# Numpy arrays

In [3]:
import numpy as np

The main data structure is np.ndarray, short for n-dimensional array.

Just take a mental note that np.ndarray is the class, and np.array is the function.

In [5]:
X = np.array([1, 10.5, -4.9, 54])  # creating a NumPy array from a Python list
X

array([ 1. , 10.5, -4.9, 54. ])

In [None]:
np.ones(shape=7) # initializing a NumPy array from scratch using ones

array([1., 1., 1., 1., 1., 1., 1.])

In [7]:
np.zeros(shape=5) # initializing a NumPy array from scratch using zeros

array([0., 0., 0., 0., 0.])

We can even initialize NumPy arrays using random numbers

In [8]:
np.random.rand(10)

array([0.83846158, 0.27795777, 0.94666403, 0.4122031 , 0.33578458,
       0.52882447, 0.70109735, 0.66654747, 0.26186274, 0.30837049])

Most importantly, when we have a given array we can initialize another one with the same dimensions using the np.zeros_like, np.ones_like, and np.empty_like functions.

In [9]:
np.zeros_like(X)

array([0., 0., 0., 0.])

Just like Python lists, NumPy arrays support item assignments and slicing.

In [10]:
X[0]= 1545.215 
X

array([1545.215,   10.5  ,   -4.9  ,   54.   ])

In [11]:
X[1:4]

array([10.5, -4.9, 54. ])

However, as expected, you can only store asingle data type with in each ndarray. When trying to assign a string as the first element, we get an error message.

In [12]:
X[0]= "str"

ValueError: could not convert string to float: 'str'

Every ndarray has a data type attribute that can be accessed at ndarray.dtype

In [13]:
X.dtype

dtype('float64')

If a conversionn can be made between the value to be assigned  and the data type, it is automatically performed, making teh item assignment successful.

In [14]:
val = 23
print(val, type(val))

X[0] = val
X

23 <class 'int'>


array([23. , 10.5, -4.9, 54. ])

NumPy arrays are iterable, just like other container types in Python.

In [15]:
for x in X:
    print(x)

23.0
10.5
-4.9
54.0


Are these suitable to represent vectors? YES. We'll see why!

# NumPy arrays as vectors

In [4]:
v_1=np.array([-4.0, 1.0, 2.3])
v_2=np.array([-8.3,-9.6,-7.7])

The addition and scalar multiplication operations are supported by default and performas expected.

In [12]:
v_1 + v_2

array([-12.3,  -8.6,  -5.4])

In [8]:
10.0*v_1 # multiplying v_1 with a scalar

array([-40.,  10.,  23.])

In [9]:
v_1*v_2 # the element wise product of v_1 and v_2

array([ 33.2 ,  -9.6 , -17.71])

In [10]:
np.zeros(shape=3) + 1

array([1., 1., 1.])

We can (often) plug NumPy arrays in to functions intended for scalars.

In [14]:
def f(x):
    """A polynomial function."""
    return 3*x**2 - x**4

f(v_1) 

array([-208.    ,    2.    ,  -12.1141])

So far, NumPy arrays satisfy almost everything we require to represent vectors. There is only one box to be checked: performance. To investigate this, we measure the execution time.

In [17]:
from timeit import timeit

n_runs = 100000
size = 1000

t_add_builtin = timeit(
    "[x + y for x, y in zip(v_1, v_2)]",
    setup=f"size={size}; v_1=[0 for i in range(size)]; v_2=[1 for i in range(size)]",
    number=n_runs,
)

t_add_numpy = timeit(
    "v_1 + v_2",
    setup=f"import numpy as np; size={size}; v_1=np.zeros(size); v_2=np.ones(size)",
    number=n_runs,
)

print(f"Built-in addition: {t_add_builtin} seconds")
print(f"NumPy addition: {t_add_numpy} seconds")
print(f"Speed-up: {t_add_builtin / t_add_numpy:.2f}x")

Built-in addition: 7.082233400000405 seconds
NumPy addition: 0.17135579999740003 seconds
Speed-up: 41.33x


NumPy arrays are much-much faster. This is because they are
* contiguous in memory,
* homogeneous in type,
* with operations implemented in C.