# Lecture 8

## Numpy

Let us install numpy to begin with.

In [None]:
%pip install --user numpy

and the package can be imported.

In [1]:
import numpy as np

### Creating numpy array

We can create a numpy array using the following command:

In [2]:
x=np.array([1,2,3,4])
print(x)

[1 2 3 4]


You need the [] when creating it

In [3]:
np.array(1,2,3,4)

TypeError: array() takes from 1 to 2 positional arguments but 4 were given

In [4]:
l=[1,2,3,4]
print(l)

[1, 2, 3, 4]


In [5]:
type(x)

numpy.ndarray

Indexing works similar to lists $^*$

In [6]:
print(x[0],l[0])

1 1


In [7]:
print(x[0:2],l[0:2])

[1 2] [1, 2]


In [8]:
print(x[0:3:2],l[0:3:2])

[1 3] [1, 3]


and also `len`.

In [9]:
print(len(x),len(l))

4 4


and it is mutable.

In [10]:
x[1]=1
print(x)

[1 1 3 4]


We can also subset.

In [11]:
x_sub=x[0:2]
l_sub=l[0:2]

In [12]:
print(x_sub,l_sub)

[1 1] [1, 2]


And suppose we change the original array/list.

In [13]:
x[0]=0
l[0]=0

In [14]:
print(x_sub,l_sub)

[0 1] [1, 2]


This happens because the subset references the original array to save on memory. You can make a copied version using the `copy` method.

In [15]:
x_sub=x[0:2].copy()
print(x_sub)
x[0]=10
print(x_sub)
print(x)

[0 1]
[0 1]
[10  1  3  4]


### Dimensions

We can have 2D and n-dimensional arrays as well. But with lists we are stuck with list of lists.

In [16]:
x2=np.array([[1,2],[3,4],[5,6]])
print(x2)

[[1 2]
 [3 4]
 [5 6]]


In [17]:
l2=[[1,2],[3,4],[5,6]]
print(l2)

[[1, 2], [3, 4], [5, 6]]


But the arrays should be of homogeneous shape. (for 2D: all rows should have the same number of columns)

In [18]:
np.array([[1,2],[3],[5,6]])

ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (3,) + inhomogeneous part.

We can access the 1st row as such for both arrays and list.

In [19]:
print(x2[0],l2[0])

[1 2] [1, 2]


But with arrays you can access individual elements directly.

In [20]:
print(x2[0,0])

1


In [21]:
print(l2[0,0])

TypeError: list indices must be integers or slices, not tuple

You can slice along rows as well as columns.

In [22]:
print(x2[0,:],x2[:,0])

[1 2] [1 3 5]


Note that you need the `:` unlike in R.

In [23]:
print(x2[0,],x2[,0])

SyntaxError: invalid syntax (4252507485.py, line 1)

The `len` returns the length of the first dimension.

In [24]:
len(x2)

3

We can get the shape, total no of element, no. of dimensions of a numpy array as such.

In [25]:
print(x2.shape,x2.size,x2.ndim)

(3, 2) 6 2


Which is equivalent to using the functions.

In [26]:
print(np.shape(x2),np.size(x2),np.ndim(x2))

(3, 2) 6 2


We can transpose arrays.

In [27]:
print(x2.T)

[[1 3 5]
 [2 4 6]]


We can also reshape it. Including changing dimensions.

In [28]:
x2.reshape((2,3))

array([[1, 2, 3],
       [4, 5, 6]])

What does negative shape mean?

In [29]:
x2.reshape((2,-1,3))

array([[[1, 2, 3]],

       [[4, 5, 6]]])

What would be the len and shape of this array?

In [30]:
x3 = np.array([[[1],[2]],[[3],[4]],[[5],[6]]])
print(x3)

[[[1]
  [2]]

 [[3]
  [4]]

 [[5]
  [6]]]


### Datatypes

Unlike lists everything in an array has to be of same datatype. This is necessary for numpy to be so fast and efficient on memory.

In [31]:
x.dtype

dtype('int64')

So if we change an element of an int array with a float, we only get int.

In [32]:
x[1]=1.5
print(x)

[10  1  3  4]


We can set the datatype of the array to float and then change it.

In [33]:
x=x.astype(np.float64)

In [34]:
x[1]=1.5
print(x)

[10.   1.5  3.   4. ]


Otherwise, we can specify the datatype when we initialize the array.

In [35]:
x=np.array([1,2,3,4],dtype=np.float64)
print(x)

[1. 2. 3. 4.]


In [36]:
x[1]=1.5
print(x)

[1.  1.5 3.  4. ]


So by default numpy uses 64-bit numbers (assuming we all use 64-bit computers now).

But numpy also allows for lower precision for cases where we can save on memory. 

In [37]:
x=np.array([1,2,129,257],dtype=np.int8)
print(x)

[   1    2 -127    1]


For the old behavior, usually:
    np.array(value).astype(dtype)`
will give the desired result (the cast overflows).
  x=np.array([1,2,129,257],dtype=np.int8)
For the old behavior, usually:
    np.array(value).astype(dtype)`
will give the desired result (the cast overflows).
  x=np.array([1,2,129,257],dtype=np.int8)


and unsigned integers as well.

In [38]:
x=np.array([1,2,-4,257],dtype=np.uint8)
print(x)

[  1   2 252   1]


For the old behavior, usually:
    np.array(value).astype(dtype)`
will give the desired result (the cast overflows).
  x=np.array([1,2,-4,257],dtype=np.uint8)
For the old behavior, usually:
    np.array(value).astype(dtype)`
will give the desired result (the cast overflows).
  x=np.array([1,2,-4,257],dtype=np.uint8)


In certain cases, you might encounter missing or infinite values in numerical computations, and numpy has you covered.

In [39]:
print(np.nan,1/np.nan)
print(np.inf,1/np.inf)

nan nan
inf 0.0


### Other ways of creating arrays

We can create an array of ones

In [40]:
y=np.ones(14)
print(y)

[1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]


Or an array of zeros. And these can be of any shape.

In [41]:
y=np.zeros((2,4))
print(y)

[[0. 0. 0. 0.]
 [0. 0. 0. 0.]]


Or an empty array. But what is empty?

In [42]:
y = np.empty(10)
print(y)

[4.68607946e-310 0.00000000e+000 0.00000000e+000 0.00000000e+000
 7.74860416e-304 6.34763799e-066 4.99029311e+174 5.64164651e-091
 5.93646253e-038 7.74860568e-304]


We also have `_like` versions of these where we can get arrays of same shape as another one.

In [43]:
y=np.ones_like(x2)
print(y)

[[1 1]
 [1 1]
 [1 1]]


#### Random
We can also generate numpy arrays with actual random values

In [44]:
y = np.random.random(10)
print(y)

[0.44274394 0.19522719 0.50427311 0.43074305 0.37891862 0.94804547
 0.52783929 0.91604648 0.93555967 0.81165586]


Or random intergers in an interval.

In [45]:
y = np.random.randint(1,10,5)
print(y)

[4 7 1 6 1]


Along with defined distributions.

In [46]:
y = np.random.normal(0,2,(2,5))
print(y)

[[ 0.39152102 -1.30184642  2.69662439  0.66248624  0.77166588]
 [-4.22214945 -0.46880732  2.17982422  1.08428234 -0.20799873]]


#### 1-D
We also have a range like implementation in numpy.

In [47]:
y=np.arange(0,10,3)
print(y)

[0 3 6 9]


Unlike range, we can have stepsize in float.

In [48]:
y=np.arange(0,1,0.1)
print(y)

[0.  0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9]


In [49]:
range(0,1,0.1)

TypeError: 'float' object cannot be interpreted as an integer

We also have linspace to create an array of n equally spaced elements between two values.

In [50]:
y = np.linspace(1,10,16)
print(y,len(y))

[ 1.   1.6  2.2  2.8  3.4  4.   4.6  5.2  5.8  6.4  7.   7.6  8.2  8.8
  9.4 10. ] 16


And similarly for over the log scale.

In [51]:
y = np.logspace(1,10,10)
print(y,len(y))

[1.e+01 1.e+02 1.e+03 1.e+04 1.e+05 1.e+06 1.e+07 1.e+08 1.e+09 1.e+10] 10


We can also create the identity matrix

In [52]:
y = np.eye(3)
print(y)

[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]


In [53]:
y = np.diag([1,2,3])
print(y)

[[1 0 0]
 [0 2 0]
 [0 0 3]]


In [54]:
print(x2)
y = np.diag(x2)
print(y)

[[1 2]
 [3 4]
 [5 6]]
[1 4]


### Operations

We can perform operations on numpy arrays and they are performed element-wise. This makes it faster than iterating over the elements manually and is more clean.

In [55]:
a = np.array([1,2,3])
b = np.array([1,4,2])

In [56]:
c = a+b
print(c)

[2 6 5]


In [57]:
c = a-b
print(c)

[ 0 -2  1]


In [58]:
c = a*b
print(c)

[1 8 6]


In [59]:
c = a/b
print(c)

[1.  0.5 1.5]


Even logical operations are supported.

In [60]:
c = a<b
print(c)

[False  True False]


In addition, we also have another type of multiplication.

In [61]:
c = a@b
print(c)

15


Caveat:

In [62]:
c = a.T@b.T
print(c)

15


We have to use the outer function

In [63]:
c = np.outer(a,b)
print(c)

[[ 1  4  2]
 [ 2  8  4]
 [ 3 12  6]]


Or we can use reshape

In [64]:
c = a.reshape(-1,1)@b.reshape(1,-1)
print(c)

[[ 1  4  2]
 [ 2  8  4]
 [ 3 12  6]]


We have other methods, functions...

For summing over the elements.

In [65]:
a.sum()

6

For finding the maximum, minimum, mean, etc.

In [66]:
print(a.max(),a.min(),a.mean())

3 1 2.0


We can also sort the array.

In [67]:
b.sort()
print(b)

[1 2 4]


Or get unique values and their counts

In [68]:
uniq, count = np.unique(c,return_counts=True)
print(uniq)
print(count)

[ 1  2  3  4  6  8 12]
[1 2 1 2 1 1 1]


We can also perform operations with scalars.

In [69]:
c = a+2
print(c)

[3 4 5]


In [70]:
c = a-2
print(c)

[-1  0  1]


In [71]:
c = a*2
print(c)

[2 4 6]


In [72]:
c = a/2
print(c)

[0.5 1.  1.5]


In [73]:
c = a<2
print(c)

[ True False False]


And this can be used to subset the array

In [74]:
print(a[c])

[1]


What if we have arrays whose shapes are different?

In [75]:
a = np.array([[1,2,3],[1,3,4]])
b = np.array([1,4,2])

In [76]:
a+b

array([[2, 6, 5],
       [2, 7, 6]])

In [77]:
a*b

array([[ 1,  8,  6],
       [ 1, 12,  8]])

In [78]:
a@b

array([15, 21])

You also have mathematical operations and constants provided with numpy

In [79]:
print(np.pi)

3.141592653589793


In [80]:
c = np.exp(a)
print(c)

[[ 2.71828183  7.3890561  20.08553692]
 [ 2.71828183 20.08553692 54.59815003]]


In [81]:
c = np.sin(np.radians(b))
print(c)

[0.01745241 0.06975647 0.0348995 ]


Although you can use these with regular numerical values as well.

In [82]:
c = np.sqrt(9)
print()


