# NumPy

Documentation: <k>https://numpy.org/doc/stable/user/</k>

<H3>What is NumPy?</H3>

It can be used to store 1D array, 2D array, 3D array or n-Dimensional array.

<H3>How are Lists different from Numpy?</H3>

Numpy is much faster than Lists.

Numpy is much faster because:
1. Faster to read less bytes of memory.
2. No type checking when iterating through objects.
3. Numpy uses contiguous memory.

<H3>Applications of Numpy</H3>

1. Mathematics(MATLAB Replacement)
2. Plotting(Matplotlib)
3. Backend(Pandas, Connect 4, Digital Photography)
4. Machine Learning

<H3>Load in NumPy</H3>

In [1]:
import numpy as np
import sys

<H3>The Basics</H3>

In [3]:
a = np.array([1,2,3])
print(a)

b = np.array([[9.0, 8.0, 7.0], [6.0, 5.0, 4.0]])
print(b)

[1 2 3]
[[9. 8. 7.]
 [6. 5. 4.]]


<H5>Some important attributes</H5>

1. ndim - Attribute returns the dimension of the array
2. shape - Attribute returns the number of element present in each dimension.
3. dtype - Attribute returns the datatype of the array
4. iteemsize - Attribute returns Length of one array element in bytes
5. nbytes - Attribute returns Total bytes consumed by the elements of the array.

In [4]:
# Get dimension
a.ndim

1

In [5]:
# Get shape
a.shape

(3,)

In [6]:
# Get Type
print(f'a type: {a.dtype}')
print(f'b type: {b.dtype}')

a type: int32
b type: float64


In [7]:
# Get size
a.itemsize

4

In [8]:
# Get total size
a.nbytes

12

<H3>Accessing/Changing specific elements, rows, columns, etc.</H3>

1. Indexing starts from 0.
2. Get a specific element [r, c]
3. Get a specific row [r, :]
4. Get a specific column [:, c]
5. Get some fancied set of elements [start_index:end_index:step_size]
6. It also supports negative indexing(last index is -N where N is the length of the array).
7. Be Careful when dealing with n-Dimensional arrays.

In [9]:
a = np.array([[1,2,3,4,5,6,7],[8,9,10,11,12,13,14]])
print(a)

[[ 1  2  3  4  5  6  7]
 [ 8  9 10 11 12 13 14]]


In [10]:
# Get a specific element [r, c]
a[1, 5]

13

In [11]:
# Get a specific row
a[0, :]

array([1, 2, 3, 4, 5, 6, 7])

In [12]:
# Get a specific column
a[:, 2]

array([ 3, 10])

In [14]:
# Getting a little more fancy [start_index:end_index:stepsize]
a[0, 1:6:2]

array([2, 4, 6])

In [15]:
a[1,5] = 20
print(a)

a[:,2] = [1,2]
print(a)

[[ 1  2  3  4  5  6  7]
 [ 8  9 10 11 12 20 14]]
[[ 1  2  1  4  5  6  7]
 [ 8  9  2 11 12 20 14]]


In [16]:
# 3-D example

b = np.array([[[1,2],[3,4]],[[5,6],[7,8]]])
print(b)

[[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]]


In [17]:
# Get specific element (work outside in)
b[0,1,1]

4

In [19]:
# replace
b[:,1,:] = [[9,9], [8,8]]
print(b)

[[[1 2]
  [9 9]]

 [[5 6]
  [8 8]]]


<H3>Initializing Different Types of Arrays</H3>

1. All zeros matrix - <b>np.zeros(shape of the array)</b>
2. All ones matrix - <b>np.ones(shape of the array)</b>
3. All values a single number - <b>np.full(shape of the array, number to be filled in all)</b>
4. A full array with the same shape and type as a given array - <b>np.full_like(array, number)</b>

5. Random array(values between 0 to 1) with rows and columns - <b>np.random.rand(r, c)</b>
6. Random array(values between 0 to 1) with shape of the array - <b>np.random.random_sample(shape of the array)</b>
7. Random integer array(values between low and high) - <b>np.random.randint(low,high, size=(shape of the array))</b> where high is exclusive and when only one integer is provided for the parameters other than the size then it will considered as high value.

8. A identity matrix of N order - <b>np.identity(n)</b> where N is natural number greater than zero.
9. Repeat an array - <b>np.repeat(array, no_of_times, axis={0 or 1})</b> axis=0: Repeats elements along the rows (vertically). If you apply np.repeat to a 2D array with axis=0, each row will be duplicated as specified.  axis=1: Repeats elements along the columns (horizontally). If used on a 2D array, each element in a row gets repeated, increasing the number of columns.


In [21]:
# All 0s matrix
np.zeros((2,3))

array([[0., 0., 0.],
       [0., 0., 0.]])

In [22]:
# All 1s matrix
np.ones((4,2,2), dtype='int32')

array([[[1, 1],
        [1, 1]],

       [[1, 1],
        [1, 1]],

       [[1, 1],
        [1, 1]],

       [[1, 1],
        [1, 1]]])

In [23]:
# Any other number
np.full((2,2), 5)

array([[5, 5],
       [5, 5]])

In [24]:
# Any other number (full_like) special case
np.full_like(a, 4)

array([[4, 4, 4, 4, 4, 4, 4],
       [4, 4, 4, 4, 4, 4, 4]])

In [27]:
# Random decimal numbers(passing rows and columns)
np.random.rand(4,2)

array([[0.5707351 , 0.30324427],
       [0.32621952, 0.0675638 ],
       [0.86403657, 0.42883195],
       [0.88750991, 0.18755744]])

In [28]:
# Random decimal numbers(passing shape of a array)
np.random.random_sample(a.shape)

array([[0.34571794, 0.64119821, 0.90029171, 0.8951102 , 0.0598117 ,
        0.15058284, 0.29026144],
       [0.46437939, 0.09560873, 0.85783289, 0.77918136, 0.8003194 ,
        0.87254542, 0.33748797]])

In [31]:
# Random integer values
np.random.randint(10, size=(3,3))

array([[7, 5, 6],
       [5, 5, 7],
       [2, 8, 4]])

In [32]:
# Main diagonal elements with 1 and others 0 i.e a identity matrix of order N
np.identity(5)

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

In [35]:
# repeat an array
arr = np.array([[1, 2, 3]])
r1 = np.repeat(arr,3, axis=0)
r2 = np.repeat(arr,3,axis=1)

print("Repeat along axis=0(row)")
print(r1)
print("\nRepeat along axis=1(column)")
print(r2)

Repeat along axis=0(row)
[[1 2 3]
 [1 2 3]
 [1 2 3]]

Repeat along axis=1(column)
[[1 1 1 2 2 2 3 3 3]]


<H3>
    Question 1:
    
    Create a matrix as shown below
</H3>

<Table>
    <tr>
        <td>1</td>
        <td>1</td>
        <td>1</td>
        <td>1</td>
        <td>1</td>
    </tr>
    <tr>
        <td>1</td>
        <td>0</td>
        <td>0</td>
        <td>0</td>
        <td>1</td>
    </tr>
    <tr>
        <td>1</td>
        <td>0</td>
        <td>9</td>
        <td>0</td>
        <td>1</td>
    </tr>
    <tr>
        <td>1</td>
        <td>0</td>
        <td>0</td>
        <td>0</td>
        <td>1</td>
    </tr>
    <tr>
        <td>1</td>
        <td>1</td>
        <td>1</td>
        <td>1</td>
        <td>1</td>
    </tr>
</Table>

In [43]:
a = np.ones((5,5), dtype='int16')
a[1:4,1:4] = 0
a[2,2] = 9
print(a)

[[1 1 1 1 1]
 [1 0 0 0 1]
 [1 0 9 0 1]
 [1 0 0 0 1]
 [1 1 1 1 1]]


<H3>Be careful when copying arrays</H3>

It is because when you declare b=a, then you're both are referencing the same. so, any change on one is directly affects the other. If you want to create a copy then use copy method.

<H3>Mathematics</H3>

For a lot more (<k>https://docs.scipy.org/doc/numpy/reference/routines.math.html</k>)

In [45]:
a = np.array([1,2,3,4])
print(a)

[1 2 3 4]


In [46]:
a + 2

array([3, 4, 5, 6])

In [47]:
a - 2

array([-1,  0,  1,  2])

In [48]:
a * 2

array([2, 4, 6, 8])

In [49]:
a / 2

array([0.5, 1. , 1.5, 2. ])

In [50]:
b = np.array([1,0,1,0])
a+b

array([2, 2, 4, 4])

In [51]:
a ** 2

array([ 1,  4,  9, 16])

In [52]:
# Take sin
np.sin(a)

array([ 0.84147098,  0.90929743,  0.14112001, -0.7568025 ])

In [53]:
np.cos(a)

array([ 0.54030231, -0.41614684, -0.9899925 , -0.65364362])

<h3>Linear Algebra</h3>

1. <b>np.matmul(a,b)</b> - Matrix multiplication
2. <b>np.linalg.det(matrix)</b> - Determinant of a 

Reference docs (<k>https://docs.scipy.org/doc/numpy/reference/routines.linalg.html</k>)

1. Determinant
2. Trace
3. Singular Vector Decomposition
4. Eigen values
5. Matrix normalization
6. Inverse
etc.

In [54]:
a = np.ones((2,3))
print(a)

b = np.full((3,2),2)
print(b)

np.matmul(a,b)

[[1. 1. 1.]
 [1. 1. 1.]]
[[2 2]
 [2 2]
 [2 2]]


array([[6., 6.],
       [6., 6.]])

In [55]:
# find the determinant of matrix
c = np.identity(3)
np.linalg.det(c)

1.0

<h3> Statistics</h3>

1. For row-wise operations, use axis=1.
2. For column-wise operations, use axis=0.

In [56]:
stats = np.array([[1,2,3],[4,5,6]])
stats

array([[1, 2, 3],
       [4, 5, 6]])

In [60]:
# max and min in the array
print(np.min(stats), np.max(stats))

1 6


In [61]:
# min in each row
np.min(stats, axis=1)

array([1, 4])

In [62]:
# min in each column
np.min(stats, axis=0)

array([1, 2, 3])

<h3>Reorganizing Arrays</h3>

1. <b>array.reshape(new shape for the array)</b> - returns changed array according to the given shape.
2. <b>np.vstack(a,b)</b> - returns a combined array with a on top of b. This is also same as np.contcatenate() with axis=0.
3. <b>np.hstack(a,b)</b> - returns a combined array with a on front of b. This is also same as np.contcatenate() with axis=1.

In [64]:
before = np.array([[1,2,3,4],[5,6,7,8]])
print(before)

after = before.reshape((2,2,2))
print(after)

[[1 2 3 4]
 [5 6 7 8]]
[[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]]


In [66]:
# vertically stacking vectors
v1 = np.array([1,2,3,4])
v2 = np.array([5,6,7,8])

np.vstack([v1,v2])

array([[1, 2, 3, 4],
       [5, 6, 7, 8]])

In [67]:
# horizontal stacking vectors
h1 = np.ones((2,4))
h2 = np.zeros((2,2))

np.hstack([h1,h2])

array([[1., 1., 1., 1., 0., 0.],
       [1., 1., 1., 1., 0., 0.]])

<h3>Miscellaneous</h3>

<h4>Load Data from file</h4>

In [72]:
filedata = np.genfromtxt('data.txt', delimiter=',')
filedata = filedata.astype('int32')
filedata

array([[  1,  13,  21,  11, 196,  75,   4,   3,  34,   6,   7,   8,   0,
          1,   2,   3,   4,   5],
       [  3,  42,  12,  33, 766,  75,   4,  55,   6,   4,   3,   4,   5,
          6,   7,   0,  11,  12],
       [  1,  22,  33,  11, 999,  11,   2,   1,  78,   0,   1,   2,   9,
          8,   7,   1,  76,  88]])

<h4>Boolean Masking and Advanced Indexing</h4>

In [75]:
filedata[filedata > 50]

array([196,  75, 766,  75,  55, 999,  78,  76,  88])

In [76]:
# you can index with a list in numpy
a = np.array([1,2,3,4,5,6,7,8,9])
a[[1,2,8]]

array([2, 3, 9])

In [77]:
np.any(filedata > 50, axis=0)

array([False, False, False, False,  True,  True, False,  True,  True,
       False, False, False, False, False, False, False,  True,  True])

In [78]:
np.all(filedata > 50, axis=0)

array([False, False, False, False,  True, False, False, False, False,
       False, False, False, False, False, False, False, False, False])

In [79]:
((filedata > 50) & (filedata < 100))

array([[False, False, False, False, False,  True, False, False, False,
        False, False, False, False, False, False, False, False, False],
       [False, False, False, False, False,  True, False,  True, False,
        False, False, False, False, False, False, False, False, False],
       [False, False, False, False, False, False, False, False,  True,
        False, False, False, False, False, False, False,  True,  True]])

In [80]:
~((filedata > 50) & (filedata < 100))

array([[ True,  True,  True,  True,  True, False,  True,  True,  True,
         True,  True,  True,  True,  True,  True,  True,  True,  True],
       [ True,  True,  True,  True,  True, False,  True, False,  True,
         True,  True,  True,  True,  True,  True,  True,  True,  True],
       [ True,  True,  True,  True,  True,  True,  True,  True, False,
         True,  True,  True,  True,  True,  True,  True, False, False]])