In [3]:
import numpy as np

# Numpy

## Numpy Arrays

- Numpy arrays contain homogenous elements and are much faster and more compact than python lists
- All elements of a np array are of the same type referred as the array dtype, which is np.float64 by default
- They can be initialised using python lists -

In [4]:
# Initialising a 1D array
arr1 = np.array([1,2,3,4,5])
# Initialising a 2D array
arr2 = np.array([[1,2,3],[4,5,6]])
# Accessing an element
print(arr2[0][1])

2


- An ndarray refers to an n dimensional array - basically any array
- ndarray is also a class in numpy which can be used to represent matrices and vectors
- Dimensions in numpy are often also called axes
- Like in python, one can access elements of an array by indexing and slicing, however, a new feature of np arrays is how they can share data across arrays, so that changes made in one are reflected in another

## More on Initialisation

- We already saw how np.array() can create an array from a python list. Here are some other ways - 

In [10]:
# Create an array of zeroes
print(np.zeros(2))
# Create an array of ones
print(np.ones(2))
# Create an empty array, the elements are based on the current memory state
print(np.empty(2))
# Create an array over a range of elements with start stop and step
print(np.arange(1,20,2))
# Create an array of specified size over a range with evenly spaced elements
print(np.linspace(0,100,5))
# Initialising an array with integer elements instead of float
print(np.zeros(2,dtype=np.int64))

[0. 0.]
[1. 1.]
[1. 1.]
[ 1  3  5  7  9 11 13 15 17 19]
[  0.  25.  50.  75. 100.]
[0 0]


## Indexing

- Numpy offers multiple ways to index an array
1. **Slicing** - Numpy arrays can be sliced just like python arrays - 
    - For multi-dimensional arrays specify the slice for each dimension
    - You can also mix slicing and integer indexing, which results in a corresponding decrease in rank of matrix

In [13]:
# Create an array - 
# [[1 2 3 4]
#  [5 6 7 8]
#  [9 10 11 12]]
arr = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])

# Create a subarray with last two rows and columns 1 and 2
arr_var = arr[1:,1:3]

# Note that slices are views and so changes to them change the actual array as well
# Initial state
print(arr)

# Making a change
arr_var[0][0] = 0

# Change reflected in main array
print(arr)

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]
[[ 1  2  3  4]
 [ 5  0  7  8]
 [ 9 10 11 12]]


2. **Integer** - Effectively this is used to create new arrays using data from pre-existing arrays
    - It works a little differently compared to python, to select multiple elements, we can specify their indexes parallely
    - A nice thing you can do with numpy arrays because of this is that you can mutate certain elements over a range without a loop

In [23]:
# Using the previous array and selecting elements at (0,1), (1,2), (2,3)
arr = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])
print('array is-')
print(arr)
print('The elements we want are',arr[[0,1,2],[1,2,3]])

# Create a 3 by 4 array of ones
arr = np.zeros([3,4],np.int64)
print('\nnew array-')
print(arr)

# Let's increase an element from each column by one
rows = np.array([1,0,0,2])
arr[rows,np.arange(4)] += 1
print('mutated-')
print(arr)

array is-
[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]
The elements we want are [ 2  7 12]

new array-
[[0 0 0 0]
 [0 0 0 0]
 [0 0 0 0]]
mutated-
[[0 1 1 0]
 [1 0 0 0]
 [0 0 0 1]]


3. **Boolean** - Used to pick elements that satisfy some condition

In [24]:
arr = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])

bool_idx = (arr>3)
# What this does is return an array whose value at an index tells us if the corresponding element is greater than 3

print(bool_idx)

# Now we can go one more step ahead, and print out a rank 1 array with just those elements which satisfy this condition
print(arr[bool_idx])

[[False False False  True]
 [ True  True  True  True]
 [ True  True  True  True]]
[ 4  5  6  7  8  9 10 11 12]


## Math on an Array

In [26]:
# Quick demonstration of basic algebra - which works elementwise on any array
a = np.array([1,2,3,4])
b = np.array([4,3,2,1])

# Addition
print(a+b)
print(np.add(a,b))

# Subtraction
print(a-b)
print(np.subtract(a,b))

# Multiplication
print(a*b)
print(np.multiply(a,b))

# Division
print(a/b)
print(np.divide(a,b))

# Square Root
print(np.sqrt(a))

[5 5 5 5]
[5 5 5 5]
[-3 -1  1  3]
[-3 -1  1  3]
[4 6 6 4]
[4 6 6 4]
[0.25       0.66666667 1.5        4.        ]
[0.25       0.66666667 1.5        4.        ]
[1.         1.41421356 1.73205081 2.        ]


- Note that * is for elementwise multiplication, use the dot() method or function to find inner products and perform matrix multiplication

In [28]:
a = np.array([[1,2],[3,4]])
b = np.array([[5,6],[7,8]])
c = np.array([9,10])
d = np.array([11,12])

# Dot product
print(c.dot(d))
print(np.dot(c,d))

# Matrix times a vector
print(a.dot(c))
print(np.dot(a,c))

# Matrix multiplication
print(a.dot(b))
print(np.dot(a,b))

219
219
[29 67]
[29 67]
[[19 22]
 [43 50]]
[[19 22]
 [43 50]]


### Sum function
- The sum function is another useful function you may need to use
- It can sum over a specific axis and also filter out elements

In [29]:
# Demonstration of the sum() function
a = np.array([[1,2],[3,4]])

# Sum all elements
print(np.sum(a))

# Sum each column
print(np.sum(a,axis=0))

# Sum each row
print(np.sum(a,axis=1))

10
[4 6]
[3 7]


### Transposing and Reshaping
- Transposing is such a common operation for a matrix, the T attribute is directly set to return the transpose of a matrix
- Reshaping an array is also common, and numpy has the reshape() function for the same

In [34]:
a = np.array([[1,2],[3,4]])
print(a)
print('\nTranspose-')
print(a.T)
print('\nReshaped-')
print(np.reshape(a,newshape=(4,1)))

[[1 2]
 [3 4]]

Transpose-
[[1 3]
 [2 4]]

Reshaped-
[[1]
 [2]
 [3]
 [4]]


## Working with Arrays of Different Shapes
- What if you'd like to multiply a large array with an incompatible small array by using the small array to perform the operation multiple times
- The first instinct would be to use a loop to create the required area from the small array, but the time complexity of such an operation would be huge
- Numpy has two methods to deal with this, one is with the tile() function and the other is with broadcasting

In [35]:
a = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
b = np.array([1, 0, 1])

# The Goal: Add b to each row of a and store in an array res

# 1) Tile function
bb = np.tile(b,(4,1))             # Stacks 4 copies of b on itself
print(bb)

# Now just use this matrix
res = a + bb
print(res)

# 2) Broadcasting
res = a + b                       # Yep, you can just do that
print(res)

[[1 0 1]
 [1 0 1]
 [1 0 1]
 [1 0 1]]
[[ 2  2  4]
 [ 5  5  7]
 [ 8  8 10]
 [11 11 13]]
[[ 2  2  4]
 [ 5  5  7]
 [ 8  8 10]
 [11 11 13]]


- The tile() function clearly does not help with the space complexity as it makes copies of the array
- Broadcasting however can avoid the space complexity as well. a + b can be computed directly despite the different shapes and it works as if v was actually np.tile(v,(4,1))

### Rules of Broadcasting

1. If the rank of the array is not the same, then prepend the small rank array with 1s till both shapes have the same length
2. Two arrays are compatible in a dimension, if they have the same size in that dimension or if one of them has size 1 in it
3. Arrays can be broadcast together if they are compatible in all dimensions
4. After the broadcasting, each array behaves as if it had shape equal to elementwise maximum of shapes of the two input arrays
5. In any dimension where one array had size 1 and the other array had size greater than 1, the first array behaves as if it were copied along that dimension

- The functions that support broadcasting are called *universal*

## Miscellaneous

### Sorting

In [36]:
arr = np.array([42,132,432,123,1,45])
arr = np.sort(arr)
print(arr)

[  1  42  45 123 132 432]


### Concatenation

In [37]:
# Concatenating 1D arrays
a = np.zeros(4)
b = np.ones(3)
print(np.concatenate((a,b)))
# Concatenating 2D arrays
a = np.array([[1,2],[3,4]])
b = np.array([[6,7]])
print(np.concatenate((a,b),axis=0))

[0. 0. 0. 0. 1. 1. 1.]
[[1 2]
 [3 4]
 [6 7]]


### Array Attributes

In [38]:
a = np.array([[[0, 1, 2, 3],
               [4, 5, 6, 7]],

              [[0, 1, 2, 3],
               [4, 5, 6, 7]],

              [[0 ,1 ,2, 3],
               [4, 5, 6, 7]]])
# Axes/Dimensions
print(a.ndim)
# Number of elements
print(a.size)
# Shape
print(a.shape)

3
24
(3, 2, 4)
