# Numpy Introduction


## How to quickly create arrays

In [3]:
# Importing library  -- numpy
import numpy as np

In [58]:
# How to create a 1-D numpy array?
x = np.array([1,2,3]) 

In [60]:
x.dtype

dtype('int32')

In [61]:
# Need not be only numbers
x = np.array(['a', 'b', 'c'], dtype=object)

In [62]:
x.dtype   # 'O' means it is data type of object

dtype('O')

In [36]:
# Quick Operations on numpy arrays
a = np.array([1,2,3]) 

np.sqrt(a)
np.log(a)
np.power(a, 2)
a ** 2   # Same as above

# Transpose
a = np.array([[1,3],[2,4]])
a
a.T

array([[1, 2],
       [3, 4]])

#### Matrix Vs. Arrays

In [38]:
# Numpy Documentation says do not use Matrix - instead use arrays
np.matrix([[1,3],[2,4]])

matrix([[1, 3],
        [2, 4]])

In [37]:
# Instead use arrays
np.array([[1,3],[2,4]])

array([[1, 3],
       [2, 4]])

## Initialization of arrays and creating with random numbers.

In [None]:
# How to create 2-D numpy array with a shape of 3*2.
# The numbers should be random between 1 and 10.

In [None]:
np.random.randint(1,10,6).reshape(3,2)

In [None]:
# Arrays of zeros and Ones

In [None]:
np.zeros(10)

In [None]:
np.zeros((3,3))   # 3 * 3 array of zeros

In [None]:
np.ones(10)

In [None]:
np.ones((3,3))    # 3 * 3 array of ones

### Evenly spaced between two numbers

In [None]:
np.linspace(0,10,10)

### Usage of random functions

In [None]:
# For reproducible results and to get the same output
np.random.seed(100)    # Run this before every statement

#### UNIFORM Distribution -- This is the default distribution

In [22]:
# Let us generate 10 samples from uniform distribution between 0 and 1.

In [None]:
np.random.seed(100)
np.random.rand(10)

In [None]:
np.random.seed(100)
np.random.random(10)  # same as above

In [None]:
np.random.seed(100)
np.random.sample(10)  # same as above

In [None]:
np.random.seed(100)
np.random.random_sample(10)  # Same as above

In [None]:
np.random.seed(100)
np.random.uniform(0,1,10)  # same as above

In [21]:
# Let us generate 10 samples from uniform distribution between 0 and 5

In [None]:
np.random.seed(100)
np.random.uniform(0,6,10) 

In [None]:
# 10 integer samples from uniform distribution between 0 and 5

In [None]:
np.random.seed(100)
np.random.randint(0,6,10) 

In [None]:
np.random.seed(100)
np.random.randint(6,size=10)  # Do not have to mention the lower end if it is 0

In [26]:
# 10 random integer samples from 0 to 100

In [27]:
np.random.seed(100)
np.random.choice(100, 10)

array([ 8, 24, 67, 87, 79, 48, 10, 94, 52, 98])

#### NORMAL Distribution

In [None]:
# 10 samples from standardized normal distribution - mean = 0 and std.dev=1
np.random.seed(10)
np.random.randn(10)

In [None]:
np.random.seed(100)
np.random.standard_normal(10)

In [None]:
# 10 samples from normal distribution - mean=0 and std.dev=10
np.random.seed(100)
np.random.normal(5, 10, 10)

In [None]:
np.random.seed(100)
10 * np.random.randn(10) + 5    # Same as above

**Permutation**

In [28]:
# If you want to randomly reorder the numbers already present in the array.
np.random.permutation(10)   # This will reorder

array([9, 4, 3, 0, 1, 7, 6, 8, 2, 5])

In [29]:
np.random.permutation([1, 4, 9, 12, 15])

array([ 4,  9,  1, 12, 15])

In [30]:
arr = np.arange(9).reshape((3, 3))
np.random.permutation(arr)

array([[6, 7, 8],
       [3, 4, 5],
       [0, 1, 2]])

## Matrix operations

### DOT operation - Vectorization

From the numpy documentation

np.dot usage -> 

1. If both a and b are 1-D arrays, it is inner product of vectors (without complex conjugation).
    
2. If both a and b are 2-D arrays, it is matrix multiplication, but using matmul or a @ b is preferred.    
    * **_RULE_** -> outer dimension of a should be equal to inner dimension of b.
      
3. If either a or b is 0-D (scalar), it is equivalent to multiply and using numpy.multiply(a, b) or a * b is preferred.
    
4. If a is an N-D array and b is a 1-D array, it is a sum product over the last axis of a and b.
    
    * **RULE**: The outer dimension of a should match with the dimension of b.
    
5. If a is an N-D array and b is an M-D array (where M>=2), it is a sum product over the last axis of a and the second-to-last axis of b    


**First point**  -- Sum_product or inner product if both are 1-D array

In [5]:
a_1d = np.array([1,2,3])
a_1d.shape                  # 1-D array 

(3,)

In [6]:
b_1d = np.array([1,2,3])
b_1d.shape                  # 1-D array 

(3,)

In [7]:
np.dot(a_1d, b_1d)          # Sumproduct / Inner product

14

In [39]:
np.inner(a_1d, b_1d)       # Same as above

14

__Second point:__ Matrix multiplication - so same rules apply here i.e. outer dimension of a should be equal to inner dimension of b.

In [13]:
a_2d = np.random.randint(1,10,6).reshape(3,2)
a_2d.shape

(3, 2)

In [14]:
b_2d = np.random.randint(1,10,6).reshape(2,3)
b_2d.shape

(2, 3)

In [15]:
np.dot(a_2d, b_2d)

array([[ 36,  11,  31],
       [ 48,  13,  41],
       [114,  39,  99]])

__Third point__ -- Scalar (0-D) multiplication.

In [None]:
c = 2
a_1d * c

** Fourth point **  -- ** VERY IMPORTANT: Row-wise/Column-wise Operation**

**Fourth point:** When we want to perform a operation on each column or row.

__RULE: The outer dimension of a should match with the dimension of b.__

__ This is same as column wise or row-wise operation. __

In [None]:
d_1d = np.array([1,2,3])
e_1d = np.array([3,4])

In [10]:
np.dot(a_2d, d_1d)     # ERROR  -> Does not follow the rule mentioned above

ValueError: shapes (3,2) and (3,) not aligned: 2 (dim 1) != 3 (dim 0)

To avoid this error the below two cell shows how it is done.

In [11]:
np.dot(a_2d.T, d_1d)   # a is transposed -> (2*3), (3,)

array([26, 21])

In [12]:
np.dot(a_2d, e_1d)     # (3*2),  (2,)

array([20, 26, 30])

### Broadcasting & Element-wise operation


#### Element-wise multiplication, addition, subtraction  -- The output dimension would be equal to the bigger dimension.

**Rule** - It should be same dimension

In [2]:
a = np.array([1,2,3])
b = np.array([1,2,3])

a - b   

array([0, 0, 0])

In [3]:
a + b

array([2, 4, 6])

In [4]:
a * b # Hadamard product

array([1, 4, 9])

In [5]:
# If you multiply 1-D array with a 2-D array, you will get the same dimension as the 2-D array.

In [6]:
a = np.array([[1,2,3,4]])
b = np.array([[1,2,1,2],
              [3,4,3,4],
              [1,2,1,2],
              [3,4,3,4]])

In [9]:
a * b

array([[ 1,  4,  3,  8],
       [ 3,  8,  9, 16],
       [ 1,  4,  3,  8],
       [ 3,  8,  9, 16]])

#### Outer product.

The output dimension in the final matrix is (number of elements of a * number of elements in b).


In [31]:
a = np.array([1,2])
b = np.array([3,4])

np.outer(a,b)  # Always, both a and b is flattened first to 1-D.

array([[3, 4],
       [6, 8]])

In [None]:
# Another example for OUTER operation

In [33]:
a = np.array([[1,2,3], [3,4,5]])
a.flatten()

array([1, 2, 3, 3, 4, 5])

In [34]:
b = np.array([1,2,3,4])
np.outer(b,a)

array([[ 1,  2,  3,  3,  4,  5],
       [ 2,  4,  6,  6,  8, 10],
       [ 3,  6,  9,  9, 12, 15],
       [ 4,  8, 12, 12, 16, 20]])

In [35]:
np.outer(a,b)

array([[ 1,  2,  3,  4],
       [ 2,  4,  6,  8],
       [ 3,  6,  9, 12],
       [ 3,  6,  9, 12],
       [ 4,  8, 12, 16],
       [ 5, 10, 15, 20]])

#### Inner Product

In [None]:
a = np.array([1,2])
b = np.array([3,4])

np.inner(a,b)  # same as np.dot(a,b)  -- Sum Product

### Other operations

In [9]:
# Inverse operation

A = np.array([[1,2], [3,4]])
A

array([[1, 2],
       [3, 4]])

In [8]:
Ainv = np.linalg.inv(A)
Ainv

array([[-2. ,  1. ],
       [ 1.5, -0.5]])

In [7]:
np.round(Ainv.dot(A))  # Should give identity matrix

array([[ 1.,  0.],
       [ 0.,  1.]])

In [None]:
A.dot(Ainv)   # Same as above

In [None]:
# Determinant
np.linalg.det(A)

In [None]:
# Diagonal
# Rule: If you pass 2-D array, output is 1-D array with the diagonal element
# If you pass 1-D array, output is 2-D array with off-diagonal elements = 0.

np.diag(A)

In [None]:
np.diag([1,2])

In [None]:
# Trace
np.diag(A).sum()

In [None]:
np.trace(A)

In [None]:
# NORM - L2 
# ||x|| = sqrt(sum(x ** 2))

# Manual way
a = np.array([2,1])
amag = np.sqrt((a*a).sum())
amag

In [None]:
# Direct way
np.linalg.norm(a)

In [40]:
# Identity Matrix
np.eye(4)

array([[ 1.,  0.,  0.,  0.],
       [ 0.,  1.,  0.,  0.],
       [ 0.,  0.,  1.,  0.],
       [ 0.,  0.,  0.,  1.]])

In [43]:
np.min(A)

1

In [44]:
np.max(A)

4

In [45]:
np.min(A, axis = 0)

array([1, 2])

In [46]:
np.min(A, axis = 1)

array([1, 3])

In [47]:
np.argmin(A)  # Index or the position which has the minimum value

0

In [48]:
np.argmax(A)

3

In [49]:
np.argmin(A, axis=0)

array([0, 0], dtype=int64)

In [50]:
np.argmax(A, axis=0)

array([1, 1], dtype=int64)

In [51]:
np.argmin(A, axis = 1)

array([0, 0], dtype=int64)

In [52]:
np.argmax(A , axis = 1)

array([1, 1], dtype=int64)

### Question based on above operation

Question: What is the difference between multiply and dot operation? 

The output dimension is different and the rules are different as given above.

Dot Operation -> Follows the rule as given above in the dot operation section. Once you do the operation, the output dimension reduces to the one mentioned in the rule -> basically it does sumproduct and not just multiplication.

Multiply operation -> It will multiply each row to the individual row of other array and give the same dimension as the larger dimension array.

You can do these operations only one row or column at a time.


### Saving Arrays

In [3]:
x = np.arange(10)

In [4]:
y = np.array([0,1,1])

In [7]:
from tempfile import TemporaryFile
outfile = TemporaryFile()

np.savez('test.npz', x, y)

In [11]:
a = np.load('test.npz')

In [16]:
a['arr_0']

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [17]:
a['arr_1']

array([0, 1, 1])

#### What if you want to store the variables with names so that you can refer to it using the name?

In [18]:
np.savez('file_named_array.npz', x=x, y=y)

In [19]:
file_named_array = np.load('file_named_array.npz')

In [20]:
file_named_array['x']

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [21]:
file_named_array['y']

array([0, 1, 1])