# Numpy
Or called the **Numerical Python** is a library used for performing calculations on arrays and matrices. It provides faster calculation as compared to Python inbuilt libraries because the logics and algorithms in NumPy are written in C++.  

## Basics
The most basic functionality i.e. creation of array. NumPy array class `ndarray` is different from python lists and arrays. The values in the array should be of same data type. Otherwise, NumPy will implicitly typecast it to maintain homogeneity. They are stored in a continuous memory location. Dimensions are called `axes`. Some of the important attributes of the class `ndarray` are `ndim`, `shape`, `size`, `dtype`, `itemsize`, and `data`. The usage of these attributes are demonstrated below.

In [1]:
!python3 --version

Python 3.8.6


In [2]:
!python3 -m pip install numpy --upgrade --quiet

In [3]:
import numpy as np # convention of aliasing

In [4]:
n = np.array([[1,2,3],[4,5,6]])
print(n.ndim) # dimension or number of axes
print(n.shape) # no. of rows and columns
print(n.size) # no. of element
print(n.dtype) # data type of elements
print(n.itemsize) # space occupied in bytes by each element
print(n.data) # address of buffer that stores the array
print(type(n))

2
(2, 3)
6
int64
8
<memory at 0x106240790>
<class 'numpy.ndarray'>


## Array Creation
There are many ways of creating an array. One way is using `np.array` function which takes python list as an argument. It takes a keyword argument `dtype` where the type of elements can be mentioned. Implicit typecasting is taken care by NumPy. 

In [5]:
empty_arr = np.array([])
print("empty array", empty_arr)

arr = np.array([[1,2,3],[4.0,5,6]])
print("int typecasted to float", arr, arr.dtype)

complex_arr = np.array([[1,2,3],[4.0,5,6]], dtype=complex)
print("complex array", complex_arr)

a = np.array([1,2,'one'])
print("int typecasted to str", a)
b = np.array([1,2,True,3, False,'h'])
print("bool and int typecasted to str", b)
print("--------------------------------------------------------------")

empty array []
int typecasted to float [[1. 2. 3.]
 [4. 5. 6.]] float64
complex array [[1.+0.j 2.+0.j 3.+0.j]
 [4.+0.j 5.+0.j 6.+0.j]]
int typecasted to str ['1' '2' 'one']
bool and int typecasted to str ['1' '2' 'True' '3' 'False' 'h']
--------------------------------------------------------------


NumPy allows creation of placeholder matrices, provided the shape is known. This reduces the overhead of extending a 0-sized matrix. Various functions for this purpose are `ones`, `zeros`,`eye`, and `empty` among others

In [6]:
print("zeros array\n", np.zeros((2,3,4)))
print("ones array\n", np.ones((1,2,3)))
print("primary diagonal filled with one\n", np.eye(2)) # will always be a square matrix
print("empty array, filled with garbage values\n", np.empty((2,3)))
print("--------------------------------------------------------------")

zeros array
 [[[0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]]

 [[0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]]]
ones array
 [[[1. 1. 1.]
  [1. 1. 1.]]]
primary diagonal filled with one
 [[1. 0.]
 [0. 1.]]
empty array, filled with garbage values
 [[1. 1. 1.]
 [1. 1. 1.]]
--------------------------------------------------------------


`arange` is analogous to `range` in python. It returns a 1D array with consecutive numbers starting from 0. `linspace(a,b,c)` returns an array of with of length c, where the elements are equidistantly placed between a and b. a and b are the first and last elements respectively in the resultant array.  

In [7]:
np.arange(10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [8]:
np.arange(12,56,3)

array([12, 15, 18, 21, 24, 27, 30, 33, 36, 39, 42, 45, 48, 51, 54])

In [9]:
"""
The parameters can be float. However, this is not advisable due to inconsistencies in storing floating point number in
Python. Hence, the length of the array generated cannot be pre-determined. 
"""
np.arange(1.2,2,0.3) 

array([1.2, 1.5, 1.8])

In [10]:
np.linspace(0,2,9)

array([0.  , 0.25, 0.5 , 0.75, 1.  , 1.25, 1.5 , 1.75, 2.  ])

## Random Number Generation
**Uniform Distribution**: all outcomes are equally likely.  
**Normal Distribution**: most of the observations cluster around the central peak and the probabilities for values further away from the mean taper off equally in both directions.  
To generate random numbers with uniform distribution, use `rand()` function from `np.random`. `randn()` can be used for normal distribution random number generation.

In [11]:
# 10 random numbers in an array
np.random.rand(10) 

array([0.98702906, 0.54197758, 0.34796677, 0.29545201, 0.38748242,
       0.37096783, 0.17577888, 0.43761755, 0.74892813, 0.12136158])

In [12]:
# 3*4 matrix of random numbers
np.random.rand(2,3)

array([[0.16552314, 0.55340241, 0.2924407 ],
       [0.1037536 , 0.56710753, 0.83812335]])

In [13]:
# 1*2 matrix of random number in normal distribution.
np.random.randn(1,2)

array([[-0.28523896, -0.60377998]])

To generate random numbers within a defined range, use `randint()`. A seed value can be given to generate the same random numbers every time. 

In [14]:
np.random.randint(10,20) # return one value

17

In [15]:
np.random.randint(1,10,20) # returns an array with 20 elements

array([5, 7, 4, 9, 1, 7, 9, 4, 8, 2, 3, 6, 4, 2, 1, 6, 8, 2, 6, 8])

In [16]:
# generating random numbers using a seed
np.random.seed(15)
np.random.rand(10)

array([0.8488177 , 0.17889592, 0.05436321, 0.36153845, 0.27540093,
       0.53000022, 0.30591892, 0.30447436, 0.11174128, 0.24989901])

## Reshaping Array
Array can be reshaped using `reshape` provided it is possible.

In [17]:
arr = np.random.rand(2,2)
arr,arr.shape

(array([[0.9176299 , 0.26414685],
        [0.71777369, 0.86571503]]),
 (2, 2))

In [18]:
# reshaping is not an inplace operation
arr.reshape(1,4)

array([[0.9176299 , 0.26414685, 0.71777369, 0.86571503]])

In [19]:
arr

array([[0.9176299 , 0.26414685],
       [0.71777369, 0.86571503]])

In [20]:
arr.reshape(4,1)

array([[0.9176299 ],
       [0.26414685],
       [0.71777369],
       [0.86571503]])

## Slicing an Array
Slicing a 1D array is similar to python 1D list. It is done using square brackets. It does not affect the original array. To make a copy of the sliced array, use `copy()`. Slicing of multi-dimensional can be slightly tricky. Have a look.

In [21]:
a = np.arange(1,10,2)
a

array([1, 3, 5, 7, 9])

In [22]:
a[2:3], a[:3]

(array([5]), array([1, 3, 5]))

In [23]:
c = a.copy()[2:]
a, c

(array([1, 3, 5, 7, 9]), array([5, 7, 9]))

In [24]:
# Multi-dimensional array slicing
a=np.array([[1,2],[3,4],[5,6]])
a[:,1]

array([2, 4, 6])

## Computations with array
Any mathematical operation with numpy arrays are applied to each element of that array.

In [25]:
a = np.array([1,2,7,2,9,7,5,4,3,2])
a>3

array([False, False,  True, False,  True,  True,  True,  True, False,
       False])

In [26]:
a[a>3]

array([7, 9, 7, 5, 4])

In [27]:
a**2

array([ 1,  4, 49,  4, 81, 49, 25, 16,  9,  4])

In [28]:
a+10

array([11, 12, 17, 12, 19, 17, 15, 14, 13, 12])

In [29]:
# element-wise addition for same dimeantion arrays
a+a

array([ 2,  4, 14,  4, 18, 14, 10,  8,  6,  4])

## Statistical operations with array

In [30]:
a = np.array([12, 34, 23, 92, 76, 54, 76, 83, 91])
np.mean(a)

60.111111111111114

In [31]:
np.median(a)

76.0

In [32]:
np.std(a) # standard deviation

28.695506595378866

In [33]:
np.corrcoef(a) # correlation coefficient

1.0

In [34]:
np.var(a) # variance

823.4320987654321