# Numpy basics with python
### Numpy is the most basic and powerful package in the python. Every data scientist must know about numpy, as many other packages are built over it. Like pandas, scikit-learn etc.

### At the core, numpy provides the excellent ndarray objects, short for n-dimensional arrays. 'ndarray' object also known as 'numpy array' can store multiple items of same data type. Hence, helping in lot of mathematical operations and data calculations.


#### `Create Numpy Array`
#### `A numpy array is a grid of values, with same type, and is indexed by a tuple of nonnegative integers. The number of dimensions is the *rank* of the array; the *shape* of an array is a tuple of integers giving the size of the array along each dimension. Array can handle vectorised operations while python list cannot, thus doing operations much faster`

In [25]:
import numpy as np
list1 = [1,2,3,4]
arr = np.array(list1)

print(type(arr))
print(arr)
print(arr.shape)

<class 'numpy.ndarray'>
[1 2 3 4]
(4,)


#### `Pass list of lists to create matrix like a 2d array`

In [26]:
list2 = [[1,2,3], [3,4,5], [4,5,6]]
arr2 = np.array(list2)

print(type(arr2))
print(arr2)
print(arr2.shape)

<class 'numpy.ndarray'>
[[1 2 3]
 [3 4 5]
 [4 5 6]]
(3, 3)


#### `we can also specify the datatype of the list, for example: 'float64'`

In [27]:
arr2_f = np.array(list2, dtype='float')
print(arr2_f)
arr3_f = np.array(list2, dtype="bool")
print(arr3_f)

[[1. 2. 3.]
 [3. 4. 5.]
 [4. 5. 6.]]
[[ True  True  True]
 [ True  True  True]
 [ True  True  True]]


***
#### `Some ways to create a numpy array and fill with some values`

In [28]:
zero = np.zeros((2,2))      # Fills the array with zeros
print(zero) 
one = np.ones((1,2))        # Fills the array with ones
print(one)
full = np.full((2,2), 7)    # Fills the array with given value
print(full)
eye = np.eye(2)             # Fills the diagonal with one, nxn matrix
print(eye)
rand = np.random.random((2,2))  # Fills the array with random values
print(rand)

[[0. 0.]
 [0. 0.]]
[[1. 1.]]
[[7 7]
 [7 7]]
[[1. 0.]
 [0. 1.]]
[[0.00471886 0.12156912]
 [0.67074908 0.82585276]]


***
#### `Indexing`
#### `Method - 1 : Slicing`

In [31]:
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
# [[ 1 2 3 4]
# [5 6 7 8]
# [9 10 11 12]]

# Use slicing to pull out the subarray consisting of the first 2 rows
# and columns 1 and 2; b is the following array of shape (2, 2):
b = a[:2, 1:3]
print(b)
# We can also modify the data using assignment operator
a[2, 0] = 102

[[2 3]
 [6 7]]


#### `Integer Indexing`

In [32]:
print(a[2, 0])

102


#### `Mixing slicing and integer indexing`

In [33]:
row1 = a[1, :]
print(row1)
col1 = a[:, 1]
print(col1)

[5 6 7 8]
[ 2  6 10]


#### `Boolean array indexing`

In [34]:
b = a > 4
print(b)
print(a[b])
print(a[a > 4])

[[False False False False]
 [ True  True  True  True]
 [ True  True  True  True]]
[  5   6   7   8 102  10  11  12]
[  5   6   7   8 102  10  11  12]


#### `Array math`

In [35]:
x = np.array([[1,2],[3,4]], dtype=np.float64)
y = np.array([[5,6],[7,8]], dtype=np.float64)

# Elementwise sum;
print(np.add(x, y))

# Elementwise difference;
print(np.subtract(x, y))

# Elementwise product;
print(np.multiply(x, y))

# Elementwise division;
print(np.divide(x, y))

# Elementwise square root; produces the array
print(np.sqrt(x))

# Row wise and column wise min
print("Column wise minimum: ", np.amin(x, axis=0))
print("Row wise minimum: ", np.amin(x, axis=1))

# mean, max and min
print("Mean value is: ", x.mean())
print("Max value is: ", x.max())
print("Min value is: ", x.min())

[[ 6.  8.]
 [10. 12.]]
[[-4. -4.]
 [-4. -4.]]
[[ 5. 12.]
 [21. 32.]]
[[0.2        0.33333333]
 [0.42857143 0.5       ]]
[[1.         1.41421356]
 [1.73205081 2.        ]]
Column wise minimum:  [1. 2.]
Row wise minimum:  [1. 3.]
Mean value is:  2.5
Max value is:  4.0
Min value is:  1.0


***

#### `Flatten and Ravel`
#### `Methods to flatten data`
#### `The difference between ravel and flatten is, the new array created using ravel is actually a reference to the parent array. So, any changes to the new array will affect the parent as well. But is memory efficient since it does not create a copy.`

In [36]:
# Transpose
arr2 = arr2.T

arr2.flatten()
b1 = arr2.flatten()  
b1[0] = 100  # changing b1 does not affect arr2

b2 = arr2.ravel()  
b2[0] = 101  # changing b2 changes arr2 also
arr2

array([[1, 3, 4],
       [2, 4, 5],
       [3, 5, 6]])

***
#### `Sequences, repeatitions and random numbers`

In [37]:
np.arange(5)            # Lower limit is 0 by defualt
np.arange(0, 10)        # 0 to 9
np.arange(0, 10, 2)     # 0 to 10 with step 2
np.arange(10, 0, -1)    # 10 to 0 with step -1

# Use linspace to get specific number of items without explicitly telling the step
np.linspace(start=1, stop=50, num=10, dtype=int)

# Use logspace to get specific set of item in logarithmic scale
np.logspace(start=1, stop=50, num=10, base=10)

np.random.rand(2, 2)            # Random numbers between [0,1) of shape 2,2
np.random.randn(2, 2)           # Normal distribution with mean=0 and variance=1 of shape 2,2
np.random.randint(0, 10, size=[2,2])    # Random integers between [0, 10) of shape 2,2
np.random.choice(['a', 'b', 'c'], size=10)  # Pick 10 items from a given list, with equal probability


# To get same random number each time set seed, it can be any number, but always the same
# Set the random seed
np.random.seed(100)

# Create random numbers between [0,1) of shape 2,2
print(np.random.rand(2,2))

[[0.54340494 0.27836939]
 [0.42451759 0.84477613]]


***
#### `Handling missing and infinite values`

In [38]:
# This returns the bool val for each item
missing_bool = np.isnan(arr2)
# OR
missing_bool = np.isinf(arr2)

# Replace the missing item with -1
arr2[missing_bool] = -1

***
#### `Broadcasting`
#### `Broadcasting is a powerful mechanism that allows numpy to work with arrays of different shapes when performing arithmetic operations.`


In [39]:
x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
v = np.array([1, 0, 1])
y = np.empty_like(x)   # Create an empty matrix with the same shape as x

# Add the vector v to each row of the matrix x with an explicit loop
for i in range(4):
    y[i, :] = x[i, :] + v


x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
v = np.array([1, 0, 1])
y = x + v  # Add v to each row of x using broadcasting

### `That's it for now! I will try to add more concept and examples soon!!!`
