<a href="https://colab.research.google.com/github/ROHIT318/python-practice/blob/main/learn_numpy.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Topics to be covered for learning numpy:**
1. Numpy array object, attributes, methods, declaration
2. Indexing and slicing on numpy arrays
3. Mathematical and Statistical operations on arrays
4. Array manipulation like reshaping, stacking and splitting arrays
5. Matrix multiplication, eigenvalues and eigenvectors in numpy arrays
6. `numpy.random` in numpy
7. Techniques for improving the performance of numpy arrays
8. Read and write data to or from files using numpy
9. How to manage memory efficienlty in numpy arrays

**What is numpy library?**
- Numpy arrays are more memory efficient and faster than python lists espacially for large datasets
- Mathematical operations can be applied directly to entire arrays unlike python lists. Ex: arr1 * arr2 (Possible only in numpy and not in python lists)
- Numpy provides wide range of functions for linear algebra, statistical analysis which are either not present in lists or are less efficient
-  Numpy arrays have a fixed data type, allowing for more efficient memory storage and faster mathematical operations. Python lists can store elements of different data types, leading to higher memory overhead.

In [3]:
import numpy as np

arr1 = np.array([1,2,3,4,5])
print(f'One dimensional array: {arr1}')

arr2 = np.array([[1,2,3], [4,5,6]])
print(f'Two dimensional array: {arr2}')

One dimensional array: [1 2 3 4 5]
Two dimensional array: [[1 2 3]
 [4 5 6]]


In [None]:
import sys

# Comparing size of python lists and numpy array
np_arr = np.array([1,2,3,4])
python_lists = [1,2,3,4]

print(f'Size of Python lists: {sys.getsizeof(python_lists)}')
print(f'Size of numpy array: {np_arr.nbytes}')

Size of Python lists: 88
Size of numpy array: 32


# Indexing and Slicing in Numpy
Indexing and slicing allows to access and manipulate elements within arrays.

**Indexing:**
- Indexing allows to access individual elements in a numpy array
- Zero indexed
- `arr_var[row_index,column_index]`

In [None]:
print(f'First row, first column: {arr2[0,0]}')
print(f'Last row, last column: {arr2[1,2]}')

First row, first column: 1
Last row, last column: 6


**Slicing:**
- Allows to extract a portion of the array
- Create a new array with the selected element

In [None]:
print(f'First row, last two elements: {arr2[0, 1:]}')
print(f'First row, first two elements: {arr2[0, :2]}')
print(f'Both rows, first two elements: {arr2[:, 0:2]}')

First row, last two elements: [2 3]
First row, first two elements: [1 2]
Both rows, first two elements: [[1 2]
 [4 5]]


**Advanced Indexing:**
- Supports indexing using bolean arrays or integer arrays to access element based on certain conditions

In [None]:
new_arr = arr2[arr2 > 1]
print(f'Creating new array using advanced indexing {new_arr}')

Creating new array using advanced indexing [2 3 4 5 6]


# Mathematical and Statistical operations in numpy

**Element wise operations**

In [None]:
arr_1 = np.array([1,0,2])
arr_2 = np.array([2,1,6])

arr_add = arr_1 + arr_2
print(f'Addition of two numpy arrays: {arr_add}')

arr_sub = arr_2 - arr_1
print(f'Subtraction of two numpy arrays: {arr_sub}')

arr_mul = arr_1 * arr_2
print(f'Multiplication of two numpy arrays: {arr_mul}')

arr_div = arr_2 / arr_1
print(f'Division of two numpy arrays: {arr_div} "On purpose division by zero')

Addition of two numpy arrays: [3 1 8]
Subtraction of two numpy arrays: [1 1 4]
Multiplication of two numpy arrays: [ 2  0 12]
Division of two numpy arrays: [ 2. inf  3.] "On purpose division by zero


  arr_div = arr_2 / arr_1


**Universal Functions (ufuncs):**
- Universal functions (ufuncs) that operate element-wise on arrays

In [None]:
arr_sqrt = np.sqrt(arr_1)
print(f'Square root of each numpy array element: {arr_sqrt}')

arr_exp = np.exp(arr_1)
print(f'Exponentiation of each numpy array element: {arr_exp}')

Square root of each numpy array element: [1.         0.         1.41421356]
Exponentiation of each numpy array element: [2.71828183 1.         7.3890561 ]


# Statistical Operations:
- Mean: Also known as average, it represents the sum of a set of values divided by the number of values in that set
- Median: It represents middle value of dataset sorted in ascending and descending order. If even number of elements than average of middle two elements  
- Standard Deviation: It is a measure of the amount of variation or dispersion in a set of values. A low standard deviation indicates that the values tend to be close to the mean, while a high standard deviation indicates that the values are spread out over a wider range.
    <br> Example:
        i) Find mean of all elements
        ii) Calculate squared difference from mean of each element
        iii) Find average of these squared differences
- Sum: Sum of all the elements in the array

In [None]:
mean_val = np.mean(arr_1)
print(f'Mean value of numpy array: {mean_val}')

median_val = np.median(arr_1)
print(f'Median value of numpy array: {median_val}')

std_dev_val = np.std(arr_1)
print(f'Standard deviation of numpy array: {std_dev_val}')

sum_val = np.sum(arr_1)
print(f'Sum numpy array: {sum_val}')

Mean value of numpy array: 1.0
Median value of numpy array: 1.0
Standard deviation of numpy array: 0.816496580927726
Sum numpy array: 3


# Array manipulation like reshaping, stacking and splitting arrays


**Reshaping Arrays:**
- Change shape or dimensions of an array without changing the data
- `reshape` function in numpy is used

In [None]:
arr_1d = np.array([1,2,3,4,5,6])

arr_2d = arr_1d.reshape(2,3)

print(f'Original array 1-dimensional: {arr_1d}')
print(f'Converted 2-dimensional: {arr_2d}')

Original array 1-dimensional: [1 2 3 4 5 6]
Converted 2-dimensional: [[1 2 3]
 [4 5 6]]


**Stacking Arrays:**
- Combining multiple arrays along a new axis
- `np.vstack` and `np.hstack` are used for vertical and horizontal stacking respectively

In [None]:
vertical_stack_arr = np.vstack((arr_1d, arr_1d))
print(f'Vertically stacked array: \n{vertical_stack_arr}')

horizontal_stack_arr = np.hstack((arr_2d, arr_2d))
print(f'Horizontally stacked array: \n{horizontal_stack_arr}')

Vertically stacked array: 
[[1 2 3 4 5 6]
 [1 2 3 4 5 6]]
Horizontally stacked array: 
[[1 2 3 1 2 3]
 [4 5 6 4 5 6]]


**Splitting Arrays:**
- Split the arrays into multiple smaller arrays
- `np.split(arr, num)` to split arrays into num equal parts
- `np.hsplit(arr, num)` to split arrays into num equal parts along the horizontal axis

In [None]:
print(f'1-d array: \n{arr_1d}')
print(f'2-d array: \n{arr_2d}')
print('----')

arr_1d_split = np.split(arr_1d,3)
arr_2d_split = np.split(arr_2d,2)
print(f'Split 1-d array: \n{arr_1d_split}')
print(f'Split 2-d array: \n{arr_2d_split}')
print('----')

arr_1d_hsplit = np.hsplit(arr_1d, 3)
arr_2d_hsplit = np.hsplit(arr_2d, 3)
print(f'Horizontally split 1-d array: \n{arr_1d_hsplit}')
print(f'Horizontally split 2-d array: \n{arr_2d_hsplit}')

1-d array: 
[1 2 3 4 5 6]
2-d array: 
[[1 2 3]
 [4 5 6]]
----
Split 1-d array: 
[array([1, 2]), array([3, 4]), array([5, 6])]
Split 2-d array: 
[array([[1, 2, 3]]), array([[4, 5, 6]])]
----
Horizontally split 1-d array: 
[array([1, 2]), array([3, 4]), array([5, 6])]
Horizontally split 2-d array: 
[array([[1],
       [4]]), array([[2],
       [5]]), array([[3],
       [6]])]


**Matrix Multiplication**

In [5]:
arr_1 = [[1,2], [3,4]]
arr_2 = [[5,6], [7,8]]

arr_mult = np.dot(arr_1, arr_2)
print(f'Result of array multiplication: \n{arr_mult}')

Result of array multiplication: 
[[19 22]
 [43 50]]


# `numpy.random` in numpy

- `np.random.rand(d1,d2,....,dn)`: Generate random float array range is [0,1) having dimension d1,d2,....,dn
- `np.random.randint(low,high,size=(d1,d2,....,dn))`: Generate random integer array, range is low (inclusive) to high (exclusive), dimension is d1,d2,....,dn
- `np.random.randn(d1,d2,....,dn)` : Generates an array of random numbers from a standard normal distribution

In [6]:
arr_rand = np.random.rand(2,3)
print(f'Created random array using function rand and shape 2,3: \n{arr_rand}')

arr_randint = np.random.randint(5,10,size=(3,3))
print(f'Created random array using function randint and shape 3,3: \n{arr_randint}')

arr_randn = np.random.randn(2,4)
print(f'Created random array using function randint and shape 3,3: \n{arr_randn}')

Created random array using function rand and shape 2,3: 
[[0.45619214 0.22237464 0.30328478]
 [0.07284764 0.90957257 0.05040259]]
Created random array using function randint and shape 3,3: 
[[9 7 8]
 [9 6 8]
 [6 6 9]]
Created random array using function randint and shape 3,3: 
[[ 0.0704172  -0.52884912 -0.40620271 -0.54487311]
 [ 2.33584872 -2.51647414  0.79962454 -0.921952  ]]
