<h1><center>NUMPY (EDA)</center></h1>

It is a Linear Algebra Library for Python.
Almost all of the libraries in the data-related ecosystem rely on NumPy as one of their main building blocks.

Numpy is also incredibly fast, as it has bindings to C libraries. For more info on why you would want to use Arrays instead of lists, check out this great [StackOverflow post](http://stackoverflow.com/questions/993984/why-numpy-instead-of-python-lists).

Numpy arrays essentially come in two flavors: vectors and matrices. Vectors are strictly 1-d arrays and matrices are 2-d (but you should note a matrix can still have only one row or one column).

Exploratory Data Analysis (EDA) comprises of the following
- Descriptive Statistics
- Numpy
- Pandas
- Matplotlib
- Seaborn

## Topics Covered

1- Creating Numpy Arrays
 - From a Python List
 - From Built-in Array Creation Mehtods
   - arange
   - zeros
   - ones
   - linspace
   - full
   - eye
 - From Random
   - rand
   - randn
   - randint

2- Array Attributes and Methods
 - Reshape
 - max,min, argmax, argmin
 - shape
 - dtyoe
 - size
 - ndim

3- Operations on Arrays
 - copying
 - append and Insert
 - Sorting
 - Removing/Deleting
 - Combining/Concatenating
 - Splitting

4- Data Loading & Saving
 - save
 - load
 - savetxt
 - load txt file
 - csv

5- NumPy Indexing and Selection¶
 - Indexing a 2D array (matrices)
 - Logical Selection

6- Broadcasting

7- Type Casting

8- Numpy Operations
 - Arithmetic
   - Add, Subtract, Multiply, Divide, Exponentiation
 
9- Universal Array Functions
 - sqrt, exp, max, sin, 

In [1]:
import numpy as np

## 1- Creating NumPy Arrays

### From a Python List
We can create an array by directly converting a list or list of lists:

In [2]:
my_list = [1,2,3]
my_list

[1, 2, 3]

In [3]:
np.array(my_list)

array([1, 2, 3])

In [4]:
my_matrix = [[1,2,3],[4,5,6],[7,8,9]]
my_matrix

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

In [5]:
np.array(my_matrix)

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

### From Built-in Array Creation Methods
There are a number of ways to generate Arrays

#### arange

Return evenly spaced values within a given interval.

In [6]:
np.arange(0,10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [7]:
np.arange(0,11,2) # 0 se 11 tak values 2 ke fasle se

array([ 0,  2,  4,  6,  8, 10])

#### zeros and ones

Generate arrays of zeros or ones

In [8]:
np.zeros(3)

array([0., 0., 0.])

In [9]:
np.zeros((2,5,3), dtype='float16') # 2 classes 5 rows and 3 cols

array([[[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]],

       [[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]]], dtype=float16)

In [10]:
np.ones(3)

array([1., 1., 1.])

In [11]:
np.ones((3,3))

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

#### linspace
Return evenly spaced numbers over a specified interval.

In [12]:
np.linspace(0,10,3) # zero se 10 tak values aye lkn sirf 3 honi chahiye

array([ 0.,  5., 10.])

In [13]:
np.linspace(0,10,50) # zero se 10 tak values or 50 tak values honi chaiye

array([ 0.        ,  0.20408163,  0.40816327,  0.6122449 ,  0.81632653,
        1.02040816,  1.2244898 ,  1.42857143,  1.63265306,  1.83673469,
        2.04081633,  2.24489796,  2.44897959,  2.65306122,  2.85714286,
        3.06122449,  3.26530612,  3.46938776,  3.67346939,  3.87755102,
        4.08163265,  4.28571429,  4.48979592,  4.69387755,  4.89795918,
        5.10204082,  5.30612245,  5.51020408,  5.71428571,  5.91836735,
        6.12244898,  6.32653061,  6.53061224,  6.73469388,  6.93877551,
        7.14285714,  7.34693878,  7.55102041,  7.75510204,  7.95918367,
        8.16326531,  8.36734694,  8.57142857,  8.7755102 ,  8.97959184,
        9.18367347,  9.3877551 ,  9.59183673,  9.79591837, 10.        ])

#### full
full ((shape of the array), value with which we want to fill the array)

In [14]:
np.full((3,4),2) # three rows four col or 2 us mein fill krna ha

array([[2, 2, 2, 2],
       [2, 2, 2, 2],
       [2, 2, 2, 2]])

### eye

Creates an identity matrix

In [15]:
np.eye(4)

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

### Random 

To generate an array with random numbers, np.random is used. It contains multiple methods.

#### rand
Create an array of the given shape and populate it with random samples from a uniform distribution over ``[0, 1)``.

In [16]:
np.random.rand(2)

array([0.44969957, 0.58705322])

In [17]:
np.random.rand(5,5)

array([[0.07894664, 0.82523262, 0.9156    , 0.28306056, 0.13771542],
       [0.56920329, 0.21923874, 0.71926806, 0.57902896, 0.09675463],
       [0.45077658, 0.88349226, 0.61074881, 0.23066558, 0.27687787],
       [0.79392875, 0.87935445, 0.44620379, 0.90034876, 0.86114758],
       [0.51282514, 0.8440169 , 0.69739662, 0.63211592, 0.09525711]])

#### randn

Return a sample (or samples) from the "standard normal" distribution. Unlike rand which is uniform:
![image.png](attachment:image.png)

In [18]:
np.random.randn(2)

array([ 0.44341237, -0.9375116 ])

In [19]:
np.random.randn(5,5)

array([[-0.37843028,  0.50487463, -1.37114841,  1.94526012,  0.89255411],
       [ 1.37009462,  0.40429479, -0.74468981,  1.2472885 ,  0.72120868],
       [-0.18541548, -0.02038328,  1.71163036, -0.37249067, -1.62014406],
       [ 2.02302546, -1.34826448,  0.28579309, -2.26274547, -0.26177829],
       [-1.66173581, -1.60970254,  1.27335597,  0.064274  , -0.77136594]])

#### randint
Return random integers from `low` (inclusive) to `high` (exclusive).

In [20]:
np.random.randint(1,100)

47

In [21]:
np.random.randint(1,100,10)

array([20, 72, 31, 24, 55, 80, 21, 42, 17, 27])

## 2- Array Attributes and Methods

In [22]:
arr = np.arange(25)
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24])

In [23]:
ranarr = np.random.randint(0,50,10)
ranarr

array([46, 14, 35, 43, 38, 32, 36,  5, 23, 48])

### Reshape
Returns an array containing the same data with a new shape.

In [24]:
arr.reshape(5,5)

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])

### max,min,argmax,argmin

These are useful methods for finding max or min values. Or to find their index locations using argmin or argmax

In [25]:
ranarr

array([46, 14, 35, 43, 38, 32, 36,  5, 23, 48])

In [26]:
ranarr.max()

48

In [27]:
ranarr.argmax() # max wale ki pos

9

In [28]:
ranarr.min()

5

In [29]:
ranarr.argmin() # min wale ki pos 

7

### Shape

Shape is an attribute that arrays have (not a method):

In [30]:
# Vector
arr.shape # arr ki shape

(25,)

In [31]:
# Notice the two sets of brackets
arr.reshape((1,25))

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15,
        16, 17, 18, 19, 20, 21, 22, 23, 24]])

In [32]:
arr.reshape(1,25).shape

(1, 25)

In [33]:
arr.reshape(25,1)

array([[ 0],
       [ 1],
       [ 2],
       [ 3],
       [ 4],
       [ 5],
       [ 6],
       [ 7],
       [ 8],
       [ 9],
       [10],
       [11],
       [12],
       [13],
       [14],
       [15],
       [16],
       [17],
       [18],
       [19],
       [20],
       [21],
       [22],
       [23],
       [24]])

In [34]:
arr.reshape(25,1).shape

(25, 1)

### dtype

We can also grab the data type of the object in the array:

In [35]:
print(arr.dtype)

int64


### size

In [36]:
arr.size

25

### ndim

In [37]:
np.ndim(arr)

1

## 3- Operations on Arrays

### Copying

In [38]:
new_array = np.copy(arr)
print(new_array)

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
 24]


### Append & Insert

In [39]:
print(np.append(new_array, (-1,-2,-3,-4,-5)))
print(np.insert(new_array, 4, (-10,-20,-30,-40,-50))) # append before a certain index position specified

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
 24 -1 -2 -3 -4 -5]
[  0   1   2   3 -10 -20 -30 -40 -50   4   5   6   7   8   9  10  11  12
  13  14  15  16  17  18  19  20  21  22  23  24]


### Sorting

In [40]:
np.sort(new_array, kind='quicksort')

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24])

### Removing/Deleting

In [41]:
mat = np.linspace(1,15, 15)
print(mat)
mat = np.reshape(mat, (3,-1))
print("New Matrix after reshaping is: \n",mat)

# Delete rows on index 2 of the 2D array
print("Row deletion \n",np.delete(mat, 2, axis=0))

# Delete rows on index 2 of the 2D array
print("Col deletion \n",np.delete(mat, 2, axis=1))

[ 1.  2.  3.  4.  5.  6.  7.  8.  9. 10. 11. 12. 13. 14. 15.]
New Matrix after reshaping is: 
 [[ 1.  2.  3.  4.  5.]
 [ 6.  7.  8.  9. 10.]
 [11. 12. 13. 14. 15.]]
Row deletion 
 [[ 1.  2.  3.  4.  5.]
 [ 6.  7.  8.  9. 10.]]
Col deletion 
 [[ 1.  2.  4.  5.]
 [ 6.  7.  9. 10.]
 [11. 12. 14. 15.]]


### Combining/Concatenating
Concatenation can be carried out on both axes i.e. on Rows and on Columns

In [42]:
array1 = np.random.randint(1,50,(2,3))
array2 = np.random.randint(60,100, (2,3))
print("First Array \n",array1)
print("Second Array \n",array2)

# Concatenation along rows
print("Concatenated arrays along rows / axis=0 \n",np.concatenate((array1, array2), axis=0))

# Concatenation along columns
print("Concatenated arrays along columns / axis=1 \n",np.concatenate((array1, array2), axis=1))


First Array 
 [[ 1  6 12]
 [14 14 17]]
Second Array 
 [[75 98 88]
 [92 75 63]]
Concatenated arrays along rows / axis=0 
 [[ 1  6 12]
 [14 14 17]
 [75 98 88]
 [92 75 63]]
Concatenated arrays along columns / axis=1 
 [[ 1  6 12 75 98 88]
 [14 14 17 92 75 63]]


### Splitting
Slits an array into sub arrays along an axis which could be 0 or 1 depending upon row and column order

In [43]:
np.split(array1, 2, axis=0)

[array([[ 1,  6, 12]]), array([[14, 14, 17]])]

In [44]:
np.split(array1, 3, axis=1)

[array([[ 1],
        [14]]),
 array([[ 6],
        [14]]),
 array([[12],
        [17]])]

## 4- Data Loading and Saving

#### save
saves an array to a binary file in 'npy' numpy format

In [45]:
np.save("saved_array", arr, allow_pickle=False)

### load
Load arrays or pickled objects from .npy, .npz or pickled files.

In [46]:
print(np.load("saved_array.npy", allow_pickle=False))

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
 24]


### savetxt
Save an array to a text file.

In [47]:
np.savetxt("saved_txt_array", arr, delimiter=' ')

### load txt file

In [48]:
print(np.loadtxt("saved_txt_array", delimiter=' '))

[ 0.  1.  2.  3.  4.  5.  6.  7.  8.  9. 10. 11. 12. 13. 14. 15. 16. 17.
 18. 19. 20. 21. 22. 23. 24.]


### CSV
Note: For csv, just change the delimeter sign.

## 5- NumPy Indexing and Selection
Bracket Indexing and Selection

In [49]:
#Creating sample array
arr = np.arange(0,11)
print(arr)

[ 0  1  2  3  4  5  6  7  8  9 10]


In [50]:
#Get a value at an index
arr[8]

8

In [51]:
#Get values in a range
arr[1:5]

array([1, 2, 3, 4])

In [52]:
#Get values in a range
arr[0:5]

array([0, 1, 2, 3, 4])

### Indexing a 2D array (matrices)

The general format is **arr_2d[row][col]** or **arr_2d[row,col]**.

In [53]:
arr_2d = np.array(([5,10,15],[20,25,30],[35,40,45]))

#Show
arr_2d

array([[ 5, 10, 15],
       [20, 25, 30],
       [35, 40, 45]])

In [54]:
#Indexing row
arr_2d[1][0]


20

In [55]:
# Format is arr_2d[row][col] or arr_2d[row,col]

# Getting individual element value
arr_2d[1][0]

20

In [56]:
# Getting individual element value
arr_2d[1,0]

20

In [57]:
# 2D array slicing

#Shape (2,2) from top right corner
arr_2d[:2,1:]

array([[10, 15],
       [25, 30]])

In [58]:
#Shape bottom row
arr_2d[2]

array([35, 40, 45])

In [59]:
#Shape bottom row
arr_2d[2,:]

array([35, 40, 45])

### Logical Selection

Logical selection based off comparison operators.

In [60]:
arr = np.arange(1,11)
arr

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [61]:
arr > 4

array([False, False, False, False,  True,  True,  True,  True,  True,
        True])

In [62]:
bool_arr = arr>4

In [63]:
bool_arr

array([False, False, False, False,  True,  True,  True,  True,  True,
        True])

In [64]:
arr[bool_arr]

array([ 5,  6,  7,  8,  9, 10])

In [65]:
arr[arr>2]

array([ 3,  4,  5,  6,  7,  8,  9, 10])

In [66]:
x = 2
arr[arr>x]

array([ 3,  4,  5,  6,  7,  8,  9, 10])

## 6- Broadcasting

Numpy arrays differ from a normal Python list because of their ability to broadcast

In [67]:
#Setting a value with index range (Broadcasting)
arr[0:5]=100

#Show
arr

array([100, 100, 100, 100, 100,   6,   7,   8,   9,  10])

In [68]:
# Reset array, we'll see why I had to reset in  a moment
arr = np.arange(0,11)

#Show
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [69]:
#Important notes on Slices
slice_of_arr = arr[0:6]

#Show slice
slice_of_arr

array([0, 1, 2, 3, 4, 5])

In [70]:
#Change Slice
slice_of_arr[:]=99

#Show Slice again
slice_of_arr

array([99, 99, 99, 99, 99, 99])

Now note the changes also occur in our original array!

In [71]:
arr

array([99, 99, 99, 99, 99, 99,  6,  7,  8,  9, 10])

Data is not copied, it's a view of the original array! This avoids memory problems!

In [72]:
#To get a copy, need to be explicit
arr_copy = arr.copy()

arr_copy

array([99, 99, 99, 99, 99, 99,  6,  7,  8,  9, 10])

## 7- Type Casting

Arrays in numpy can be type casted into some other type if need arises. FOr this .astype is used

In [73]:
# An array of string values
str_array = np.array(['1.5', '2.7', '3.9'])
print(str_array)
# Type casting to float
flt_array = str_array.astype(np.float16)
print(flt_array)

# Type casting to int
print(flt_array.astype(np.int16))


['1.5' '2.7' '3.9']
[1.5 2.7 3.9]
[1 2 3]


In [74]:
# Type casting back to Python list
flt_array.tolist()

[1.5, 2.69921875, 3.900390625]

## Documentation Link
https://fgnt.github.io/python_crashkurs_doc/include/numpy.html

## 8- NumPy Operations

### Arithmetic

We can easily perform array with array arithmetic, or scalar with array arithmetic.

In [75]:
arr = np.arange(0,10)

#### Add

In [76]:
print(arr + arr)
print(np.add(arr, arr))

[ 0  2  4  6  8 10 12 14 16 18]
[ 0  2  4  6  8 10 12 14 16 18]


#### Subtract

In [77]:
print(arr - arr)
print(np.subtract(arr,arr))

[0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0]


#### Multiply

In [78]:
print(arr * arr)
print(np.multiply(arr,arr))

[ 0  1  4  9 16 25 36 49 64 81]
[ 0  1  4  9 16 25 36 49 64 81]


#### Divide

In [79]:
# Warning on division by zero, but not an error!
# Just replaced with nan
print(arr / arr)
print(np.divide(arr,arr))

[nan  1.  1.  1.  1.  1.  1.  1.  1.  1.]
[nan  1.  1.  1.  1.  1.  1.  1.  1.  1.]


  print(arr / arr)
  print(np.divide(arr,arr))


In [80]:
# Also warning, but not an error instead infinity
print(1 / arr)

[       inf 1.         0.5        0.33333333 0.25       0.2
 0.16666667 0.14285714 0.125      0.11111111]


  print(1 / arr)


#### Exponentiation

In [81]:
print(arr**3)

[  0   1   8  27  64 125 216 343 512 729]


## 9- Universal Array Functions

Numpy comes with many [universal array functions](https://numpy.org/doc/stable/reference/ufuncs.html), which are essentially just mathematical operations you can use to perform the operation across the array. Let's show some common ones:

In [82]:
#Taking Square Roots
np.sqrt(arr)

array([0.        , 1.        , 1.41421356, 1.73205081, 2.        ,
       2.23606798, 2.44948974, 2.64575131, 2.82842712, 3.        ])

In [83]:
#Calcualting exponential (e^)
np.exp(arr)

array([1.00000000e+00, 2.71828183e+00, 7.38905610e+00, 2.00855369e+01,
       5.45981500e+01, 1.48413159e+02, 4.03428793e+02, 1.09663316e+03,
       2.98095799e+03, 8.10308393e+03])

In [84]:
np.max(arr) #same as arr.max()

9

In [85]:
np.sin(arr)

array([ 0.        ,  0.84147098,  0.90929743,  0.14112001, -0.7568025 ,
       -0.95892427, -0.2794155 ,  0.6569866 ,  0.98935825,  0.41211849])

In [86]:
np.log(arr)

  np.log(arr)


array([      -inf, 0.        , 0.69314718, 1.09861229, 1.38629436,
       1.60943791, 1.79175947, 1.94591015, 2.07944154, 2.19722458])

In [87]:
x = 2
arr[arr>x]

array([3, 4, 5, 6, 7, 8, 9])