# Intro to NumPy 

`NumPy` --> called Numerical Python

- It is the fundamental library in Data Science; written in C language (that's why faster than Python itself)

- Data object in numpy is called `nd array` --> it is a core data structure where `n` stands for `n-dimensions`

- Properties Of NumPy:
    - It is foundation of many Data Science libraries
    - Offers rich set of data, mathematical & statistical operations/functions
    - Offers efficient storage for various data types like: strings, floats, etc
    - Easy slicing, merging and indexing of arrays 
    - Provides Random array or Matrix generation i.e it can Generate Random Data for user
    - Arrays perform faster than lists


To install NumPy: 
- `pip install numpy` or `pip3 install numpy`

To use NumPy in program we import it by using:
- `import numpy as np`

> where *np* is alliance(nickname) for numpy    (convention)

In [1]:
import numpy as np

#### **1.** `np.array([list])` --> converts a List to a NumPy array

In [2]:
#Our first array in numpy:   within brackets is a list --> converted to NumPy array (more powerful)
arr = np.array([2,4,6])
arr

array([2, 4, 6])

In [3]:
#Convert List into NumPy array:
myList = [1,2,3,4,5]
arr = np.array(myList)

In [4]:
#OR
arr = np.array([1,2,3,4,5])

In [5]:
print(type(myList))
print(type(arr))

<class 'list'>
<class 'numpy.ndarray'>


#### **2.** `np.arange(start, end)` --> creates a NumPy array from (start) to (end-1)

In [6]:
#arange() func --> this is a-range and not arrange 

r = (range(1,4))             # range(start, end) --> creates a range from (start) to (end-1)
a = np.arange(1,4)           # arange(start, end) --> creates a NumPy array from (start) to (end-1)
# print(r)
# print(a)
r,a

(range(1, 4), array([1, 2, 3]))

In [7]:
print(type(r))                  
print(type(a))

<class 'range'>
<class 'numpy.ndarray'>


In [8]:
#Increments:
np.arange(2, 41, 2)      #prints numbers from 2 to 40 with gap of 2
#                |
#               gap

array([ 2,  4,  6,  8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34,
       36, 38, 40])

## `Mathematical Operations` 

a) **Multiplication of NumPy Array by a Number**

In [9]:
r = list(range(1,4))
a = np.arange(1,4)

In [10]:
#eg: Multiplying 'r' and 'a' by 2

print(x * 2 for x in r)                     #this uses a 'for loop' to multiple each element of r by 2 --> [2, 4, 6]

print(a * 2)                                #directly multiples each element with 2 --> array([2, 4, 6])

#That's why we use NumPy for Math and Stats works

<generator object <genexpr> at 0x74974270d220>
[2 4 6]


b) **Array Dimensions**

0-D Array(Dot/Scalar)

In [11]:
a0 = np.array(40)

1-D Array(Line/Vector)

In [12]:
a1 = np.array([2,3,4])

2-D Array(Table/Matrix)

In [13]:
a2 = np.array([[1,2,3],         #1st Row
               [4,5,6]])        #2nd Row

#NOTE, the Double Square Braces

3-D Array(Cube)

In [14]:
a3 = np.array([[[1,2,3],                 #1st Layer
                [4,5,6],
                [7,8,9]],[[1,2,3],       #2nd Layer
                          [4,5,6],
                          [7,8,9]]])

#NOTE, the Double Square Braces & the Comma(,)

- 0-D array uses `0` square brackets
- 1-D array uses `1 pair` of square brackets
- 2-D array uses `2 pairs` of square brackets 
- 3-D array uses `3 pairs` of square brackets

#### **3.** `np.ndim` --> gives **'n-dimensions'** of the NumPy array 

In [15]:
a0.ndim, a1.ndim, a2.ndim, a3.ndim

(0, 1, 2, 3)

#### **4.** `.shape` --> gives format of NumPy array 


- For 0-D Array: () = NO Dimensions
- For 1-D Array: (a, ) = 'a' Elements
- For 2-D Array: (a, b) = 'a' Rows , 'b' Columns
- For 3-D Array: (a, b, c) = 'a' Layers , 'b' Rows, 'c' Columns

In [16]:
a0.shape, a1.shape, a2.shape, a3.shape

((), (3,), (2, 3), (2, 3, 3))

#### **5.** `.size` --> gives Count of Number of Elements


In [17]:
a0.size,a1.size,a2.size,a3.size

(1, 3, 6, 18)

> we can access elements of shape using Indexing 

In [18]:
#Shape func can be referenced as Object --> we can access elements of shape using Indexing 
print(f"This array has {a3.shape[0]} Layers") 
print(f"This array has {a3.shape[1]} Rows") 
print(f"This array has {a3.shape[2]} Columns") 


This array has 2 Layers
This array has 3 Rows
This array has 3 Columns


c) **Arithmetic Operations in NumPy**

In [19]:
arr1 = np.array([3,4,5])
arr2 = np.array([10,20,30])

**Addition of 2 NumPy Arrays**

In [20]:
arr1 + arr2   #[13,24,35]

array([13, 24, 35])

**Multiplication of 2 NumPy Arrays**

In [21]:
arr1 * arr2    #[30, 80, 150]

array([ 30,  80, 150])

**Division of 2 NumPy Arrays**

In [22]:
arr1 / arr2     #[0.33, 0.2, 0.16]

array([0.3       , 0.2       , 0.16666667])

## `Broadcasting`
- Process of applying **arithmetic operations** on 2 arrays with **Different Dimensions**
- In this process, the smaller NumPy array is **stretched** in background by NumPy to make it compatible with bigger NumPy array

eg: Multiplication of 1-D & 0-D NumPy Array

In [23]:
'''                           np.array([1,2,3]) * np.array(10)
                                      |                   |
                                   1D Array            0D Array

                                                |
                                                v
                                        [1,2,3] * [10,10,10]
                                                
                                                |
                                                v
                                            [10,20,30]
'''

'                           np.array([1,2,3]) * np.array(10)\n                                      |                   |\n                                   1D Array            0D Array\n\n                                                |\n                                                v\n                                        [1,2,3] * [10,10,10]\n\n                                                |\n                                                v\n                                            [10,20,30]\n'

- Similarly, 1D array can be **added** to 2D array if no.of 1D array elements **matches** no.of 2D array columns

- Dimensions are compatible if:
   - They are equal, or one of them is 1

> If any pair of dimensions don't follow these rules, broadcasting fails.


![join](https://www.altexsoft.com/static/content-image/2024/10/a4b72e73-3b9d-48b1-b5ba-7451e10c43b2.png)

#### **6.** `reshape()` --> it reshapes a NumPy Array with a dimension into other dimension
- eg: 
    - `1-D` to `2-D`
    - `2-D` to `3-D`
    - `1-D` to `3-D`

In [24]:
arr = np.arange(1,9)                     #(1,2,3,4,5,6,7,8)  --> 1D array
arr

array([1, 2, 3, 4, 5, 6, 7, 8])

`1-D to 2-D:`

In [25]:
arr_new = arr.reshape(2,4)                
#                     | |
#                  Rows  Columns
arr_new

array([[1, 2, 3, 4],
       [5, 6, 7, 8]])

`2-D to 3-D:`

In [26]:
arr_newer = arr_new.reshape(2,2,2)      
#                     /   |   \
#                Layers  Rows  Columns
arr_newer

array([[[1, 2],
        [3, 4]],

       [[5, 6],
        [7, 8]]])

`1-D to 3-D:`

In [27]:
arr_newest = arr.reshape(2,2,2)      
#                     /   |   \
#                Layers  Rows  Columns
arr_newest

array([[[1, 2],
        [3, 4]],

       [[5, 6],
        [7, 8]]])

`Flatening` an array --> used to flatten an image pixel so that we can pass it to neural network

In [28]:
arr_new.reshape(-1)  # using reshape(-1) --> automatically flattens the array 

#any negative value works

array([1, 2, 3, 4, 5, 6, 7, 8])

#### **7.** `np.mod()` --> gives Remainder(Modulo Operation)


In [29]:
#Modulus Function --> gives remainder 
a1 = np.array([10, 20, 30])
a2 = np.array([3, 4, 5])

np.mod(a1, a2)   #([1, 0, 0])

array([1, 0, 0])

## `String Arrays In NumPy`

String NumPy arrays:

In [30]:
greetings = np.array(["Hello", "Welcome"])
greetings2= np.array(["Everyone", "To The NumPy Class !!"])

a) **Add Strings**
#### **8.**  `np.char.add(string1, string2)` -->  This is like Matrix Addition 

In [31]:
#add strings:
np.char.add(greetings, greetings2)  #(["Hello Everyone", "Welcome To The NumPy Class !!"]) --> this is like Matrix Addition

array(['HelloEveryone', 'WelcomeTo The NumPy Class !!'], dtype='<U28')

b) **Concatenate Strings**
#### **9.**  `np.concatenate((string1, string2))` --> This joins Head of End to Tail of Front 

In [32]:
#concatenation of Strings:
np.concatenate((greetings, greetings2))           #(["Hello", "Welcome", "Everyone", "To The NumPy Class !!"])

#NOTE: The double braces cuz it is Tuple

array(['Hello', 'Welcome', 'Everyone', 'To The NumPy Class !!'],
      dtype='<U21')

c) **Concatenate Numbers**  --> this is just Stacking of Numbers on eachothers
#### **9.**  `np.concatenate((arr1, arr2))`  

In [33]:
#Concatenation of Numbers:
arr1 = np.array([[10,20],               #2D
                [30,40]])

arr2 = np.array([[50,60],               #2D
                 [70,80]])


np.concatenate((arr1, arr2))            # ([  [10,20],
                                        #     [30,40],
                                        #     [50,60],
                                        #     [70,80]   ])  

# This is just Stacking of Numbers on eachother

array([[10, 20],
       [30, 40],
       [50, 60],
       [70, 80]])

####  `np.concatenate((arr1, arr2), axis=0/1)`  

- Concatenate by columns - `vertical` --> axis=0

In [34]:
np.concatenate((arr1, arr2), axis = 0)     #concat for columns - vertical
#([  [10,20],
#    [30,40],
#    [50,60],
#    [70,80]  ])

array([[10, 20],
       [30, 40],
       [50, 60],
       [70, 80]])

> np.concatenate((arr1, arr2), axis=0) is Default i.e when we don't mention (axis=0) then it is automatically considered

- Concatenate by columns - `horizontal` --> axis=1

In [35]:
np.concatenate((arr1, arr2), axis = 1)     #concat for columns - horizontal
#([  [10,20,30,40],
#    [50,60,70,80]  ])

array([[10, 20, 50, 60],
       [30, 40, 70, 80]])

> All this concatenation is valid if both arrays are **Compatible**

d) **Additional 'CHAR' Functions**

#### **10.**  `np.char.upper(arr)` --> converts all array string elements to Upper Case  

In [36]:
arr = np.array(["Hello", "hELLO", "world", "worLd"])
np.char.upper(arr)                                    # (["HELLO", "HELLO", "WORLD", "WORLD"])

array(['HELLO', 'HELLO', 'WORLD', 'WORLD'], dtype='<U5')

#### **11.**  `np.unique(arr)` --> this removes all repetitions/redundancy in string element of array

In [37]:
np.unique(arr)                                  # (["HELLO", "WORLD"])

array(['Hello', 'hELLO', 'worLd', 'world'], dtype='<U5')

#### **12.**  `np.char.replace(arr, replace whom, by whom)` --> replaces string 
- Therefore, helping in Data Labelling Consistency  

In [38]:
arr1 = np.array(["Model A", "Model B", "Concept C", "Concept D"])

np.char.replace(arr1, "Model", "Concept")                       #(["Concept A", "Concept B", "Concept C", "Concept D"])

array(['Concept A', 'Concept B', 'Concept C', 'Concept D'], dtype='<U9')

#### **13.**  `np.char.strip(arr)` --> removes Whitespaces 


In [39]:
arr1 = np.array(["  Model A   ", "   Model B ", "  Concept C  ", "   Concept D "])

np.char.strip(arr1)                                             #(["Model A", "Model B", "Concept C", "Concept D"])

array(['Model A', 'Model B', 'Concept C', 'Concept D'], dtype='<U13')

# `Statistical Functions In NumPy`

#### **14.**  `np.random.rand(rows, columns)` --> generates array with random numbers 

In [40]:
arr_x = np.random.rand(1,10)
#                      /   \
#                  rows    columns
arr_x

array([[0.6396269 , 0.40302859, 0.43328837, 0.96865056, 0.10872456,
        0.62330481, 0.89519482, 0.08639265, 0.16884962, 0.66054151]])

#### **15.**  `np.random.randint(dimension of array, no.of elements)` --> It generates a single random integer, NOT an array

In [41]:
arr_x = np.random.randint(2,10) 
arr_x

8

a) **Mean**
#### **16.** 

   - **For 1-D Array:**

####   - `np.mean(arr)` --> Returns mean of whole

In [42]:
arr = np.array([1,2,3,4,5,6])

# 1-D Array
np.mean(arr)                

3.5

   - **For 2-D Array:**

####   - `np.mean(arr)` --> Returns mean of whole
####   - `np.mean(arr,axis=0)` --> Returns mean of each Column
####   - `np.mean(arr,axis=1)` --> Returns mean of each Row

In [43]:
#Mean for 2-D Array:
arr = ([[1,2,3],
        [10,20,30],
        [100,200,300],
        [1000,2000,3000]])

np.mean(arr)                #this gives overall one value
np.mean(arr,axis=0)         #gives Mean of each Column (3)      ([277.75, 555.5, 833.25])
np.mean(arr,axis=1)         #gives Mean of each Row (4)         ([1.5, 15, 150, 1500])


np.mean(arr), np.mean(arr, axis=0), np.mean(arr, axis=1)

(555.5, array([277.75, 555.5 , 833.25]), array([   2.,   20.,  200., 2000.]))

   - **For 3-D Array:**

####   - `np.mean(arr)` --> Returns mean of whole
####   - `np.mean(arr,axis=0)` --> Returns mean of each Column
####   - `np.mean(arr,axis=1)` --> Returns mean of each Row
####   - `np.mean(arr,axis=2)` --> Returns mean of each Layer of 3-D array


In [44]:
#Mean for 3-D Array:
a3 = np.array([[[1,2,3],                 #1st Layer
                [4,5,6],
                [7,8,9]],[[1,2,3],       #2nd Layer
                          [4,5,6],
                          [7,8,9]]])

np.mean(a3,axis=2)      #this gives average in layers 
#([[2,5,8]    #row1,row2,row3
#  [2,5,8]]) 

np.mean(a3,axis=0)      #directly adds element in front of eachother in layer
# ([[1,2,3]
#   [4,5,6]
#   [7,8,9]])

arr = np.array(arr)
np.mean(a3), np.mean(a3, axis=0), np.mean(a3, axis=1), np.mean(a3, axis=2)

(5.0,
 array([[1., 2., 3.],
        [4., 5., 6.],
        [7., 8., 9.]]),
 array([[4., 5., 6.],
        [4., 5., 6.]]),
 array([[2., 5., 8.],
        [2., 5., 8.]]))

b) **Median**
#### **17.**  `np.median(arr)`

In [45]:
#Median:
np.median(arr_x)                #gives middle value

8.0

c) **Standard Deviation**
#### **18.**  `np.std(arr)`

In [46]:
#Standard Deviation:
np.std(arr_x)

0.0

c) **Maximum & Minimum**
#### **19.**
####   - `np.max(arr)` --> Returns single largest value from entire array.
####   - `np.max(arr,axis=0)` --> this gives max element from each Column
####   - `np.max(arr,axis=1)` --> this gives max element from each Row


In [47]:
arr = np.array([
    [12, 7, 9],
    [4, 15, 6],
    [10, 3, 18]
])


np.max(arr), np.max(arr, axis=0), np.max(arr, axis=1)

(18, array([12, 15, 18]), array([12, 15, 18]))

# `Filtering In NumPy`

By Using **Traditional Method**:

In [48]:
arr = np.arange(1,31)

#get the values that are Less Than 14:
filtered_arr = arr[arr<14]

print(filtered_arr)             #([1,2,3,4,5,6,7,8,9,10,11,12,13])

#get the values that are Less Than 14 and greater than 4:
filtered_a = arr[(arr<14) & (arr>4)]

print(filtered_a)               #([5,6,7,8,9,10,11,12,13])

[ 1  2  3  4  5  6  7  8  9 10 11 12 13]
[ 5  6  7  8  9 10 11 12 13]


> In NumPy, we use bitwise operartor '&' instead of 'and'

#### **20.**  `np.where(condition, if_true, if_false)` -->  element that satisfy the condition gets replaced by (if_true) value and element that doesn't satisfy the condition gets replaced by (if_false) value

In [49]:
arr = np.array([1, 2, 3, 4, 5])


filter = np.where((arr > 3), 10, 0)
#                  /         |     \
#             (cond)    (if true)  (if false)

print(filter)      #[0 0 0 10 10]

[ 0  0  0 10 10]


# `Slicing & Dicing of Array (2D Array)`

### Slicing --> we takes out a part of data using Indices
- It uses **Square Brackets '[ ]'**

In [50]:
arr = np.array([[2,3,5],
              [5,9,45],
              [15,23,4]])

arr[0,1]
#   /  \
#row    column

3

- 0  :2  --> gets us 2 rows (range)
- 0 , 2 --> gets us to a element(comma) -> [5]

> first portion represents Rows & second portion represents Columns

In [51]:
arr[1]
# this gives row 1

array([ 5,  9, 45])

In [52]:
arr[0:2]
# this gives both row 0 & 1

array([[ 2,  3,  5],
       [ 5,  9, 45]])

In [53]:
a[-1]           #gives last row 

3

In [54]:
arr[1:3, 1:3]     #gives row 1 to 2 & column 1 to 2
#   /      \
#row       column

array([[ 9, 45],
       [23,  4]])

In [55]:
#if we want till end then no need to mention
arr[0:]

array([[ 2,  3,  5],
       [ 5,  9, 45],
       [15, 23,  4]])

In [56]:
#if we want till end then no need to mention
arr[1: , 1: ]     #gives row 1 to end & element 1 to end 

array([[ 9, 45],
       [23,  4]])

In [57]:
#selecting all rows but from 2nd column to end column
arr[0: , 1: ]

array([[ 3,  5],
       [ 9, 45],
       [23,  4]])

In [58]:
#Addition of elements using indexes: 
arr[1,0] + arr[1,1]         # 5 + 9 = 14

14

`arr[start : stop : step]`:

1. start = 1 → start from row index 1 (2nd row)
2. stop = empty → go till the end
3. step = 2 → skip every 1 row

In [64]:
# Now we slice only 'rows':

#Printing every 2nd Row:
arr = np.array([[1,2,3],[4,5,6],[1,2,3],[4,5,6],[1,2,3],[4,5,6],[1,2,3],[4,5,6]])       # 8 rows vs 3 columns

arr[1::2]
#arr[start : stop : step]


#Now we slice 'rows' AND 'columns' : 

arr[1::2, 1:]  #--> print Row 1 & then jump by 2 and print 2nd row and do this for Column 1 till end column

arr, arr[1::2], arr[1::2, 1:]

(array([[1, 2, 3],
        [4, 5, 6],
        [1, 2, 3],
        [4, 5, 6],
        [1, 2, 3],
        [4, 5, 6],
        [1, 2, 3],
        [4, 5, 6]]),
 array([[4, 5, 6],
        [4, 5, 6],
        [4, 5, 6],
        [4, 5, 6]]),
 array([[5, 6],
        [5, 6],
        [5, 6],
        [5, 6]]))

### Slicing/Referencing for **3D Array**:

- `arr[layer, row, column]`:

In [68]:
arr = np.array([[[1,2,3],[4,5,6],[7,8,9]],                       #1st Layer
                [[10,20,30],[40,50,60],[70,80,90]]])             #2nd Layer 

#to get a slice from it --> (layer, row, column)
arr[0,2,0],         #7
arr[1,1,0:3]       #[40,50,60]

arr, arr[0,2,0], arr[1,1,0:3]

(array([[[ 1,  2,  3],
         [ 4,  5,  6],
         [ 7,  8,  9]],
 
        [[10, 20, 30],
         [40, 50, 60],
         [70, 80, 90]]]),
 7,
 array([40, 50, 60]))

# `Generating Data in NumPy`

#### **21.**  `np.random.rand(start, stop)`

a) Gives range of random numbers from 1 to 10:

In [69]:
np.random.rand(1,10)

array([[0.33382649, 0.95418112, 0.28715903, 0.21779862, 0.36914003,
        0.32916523, 0.69128765, 0.09571865, 0.479831  , 0.31340257]])

b) Gives one random Integer from 1 to 10:

In [77]:
np.random.randint(1,10)

7

#### **22.**  `np.random.randint(start, stop, size)`

a) Gives an array of a defined **size** with random numbers ranging from 1 to 100:

In [78]:
np.random.randint(1,100,size=40)

array([49, 60, 78, 24, 80, 96, 20, 86,  7, 68, 41, 46, 51, 59, 31, 40, 10,
       91, 50, 95, 62, 47, 36, 31, 54, 31,  3, 66, 24,  1, 73, 10,  7, 97,
       32, 93, 12, 69, 46, 98])

b) Gives an array of a defined **size** with random numbers ranging from 1 to 100 and storing it into a Matrix (**Row & Column Format**):

In [80]:
np.random.randint(1,100, size=(8,7))        # 8 Rows & 7 Columns 

array([[70, 85, 83, 80, 79, 41, 85],
       [32, 92, 32, 13, 30, 96, 86],
       [34, 76, 48, 31, 62, 57, 69],
       [48, 49, 26, 41, 62, 56, 24],
       [90, 69, 83, 36, 35, 28, 23],
       [ 8,  3, 40, 58, 79, 86, 50],
       [84, 37, 67, 73, 14, 55, 79],
       [77, 18, 35, 14, 44, 75, 43]])

#### **23.**  `np.linspace(start, stop, no.of divisions)`

**linspace** = I want THIS MANY evenly spaced points between two values

a) Get 20 equal divisions between 0 to 1:

In [82]:
np.linspace(0,1,20)

array([0.        , 0.05263158, 0.10526316, 0.15789474, 0.21052632,
       0.26315789, 0.31578947, 0.36842105, 0.42105263, 0.47368421,
       0.52631579, 0.57894737, 0.63157895, 0.68421053, 0.73684211,
       0.78947368, 0.84210526, 0.89473684, 0.94736842, 1.        ])

b) Using `retstep = True`, it returns TWO things:

- 1️⃣ values → the array of 20 evenly spaced numbers from 0 to 1
- 2️⃣ step → the increment (gap) between consecutive values

In [85]:
np.linspace(0,1,20,retstep = True)              #retstep --> means Return Step

(array([0.        , 0.05263158, 0.10526316, 0.15789474, 0.21052632,
        0.26315789, 0.31578947, 0.36842105, 0.42105263, 0.47368421,
        0.52631579, 0.57894737, 0.63157895, 0.68421053, 0.73684211,
        0.78947368, 0.84210526, 0.89473684, 0.94736842, 1.        ]),
 0.05263157894736842)

# ✅ NumPy — completed.
