<h1 align="center" style="background-color:#001f3f;
           color:white;
           border-radius:8px;
           font-family:'Times New Roman', Times, serif;
           padding:20px;
           display:inline-block;">
What is an Array?
</h1>

<hr>

<p align="center"><b>Author:</b> Muhammad Usman</p>
<p align="center"><b>Dated:</b> 4/Oct/2025</p>

An `array` is simply a way of storing a `collection of data items` of the `same type` in a single variable, instead of using multiple variables.

**Arrays are very useful in programming because they:**

- Allow storing multiple values in a single variable.

- Support indexing and slicing, meaning you can directly access, update, or iterate over elements.

- Provide a base for numerical and matrix operations in data science, simulations, and machine learning.



- **Example in Python using a list (which works like an array):**

marks = [85, 90, 78, 92]\
print(marks[0])  

`Python` by default doesn’t have `arrays` like C or Java. Instead, it uses `lists`, which behave like flexible arrays.

Lists can store elements of `different data types`, such as integers, floats, and strings, making them `slower` and `memory-hungry`.

*When we apply operations, Python loops run through each element, which is slower compared to low-level C operations.*

In [None]:
lst = [1, 2, 3, 4, 5]
doubled = [x*2 for x in lst]
print(doubled)   

[2, 4, 6, 8, 10]


<h1 align="center" style="background-color:#001f3f;
           color:white;
           border-radius:8px;
           font-family:'Times New Roman', Times, serif;
           padding:20px;
           display:inline-block;">
Introduction to NumPy Arrays
</h1>

NumPy, short for `Numerical Python`, is the most important `Python library` for `numerical and scientific computing`.

It provides a high-performance `n-dimensional array` (means: column and rows), object called `ndarray`, along with a collection of mathematical functions to operate on these arrays efficiently.

<br>

Unlike Python lists, NumPy arrays are homogeneous, meaning all elements must be of the `same data type`, which makes `computations faster` and `memory-efficient`.

*With NumPy, you don’t need to write explicit loops for element-wise operations*

In [2]:
import numpy as np

In [3]:
arr = np.array([1, 2, 3, 4, 5])
doubled = arr * 2
print(doubled)

[ 2  4  6  8 10]


*Notice how we did not write a loop. The single operation `arr * 2` automatically applies the multiplication to every element, using `vectorization`. This makes NumPy 10x to 100x faster than lists, especially for large data.*

> The main `difference` between `Python lists` and `NumPy arrays` is `performance`.
<br>
Lists are general-purpose containers, while NumPy arrays are optimized for numerical operations with vectorization and better memory usage.

<h1 align="center" style="background-color:#001f3f;
           color:white;
           border-radius:8px;
           font-family:'Times New Roman', Times, serif;
           padding:20px;
           display:inline-block;">
Installation of Numpy
</h1>

In terminal Write the command\
\
***pip install numpy***

then import numpy as np in python cell and do computatioon


### **Making 1D array**
*Also know as Vector*

In [4]:
import numpy as np

x = np.array(
    [1,2,3,4,5] 
)

print(x)

print(type(x))
print(f"Shape of an array: {arr.shape}")          # we only assign 5 rows (5,)
print(f"Dimension: {arr.ndim}")
print("Memory Size:", arr.nbytes, "bytes")

%timeit arr * 2

[1 2 3 4 5]
<class 'numpy.ndarray'>
Shape of an array: (5,)
Dimension: 1
Memory Size: 20 bytes
1.32 μs ± 102 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)


In [5]:
import numpy as np
import sys
y = [1,2,3,4,5]
                      # comparision of list and numpy array (memories)
print("Python List Memory:", sys.getsizeof(y))  
%timeit [i*2 for i in y]

Python List Memory: 104
283 ns ± 21.6 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)


### **Making 2D array**
*Also know as Matrix*

In [6]:
# creating 2D array

mat = np.array(
    [
        
    [50,60,70],

    [880,90,100], 
    
    [77,88,99]
    
    
    ]
)

print(mat)

print("")
print(f"Shape:{mat.shape}")
print(f"Dimension:{mat.ndim}")
print("Memory Size:", mat.nbytes, "bytes")

[[ 50  60  70]
 [880  90 100]
 [ 77  88  99]]

Shape:(3, 3)
Dimension:2
Memory Size: 36 bytes


In [7]:
mat1 = np.array(
    [1,2,3], ndmin = 5)
print(mat1)
print(mat.shape)
print(f"Dimension:{mat1.ndim}")

[[[[[1 2 3]]]]]
(3, 3)
Dimension:5


### **Making 3D array**
*Also know as Tensor, cube like structure*

In [8]:
mat2 = np.array(
    
 [
     [
         [10,20,30],
         [40,50,60]
     ],

    [
         [11,22,33],
         [44,55,66]
     ]


 ]

)
# Each layer has 2 rows and 3 columns

print(mat2)

print(" ")
print(type(mat2))
print(f"Shape:{mat2.shape}")
print(f"Dimension:{mat2.ndim}")
print("Memory Size:", mat2.nbytes, "bytes")

[[[10 20 30]
  [40 50 60]]

 [[11 22 33]
  [44 55 66]]]
 
<class 'numpy.ndarray'>
Shape:(2, 2, 3)
Dimension:3
Memory Size: 48 bytes


>In array dimensions must be equal otherwise it will create an object array of python lists (old version only),if u want to make inconsistent matrix u have to define dtype = object, otherwie it will give an error

In [9]:
# Inconsistent rows

arr = np.array(

    [
        [1,2,3], [4,5]
        
        ], dtype = object
    
    )

print(arr)
print("dtype:", arr.dtype)
print("shape:", arr.shape)



[list([1, 2, 3]) list([4, 5])]
dtype: object
shape: (2,)


***nD Arrays: Can go to any dimensions used in deep learning (tensors).***

<h1 align="center" style="background-color:#001f3f;
           color:white;
           border-radius:8px;
           font-family:'Times New Roman', Times, serif;
           padding:20px;
           display:inline-block;">
Datatypes in Numpy
</h1>

NumPy supports many datatypes: `int32`, `int64`, `float32`, `float64`, `complex`, `bool`, etc

In [10]:
# int
arr1 = np.array(

   [
        [100, 200, 300],[400,500,600]
        
        ]

   )
print(arr1)
print("")
print(arr1.dtype)  
print("----------------------------")

# float
arr2 = np.array([1.1, 2.2, 3.3])
print(arr2)
print("")
print(arr2.dtype)  
print("----------------------------")

# bool
arr3 = np.array([True, False, True, 1, 0])
print(arr3)
print("")
print(arr3.dtype)  
print("----------------------------")
print(" ")

# Complex
arr4 = np.array([2 + 3j, 4 + 5j])
print(arr4)
print("")
print(arr4.dtype)  
print("----------------------------")
print(" ")

# String
arr_str = np.array(["AI", "Data", "NumPyy"])
print("String array:", arr_str, arr_str.dtype)
print("----------------------------")
print(" ")

# Object Type
# if you want to store different data types in a single array, you can use dtype=object
arr_obj = np.array([1, "Hello", 3.14], dtype=object)
print("Object array:", arr_obj, arr_obj.dtype)
print("----------------------------")
print(" ")



[[100 200 300]
 [400 500 600]]

int32
----------------------------
[1.1 2.2 3.3]

float64
----------------------------
[1 0 1 1 0]

int32
----------------------------
 
[2.+3.j 4.+5.j]

complex128
----------------------------
 
String array: ['AI' 'Data' 'NumPyy'] <U6
----------------------------
 
Object array: [1 'Hello' 3.14] object
----------------------------
 


***Note: Use float32-bits as it takes less memory and is faster than float64-bits
\
use Float32 when dealing with large datasets or when memory efficiency is a concern.***

<h1 align="center" style="background-color:#001f3f;
           color:white;
           border-radius:8px;
           font-family:'Times New Roman', Times, serif;
           padding:20px;
           display:inline-block;">
Pre-Defined Numpy Arrays
</h1>

### **Constant Value Arrays**

In [11]:
zeros = np.zeros((2, 3)) 
print(zeros)
print(f"By default it gives float64: {zeros.dtype}")
print(f"Dimension: {zeros.ndim}")
print(zeros.itemsize)
print("-------------")
# if you want to change the datatype then you have to define explicitly the data type


zeros = np.zeros((2, 3),dtype= np.int64) 
print(zeros)
print(f"By default it gives float64: {zeros.dtype}")

[[0. 0. 0.]
 [0. 0. 0.]]
By default it gives float64: float64
Dimension: 2
8
-------------
[[0 0 0]
 [0 0 0]]
By default it gives float64: int64


In [12]:
# Creating output matrix 1
ones = np.ones((5,2))    # Always row comes first then column
print(ones)
print(f"Dimension: {ones.ndim}")

[[1. 1.]
 [1. 1.]
 [1. 1.]
 [1. 1.]
 [1. 1.]]
Dimension: 2


In [13]:
# creating full matrix

np.full((5,5), 4)

array([[4, 4, 4, 4, 4],
       [4, 4, 4, 4, 4],
       [4, 4, 4, 4, 4],
       [4, 4, 4, 4, 4],
       [4, 4, 4, 4, 4]])

### **Identity and Diagonal Arrays**

In [14]:
identity = np.eye(4)
print(identity)
print(f"Dimension: {identity.ndim}")
print(f"By default it gives float64: {identity.dtype}")

# it is commonly used in linear algebra for solving linear equations


[[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]
 [0. 0. 0. 1.]]
Dimension: 2
By default it gives float64: float64


In [15]:
np.diag([1,2,3,4])        # ------> now the diagonal is 1,2,3,4

array([[1, 0, 0, 0],
       [0, 2, 0, 0],
       [0, 0, 3, 0],
       [0, 0, 0, 4]])

### **Random Arrays**

- **Uniform Distribution**: np.random.rand(m,n) → [0,1), flat distribution\

- **Normal Distribution**: np.random.randn(m,n) → bell curve, mean=0, std=1

- #### **Uniform Distribution**

In [16]:
# If you want to create Random numbers you can use the following functions:

random1 = np.random.rand(5,5)
print(random1)                          # at each run it will give different output
print(f"Dimension: {random1.ndim}")     

# range is between 0 to 1

[[0.56750292 0.57831992 0.76114311 0.80740698 0.77181717]
 [0.16267548 0.99381734 0.12925615 0.54097708 0.37098172]
 [0.80649746 0.67012528 0.30338269 0.9738256  0.78110172]
 [0.85744954 0.53123913 0.60962467 0.18267735 0.98163825]
 [0.32152071 0.96921784 0.99274356 0.46383081 0.70639301]]
Dimension: 2


<p align="">
  <img src="img/UD.jpg" alt="screenshot" width="400">
</p>

In [17]:
# if we want fix random values at re-run then

random2 = np.random.rand(5,5)
np.random.seed(42)
print(random2)              # now re-run it 100 times no change in values

[[0.20128438 0.94134028 0.90706328 0.81664329 0.821996  ]
 [0.51231532 0.8806526  0.20693152 0.67333699 0.98802745]
 [0.99559309 0.71894433 0.78765577 0.61776314 0.12729836]
 [0.78145376 0.51264638 0.1098976  0.06076152 0.54242849]
 [0.9819766  0.67380192 0.33375069 0.47968631 0.74026582]]


#### **other functions of Uniform distribution**
*np.random.random(),\
 np.random.uniform()*

In [18]:
random3 = np.random.random((2, 3))
print(random3)               # ----> generate random floats between 0 and 1


[[0.37454012 0.95071431 0.73199394]
 [0.59865848 0.15601864 0.15599452]]


In [19]:
random4 = np.random.uniform(1, 5, size=(2,3))
print(random4)               # -----> control the range of uniform distribution, unlike random() which is fixed to [0,1).


[[1.23233445 4.46470458 3.40446005]
 [3.83229031 1.08233798 4.87963941]]


- #### **Normal Distribution**

In [20]:
# If you want to create random values in neg and pos 

np.random.seed(42)  
arr6 = np.random.randn(5,5)
print(arr6)

[[ 0.49671415 -0.1382643   0.64768854  1.52302986 -0.23415337]
 [-0.23413696  1.57921282  0.76743473 -0.46947439  0.54256004]
 [-0.46341769 -0.46572975  0.24196227 -1.91328024 -1.72491783]
 [-0.56228753 -1.01283112  0.31424733 -0.90802408 -1.4123037 ]
 [ 1.46564877 -0.2257763   0.0675282  -1.42474819 -0.54438272]]


<img src="img/ND.jpg" alt="Uniform Distribution" width="400">

#### **NOTE**

These two concepts you will learn in depth in the topic `Data Preprocessing`, but here I am giving you a touch so you understand the difference and why they matter in cleaning.

***Uniform Distribution (`np.random.rand()`)***\
(means all values in the range are equally likely)
- It is almost cleaned no need to do much pre-processing
- Values are bounded between 0 and 1

***Normal Distribution (`np.random.randn`), [other names: Bell curved, Guassian distribution]***
- Need much time in cleaning
- outliers can me much

#### **other functions of Guassian distribution**
*np.random.normal(mean, std, size)*

In [21]:
arr7 = np.random.normal(50, 10, size=(2,3))
print(arr7)

[[51.1092259  38.49006423 53.75698018]
 [43.9936131  47.0830625  43.98293388]]


In [22]:
# if we want to create a range b/w 1 to 4 with 5 values 

abc = np.random.randint(1,4,5)
print(abc, abc.ndim)

[3 1 3 3 2] 1


### **Range-Based Arrays**

In [23]:
# If you want to create a range array

range = np.arange(0,5+1)
print(range)
print(f"Dimension: {range.ndim}")
print("-----------------")

# np.arange only creates a flat 1D array. but you can reshape it into multi-dimensional array
reshaped = range.reshape(2,3)
print(reshaped)
print(f"Dimension: {reshaped.ndim}")


[0 1 2 3 4 5]
Dimension: 1
-----------------
[[0 1 2]
 [3 4 5]]
Dimension: 2


In [24]:
# np.arange(start, stop, step)

rangee = np.arange(0, 10, 2)  
print(rangee)           # ----> it starts from 0, goes up to but not including 10, with step 2.


[0 2 4 6 8]


In [25]:
# np.linspace(start, stop, num)

arr = np.linspace(0, 10, 5)
print(arr)


[ 0.   2.5  5.   7.5 10. ]


***Question for you:\
If I say “I want exactly 101 values between 0 and 1 for plotting a smooth curve”,
which function will you choose: arange or linspace?***

In [26]:
# comment your answwer


### **Empty Arrays**

In [27]:
emp = np.empty((4,5))

print(emp)
print(f"Dimension: {emp.ndim}")
print(emp.dtype)
# this will give you random values

[[0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]]
Dimension: 2
float64


In [28]:
arr7 = np.array([[10,2,3,4],[50,68,7,8],[90,108,11,12]])
empt = np.empty_like(arr7)

print("Original:\n", arr7)

print("")

print("Empty Like:\n", empt)


Original:
 [[ 10   2   3   4]
 [ 50  68   7   8]
 [ 90 108  11  12]]

Empty Like:
 [[  490468639  1078562299  1823429482  1078148794]
 [-1174022826  1078649060 -1227985241  1078329134]
 [ -893062260  1078430369  -956469154  1078328784]]


***What is the difference between `np.empty` and `np.empty_like`?***
<br>

`np.empty` creates a `new array` with a specified shape and data type but does not initialize its values, so it contains random memory garbage.
<br>

`np.empty_like` creates a new uninitialized array with the same shape and dtype as an existing array.
<br>

Both are faster than zeros or ones because they skip initialization, but you must be careful to overwrite the values before use.

<h1 align="center" style="background-color:#001f3f;
           color:white;
           border-radius:8px;
           font-family:'Times New Roman', Times, serif;
           padding:20px;
           display:inline-block;">
Indexing & Slicing
</h1>


In [29]:
import numpy as np

In [30]:
# let creates an array and play with it

arr5 = np.array(

    [
        [50,60,70],
        [90,110,400],
        [600,1000,200]
    ]

)
print(arr5)

[[  50   60   70]
 [  90  110  400]
 [ 600 1000  200]]


In [31]:
arr5[0:2, 1:3]

array([[ 60,  70],
       [110, 400]])

In [32]:
# if we want to access 110, how we do it?
print(f"Element found: {arr5[1,1]}")         #----> indexing starts at 0, (row 1, col 1)

Element found: 110


In [33]:
print(f"Element found: {arr5[:,1]}")         #----> this will print the entire column 1

Element found: [  60  110 1000]


In [34]:
print(f"Element found: {arr5[1,:]}")         #----> this will print entire row 1

Element found: [ 90 110 400]


In [35]:
# [[  50   60   70]       -----  row 0
#  [  90  110  400]       -----  row 1
#  [ 600 1000  200]]      -----  row 2

In [36]:
# slicing sub-matrix
                                               #----> Rows 0:2 → take row 0 and row 1 (stop before 2).
print(f"Element found:\n {arr5[0:2, 1:3]}") 
                                               #----> Columns 1:3 → take column 1 and column 2 (stop before 3).


Element found:
 [[ 60  70]
 [110 400]]


In [37]:
print(f"Element found:\n {arr5[::-1] }")     # ----> this will reverse first row at index 0 and last row at index 2

# this will not modify the array

Element found:
 [[ 600 1000  200]
 [  90  110  400]
 [  50   60   70]]


In [38]:
arr5

array([[  50,   60,   70],
       [  90,  110,  400],
       [ 600, 1000,  200]])

In [39]:
print(f"Element found:\n {arr5[:, ::-1]}")     # ----> this will reverse first column of index 0 and last column of index 2

Element found:
 [[  70   60   50]
 [ 400  110   90]
 [ 200 1000  600]]


In [40]:
# Boolean Indexing
print(arr5[arr5 > 50])                          #----> elements greater than 50
print(arr5.ndim)

[  60   70   90  110  400  600 1000  200]
2


In [41]:
print(arr5)

[[  50   60   70]
 [  90  110  400]
 [ 600 1000  200]]


In [42]:
# Fancy Indexing

print("Selected elements:", arr5[[0,2],[1,2]])     # ---> exracting Elements: (0,1) → row 0, col 1 → 60

                                                                            # (2,2) → row 2, col 2 → 200

Selected elements: [ 60 200]


**Key Difference from Slicing**
\
\
*`Slicing (a[0:2, 1:3])` → extracts a block (subarray), works for `continuous ranges`*
\
\
*`Fancy indexing (a[[0,2],[1,2]])` → extracts specific elements from specific positions, works for `non-contiguous`*

<!-- <div align="center"> -->
  <h1 style="background-color:#001f3f;
             color:white;
             border-radius:8px;
             font-family:'Times New Roman', Times, serif;
             padding:20px;
             display:inline-block;">
    Done with basics:)
  </h1>
<!-- </div> -->
