# Table of Contents
 ## 1. Array Creation Functions
 ## 2. Data Types (dtype)
 ## 3. Shape Manipulation
 ## 4. Mathematical & Statistical Operations
 ## 5. Broadcasting
 ## 6. Stacking & Splitting Arrays
 ## 7. Copy vs View

# 1. Array Creation

In [1]:
import numpy as np

#### 1.1. np.eye() – identity matrix
Creates an identity matrix (1s on diagonal, 0s elsewhere).

Used in:
- Linear algebra
- Machine learning
- Matrix operations (important in ML algorithms)

In [2]:
np.eye(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

#### 1.2. np.logspace() - creates numbers that are evenly spaced on a logarithmic scale.

This means:
- Instead of spacing numbers normally (like 1, 2, 3, 4),
- It spaces them according to powers of 10 (like 10^0, 10^1, 10^2).

Syntax: np.logspace(start, stop, num)    , Where: 
- start = exponent of the first number
- stop = exponent of the last number
- num = how many numbers to generate


In [3]:
np.logspace(0, 3, 4)

array([   1.,   10.,  100., 1000.])

# 2. Data Types (dtype)

dtype (data type) - tells what kind of values are stored inside a NumPy array.

Example: integers, floats, booleans, complex numbers.

### 2.1. Checking the Data Type

Use .dtype to check the type of elements in the array.

In [6]:
arr = np.array([1, 2, 3])
arr.dtype

dtype('int32')

### 2.2. Converting Data Type

Use .astype() to convert an array to another type.

#### Example 1: int -> float

In [7]:
arr = np.array([1, 2, 3])
arr.astype(float)

array([1., 2., 3.])

#### Example 2: float -> int

In [10]:
arr = np.array([1.6, 2.8, 3.4])
arr.astype(int)    # decimals removed

array([1, 2, 3])

### 2.3. Common Data Types

Here are the most commonly used dtypes in Data Analysis:

| dtype    | Meaning           | Use case                 |
|----------|-------------------|---------------------------|
| int32    | 32-bit integer    | counts, labels            |
| in64     | 64-bit integer    | large integer values      |
| float32  | 32-bit float      | fast ML models            |
| float64  | 64-bit float      | precise calculations      |
| bool     | True/False        | filtering, masks          |
| complex  | complex numbers   | scientific computing      |


NumPy chooses type automatically based on the data you give.

### 2.3. Why `dtype` Matters?

Because dtype affects:

#### 1. Memory usage
- `int32` uses 4 bytes
- `int64` uses 8 bytes
- `float64` uses 8 bytes

Large datasets → dtype matters for performance.

#### 2. Speed
Smaller dtypes (`float32`, `int32`) can be faster in ML calculations.

#### 3. Precision
For scientific calculations, float64 is preferred.


For deep learning, float32 is standard.

#### 4. Compatibility

Pandas, Scikit-Learn, and TensorFlow expect specific dtypes.

# 3. Shape Manipulation

These functions help you change the shape, dimensions, and structure of arrays — very important when preparing datasets.


Below are the essential ones.

#### 3.1.   .shape — check shape of array

Shows the number of rows and columns.

In [41]:
arr = np.array([[1, 2, 3], [4, 5, 6]])
arr.shape

(2, 3)

#### 3.2. .reshape() — change the shape

Used to convert 1D → 2D or 2D → 3D etc.

In [23]:
arr.reshape(3, 2)

array([[1, 2],
       [3, 4],
       [5, 6]])

Rules:
- Total elements must match
- Does not change data

#### 3.3. .ravel() — flatten array (returns view)

Converts multi-dimensional array into 1D. (Fast)

In [24]:
arr.ravel()

array([1, 2, 3, 4, 5, 6])

In [25]:
arr

array([[1, 2, 3],
       [4, 5, 6]])

#### 3.4. .flatten() — flatten array (returns copy)

Same as ravel(), but returns a new copy.

In [26]:
arr = np.array([[1, 2, 3], [4, 5, 6]])
arr.flatten()

array([1, 2, 3, 4, 5, 6])

In [27]:
arr

array([[1, 2, 3],
       [4, 5, 6]])

Use this in ML when you must ensure the data is independent.

#### 3.5. .T — transpose

Swaps rows ↔ columns.

In [28]:
arr.T

array([[1, 4],
       [2, 5],
       [3, 6]])

Used very often in linear algebra and ML.

#### 3.6. np.expand_dims() — add new dimension

Makes a 1D array into 2D or 2D into 3D.

In [29]:
np.expand_dims(arr, axis = 0)  # add row dimension

array([[[1, 2, 3],
        [4, 5, 6]]])

In [30]:
np.expand_dims(arr, axis = 1)   # add column dimension

array([[[1, 2, 3]],

       [[4, 5, 6]]])

Useful when preparing data for ML models.

#### 3.7. np.squeeze() — remove a dimension

Removes dimensions of size 1.

##### Example 1: 

In [42]:
arr.shape

(2, 3)

In [43]:
np.squeeze(arr).shape

(2, 3)

##### Example 2: 

In [44]:
arr = np.array([[[1, 2, 3]]])  # shape (1,1,3)
print(arr.shape)

(1, 1, 3)


In [45]:
np.squeeze(arr).shape

(3,)

- 1 dimensions removed
- Shape (1,1,3) becomes (3,)

When squeeze is useful, It helps in cases like:
- Removing extra dimensions in ML model outputs
- Fixing shapes after reshaping
- Removing unnecessary singleton axes

# 4. Mathematical & Statistical Operations

## A. Elementwise Mathematical Operations

These apply to every element of the array:

1. **Addition, subtraction, multiplication, division**

In [55]:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

In [56]:
a + b

array([5, 7, 9])

In [57]:
a - b

array([-3, -3, -3])

In [58]:
a * b

array([ 4, 10, 18])

In [59]:
a / b

array([0.25, 0.4 , 0.5 ])

2. **NumPy math functions (ufuncs)**


Vectorized and super fast:

In [49]:
np.add(a, b)

array([5, 7, 9])

In [50]:
np.subtract(a, b)

array([-3, -3, -3])

In [51]:
np.multiply(a, b)

array([ 4, 10, 18])

In [52]:
np.divide(a, b)

array([0.25, 0.4 , 0.5 ])

In [53]:
np.power(a, 2)

array([1, 4, 9], dtype=int32)

In [54]:
np.abs(a)

array([1, 2, 3])

## B. Matrix Multiplication

1. **Using @ operator**

In [61]:
a @ b

32

2. **Using .dot()**

In [62]:
a.dot(b)

32

Used in:
- Linear Regression
- Neural Networks
- PCA
- Statistics

## C. Statistical Operations

These help summarize your data quickly.

In [63]:
arr = np.array([1, 2, 3])

1. **Sum**

In [67]:
np.sum(arr)

6

2. **Mean**

In [69]:
np.mean(arr)

2.0

3. **Standard deviation**

In [70]:
np.std(arr)

0.816496580927726

4. **Minimum and maximum**

In [71]:
np.min(arr)

1

In [72]:
np.max(arr)

3

5. **Index of min/max**

In [73]:
np.argmin(arr)

0

In [74]:
np.argmax(arr)

2

## D. More useful functions

**np.sqrt()**

In [75]:
np.sqrt(arr)

array([1.        , 1.41421356, 1.73205081])

**np.exp()**


Used in ML activation functions (softmax, sigmoid, etc.)

In [76]:
np.exp(arr)

array([ 2.71828183,  7.3890561 , 20.08553692])

**np.log()**


Used in:
- log transformations
- probability
- loss functions

In [77]:
np.log(arr)

array([0.        , 0.69314718, 1.09861229])

# 5. Broadcasting

Broadcasting lets NumPy apply operations between arrays of different shapes by automatically expanding the smaller array.

#### Simple Example (1D + Scalar)

In [78]:
arr = np.array([1, 2, 3])
arr + 5

array([6, 7, 8])

→ NumPy "stretches" the scalar 5 to match the array.

#### Example: 1D + 1D (same size)

In [80]:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
a + b

array([5, 7, 9])

Same shapes → directly works.

#### Example: 2D + 1D (VERY IMPORTANT)

In [81]:
A = np.array([[1, 2, 3],
              [4, 5, 6]])

b = np.array([10, 20, 30])

A + b

array([[11, 22, 33],
       [14, 25, 36]])

→ b is expanded across rows.

### General Broadcasting Rules

NumPy compares shapes from right to left:

|Dimension A	|Dimension B	|Result      |
|---------------|---------------|------------|
|equal	        |equal	        |OK          |
|1	            |n	            |expand the 1| 
|n	            |1	            |expand the 1|
|mismatch	    |mismatch	    |error       |

#### Example of Broadcasting Failure

In [84]:
a = np.array([1, 2, 3])
b = np.array([10, 20])
a + b      # cannot broadcast

ValueError: operands could not be broadcast together with shapes (3,) (2,) 

Reason: (3,) and (2,) cannot match in any dimension.

### Broadcasting in ML / Data Science

Used for:
- Normalizing data  `X = (X - X.mean()) / X.std()`
- Adding bias terms
- Feature scaling
- Vectorized Python operations
- Applying functions across rows/columns

# 6. Stacking & Splitting Arrays (NumPy)


These functions help you combine or separate arrays — very useful for:
- joining datasets
- combining features
- splitting train/test data
- reshaping data

## A. Stacking Arrays (Combining Arrays)

#### 1. np.hstack() — horizontal stack

Joins arrays side-by-side (column-wise).

In [85]:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

np.hstack([a, b])

array([1, 2, 3, 4, 5, 6])

#### 2. np.vstack() — vertical stack

Joins arrays top-to-bottom (row-wise).

In [86]:
np.vstack([a, b])

array([[1, 2, 3],
       [4, 5, 6]])

#### 3. np.dstack() — depth stack

Stacks arrays along depth (3rd dimension).

Turns 2D arrays into 3D.

In [91]:
x = np.array([[1, 2, 3], [4, 5, 6]])
y = np.array([[7, 8, 9], [11, 23, 45]])

np.dstack([x, y])

array([[[ 1,  7],
        [ 2,  8],
        [ 3,  9]],

       [[ 4, 11],
        [ 5, 23],
        [ 6, 45]]])

#### 4. np.column_stack() — stack as columns

Turns 1D arrays into columns and stacks them side-by-side.

Perfect for creating datasets:

In [90]:
np.column_stack([a, b])

array([[1, 4],
       [2, 5],
       [3, 6]])

#### 5. np.concatenate() — general-purpose join

You can join arrays along any axis.

In [94]:
np.concatenate([x, y], axis=0)   # vertical

array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [11, 23, 45]])

In [95]:
np.concatenate([x, y], axis=1)   # horizontal

array([[ 1,  2,  3,  7,  8,  9],
       [ 4,  5,  6, 11, 23, 45]])

## B. Splitting Arrays (Breaking Arrays Apart)

#### 1. np.hsplit() — split horizontally (columns)

Splits array column-wise.

In [100]:
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]])
np.hsplit(arr, 2)    # split into 2 parts horizontally

[array([[ 1,  2],
        [ 5,  6],
        [ 9, 10],
        [13, 14]]),
 array([[ 3,  4],
        [ 7,  8],
        [11, 12],
        [15, 16]])]

#### 2. np.vsplit() — split vertically (rows)

Splits array row-wise.

In [103]:
np.vsplit(arr, 2)    # split into 2 vertical parts

[array([[1, 2, 3, 4],
        [5, 6, 7, 8]]),
 array([[ 9, 10, 11, 12],
        [13, 14, 15, 16]])]

#### 3. np.array_split() — flexible splitting

Unlike hsplit/vsplit, it works even if the array size is not perfectly divisible.

In [105]:
np.array_split(arr, 3)

[array([[1, 2, 3, 4],
        [5, 6, 7, 8]]),
 array([[ 9, 10, 11, 12]]),
 array([[13, 14, 15, 16]])]

# 7. Copy vs View

When you create a new array from an existing array, NumPy may return:
- a view → shares memory with the original
- a copy → completely independent

Understanding this is critical for Data Analysis and ML to avoid accidentally modifying your data.

### View

A view is a new array object that looks different, but shares the same underlying data.

In [106]:
a = np.array([1, 2, 3, 4])
b = a[1:3]      # view
b[0] = 100
print(a)

[  1 100   3   4]


- Changing b changes a
- Fast
- No extra memory used

### Copy

A copy is a completely separate array with its own memory.

In [107]:
a = np.array([1, 2, 3, 4])
b = a.copy()    # copy
b[0] = 100
print(a)

[1 2 3 4]


-  Changing b does NOT change a
-  Safe
-  Uses extra memory

### Important: Slicing creates a VIEW

In [109]:
b = a[1:4]   # view

### .copy() creates a COPY

In [110]:
b = a[1:4].copy()

# Summary

### 1. Array Creation 
- np.array() - Creates a NumPy array from a Python list (or list of lists).
- np.arange() - Creates a range of numbers with a start, stop, and step (like Python range() but returns an array).
- np.zeros() - Creates an array filled with zeros of a given shape.
- np.ones() - Creates an array filled with ones of a given shape.
- np.full() - Creates an array filled with a specific constant value.
- np.linspace() - Creates a specified number of evenly spaced values between two numbers.
- np.eye() - Creates an identity matrix (1s on diagonal, 0 elsewhere).
- np.empty() - Creates an array without initializing values (contains garbage memory).
- np.logspace() - Creates logarithmically spaced numbers (used in engineering/scientific graphs).

### 2. Data Types (`dtype`) 

- `arr.dtype` → checks the data type of array elements.
- `arr.astype(new_type)` → converts array to a new dtype.
- Common dtypes: `int32`, `int64`, `float32`, `float64`, `bool`, `complex`.
- dtype affects memory, speed, precision, and compatibility with ML libraries.


### 3. Shape Manipulation 

- `arr.shape` → shows array shape  
- `arr.reshape()` → changes shape without changing data  
- `arr.ravel()` → flattens array (returns view)  
- `arr.flatten()` → flattens array (returns copy)  
- `arr.T` → transpose (rows ↔ columns)  
- `np.expand_dims()` → adds new axis (dimension)  
- `np.squeeze()` → removes axis of size 1  


### 4. Mathematical & Statistical Operations 

**Elementwise Operations**
- `a + b`, `a - b`, `a * b`, `a / b`
- `np.add`, `np.subtract`, `np.multiply`, `np.divide`
- `np.power`, `np.abs`

**Matrix Multiplication**
- `A @ B` (recommended)
- `A.dot(B)`

**Statistics**
- `np.sum`, `np.mean`, `np.std`
- `np.min`, `np.max`
- `np.argmin`, `np.argmax`

**Useful Functions**
- `np.sqrt`, `np.exp`, `np.log`


### 5. Broadcasting 

- Broadcasting automatically expands smaller arrays during operations.
- Works when dimensions match or one of them is 1.
- Examples:
  - `arr + 5`
  - 2D + 1D (row/column operations)
- Used in normalization, scaling, and vectorized ML operations.
- Fails when dimensions are incompatible.


### 6. Stacking & Splitting 

**Stacking (combine arrays)**
- `np.hstack()` → stack horizontally (columns)
- `np.vstack()` → stack vertically (rows)
- `np.dstack()` → stack depth-wise (3D)
- `np.column_stack()` → convert 1D arrays to columns and stack
- `np.concatenate()` → general join along any axis

**Splitting (break arrays)**
- `np.hsplit()` → split by columns
- `np.vsplit()` → split by rows
- `np.array_split()` → split unevenly


### 7. Copy vs View 

- **View**: shares data with original array  
  - Created by slicing  
  - Fast, memory-efficient  
  - Changes reflect in the original

- **Copy**: independent array  
  - Created using `.copy()`  
  - Safe but uses extra memory  
  - Changes do NOT affect the original