In this tutorial, we will cover:
* Conda environments
* Basic Python: Basic data types (Containers, Lists, Dictionaries, Sets, Tuples), Functions, Classes
* Numpy: Arrays, Array indexing, Datatypes, Array math, Broadcasting
* Data loading: Boading MNIST dataset, preprocess and transformation
* Linear regresion: Basic 1-d abd N-d linear regression using gradient decent

## Conda

## What is Conda?

**Conda** is an open-source package and environment management system. It allows you to install, run, and manage packages and their dependencies in isolated environments.

### Key Features:
- Environment isolation
- Cross-platform support (Windows, Linux, macOS)
- Supports packages written in any language

## Installing Conda

You can install Conda through either **Anaconda** (full distribution) or **Miniconda** (minimal installer recommended for experienced users):

To install follow install instructions from  [here](https://www.anaconda.com/docs/getting-started/miniconda/install#basic-install-instructions)

---

## Basic Conda Commands

```bash
# Check conda version
conda --version

# Update conda
conda update conda

# Create a new environment
conda create -n <env_name> python=3.11

# Activate an environment
conda activate <env_name>

# Deactivate an environment
conda deactivate

# List all environments
conda env list

# Delete an environment
conda remove -n <env_name> --all
```



## Example: Deep Learning Environment

Create and activate an environment specifically for deep learning:

```bash
conda create -n deep-learning python=3.11
conda activate deep-learning
```

Install deep learning libraries like PyTorch or TensorFlow:

**PyTorch:**
```bash
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
```

**TensorFlow:**
```bash
conda install tensorflow
```

You can install additional libraries like NumPy, Pandas, Matplotlib, and Jupyter:

```bash
conda install numpy pandas matplotlib jupyter
```

Alternatively, install using `pip` within a Conda environment:

```bash
pip install numpy pandas matplotlib jupyter
```

## What is Pip?

**Pip** is Python's default package manager that installs and manages Python packages from the [Python Package Index (PyPI)](https://pypi.org/).

### Key Features:
- Specifically for Python packages
- Installed with Python by default
- Does not handle multiple environments by itself (but can be combined with virtual environments like `venv`)

## Differences between Conda and Pip

| Feature                 | Conda                   | Pip                        |
|-------------------------|-------------------------|----------------------------|
| Package Management      | Cross-language          | Python only                |
| Environment Management  | Built-in                | Needs tools like `venv`    |
| Binary Packages         | Pre-built binaries      | Source-based or binaries   |
| Package Sources         | Conda channels          | PyPI repository            |
| Dependency Resolution   | Robust dependency solver| Basic dependency resolver  |


---
### Export an environment using pip


```bash
pip3 freeze > requirements.txt  # Python3
pip freeze > requirements.txt  # Python2
```

### Install an environment using pip


```bash
conda activate <env_name>
cd <path to working folder>
pip install -r requirements.txt
```

---

## Tips and Tricks

- Always create a separate Conda environment for each project.
- Use descriptive environment names.
- Regularly update your Conda and package installations.

---

Happy coding!

##Basics of Python

####Numbers

Integers and floats work as you would expect from other languages:

In [None]:
x = 3
print(x)
print(type(x))
print(dir(x))


In [None]:
print(x.__add__(1), x+1)

In [None]:
print(f"x + 1 = {x + 1}" )   # Addition
print(f"x - 1 = {x - 1}")     # Subtraction
print(f"x * 2 = {x * 2}")     # Multiplication
print("x ** 2 = {}".format(x ** 2))   # Exponentiation


In [None]:
print(x)
x += 1
print(x)
x *= 2
print(x)


In [None]:
print(type(1), type(1.0))

In [None]:
y = 2.5
print(type(y))
print(y, y + 1, y * 2, y ** 2)

In [None]:
x = 2
print(x/2, type(x/2))
print(x//2, type(x//2), int(x/2))

Note that unlike many languages, Python does not have unary increment (x++) or decrement (x--) operators.

Python also has built-in types for long integers and complex numbers; you can find all of the details in the [documentation](https://docs.python.org/3.7/library/stdtypes.html#numeric-types-int-float-long-complex).

####Booleans

Python implements all of the usual operators for Boolean logic, but uses English words rather than symbols (`&&`, `||`, etc.):

In [None]:
t, f = True, False
print(type(t))

Now we let's look at the operations:

In [None]:
print(t and f, 1 and 0)     # Logical AND;
print(t or f, 1 or 0)       # Logical OR;
print(not t, not 0, not 1)  # Logical NOT;
print(t != f, 1 != 0)               # Logical XOR;

In [None]:
print(f"True * 4 = {t * 4}")
print(f"False * 4 = {f * 4}")

###Containers

Python includes several built-in container types: lists, dictionaries, sets, and tuples.

####Lists

A list is the Python equivalent of an array, but is resizeable and can contain elements of different types:

In [None]:
xs = [3, 1, 2]   # Create a list
print(xs, xs[2])
print(xs[-1])     # Negative indices count from the end of the list; prints "2"

In [None]:
xs[2] = 'foo'    # Lists can contain elements of different types
print(xs)

In [None]:
xs[0] = lambda x: x**2
xs[0](2)

In [None]:
xs.append('bar') # Add a new element to the end of the list
print(xs)

In [None]:
x = xs.pop()     # Remove and return the last element of the list
print(x, xs)

In [None]:
print(xs*2)

As usual, you can find all the gory details about lists in the [documentation](https://docs.python.org/3.7/tutorial/datastructures.html#more-on-lists).

####Slicing

In addition to accessing list elements one at a time, Python provides concise syntax to access sublists; this is known as slicing:

In [None]:
r = range(5)
print(f"range: {r}, type: {type(r)}")


nums = list(r)    # range is a built-in function that creates a list of integers
print(f"nums: {nums}, type: {type(nums)}")

In Python’s slice syntax  
```
sequence[start:stop:step]
```  
each field maps positionally to a “slice object” that tells the interpreter how to walk the sequence and collect elements. Internally, the process goes like this:

1. **Parsing the fields**  
   - The text before the first colon (`:`) is taken as **start**.  
   - The text between the first and second colon is **stop** (sometimes called “end”).  
   - The text after the second colon is **step**.  
   If you omit a field, Python substitutes its default (see below).

2. **Applying defaults**  
   - If **start** is omitted, it defaults to `0` when `step` is positive (or to `len(sequence)–1` when `step` is negative).  
   - If **stop** is omitted, it defaults to `len(sequence)` when `step` is positive (or to `–len(sequence)–1` when `step` is negative).  
   - If **step** is omitted, it defaults to `1`.

3. **Normalizing negative indices**  
   Any negative value in **start** or **stop** is effectively shifted by adding `len(sequence)`. For example, a start of `–2` becomes `len(sequence)–2`.  

4. **Element selection rule**  
   Having resolved concrete integers for `start`, `stop`, and `step`, Python generates elements at indices  
   ```
   start, start + step, start + 2⋅step, …
   ```  
   stopping as soon as the next index would overshoot the “stop” boundary—**stop** is always exclusive.

5. **Directionality**  
   A positive **step** walks forward (low to high indices); a negative **step** walks backward (high to low indices). The defaults and boundary checks adapt accordingly so you can, for instance, reverse a sequence by using `step = –1` without needing to swap start or stop manually.

In short, the three slice arguments map exactly in order to (1) where to begin, (2) where to end (exclusive), and (3) how to increment (or decrement) between elements. Omitted arguments pick sensible defaults, and negative values are shifted relative to the sequence’s length before slicing begins.

In [None]:
print(nums[2:4])    # Get a slice from index 2 to 4 (exclusive); prints "[2, 3]"
print(nums[2:])     # Get a slice from index 2 to the end; prints "[2, 3, 4]"
print(nums[:2])     # Get a slice from the start to index 2 (exclusive); prints "[0, 1]"
print(nums[:])      # Get a slice of the whole list; prints ["0, 1, 2, 3, 4]"
print(nums[:-1])    # Slice indices can be negative; prints ["0, 1, 2, 3]"
nums[2:4] = [8, 9]  # Assign a new sublist to a slice
print(nums)         # Prints "[0, 1, 8, 9, 4]"
print(nums[::2])    # prints only the even indices from the start
print(nums[::-1])   # Revers the list
print(nums[::-2])   # Revers the list while selecting only the even indices from the end

####List comprehensions:

When programming, frequently we want to transform one type of data into another. As a simple example, consider the following code that computes square numbers:

In [None]:
nums = [0, 1, 2, 3, 4]
squares = []
for x in nums:
    squares.append(x ** 2)
print(squares)

You can make this code simpler using a list comprehension:

In [None]:
nums = [0, 1, 2, 3, 4]
squares = [x ** 2 for x in nums]
print(squares)

You can make this code even more simpler using a list comprehension and the function range:

In [None]:
squares = [x ** 2 for x in range(5)]
print(squares)

In [None]:
import time

size = int(1e7)
# Loop
x = list(range(size))
start = time.time()
squares = []
for i in x:
    squares.append(i ** 2)
end = time.time()
t1 = end - start
print(f"loop: {t1:>15.3f}", "top5:", squares[:5])

# comprehensions
start = time.time()
squares = [i ** 2 for i in range(size)]
end = time.time()
t2 =  end - start
print(f"comprehensions: {t2:<.3f}", "top5:", squares[:5])



List comprehensions can also contain conditions:

In [None]:
even_squares = [x ** 2 for x in range(5) if x % 2 == 0]
print(even_squares)

In [None]:
even_squares = [x ** 2  if x % 2 == 0 else None for x in range(5)]
print(even_squares)

####Tuples

A **tuple** in Python is an **ordered**, **immutable** sequence of items. Unlike lists, once created a tuple’s contents cannot be changed (no additions, removals, or item‑assignments). Tuples are typically written with items separated by commas and enclosed in parentheses, for example:

- **Ordered**: items have a defined position, so you can access elements by index  
- **Immutable**: once defined, you cannot modify its length or contents  
- **Heterogeneous**: elements of different types can coexist (e.g., integers, strings, objects)  
- **Lightweight**: slightly more efficient than lists for fixed collections of data  

Common uses include returning multiple values from a function, grouping related but different items together, or using as keys in dictionaries (since they’re hashable).

In [None]:
t = (0, 1)
print(f"t: {t}, type: {type(t)}")

In [None]:
t[0] = 1

In [None]:
d = {(x, x + 1): x for x in range(10)}  # Create a dictionary with tuple keys
t = (5, 6)       # Create a tuple
print(f"d: {d}")
print(f"t: {t}, d[t)]: {d[t]}")

##Numpy

Numpy is the core library for scientific computing in Python. It provides a high-performance multidimensional array object, and tools for working with these arrays. If you are already familiar with MATLAB, you might find this [tutorial](http://wiki.scipy.org/NumPy_for_Matlab_Users) useful to get started with Numpy.

To use Numpy, we first need to import the `numpy` package:

In [None]:
import numpy as np
from sklearn.datasets import fetch_openml
import matplotlib.pyplot as plt
import tensorflow_datasets as tfds

###Arrays

A **NumPy array** (or **ndarray**) is an **N‑dimensional**, **homogeneous**, **mutable** container for elements of a single data type, stored in a contiguous block of memory. It forms the core data structure for numerical and scientific computing in Python.

- **N‑dimensional**: supports vectors, matrices, and higher‑order tensors via axes (shape tuple)  
- **Homogeneous**: all elements share the same data type (e.g., `float64`), enabling highly optimized storage and computation  
- **Mutable**: individual elements or entire slices can be reassigned in place  
- **Vectorized operations**: arithmetic, logical, and mathematical functions apply element‑wise without explicit Python loops  
- **Broadcasting**: automatically expands smaller arrays’ shapes to match larger ones in arithmetic operations  
- **Memory‑efficient**: backed by contiguous C arrays, minimizing overhead compared to Python lists  

**Common uses** include linear algebra, statistical analysis, image processing, and any workload requiring fast, large‑scale numerical computations.

We can initialize numpy arrays from nested Python lists, and access elements using square brackets:

In [None]:
a = np.array([1, 2, 3],)  # Create a rank 1 array
print(f"type: {type(a)}, shape: {a.shape}, a: {a}, dtype: {a.dtype}")



In [None]:
a = np.array([1, 2, 3], dtype=np.float16)  # Create a rank 1 array
print(f"type: {type(a)}, shape: {a.shape}, a: {a.shape}, dtype: {a.dtype}")



In [None]:
a[0] = 5                 # Change an element of the array
print(a)

In [None]:
b = np.array([[1,2,3],[4,5,6]])   # Create a rank 2d array
print(f"type: {type(b)}, shape: {b.shape}, dtype: {b.dtype}")
print(f"{b}")


Numpy also provides many functions to create arrays:

In [None]:
a = np.zeros((2,2))  # Create an array of all zeros
print(f"a: zeros\n{a}", end="\n\n")

b = np.ones((1,2))   # Create an array of all ones
print(f"b: ones\n{b}", end="\n\n")

c = np.full((2,2), 7) # Create a constant array
print(f"a: full\n{c}", end="\n\n")


d = np.eye(3)        # Create a 3x3 identity matrix
print(f"d: diagonal \n{d}", end="\n\n")

e = np.random.random((2,2)) # Create an array filled with random values
print(f"e: random\n{e}", end="\n\n")

###Array indexing and slicing

In [None]:
# Create the following rank 2 array with shape (3, 4)
# [[ 1  2  3  4]
#  [ 5  6  7  8]
#  [ 9 10 11 12]]
print("a: ")
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
print(a)

**Array Indexing in NumPy**

NumPy provides **multiple ways** to access array elements. One of the most common is **slicing**, which works similarly to Python lists. However, because arrays can be **multidimensional**, you must supply a **slice for each axis** of the array or numpy will use its default values:


In [None]:
# slicing rows
row_r1 = a[1, :]
row_r2 = a[1:2, :]
row_r3 = a[[1], :]

print(row_r1, row_r1.shape)
print()
print(row_r2, row_r2.shape)
print()
print(row_r3, row_r3.shape)

In [None]:
# We can make the same distinction when accessing columns of an array:
col_r1 = a[:, 1]
col_r2 = a[:, 1:2]
print(col_r1, col_r1.shape)
print()
print(col_r2, col_r2.shape)

**Array Slicing: Views vs. Copies**

When you take a slice of a NumPy array, you get a **view** into the original data rather than an independent copy. This means that **modifying the slice will also modify the original array**.


In [None]:
print(a[0, 1])
b[0, 0] = 77    # b[0, 0] is the same piece of data as a[0, 1]
print(a[0, 1])

**Integer Array Indexing**

While **slicing** always returns a **view** that is a contiguous subarray, **integer array indexing** uses integer arrays to select elements at **arbitrary positions**, producing a **new array** with any shape you choose. Since it creates a copy, **modifying the result does _not_ affect the original array**.


In [None]:
# 2D arrays
a = np.array([[1,2], [3, 4], [5, 6]])
print(f"shape: {a.shape}")
print()

print(a[[0, 1, 2,], [0, 1, 0]])
print(np.array([a[0, 0], a[1, 1], a[2, 0]]))

In [None]:
# 3D arrays
a = arr_rand = np.random.rand(28, 28, 3)
print(f"shape: {a.shape}")
print()

print(a[[0, 1, 2], [2, 1, 0], [0, 0, 0]])
print(np.array([a[0, 2, 0], a[1, 1, 0], a[2, 0, 0]]))

In [None]:
plt.imshow(a)
plt.show()

In [None]:
import torchvision
testset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                       download=True)

npimg = np.array(testset[1][0])
print(npimg.shape)
plt.imshow(npimg)
plt.show()

One useful trick with integer array indexing is selecting or mutating one element from each row of a matrix:

In [None]:
# Create a new array from which we will select elements
a = np.array([[1,2,3, 4], [5,6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]])
print(a)

In [None]:
# Create an array of indices
b = np.array(range(4))

# Select one element from each row of a using the indices in b
print(a[np.arange(4), b])  # Prints elements on main diagonal

In [None]:
# Mutate one element from each row of a using the indices in b
a[np.arange(4), b] **= 2
print(a)

### Conditional (Boolean) Indexing

NumPy lets you filter elements using a **boolean mask**—an array of `True`/`False` values that matches the shape of the original. When you write:

```
filtered = arr[arr > 0]
```

- The expression `arr > 0` produces a **boolean mask**.  
- Indexing `arr` with that mask returns a **1D array** of all elements where the mask is `True`.  
- Because this operation creates a **copy**, **modifying** `filtered` **does not** affect the original `arr`.  

In [None]:
import numpy as np

arr = np.array([[1,2], [3, 4], [5, 6]])

filtered  = (arr % 2) == 0
print(f"filtered:\n {filtered}")
print()

print(f"arr filtered: {arr[filtered]}")

### Conditional Indexing with `np.where`

NumPy’s `where` function provides two complementary ways to work with boolean conditions:

1. **Finding indices where a condition is true**  
   - Calling `indices = np.where(condition)` returns a **tuple of index arrays** (one per dimension) indicating where `condition` holds.  
   - You can then use these indices to select or assign values in the original array:  



In [None]:
idx = np.where((arr % 2) == 1)       # tuple of arrays like (row_idx, col_idx, ...)
odd = arr[idx]
print(f"arr.shape: {arr.shape}, idx: {idx}")
print()
print(f"odd: {odd}")

2. **Selecting between two arrays (or values) element‑wise**  
   - Calling `result = np.where(condition, x, y)` constructs a **new array** by choosing elements from `x` wherever `condition` is `True`, and from `y` wherever it is `False`.  
   - The shapes of `condition`, `x`, and `y` must be broadcast‑compatible, and the result has the same shape as `condition`.  




In [None]:
arr_1 = np.where((arr % 1) == 0, arr, 0)
arr_1

**Key points**  
- `np.where(condition)` → **indices** of `True` entries (useful for indexing or assignment).  
- `np.where(condition, x, y)` → **element‑wise selection**, producing a brand‑new array.  
- In both cases, the original array is **not modified** unless you explicitly assign back using the indices.  

###Datatypes

Every numpy array is a grid of elements of the same type. Numpy provides a large set of numeric datatypes that you can use to construct arrays. Numpy tries to guess a datatype when you create an array, but functions that construct arrays usually also include an optional argument to explicitly specify the datatype. Here is an example:

In [None]:
x = np.array([1, 2])  # Let numpy choose the datatype
y = np.array([1.0, 2.0])  # Let numpy choose the datatype
z = np.array([1, 2], dtype=np.int64)  # Force a particular datatype

print(x.dtype, y.dtype, z.dtype)

You can read all about numpy datatypes in the [documentation](http://docs.scipy.org/doc/numpy/reference/arrays.dtypes.html).

###Array math

Basic mathematical functions operate elementwise on arrays, and are available both as operator overloads and as functions in the numpy module:

In [None]:
x = np.array([[1,2],[3,4]], dtype=np.float64)
y = np.array([[5,6],[7,8]], dtype=np.float64)

# Elementwise sum; both produce the array
print(x + y)
print(np.add(x, y))

In [None]:
# Elementwise difference; both produce the array
print(x - y)
print(np.subtract(x, y))

In [None]:
# Elementwise product; both produce the array
print(x * y)
print(np.multiply(x, y))

In [None]:
# Elementwise division; both produce the array
# [[ 0.2         0.33333333]
#  [ 0.42857143  0.5       ]]
print(x / y)
print(np.divide(x, y))

In [None]:
print(np.sqrt(x))

Note that unlike MATLAB, `*` is elementwise multiplication, not matrix multiplication. We instead use the dot function to compute inner products of vectors, to multiply a vector by a matrix, and to multiply matrices. dot is available both as a function in the numpy module and as an instance method of array objects:

In [None]:
x = np.array([[1,2],[3,4]])
y = np.array([[5,6],[7,8]])

v = np.array([9,10])
w = np.array([11, 12])

# Inner product of vectors; both produce 219
print(v.dot(w))
print(np.dot(v, w))

You can also use the `@` operator which is equivalent to numpy's `dot` operator.

In [None]:
print(v @ w)

In [None]:
# Matrix / vector product; both produce the rank 1 array [29 67]
print(x.dot(v))
print(np.dot(x, v))
print(x @ v)

In [None]:
# Matrix / matrix product; both produce the rank 2 array
# [[19 22]
#  [43 50]]
print(x.dot(y))
print(np.dot(x, y))
print(x @ y)

### Reduction Operations in NumPy



NumPy offers a set of **reduction functions** that aggregate values across one or more axes of an array.


#### 1. Descriptive Statistics
These functions compute simple statistical summaries:

- **`np.sum`** – Total sum of all elements  
- **`np.mean`** – Arithmetic mean  
- **`np.min`** / **`np.max`** – Minimum or maximum value  



#### 2. Index of Extremes: `argmin` and `argmax`

- **`np.argmax`** / **`np.argmin`** return the **index** of the maximum or minimum value along a given axis.  
- This is useful when you care not just about the value, but **where** it occurred in the array.  
- Example: To find which student scored highest in Subject 1, apply `argmax` on the corresponding row or column.



#### 3. Functional Reductions: `reduce` and `accumulate`

NumPy also supports functional-style reductions via:

- **`np.add.reduce(array)`** – Applies the operation cumulatively and returns the **final result** only  
- **`np.add.accumulate(array)`** – Returns an array of all **intermediate results**, showing the cumulative process  

This pattern is available for operations like `add`, `multiply`, `logical_and`, etc., and is especially useful for **step-by-step transformations** or **progressive aggregation**.



#### Controlling Dimensions

- Use the **`axis`** parameter to specify the direction of reduction.  
  For example:  
  - `axis=0` reduces **down columns**  
  - `axis=1` reduces **across rows**
- Add **`keepdims=True`** to retain reduced axes with size 1, which helps preserve dimensionality and simplifies broadcasting.



#### Controlling Data Type

- The **`dtype`** parameter controls the output data type.




> 🔍 **Note:** Reduction functions **return new arrays** and do **not modify** the original input. Internally, they use optimized low-level code for high performance.


In [None]:
import numpy as np

# Example data: scores of 5 students across 3 subjects
scores = np.array([
    [88, 92, 80],
    [75, 85, 89],
    [91, 90, 77],
    [83, 78, 85],
    [80, 88, 92]
])

print("Original scores:\n", scores)

print("\n### 1. Descriptive Statistics (with axis)")
print("Sum-> total:", scores.sum(),       " | axis=0 (per subject):", scores.sum(axis=0),       "| axis=1 (per student):", scores.sum(axis=1))
print("Mean-> total:", scores.mean(),      " | axis=0 (per subject):", scores.mean(axis=0),      "| axis=1 (per student):", scores.mean(axis=1))
print("Min-> total:", scores.min(),       " | axis=0 (per subject):", scores.min(axis=0),       "| axis=1 (per student):", scores.min(axis=1))
print("Max-> total:", scores.max(),       " | axis=0 (per subject):", scores.max(axis=0),       "| axis=1 (per student):", scores.max(axis=1))




In [None]:
print("\n### 2. Index of Extremes (argmax/argmin)")
print("Argmax-> overall", scores.argmax(), "axis=0 (which student per subject):", scores.argmax(axis=0), "| axis=1 (which subject per student):", scores.argmax(axis=1))
print("Argmin-> overall:", scores.argmin(), "axis=0 (which student per subject):", scores.argmin(axis=0), "| axis=1 (which subject per student):", scores.argmin(axis=1))



In [None]:
print("\n### 3. Reduce and Accumulate (1D array)")
arr = np.array([1, 2, 3, 4])
print("Array:", arr)
print("Reduce (sum):", np.add.reduce(arr))
print("Accumulate (sum):", np.add.accumulate(arr))
print("Reduce (mull):", np.multiply.reduce(arr))
print("Accumulate (mull):", np.multiply.accumulate(arr))




In [None]:
print("\n### 4. keepdims=True Example")
print("Mean with keepdims=True:\n", np.mean(scores, keepdims=True))
print("Mean with axis=0, keepdims=True:\n", np.mean(scores, axis=0, keepdims=True))
print("Mean with axis=1, keepdims=True:\n", np.mean(scores, axis=1, keepdims=True))



You can find the full list of mathematical functions provided by numpy in the [documentation](http://docs.scipy.org/doc/numpy/reference/routines.math.html).

Apart from computing mathematical functions using arrays, we frequently need to reshape or otherwise manipulate data in arrays. The simplest example of this type of operation is transposing a matrix; to transpose a matrix, simply use the T attribute of an array object:

In [None]:
print(x)
print("transpose\n", x.T)

In [None]:
v = np.array([[1,2,3]])
print(v )
print("transpose\n", v.T)

In [None]:
a = np.random.random((28, 28, 3))
print(a.shape)
print(a.transpose((0, 1, 2)).shape)


(28, 28, 3)
(28, 28, 3)


In [None]:
print(a.transpose((0, 2, 1)).shape)

(28, 3, 28)


In [None]:
print(a.transpose((2, 0, 1)).shape)
print(a.transpose((2, 1, 0)).shape)

(3, 28, 28)
(3, 28, 28)


### Broadcasting



Broadcasting is NumPy’s way of performing operations between arrays of **different shapes**, by **automatically expanding** dimensions where needed—without copying data.

Rules of broadcasting:
- Compare shapes **from the end**.
- Dimensions must be either **equal** or **1**.
- NumPy **expands** dimensions with size 1 to match the other array.

Common patterns:
- Add a 1D array to each row or column of a 2D array
- Multiply a column vector and row vector to create a matrix
- Apply scalar operations to an entire array





In [None]:
# Original 1D and 2D arrays
a = np.array([1, 2, 3])              # shape (3,)
b = np.array([[10], [20], [30]])     # shape (3, 1)
c = np.array([[1, 2, 3]])            # shape (1, 3)

print("Original arrays:")
print("a (shape):", a.shape, "→", a)
print("b (shape):", b.shape, "→\n", b)
print("c (shape):", c.shape, "→", c)

print("\n### Broadcasting Examples")

# 1. Broadcasting a 1D array across rows of a 2D array
print("Add a (1D) to b (column vector):")
result1 = b + a
print("Result shape:", result1.shape)
print(result1)

# 2. Broadcasting across rows using a (1, 3) row vector
print("\nMultiply b (3×1) with c (1×3):")
result2 = b * c
print("Result shape:", result2.shape)
print(result2)

# 3. Broadcasting scalar
print("\nAdd scalar to array:")
print("a + 10:", a + 10)

print("\n### Reshaping Examples")

arr = np.arange(12)
print("Original flat array:", arr, "| shape:", arr.shape)



Original arrays:
a (shape): (3,) → [1 2 3]
b (shape): (3, 1) →
 [[10]
 [20]
 [30]]
c (shape): (1, 3) → [[1 2 3]]

### Broadcasting Examples
Add a (1D) to b (column vector):
Result shape: (3, 3)
[[11 12 13]
 [21 22 23]
 [31 32 33]]

Multiply b (3×1) with c (1×3):
Result shape: (3, 3)
[[10 20 30]
 [20 40 60]
 [30 60 90]]

Add scalar to array:
a + 10: [11 12 13]

### Reshaping Examples
Original flat array: [ 0  1  2  3  4  5  6  7  8  9 10 11] | shape: (12,)


### Reshaping

Use `.reshape()` to **change the shape** of an array without changing its data.

Key features:
- You must **preserve the total number of elements**
- You can use **`-1`** to let NumPy **infer one dimension**
- Useful for preparing data for ML models or aligning arrays for broadcasting

Examples:
- `arr.reshape(3, 4)` turns a 1D array into a 3×4 matrix
- `arr.reshape(2, -1)` auto-computes the second dimension
- `arr.reshape(-1)` flattens to 1D

In [None]:
# 1. Reshape to 3x4
arr = np.arange(12)
arr_reshaped_3x4 = arr.reshape(3, 4)
print(f"Reshaped from {arr.shape} to {arr_reshaped_3x4.shape}:\n", arr_reshaped_3x4)
print()
# 2. Reshape to 2x2x3
arr_reshaped_2x2x3 = arr.reshape(2, 2, 3)
print(f"Reshaped to from {arr.shape} to {arr_reshaped_2x2x3.shape}:\n", arr_reshaped_2x2x3)
print()

# 3. Reshape  2x2x3 to 3*2*2
arr_reshaped_3x2x2 = arr_reshaped_2x2x3.reshape(3, 2, 2)
print(f"Reshaped from {arr_reshaped_2x2x3.shape} to {arr_reshaped_3x2x2.shape}:\n", arr_reshaped_3x2x2)
print()

# 4. Automatically infer one dimension with -1
arr_reshaped_auto = arr.reshape(2, -1)
print("Reshaped to 2×? (-1 infers 6):\n", arr_reshaped_auto)
print()

# 5. Flattening (reshape back to 1D)
arr_flattened = arr_reshaped_3x4.reshape(-1)
print("Flattened back to 1D:", arr_flattened)

### Use cases for brodcasting

###### **1. Add vector to matrix**

In [None]:
# We will add the vector v to each row of the matrix x,
# storing the result in the matrix y
x = np.arange(1,13).reshape(4, 3)
v = np.array([1, 2, 3])
y = np.empty_like(x)   # Create an empty matrix with the same shape as x

# Add the vector v to each row of the matrix x with an explicit loop
for i in range(4):
    y[i, :] = x[i, :] + v


print(y)

This works; however when the matrix `x` is very large, computing an explicit loop in Python could be slow. Note that adding the vector v to each row of the matrix `x` is equivalent to forming a matrix `vv` by stacking multiple copies of `v` vertically, then performing elementwise summation of `x` and `vv`. We could implement this approach like this:

In [None]:
vv = np.tile(v, (4, 1))  # Stack 4 copies of v on top of each other
print(vv)                # Prints "[[1 0 1]
                         #          [1 0 1]
                         #          [1 0 1]
                         #          [1 0 1]]"

In [None]:
y = x + vv  # Add x and vv elementwise
print(y)

Numpy broadcasting allows us to perform this computation without actually creating multiple copies of v. Consider this version, using broadcasting:

In [None]:
# We will add the vector v to each row of the matrix x,
# storing the result in the matrix y
v = np.array([1, 2, 3])
y = x + v  # Add v to each row of x using broadcasting
print(y)
print()

v = np.array([[1, 2, 3, 4]]).T
y = x + v  # Add v to each row of x using broadcasting
print(y)

##### **2. Normalize a NumPy matrix per rows and comumn**

In [None]:
# Create a random matrix: 5 rows (samples), 3 columns (features)
X = np.random.randn(5, 3) * 10  # Normally distributed, not yet normalized

print("Means per row:", X.mean(axis=1), "| Means per column:", X.mean(axis=0))
print("Std per row:", X.std(axis=1), "| Std per column:", X.std(axis=0))

In [None]:
# 1. Normalize each row to mean=0 and std=1
row_mean = X.mean(axis=1, keepdims=True)        # shape: (5, 1)
row_std = X.std(axis=1, keepdims=True)          # shape: (5, 1)
X_row_norm = (X - row_mean) / row_std           # broadcasting across columns

print("Means per row:", X_row_norm.mean(axis=1))
print("Stds per row:", X_row_norm.std(axis=1))


In [None]:
# 2. Normalize each column of the result to mean=0 and std=1
col_mean = X_row_norm.mean(axis=0, keepdims=True)   # shape: (1, 3)
col_std = X_row_norm.std(axis=0, keepdims=True)     # shape: (1, 3)
X_col_norm = (X_row_norm - col_mean) / col_std      # broadcasting across rows

print("Means per column:", X_col_norm.mean(axis=0))
print("Stds per column:", X_col_norm.std(axis=0))

Broadcasting typically makes your code more concise and faster, so you should strive to use it where possible.

### Array arithmatics

In [None]:
# Element-wise Multiplication
a = np.array([1, 2, 3])
b = np.array([10])
c = np.array([[2], [3]])
d = np.array([[2, 3, 4]])
e = np.array([[1], [2], [3]])
r1 = a * a      # (3, ) * (3, )
r2 = a * b      # (3, ) * (1, )
r3 = a * c      # (3, ) * (2, 1)
r4 = a *d       # (3, ) * (1, 3)
r5 = a * e      # (3, ) * (3, 1)

print(f"r1: {a.shape} x {a.shape}\n", r1, "new shape:", r1.shape, end="\n\n")
print(f"r2: {a.shape} x {b.shape}\n", r2, "new shape:", r2.shape, end="\n\n")
print(f"r3: {a.shape} x {c.shape}\n", r3, "new shape:", r3.shape, end="\n\n")
print(f"r4: {a.shape} x {d.shape}\n", r4, "new shape:", r4.shape, end="\n\n")
print(f"r5: {a.shape} x {e.shape}\n", r5, "new shape:", r5.shape, end="\n\n")


#### Dot Product

In [None]:
# Using np.dot
a = np.array([[1, 2, 3]])
b = np.array([[10]])
c = np.array([[1, 1, 1], [2, 2, 2], [3, 3, 3]])


r1 = np.dot(a.T, a)
r2 = np.dot(b, a)
r3 = np.dot(a, c)
print(f"r1: {a.T.shape} x {a.shape}\n", r1, "new shape:", r1.shape, end="\n\n")
print(f"r2: {b.shape} x {a.shape}\n", r2, "new shape:", r2.shape, end="\n\n")
print(f"r3: {a.shape} x {c.shape}\n", r3, "new shape:", r3.shape, end="\n\n")





#### Timming

In [None]:
# Large vectors
size = int(1e7)
a = np.random.rand(size)
b = np.random.rand(size)

# Vectorized
start = time.time()
c = a * b
end = time.time()
t1 = end - start
print(f"Vectorized time: {t1:<14.3f}", "top5:", c[:5])

# Loop (slower)
start = time.time()
c_loop = np.zeros_like(a)
for i in range(size):
    c_loop[i] = a[i] * b[i]
end = time.time()
t2 =  end - start
print(f"Loop time: {t2:<20.3f}", "top5:", c[:5])
print(f"{t2//t1} magnitudes of order slower")


## Dataloading

**`scikit-learn`** is one of the most widely used Python libraries for **machine learning and data preprocessing**. It provides simple, consistent tools for:

- Dataset loading and manipulation
- Data preprocessing (scaling, splitting, encoding, etc.)
- Classical ML models (e.g., SVM, logistic regression, decision trees)
- Evaluation metrics and model validation


### Loading MNIST Dataset

**MNIST Dataset**

**MNIST** is a well-known dataset of **28×28 grayscale images** of handwritten digits (0–9), commonly used for benchmarking image classification models.

It contains **70,000 images**, already split into training and test sets.




In [None]:
data, y = fetch_openml("mnist_784", version=1, return_X_y=True, as_frame=False)
print(f"data.shape: {data.shape}, y.shape: {y.shape}")


In [None]:
y

#### Reshape

In [None]:
data = data.reshape(-1, 28, 28)              # From (70000, 784) → (70000, 28, 28)
y = y.astype(np.uint8)                       # Convert labels from string to int
y_hot = np.eye(y.max() +1)[y]                # One-hot encoding

print(f"data.shape: {data.shape}, y.shape: {y.shape}, y_hot.shape: {y_hot.shape}")
print(y[0], y_hot[0])


In [None]:
print(np.eye(y.max() +1)[[0]])
print()
print(np.eye(y.max() +1)[[1, 5]])
print()
print(np.eye(y.max() +1)[[3, 6, 9]])
print()
print(np.eye(y.max() +1)[[5, 6, 0, 1]])

#### Preprocess and normelize

In [None]:
print(f"data: min: {data.min()}, max: {data.max()},  mean: {data.mean():6.3f}, std: {data.std():6.3f}")
data_norm = data / 255.0
data_1std = (data - data.mean()) / data.std()
print(f"data_norm: min: {data_norm.min():6.3f}, max: {data_norm.max():6.3f}, mean: {data_norm.mean():6.3f}, std: {data_norm.std():6.3f}")
print(f"data_1std: min: {data_1std.min():6.3f}, max: {data_1std.max():6.3f}, mean: {data_1std.mean():6.3f}, std: {data_1std.std():6.3f}")


#### Visualize

In [None]:
def visualize(data, y=None, n=9):
  plt.figure(figsize=(3, 3))
  for i in range(9):
      plt.subplot(3, 3, i + 1)
      plt.imshow(data[i], cmap="gray")
      plt.axis('off')
      if y is not None:
          plt.title(f"Label: {y[i]}")
  plt.tight_layout()
  plt.show()

In [None]:
visualize(data_norm, y, n=9)

In [None]:
visualize(data_1std, y, n=9)


### Transformation

In [None]:
def add_noise(X, std=0.1):
    return np.clip(X + np.random.normal(0, std, X.shape), 0.0, std)

def flip(X, axis=2):
  return np.flip(X, axis=axis)

def random_crop(X, crop_size=24):
  X_cropped = []
  for img in X:
      pad_img = np.pad(img, ((4, 4), (4, 4)), mode='constant')
      x = np.random.randint(0, 8)
      y = np.random.randint(0, 8)
      X_cropped.append(pad_img[x:x+crop_size, y:y+crop_size])
  return np.array(X_cropped)

In [None]:
d = add_noise(data_norm[:9], 0.5)
visualize(d, y, n=9)

In [None]:
d = flip(data_norm[:9], axis=0)
visualize(d, y, n=9)

In [None]:
d = random_crop(data_norm[:9])
visualize(d, y, n=9)

### Split into training, validation and test

In [None]:
def split_dataset_np(X, y, val_size=0.1, test_size=0.1, random_state=42):
    """
    Splits a dataset into train, validation, and test sets using NumPy indexing.

    Parameters:
        X (ndarray): Input features
        y (ndarray): Labels
        val_size (float): Fraction of the total data to use for validation
        test_size (float): Fraction of the total data to use for test
        random_state (int): Random seed for reproducibility

    Returns:
        (X_train, y_train), (X_val, y_val), (X_test, y_test)
    """
    np.random.seed(random_state)
    num_samples = X.shape[0]

    # Shuffle indices
    indices = np.random.permutation(num_samples)

    # Compute split sizes
    test_count = int(num_samples * test_size)
    val_count = int(num_samples * val_size)
    train_count = num_samples - val_count - test_count

    # Slice the indices
    test_idx = indices[:test_count]
    val_idx = indices[test_count:test_count + val_count]
    train_idx = indices[test_count + val_count:]

    # Index into data
    X_train, y_train = X[train_idx], y[train_idx]
    X_val, y_val     = X[val_idx], y[val_idx]
    X_test, y_test   = X[test_idx], y[test_idx]

    return (X_train, y_train), (X_val, y_val), (X_test, y_test)

In [None]:
(X_train, y_train), (X_val, y_val), (X_test, y_test) = split_dataset_np(
    data,
    y,
    val_size=0.1,
    test_size=0.1
)

print("Train:", X_train.shape)
print("Validation:", X_val.shape)
print("Test:", X_test.shape)

In [None]:
for i in range(10):
  print(f"Lable {i} train: {sum(y_train == i) / (len(y_train)): .0%}, ", end='')
  print(f"validation: {sum(y_val == i) / (len(y_val)): .0%}), ", end='')
  print(f"test: {sum(y_test == i) / (len(y_test)): .0%}")


### Data loader

In [None]:
class NumpyDataLoader:
    def __init__(self, X, y, batch_size=32, shuffle=True):
        """
        Initialize a simple NumPy-based data loader.

        Parameters:
            X (ndarray): Input data, shape (N, ...)
            y (ndarray): Labels, shape (N,)
            batch_size (int): Number of samples per batch
            shuffle (bool): Whether to shuffle data each epoch
        """
        self.X = X
        self.y = y
        self.batch_size = batch_size
        self.shuffle = shuffle
        self.current = 0
        self.num_samples = X.shape[0]
        self.indices = np.arange(self.num_samples)

    def __iter__(self):
        if self.shuffle:
            np.random.shuffle(self.indices)
        self.current = 0
        return self

    def __next__(self):
        if self.current >= self.num_samples:
            raise StopIteration
        start = self.current
        end = min(start + self.batch_size, self.num_samples)
        batch_idx = self.indices[start:end]
        self.current = end
        return self.X[batch_idx], self.y[batch_idx]

    def __len__(self):
        return int(np.ceil(self.num_samples / self.batch_size))

In [None]:
# Assuming you already have X_train and y_train
loader = NumpyDataLoader(X_train, y_train, batch_size=56, shuffle=True)
print(f"Number of batches: {len(loader)}")




NameError: name 'NumpyDataLoader' is not defined

In [None]:
# use these batches in training loop
X_batch, y_batch  = next(iter(loader))
print(X_batch.shape, y_batch.shape)
visualize(X_batch, y_batch, n=9)

## Gradient decent

### 2D Linear regression

In [None]:
# Sample data (N points in 2D)
np.random.seed(0)
num_points = 32000
x = np.arange(-1, 1, 2/num_points)  # x axis -  in the ragne [-10,10] with num_points points
m = np.random.random()
b_true = 5
noise = 0.1
y = m * x + b_true + np.random.normal(0, noise, size=num_points)
plt.scatter(x, y, marker='.')
plt.plot(x, m * x + b_true, color="r")
plt.xlabel("x1")
plt.ylabel("y")
plt.show()



In [None]:
data_loader = NumpyDataLoader(x, y, batch_size=32, shuffle=True)

In [None]:
# Initialize weights
w = np.random.random() # one for the slope (m) and second for bais (b)
b = np.random.random()

# Hyperparameters
lr = 0.001
epochs = 10
batch_size = 10

In [None]:
for epoch in range(epochs):
    for x_batch, y_batch in data_loader:
        y_pred = w * x_batch + b

        # Errors
        error = y_pred - y_batch

        # Gradients (vectorized!)
        grad_w = 2 * np.mean(error * x_batch)
        grad_b = 2 * np.mean(error)

        # Update parameters
        w -= lr * grad_w
        b -= lr * grad_b

    if epoch % (epochs//10) == 0 or epoch == epochs - 1:
      loss = np.mean((w * x + b - y) ** 2)
      print(f"Epoch {epoch+1}: Loss = {loss:.5f}")


print(f"\nReal slope: {m:.3f}, Learned slope: {w:.3f}")
print(f"Real bais: {b_true:.3f}, Learned bais: {b:.3f}")

### n-dimentional Linear regression

In [None]:
# Generate 3D data (same as before)
np.random.seed(0)
num_features = 10
num_points = 10000
x = np.random.uniform(-1, 1, size=(num_points, num_features))
m = np.random.random(num_features)
print(f"real parameters: {m}")
b_true = 0.25
noise = 1
y = x @ m + b_true + np.random.normal(0, noise, size=num_points)

In [None]:
data_loader = NumpyDataLoader(x, y, batch_size=8, shuffle=True)

In [None]:
# Initialize weights
w = np.random.random(num_features) # one for the slope (m) and second for bais (b)
b = np.random.random()

# Hyperparameters
lr = 0.0001
epochs = 100

In [None]:
for epoch in range(epochs):
    for x_batch, y_batch in data_loader:
        y_pred = x_batch @ w + b

        # Errors
        error = y_pred - y_batch

        # Gradients (vectorized!)
        grad_w = 2 * np.mean(x_batch.T * error, axis=1)
        grad_b = 2 * np.mean(error)

        # Update parameters
        w -= lr * grad_w
        b -= lr * grad_b

    if epoch % (epochs//10) == 0 or epoch == epochs - 1:
      loss = np.mean((x @ w + b - y) ** 2)
      print(f"Epoch {epoch+1}: Loss = {loss:.5f}")
      # print(w, b)

print()
print(f"bais -> True : {b_true:.3f}, Pred: {b:.3f}")
for i, pair in enumerate(zip(m,w)):
  print(f"w{i} -> True: {pair[0]:.3f}, Pred: {pair[1]:.3f}")



In [None]:
print(x[0] @ w)
print(x[0] @ m)