#### **Creating a Virtual Environment and Installing Packages**

#### What's a venv

A venv (virtual environment) lets you install packages for one project without affecting your system Python or other projects.

So:

- each project gets its own dependencies

- no version conflicts

- easy to reproduce setups

#### How to create a venv

Open your terminal with "ctrl + `" and follow the next steps based on your operating system

Windows (Command Prompt)

1. Open Command Prompt.
2. Create venv:
   `python -m venv .venv`
   (or `py -3 -m venv .venv`)
3. Activate:
   `.venv\Scripts\activate`
4. Install packages:
   `pip install pandas`
5. Freeze requirements:
   `pip freeze > requirements.txt`
6. Deactivate:
   `deactivate`
7. Remove: delete the `.venv` folder.

Windows (PowerShell)

1. Open PowerShell.
2. Create venv:
   `python -m venv .venv`
3. If execution is blocked, run once as admin:
   `Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser`
4. Activate:
   `.venv\Scripts\Activate.ps1`
5. Use `pip` / `pip freeze` / `deactivate` same as above.

macOS / Linux (bash, zsh)

1. Open Terminal.
2. Create venv:
   `python3 -m venv .venv`
3. Activate:
   `source .venv/bin/activate`
4. Install packages:
   `pip install pandas`
5. Freeze:
   `pip freeze > requirements.txt`
6. Deactivate:
   `deactivate`
7. Remove:
   `rm -rf .venv`

macOS / Linux (fish shell)

1. Create venv:
   `python3 -m venv .venv`
2. Activate:
   `source .venv/bin/activate.fish`

Git Bash on Windows

1. Create venv:
   `python -m venv .venv`
2. Activate:
   `source .venv/Scripts/activate`
   (or `.venv\Scripts\activate`)

Notes (short)

* Use `python` vs `python3` depending on your system.
* Use `.venv` (hidden) or `env` — name is up to you.
* To reproduce env on another machine: `pip install -r requirements.txt`.

#### **Connect your venv to Jupyter**
This lets you pick the environment as a notebook kernel.

```bash
pip install ipykernel
python -m ipykernel install --user --name venv
```

### **NumPy**

NumPy is a Python library for fast numerical computing.

Why it matters in data science:

* It gives you ndarrays, which are super-fast arrays for handling numbers.
* Most data science tools (Pandas, Scikit-learn, SciPy, TensorFlow, PyTorch) sit on top of NumPy.
* It makes math on large datasets easy: vectorization, matrix ops, broadcasting.
* It’s way faster than pure Python loops because it runs in optimized C code.
* It’s the foundation for working with features, matrices, tensors, linear algebra, and statistics.

In short: NumPy is the engine that powers almost all numerical work in data science.


To use numpy, we have to import the numpy library

In [None]:
import numpy as np

#### Creating Arrays

In [None]:
# Creating an array with 5 elements that are all zero
np.zeros(5)

array([0., 0., 0., 0., 0.])

In [None]:
# Creating an array with 10 elements that are all one
np.ones(10)

array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])

In [None]:
# Creating an array of five elements that are all 8
np.full(5, 8)

array([8, 8, 8, 8, 8])

In [None]:
# Creating an array with a list of numbers
a = np.array([1, 2, 3, 5, 7, 11])
a

array([ 1,  2,  3,  5,  7, 11])

In [None]:
# Viewing the elements of an array
a[2]

np.int64(3)

In [None]:
# Modifying the elements of an array
a[2] = 13
a

In [None]:
# Creating an array with a range
np.arange(10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [None]:
# Creating an array with a range starting at 3 and ending at an element before 10
np.arange(3, 10)

array([3, 4, 5, 6, 7, 8, 9])

In [None]:
# Creating an array with a range starting at 3 and ending at an element before 10 with a step of 2
np.arange(0, 10, 2)

array([0, 2, 4, 6, 8])

In [None]:
# Creating an array with a range starting at 0 and ending at 1 with 11 elements
np.linspace(0, 1, 11)

array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ])

In [None]:
# Create an array with a specific data type
arr_float = np.array([1, 2, 3, 4, 5], dtype=float)
arr_float.dtype

#### Inspecting Array Properties

In [None]:
# Check the shape of an array
arr = np.array([1, 2, 3, 4, 5])
arr.shape

In [None]:
# Check the number of elements in an array
arr.size

In [None]:
# Check the data type of an array
arr.dtype

In [None]:
# Check the shape of a 2D array
arr_2d = np.array([[1, 2, 3], [4, 5, 6]])
arr_2d.shape

#### Multi Dimensional Arrays

In [None]:
# Creating a 5x2 array of zeros
np.zeros((5, 2))

array([[0., 0.],
       [0., 0.],
       [0., 0.],
       [0., 0.],
       [0., 0.]])

In [None]:
# Creating a 3x3 array of lists of numbers
n = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
])

n

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [None]:
# Accessing an element at row 0, column 1
n[0, 1]

np.int64(2)

In [None]:
n[0, 1] = 20
n

array([[ 1, 20,  3],
       [ 4,  5,  6],
       [ 7,  8,  9]])

In [None]:
# Accessing the entire first row
n[0]

array([ 1, 20,  3])

In [None]:
n[2] = [1, 1, 1]
n

array([[ 1, 20,  3],
       [ 4,  5,  6],
       [ 1,  1,  1]])

In [None]:
# Accessing the entire second column
n[:, 1]

array([20,  5,  1])

In [None]:
n[:, 2] = [0, 1, 2]
n

array([[ 1, 20,  0],
       [ 4,  5,  1],
       [ 1,  1,  2]])

#### Fancy Indexing

In [None]:
# Using integer array indexing to select multiple elements
arr = np.array([10, 20, 30, 40, 50])
indices = np.array([0, 2, 4])
arr[indices]

In [None]:
# Using fancy indexing on a 2D array
arr_2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
rows = np.array([0, 2])
cols = np.array([1, 2])
arr_2d[rows, cols]

### Randomly Generated Arrays

In [None]:
# Generating a 5x2 array of random numbers between 0 and 1 with seed 2
np.random.seed(2)
np.random.rand(5, 2)

array([[0.4359949 , 0.02592623],
       [0.54966248, 0.43532239],
       [0.4203678 , 0.33033482],
       [0.20464863, 0.61927097],
       [0.29965467, 0.26682728]])

In [None]:
# Generating a 5x2 array of random numbers from a standard normal distribution with seed 2
np.random.seed(2)
np.random.randn(5, 2)

array([[-0.41675785, -0.05626683],
       [-2.1361961 ,  1.64027081],
       [-1.79343559, -0.84174737],
       [ 0.50288142, -1.24528809],
       [-1.05795222, -0.90900761]])

In [None]:
# Generating a 5x2 array of random numbers between 0 and 100 with seed 2
np.random.seed(2)
100 * np.random.rand(5, 2)

array([[43.59949021,  2.59262318],
       [54.96624779, 43.53223926],
       [42.03678021, 33.0334821 ],
       [20.4648634 , 61.92709664],
       [29.96546737, 26.68272751]])

In [None]:
# Generating a 5x2 array of random integers between 0 and 100 with seed 8
np.random.seed(8)
np.random.randint(low=0, high=100, size=(5, 2))

array([[67, 84],
       [ 5, 90],
       [ 8, 83],
       [63, 48],
       [85, 60]])

In [None]:
# Randomly choosing elements from an array
choices = np.random.choice([10, 20, 30, 40, 50], size=5)
choices

### Element-wise Operations

In [None]:
# Creating an array with a range from 0 to 5
a = np.arange(5)
a

array([0, 1, 2, 3, 4])

In [None]:
# Adding 1 to each element
a + 1

array([1, 2, 3, 4, 5])

In [None]:
# Performing element-wise operations: (10 + a * 2) squared
(10 + (a * 2)) ** 2

array([100, 144, 196, 256, 324])

In [None]:
# Computing 10 + (a * 2)
10 + (a * 2)

array([10, 12, 14, 16, 18])

In [None]:
# Computing (10 + a * 2) squared divided by 100 and storing in b
b = (10 + (a * 2)) ** 2 / 100
b

array([1.  , 1.44, 1.96, 2.56, 3.24])

In [None]:
# Adding arrays a and b element-wise
a + b

array([1.  , 2.44, 3.96, 5.56, 7.24])

### Broadcasting

Broadcasting lets NumPy match different shapes so an operation still works.

Basic idea:
NumPy stretches the smaller array without copying it.

In [None]:
# Broadcasting a column vector to a 2D array
col = np.array([[10], [20]])
arr_2d + col

In [None]:
# Broadcasting a 1D array to a 2D array
arr_2d = np.array([[1, 2, 3], [4, 5, 6]])
row = np.array([10, 20, 30])
arr_2d + row

In [None]:
# Broadcasting a scalar to an array
arr = np.array([1, 2, 3])
result = arr + 10
result

### Math Functions

In [None]:
# Matrix multiplication
mat1 = np.array([[1, 2], [3, 4]])
mat2 = np.array([[5, 6], [7, 8]])
np.matmul(mat1, mat2)

In [None]:
# Calculate the dot product of two arrays
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
np.dot(a, b)

In [None]:
# Calculate the mean of an array
arr = np.array([1, 2, 3, 4, 5])
np.mean(arr)

In [None]:
# Calculate the standard deviation of an array
np.std(arr)

In [None]:
# Calculate the sum of an array
np.sum(arr)

### Array Shape Manipulation

In [None]:
# Expand dimensions of an array
arr_1d = np.array([1, 2, 3])
expanded = np.expand_dims(arr_1d, axis=0)
expanded.shape

In [None]:
# Transpose an array
arr_2d = np.array([[1, 2, 3], [4, 5, 6]])
transposed = arr_2d.T
transposed

In [None]:
# Reshape an array to a different shape
arr = np.arange(12)
reshaped = arr.reshape(3, 4)
reshaped

In [None]:
# Ravel a 2D array to 1D (returns a view, not a copy)
raveled = reshaped.ravel()
raveled

In [None]:
# Flatten a 2D array to 1D
flattened = reshaped.flatten()
flattened

### Comparison Operations

In [None]:
# Comparing each element of a with 2 (returns boolean array)
a >= 2

array([False, False,  True,  True,  True])

In [None]:
# Comparing arrays a and b element-wise (returns boolean array)
a > b

array([False, False,  True,  True,  True])

In [None]:
# Using a boolean array to filter elements where a > b
a[a > b]

array([2, 3, 4])

### Basic Linear Algebra

In [None]:
# Computing the inverse of a matrix
mat = np.array([[1, 2], [3, 4]], dtype=float)
inv_mat = np.linalg.inv(mat)
inv_mat

In [None]:
# Computing eigenvalues and eigenvectors of a matrix
eigenvalues, eigenvectors = np.linalg.eig(mat)
print("Eigenvalues:", eigenvalues)
print("Eigenvectors:\n", eigenvectors)

In [None]:
# Singular Value Decomposition (SVD)
U, S, Vt = np.linalg.svd(mat)
print("U:\n", U)
print("Singular values:", S)
print("Vt:\n", Vt)