# 1. Numpy
---

NumPy provides **fast, vectorized operations** on large sets of numerical data.
A key feature is the **numpy array**, which:

* stores only **one data type** (int, float, bool…)
* allows **element-wise** math (no loops needed)
* is much faster than Python lists

## 1. Array vs list

### Using normal Python lists:

```python
height = [1.73, 1.68, 1.71, 1.89, 1.79]
weight = [65.4, 59.2, 63.6, 88.4, 68.7]

weight / height ** 2   #  Error: lists don't support element-wise math
```

### Using NumPy arrays:

```python
import numpy as np

np_height = np.array(height)
np_weight = np.array(weight)

np_weight / np_height ** 2   #  Works element-wise
```

**Why it works:** NumPy arrays support **vectorized operations**, meaning the math is applied to each element automatically.

## 2. 2D NumPy Arrays

```python
np_2d = np.array([[1.73, 1.68, 1.71, 1.89, 1.79],  
                  [65.4, 59.2, 63.6, 88.4, 68.7]])
```
## 3. Subsetting
```python
array[1]        → element at index 1  
array > 2       → boolean array (True/False per element)  
array[array>2]  → elements greater than 2  
np_2d.shape    → returns the array’s dimensions (rows, columns)  
np_2d[0][2]    → element in row 0, column 2  
np_2d[0, 2]    → element in row 0, column 2  
np_2d[:, 1:3]  → all rows, columns 1 to 2   
np_2d[1, :]    → all columns from row 1 
```

## 4. Basic Statistics

samples : 
```python
np.mean(np_city[:,0])  
np.mediam(np_city[:,0])  
np.corcoef(np_city[:,0])  
np.stf(np_city[:,0])
``` 

## 5. Generate data

`np.random.normal()`

Arguments of `np.random.normal(mean, std, size)`:

* **mean** → center of the distribution
* **std** → standard deviation (spread)
* **size** → number of samples

### Example

```python
height = np.round(np.random.normal(1.75, 0.20, 5000), 2)
weight = np.round(np.random.normal(60.32, 15, 5000), 2)

np_city = np.column_stack((height, weight))
```

`np_city` now contains **5000 rows** of `[height, weight]` pairs.

## 6. Boolean Operators

### logical_and()

```python
np.logical_and(bmi > 21, bmi < 22)
```

### logical_or()

```python
np.logical_or(bmi > 21, bmi < 22)
```

### logical_not()

```python
np.logical_not(bmi > 21, bmi < 22)
```

# 2. Pandas
---
## 1. Dataframe

### Create Dataframe

```python
df = pd.read_csv('file.csv') #from CSV
df = pd.DataFrame(my_dict) #from dictionary
```

### Index

```python
df.index = my_list #after creation
df = pd.read_csv('file.csv', index_col = 0) # from CSV
```

### Bracket access  []

####  Columns access
```python
df['column_name']     →   return a pandas.series
df[['column_name']]   →   return a pandas.DataFrame
```

####  Rows access
```python
df[1:4]               →   slice row 1 to 3.  4 Not included
```

### loc : label based
```python
df.loc['FR']           →   return a pandas.series
df.loc[['FR']]         →   return a pandas.DataFrame
df.loc[['FR','US']]    →   return a pandas.DataFrame 
df.loc[['FR','US'], ['capital','pop']]    →   return only selected rows and columns
df.loc[:,['capital','pop']] →   return all rows, selected columns
```


### iloc : position based
```python
df.iloc[1]           →   return a pandas.series
df.iloc[[1]]         →   return a pandas.DataFrame
df.loc[[1,2]]    →   return a pandas.DataFrame based on a list a index
df.loc[[1,2], [0,1]]    →   return only selected rows and columns
df.loc[:,[0,1]] →   return all rows, selected columns
```
## 2. Filtering

```python
df[df['pop'] > 8]

import numpy as np
df[np.logical_and(df["pop"] > 8, df["pop"] < 10)]   # with Numpy boolean operator
```
