
# Machine Learning Reference Notebook

This notebook serves as a comprehensive reference for key concepts in Machine Learning. You can refer to this notebook when working on ML projects or revising concepts.

## 1. **Linear Algebra**

### Vectors and Matrices:
- **Vectors**: A one-dimensional array of numbers.
- **Matrices**: A two-dimensional array of numbers.

#### Common Operations:
- **Dot Product**: The product of two vectors.
- **Matrix Multiplication**: Multiplying matrices together.
- **Transpose**: Flipping a matrix over its diagonal.

#### Python Libraries:
- `numpy` for vector and matrix operations.
```python
import numpy as np
# Example: Create a matrix
matrix = np.array([[1, 2], [3, 4]])
```

## 2. **Probability and Statistics**

### Basic Concepts:
- **Probability Distributions**: Describes the likelihood of outcomes.
- **Expectation**: The mean of a random variable.
- **Variance**: The spread of a distribution.

#### Common Distributions:
- **Normal Distribution**: Bell curve, used widely in statistics.
- **Bernoulli Distribution**: Models binary outcomes (0 or 1).

#### Python Libraries:
- `scipy.stats` for probability distributions.
```python
from scipy.stats import norm
# Example: Calculate probability for normal distribution
norm.cdf(1.96)
```

## 3. **Supervised Learning**

### Linear Regression:
- Predict continuous values.
- **Equation**: \(y = w_0 + w_1x_1 + \dots + w_nx_n\).
- Minimize the **mean squared error** (MSE).

```python
from sklearn.linear_model import LinearRegression
# Example: Fit a linear regression model
model = LinearRegression()
model.fit(X_train, y_train)
```

### Logistic Regression:
- Used for binary classification.
- **Sigmoid function**: Converts output to a probability between 0 and 1.
```python
from sklearn.linear_model import LogisticRegression
# Example: Fit a logistic regression model
log_model = LogisticRegression()
log_model.fit(X_train, y_train)
```

### Decision Trees and Random Forests:
- **Decision Tree**: Splits data into subsets based on feature values.
- **Random Forest**: Ensemble of decision trees to improve performance.

```python
from sklearn.ensemble import RandomForestClassifier
# Example: Fit a random forest classifier
rf_model = RandomForestClassifier()
rf_model.fit(X_train, y_train)
```

## 4. **Unsupervised Learning**

### Clustering:
- Group data points into clusters.

#### K-Means:
- Iteratively assigns data points to clusters and updates centroids.
```python
from sklearn.cluster import KMeans
# Example: Perform K-Means clustering
kmeans = KMeans(n_clusters=3)
kmeans.fit(X_train)
```

### Dimensionality Reduction:
- Reduces the number of features in the dataset.

#### Principal Component Analysis (PCA):
- Projects data onto lower-dimensional space while preserving variance.
```python
from sklearn.decomposition import PCA
# Example: Apply PCA
pca = PCA(n_components=2)
X_pca = pca.fit_transform(X_train)
```

## 5. **Evaluation Metrics**

### Classification Metrics:
- **Accuracy**: Proportion of correct predictions.
- **Precision**: \( rac{TP}{TP + FP} \).
- **Recall**: \( rac{TP}{TP + FN} \).
- **F1 Score**: Harmonic mean of precision and recall.

```python
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
# Example: Calculate evaluation metrics
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)
f1 = f1_score(y_test, y_pred)
```

## 6. **Deep Learning**

### Neural Networks:
- Composed of layers of neurons.
- Each neuron applies a weighted sum and an activation function.

### Activation Functions:
- **ReLU**: \(f(x) = max(0, x)\).
- **Sigmoid**: \(f(x) = rac{1}{1 + e^{-x}}\).

```python
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

# Example: Create a neural network with Keras
model = Sequential([
    Dense(128, activation='relu', input_shape=(784,)),
    Dense(10, activation='softmax')
])
```

## 7. **Optimization**

### Gradient Descent:
- Update model weights to minimize the loss function.
- Variants include **SGD**, **Adam**, and **RMSprop**.

```python
# Example: Compile a neural network model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
```

## 8. **Model Evaluation**

### Cross-Validation:
- Evaluate model performance on different subsets of the data.
```python
from sklearn.model_selection import cross_val_score
# Example: Perform cross-validation
scores = cross_val_score(model, X_train, y_train, cv=5)
```

### Hyperparameter Tuning:
- Use **Grid Search** or **Random Search** to find optimal model parameters.

```python
from sklearn.model_selection import GridSearchCV
# Example: Perform grid search for hyperparameter tuning
param_grid = {'n_estimators': [100, 200], 'max_depth': [10, 20]}
grid_search = GridSearchCV(rf_model, param_grid, cv=3)
grid_search.fit(X_train, y_train)
```

## 9. **Real-World Applications**

### Natural Language Processing (NLP):
- Tasks: Text classification, sentiment analysis.
- Tools: Bag-of-Words, TF-IDF, Word2Vec, Transformers.

### Computer Vision:
- Tasks: Image classification, object detection, segmentation.
- Tools: CNNs, ResNet, YOLO.

### Time Series Forecasting:
- Tasks: Stock price prediction, demand forecasting.
- Tools: ARIMA, LSTM.

---

## Conclusion

This notebook provides references to key concepts in Machine Learning, along with code snippets for practical implementation using Python and popular ML libraries.
