### What is a parameter?

A *parameter* is a variable in a model that is learned from data. For example, weights in a linear regression or neural network are parameters.


### What is correlation?

Correlation measures the linear relationship between two variables. Values range from -1 (perfect negative linear relationship) to +1 (perfect positive linear relationship).


### What does negative correlation mean?

Negative correlation means that as one variable increases, the other tends to decrease. For example, temperature vs. heating energy use (in some contexts) might be negatively correlated.


### Define Machine Learning. What are the main components in Machine Learning?

Machine Learning (ML) is a field of computer science that builds algorithms that learn patterns from data to make predictions or decisions. Main components:

- **Data** (features and labels)
- **Model** (hypothesis or algorithm)
- **Loss function** (how we measure errors)
- **Optimizer / Training procedure** (how we update parameters)
- **Evaluation** (metrics, test set)


### How does loss value help in determining whether the model is good or not?

The loss function quantifies how well the model's predictions match the true values on a dataset. Lower loss usually indicates better fit. But watch out for overfitting: very low training loss and high test loss indicates poor generalization.


### What are continuous and categorical variables?

**Continuous** variables take numeric values on a continuum (e.g., height, temperature). **Categorical** variables take a finite set of discrete values (e.g., color: red/green/blue).


### How do we handle categorical variables in Machine Learning? Common techniques?

Common techniques:

- **Label Encoding**: assign integer IDs to categories (suitable for ordinal categories).
- **One-Hot Encoding**: create binary columns for each category (good for nominal categories).
- **Target / Mean Encoding**: replace categories with the mean target value (use carefully to avoid leakage).
- **Binary / Hashing encoding**: for high-cardinality features.


### What do you mean by training and testing a dataset?

Training is the process of fitting a model's parameters using labeled data. Testing (evaluation) is measuring model performance on held-out data that the model did not see during training.


### What is sklearn.preprocessing?

A scikit-learn module that contains tools for feature scaling, encoding, normalization, imputation, and other preprocessing functionality (e.g., StandardScaler, MinMaxScaler, OneHotEncoder).


### What is a Test set?

A test set is a portion of data held out from training and used only to evaluate final model performance and estimate generalization error.


### How do we split data for model fitting (training and testing) in Python?

Typically we use `train_test_split` from `sklearn.model_selection` to split arrays or DataFrames into training and test sets.


### How do you approach a Machine Learning problem?

Short approach:
1. Understand problem & success metric
2. Collect & inspect data (EDA)
3. Clean & preprocess
4. Feature engineering
5. Choose baseline model
6. Train & validate (cross-validation)
7. Tune hyperparameters
8. Evaluate on test set
9. Deploy & monitor


### Why do we have to perform EDA before fitting a model to the data?

Exploratory Data Analysis (EDA) helps detect data issues (missing values, outliers), understand feature distributions and relationships (correlations), and design appropriate preprocessing and modeling strategies.


### How can you find correlation between variables in Python?

Use `pandas.DataFrame.corr()` for Pearson correlation, `scipy.stats` for other correlation measures (Spearman, Kendall).


### What is causation? Explain difference between correlation and causation with an example.

Causation means one event causes another. Correlation is only association. Example: ice cream sales and drowning incidents are correlated (both increase in summer) but ice cream does not cause drowning. Temperature (a confounder) causes both to rise.


### What is an Optimizer? Types and examples.

An optimizer updates model parameters to minimize the loss. Examples:

- **Gradient Descent (GD)**: full-batch updates using gradient of loss over entire dataset.
- **Stochastic Gradient Descent (SGD)**: uses one example (or mini-batch) per update.
- **Momentum**: accelerates SGD by adding a fraction of previous update.
- **Adam**: adaptive moment estimates combining momentum and per-parameter adaptive learning rates.

Each has trade-offs: SGD noisy but scalable; Adam often converges faster in deep learning.


### What is sklearn.linear_model?

A scikit-learn module with linear models (LinearRegression, LogisticRegression, Ridge, Lasso, SGDClassifier, etc.).


### What does model.fit() do? What arguments must be given?

`model.fit(X, y)` trains the model on features `X` and labels `y`. Some models accept sample weights or additional args; consult the estimator docs.


### What does model.predict() do? What arguments must be given?

`model.predict(X_new)` returns predictions for new features `X_new`. For probabilistic outputs, use `model.predict_proba(X_new)` (if supported).


### What is feature scaling? How does it help in Machine Learning?

Feature scaling rescales numeric features to a comparable range (e.g., zero mean unit variance). It helps models that use distance (KNN, SVM) or gradient-based optimization converge faster.


### How do we perform scaling in Python?

Use `StandardScaler`, `MinMaxScaler`, or `RobustScaler` from `sklearn.preprocessing`. Fit the scaler on training data and transform both train and test using the same scaler.


### Explain data encoding?

Data encoding transforms categorical/text features into numeric form suitable for ML models (label encoding, one-hot encoding, target encoding, embeddings).


## Hands-on demo: Iris + synthetic regression dataset
We'll demonstrate EDA, correlation, encoding, scaling, train/test split, model training and evaluation, and charts.


In [None]:
# Standard imports and dataset loading
import numpy as np
import pandas as pd
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.linear_model import LogisticRegression, LinearRegression
from sklearn.metrics import accuracy_score, mean_squared_error, log_loss
import matplotlib.pyplot as plt

# load iris dataset (classification demo)
iris = datasets.load_iris()
X_iris = pd.DataFrame(iris.data, columns=iris.feature_names)
y_iris = pd.Series(iris.target, name='species')
df_iris = pd.concat([X_iris, y_iris], axis=1)
df_iris.head()

In [None]:
# Basic EDA: distributions & scatter plots
# 1) Histograms for numeric features
for col in X_iris.columns:
    plt.figure()
    plt.hist(X_iris[col])
    plt.title(f'Histogram of {col}')
    plt.xlabel(col)
    plt.ylabel('count')
    plt.show()

# 2) Scatter plot: sepal length vs sepal width
plt.figure()
plt.scatter(X_iris['sepal length (cm)'], X_iris['sepal width (cm)'])
plt.xlabel('sepal length (cm)')
plt.ylabel('sepal width (cm)')
plt.title('Sepal length vs Sepal width (scatter)')
plt.show()

In [None]:
# Correlation matrix and simple heatmap (Pearson)
corr = df_iris.corr()
corr

In [None]:
# Visualize correlation matrix with imshow
plt.figure(figsize=(6,5))
plt.imshow(corr, interpolation='nearest')
plt.xticks(range(len(corr)), corr.columns, rotation=45)
plt.yticks(range(len(corr)), corr.columns)
plt.colorbar()
plt.title('Correlation matrix (imshow)')
plt.tight_layout()
plt.show()

### Encoding categorical feature example
We'll create a small categorical column and show Label Encoding vs One-Hot Encoding.

In [None]:
# Create a small categorical column
df_demo = pd.DataFrame({
    'num': [1.2, 3.4, 2.2, 5.1],
    'color': ['red', 'blue', 'green', 'red']
})
df_demo

In [None]:
# One-Hot Encoding using pandas.get_dummies and sklearn.OneHotEncoder
ohe_pd = pd.get_dummies(df_demo['color'], prefix='color')
ohe_pd

In [None]:
encoder = OneHotEncoder(sparse=False, drop=None)
enc = encoder.fit_transform(df_demo[['color']])
enc_df = pd.DataFrame(enc, columns=encoder.get_feature_names_out(['color']))
enc_df

### Scaling example (StandardScaler)
Fit scaler on training split and transform both train and test.

In [None]:
# Create a regression dataset for scaling and modeling demo
from sklearn.datasets import make_regression
X_reg, y_reg = make_regression(n_samples=200, n_features=3, noise=10, random_state=42)
df_reg = pd.DataFrame(X_reg, columns=['f1', 'f2', 'f3'])
df_reg['target'] = y_reg
df_reg.head()

In [None]:
# Train/test split
X = df_reg[['f1','f2','f3']]
y = df_reg['target']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

# Scaling
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Fit a linear regression model and evaluate MSE
lr = LinearRegression()
lr.fit(X_train_scaled, y_train)
y_pred = lr.predict(X_test_scaled)
mse = mean_squared_error(y_test, y_pred)
print('Test MSE (LinearRegression with StandardScaler):', mse)

### Demonstration of training and prediction with classification (Iris)
We'll train a LogisticRegression and show `fit()` and `predict()` usage.

In [None]:
# Prepare iris for classification demo
X = X_iris
y = y_iris
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=1)

# Feature scaling for LogisticRegression (optional but often helpful)
scaler = StandardScaler()
X_train_s = scaler.fit_transform(X_train)
X_test_s = scaler.transform(X_test)

clf = LogisticRegression(max_iter=200)
clf.fit(X_train_s, y_train)  # model.fit(X, y)
y_pred = clf.predict(X_test_s)  # model.predict(X_new)
acc = accuracy_score(y_test, y_pred)
print('Iris test accuracy:', acc)

# If model supports predict_proba we can inspect probabilities and compute log-loss (a loss function)
if hasattr(clf, 'predict_proba'):
    probs = clf.predict_proba(X_test_s)
    ll = log_loss(y_test, probs)
    print('Log loss on test set:', ll)

## Short note on Loss and Overfitting
- The **loss** quantifies model error (e.g., MSE for regression, log-loss for probabilistic classification).
- Track training vs validation loss: if training loss is much lower than validation loss, model overfits.

## Full-Stack Web Development â€” Compact guide + minimal example
This section contains a short overview and a small Flask app (server) + simple frontend (HTML/JS). The Flask app below serves a single page and a JSON API. In Colab you cannot run a long-lived server for external traffic, but the code is runnable locally or in an environment that allows incoming connections.


In [None]:
# Flask app example (save as app.py locally and run: `python app.py`)
flask_example = """from flask import Flask, jsonify, render_template_string, request

app = Flask(__name__)

INDEX_HTML = '''
<!doctype html>
<html>
  <head>
    <meta charset="utf-8"/>
    <title>Mini Flask + Frontend</title>
  </head>
  <body>
    <h1>Mini Full-Stack Example</h1>
    <div id="message">Press the button to call the API.</div>
    <button onclick="callApi()">Call API</button>
    <script>
      async function callApi() {
        const res = await fetch('/api/hello');
        const data = await res.json();
        document.getElementById('message').innerText = 'API says: ' + data.greeting;
      }
    </script>
  </body>
</html>
'''

@app.route('/')
def index():
    return render_template_string(INDEX_HTML)

@app.route('/api/hello')
def api_hello():
    return jsonify({'greeting': 'Hello from Flask!'})

if __name__ == '__main__':
    app.run(debug=True, port=5000)
"""
print('Flask snippet created. Save to app.py and run locally to test.')