Python Lab Manual: Practical Exercises & Mini-Projects
This repository contains a sequence of hands-on lab exercises that teach core Python programming, data manipulation, visualization, numerical computing with NumPy, data analysis with Pandas, and fundamental machine learning workflows. It is designed for undergraduate or postgraduate students taking a lab course or for anyone who wants a practical, example-driven introduction to Python-based data science.
The labs progress from basic programming concepts to complete end-to-end machine learning projects, covering 19 topics:
- Demonstrate variables, different data types, and arithmetic operations.
- Implement if-elif-else conditions and loops (for, while) with real-world examples.
- Create Python functions for mathematical operations and demonstrate reading/writing files.
- Store and manipulate data using different data structures (lists, tuples, dicts, sets).
- Use NumPy for array operations.
- Use NumPy for indexing and slicing.
- Pandas for data manipulation (load CSV, basic analysis).
- Handle missing values.
- Normalize and regularize numerical features.
- Use visualization techniques (histograms).
- Use visualization techniques (heatmaps).
- Use visualization techniques (scatter plots).
- Train a Linear Regression model on a dataset (e.g., house price prediction) and evaluate performance.
- Implement Logistic Regression to classify a dataset (e.g., predicting student pass/fail).
- Train and visualize a Decision Tree model for classification tasks.
- Implement k-NN for classification problems and find the optimal k value.
- Apply k-Means clustering on an unlabeled dataset and visualize cluster formation.
- Compare multiple ML models using metrics like confusion matrix, precision, recall, and F1-score.
- Implement an end-to-end ML project (e.g., spam detection, customer churn) and supply a detailed
README.md
for the project.
LAB_MANNUAL_PROJECT/ ├── .git/ ├── DataSets/ ├── Images/ ├── Lab19/ ├── Lab20_House_Price_P... ├── model/ ├── lab1.py ├── lab2.py ├── lab3.py ├── lab4.py ├── lab5.py ├── lab6.py ├── Lab7.py ├── Lab8.py ├── Lab9.py ├── Lab10.py ├── Lab11.py ├── Lab12.py ├── Lab13.py ├── Lab14.py ├── Lab15.py ├── Lab16.py ├── Lab17.py ├── Lab18.py └── results.txt
-
Python 3.8+
-
Recommended packages (listed in
requirements.txt
):- numpy
- pandas
- matplotlib
- scikit-learn
- seaborn
- jupyterlab or notebook
Install with:
pip install -r requirements.txt
- Clone the repository.
- Review the dataset(s) in the
data/
folder. - Open a notebook from
notebooks/
in JupyterLab or Jupyter Notebook. - Run cells sequentially. Each notebook contains: learning objectives, brief theory, step-by-step code, exercises, and a mini assignment.
Each topic notebook follows a consistent layout:
- Objective — what you will learn
- Theory — short conceptual notes
- Code walkthrough — runnable examples
- Exercises — tasks for practice
- Mini assignment — small project to test learning
- Demonstrations of
int
,float
,str
,bool
, and type conversion. - Arithmetic operators and simple expressions.
- Small exercises: calculate student grade percentages, currency conversion snippet.
- Real-world examples: grade categorization, attendance counting.
for
loops,while
loops,break
/continue
usage.- Exercise: implement passing criteria and aggregate statistics.
def
functions for math ops (add, subtract, factorial, gcd).- Reading/writing CSV and text files with
open()
andpandas
. - Example: function to compute class averages and save results to a file.
- Use and manipulation of
list
,tuple
,dict
,set
. - When to use which structure, time complexity notes, examples.
- Exercise: aggregate student marks using dictionaries.
- Creating arrays, broadcasting, vectorized ops, reshaping.
- Indexing, boolean masking, fancy indexing and slicing examples.
- Exercises: normalize arrays, compute column-wise statistics.
- Loading CSV (
pd.read_csv
), inspecting data (head()
,info()
,describe()
), filtering, groupby. - Exercise: load
students.csv
and compute subject-wise averages.
- Detect (
isnull()
), drop (dropna()
), and impute (fillna()
,SimpleImputer
) strategies. - Case studies: mean/median imputation, forward/backward fill.
- Scaling: MinMaxScaler, StandardScaler, RobustScaler.
- Regularization concepts (brief): L1 vs L2, where used (linear/logistic regression).
- Exercises: scale features for regression model.
- Histograms: distribution of age, marks, prices.
- Heatmaps: correlation matrices (Pearson) for feature relationships.
- Scatter plots: relationship between features (e.g., sqft vs price).
- Each notebook contains plotting code using
matplotlib
andseaborn
plus interpretation notes.
- Train/test split, fit a linear regression model, calculate RMSE, MAE, R².
- Visualize predictions vs actual values and residuals.
- Binary classification (e.g., pass/fail), confusion matrix, precision, recall, ROC curve.
- Train a decision tree classifier, visualize with
plot_tree
, discuss overfitting and pruning.
- Implement k-NN using
sklearn.neighbors.KNeighborsClassifier
, use cross-validation to find optimalk
.
- Run k-Means, elbow method to choose k, visualize clusters on 2D PCA projection.
- Train several models (Logistic Regression, Decision Tree, k-NN, SVM optionally), compare via confusion matrix, precision, recall, F1-score, and classification report.
- A complete project folder that includes: data, EDA notebook, preprocessing script, model training notebook, evaluation, saved model, and a project-level
README.md
(this file also describes how to run the project and interpret results).
- Run Jupyter Lab: & Vs Code
jupyter lab
- Run a notebook to train a model (example):
python src/train_model.py --config configs/linear_regression.yaml
(If you prefer notebooks only, open notebooks/13_linear_regression.ipynb
.)
- Small synthetic datasets are included in
data/
for exercises. - For larger examples (house prices, spam detection) we recommend using public datasets such as the UCI repository, Kaggle datasets, or generated synthetic data. Always include a
data/README.md
in the project folder describing the dataset and its license.
Each notebook ends with exercises and a mini-assignment. Suggested grading rubric for assignments:
- Correctness of code: 40%
- Code readability and comments: 20%
- Analysis and interpretation of results: 30%
- Plots and visualizations quality: 10%
Contributions are welcome. Suggested workflow:
- Fork the repo.
- Create a branch
feature/<topic-number-summary>
. - Add notebooks or improve explanations.
- Open a pull request with a description of changes.