Applied AI & Data Science Project Lab
A structured collection of notebook-based projects for machine learning, analytics, and portfolio practice.
Overview · Contents · Projects · How to Use · Learning Flow
AI-DS-100 is an applied AI and Data Science project repository. The uploaded version currently contains 26 implemented project bundles arranged into three levels: Basic, Intermediate, and Advanced.
Each project is designed for practical learning and portfolio building. The bundles generally include a notebook, dataset, exported report/PDF, and a short project description.
Dataset → Cleaning → EDA → Model training → Evaluation → Report/export
The repository name points toward a larger 100-project collection, while the current implemented set contains 26 project bundles.
| Repository Part | What it provides |
|---|---|
DS-Project-Basic/ |
Beginner-friendly regression/classification projects with simple datasets and baseline ML workflows. |
DS-Project-Intermediate/ |
More complete prediction projects covering churn, health risk, booking, loan, and sensor-style datasets. |
DS-Project-Advanced/ |
Larger or more involved projects such as car pricing, crime analysis, crop yield, traffic flow, and fraud prediction. |
| Project zip bundles | Each project is packaged separately so it can be downloaded, extracted, and studied independently. |
| Level README files | Each difficulty folder contains its own short level-specific README. |
| MIT License | Allows reuse and modification under the license terms. |
| Project | Area | Dataset Focus |
|---|---|---|
| Delhi House Price Prediction | Regression | MagicBricks housing data |
| Medical Cost Prediction | Regression | Insurance charges data |
| Pima Indians Diabetes Prediction | Classification | Clinical diabetes data |
| Red Wine Quality | Classification / Regression | Wine physicochemical data |
| SFR Analysis | Analysis / Prediction | Launch SFR records |
| Salary Prediction | Regression | Salary and profile data |
| Sleep Disorder Prediction | Classification | Sleep and lifestyle data |
| Titanic Survival Prediction | Classification | Titanic passenger data |
| Project | Area | Dataset Focus |
|---|---|---|
| Breast Cancer Prediction | Classification | Tumor feature data |
| Cardiovascular Disease Prediction | Classification | Cardio/health indicator data |
| Customer Churn Prediction | Classification | Bank/customer churn data |
| Diamond Price Prediction | Regression | Diamond attributes data |
| E-Commerce Product Delivery Prediction | Classification | Order delivery data |
| Heart Stroke Prediction | Classification | Stroke health data |
| Hotel Reservations Cancellation Prediction | Classification | Hotel booking data |
| House Price Prediction | Regression | Home sales data |
| Loan Approval Prediction | Classification | Applicant/credit data |
| Osteoporosis Risk Prediction | Classification | Health risk data |
| Room Occupancy Detection | Classification | Sensor readings |
| Telecom Customer Churn Prediction | Classification | Telco customer data |
| Project | Area | Dataset Focus |
|---|---|---|
| Belarus Car Price Prediction | Regression | Used car listings |
| Calgary Crime Data Analysis and Neural Network Model | Analysis / Prediction | Crime statistics |
| Crop Yield Prediction | Regression | Crop-yield spreadsheet |
| Indian Used Car Price Prediction | Regression | Indian used-car listings |
| Traffic-Flow-Prediction | Classification / Forecasting | Traffic count data |
| Warranty Claims Fraud Prediction | Classification | Warranty claim data |
Most extracted project folders follow this practical structure:
| File Type | Purpose |
|---|---|
.ipynb |
Main Jupyter notebook containing code, analysis, model training, and evaluation. |
.csv / .xlsx |
Dataset used by the notebook. |
.pdf |
Exported notebook/report for quick review. |
description.md |
Short explanation of the problem, workflow, and learning value. |
The projects mainly use familiar beginner-to-intermediate Python data science tools such as pandas, numpy, matplotlib, and scikit-learn.
Most notebooks follow a similar learning pattern:
1. Import libraries
2. Load the dataset
3. Inspect rows, columns, missing values, and basic statistics
4. Clean or encode the data
5. Explore patterns with simple visualizations
6. Split data into train/test sets
7. Train a baseline model
8. Evaluate with suitable metrics
9. Summarize results in a report/export
This consistency makes the repository useful for beginners who want to repeat the same machine-learning workflow across different real-world domains.
AI-DS-100/
├── DS-Project-Basic/
│ ├── README_BASIC.md
│ └── 8 project zip bundles
├── DS-Project-Intermediate/
│ ├── README_INTERMEDIATE.md
│ └── 12 project zip bundles
├── DS-Project-Advanced/
│ ├── README_ADVANCED.md
│ └── 6 project zip bundles
├── LICENSE
└── README.md
- Open the difficulty folder that matches your current level.
- Extract the project zip you want to study.
- Open the notebook in Jupyter Notebook, JupyterLab, VS Code, Google Colab, or Kaggle.
- Install common dependencies if required:
pip install numpy pandas matplotlib scikit-learn jupyter openpyxl- Run the notebook from top to bottom.
- Compare your output with the included PDF/export.
- Modify the notebook by adding better EDA, extra metrics, different models, or improved documentation.
- Building a beginner-to-intermediate data science portfolio.
- Practicing classification and regression workflows.
- Learning how similar ML steps change across different datasets.
- Preparing project explanations for resumes, GitHub, LinkedIn, or interviews.
- Using existing notebooks as a base for improved versions with cleaner code and stronger evaluation.
- Some projects are intentionally simple and use baseline models instead of heavy production pipelines.
- Datasets are stored inside individual project bundles, so extract a project before running it.
- The current release contains 26 implemented projects; more projects can be added later while keeping the same three-level structure.
Released under the MIT License.