Welcome to the repository showcasing a collection of data science projects. This repository contains solutions for various tasks, demonstrating key aspects of data science, including data preprocessing, model building, evaluation, and prediction.
- Task 1: Titanic Survival Prediction
- Task 2: Movie Rating Prediction with Python
- Task 3: Iris Flower Classification
- Task 4: Sales Prediction Using Python
- Task 5: Credit Card Fraud Detection
In this project, survival chances of passengers aboard the Titanic were predicted based on features such as age, class, and sex.
Key Concepts:
- Data Cleaning and Preprocessing
- Feature Engineering
- Logistic Regression Model
- Accuracy Evaluation
Key Libraries Used:
- pandas, numpy, matplotlib, seaborn, scikit-learn
This project involved predicting movie ratings based on user preferences using machine learning models such as Linear Regression and Decision Trees.
Key Concepts:
- Data Cleaning, Preprocessing, and Formatting
- Linear Regression and Random Forest Regressor
- Evaluation Metrics: MSE
Key Libraries Used:
- pandas, numpy, scikit-learn, matplotlib
I classified Iris flower species based on sepal and petal dimensions using a variety of machine learning classifiers, including K-Nearest Neighbors and Random Forest.
Key Concepts:
- Logistic Regression Model
- Data Visualization
- Model Evaluation: Accuracy, Classification Report, Confusion Matrix
Key Libraries Used:
- pandas, numpy, scikit-learn, seaborn, matplotlib
This task involves predicting sales based on marketing data such as TV, Radio, and Newspaper spending. The model predicts sales using Linear Regression and Random Forest Regression.
Key Concepts:
- Regression Analysis
- Model Tuning
- Feature Engineering
Key Libraries Used:
- pandas, numpy, matplotlib, seaborn, scikit-learn
In this project, a machine learning model was built to predict fraudulent transactions based on historical transaction data.
Key Concepts:
- Anomaly Detection
- Model Evaluation
- Handling Imbalanced Data
Key Libraries Used:
- pandas, numpy, scikit-learn, imbalanced-learn
To run the code in this repository, you'll need to install the following Python packages:
pip install pandas numpy matplotlib seaborn scikit-learn imbalanced-learnThese projects demonstrate various aspects of data science, from classification to regression and anomaly detection. Each project showcases skills in building and evaluating machine learning models.