This repository showcases various machine learning programs utilizing different aspects of data science.
The major tech stacks used are:
- Python
- Scikit-learn
- Pandas
- NumPy
- Matplotlib
- Jupyter Notebook
- UCI Machine Learning Repository
Below are descriptions of the experiments:
In this experiment, a Decision Tree classifier is implemented for plant leaf iris detection using the Iris dataset. The model is trained with various max depth values, and the optimal depth is determined by evaluating and visualizing the accuracy scores on the test set, resulting in an accuracy of [insert accuracy score]% for the chosen depth.
In this experiment, a Random Forest Regressor model was trained on a car price prediction dataset using numerical features. The model achieved a commendable R2 score of [R2Score], indicating its effectiveness in accurately predicting car prices based on the provided input features.
Investigate and clean a UCI dataset with 100 instances and 3 attributes, targeting student marks, study time, and courses. Build regression models, considering dataset limitations, and evaluate using metrics like R2 and RMSE, highlighting the challenge of accurate predictions with a small dataset while ensuring model applicability.
In this experiment, data cleaning was done on the given diabetes dataset, and a machine learning algorithm was implemented to predict whether a woman will have diabetes or not.
Find the programs in a readble format, here.