Data Science Internship Task 2 - The Sparks Foundation
This repository contains the code and explanation for Task 2 of my Data Science internship at The Sparks Foundation. In this task, I have analyzed the Iris dataset, a classic dataset often used in machine learning for classification tasks.
About The Sparks Foundation
The Sparks Foundation is a non-profit organization committed to providing opportunities for students and professionals to develop skills in various fields, including data science and machine learning. They offer a range of tasks and projects to help individuals gain practical experience and knowledge in these domains.
The project is a Python script that performs a comprehensive analysis of the Iris dataset. Here's an overview of what the code does:
- Imports necessary libraries for data analysis, visualization, and modeling.
- Loads the Iris dataset from a CSV file into a DataFrame and preprocesses the data.
- Explores the data, checks for null values, and handles outliers.
- Visualizes the relationships between variables.
- Splits the data into training and testing sets.
- Applies feature scaling to ensure consistent scales.
- Builds a Decision Tree Classifier model and evaluates its performance.
- Visualizes the Decision Tree for insights.
To use this code and analyze the Iris dataset, follow these steps:
-
Clone this repository to your local machine:
git clone https://github.com/himanshumahajan138/Prediction-Using-Decision-Tree-Algorithm
-
Install the required libraries if you haven't already. You can use pip for this:
pip install pandas numpy matplotlib seaborn scikit-learn
-
Download the Iris dataset (Iris.csv) and place it in the same directory as the script.
-
Run the script:
python iris_analysis.py
-
The script will perform data analysis, build a model, and display the results and visualizations.
- The Sparks Foundation for providing this internship opportunity and the task.
- The Iris dataset, a classic dataset for machine learning, available through various sources.
Feel free to explore and modify the code to suit your own data analysis projects. Happy coding!
Himanshu
Data Science Intern
The Sparks Foundation
This project is licensed under the MIT License - see the LICENSE file for details.