Coder_Roots Assignment

This repository contains solutions to three tasks involving data manipulation, visualization, and predictive modeling. Each task demonstrates specific data science techniques using Python.

Task 1: Data Manipulation and Cleaning

Objective:

Clean and analyze the employee_data.csv dataset.

Dataset:

Employee_Data

Steps:

Remove duplicate entries.
Handle missing values (fill them with default values or drop the rows).
Convert the JoiningDate column to a proper datetime format.
Filter out employees where the Status is "Resigned".
Analyze the data:
- Find the average salary by department.
- List employees who joined after 2020.

Code:

Refer to the script Task1_Data Manipulation and Cleaning.ipynb.

Outputs:

Cleaned DataFrame.
Average salary per department.
List of employees who joined after 2020.

Task 2: Data Visualization

Objective:

Explore a public dataset through visualizations.

Dataset:

Any public dataset can be used (e.g., Titanic dataset). Titanic dataset

Steps:

Load the dataset into a Pandas DataFrame.
Create four meaningful visualizations using Matplotlib or Seaborn.
- Examples: Bar plots, histograms, box plots, etc.
Generate a correlation heatmap for numerical variables.

Code:

Refer to the script Task2_Data Visualization.ipynb.

Outputs:

Visualizations:
1. Survival rate by passenger class.
2. Histogram of passenger ages.
3. Box plot of fare distribution.
4. Correlation heatmap.

Task 3: Predictive Modeling (Classification)

Objective:

Build a classification model to predict the presence of diabetes based on health metrics.

Dataset:

Diabetes Prediction Dataset

Steps:

Data Preprocessing:
- Replace missing or undefined values.
- Convert categorical variables to numerical using encoding techniques.
- Normalize or scale features if necessary.
Model Building:
- Split the dataset into training and testing sets.
- Train two classification models:
  1. Logistic Regression
  2. Decision Tree
Model Evaluation:
- Evaluate models using accuracy, precision, recall, and F1 score.

Code:

Refer to the script Task3_Predictive Modeling.ipynb.

Outputs:

Preprocessed dataset.
Performance metrics for Logistic Regression and Decision Tree models:
- Accuracy, Precision, Recall, F1 Score.

Installation and Usage

Clone the repository:

git clone https://github.com/AbhinavSharma07/Coder_Roots.git
cd Coder_Roots

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
Task-1		Task-1
Task-2		Task-2
Task-3		Task-3
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Coder_Roots Assignment

Task 1: Data Manipulation and Cleaning

Objective:

Dataset:

Steps:

Code:

Outputs:

Task 2: Data Visualization

Objective:

Dataset:

Steps:

Code:

Outputs:

Task 3: Predictive Modeling (Classification)

Objective:

Dataset:

Steps:

Code:

Outputs:

Installation and Usage

About

Uh oh!

Releases

Packages

Languages

AbhinavSharma07/Coder_Roots

Folders and files

Latest commit

History

Repository files navigation

Coder_Roots Assignment

Task 1: Data Manipulation and Cleaning

Objective:

Dataset:

Steps:

Code:

Outputs:

Task 2: Data Visualization

Objective:

Dataset:

Steps:

Code:

Outputs:

Task 3: Predictive Modeling (Classification)

Objective:

Dataset:

Steps:

Code:

Outputs:

Installation and Usage

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages