Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# **Campus Placement Analysis & Prediction**

**GOAL**

To analyze various factors and predict salary offered to candidates during campus placements using machine learning algorithm.

**DATASET**

Dataset can be downloaded from [here](https://www.kaggle.com/benroshan/factors-affecting-campus-placement).

**WHAT I HAD DONE**
- Step 1: Data Preprocessing & Exploration
- Step 2: Data Training & Model Creation
- Step 3: Performance Evaluation


**Screenshots**

![](https://github.com/ayushi424/PyAlgo-Tree/blob/main/Machine%20Learning/Campus%20Placement%20Analysis%20%26%20Prediction/Images/cp1.jpg)
![](https://github.com/ayushi424/PyAlgo-Tree/blob/main/Machine%20Learning/Campus%20Placement%20Analysis%20%26%20Prediction/Images/cp2.jpg)
![](https://github.com/ayushi424/PyAlgo-Tree/blob/main/Machine%20Learning/Campus%20Placement%20Analysis%20%26%20Prediction/Images/cp3.jpg)
![](https://github.com/ayushi424/PyAlgo-Tree/blob/main/Machine%20Learning/Campus%20Placement%20Analysis%20%26%20Prediction/Images/cp4.jpg)
**MODEL USED**
- Decision Tree Regressor

**LIBRARIES NEEDED**
- pandas
- numpy
- matplotlib
- seaborn
- sklearn (For data traning, importing models and performance check)


**Accuracy of different models used**
- By using Decision Tree Regressor model
```python
Accuracy achieved : 1.00
```



**CONCLUSION**

Performance of Decision tree regressor is highyly efficient.


**Author**

[Ayushi Shrivastava](https://github.com/ayushi424)
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
# -*- coding: utf-8 -*-
"""campus_placement_analysis&prediction.ipynb

Automatically generated by Colaboratory.

Original file is located at
https://colab.research.google.com/drive/1H8LaqWZcR6OqEhPmavKEbh2duuiTLypC

# **Campus Placement Analysis & Prediction**

Here, in this project we will analyze various features that affect campus placements and then perform prediction using decision tree regressor algorithm.

Following steps are followed:

- **Data preprocessing and exploration** to understand what kind of data will we working on.
- **Data Training** using train-test-split method from sklearn to split the data into training and testing data & Model Creation using decision tree regressor algorithm.
- **Performance Evaluation** by error and accuracy check to find how efficient algorithm is for this project.

For the dataset being used in this project [click here](https://www.kaggle.com/benroshan/factors-affecting-campus-placement)

### **Data Preprocessing & Exploration**
"""

#importing pandas library.
import pandas as pd

#loading and reading data through following
data=pd.read_csv('/content/Placement_Data_Full_Class.csv')
data

#to view shape of the dataset i.e. total number of rows and columns.
data.shape

#to view first 5 rows of the dataset.
data.head()

#to view last 5 rows of the dataset.
data.tail()

#to view different columns of the dataset.
data.columns

#to view memory usage, non-null values, datatypes of columns.
data.info()

#to view statistical summary of the dataset.
data.describe()

#to check for any missing or null values in the dataset.
data.isnull().sum()

"""There are 67 null values in 'salary', we can't proceed with this.

We need to replace the null values by the mean of that respective column.
"""

data['salary'].isnull().sum()

data['salary'].mean()

data['salary'].fillna('288655',inplace=True)
# fillna fucntion will fill the null values(where null=TRUE) with the mean value.

data['salary'].isnull().sum()

#to check again for any missing or null values in the dataset.
data.isnull().sum()

#to view total null values in the dataset.
data.isnull().sum().sum()

"""Now that the dataset has no null values i.e. dataset is cleaned and proper.

We can now proceed with further steps.

### **Data Training**
"""

data.info()

data1=data.drop(['gender','ssc_b','hsc_b','hsc_s','degree_t','workex','specialisation','status'],axis=1)
data1.info()

#converting data into int datatype to avoid errors below.
prepareddata=data1.astype(int)
prepareddata.head()

# Import train_test_split from sklearn.model_selection
from sklearn.model_selection import train_test_split
# Here, X is the data which will have feature and y will have our target.
x=prepareddata.drop(['salary'],axis=1)
y=prepareddata['salary']

# Split data into training data and testing data
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2,random_state=500)
#Ratio used for splitting training and testing data is 8:2 respectively

"""### **Model Creation using Decision Tree Regressor Algorithm**"""

# Importing decision tree regressor
from sklearn.tree import DecisionTreeRegressor
reg = DecisionTreeRegressor()

#Fitting data into the model.
reg.fit(x_train, y_train)

# Making predictions on Test data
pred = reg.predict(x_test)

pred

"""### **Performance Evaluation**"""

import numpy as np
from sklearn.metrics import mean_squared_error
print("Model\t\t\t RootMeanSquareError \t\t Accuracy of the model")
print("""Decision Tree Regressor \t\t {:.2f} \t \t\t {:.2f}""".format( np.sqrt(mean_squared_error(y_test, pred)), reg.score(x_train,y_train)))

"""Conclusion Drawn:
* Accuracy of the decision tree regressor model for this project is 1.00 which is an excellent accuracy.

* Decision tree regressor is a highly efficient model and widely used for regression tasks,various prediction projects etc.

**Author**

[Ayushi Shrivastava](https://github.com/ayushi424)
"""