# Placement Probability & Gap Analysis Engine

## Project Type:
Machine Learning Based Predictive and Prescriptive System

## Objective:
To predict student placement category (Tier 1, Tier 2, Mass Recruiter, Unplaced) 
and provide actionable improvement guidance using Gap Analysis.

## Core Components:
- Multi-class classification
- Interpretable ML models
- Student profiling
- Gap analysis engine
- Visualization system


In [3]:
import pandas as pd
import numpy as np
import os
import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder, StandardScaler
from sklearn.tree import DecisionTreeClassifier
from sklearn.cluster import KMeans
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix


In [5]:
folders = [
    "data",
    "notebooks",
    "models",
    "src",
    "outputs",
    "visualizations"
]

for folder in folders:
    if not os.path.exists(folder):
        os.makedirs(folder)

print("Project folder structure created successfully.")


Project folder structure created successfully.


In [7]:
data_schema = {
    "CGPA": "Float (0-10)",
    "Backlogs": "Binary (0/1)",
    "Internship_Count": "Integer",
    "Coding_Rating": "Integer (800-2200)",
    "Aptitude_Score": "Integer (0-100)",
    "Target_Class": "Tier_1 / Tier_2 / Mass_Recruiter / Unplaced"
}

pd.DataFrame(data_schema.items(), columns=["Feature", "Description"])


Unnamed: 0,Feature,Description
0,CGPA,Float (0-10)
1,Backlogs,Binary (0/1)
2,Internship_Count,Integer
3,Coding_Rating,Integer (800-2200)
4,Aptitude_Score,Integer (0-100)
5,Target_Class,Tier_1 / Tier_2 / Mass_Recruiter / Unplaced


In [9]:
target_classes = ["Tier_1", "Tier_2", "Mass_Recruiter", "Unplaced"]
target_classes


['Tier_1', 'Tier_2', 'Mass_Recruiter', 'Unplaced']

In [11]:
academic_features = ["CGPA", "Backlogs"]
skill_features = ["Coding_Rating", "Aptitude_Score"]
experience_features = ["Internship_Count"]

all_features = academic_features + skill_features + experience_features
all_features


['CGPA', 'Backlogs', 'Coding_Rating', 'Aptitude_Score', 'Internship_Count']

## Project Pipeline

1. Data Generation
2. Data Preprocessing
3. Feature Engineering
4. Model Training
5. Model Evaluation
6. Prediction System
7. Gap Analysis Engine
8. Visualization
9. Result Interpretation


In [15]:
model_strategy = {
    "Primary_Model": "Decision Tree Classifier",
    "Secondary_Model": "Logistic Regression",
    "Profiling_Model": "K-Means Clustering",
    "Prediction_Type": "Multi-Class Classification",
    "Evaluation_Focus": "Recall for Unplaced Class",
    "System_Type": "Decision Support System"
}

pd.DataFrame(model_strategy.items(), columns=["Component", "Design Choice"])


Unnamed: 0,Component,Design Choice
0,Primary_Model,Decision Tree Classifier
1,Secondary_Model,Logistic Regression
2,Profiling_Model,K-Means Clustering
3,Prediction_Type,Multi-Class Classification
4,Evaluation_Focus,Recall for Unplaced Class
5,System_Type,Decision Support System


In [17]:
project_config = {
    "project_name": "Placement Probability & Gap Analysis Engine",
    "domain": "Machine Learning",
    "type": "Predictive + Prescriptive",
    "classes": target_classes,
    "features": all_features,
    "primary_model": "Decision Tree",
    "evaluation_metric": "Recall",
    "gap_analysis": True
}

import json
with open("project_config.json", "w") as f:
    json.dump(project_config, f, indent=4)

print("Project configuration saved.")


Project configuration saved.
