# Student Placement Data Analysis using Pandas

This notebook helps you practice real-world **Pandas Data Analysis** tasks using the dataset `student_placement_data.csv`.

Each question builds your understanding step-by-step.

---


### 1️⃣ Load and Explore the Dataset
Load the dataset and display the first 5 rows to get an overview.

In [1]:
# Import necessary library
import pandas as pd

# Load dataset
df = pd.read_csv('/content/student_placement_data (1).csv')


# Display first 5 rows
df.head()


Unnamed: 0,Student_ID,Student_Name,Course,Assignment_Completion_%,Content_Score_Avg,Communication_Skill_(10),Batch_Start_Date,Placed,CTC_LPA
0,STU001,Ishaan Das,Data Analytics,83,53,9,2024-05-19,Yes,7.6
1,STU002,Anika Patel,Full Stack Web Dev,96,42,5,2024-02-17,No,
2,STU003,Aditya Kumar,Full Stack Web Dev,73,40,6,2024-08-12,Yes,4.29
3,STU004,Aditya Verma,Full Stack Web Dev,59,44,10,2024-02-19,Yes,5.33
4,STU005,Krishna Sharma,Full Stack Web Dev,87,65,9,2024-02-23,Yes,6.51


### 2️⃣ Find the average CTC for each course
Group the data by `Course` and find the **average CTC_LPA** for placed students.

In [3]:
# Answer here
avg_ctc = df.groupby('Course')['CTC_LPA'].mean()
print(avg_ctc)

Course
Data Analytics        5.990625
Full Stack Web Dev    6.177105
Python DA             6.472051
Name: CTC_LPA, dtype: float64


### 3️⃣ Compare average communication skill between placed and unplaced students
Find average `Communication_Skill_(10)` for both placed and unplaced students.

In [8]:
# Answer here
average_communication_skill = df.groupby('Placed')['Communication_Skill_(10)'].mean().reset_index()
print(average_communication_skill)

  Placed  Communication_Skill_(10)
0     No                  6.073171
1    Yes                  6.825688


### 4️⃣ Find top 10 students by assignment completion percentage
Sort the data by `Assignment_Completion_%` in descending order and display top 10 students.

In [12]:
# Answer here
top_10_students = df.sort_values(by='Assignment_Completion_%', ascending=False).head(10)
print(top_10_students[['Student_ID','Student_Name','Course','Assignment_Completion_%']])

    Student_ID   Student_Name              Course  Assignment_Completion_%
136     STU137    Myra Sharma           Python DA                       99
41      STU042    Aarav Reddy  Full Stack Web Dev                       99
78      STU079    Meera Singh           Python DA                       98
75      STU076      Ira Verma      Data Analytics                       98
19      STU020  Reyansh Gupta  Full Stack Web Dev                       97
13      STU014     Aarav Nair  Full Stack Web Dev                       97
57      STU058       Diya Das           Python DA                       97
99      STU100   Aadhya Gupta  Full Stack Web Dev                       97
1       STU002    Anika Patel  Full Stack Web Dev                       96
42      STU043     Aarav Nair  Full Stack Web Dev                       96


### 5️⃣ Group by course and check placement rate
Find how many students are placed and not placed in each course.

In [20]:
# Answer here

#print(df.groupby(['Course','Placed']).size().reset_index(name='Count'))


placement_summary = df.groupby(['Course','Placed']).size().unstack(fill_value=0).reset_index()
placement_summary['Total_Students'] = placement_summary['Yes'] + placement_summary['No']
placement_summary['Placement_Rate_%'] = (placement_summary['Yes']/placement_summary['Total_Students'])*100

print(placement_summary.reset_index())

Placed  index              Course  No  Yes  Total_Students  Placement_Rate_%
0           0      Data Analytics  22   32              54         59.259259
1           1  Full Stack Web Dev  10   38              48         79.166667
2           2           Python DA   9   39              48         81.250000


### 6️⃣ (Optional Challenge) Find high-performing students not placed
Find students who scored `>80` in content score and `>8` in communication skills but are **not placed**.

In [25]:
# Answer here
df[(df['Content_Score_Avg']>80) & (df['Communication_Skill_(10)']>8) & (df['Placed']=='No')]

Unnamed: 0,Student_ID,Student_Name,Course,Assignment_Completion_%,Content_Score_Avg,Communication_Skill_(10),Batch_Start_Date,Placed,CTC_LPA
48,STU049,Arjun Kumar,Python DA,53,83,10,2024-10-22,No,
