<a href="https://colab.research.google.com/github/MuntahaTazeem5/Career-Analysis-with-Pandas/blob/main/career_analysis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

***Career Analysis with Pandas***

**📝 Problem Statement**

You are analyzing different career options to compare their salary, education requirements, work-life balance, and growth outlook. The goal is to use Pandas DataFrame to organize the data, perform operations, and extract useful insights for decision-making.

**Tasks:**

Create DataFrame → 10 careers, 5+ columns.

Add Column → Salary_per_EduYear_$ = Salary / Education_Years.

Sort → By Average_Salary_$ (descending).

Filter → Careers with Growth_Outlook_% > 15 and Work_Life_Balance_10 >= 7.

In [1]:
import pandas as pd

In [6]:
# Step 1: Create Career DataFrame
data = {
    "Career": [
        "Data Scientist", "Software Engineer", "Doctor", "Teacher", "Lawyer",
        "Mechanical Engineer", "Graphic Designer", "Civil Engineer",
        "Nurse", "Digital Marketer"
    ],
    "Average_Salary_$": [
        120000, 100000, 150000, 55000, 130000,
        85000, 60000, 75000, 70000, 65000
    ],
    "Education_Years": [6, 4, 8, 4, 7, 4, 3, 4, 4, 3],
    "Work_Life_Balance_10": [7, 6, 5, 8, 6, 6, 7, 6, 7, 8],
    "Growth_Outlook_%": [35, 25, 12, 5, 8, 10, 15, 9, 12, 20]
}

In [3]:
df = pd.DataFrame(data)


In [5]:
print("📊 Original Career Analysis DataFrame:")
print(df)

📊 Original Career Analysis DataFrame:
                Career  Average_Salary_$  Education_Years  \
0       Data Scientist            120000                6   
1    Software Engineer            100000                4   
2               Doctor            150000                8   
3              Teacher             55000                4   
4               Lawyer            130000                7   
5  Mechanical Engineer             85000                4   
6     Graphic Designer             60000                3   
7       Civil Engineer             75000                4   
8                Nurse             70000                4   
9     Digital Marketer             65000                3   

   Work_Life_Balance_10  Growth_Outlook_%  
0                     7                35  
1                     6                25  
2                     5                12  
3                     8                 5  
4                     6                 8  
5                     6   

In [7]:
 # Step 2: Add new column → Salary per Education Year
df["Salary_per_EduYear_$"] = (df["Average_Salary_$"] / df["Education_Years"]).round(2)


In [8]:
# Step 3: Sort careers by Average Salary
df_sorted = df.sort_values(by="Average_Salary_$", ascending=False)


In [9]:
# Step 4: Filter → Careers with Growth Outlook > 15% and Work-Life Balance >= 7
filtered_df = df[(df["Growth_Outlook_%"] > 15) & (df["Work_Life_Balance_10"] >= 7)]


In [14]:
print("\n DataFrame with New Column (Salary per Edu Year):\n")
print(df)


 DataFrame with New Column (Salary per Edu Year):

                Career  Average_Salary_$  Education_Years  \
0       Data Scientist            120000                6   
1    Software Engineer            100000                4   
2               Doctor            150000                8   
3              Teacher             55000                4   
4               Lawyer            130000                7   
5  Mechanical Engineer             85000                4   
6     Graphic Designer             60000                3   
7       Civil Engineer             75000                4   
8                Nurse             70000                4   
9     Digital Marketer             65000                3   

   Work_Life_Balance_10  Growth_Outlook_%  Salary_per_EduYear_$  
0                     7                35              20000.00  
1                     6                25              25000.00  
2                     5                12              18750.00  
3           

In [13]:
print("\n Sorted by Average Salary (Top First):\n")
print(df_sorted[["Career", "Average_Salary_$"]])



 Sorted by Average Salary (Top First):

                Career  Average_Salary_$
2               Doctor            150000
4               Lawyer            130000
0       Data Scientist            120000
1    Software Engineer            100000
5  Mechanical Engineer             85000
7       Civil Engineer             75000
8                Nurse             70000
9     Digital Marketer             65000
6     Graphic Designer             60000
3              Teacher             55000


In [12]:
print("\n Filter: Careers with High Growth (>15%) and Good Work-Life Balance (>=7):\n")
print(filtered_df[["Career", "Growth_Outlook_%", "Work_Life_Balance_10"]])



 Filter: Careers with High Growth (>15%) and Good Work-Life Balance (>=7):

             Career  Growth_Outlook_%  Work_Life_Balance_10
0    Data Scientist                35                     7
9  Digital Marketer                20                     8
