**English:** Import pandas and create the initial DataFrame with employee data.
**Hindi:** Pandas import karein aur employee data ke saath initial DataFrame banayein.

In [6]:
import pandas as pd

data = {
    'Department': ['IT', 'HR', 'IT', 'Finance', 'HR', 'Finance', 'IT', 'HR'],
    'Employee': ['Amit', 'Neha', 'Raj', 'Rina', 'Arjun', 'Tina', 'Karan', 'Divya'],
    'Salary': [70000, 50000, 65000, 80000, 48000, 85000, 72000, 51000],
    'Experience': [3, 2, 5, 7, 3, 10, 6, 4]
}

df = pd.DataFrame(data)


**English:** Display the DataFrame.
**Hindi:** DataFrame ko display karein.

In [7]:
df

**English:** Group the data by 'Department' and calculate the mean 'Salary' for each. `.reset_index()` converts the grouped output back into a DataFrame.
**Hindi:** Data ko 'Department' se group karein aur har ek ke liye mean 'Salary' calculate karein. `.reset_index()` grouped output ko wapas DataFrame mein convert karta hai.

In [8]:
df.groupby('Department')['Salary'].mean().reset_index(name ='Avg_salary')

**English:** Group by 'Department' and use `.agg()` to calculate the minimum, maximum, and average salary for each department.
**Hindi:** 'Department' se group karein aur har department ke liye minimum, maximum, aur average salary calculate karne ke liye `.agg()` ka istemal karein.

In [9]:
df.groupby('Department').agg(
    min_salary = ('Salary','min'),
    max_salary = ('Salary','max'),
    average_salary = ('Salary','mean')
).round()

**English:** This is a powerful technique. First, group by 'Department' and get the mean salary. This creates a Series. Then, a boolean condition (`> 60000`) is applied to this Series to find high-paying departments. The `.index` of this filtered Series gives us the department names we need.
**Hindi:** Yeh ek powerful technique hai. Pehle, 'Department' se group karke mean salary nikalein. Isse ek Series banti hai. Fir, is Series par ek boolean condition (`> 60000`) lagayi jaati hai high-paying departments dhoondhne ke liye. Is filtered Series ka `.index` hamein zaroori department names deta hai.

In [-1]:
high_salary = df.groupby('Department')['Salary'].mean()
high_salary = high_salary[high_salary > 60000].index

**English:** Filter the original DataFrame using `.isin()` to select only the rows where the 'Department' is in our `high_salary` list.
**Hindi:** Original DataFrame ko `.isin()` ka istemal karke filter karein, taaki sirf woh rows select hon jinka 'Department' hamari `high_salary` list mein hai.

In [-1]:
df[df['Department'].isin(high_salary)]

**English:** A more direct way to achieve the same result as the cells above is using `.filter()`. It applies a function (in this case, checking if the group's mean salary is > 60000) and returns the rows from groups where the function returns `True`.
**Hindi:** Upar ke cells jaisa result paane ka ek zyada direct tareeka `.filter()` ka istemal karna hai. Yeh ek function apply karta hai (is case mein, check karna ki group ka mean salary > 60000 hai ya nahi) aur un groups ke rows return karta hai jahan function `True` return karta hai.

In [-1]:
df.groupby('Department').filter(lambda y:y['Salary'].mean()> 60000)

**English:** Use `.query()` for a readable way to filter aggregated results. Here we find departments where the average experience is greater than 5.
**Hindi:** Aggregated results ko filter karne ke ek readable tareeke ke liye `.query()` ka istemal karein. Yahan hum un departments ko dhoondhte hain jahan average experience 5 se zyada hai.

In [-1]:
df.groupby('Department').agg(
    max_experience  = ('Experience','max'),
    min_experience = ('Experience','min'),
    avg_experience = ('Experience','mean')
).round().query('avg_experience > 5')

**English:** Redefine the DataFrame with a new dataset that includes 'Performance_Score' for more complex analysis.
**Hindi:** DataFrame ko ek naye dataset ke saath redefine karein jismein aur complex analysis ke liye 'Performance_Score' shamil hai.

In [-1]:
import pandas as pd

data = {
    'Department': ['IT', 'HR', 'Finance', 'IT', 'HR', 'Finance', 'IT', 'HR', 'Finance', 'IT'],
    'Employee': ['Amit', 'Neha', 'Rina', 'Raj', 'Arjun', 'Tina', 'Karan', 'Divya', 'Sonia', 'Rohit'],
    'Salary': [70000, 50000, 80000, 65000, 48000, 85000, 72000, 51000, 90000, 68000],
    'Experience': [3, 2, 7, 5, 3, 10, 6, 4, 12, 5],
    'Performance_Score': [88, 92, 79, 85, 95, 90, 87, 93, 78, 86]
}

df = pd.DataFrame(data)


**English:** Display the new DataFrame.
**Hindi:** Naya DataFrame display karein.

In [-1]:
df

**English:** Filter for employees in departments where the average performance score is above 85.
**Hindi:** Un departments ke employees ko filter karein jahan average performance score 85 se upar hai.

In [-1]:
df[df['Department'].isin(
    df.groupby('Department')['Performance_Score'].mean().loc[lambda y:y > 85].index
)]

**English:** This cell combines multiple conditions for filtering. It looks for departments where average salary is > 65k, average performance is > 85, AND average experience is > 5. Since no department meets all three conditions, the result is an empty DataFrame.
**Hindi:** Yeh cell filtering ke liye multiple conditions ko jodta hai. Yeh un departments ko dhoondhta hai jahan average salary > 65k, average performance > 85, AUR average experience > 5 hai. Kyunki koi bhi department teeno conditions poori nahi karta, result ek empty DataFrame hai.

In [-1]:
df[df['Department'].isin(
    df.groupby('Department').agg(
        avg_salary = ('Salary','mean'),
        avg_exp = ('Experience','mean'),
        avg_per = ('Performance_Score','mean')
    ).loc[lambda y: (y['avg_salary'] > 65000) & (y['avg_per'] > 85) & (y['avg_exp'] > 5)].index
)]