**English:** Import pandas for data manipulation and numpy for numerical operations.
**Hindi:** Pandas ko data manipulation ke liye aur numpy ko numerical operations ke liye import karein.

In [161]:
import pandas as pd
import numpy as np

**English:** Create a dictionary containing employee data with some missing values, then convert it into a pandas DataFrame.
**Hindi:** Kuchh missing values ke saath employee data wala ek dictionary banayein, fir use pandas DataFrame mein convert karein.

In [162]:
data = {
    'Name': ['Amit', 'Priya', 'Rohan', 'Sneha', np.nan, 'Tina', 'Arjun', 'Neha'],
    'Department': ['HR', 'IT', 'Finance', np.nan, 'Finance', 'HR', 'IT', 'Finance'],
    'Age': [25, np.nan, 35, 28, 32, 41, 30, np.nan],
    'Salary': [35000, 60000, np.nan, 58000, 72000, 40000, 62000, 67000],
    'City': ['Pune', 'Mumbai', 'Delhi', 'Pune', 'Chennai', np.nan, 'Delhi', 'Pune']
}

df = pd.DataFrame(data)


**English:** Count the number of missing (NaN) values in each column of the DataFrame.
**Hindi:** DataFrame ke har column mein missing (NaN) values ki sankhya ginein.

In [163]:
df.isna().sum()

**English:** Display the current state of the DataFrame.
**Hindi:** DataFrame ki vartaman sthiti dikhayein.

In [164]:
df

**English:** Fill missing 'Age' values with the mean of the existing ages.
**Hindi:** Missing 'Age' values ko maujooda umar ke mean se bharein.

In [165]:
df['Age'].fillna(df['Age'].mean(),inplace = True)

**English:** Convert the 'Age' column to integer type.
**Hindi:** 'Age' column ko integer type mein convert karein.

In [166]:
df['Age'] = df['Age'].astype(int)

**English:** Fill missing 'Department' values with 'unknown' and missing 'City' values with 'not mentioned'.
**Hindi:** Missing 'Department' values ko 'unknown' aur missing 'City' values ko 'not mentioned' se bharein.

In [168]:
df['Department'].fillna('unknown',inplace = True)

In [169]:
df['City'].fillna('not mentioned', inplace = True)

**English:** Display the DataFrame after filling some of the missing values.
**Hindi:** Kuchh missing values bharne ke baad DataFrame dikhayein.

In [170]:
df

**English:** Identify and display the rows that still contain any missing values.
**Hindi:** Un rows ko pehchanein aur dikhayein jinmein abhi bhi koi missing value hai.

In [171]:
rows_with_missing = df[df.isnull().any(axis=1)]

In [172]:
rows_with_missing

**English:** For each row with missing values, print the index and the name of the columns that are missing.
**Hindi:** Missing values wali har row ke liye, index aur un columns ke naam print karein jo missing hain.

In [173]:
for idx in rows_with_missing.index:
    missing_cols = df.loc[idx].isnull()
    missing_col_names = missing_cols[missing_cols].index.tolist()
    print(f'Row {idx}: Missing in columns {missing_col_names}')

**English:** Group data by 'Department' and calculate the mean salary for each.
**Hindi:** Data ko 'Department' ke anusaar group karein aur har ek ke liye mean salary calculate karein.

In [176]:
df.groupby('Department')['Salary'].mean().round()

**English:** Sort the DataFrame by 'Salary' in descending order.
**Hindi:** DataFrame ko 'Salary' ke adhaar par descending order mein sort karein.

In [177]:
df.sort_values(by='Salary',ascending=False)

**English:** Remove rows where the 'Salary' is missing.
**Hindi:** Un rows ko hatayein jahan 'Salary' missing hai.

In [178]:
df= df.dropna(subset=['Salary'])

**English:** Create a new 'category' column based on salary ranges (High, medium, low) using `np.select`.
**Hindi:** `np.select` ka upyog karke salary ranges (High, medium, low) ke aadhar par ek naya 'category' column banayein.

In [179]:
df['category'] = np.select([df['Salary'] > 65000 , 
                (df['Salary'] > 45000)&(df['Salary']<=65000),
                df['Salary']< 45000],
                ['High ','medium','low'],
                default = 'unknown')

**English:** Replace the missing name at index 4 with 'john'. Note: The original index is used, which might be confusing after dropping rows. A better approach would be to use a more robust locator.
**Hindi:** Index 4 par missing naam ko 'john' se replace karein. Dhyaan dein: Yahan original index ka upyog kiya gaya hai, jo rows drop karne ke baad confusing ho sakta hai. Ek behtar tareeka ek adhik majboot locator ka upyog karna hoga.

In [180]:
df.iloc[3,0] = 'john'

**English:** Display the DataFrame with the new 'category' and updated name.
**Hindi:** Nayi 'category' aur updated naam ke saath DataFrame dikhayein.

In [181]:
df

**English:** Count the occurrences of each salary category.
**Hindi:** Har salary category ki occurrences ginein.

In [182]:
df['category'].value_counts()

**English:** Group by 'Name' and 'category' and count the combinations.
**Hindi:** 'Name' aur 'category' ke anusaar group karein aur combinations ko ginein.

In [183]:
df.groupby('Name')['category'].value_counts()

**English:** Group by 'Department' and 'category' and find the mean salary for each group.
**Hindi:** 'Department' aur 'category' ke anusaar group karein aur har group ke liye mean salary pata karein.

In [185]:
df.groupby(['Department','category'])['Salary'].mean()

**English:** Find the name of the youngest employee.
**Hindi:** Sabse kam umar ke employee ka naam pata karein.

In [186]:
youngest_employee = df['Age'].idxmin()

In [187]:
name = df.loc[youngest_employee,'Name']
name

**English:** Create a new 'Name_length' column with the length of each name.
**Hindi:** Har naam ki lambai ke saath ek naya 'Name_length' column banayein.

In [190]:
df['Name_length'] = df['Name'].str.len()

**English:** Perform multiple aggregations on the 'Department' groups to get the total number of employees, average salary, and minimum age.
**Hindi:** 'Department' groups par multiple aggregations perform karein taaki कुल employees, average salary, aur minimum age prapt ho sake.

In [192]:
df.groupby('Department').agg(total_employees = ('Name','count'),
                                avg_salary = ('Salary','mean'),
                                min_age = ('Age','min'))

**English:** Convert all city names to uppercase.
**Hindi:** Sabhi shahar ke naamo ko uppercase mein convert karein.

In [193]:
df['City'] = df['City'].str.upper()

**English:** Display the final DataFrame.
**Hindi:** Final DataFrame dikhayein.

In [194]:
df

**English:** Save a subset of the DataFrame (Name, Salary, Age, category) to a new CSV file named 'new_csv_sachin' without the index.
**Hindi:** DataFrame ka ek subset (Name, Salary, Age, category) 'new_csv_sachin' naam ki ek nayi CSV file mein bina index ke save karein.

In [198]:
df[['Name','Salary','Age','category']].to_csv('new_csv_sachin',index = False)