**1. DataFrame Creation:**

*Create a pandas DataFrame from a dictionary of lists, where each list represents a column.*

In [18]:
import pandas as pd

# Creating a dictionary of lists
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David'],
    'Age': [25, 30, 35, 40],
    'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']
}

# Creating a DataFrame
df = pd.DataFrame(data)
print("DataFrame:\n", df)


DataFrame:
       Name  Age         City
0    Alice   25     New York
1      Bob   30  Los Angeles
2  Charlie   35      Chicago
3    David   40      Houston


**2. DataFrame Operations:**

*Add a new column to an existing DataFrame. Perform element-wise operations between two columns.*

In [5]:
# Adding a new column 'Salary' to the DataFrame
df['Salary'] = [50000, 60000, 70000, 80000]

# Element-wise operation: Increase each salary by 10%
df['Increased_Salary'] = df['Salary'] * 1.1

print("\nDataFrame with new columns:\n", df)



DataFrame with new columns:
       Name  Age         City  Salary  Increased_Salary
0    Alice   25     New York   50000           55000.0
1      Bob   30  Los Angeles   60000           66000.0
2  Charlie   35      Chicago   70000           77000.0
3    David   40      Houston   80000           88000.0


**3. Data Selection:**

*Select rows based on a condition. Select specific columns from a DataFrame.*

In [8]:
# Selecting rows where Age is greater than 30
age_condition = df[df['Age'] > 30]
print("\nRows where Age > 30:\n", age_condition)

# Selecting specific columns: 'Name' and 'City'
selected_columns = df[['Name', 'City']]
print("\nSelected columns:\n", selected_columns)



Rows where Age > 30:
       Name  Age     City  Salary  Increased_Salary
2  Charlie   35  Chicago   70000           77000.0
3    David   40  Houston   80000           88000.0

Selected columns:
       Name         City
0    Alice     New York
1      Bob  Los Angeles
2  Charlie      Chicago
3    David      Houston


**4. Data Aggregation:**

*Group the data in a DataFrame based on a categorical column and calculate the mean of each group.*

In [11]:
# Creating a new categorical column 'Department'
df['Department'] = ['HR', 'Finance', 'IT', 'Finance']

# Grouping by 'Department' and calculating the mean of 'Salary'
grouped_mean = df.groupby('Department')['Salary'].mean()
print("\nMean salary by Department:\n", grouped_mean)



Mean salary by Department:
 Department
Finance    70000.0
HR         50000.0
IT         70000.0
Name: Salary, dtype: float64


**5. Data Cleaning:**

*Handle missing values by either removing or replacing them with appropriate values.*


In [16]:
# missing values into the DataFrame
df.loc[1, 'Age'] = None
df.loc[3, 'City'] = None
print("\nDataFrame with missing values:\n", df)

# Removing rows with missing values
df_cleaned = df.dropna()
print("\nDataFrame after removing missing values:\n", df_cleaned)

# Alternatively, replacing missing values with appropriate values
df_filled = df.fillna({
    'Age': df['Age'].mean(),  # Filling missing age with the mean age
    'City': 'Unknown'         # Filling missing city with 'Unknown'
})
print("\nDataFrame after filling missing values:\n", df_filled)



DataFrame with missing values:
       Name   Age         City  Salary  Increased_Salary Department
0    Alice  25.0     New York   50000           55000.0         HR
1      Bob   NaN  Los Angeles   60000           66000.0    Finance
2  Charlie  35.0      Chicago   70000           77000.0         IT
3    David  40.0         None   80000           88000.0    Finance

DataFrame after removing missing values:
       Name   Age      City  Salary  Increased_Salary Department
0    Alice  25.0  New York   50000           55000.0         HR
2  Charlie  35.0   Chicago   70000           77000.0         IT

DataFrame after filling missing values:
       Name        Age         City  Salary  Increased_Salary Department
0    Alice  25.000000     New York   50000           55000.0         HR
1      Bob  33.333333  Los Angeles   60000           66000.0    Finance
2  Charlie  35.000000      Chicago   70000           77000.0         IT
3    David  40.000000      Unknown   80000           88000.0    Fin