## Pivoting

1. Create a DataFrame containing sales data for a store with columns: 'Date', 'Category', 'Sales'. Pivot the DataFrame to show the total sales for each category by date.

2. Create a DataFrame with columns: 'Region', 'Product', 'Revenue'. Pivot the DataFrame to calculate the average revenue for each product in different regions.

3. Using a given DataFrame, pivot the data to display the count of items sold based on 'Store' and 'Year'.

## Sorting & Aggregation

4. Create a DataFrame with columns: 'Name', 'Age', 'Score'. Sort the DataFrame by 'Score' in descending order.

5. Create a DataFrame of students' marks across subjects. Use aggregation functions like `mean`, `sum`, and `max` to summarize marks by student names.

6. Given a DataFrame of employee details (Name, Department, Salary), sort the DataFrame by department, then by salary within each department.

## Re-Indexing & Altering Labels

7. Create a DataFrame with default indexing. Re-index the DataFrame to have a custom index (e.g., 'A', 'B', 'C').

8. Create a DataFrame and demonstrate the use of `reset_index()` to reset to default indices.

9. Given a DataFrame, alter the column labels to uppercase and the row indices to start from 100.

## Groupby() & Transform()

10. Create a DataFrame with columns: 'Department', 'Employee', 'Salary'. Group the data by 'Department' and calculate the average salary for each department.

In [10]:
import pandas as pd

data = {
    "Department": ["HR", "IT", "Finance", "Marketing"],
    "Employee": ["John Doe", "Jane Smith", "Emily Davis", "Michael Brown"],
    "Salary": [50000, 75000, 80000, 60000]
}

df = pd.DataFrame(data)
# print(df.to_string())

dep=df.groupby('Department')
print(dep.sum())
print("=======================")
avg_salary=df.groupby('Department')['Salary'].mean()
print(avg_salary)

                 Employee  Salary
Department                       
Finance       Emily Davis   80000
HR               John Doe   50000
IT             Jane Smith   75000
Marketing   Michael Brown   60000
Department
Finance      80000.0
HR           50000.0
IT           75000.0
Marketing    60000.0
Name: Salary, dtype: float64


11. Create a DataFrame with sales data (Store, Item, Sales). Group the data by 'Item' and use `transform()` to calculate the percentage contribution of each sale.

In [15]:
import pandas as pd

data = {
    'Store': ['Store A', 'Store B', 'Store A', 'Store C', 'Store B'],
    'Item': ['Apples', 'Bananas', 'Oranges', 'Grapes', 'Apples'],
    'Sales': [150, 200, 180, 220, 170]
}

df = pd.DataFrame(data)

df['Sales_Percentage']=df['Sales'] / df.groupby('Item')['Sales'].transform('sum') * 100
print(df['Sales_Percentage'] )

0     46.875
1    100.000
2    100.000
3    100.000
4     53.125
Name: Sales_Percentage, dtype: float64


12. Given a DataFrame of student marks, group the data by 'Subject' and use `transform()` to normalize the marks within each subject.

In [38]:
import pandas as pd


student_marks = {
    'Student_ID': [101, 102, 103, 104, 105, 106],
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eva', 'Frank'],
    'Subject': ['Math', 'Gujarati', 'Science', 'Biology', 'English', 'Iks'],
    'Marks': [85, 78, 92, 88, 76, 81]
}

df = pd.DataFrame(student_marks)

df['Normalize_marks']=df.groupby('Subject')['Marks'].transform(lambda x: (x - x.min()) / (x.max()-x.min()))
print(df)

   Student_ID     Name   Subject  Marks  Normalize_marks
0         101    Alice      Math     85              NaN
1         102      Bob  Gujarati     78              NaN
2         103  Charlie   Science     92              NaN
3         104    David   Biology     88              NaN
4         105      Eva   English     76              NaN
5         106    Frank       Iks     81              NaN


In [37]:
import pandas as pd

# Sample student marks data
student_marks = {
    'Student_ID': [101, 102, 103, 104, 105, 106, 107, 108],
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eva', 'Frank', 'Grace', 'Helen'],
    'Subject': ['Math', 'Math', 'Science', 'Science', 'English', 'English', 'Math', 'Science'],
    'Marks': [85, 78, 92, 88, 76, 81, 95, 89]
}

# Convert dictionary to DataFrame
df = pd.DataFrame(student_marks)

# Normalize Marks within each Subject (Min-Max Scaling)
df['Normalized_Marks'] = df.groupby('Subject')['Marks'].transform(lambda x: (x - x.min()) / (x.max() - x.min()))

# Display the updated DataFrame
print(df)


   Student_ID     Name  Subject  Marks  Normalized_Marks
0         101    Alice     Math     85          0.411765
1         102      Bob     Math     78          0.000000
2         103  Charlie  Science     92          1.000000
3         104    David  Science     88          0.000000
4         105      Eva  English     76          0.000000
5         106    Frank  English     81          1.000000
6         107    Grace     Math     95          1.000000
7         108    Helen  Science     89          0.250000
