# Pandas Practice: Scenario-Based Questions (Beginner to Advanced)

This notebook contains a comprehensive set of **50** practical, scenario-based questions to help you master the Pandas library from beginner to advanced levels. The questions are designed to be realistic and reflective of real-world data analysis tasks.

---

## 🥇 Beginner Level (1-20)

### 1. Basic DataFrame Creation and Manipulation

**Scenario:** You are a data analyst at a retail company. You receive the following sales data for analysis:

| Product | Category | Units Sold | Unit Price | Total Revenue |
|---------|----------|------------|------------|----------------|
| Shoes   | Apparel  | 150        | 50         | 7500           |
| Shirts  | Apparel  | 300        | 20         | 6000           |
| Laptops | Electronics| 20       | 1000       | 20000          |
| Mobiles | Electronics| 100      | 500        | 50000          |
| Watches | Accessories| 200      | 100        | 20000          |

**Tasks:**  

1. Create a DataFrame for this sales data.  

2. Add a new column "Profit" assuming a 30% profit margin on each product.  

3. Filter the DataFrame to display only the products that have sold more than 100 units.  

4. Calculate the total revenue and average unit price for each category.  

5. Sort the DataFrame by total revenue in descending order.  

6. Find the product with the highest unit price.  

7. Add a column to calculate the profit for each product.  

8. Remove the "Total Revenue" column.  

9. Rename the "Units Sold" column to "Quantity Sold".  

10. Extract the first three characters from each product name.  

11. Check for any missing values in the DataFrame.  

12. Replace the category "Accessories" with "Wearables".  

13. Add a row for a new product "Headphones" in the Electronics category.  

14. Drop the "Profit" column.  

15. Count the number of products in each category.  

16. Find the average price of Electronics products.  

17. Convert the "Unit Price" to a float data type.  

18. Reset the index of the DataFrame.  

19. Display only the last 3 rows of the DataFrame.  

20. Create a summary report of the DataFrame using the `describe()` function.  

---

## 🥈 Intermediate Level (21-40)

### 2. Data Cleaning and Preparation

**Scenario:** You are working with customer feedback data for a mobile app. The data is as follows:

| Customer ID | Name      | Age | Feedback                              | Rating | Date       |
|--------------|-----------|-----|---------------------------------------|--------|------------|
| 101          | Alice     | 25  | Excellent service                     | 5      | 2023-01-01 |
| 102          | Bob       | 32  | Good app, but slow sometimes          | 3      | 2023-02-15 |
| 103          | Charlie   | 28  | Needs more features, average experience| 2      | 2023-02-20 |
| 104          | Diana     |     | Excellent experience                  | 5      | 2023-03-10 |
| 105          | Eve       | 22  | Great, but needs better UI            | 4      | 2023-03-15 |

**Tasks:**  

21. Create a DataFrame for this customer feedback data.  

22. Identify and fill the missing age values with the average age.  

23. Extract the year, month, and day from the 'Date' column into separate columns.  

24. Filter the feedback data to show only customers with a rating of 4 or higher.  

25. Calculate the average rating by month.  

26. Remove any duplicate customer IDs (if any).  

27. Sort the data by customer age in ascending order.  

28. Replace all occurrences of the word "Excellent" with "Outstanding".  

29. Count the number of unique customers in the dataset.  

30. Find the oldest customer in the dataset.  

31. Check if there are any null values in the 'Feedback' column.  

32. Convert the 'Date' column to datetime format.  

33. Extract only the first name initials of each customer.  

34. Group the feedback by rating and count the number of customers in each group.  

35. Add a column to classify feedback as "Positive" if the rating is 4 or 5, and "Negative" otherwise.  

36. Drop the "Date" column after extracting year, month, and day.  

37. Count the number of customers who gave feedback containing the word "great".  

38. Merge this DataFrame with another DataFrame containing customer emails.  

39. Create a bar plot to visualize the count of ratings.  

40. Save this DataFrame to a CSV file.  

---

## 🥇 Advanced Level (41-50)

### 3. Data Aggregation and Analysis

**Scenario:** You have been given the sales data for a global retail chain. The data includes sales from different regions as follows:

| Region   | Product | Units Sold | Unit Price | Total Revenue |
|----------|---------|------------|------------|----------------|
| North    | Shoes   | 200        | 50         | 10000          |
| South    | Shirts  | 150        | 30         | 4500           |
| East     | Laptops | 30         | 1200       | 36000          |
| West     | Mobiles | 100        | 600        | 60000          |
| Central  | Watches | 180        | 150        | 27000          |
| North    | Laptops | 50         | 1000       | 50000          |
| South    | Mobiles | 120        | 550        | 66000          |

**Tasks:**  

41. Create a DataFrame for this regional sales data.  

42. Group the data by region and calculate the total revenue, total units sold, and average unit price for each region.  

43. Identify the region with the highest total revenue.  

44. Create a pivot table to summarize the total revenue for each product in each region.  

45. Visualize the total revenue by region using a bar chart.  

46. Find the product with the highest total revenue in each region.  

47. Calculate the total profit assuming a 25% profit margin for each product.  

48. Remove all rows where the total revenue is less than 5000.  

49. Find the average units sold per product category.  

50. Save this DataFrame to an Excel file.  