# Pandas Data Manipulation Exercise

## Objective

This Exercise will help you gain hands-on experience with key Pandas operations, including data creation, exploration, filtering, slicing, merging, concatenating, and grouping data. Each task aims to build foundational skills in manipulating data using Pandas.

## Instructions

Complete each task sequentially, answering all questions.
Add comments to explain your code where necessary.
Submit your work as a Jupyter Notebook (.ipynb) or a Python script (.py).

## Dataset Setup
#### **Create a DataFrame df_sales using the following data:** ####
                
| Order_ID | Product     | Category    | Quantity | Price | Store_Location |
|----------|-------------|-------------|----------|-------|----------------|
| 1        | Laptop      | Electronics | 3        | 800   | New York       |
| 2        | Headphones  | Electronics | 5        | 150   | San Francisco  |
| 3        | Chair       | Furniture   | 10       | 85    | New York       |
| 4        | Desk        | Furniture   | 2        | 200   | Chicago        |
| 5        | Monitor     | Electronics | 4        | 300   | San Francisco  |
| 6        | Lamp        | Furniture   | 7        | 40    | New York       |
| 7        | Smartphone  | Electronics | 6        | 600   | Chicago        |
| 8        | Sofa        | Furniture   | 1        | 500   | San Francisco  |


In [2]:
# Create a DataFrame using pandas library

# CODE HERE

## Tasks and Questions

#### Task 1: Basic DataFrame Information

1.Display the first 5 rows and last 5 rows of df_sales using .head() and .tail().

**Question:** What are the products listed in the first and last rows?

In [3]:
# CODE HERE

### **Data Structure**

Use .size, .shape, and .ndim to find the total number of elements, the number of rows and columns, and the dimensions of df_sales.

**Question:** How many rows and columns are in df_sales?



In [4]:
# CODE HERE

#### Column Information
Use .columns and .dtypes to check the column names and data types.

**Question:** Which columns contain numeric data?

In [5]:
# CODE HERE

## Task 2: Descriptive Statistics
#### Summary Statistics

Use .describe() to view summary statistics for numeric columns.
**Question:** What is the average quantity sold? What is the maximum price?


In [6]:
# CODE HERE


#### Missing Values

Use .info() to get a concise summary of df_sales.
Question: Are there any missing values?


In [7]:
# CODE HERE

#### Min and Max Price
Find the minimum, maximum, average values of the Price column.

Question: What is the lowest price and which product has it?

In [8]:
# CODE HERE

## Task 3: Filtering and Slicing

#### **Filter by Category**

Filter rows where Category is "Electronics".

**Question:** How many electronic products are in the dataset?


In [9]:
# CODE HERE

#### Filter by Price

Filter rows where Price is above 300.

**Question:** What products have a price greater than 300?

In [10]:
# CODE HERE


#### Filter by Location
Filter rows for orders from "New York".

**Question:** How many orders were made from "New York"?


In [11]:
# CODE HERE

## Indexing and Slicing

Select only the Product and Quantity columns for the first 5 rows.

**Question:** What are the names and quantities of the first five products?


In [12]:
# CODE HERE

## Advanced Filtering
Select rows where Category is "Furniture" and Quantity is more than 5.

**Question:** How many furniture items have quantities greater than 5?


In [13]:
# CODE HERE

## Task 4: Adding and Modifying Columns

#### Add Total Price Column

Add a new column called Total_Price, calculated as Quantity * Price.

**Question:** What is the total price for each product?



In [14]:
# CODE HERE

#### Add Price Category


Add a new column called Price_Category, labeling products as "High" if Price > 300 and "Low" otherwise.

**Question:** How many products fall into each price category?

In [15]:
# CODE HERE






Task 5: Merging and Concatenation
Create a DataFrame df_discounts with the following data:


| Category           | Discount_Percentage |
|--------------------|---------------------|
| Electronics        | 10                  |
| Furniture          | 15                  |
| Clothing           | 20                  |
| Groceries          | 5                   |
| Toys               | 25                  |
| Books              | 30                  |
| Sports Equipment   | 12                  |
| Beauty             | 18                  |




In [16]:
# CODE HERE

## Merge DataFrames
Merge df_sales and df_discounts based on the Category column using a left join. Save the result as df_combined.

**Question:** How many rows are in df_combined?


In [17]:
# CODE HERE

### Calculate Discounted Price
Add a new column called Discounted_Price in df_combined, using the formula: **Discounted_Price = Total_Price * (1 - Discount_Percentage / 100).**

In [18]:
# CODE HERE


Question: What is the discounted price for each product?

Concatenate DataFrames
Concatenate df_sales with a new DataFrame df_extra_sales:

| Order_ID | Product     | Category    | Quantity | Price | Store_Location |
|----------|-------------|-------------|----------|-------|----------------|
| 1        | Laptop      | Electronics | 3        | 800   | New York       |
| 2        | Headphones  | Electronics | 5        | 150   | San Francisco  |
| 3        | Chair       | Furniture   | 10       | 85    | New York       |
| 4        | Desk        | Furniture   | 2        | 200   | Chicago        |
| 5        | Monitor     | Electronics | 4        | 300   | San Francisco  |
| 6        | Lamp        | Furniture   | 7        | 40    | New York       |
| 7        | Smartphone  | Electronics | 6        | 600   | Chicago        |
| 8        | Sofa        | Furniture   | 1        | 500   | San Francisco  |
| 9        | Table       | Furniture   | 2        | 150   | Chicago        |
| 10       | Speaker     | Electronics | 3        | 200   | San Francisco  |


Question: After concatenation, how many total rows are in the updated DataFrame?



In [19]:
# CODE HERE

# Task 6: Grouping and Aggregation

#### Group by Category


Group by Category to find the total quantity sold and average price per category.

**Question:** Which category sold the highest quantity overall?



In [20]:
# CODE HERE

#### Group by Store Location
Group by Store_Location to find the total quantity sold and total Total_Price at each location.

**Question:** Which store location generated the highest revenue?



In [21]:
# CODE HERE

#### Discounted Revenue Calculation
Using df_combined, group by Category and calculate the average Discounted_Price and the total revenue after discounts.

**Question:** What is the total discounted revenue for the "Electronics" category?

In [22]:
# CODE HERE