# 🐼 Python Pandas Practice Exercises
- Learnning and mastering Python's powerful data manipulation library — Pandas.

- This notebook contains a series of progressively challenging exercises (with real-world context) to help you become a Pandas pro.


## 📘 Instructions followed:
- Read the problem description carefully.
- Tried to solve the exercises using **Pythonic and efficient Pandas operations**.
- Did **not** use any external libraries unless specified.

Let's begin!


### 1. Create a Pandas Series from a Python list `[10, 20, 30, 40, 50]`.

In [1]:
import pandas as pd

li = [10, 20, 30, 40, 50]
ps = pd.Series(li)
ps


0    10
1    20
2    30
3    40
4    50
dtype: int64

### 2. Create a Series with custom index labels using `[100, 200, 300]` and labels `['a', 'b', 'c']`.

In [2]:
ps = pd.Series([100, 200, 300], index=['a', 'b', 'c'])
ps

a    100
b    200
c    300
dtype: int64

### 3. Retrieve the 2nd and 3rd elements from a Series.

In [5]:
ps.iloc[[1, 2]]

b    200
c    300
dtype: int64

### 4. Add two Series element-wise.

In [6]:
ps1 = pd.Series([1, 2, 3])
ps2 = pd.Series([4, 5, 6])
added_ps = ps1 + ps2
added_ps

0    5
1    7
2    9
dtype: int64

### 5. Apply a lambda function to square each element of a numeric Series.

In [10]:
ps = pd.Series([1, 2, 3])
squared  = ps.apply(lambda x: x ** 2)
squared

0    1
1    4
2    9
dtype: int64

### 6. Create a Pandas DataFrame from a dictionary with columns: Name, Age, and City.

In [29]:
people = {
    'name': ['Asif', 'Babu', 'Emon', 'Hassan', 'Mazid', 'Shiraj'],
    'age' : [19, 27, 23, 30, 17, None],
    'city' : ['shylet', 'noakhali', 'rajshahi', 'barishal', 'chattogram', 'rangpur']
}

df = pd.DataFrame(people)
df

Unnamed: 0,name,age,city
0,Asif,19.0,shylet
1,Babu,27.0,noakhali
2,Emon,23.0,rajshahi
3,Hassan,30.0,barishal
4,Mazid,17.0,chattogram
5,Shiraj,,rangpur


### 7. Retrieve the first 3 rows of a DataFrame.

In [12]:
df.iloc[:3]

Unnamed: 0,name,age,city
0,Asif,19,shylet
1,Babu,27,noakhali
2,Emon,23,rajshahi


### 8. Select only the 'Name' and 'City' columns from the DataFrame.

In [13]:
df[['name', 'city']]

Unnamed: 0,name,city
0,Asif,shylet
1,Babu,noakhali
2,Emon,rajshahi
3,Hassan,barishal
4,Mazid,chattogram


### 9. Filter all rows where age is above 25.

In [14]:
df[df['age'] > 25]

Unnamed: 0,name,age,city
1,Babu,27,noakhali
3,Hassan,30,barishal


### 10. Add a new column 'Salary' with values `[50000, 60000, 70000]`.

In [30]:
salary = [50000, 60000, 70000, 80000, 20000, None]
df['salary'] = salary
df

Unnamed: 0,name,age,city,salary
0,Asif,19.0,shylet,50000.0
1,Babu,27.0,noakhali,60000.0
2,Emon,23.0,rajshahi,70000.0
3,Hassan,30.0,barishal,80000.0
4,Mazid,17.0,chattogram,20000.0
5,Shiraj,,rangpur,


### 11. Sort the DataFrame by 'Age' in descending order.

In [16]:
df.sort_values('age')

Unnamed: 0,name,age,city,salary
4,Mazid,17,chattogram,20000
0,Asif,19,shylet,50000
2,Emon,23,rajshahi,70000
1,Babu,27,noakhali,60000
3,Hassan,30,barishal,80000


### 12. Reset the index of a DataFrame.

In [17]:
df.reset_index()

Unnamed: 0,index,name,age,city,salary
0,0,Asif,19,shylet,50000
1,1,Babu,27,noakhali,60000
2,2,Emon,23,rajshahi,70000
3,3,Hassan,30,barishal,80000
4,4,Mazid,17,chattogram,20000


### 13. Set the 'Name' column as index.

In [None]:
df.index = df['name']

In [19]:
df

Unnamed: 0_level_0,name,age,city,salary
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Asif,Asif,19,shylet,50000
Babu,Babu,27,noakhali,60000
Emon,Emon,23,rajshahi,70000
Hassan,Hassan,30,barishal,80000
Mazid,Mazid,17,chattogram,20000


In [23]:
df.drop('name', axis=1, inplace= True)

### 14. Filter rows where 'City' is either 'Noakhlali' or 'Shylet'.

In [27]:
df[(df['city'] == 'noakhali') | (df['city'] == 'barishal')]

Unnamed: 0_level_0,age,city,salary
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Babu,27,noakhali,60000
Hassan,30,barishal,80000


### 15. Replace missing values in a DataFrame with the column mean.

In [46]:
df['age'].fillna(value= df['age'].mean(), inplace= True)

In [None]:
df['salary'].fillna(value= df['salary'].mean(), inplace= True)

The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  df['salary'].fillna(value= df['salary'].mean(), inplace= True)


In [50]:
df

Unnamed: 0,name,age,city,salary
0,Asif,19.0,shylet,50000.0
1,Babu,27.0,noakhali,60000.0
2,Emon,23.0,rajshahi,70000.0
3,Hassan,30.0,barishal,80000.0
4,Mazid,17.0,chattogram,20000.0
5,Shiraj,23.2,rangpur,56000.0


### 16. Group the DataFrame by 'City' and compute average 'Salary'.

### 17. Use `apply()` to calculate the length of each name in 'Name' column.

### 18. Add a new column 'Tax' that is 10% of the Salary using `apply()`.

### 19. Filter groups with more than 1 entry using `groupby`.

### 20. Create a pivot table of average Salary by City and Age.

### 21. Merge two DataFrames on a common column 'ID'.

### 22. Perform a left join using Pandas.

### 23. Concatenate two DataFrames vertically (stack rows).

### 24. Concatenate two DataFrames horizontally (side-by-side).

### 25. Join two DataFrames with different column names but similar data using `left_on` and `right_on`.

### 26. Load a CSV file into a Pandas DataFrame (e.g., 'sales.csv').

### 27. Find top 5 products with highest total sales from the sales data.

### 28. Convert a date column to datetime and extract the year.

### 29. Create a rolling average column for 'Sales' with a window of 3.

### 30. Identify and drop duplicate rows based on 'Order ID'.