# INTRO TO PANDAS

## Exercises (Basic)

1. Create a Pandas Series with your five favorite numbers.

2. Create a DataFrame with columns Name, Age, and Salary for 5 people.

3. Convert a dictionary to a Pandas DataFrame and print its shape.

4. Import a CSV file into a DataFrame and display the first five rows.

5. Create a DataFrame from a list of dictionaries.


In [2]:
#Q1
import pandas as pd
import numpy as np
df = pd.Series([1,2,3,4,5])
df

0    1
1    2
2    3
3    4
4    5
dtype: int64

In [3]:
#Q2
dict = {
    'Name' : ["Ahmed","Saad","Ayan","Fuzail","Mughees"],
    'Salary' : [10000,3000,30000,0,0],
    'Age' : [20,19,19,20,19]
}

df2 = pd.DataFrame(dict)
print(df2)
#Q3
print(df2.shape)

      Name  Salary  Age
0    Ahmed   10000   20
1     Saad    3000   19
2     Ayan   30000   19
3   Fuzail       0   20
4  Mughees       0   19
(5, 3)


In [4]:
#Q4
df3 = pd.read_excel('SaleData.xlsx')
df3.head()

Unnamed: 0,OrderDate,Region,Manager,SalesMan,Item,Units,Unit_price,Sale_amt
0,2018-01-06,East,Martha,Alexander,Television,95.0,1198.0,113810.0
1,2018-01-23,Central,Hermann,Shelli,Home Theater,50.0,500.0,25000.0
2,2018-02-09,Central,Hermann,Luis,Television,36.0,1198.0,43128.0
3,2018-02-26,Central,Timothy,David,Cell Phone,27.0,225.0,6075.0
4,2018-03-15,West,Timothy,Stephen,Television,56.0,1198.0,67088.0


In [5]:
import pandas as pd

# Proper list of dictionaries (each dictionary representing one row)
data = [
    {'Name': 'Ahmed', 'Salary': 10000, 'Age': 20},
    {'Name': 'Saad', 'Salary': 3000, 'Age': 19},
    {'Name': 'Ayan', 'Salary': 30000, 'Age': 19},
    {'Name': 'Fuzail', 'Salary': 0, 'Age': 20},
    {'Name': 'Mughees', 'Salary': 0, 'Age': 19}
]

# Creating DataFrame
df3 = pd.DataFrame(data)

# Display DataFrame
print(df3)


      Name  Salary  Age
0    Ahmed   10000   20
1     Saad    3000   19
2     Ayan   30000   19
3   Fuzail       0   20
4  Mughees       0   19


1. Create a Series with country names as indices and their capitals as values.
2. Convert a NumPy array into a DataFrame.
3. Retrieve a specific column from a DataFrame.
4. Retrieve the first three rows of a DataFrame.
5. Modify a specific row's values in a DataFrame.

In [6]:
#Q1
list = ["Isl","Delhi","NYC","Kabul","Toronto"]
Index = ["PAK","IND","USA","AFG","CAN"]

df = pd.Series(list, index = Index)
print(df)

PAK        Isl
IND      Delhi
USA        NYC
AFG      Kabul
CAN    Toronto
dtype: object


In [7]:
#Q2
arr = np.array([1,2,4,5,8,3,2,1])
df2 = pd.DataFrame(arr)
print(df2)

   0
0  1
1  2
2  4
3  5
4  8
5  3
6  2
7  1


In [8]:
#Q3
data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35],'City': ['New York', 'Los Angeles', 'Chicago']}
df3 = pd.DataFrame(data)
print(df3.columns)


Index(['Name', 'Age', 'City'], dtype='object')


In [9]:
#Q4
print(df3.iloc[0:])

      Name  Age         City
0    Alice   25     New York
1      Bob   30  Los Angeles
2  Charlie   35      Chicago


In [10]:
#Q5
import pandas as pd

# Create a sample DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie'],
        'Age': [25, 30, 35],
        'City': ['New York', 'Los Angeles', 'Chicago']}

df = pd.DataFrame(data)

# Modify a specific row (e.g., update Bob's age and city)
df.loc[1, 'Age'] = 32  # Updating a single column
df.loc[1, 'City'] = 'San Francisco'  # Updating another column

# Alternatively, modify the entire row
df.loc[1] = ['Bob', 32, 'San Francisco']

# Display the updated DataFrame
print(df)


      Name  Age           City
0    Alice   25       New York
1      Bob   32  San Francisco
2  Charlie   35        Chicago


## Exercises
1.	Detect missing values in a dataset.
2.	Replace missing values with the column mean.
3.	Drop all rows with missing values.
4.	Fill missing values with a fixed value.
5.	Interpolate missing values in time series data.


In [11]:
#Q1
df = pd.DataFrame({
    "Name" : ["Ahmed","Ali",np.nan],
    "Age" : [20,np.nan,20],
    "Salary": [1000,np.nan,2000]
})

print(df.isnull())

    Name    Age  Salary
0  False  False   False
1  False   True    True
2   True  False   False


In [12]:
#q2
df['Age'].fillna(df['Age'].mean())
df['Salary'].fillna(df['Salary'].mean())



0    1000.0
1    1500.0
2    2000.0
Name: Salary, dtype: float64

In [13]:
#q3
df
df.dropna()

Unnamed: 0,Name,Age,Salary
0,Ahmed,20.0,1000.0


In [14]:
#q4
df.fillna(10)

Unnamed: 0,Name,Age,Salary
0,Ahmed,20.0,1000.0
1,Ali,10.0,10.0
2,10,20.0,2000.0


In [15]:
#q5
import pandas as pd
import numpy as np

# Creating a sample time-series dataset with missing values
data = {
    'Date': pd.date_range(start='2025-03-01', periods=6, freq='D'),
    'Value': [10, np.nan, 20, np.nan, np.nan, 50]
}

df = pd.DataFrame(data)

# Set 'Date' as the index
df.set_index('Date', inplace=True)

# Interpolating missing values based on time
df['Value'] = df['Value'].interpolate(method='time')

print(df)


            Value
Date             
2025-03-01   10.0
2025-03-02   15.0
2025-03-03   20.0
2025-03-04   30.0
2025-03-05   40.0
2025-03-06   50.0


### Exercises
1.	Select a subset of columns from a DataFrame.
2.	Filter rows based on a condition.
3.	Filter rows based on multiple conditions.
4.	Select specific rows using .loc[] and .iloc[].
5.	Retrieve only even-indexed rows from a DataFrame.


In [None]:
#   q1  
df = pd.DataFrame({
    "Name" : ["Ahmed","Ali","aqsa"],
    "Age" : [20,11,20],
    "Salary": [1000,3500,2000]
})

print(df[["Name","Age"]])

    Name  Age
0  Ahmed   20
1    Ali   11
2   aqsa   20


In [26]:
#   q2
df.loc[df["Age"]>15]


Unnamed: 0,Name,Age,Salary
0,Ahmed,20,1000
2,aqsa,20,2000


In [29]:
#   q3  
df.loc[(df["Age"] > 15) & (df["Salary"] > 1500)]

Unnamed: 0,Name,Age,Salary
2,aqsa,20,2000


In [31]:
#q5
df.iloc[df.index%2==0]

Unnamed: 0,Name,Age,Salary
0,Ahmed,20,1000
2,aqsa,20,2000
