## **Data Frame**

1)How to iterate over rows in Pandas Dataframe.

In [None]:
import pandas as pd

data = {'Name': ['Sasank', 'Sampath', 'Jaswanth'],
        'Age': [22, 30, 21],
        'City': ['New York', 'London', 'Paris']}
df = pd.DataFrame(data)


for index, row in df.iterrows():
  print(f"Index: {index}, Name: {row['Name']}, Age: {row['Age']}, City: {row['City']}")


Index: 0, Name: Sasank, Age: 22, City: New York
Index: 1, Name: Sampath, Age: 30, City: London
Index: 2, Name: Jaswanth, Age: 21, City: Paris


2)How to select row with minimum and maximum value in the pandas
Dataframe.


In [None]:
# Find the row with the minimum age
row_with_min_age = df.loc[df['Age'].idxmin()]
print("Row with minimum age:")
print(row_with_min_age)

# Find the row with the maximum age
row_with_max_age = df.loc[df['Age'].idxmax()]
print("\nRow with maximum age:")
print(row_with_max_age)

Row with minimum age:
Name    Jaswanth
Age           21
City       Paris
Name: 2, dtype: object

Row with maximum age:
Name    Sampath
Age          30
City     London
Name: 1, dtype: object


3)How to merge or join two different dataframes by a common column.

In [None]:
import pandas as pd
# Create two sample DataFrames
data1 = {'ID': [1, 2, 3],
         'Name': ['Nithin', 'Lokesh', 'Vamsi']}
df1 = pd.DataFrame(data1)

data2 = {'ID': [2, 3, 4],
         'City': ['London', 'Paris', 'New York']}
df2 = pd.DataFrame(data2)

# Merge the two DataFrames based on the 'ID' column
# Inner join: only rows with matching IDs in both DataFrames are included
merged_df_inner = pd.merge(df1, df2, on='ID', how='inner')
print("Inner Join:")
print(merged_df_inner)

# Left join: all rows from the left DataFrame (df1) are included,
# and matching rows from the right DataFrame (df2) are added.
# If there's no match in df2, NaN values are filled.
merged_df_left = pd.merge(df1, df2, on='ID', how='left')
print("\nLeft Join:")
print(merged_df_left)

# Right join: all rows from the right DataFrame (df2) are included,
# and matching rows from the left DataFrame (df1) are added.
# If there's no match in df1, NaN values are filled.
merged_df_right = pd.merge(df1, df2, on='ID', how='right')
print("\nRight Join:")
print(merged_df_right)

# Outer join: all rows from both DataFrames are included.
# If there's no match in one DataFrame, NaN values are filled.
merged_df_outer = pd.merge(df1, df2, on='ID', how='outer')
print("\nOuter Join:")
print(merged_df_outer)


Inner Join:
   ID    Name    City
0   2  Lokesh  London
1   3   Vamsi   Paris

Left Join:
   ID    Name    City
0   1  Nithin     NaN
1   2  Lokesh  London
2   3   Vamsi   Paris

Right Join:
   ID    Name      City
0   2  Lokesh    London
1   3   Vamsi     Paris
2   4     NaN  New York

Outer Join:
   ID    Name      City
0   1  Nithin       NaN
1   2  Lokesh    London
2   3   Vamsi     Paris
3   4     NaN  New York


4)How to change the order of columns of a dataframe?

In [None]:
new_column_order = ['City', 'Name', 'Age']
df = df[new_column_order]
print(df)

       City      Name  Age
0  New York    Sasank   22
1    London   Sampath   30
2     Paris  Jaswanth   21


5)How to reverse the rows of a dataframe?

In [None]:
# Reverse the rows of the DataFrame
df_reversed = df[::-1]
print(df_reversed)

       City      Name  Age
2     Paris  Jaswanth   21
1    London   Sampath   30
0  New York    Sasank   22


6)How to convert a dictionary to a Dataframe?

In [None]:
import pandas as pd

# dictionary
data = {'Name': ['Alice', 'Bob', 'Charlie'],
        'Age': [25, 30, 28],
        'City': ['New York', 'London', 'Paris']}

# Convert dictionary to DataFrame
df = pd.DataFrame(data)
print(df)

      Name  Age      City
0    Alice   25  New York
1      Bob   30    London
2  Charlie   28     Paris


7)How to set a custom index in a Dataframe?

In [None]:
import pandas as pd

# Method 1: Using the index parameter during DataFrame creation
# (If you're creating the DataFrame from a dictionary or list of dictionaries)
data = {'Name': ['Alice', 'Bob', 'Charlie'],
        'Age': [25, 30, 28],
        'City': ['New York', 'London', 'Paris']}
custom_index = ['A', 'B', 'C']  # Your custom index values
df_with_index = pd.DataFrame(data, index=custom_index)
print(df_with_index)

      Name  Age      City
A    Alice   25  New York
B      Bob   30    London
C  Charlie   28     Paris


In [None]:
# Method 2: Using the set_index() method
# (If you already have a DataFrame)
df['ID'] = ['A', 'B', 'C']  # Add a new column with your desired index values
df = df.set_index('ID')
print(df)

       Name  Age      City
ID                        
A     Alice   25  New York
B       Bob   30    London
C   Charlie   28     Paris


8)How to sort a Dataframe by multiple columns?

In [None]:
# Sort the DataFrame by 'Age' in ascending order and then by 'Name' in descending order
df_sorted = df.sort_values(['Age', 'Name'], ascending=[True, False])
print(df_sorted)

       Name  Age      City
ID                        
A     Alice   25  New York
C   Charlie   28     Paris
B       Bob   30    London


9)How to apply aggregrate function to a Dataframe?

In [None]:
import pandas as pd

# Sample DataFrame
data = {'Name': ['Ariel', 'Bookie', 'Chocolate', 'Ariel', 'Bookie'],
        'Age': [25, 30, 28, 22, 35],
        'City': ['New York', 'London', 'Paris', 'New York', 'London']}
df = pd.DataFrame(data)

# Group by 'Name' and calculate the average age for each name
average_age_by_name = df.groupby('Name')['Age'].mean()
print("Average age by name:")
print(average_age_by_name)

# Calculate the total number of people in each city
people_count_by_city = df.groupby('City')['Name'].count()
print("\nPeople count by city:")
print(people_count_by_city)

# Multiple aggregations using agg()
agg_result = df.groupby('Name').agg({'Age': ['mean', 'min', 'max'], 'City': 'nunique'})
print("\nMultiple aggregations:")
print(agg_result)


Average age by name:
Name
Ariel        23.5
Bookie       32.5
Chocolate    28.0
Name: Age, dtype: float64

People count by city:
City
London      2
New York    2
Paris       1
Name: Name, dtype: int64

Multiple aggregations:
            Age            City
           mean min max nunique
Name                           
Ariel      23.5  22  25       1
Bookie     32.5  30  35       1
Chocolate  28.0  28  28       1


10)How to get the statistical summary of a Dataframe?

In [None]:
import pandas as pd
# Sample DataFrame
data = {'Name': ['Ariel', 'Bookie', 'Chocolate'],
        'Age': [25, 30, 28],
        'City': ['New York', 'London', 'Paris']}
df = pd.DataFrame(data)

# Get the statistical summary of the DataFrame
summary = df.describe()
print(summary)

             Age
count   3.000000
mean   27.666667
std     2.516611
min    25.000000
25%    26.500000
50%    28.000000
75%    29.000000
max    30.000000


## **Series**

1)Write a Pandas program to add, subtract, multiple and divide two Pandas
Series.
Sample Series: [2, 4, 6, 8, 10], [1, 3, 5, 7, 9]

In [None]:
import pandas as pd
ds1 = pd.Series([2, 4, 6, 8, 10])
ds2 = pd.Series([1, 3, 5, 7, 9])
ds = ds1 + ds2
print("Add two Series:")
print(ds)
print("Subtract two Series:")
ds = ds1 - ds2
print(ds)
print("Multiply two Series:")
ds = ds1 * ds2
print(ds)
print("Divide Series1 by Series2:")
ds = ds1 / ds2
print(ds)

Add two Series:
0     3
1     7
2    11
3    15
4    19
dtype: int64
Subtract two Series:
0    1
1    1
2    1
3    1
4    1
dtype: int64
Multiply two Series:
0     2
1    12
2    30
3    56
4    90
dtype: int64
Divide Series1 by Series2:
0    2.000000
1    1.333333
2    1.200000
3    1.142857
4    1.111111
dtype: float64


2)Write a Pandas program to compare the elements of the two Pandas Series.
Sample Series: [2, 4, 6, 8, 10], [1, 3, 5, 7, 10]

In [None]:
import pandas as pd
ds1 = pd.Series([2, 4, 6, 8, 10])
ds2 = pd.Series([1, 3, 5, 7, 10])
print("Series1:")
print(ds1)
print("Series2:")
print(ds2)
print("Compare the elements of the said Series:")
print("Equals:")
print(ds1 == ds2)
print("Greater than:")
print(ds1 > ds2)
print("Less than:")
print(ds1 < ds2)


Series1:
0     2
1     4
2     6
3     8
4    10
dtype: int64
Series2:
0     1
1     3
2     5
3     7
4    10
dtype: int64
Compare the elements of the said Series:
Equals:
0    False
1    False
2    False
3    False
4     True
dtype: bool
Greater than:
0     True
1     True
2     True
3     True
4    False
dtype: bool
Less than:
0    False
1    False
2    False
3    False
4    False
dtype: bool


3)Write a Pandas program to convert a dictionary to a Pandas series.

In [None]:
import pandas as pd

d1 = {'a': 100, 'b': 200, 'c': 300, 'd': 400, 'e': 800}

new_series = pd.Series(d1)

print("Original dictionary:")
print(d1)
print("Converted series:")
print(new_series)


Original dictionary:
{'a': 100, 'b': 200, 'c': 300, 'd': 400, 'e': 800}
Converted series:
a    100
b    200
c    300
d    400
e    800
dtype: int64


4)Write a Pandas program to sort a given Series.

In [None]:
import pandas as pd

s = pd.Series(['800', '152', 'John', '35.12', '40.45'])
print("Original Data Series:")
print(s)
new_s = pd.Series(s).sort_values()
print(new_s)

Original Data Series:
0      800
1      152
2     John
3    35.12
4    40.45
dtype: object
1      152
3    35.12
4    40.45
0      800
2     John
dtype: object


5)Write a Pandas program to add some data to an existing Series.

In [None]:
import pandas as pd
data_series = pd.Series(['10', '50', 'java', '80.56', '120'])
print("Original Data Series:")
print(data_series)

print("\nData Series after adding some data:")
updated_series = pd.concat([data_series, pd.Series([250, "c++"])], ignore_index=True)
print(updated_series)

Original Data Series:
0       10
1       50
2     java
3    80.56
4      120
dtype: object

Data Series after adding some data:
0       10
1       50
2     java
3    80.56
4      120
5      250
6      c++
dtype: object


6)Write a Pandas program to create a subset of a given series based on value
and condition.

In [1]:
import pandas as pd

# Sample Series
series = pd.Series([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

# Create a subset of the series where the values are greater than 5
subset = series[series > 5]

print("Original Series:")
print(series)

print("\nSubset of the Series (values greater than 5):")
print(subset)


Original Series:
0     1
1     2
2     3
3     4
4     5
5     6
6     7
7     8
8     9
9    10
dtype: int64

Subset of the Series (values greater than 5):
5     6
6     7
7     8
8     9
9    10
dtype: int64


7)Write a Pandas program to create the mean and standard deviation of the data
of a given Series.

In [2]:
import pandas as pd

# Sample Series
series = pd.Series([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

# Calculate the mean of the series
mean = series.mean()

# Calculate the standard deviation of the series
std_dev = series.std()

print("Original Series:")
print(series)

print("\nMean of the Series:")
print(mean)

print("\nStandard Deviation of the Series:")
print(std_dev)


Original Series:
0     1
1     2
2     3
3     4
4     5
5     6
6     7
7     8
8     9
9    10
dtype: int64

Mean of the Series:
5.5

Standard Deviation of the Series:
3.0276503540974917
