Q1. List any five functions of the pandas library with execution.

Q2. Given a Pandas DataFrame df with columns 'A', 'B', and 'C', write a Python function to re-index the
DataFrame with a new index that starts from 1 and increments by 2 for each row.

Q3. You have a Pandas DataFrame df with a column named 'Values'. Write a Python function that
iterates over the DataFrame and calculates the sum of the first three values in the 'Values' column. The
function should print the sum to the console.

Q4. Given a Pandas DataFrame df with a column 'Text', write a Python function to create a new column
'Word_Count' that contains the number of words in each row of the 'Text' column.

Q5. How are DataFrame.size() and DataFrame.shape() different?

Q6. Which function of pandas do we use to read an excel file?

Q7. You have a Pandas DataFrame df that contains a column named 'Email' that contains email
addresses in the format 'username@domain.com'. Write a Python function that creates a new column
'Username' in df that contains only the username part of each email address.

Q8. You have a Pandas DataFrame df with columns 'A', 'B', and 'C'. Write a Python function that selects
all rows where the value in column 'A' is greater than 5 and the value in column 'B' is less than 10. The
function should return a new DataFrame that contains only the selected rows.

Q9. Given a Pandas DataFrame df with a column 'Values', write a Python function to calculate the mean,
median, and standard deviation of the values in the 'Values' column.

Q10. Given a Pandas DataFrame df with a column 'Sales' and a column 'Date', write a Python function to
create a new column 'MovingAverage' that contains the moving average of the sales for the past 7 days
for each row in the DataFrame. The moving average should be calculated using a window of size 7 and
should include the current day.

Q11. You have a Pandas DataFrame df with a column 'Date'. Write a Python function that creates a new
column 'Weekday' in the DataFrame. The 'Weekday' column should contain the weekday name (e.g.
Monday, Tuesday) corresponding to each date in the 'Date' column.

Q12. Given a Pandas DataFrame df with a column 'Date' that contains timestamps, write a Python
function to select all rows where the date is between '2023-01-01' and '2023-01-31'.

Q13. To use the basic functions of pandas, what is the first and foremost necessary library that needs to
be imported?

# Answers

 five commonly used functions in the pandas library along with their execution

 read_csv(): This function is used to read data from a CSV file and create a DataFrame.

 head(): This function returns the first n rows of a DataFrame.

 groupby(): This function is used for grouping data based on one or more columns and performing aggregate operations.

 fillna(): This function is used to fill missing values in a DataFrame with a specified value or method.

 to_csv(): This function is used to save a DataFrame to a CSV file.

 

In [6]:
import pandas as pd

def reindex_dataframe(df):
    df.index = pd.RangeIndex(start=1, stop=len(df)*2, step=2)
    return df

# Example usage:
data = {'A': [10, 20, 30, 40], 'B': [50, 60, 70, 80], 'C': [90, 100, 110, 120]}
df = pd.DataFrame(data)
reindexed_df = reindex_dataframe(df)
print(reindexed_df)


    A   B    C
1  10  50   90
3  20  60  100
5  30  70  110
7  40  80  120


In [7]:
import pandas as pd

def sum_first_three_values(df):
    values_column = df['Values']
    sum_values = sum(values_column[:3])
    print("Sum of the first three values:", sum_values)

# Example usage:
data = {'Values': [10, 20, 30, 40, 50]}
df = pd.DataFrame(data)
sum_first_three_values(df)


Sum of the first three values: 60


In [8]:
import pandas as pd

def add_word_count_column(df):
    df['Word_Count'] = df['Text'].apply(lambda x: len(str(x).split()))
    return df

# Example usage:
data = {'Text': ['Hello, world!', 'Python is awesome', 'Data Science']}
df = pd.DataFrame(data)
df_with_word_count = add_word_count_column(df)
print(df_with_word_count)


                Text  Word_Count
0      Hello, world!           2
1  Python is awesome           3
2       Data Science           2


The DataFrame.size() and DataFrame.shape() functions in pandas are used to obtain different information:

DataFrame.size() returns the total number of elements in the DataFrame, which is equal to the number of rows multiplied by the number of columns.

DataFrame.shape() returns a tuple containing the number of rows and columns in the DataFrame

In [9]:
import pandas as pd

data = {'A': [1, 2, 3], 'B': [4, 5, 6]}
df = pd.DataFrame(data)

print(df.size)
print(df.shape)


6
(3, 2)


In [None]:
import pandas as pd

df = pd.read_excel('data.xlsx')
print(df)


In [10]:
import pandas as pd

def extract_username(df):
    df['Username'] = df['Email'].str.split('@').str[0]
    return df

# Example usage:
data = {'Email': ['john.doe@example.com', 'jane.smith@example.com', 'alex.wilson@example.com']}
df = pd.DataFrame(data)
df_with_username = extract_username(df)
print(df_with_username)


                     Email     Username
0     john.doe@example.com     john.doe
1   jane.smith@example.com   jane.smith
2  alex.wilson@example.com  alex.wilson


In [11]:
import pandas as pd

def select_rows(df):
    selected_rows = df[(df['A'] > 5) & (df['B'] < 10)]
    return selected_rows

# Example usage:
data = {'A': [3, 8, 6, 2, 9], 'B': [5, 2, 9, 3, 1], 'C': [1, 7, 4, 5, 2]}
df = pd.DataFrame(data)
selected_df = select_rows(df)
print(selected_df)


   A  B  C
1  8  2  7
2  6  9  4
4  9  1  2


In [13]:
import pandas as pd

def calculate_statistics(df):
    values_column = df['Values']
    mean_value = values_column.mean()
    median_value = values_column.median()
    std_value = values_column.std()
    
    print("Mean:", mean_value)
    print("Median:", median_value)
    print("Standard Deviation:", std_value)

# Example usage:
data = {'Values': [10, 20, 30, 40, 50]}
df = pd.DataFrame(data)
calculate_statistics(df)


Mean: 30.0
Median: 30.0
Standard Deviation: 15.811388300841896


In [14]:
import pandas as pd

def calculate_moving_average(df):
    df['MovingAverage'] = df['Sales'].rolling(window=7, min_periods=1).mean()
    return df

# Example usage:
data = {'Date': pd.date_range(start='2023-01-01', periods=10),
        'Sales': [10, 15, 12, 20, 18, 25, 22, 19, 16, 14]}
df = pd.DataFrame(data)
df_with_moving_average = calculate_moving_average(df)
print(df_with_moving_average)


        Date  Sales  MovingAverage
0 2023-01-01     10      10.000000
1 2023-01-02     15      12.500000
2 2023-01-03     12      12.333333
3 2023-01-04     20      14.250000
4 2023-01-05     18      15.000000
5 2023-01-06     25      16.666667
6 2023-01-07     22      17.428571
7 2023-01-08     19      18.714286
8 2023-01-09     16      18.857143
9 2023-01-10     14      19.142857


In [15]:
import pandas as pd

def add_weekday_column(df):
    df['Weekday'] = df['Date'].dt.strftime('%A')
    return df

# Example usage:
data = {'Date': pd.date_range(start='2023-01-01', periods=5)}
df = pd.DataFrame(data)
df_with_weekday = add_weekday_column(df)
print(df_with_weekday)


        Date    Weekday
0 2023-01-01     Sunday
1 2023-01-02     Monday
2 2023-01-03    Tuesday
3 2023-01-04  Wednesday
4 2023-01-05   Thursday


In [16]:
import pandas as pd

def select_rows_between_dates(df):
    df['Date'] = pd.to_datetime(df['Date'])
    selected_rows = df[(df['Date'] >= '2023-01-01') & (df['Date'] <= '2023-01-31')]
    return selected_rows

# Example usage:
data = {'Date': ['2023-01-01', '2023-01-15', '2023-01-31', '2023-02-10'],
        'Values': [10, 20, 30, 40]}
df = pd.DataFrame(data)
selected_df = select_rows_between_dates(df)
print(selected_df)


        Date  Values
0 2023-01-01      10
1 2023-01-15      20
2 2023-01-31      30


he first and foremost necessary library that needs to be imported to use the basic functions of pandas is the pandas library itself. The pandas library is a powerful and widely used library in Python for data manipulation and analysis. It provides data structures and functions that are specifically designed to handle and analyze structured data, such as tables and time series.