Q1. List any five functions of the pandas library with execution.

Answer: 

1. `pandas.read_csv()`: This function is used to read CSV data files and convert them into pandas DataFrame objects.

Example:
```python
import pandas as pd
df = pd.read_csv('data.csv')
```

2. `pandas.DataFrame()`: This function is used to create a new DataFrame object from scratch.

Example:
```python
import pandas as pd
df = pd.DataFrame({'Name': ['John', 'Jane', 'Mike'], 'Age': [30, 20, 25]})
```

3. `pandas.DataFrame.describe()`: This function generates descriptive statistics that summarize the central tendency, dispersion, and shape of the values in a pandas DataFrame.

Example:
```python
import pandas as pd
df = pd.DataFrame({'Name': ['John', 'Jane', 'Mike'], 'Age': [30, 20, 25]})
print(df.describe())
```

4. `pandas.DataFrame.apply()`: This function applies a function to each element of a pandas DataFrame.

Example:
```python
import pandas as pd
df = pd.DataFrame({'Name': ['John', 'Jane', 'Mike'], 'Age': [30, 20, 25]})
def add_suffix(x):
    return x + '_person'
df['Name'] = df['Name'].apply(add_suffix)
print(df)
```

5. `pandas.DataFrame.groupby()`: This function is used to group rows of a DataFrame based on the values in one or more columns.

Example:
```python
import pandas as pd
df = pd.DataFrame({'Name': ['John', 'Jane', 'Mike', 'Meg', 'Emily'], 'Age': [30, 20, 25, 35, 25], 'Gender': ['M', 'F', 'M', 'M', 'F']})
grouped_df = df.groupby('Gender').mean()
print(grouped_df)
```

Q2. Given a Pandas DataFrame df with columns 'A', 'B', and 'C', write a Python function to re-index the
DataFrame with a new index that starts from 1 and increments by 2 for each row.

In [None]:
import pandas as pd

def reindex_dataframe(df):
    new_index = pd.RangeIndex(start=1, step=2, stop=len(df)*2+1)
    df = df.reset_index(drop=True)
    df = df.set_index(new_index)
    return df

Q3. You have a Pandas DataFrame df with a column named 'Values'. Write a Python function that
iterates over the DataFrame and calculates the sum of the first three values in the 'Values' column. The
function should print the sum to the console.

For example, if the 'Values' column of df contains the values [10, 20, 30, 40, 50], your function should
calculate and print the sum of the first three values, which is 60.

In [None]:
import pandas as pd

def sum_first_three(df):
    sum_of_first_three = df['Values'][:3].sum()
    print("Sum of first three values in 'Values' column:", sum_of_first_three)

Q4. Given a Pandas DataFrame df with a column 'Text', write a Python function to create a new column
'Word_Count' that contains the number of words in each row of the 'Text' column.

In [None]:
import pandas as pd

def word_count(df):
    df['Word_Count'] = df['Text'].apply(lambda x: len(x.split()))
    return df

#we can then call this function passing the original DataFrame as an argument:
df=[]
df_word_count = word_count(df)

Q5. How are DataFrame.size() and DataFrame.shape() different?

Answer: 

DataFrame.size() returns the total number of elements in a DataFrame (i.e., the product of the number of rows and columns), while DataFrame.shape() returns a tuple containing the number of rows and columns in the DataFrame. In other words, DataFrame.size() returns a scalar value while DataFrame.shape() returns a tuple of integers.

Q6. Which function of pandas do we use to read an excel file?

Answer: 

To read an Excel file, we use the `read_excel()` function in Pandas. This function can accept various parameters to customize the import process, such as the sheet name or index, header rows, columns to read, etc. 

Here is an example:

```python
import pandas as pd

# Read the Excel file
data = pd.read_excel('example.xlsx', sheet_name='Sheet1')

# Print the first few rows
print(data.head())
```

Q7. You have a Pandas DataFrame df that contains a column named 'Email' that contains email
addresses in the format 'username@domain.com'. Write a Python function that creates a new column
'Username' in df that contains only the username part of each email address.

The username is the part of the email address that appears before the '@' symbol. For example, if the
email address is 'john.doe@example.com', the 'Username' column should contain 'john.doe'. Your
function should extract the username from each email address and store it in the new 'Username'
column.

In [None]:
def extract_username(df):
    # create a new 'Username' column with empty strings
    df['Username'] = ''
    # iterate through the 'Email' column and extract the username part
    for i in range(len(df)):
        email = df.loc[i, 'Email'] # get the email address
        username = email.split('@')[0] # extract the username part
        df.loc[i, 'Username'] = username # insert the username in the 'Username' column
    return df

#We can call this function with your Pandas DataFrame as the argument:
new_df = extract_username(df)

Q8. You have a Pandas DataFrame df with columns 'A', 'B', and 'C'. Write a Python function that selects
all rows where the value in column 'A' is greater than 5 and the value in column 'B' is less than 10. The
function should return a new DataFrame that contains only the selected rows.
For example, if df contains the following values:<br>
  A B C <br>
0 3 5 1 <br>
1 8 2 7 <br>
2 6 9 4 <br>
3 2 3 5 <br>
4 9 1 2

Your function should select the following rows: <br>
A B C <br>
1 8 2 7 <br>
4 9 1 2 <br>
The function should return a new DataFrame that contains only the selected rows.

In [11]:
import pandas as pd

def select_rows(df):
    # Select rows where A > 5 and B < 10
    selected_rows = df[(df['A'] > 5) & (df['B'] < 10)]
    
    return selected_rows

df = pd.DataFrame({'A': [3, 8, 6, 2, 9], 'B': [5, 2, 9, 3, 1], 'C': [1, 7, 4, 5, 2]})
selected_rows = select_rows(df)

print(selected_rows)

   A  B  C
1  8  2  7
2  6  9  4
4  9  1  2


Q9. Given a Pandas DataFrame df with a column 'Values', write a Python function to calculate the mean,
median, and standard deviation of the values in the 'Values' column.

In [12]:
import pandas as pd
import numpy as np

def calculate_values_stats(df):
    mean = np.mean(df['Values'])
    median = np.median(df['Values'])
    std_dev = np.std(df['Values'])
    return mean, median, std_dev

Q10. Given a Pandas DataFrame df with a column 'Sales' and a column 'Date', write a Python function to
create a new column 'MovingAverage' that contains the moving average of the sales for the past 7 days
for each row in the DataFrame. The moving average should be calculated using a window of size 7 and
should include the current day.

In [14]:
import pandas as pd

def create_moving_average(df):
    # Sort the DataFrame by date
    df = df.sort_values('Date')

    # Create a rolling window object and calculate the moving average
    window = df['Sales'].rolling(window=7, min_periods=1)
    df['MovingAverage'] = window.mean()

    return df

Q11. You have a Pandas DataFrame df with a column 'Date'. Write a Python function that creates a new
column 'Weekday' in the DataFrame. The 'Weekday' column should contain the weekday name (e.g.
Monday, Tuesday) corresponding to each date in the 'Date' column.

For example, if df contains the following values:
| |Date|
|--|--|
|0 | 2023-01-01|
|1| 2023-01-02|
|2 |2023-01-03|
|3 |2023-01-04|
|4| 2023-01-05|

Your function should create the following DataFrame:

|  |Date| Weekday|
|--|--|--|
|0 |2023-01-01 |Sunday|
|1 |2023-01-02 |Monday|
|2 |2023-01-03 |Tuesday
|3 |2023-01-04 |Wednesday|
|4 |2023-01-05| Thursday|

The function should return the modified DataFrame.

In [15]:
import pandas as pd

def add_weekday_column(df):
    # Convert 'Date' column to datetime format
    df['Date'] = pd.to_datetime(df['Date'])
    
    # Create the 'Weekday' column
    df['Weekday'] = df['Date'].dt.weekday_name
    
    return df

Q12. Given a Pandas DataFrame df with a column 'Date' that contains timestamps, write a Python
function to select all rows where the date is between '2023-01-01' and '2023-01-31'.

In [16]:
import pandas as pd

def select_rows_by_date(df):
    start_date = '2023-01-01 00:00:00'
    end_date = '2023-01-31 23:59:59'
    mask = (df['Date'] >= start_date) & (df['Date'] <= end_date)
    selected_rows = df.loc[mask]
    return selected_rows

Q13. To use the basic functions of pandas, what is the first and foremost necessary library that needs to
be imported?

Answer: 

The first and foremost necessary library that needs to be imported in order to use the basic functions of pandas is pandas itself. It is usually imported using the following command:

```
import pandas as pd
```