Consider two excel files having attendance of a workshop’s participants for two days. Each file has three
fields ‘Name’, ‘Time of joining’, duration (in minutes) where names are unique within a file. Note that duration
may take one of three values (30, 40, 50) only. Import the data into two dataframes and do the following:
a. Perform merging of the two dataframes to find the names of students who had attended the
workshop on both days.
b. Find names of all students who have attended workshop on either of the days.
c. Merge two data frames row-wise and find the total number of records in the data frame.
d. Merge two data frames and use two columns names and duration as multi-row indexes. Generate
descriptive statistics for this multi-index.

In [None]:
import pandas as pd

# Import the first Excel file
df1 = pd.read_excel('file1.xlsx')

# Import the second Excel file
df2 = pd.read_excel('file2.xlsx')

# Merge the two dataframes on the 'Name' column
merged_df = df1.merge(df2, on='Name')

# Get the names of students who attended the workshop on both days
students_attended_both_days = merged_df['Name'].unique()

print('Students who attended the workshop on both days:')
print(students_attended_both_days)

# Create a set of all the students who attended the workshop on either of the days
all_students = set(df1['Name'].unique()).union(set(df2['Name'].unique()))

# Print the names of all students who attended the workshop on either of the days
print('Names of all students who attended the workshop on either of the days:')
print(all_students)

# Merge the two dataframes row-wise
merged_df = pd.concat([df1, df2], ignore_index=True)

# Get the total number of records in the merged dataframe
total_records = merged_df.shape[0]

print('Total number of records in the merged dataframe:')
print(total_records)

# Merge the two dataframes on the 'Name' and 'Duration' columns
merged_df = df1.merge(df2, how='outer', on=['Name', 'Duration'])

# Set the multi-index
merged_df.index = merged_df[['Name', 'Duration']]

# Generate descriptive statistics for the multi-index
print(merged_df.describe(include='all'))


Students who attended the workshop on both days:
['Arun' 'Harshit' 'Gagan']
Names of all students who attended the workshop on either of the days:
{'Manas', 'Tanuj', 'Harshit', 'Gagan', 'Arun'}
Total number of records in the merged dataframe:
8
        Name Time of joining _x   Duration Time of joining _y
count      5                  4   5.000000                  4
unique     5                  4        NaN                  4
top     Arun           10:20:00        NaN           10:20:00
freq       1                  1        NaN                  1
mean     NaN                NaN  41.000000                NaN
std      NaN                NaN   6.519202                NaN
min      NaN                NaN  35.000000                NaN
25%      NaN                NaN  35.000000                NaN
50%      NaN                NaN  40.000000                NaN
75%      NaN                NaN  45.000000                NaN
max      NaN                NaN  50.000000                NaN
