# Lab 9 - Summarizing Attendance and Practice Attempts

In the last lab, you combined the information for attendance and practice quizzes into one combined table.  In this lab, you will transform these data into two summary tables.

1. For each class, you will create a table that summarizes the attendance of all students in that class.
2. For each class that has practice quizzes, you will create a table that summarized the number of attempts, overall and for each module.

## Tasks

To complete this lab, complete each of the following tasks.

#### Task 1 - Attendance Summary

For each of the classes contained in `attendance_example.zip`, we want to create a table with three columns: `FirstName`, `LastName`, `Attendance`.  Note that `Attendance` will represent the maximum number of attempts on the quiz, which corresponds to that students number of days they have attended.  To complete this task, do the following.

1. Write a helper function that takes a class identifier and the overall dataframe as arguments; and returns the table described above.
2. Use a loop and your helper functions to create the attendance quiz table for each class and then write the contents to a csv file.

In [16]:
import pandas as pd
from dfply import *
df_columns = ['Org_Defined_ID', 'UserName', 'FirstName', 'LastName', 'Attempt_num',
       'Score', 'Out_Of', 'Attempt_Start', 'Attempt_End', 'Percent', 'Class',
       'Type', 'Module']
att_prac = pd.read_csv("./data/attendace_pratice.csv", names=df_columns, header = 0)
att_prac.head()

Unnamed: 0,Org_Defined_ID,UserName,FirstName,LastName,Attempt_num,Score,Out_Of,Attempt_Start,Attempt_End,Percent,Class,Type,Module
0,15135961,wd8670of,McKinley,Sabina,7,1,1,2019-02-05 09:07:00,2019-02-05 09:11:00,100 %,stat491,Attendance,
1,15135961,wd8670of,McKinley,Sabina,8,1,1,2019-02-07 09:01:00,2019-02-07 09:07:00,100 %,stat491,Attendance,
2,15135961,wd8670of,McKinley,Sabina,9,1,1,2019-02-14 09:00:00,2019-02-14 09:01:00,100 %,stat491,Attendance,
3,15135961,wd8670of,McKinley,Sabina,10,1,1,2019-02-21 09:08:00,2019-02-21 09:17:00,100 %,stat491,Attendance,
4,15135961,wd8670of,McKinley,Sabina,11,1,1,2019-02-26 09:04:00,2019-02-26 09:07:00,100 %,stat491,Attendance,


In [90]:
##Example
class_ex = "stat491"
(att_prac
 >> filter_by(X.Class == class_ex)
 >> filter_by(X.Type == 'Attendance')
 >> group_by(X.UserName, X.FirstName, X.LastName)
 >> summarise(Attendance = X.Attempt_num.max())
 >> drop(X.UserName)
)>>head

Unnamed: 0,LastName,FirstName,Attendance
0,Acadia,Athenian,19
1,Erich,Hammond,16
2,Czech,Ekstrom,18
3,Arizona,McDonnell,19
4,Bogota,Dar,19


In [28]:
atten_gen = lambda class_id, df: (df
                                  >> filter_by(X.Class == class_id)
                                  >> filter_by(X.Type == 'Attendance')
                                  >> group_by(X.UserName, X.FirstName, X.LastName)
                                  >> summarise(Attendance = X.Attempt_num.max())
                                  >> drop(X.UserName))

In [31]:
class_list = ['stat491', 'dsci494', 'stat180']
for class_ind in class_list:
    atten_gen(class_ind, att_prac).to_csv("./data/" + class_ind +"_attendance.csv", header = True, index = None)


#### Task 2 - Practice Quiz Summary

Some of the classes contained in `attendance_example.zip` contain information about attempts on practice quizzes for four modules.  We want to create a table for each class that summarizes the practice quiz attempts.  This table should contain the following columns: `FirstName`, `LastName`, `Module 1`, `Module 2`, `Module 3`, `Module 4`, and `Total Attempts`.  Note that, for example, `Module 1` contains the total number of attempts each student made on the corresponding quiz and `Total Attempts` contains the total number of attempts on all four quizzes.


1. Create a list of class that have practice quizzes.
2. Write a helper function that takes a class identifier and the overall dataframe as arguments; and returns the table described above.
3. Use a loop and your helper functions to create the practice quiz table for each class and then write the contents to a csv file.

In [40]:
practice_classes = (att_prac
                    >> filter_by(X.Type == "Practice")
                    >> group_by(X.Class)
                    >> summarise(cnt = n(X.Class))
                    >> drop(X.cnt))
practice_list = list(practice_classes['Class'])
practice_list

['dsci494', 'stat491']

In [85]:
from more_dfply import recode, ifelse
mod_dict = {1.0:'Module_1',
            2.0: 'Module_2',
            3.0: 'Module_3',
            4.0: 'Module_4'}

In [95]:
## Example
(att_prac
 >> filter_by(X.Class == class_ex)
 >> filter_by(X.Type == "Practice")
 >> mutate(Module = recode(X.Module, mod_dict))
 >> group_by( X.UserName, X.FirstName, X.LastName, X.Module)
 >> summarise(Attempts = X.Attempt_num.max())
 >> spread(X.Module, X.Attempts)
 >> mutate(Total_Attempts = X.Module_1 
           + ifelse(X.Module_2.isna(),0,X.Module_2) 
           + ifelse(X.Module_3.isna(),0,X.Module_3)
           + ifelse(X.Module_4.isna(),0,X.Module_4))
 >> drop(X.UserName)
)>>head()
 

Unnamed: 0,LastName,FirstName,Module_1,Module_2,Module_3,Module_4,Total_Attempts
0,Acadia,Athenian,11.0,,,,11.0
1,Angelica,Frau,10.0,5.0,17.0,7.0,39.0
2,Arizona,McDonnell,23.0,,8.0,,31.0
3,Bogota,Dar,13.0,,,,13.0
4,Czech,Ekstrom,12.0,,8.0,7.0,27.0


In [96]:
mod_gen = lambda class_id, df: (df
                                >> filter_by(X.Class == class_id)
                                >> filter_by(X.Type == "Practice") 
                                >> mutate(Module = recode(X.Module, mod_dict))
                                >> group_by( X.UserName, X.FirstName, X.LastName, X.Module)
                                >> summarise(Attempts = X.Attempt_num.max())
                                >> spread(X.Module, X.Attempts)
                                >> mutate(Total_Attempts = X.Module_1 
                                                          + ifelse(X.Module_2.isna(),0,X.Module_2) 
                                                          + ifelse(X.Module_3.isna(),0,X.Module_3)
                                                          + ifelse(X.Module_4.isna(),0,X.Module_4))
                                >> drop(X.UserName))

In [97]:
for class_ind in practice_list:
    mod_gen(class_ind, att_prac).to_csv("./data/" + class_ind +"_practiceAttempts.csv", header = True, index = None)
