> ### Python Class to Compute Dedication Time

>> Dedication time of a learner is the time between login and last activity for that login session, which is the last activity before the next login event for a given learner.  For more information, check these references. 
Moodle estimates time based in the concepts of Session and Session duration applied to Moodle's log entries:

1. Click: every time that a user accesses a page in Moodle a log entry is stored.
2. Session: set of two or more consecutive clicks in which the elapsed time between every pair of consecutive clicks does not overcome an established maximum time.
3. Session duration: elapsed time between the first and the last click of the session. 
            



In [1]:
%load_ext sql


In [2]:
%%sql
postgresql://postgres:postgres@localhost/moodle

In [3]:
import pandas as pd
from sqlalchemy import create_engine

engine = create_engine('postgresql://postgres:postgres@localhost/moodle')
log_df = pd.read_sql('select * from mdl_logstore_standard_log', engine)

In [19]:
log_df = log_df[log_df['userid']>=0]
logged_in = log_df[log_df.action == 'loggedin'][['userid', 'action']]
login_by_user = logged_in.groupby('userid').count().sort_values('action', ascending=False)

In [23]:
log_in_out = log_df[(log_df.action == "loggedin") | (log_df.action == "loggedout")]
log_in_out.shape

(4993, 21)

In [24]:
user_id =  log_df.userid.unique()

dedication_times = {}
l = 0
for user in user_id:
    l += 1
    
    log_user = log_in_out[log_in_out.userid == user].sort_values('timecreated')
    
    d_time = 0 
    isLoggedIn = 0
    loggedIn_timecreated = 0
    for i, row in log_user.iterrows():
        if(row.action == "loggedin"): 
            isLoggedIn = 1
            loggedIn_timecreated = row.timecreated
            
        if((row.action == "loggedout") & (isLoggedIn == 1)):
            d_time += row.timecreated - loggedIn_timecreated
            isLoggedIn = 0
            
    dedication_times[user] = d_time

In [25]:
dedication_times

{0: 0,
 2: 19113,
 3: 38607,
 1: 0,
 5: 37326,
 4: 0,
 7: 45,
 8: 0,
 9: 0,
 20: 13,
 25: 6661,
 16: 0,
 32: 0,
 15: 0,
 47: 0,
 56: 2475,
 42: 0,
 74: 403,
 65: 131,
 64: 0,
 73: 5275,
 19: 0,
 44: 0,
 61: 0,
 40: 0,
 45: 3565,
 69: 0,
 36: 0,
 59: 0,
 57: 7502,
 54: 0,
 62: 561,
 51: 0,
 38: 0,
 24: 384,
 67: 0,
 48: 83,
 60: 0,
 71: 0,
 70: 0,
 43: 0,
 46: 0,
 58: 0,
 68: 5349,
 50: 0,
 72: 0,
 75: 3494,
 55: 0,
 52: 0,
 41: 0,
 39: 0,
 77: 2087,
 80: 18436,
 102: 0,
 141: 0,
 107: 0,
 148: 0,
 222: 0,
 273: 0,
 203: 0,
 100: 2180,
 143: 1812,
 106: 0,
 145: 0,
 278: 0,
 94: 0,
 188: 1807,
 95: 0,
 180: 0,
 252: 0,
 163: 0,
 150: 0,
 169: 0,
 285: 1291,
 219: 1137,
 159: 2588,
 162: 0,
 258: 0,
 208: 0,
 127: 0,
 115: 0,
 215: 0,
 116: 0,
 250: 0,
 164: 0,
 266: 0,
 210: 11859,
 131: 5513,
 125: 0,
 147: 6143,
 268: 0,
 292: 0,
 186: 0,
 298: 0,
 221: 0,
 199: 0,
 133: 0,
 204: 0,
 84: 0,
 205: 0,
 174: 0,
 167: 188,
 124: 0,
 245: 0,
 135: 0,
 251: 0,
 96: 0,
 287: 17734,
 153: 0,


In [27]:
log_df['dedicationtime'] = log_df['userid'].map(dedication_times)

In [28]:
log_df[['userid', 'dedicationtime']].sample(10)

Unnamed: 0,userid,dedicationtime
280862,777,0
109296,387,0
108395,118,9065
66590,99,15148
111103,331,20164
96514,165,12952
366773,155,0
409720,930,311
387468,1019,882
32426,102,0


In [17]:
def top_x(df, percent):
    """Calculate the Percentile for each user
        Takes dataframe and the percentile you wish to calculate
        returns: a dataframe with user id that belong to that percentile and their action count
    """
    
    tot_len = df.shape[0]
    top = int((tot_len * percent)/100)
    return df.iloc[:top,]

**Rank User by login counts to 1%, 25% etc**

In [18]:
login_by_user.columns = ['login_count']
top_x(login_by_user, 1)

Unnamed: 0_level_0,login_count
userid,Unnamed: 1_level_1
2,169
246,113
3,107
369,100
165,91
290,73


**Compute Activity Count**

In [12]:
activity_log_df = log_df[['userid', 'action']]
activity_log_by_user = pd.DataFrame(activity_log_df.groupby('userid').count().sort_values('action', ascending=False))
activity_log_by_user

Unnamed: 0_level_0,action
userid,Unnamed: 1_level_1
2,45023
246,13917
3,12922
917,10696
581,10533
...,...
1033,9
391,7
4,3
1,3


In [22]:
activity_log_by_user.columns = ['activity_count']
top_x(activity_log_by_user, 1)

Unnamed: 0_level_0,activity_count
userid,Unnamed: 1_level_1
2,45023
246,13917
3,12922
917,10696
581,10533
290,8558
347,7320
0,7257
607,6461
344,6019


In [76]:
%%sql
SELECT u.id, EXTRACT(HOUR FROM to_timestamp(u.lastaccess-log.timecreated)) AS usage,
EXTRACT(HOUR FROM to_timestamp(u.lastaccess-log.timecreated)) AS "Dedication Time (hr)"
FROM mdl_logstore_standard_log AS log
JOIN mdl_user AS u ON log.userid = u.id
LIMIT 10

 * postgresql://postgres:***@localhost/moodle
10 rows affected.


id,usage,Dedication Time (hr)
1,7.0,7.0
1,7.0,7.0
1,7.0,7.0
20,14.0,14.0
20,16.0,16.0
20,15.0,15.0
20,0.0,0.0
20,23.0,23.0
20,15.0,15.0
20,15.0,15.0


> ### Compute login and activity counts.


In [31]:
%%sql
select userid, count(action) as login_count
    from mdl_logstore_standard_log
        
            group by userid;

 * postgresql://postgres:***@localhost/moodle
1049 rows affected.


userid,login_count
-10,1
-1,2170
0,7257
1,3
2,45023
3,12922
4,3
5,1079
7,94
8,55


Activity Count

In [46]:
pd.DataFrame(log_df.groupby(['userid'])['action'].value_counts())

Unnamed: 0_level_0,Unnamed: 1_level_0,action
userid,action,Unnamed: 2_level_1
1,viewed,2
1,loggedin,1
2,created,12175
2,assigned,9493
2,sent,8025
...,...,...
1049,viewed,9
1050,viewed,9
1050,failed,3
1051,viewed,9


> ### Based on the following metrics, group students as top 1%, 5%, 10%, 25%
>> Login

user_percentile = (total_activity * user_activity_count) / 100

In [None]:
def top_x(df, percent):
    tot_len = df.shape[0]
    top = int((tot_len * percent)/100)
    return df.iloc[:top,]

In [54]:
user_count = pd.DataFrame(log_df.groupby(['userid'])['action'].count().reset_index(name='action_count'))

cell info lite --> https://play.google.com/store/apps/details?id=com.wilysis.cellinfolite&hl=en