### Accessing Canvas API Click Activity
Authored by Fernando Rodriguez<br>
Last updated Sept 4th, 2018

This script obtains students' daily and hourly click counts from the Canvas learning management platform using the Canvas API Library.<br>
http://www.python-requests.org/en/latest/


<br>
## Step 1 - Import Libraries

In [50]:
from canvasapi import Canvas
import pandas as pd
import requests 
import csv
import re

<br><br>
## Step 2 - Access Canvas API

In order to access a course's Canvas API, you need an API key.

Please visit the following page for instruction on how to obtain Canvas API keys.<br>
https://community.canvaslms.com/docs/DOC-10806-4214724194

In [72]:
# Canvas API URL
API_URL = "https://canvas.instructure.com/"
# Canvas API key
API_KEY = "#####ENTER-KEY-HERE####" # <- This is where you put in the API key. Enclose the key in '' or ""
# Initialize a new Canvas object
canvas = Canvas(API_URL, API_KEY)

<br>
#### Use the .get_course method to Access Specific Course

In [54]:
# You need to enter Canvas Course ID in the argument
course = canvas.get_course(1160937) # <- 1160937 is the Canvas course ID for this example

#### Use .name method to get course name

In [55]:
course.name

u'SSII 17 Chem 1C:  5 week'

#### Access Instructor(s) and TA(s) names

In [56]:
staff_list = ['teacher', 'ta', 'designer']

staff_users = course.get_users(enrollment_type = staff_list)

In [None]:
for user in staff_users:
    print (user)

#### Access Student Names and IDs

In [57]:
student_users = course.get_users(enrollment_type = ['student'])

In [None]:
for user in student_users:
    print (user)

#### Saving User Names and IDs in Seperate Lists

In [58]:
# splitting up user ids

user_list = []
user_name = []

student_users = course.get_users(enrollment_type = ['student'])
print student_users

for user in student_users:
    userstr = str(user)
    userid_split_id = userstr.split('(', 1)[1].split(')')[0]
    userid_split_name = userstr.split(' (')[0]
    user_list.append(userid_split_id)
    user_name.append(userid_split_name)

<PaginatedList of type User>


Check the first 10 user names

In [None]:
user_name[0:10]

Get total number of students in the course

In [59]:
user_list[0:10]
len(user_list)

247

<br><br>
## Step 3 - Access Student Page Views

In [37]:
# checking that one of the student's clicks matches the dataframe
# The id for this student is '6294748', which is entered into the get_user_in_a_course_level_participation_data argument
sampledata = course.get_user_in_a_course_level_participation_data(6294748) # <- individual studentid
sampledata # yes, it checks out

{u'page_views': {u'2017-08-04T21:00:00-06:00': 8,
  u'2017-08-06T10:00:00-06:00': 9,
  u'2017-08-06T22:00:00-06:00': 13,
  u'2017-08-08T11:00:00-06:00': 7,
  u'2017-08-08T18:00:00-06:00': 9,
  u'2017-08-08T23:00:00-06:00': 4,
  u'2017-08-09T10:00:00-06:00': 5,
  u'2017-08-09T16:00:00-06:00': 3,
  u'2017-08-09T18:00:00-06:00': 25,
  u'2017-08-09T19:00:00-06:00': 10,
  u'2017-08-09T22:00:00-06:00': 1,
  u'2017-08-09T23:00:00-06:00': 16,
  u'2017-08-10T10:00:00-06:00': 8,
  u'2017-08-10T11:00:00-06:00': 31,
  u'2017-08-10T12:00:00-06:00': 25,
  u'2017-08-10T13:00:00-06:00': 72,
  u'2017-08-10T14:00:00-06:00': 81,
  u'2017-08-10T15:00:00-06:00': 38,
  u'2017-08-10T16:00:00-06:00': 88,
  u'2017-08-10T19:00:00-06:00': 26,
  u'2017-08-11T14:00:00-06:00': 15,
  u'2017-08-14T12:00:00-06:00': 4,
  u'2017-08-15T11:00:00-06:00': 4,
  u'2017-08-16T08:00:00-06:00': 4,
  u'2017-08-16T14:00:00-06:00': 4,
  u'2017-08-16T15:00:00-06:00': 6,
  u'2017-08-16T17:00:00-06:00': 4,
  u'2017-08-16T22:00:00-06:0

### Obtaining Page Views for All Students

The script below may take 5-20 minutes to run, depending on the number of students and the length of the course. 


In [60]:
pg_list = []

for user in user_list:
    views = course.get_user_in_a_course_level_participation_data(user)
    pageviews = views['page_views']
    
    pg_list.append(pageviews)

Coverting page views into a dataframe.

Note that Canvas records the first ever page view for the course. Therefore, the date-time may start much earlier than the start of the course. Likewise, Canvas also records the last page view for the course, so the last date-time may be much later than the last day of the course. 

In [69]:
df_pgviews = pd.DataFrame(pg_list)
df_pgviews[0:10] # <- accessing the first 10 rows of the dataframe
# Note that the NaN values means the students did not have any clicks for that particular day or time

Unnamed: 0,2017-05-25T11:00:00-06:00,2017-05-27T16:00:00-06:00,2017-05-29T14:00:00-06:00,2017-05-29T23:00:00-06:00,2017-05-30T13:00:00-06:00,2017-06-01T12:00:00-06:00,2017-06-01T19:00:00-06:00,2017-06-03T16:00:00-06:00,2017-06-04T14:00:00-06:00,2017-06-05T12:00:00-06:00,...,2018-07-12T22:00:00-06:00,2018-07-12T23:00:00-06:00,2018-07-17T00:00:00-06:00,2018-07-17T23:00:00-06:00,2018-07-19T16:00:00-06:00,2018-07-19T17:00:00-06:00,2018-07-19T19:00:00-06:00,2018-07-19T21:00:00-06:00,2018-07-26T17:00:00-06:00,2018-08-15T21:00:00-06:00
0,,,,,,,,,,,...,,,,,,,,,,
1,,,,,,,,,,,...,,,,,,,,,,
2,,,,,,,,,,,...,,,,,,,,,,
3,,,,,,,,,,,...,,,,,,,,,,
4,,,,,,,,,,,...,,,,,,,,,,
5,,,,,,,,,,,...,,,,,,,,,,
6,,,,,,,,,,,...,,,,,,,,,,
7,,,,,,,,,,,...,,,,,,,,,,
8,,,,,,,,,,,...,,,,,,,,,,
9,,,,,,,,,,,...,,,,,,,,,,


#### Saving to .csv file (no identifiers)

In [40]:
df_pgviews.to_csv('Page Views - NO IDS.csv')

### Adding identifiers to dataframe

In [70]:
# making user id into daframe so I can concatinate it 
userids = pd.DataFrame(user_list)
userids.columns = ["studentid"]

# making user name into dataframe for the same purposes
usernames = pd.DataFrame(user_name)
usernames.columns = ["name_firstlast"]

# duplicate names dataframe. This will id students with the same name
duplicatenames = []
duplicatenames = usernames.duplicated(keep = False)
duplicate_names = pd.DataFrame(duplicatenames)
duplicate_names.columns = ["duplicate_names"]

In [71]:
# concatinating all dataframes
df_pageviews_full = pd.concat([df_pgviews, userids, usernames, duplicate_names], axis = 1)

In [None]:
# the identifiers are the final three columms
df_pageviews_full[0:10]

#### Saving to .csv file (with identifiers)

In [48]:
df_pageviews_full.to_csv('Page Views with IDs.csv')