## About
Aim here is to recreate (and improve) the graphs produced in [the original Indicators paper](https://indicatorsproject.wordpress.com/2009/10/09/the-indicators-project-identifying-effective-learning-adoption-activity-grades-and-external-factors/) that allow comparison of LMS feature adoption longitudinally across a number of years using the Malikowski model.

The intent is to create graphs that 

1. For each Malikowski category, shows the % of courses each year adopting that feature

In [1]:
## get the necessary modules
import plotly
plotly.offline.init_notebook_mode()

from Malikowski.Adoption import Adoption
from Malikowski.AdoptionView import AdoptionView

import re

In [2]:
## Obtain Malikowski adoption data for the available years

# Define how to get the data for each year
# - list of strings to be used in SQL query matching moodle.shortnames
# - if different shortnames were used, could be used to do this comparison between disciplines
#   or other groups of courses
years = [ '%%_2012_%%','%%_2013_%%','%%_2014_%%','%%_2015_%%']

# Gather data for each year
yearAdoption = []
for year in years:
    title = re.sub( r'_%%', " ", re.sub( r'%%_', "", year))
    print ('Getting model for term ' + title)
    
    model = Adoption(title);
    model.getCoursesShortname( year )
    
    yearAdoption.append( model)

In [3]:
## Show a box plot representation
#  - but do it one category at a time 
view = AdoptionView()

### Content

In [6]:
category = { 'content':1}
#-- Show the 2009 equivalent graph
view.listCategoryPercentage(yearAdoption,category)
#-- Show a graph that shows the number of different features of each category used by each course
view.listCategoryComparison(yearAdoption,category)

#### Observations - Content

**Courses with no content features?**

In this sample, it appears that significant numbers of the courses are not using any content-based feature. This is a strange finding and would require more digging to 

1. identify if this is a flaw in the methodology (not picking up some content features - eg. announcements) or 
1. perhaps in selecting the courses (e.g. some of these courses are test courses not real courses)

**Huge numbers of content features**

Some courses having 200+ different content features. Implying quite a lot, but also indicative of what is being counted as content (e.g. labels)

To do (this could be a finding for on-going work that is mentioned in the paper)

1. Explore ways to visualise the specific breakdown of content features and show quantity

**Descriptive statistics**

The [box plots allow a level of descriptive statistics](http://www.physics.csbsju.edu/stats/box2.html) to talk about the data. Could be useful for the paper.Perhaps the table summarising the median etc could be used.

Apparently, a [t test](http://www.physics.csbsju.edu/stats/t-test.html) can be used to compare any real difference between two collections.  Suggesting you could run a t-test between two different years (or other groups of courses) to determine if they are fundamentally different.

## Communication

In [5]:
category = { 'communication':1}
#-- Show the 2009 equivalent graph
view.listCategoryPercentage(yearAdoption,category)
#-- Show a graph that shows the number of different features of each category used by each course
view.listCategoryComparison(yearAdoption,category)

### Observations - communication

**100% adoption**

Could this be the influence of the announcements forum?  What if this was removed? Would it have content pick up, but communication drop off?

**Descriptive statistics**

The median number of communication features in courses is consistently 4 across the years (post 2012). i.e. 50% of courses have 4 or less communication features.  Including a number here that have 0

To do:

1. How many courses have 0 communication features?
1. What are the 4 communication features that are common across these courses?
1. Is there any impact on student usage of communication features between courses that have larger numbers of communication features?
1. What, if any, are the commonalities/differences within and between the median and above median courses?

## Assessment

In [7]:
category = { 'assessment':1}
#-- Show the 2009 equivalent graph
view.listCategoryPercentage(yearAdoption,category)
#-- Show a graph that shows the number of different features of each category used by each course
view.listCategoryComparison(yearAdoption,category)

### Observations on assessment

**Jump in assessment in 2015**

This appears to align with the move from EASE (USQ specific) to the Moodle assignment submission system that year.

Reinforcing the idea that this type of analysis (LMS specific) does not capture all of the necessary data.

**Low median**

Median # of assessment features was 2 in 2013 and 2014, but moved to 5 in 2015.

1. Were quizzes the common 2 features in 2013/2014?
1. What's the common number of assignments - 3?

## CBI

In [8]:
category = { 'cbi':1}
#-- Show the 2009 equivalent graph
view.listCategoryPercentage(yearAdoption,category)
#-- Show a graph that shows the number of different features of each category used by each course
view.listCategoryComparison(yearAdoption,category)

### Comments on CBI

**Very low use**

Moodle really only has a the lesson feature that we've classed as CBI.  More limited than other categories - at least compared to content.


**Missing box points**

The box plot on Jupyter doesn't wish to show the points.  It will when exported to plot.ly