# <center>Introduction to the Curricular Analytics Toolbox</center>

<center>
    <b>Gregory L. Heileman$^\dagger$ and Hayden W. Free$^\ddagger$</b> <br>
    $^\dagger$Department of Electrical & Computer Engineering <br>
    University of Arizona <br>
    heileman@arizona.edu <br>
    $^\ddagger$Department of Computer Science <br>
    Univeristy of Kentucky <br>
    hayden.free@uky.edu
</center>

## 1. Introduction
This notebook is meant to provide an overview of how to get up and running using the Curricular Analytics toolbox, 

For an overview of the curricular analytics concepts used in this toolbox see <cite data-cite="he:18">Heileman, et. al, (2018)</cite>). An extensive documentation set, describing the various capabilites of the toolbox, is available on [GitHub Pages](https://curricularanalytics.github.io/CurricularAnalytics.jl/latest/).

In order to execute the code contained in this notebook, you must first install the Julia progamming language.  See: https://julialang.org. Next, you should enter "package mode" from within the Julia REPL, by entering `]`, i.e., 
```julia
julia> ]
```
Then you should add the following packages associated with the Curricular Analtyics toolbox:
```julia
pkg> add CurricularAnalytics
pkg> add CurricularVisualization
```
This may take some time, and you may be required to add additonal packages (just follow the directions that Julia provides to you), but you will only need to do this once in order to use the toolbox.

Once these packages are added, the `using` command can be used to load a package into the user's current Julia enironment, and makes the various items defined in that package available to the user. Load the CurricularAnaltyics package using the following commands:

In [23]:
using CurricularAnalytics, CurricularVisualization

## 2. Creating Curricula and Degree Plans

Now that your enviroment is set up, let's start by creating a simple curriculum, along with a degree plan for that curriculum.  Consider a Basket Weaving curriculum, consisting of the following four courses:
 - BW 101 : Introduction to Baskets, 3 credits
 - BW 101L : Introduction to Baskets Lab, 1 credit; strict co-requisite: BW 101
 - BW 111 : Basic Basket Forms, 3 credits; prerequisite: BW 101
 - BW 201 : Advanced Basketry, 3 credits; co-requisite: BW 111
 
Notice that the Introduction to Baskets Lab must be taken at the same time as the course Introduction to Baskets. That is, the laboratory is a *strict co-requisite* for the course. Notice also, that the Basic Basket Forms class has the Introduction to Baskets class a *prerequisite*, and the Advanced Basketry class has the Basic Basket forms class as a *co-requisite*. 

### 2.1 Curricula

The following commands will create this curriculum:

In [24]:
# first create the courses
c1 = Course("Introduction to Baskets", 3, prefix = "BW", num = "101")
c2 = Course("Introduction to Baskets Lab", 1, prefix = "BW", num = "101L")
c3 = Course("Basic Basket Forms", 3, prefix = "BW", num = "111")
c4 = Course("Advanced Basketry", 3, prefix = "BW", num = "201")

# next create the various requisite relationships
add_requisite!(c1, c2, strict_co)
add_requisite!(c1, c3, pre)
add_requisite!(c3, c4, co)

# now store the courses in an array
courses = [c1, c2, c3, c4];

Notice that the `add_requisite!` function has an exclamation point at the end of its name. In Julia, this convention is used to indicate that the function modifies the underlying data.

Next, to create a curriculum from these courses, use:

In [25]:
curric = Curriculum("Basket Weaving Program", courses);

If you'd like to visualize this curriculum, you can download and use a companion visualization package using (warning this may take some time, and its often easier to download a package using the Julia REPL):

In [26]:
visualize(curric, notebook=true, scale=0.5)

#### Verifying Curricula
In order to determine whether or not a curriculum is valid, use the `is_valid` function, as follows:

In [27]:
errors = IOBuffer()
if isvalid_curriculum(curric, errors)
    println("Curriculum $(curric.name) is valid")
else
    println("Curriculum $(curric.name) is not valid:")
    print(String(take!(errors)))
end

Curriculum Basket Weaving Program is valid


In order to demonstrate how this function works when a curriculum is not valid, let us introduce a cycle in the curriculum graphs -- which makes the curriculum impossible to satisfy.

In [30]:
add_requisite!(c4, c1, pre);

Note that you can create an invalid curriculum; however, when you check for validity, errors are reported:

In [31]:
if isvalid_curriculum(bad_curric, errors)
    println("Curriculum $(curric.name) is valid")
else
    println("Curriculum $(curric.name) is not valid:")
    print(String(take!(errors)))
end

Curriculum Basket Weaving Program is not valid:
 curriculum 'Basket Weaving Program -- invalid' has requisite cycles:
(Basic Basket Forms, Advanced Basketry, Introduction to Baskets)


The prerequiiste the made the curriculum invalid needs to be removed:

In [32]:
delete_requisite!(c4, c1);

###  2.2 Degree Plans
Degree plans give the sequential time element to a curricula. They serve as a primary benchmark to advise students when to take a course, and as such they should be carefully studied and optimized by administrators. 

To create the degree plan for the above curricula, the first thing that should be defined is the amount of terms: In our example, we only have 3:

Now, each one of the terms should be seperated by their respective courses:

In [9]:
t1 = Term([c1, c2])
t2 = Term([c3])
t3 = Term([c4])
terms = [t1, t2, t3];

Now the degree plan can be created with the desired name given in ""

In [10]:
dp1 = DegreePlan("Basket Weaving Degree Plan--3 terms", curric, terms);

The `visualize` function can also be used to visualize degree plans:

In [11]:
visualize(dp1, notebook=true, scale=0.5)

Notice how various metrics are revealed when you hover your mouse over the courses in this visualization. Here's another degree plan for the same curriculum:

In [12]:
t1 = Term([c1, c2])
t2 = Term([c3, c4])
terms = [t1, t2];
dp2 = DegreePlan("Basket Weaving Degree Plan--2 terms", curric, terms)
visualize(dp2, notebook=true, scale=0.5)

#### Verifying Degree Plan

The curriuclar analytics toolbox also includes functionality to test whether or not a degree plan is valid. In this case, the verification includes checking to ensure:
 - All requisites are satisfied (i.e., a prerequisites for  courses occur in a later terms than the courses themselves).
 - All courses in the underlying curriculum are included in the degree plan.
 - Each course in the curriculum is included at most once in the degree plan.

The validity check for degree plans can executed using:

In [13]:
if isvalid_degree_plan(dp1, errors)
    println("Degree Plan $(dp1.name) is valid")
else
    println("Degree Plan $(dp1.name) is not valid:")
    print(String(take!(errors)))
end

Degree Plan Basket Weaving Degree Plan--3 terms is valid


Let's create a invalid degree plan in order to see what happens:

In [14]:
t1 = Term([c1, c4])
t2 = Term([c2, c3, c1])
terms = [t1, t2];
bad_dp = DegreePlan("Basket Weaving Degree Plan--Oh No!", curric, terms)
visualize(bad_dp, notebook=true, scale=0.75)

Notice that you can also visualize invalid degree plans.  Let's now check for validity:

In [15]:
if isvalid_degree_plan(bad_dp, errors)
    println("Degree Plan $(bad_dp.name) is valid")
else
    println("Degree Plan $(bad_dp.name) is not valid:")
    print(String(take!(errors)))
end

Degree Plan Basket Weaving Degree Plan--Oh No! is not valid:

-Invalid requisite: Basic Basket Forms in term 2 is a requisite for Advanced Basketry in term 1
-Invalid prerequisite: Introduction to Baskets in term 2 is a prerequisite for Basic Basket Forms in the same term
-Course Introduction to Baskets is listed multiple times in degree plan

### 2.3 Real Curricula and Degree Plans
The "toy" curricula and degree plans shown above were easy enough to enter, but this would quickly become tedious for real curricula and degree plans. Thus, we have created a spreadsheet format that can be used to specify larger curricula and degree plans.  We have provide a few examples with this notebook, let's read them in.

In [16]:
UA_Aero = read_csv("./Univ_of_Arizona-Aero.csv")
visualize(UA_Aero, notebook=true, scale=1.0)

In [17]:
if isvalid_degree_plan(UA_Aero, errors)
    println("Degree Plan $(UA_Aero.name) is valid")
else
    println("Degree Plan $(UA_Aero.name) is not valid:")
    print(String(take!(errors)))
end

Degree Plan 2019-20 Degree Plan is valid


In [18]:
Ga_Tech_EE = read_csv("./Ga_Tech-EE.csv")
visualize(Ga_Tech_EE, notebook=true, scale=1.0)

In [19]:
if isvalid_degree_plan(Ga_Tech_EE, errors)
    println("Degree Plan $(Ga_Tech_EE.name) is valid")
else
    println("Degree Plan $(Ga_Tech_EE.name) is not valid:")
    print(String(take!(errors)))
end

Degree Plan 2017-18 Plan is valid


## 3. Basic Analytics
Now that we have created the curriculum, associated degree plan, and verified there are no errors, we can quantitatively analyze the plan, and more specifically, the courses. We will first begin with delay factor.

### 3.1 Delay Factor
Many curricula, particularly those in science, technology engineering and math (STEM) fields, contain a set of courses that must be completed in sequential order. It is not uncommon in these programs to find prerequisite pathways consisting of seven or eight courses—they span nearly every term in any possible degree plan. The ability to successfully navigate these long pathways without delay is critical for student success and on-time graduation.

We define the **delay factor** associated with a given course $v_k$ in a curriculum $c$, denoted $d_c(v_k)$, as the number of vertices in the longest path in $G_c$ that passes through $v_k$ (Slim, 2016). 

$G_c$ is known as the *curriculum graph*, where each vertex $v_1, . . . , v_n ∈ V$ represents a requirement (i.e., course) in a curriculum $c$. There is a directed edge $(v_i,v_j) ∈ E$ from requirement $v_i$ to $v_j$ if $v_i$ that must be satisfied prior to the satisfaction of $v_j$.

\begin{equation}
d_c(v_k) = \max_{i,j,k,l} \{ \#(v_i\rightsquigarrow v_k \rightsquigarrow v_j)\}
\end{equation}

We define the delay factor associated with an entire curriculum $c$ as:

\begin{equation}
d(G_c)= \sum_{v_k ∈ V} d_c(v_k)
\end{equation}

### 3.2 Blocking Factor
Another structural factor arises when one course serves as the gateway to many other courses in the curriculum. In this case, if a student is unable to pass the gateway course, they are **blocked** from attempting many of the other courses in the curriculum.

In our example, *Calculus 1* is a foundational first-term course that must be completed before taking  the other major-specific classes in subsequent terms leading to our end goal of completing *Circuits One*. It is obvious that a course which is a prerequisite for a large number of other courses in a curriculum is a highly important course in that curriculum.

We will denote the situation where course $v_j$ is reachable from course $v_i$, via any prerequisite pathway, using $vi\rightsquigarrow v_j$, and $v_i \nrightarrow v_j$ will be used if course $v_j$ is not reachable from course $v_i$. The blocking factor associated with course $v_i$ in curriculum $G_c = (V, E)$, denoted $b_c(v_i)$, is then given by (Slim, 2016):

\begin{equation}
b_c(v_i)= \sum_{v_j ∈ V} I(v_i,v_j)
\end{equation}

where $I$ is the indicator function :

\begin{equation}
= I \begin{cases}
1 & if \space \space v_i\rightsquigarrow v_j\\
0 & if \space \space v_i \nrightarrow v_j
\end{cases}
\end{equation}

We define the blocking factor associated with an entire curriculum $c$ as:

\begin{equation}
b(G_c)= \sum_{v_i ∈ V} b_c(v_i)
\end{equation}

### 3.3 Complexity
After computing blocking and delay factor, a unitless measure for structural complexity can be applied to any curriculum.

We keep in mind that the experimental design explicitly relates this measure to the likelihood that a student can complete a curriculum.

In order to achieve this overall *Complexity* metric, we simply add the blocking and delay factors of the entire curricula:

\begin{equation}
Complexity = b(G_c) + d(G_c)
\end{equation}

### 3.4 Back To Our Example
In our example, all these metrics can be computed using the following commands:

In [20]:
    println("Delay factor = $(delay_factor(Ga_Tech_EE.curriculum))")
    println("Blocking factor = $(blocking_factor(Ga_Tech_EE.curriculum))")
    println("Curricular complexity = $(complexity(Ga_Tech_EE.curriculum))")

Delay factor = (176.0, [5.0, 8.0, 6.0, 7.0, 1.0, 2.0, 8.0, 7.0, 7.0, 1.0, 2.0, 8.0, 3.0, 6.0, 3.0, 1.0, 1.0, 7.0, 8.0, 7.0, 4.0, 1.0, 8.0, 8.0, 7.0, 7.0, 1.0, 1.0, 6.0, 1.0, 1.0, 7.0, 1.0, 1.0, 1.0, 8.0, 1.0, 1.0, 1.0, 1.0, 8.0, 1.0, 1.0, 1.0])
Blocking factor = (111, [4, 15, 10, 10, 0, 1, 9, 10, 6, 0, 0, 8, 0, 4, 2, 0, 0, 6, 7, 5, 0, 0, 3, 2, 2, 2, 0, 0, 2, 0, 0, 2, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0])
Curricular complexity = (287.0, Number[9.0, 23.0, 16.0, 17.0, 1.0, 3.0, 17.0, 17.0, 13.0, 1.0, 2.0, 16.0, 3.0, 10.0, 5.0, 1.0, 1.0, 13.0, 15.0, 12.0, 4.0, 1.0, 11.0, 10.0, 9.0, 9.0, 1.0, 1.0, 8.0, 1.0, 1.0, 9.0, 1.0, 1.0, 1.0, 9.0, 1.0, 1.0, 1.0, 1.0, 8.0, 1.0, 1.0, 1.0])


## Basic Metrics
If more masic metrics are desired such as:

1. Total credit hours
2. Avg. credits per term
3. Number of terms
4. Max. credits in a term
5. Min. credit term
6. Credit hour variance
7. Max. credit term

The following command may be run. Remember that the variable inside of () is whatever you set your degree plan to equal, in our case dp:

In [21]:
basic_metrics(Ga_Tech_EE)
Ga_Tech_EE.metrics

Dict{String,Any} with 8 entries:
  "total credit hours"         => 134
  "avg. credits per term"      => 16.75
  "min. credits in a term"     => 15
  "term credit hour std. dev." => 1.08972
  "number of terms"            => 8
  "max. credits in a term"     => 18
  "min. credit term"           => 4
  "max. credit term"           => 5

# Conclusion 
By treating a curriculum as a formal system that exists within the larger university ecosystem, we have highlighted the fact that it can be directly and rigorously analyzed. An analytical approach to the study of curricula supports not only the ability to make predictions about how curricular changes will effect student progress, but also predictions around the likely impact of particular student success interventions on curricular progression.

We hope that through the use of our Curricular Analytics toolbox administrators in higher education can make changes to their curricula that ultimately improve student success outcomes. 

# References

Heileman, G. L., Abdallah, C.T., Slim, A., and Hickman, M. (2018). Curricular analytics: A framework for quantifying the impact of curricular reforms and pedagogical innovations. www.arXiv.org, arXiv:1811.09676 [cs.CY].

Slim, A. (2016). Curricular Analytics in Higher Education. PhD thesis, University of New Mexico,
Albuquerque, NM.