# <center>Introduction to Curricular Simulations</center>

<center>
    <b>Gregory L. Heileman$^\dagger$, Jiacheng Zhang$^\ddagger$ and Hayden W. Free$^\bullet$</b> <br>
    $^\dagger$Department of Electrical & Computer Engineering <br>
    University of Arizona <br>
    heileman@arizona.edu <br>
    $^\ddagger$Department of Computer Science <br>
    jiachengzhang1@arizona.edu <br>
    University of Arizona<br>
    $^\bullet$Department of Computer Science <br>
    hayden.free@uky.edu <br>
    University of Kentucky
</center>

## 1. Introduction

This notebook demonstrates how to use the simulation capabilites that are included as a part of the [CurricularAnalytics toolbox](https://github.com/CurricularAnalytics/CurricularAnalytics.jl). If you would like to become more familiar with the notions behind curricular analytics, we suggest you read <cite data-cite="he:18">Heileman, et. al, (2018)</cite>), and also examine the Introduction to the Curricular Analytics Toolbox notebook that accompanies this notebook.

The simulation capabilites include the ability to simulate the flow of students through a curriculum, towards graduation, using discrete event simulation. Specifically, a population of students attempt to complete the selected curriculum, by taking courses in the order prescribed by the curriculum. At each step (semester) of the simulation a given student enrolls in a set of courses, earning either a passing of failing grade in each.  At the end of a given semester, if a student has passed all of the courses in the curriculum, they are deemed a graduate.  If a student has not yet gradauted, then they may stop out (according to a prescribed stop-out model), or enroll in the next set of courses available to them. One of the intended uses of these simulation capabilities is to estimate the impact that particular curricular changes or instructional improvements will have on student progress. 

The simulation framework (shown below) was orginally developed by Hickman <cite data-cite="hi:18">Hickman, (2014)</cite>), and  subsequent development has allowed it to be integrated into the CurricularAnalytics toolbox. 

<img src="SimulationFramework.png" width="600">

Notice that the simulation "engine" requires three inputs: a curriculum, a model for students, and a model for student peformance. The results of the simualtion are returned in an object that may be viewed using the `simulation_report()` function, as demonstrated below. 

In order to perform curricular simulations, first load the Curricular Analytics toolbox modules:

In [1]:
using CurricularAnalytics, CurricularVisualization

┌ Info: Precompiling CurricularAnalytics [593ffa3d-269e-5d81-88bc-c3b6809c35a6]
└ @ Base loading.jl:1278
  ** incremental compilation may be fatally broken for this module **

┌ Info: Precompiling CurricularVisualization [fbaee398-1ac9-4ee2-a69c-ef17203eefbf]
└ @ Base loading.jl:1278
  ** incremental compilation may be fatally broken for this module **





## 2. Setting up the Simulation Environment

The first thing we will do is read in a degree plan. The student in the simulation will attempt to complete the curriculum associated with the degree plan in the order that prescribed in the degree plan.  Specifically, in each semester each student will enroll in courses they have not yet taken in the order specified by the degree plan until they reach the maximum allowed number of credit hours.

### 2.1 Reading the Degree Plan

The following commands read in a degree plan stored in the CSV file format, and then display a visualization of the resulting plan.

In [2]:
AE_degree_plan = read_csv("Univ_of_Arizona-Aero.csv")
visualize(AE_degree_plan, notebook=true, scale=0.8)

In [3]:
basic_metrics(AE_degree_plan)
AE_degree_plan.metrics

Dict{String,Any} with 8 entries:
  "total credit hours"         => 129
  "avg. credits per term"      => 16.125
  "min. credits in a term"     => 15
  "term credit hour std. dev." => 0.927025
  "number of terms"            => 8
  "max. credits in a term"     => 18
  "min. credit term"           => 4
  "max. credit term"           => 3

In [4]:
CS_degree_plan = read_csv("Univ_of_Arizona-CS.csv")
visualize(CS_degree_plan, notebook=true, scale=0.8)

In [5]:
basic_metrics(CS_degree_plan)
CS_degree_plan.metrics

Dict{String,Any} with 8 entries:
  "total credit hours"         => 122
  "avg. credits per term"      => 15.25
  "min. credits in a term"     => 12
  "term credit hour std. dev." => 1.39194
  "number of terms"            => 8
  "max. credits in a term"     => 17
  "min. credit term"           => 8
  "max. credit term"           => 2

### 2.2 Creating the Student Cohort

The following command will create an inital cohort of students `n` using a simple enrollment model. Specifically, with this simple model, all students are assumed equally likely (or unlikely) to pass a given class according the course pass rate probability model.

In [6]:
enrollment_model = Enrollment  # use the Enrollment module to determine if/when student may enroll in a course
stopouts = true  # assume that student may stop out of the cohort
n = 1000   # student cohort size will be 100
students = simple_students(n);  # create a student cohort

### 2.3 The Course Performance Model

Next we will set the model that will be used to determine whether or not a student passes a course. We will use actual pass/fail rates computed using historical data for the courses in the degree plan shown above. 

In [7]:
performance_model = PassRate
course_passrate = 0.9  # use if a course is not contained in the CSV file
real_passrate = true  # use the actual pass rates, rather than course_passrate for all courses
set_passrates_from_csv(AE_degree_plan.curriculum.courses, "./Student_Grades_sp17_to_fall19.csv", course_passrate)

Note: A more realistic model for predicting student performance could be used here. Specifically, a more realistic model might:

- take student demographics into account, including the major they are in,
- take prior grades into account when predicting future grades,
- take into account factors that influence student stopout, e.g., academic standing, GPA, unment need, etc.

Learning the model pararmeters using actual student data would improve the fidelity of the simulation.

### 2.4 Setting the Simulation Parameters

In [8]:
max_credits = 18  # the maximum number of credit hours a student may enroll in during a semester
duration_lock = false # rather than simulating until no students are left in the cohort, run for a fixed number of terms
num_terms = 12  # the maximum number of terms in the simulation
course_attempt_limit = 2;  # number of times a student may attempt a course

### 2.5 Running the Simulation 
The `simulation` function is used to execute the simulation.  Depending upon how many students are in the cohort, this may take some time to run.

In [9]:
simulation = simulate(AE_degree_plan, course_attempt_limit, students,
                      max_credits = max_credits,
                      performance_model = performance_model,
                      enrollment_model = enrollment_model,
                      duration = num_terms,
                      duration_lock = duration_lock,
                      stopouts = stopouts);

In order to view the results of the simulation, use the `simulation_report` function:

In [10]:
simulation_report(simulation, num_terms, course_passrate, max_credits, real_passrate)


[0m[1m------------ Simulation Report ------------[22m
Aerospace Engineering, BS -- 2019-20 Degree Plan

-------- Simulation Statistics --------
Number of terms: 12
Max Credits per Term: 18
Max Course Attempts: 2
Number of Students: 1000
Preset Course Pass Rates: 90.0%

-------- Graduation Statistics --------
Number of Students Graduated: 275
Graduation Rate: 27.500000000000004%
Term Graduation Rates: 
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.016, 0.184, 0.27, 0.275, 0.0]
Average time to degree: 9.290909090909091 terms

-------- Stop out Statistics --------
Number of Students Stopped Out (Stopout Model Prediction + Reached Max Attempts): 725
Number of Students Reaching Max Attempts: 498
Stop-out Rate: 72.5%
Cumulative Term Stop-out Rates (including reached max course attempts students): 
[0.079, 0.344, 0.489, 0.595, 0.65, 0.678, 0.687, 0.709, 0.719, 0.721, 0.725, 0.0]

Cumulative Term Stop-out Rates (excluding reaching max course attempts students): 
[0.079, 0.183, 0.189, 0.215, 0.23, 

│ 25  │ 0.0%  │ 9.3%  │ 26.6% │ 32.5% │ 33.5% │ 33.5% │ 33.5%  │ 33.5%  │
│ 26  │ 6.3%  │ 17.0% │ 37.2% │ 37.8% │ 37.8% │ 37.8% │ 37.8%  │ 37.8%  │
│ 27  │ 4.0%  │ 7.2%  │ 32.7% │ 36.4% │ 36.4% │ 36.4% │ 36.4%  │ 36.4%  │
│ 28  │ 97.4% │ 97.4% │ 97.5% │ 97.6% │ 97.6% │ 97.6% │ 97.6%  │ 97.6%  │
│ 29  │ 0.0%  │ 0.0%  │ 7.9%  │ 24.3% │ 31.1% │ 32.0% │ 32.1%  │ 32.1%  │
│ 30  │ 0.0%  │ 0.0%  │ 15.7% │ 30.4% │ 32.3% │ 32.4% │ 32.4%  │ 32.4%  │
│ 31  │ 0.0%  │ 0.0%  │ 8.3%  │ 29.5% │ 31.7% │ 31.9% │ 31.9%  │ 31.9%  │
│ 32  │ 0.0%  │ 0.0%  │ 5.3%  │ 28.3% │ 32.1% │ 32.1% │ 32.1%  │ 32.1%  │
│ 33  │ 99.8% │ 99.8% │ 99.8% │ 99.8% │ 99.8% │ 99.8% │ 99.8%  │ 99.8%  │
│ 34  │ 1.0%  │ 3.6%  │ 6.9%  │ 24.6% │ 31.5% │ 31.6% │ 31.6%  │ 31.6%  │
│ 35  │ 0.0%  │ 0.0%  │ 0.0%  │ 7.9%  │ 28.4% │ 31.0% │ 31.2%  │ 31.2%  │
│ 36  │ 0.0%  │ 0.0%  │ 0.0%  │ 6.6%  │ 26.4% │ 30.5% │ 30.9%  │ 30.9%  │
│ 37  │ 0.0%  │ 0.0%  │ 0.0%  │ 6.3%  │ 28.2% │ 31.0% │ 31.2%  │ 31.2%  │
│ 38  │ 0.0%  │ 0.0%  │ 0.0%  │ 10.7% 

Now let's run the same set of students through the Computer Science curriculum:

In [11]:
set_passrates_from_csv(CS_degree_plan.curriculum.courses, "./Student_Grades_sp17_to_fall19.csv", course_passrate)
#students = simple_students(n);  # create a student cohort
simulation = simulate(CS_degree_plan, course_attempt_limit, students,
                      max_credits = max_credits,
                      performance_model = performance_model,
                      enrollment_model = enrollment_model,
                      duration = num_terms,
                      duration_lock = duration_lock,
                      stopouts = stopouts);
simulation_report(simulation, num_terms, course_passrate, max_credits, real_passrate)


[0m[1m------------ Simulation Report ------------[22m
Computer Science, BS -- 2019-20 Degree Plan

-------- Simulation Statistics --------
Number of terms: 12
Max Credits per Term: 18
Max Course Attempts: 2
Number of Students: 1000
Preset Course Pass Rates: 90.0%

-------- Graduation Statistics --------
Number of Students Graduated: 262
Graduation Rate: 26.200000000000003%
Term Graduation Rates: 
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.103, 0.238, 0.262, 0.0, 0.0]
Average time to degree: 8.698473282442748 terms

-------- Stop out Statistics --------
Number of Students Stopped Out (Stopout Model Prediction + Reached Max Attempts): 738
Number of Students Reaching Max Attempts: 509
Stop-out Rate: 73.8%
Cumulative Term Stop-out Rates (including reached max course attempts students): 
[0.09, 0.376, 0.462, 0.579, 0.643, 0.684, 0.71, 0.725, 0.736, 0.738, 0.0, 0.0]

Cumulative Term Stop-out Rates (excluding reaching max course attempts students): 
[0.09, 0.177, 0.202, 0.223, 0.229, 0.227, 0.

│ 33  │ 0.0%  │ 0.0%  │ 7.9%  │ 27.1% │ 29.0% │ 29.0%  │ 29.0%  │ 29.0%  │
│ 34  │ 0.0%  │ 0.0%  │ 3.3%  │ 25.1% │ 28.3% │ 28.5%  │ 28.5%  │ 28.5%  │
│ 35  │ 0.0%  │ 0.0%  │ 0.5%  │ 21.7% │ 27.7% │ 28.1%  │ 28.1%  │ 28.1%  │
│ 36  │ 0.0%  │ 0.0%  │ 0.0%  │ 19.2% │ 26.8% │ 27.8%  │ 27.8%  │ 27.8%  │
│ 37  │ 0.0%  │ 0.0%  │ 0.0%  │ 15.5% │ 26.0% │ 27.3%  │ 27.3%  │ 27.3%  │
│ 38  │ 0.0%  │ 0.0%  │ 0.0%  │ 10.3% │ 23.8% │ 26.2%  │ 0.0%   │ 0.0%   │

## 3. What-if Analyses

These simulation capabilities allow us to conduct what-if analyses around the impact that changes to curricular structure or instructional improvements will have on student success.  First, let's consider how changing the number of allowed attempts (from 2 to 3) would impact graduation rates in both of these programs.

In [12]:
course_attempt_limit = 3
simulation = simulate(AE_degree_plan, course_attempt_limit, students,
                      max_credits = max_credits,
                      performance_model = performance_model,
                      enrollment_model = enrollment_model,
                      duration = num_terms,
                      duration_lock = duration_lock,
                      stopouts = stopouts);
simulation_report(simulation, num_terms, course_passrate, max_credits, real_passrate)


[0m[1m------------ Simulation Report ------------[22m
Aerospace Engineering, BS -- 2019-20 Degree Plan

-------- Simulation Statistics --------
Number of terms: 12
Max Credits per Term: 18
Max Course Attempts: 3
Number of Students: 1000
Preset Course Pass Rates: 90.0%

-------- Graduation Statistics --------
Number of Students Graduated: 565
Graduation Rate: 56.49999999999999%
Term Graduation Rates: 
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.017, 0.242, 0.471, 0.557, 0.565]
Average time to degree: 9.72212389380531 terms

-------- Stop out Statistics --------
Number of Students Stopped Out (Stopout Model Prediction + Reached Max Attempts): 434
Number of Students Reaching Max Attempts: 122
Stop-out Rate: 43.4%
Cumulative Term Stop-out Rates (including reached max course attempts students): 
[0.086, 0.201, 0.251, 0.345, 0.387, 0.407, 0.422, 0.425, 0.431, 0.432, 0.434, 0.434]

Cumulative Term Stop-out Rates (excluding reaching max course attempts students): 
[0.086, 0.201, 0.23, 0.28, 0.29

│ 4   │ 99.6% │ 99.6% │ 99.6% │ 99.6% │ 99.6% │ 99.6% │ 99.6%  │ 99.6%  │
│ 5   │ 97.7% │ 97.7% │ 97.7% │ 97.7% │ 97.7% │ 97.7% │ 97.7%  │ 97.7%  │
│ 6   │ 83.5% │ 83.8% │ 83.8% │ 83.8% │ 83.8% │ 83.8% │ 83.8%  │ 83.8%  │
│ 7   │ 86.1% │ 87.0% │ 87.0% │ 87.0% │ 87.0% │ 87.0% │ 87.0%  │ 87.0%  │
│ 8   │ 84.2% │ 84.9% │ 85.1% │ 85.1% │ 85.1% │ 85.1% │ 85.1%  │ 85.1%  │
│ 9   │ 87.9% │ 88.0% │ 88.0% │ 88.0% │ 88.0% │ 88.0% │ 88.0%  │ 88.0%  │
│ 10  │ 85.5% │ 85.8% │ 85.8% │ 85.8% │ 85.8% │ 85.8% │ 85.8%  │ 85.8%  │
│ 11  │ 84.2% │ 84.9% │ 84.9% │ 84.9% │ 84.9% │ 84.9% │ 84.9%  │ 84.9%  │
│ 12  │ 56.5% │ 69.5% │ 71.6% │ 72.1% │ 72.1% │ 72.1% │ 72.1%  │ 72.1%  │
│ 13  │ 59.8% │ 69.7% │ 71.7% │ 72.1% │ 72.1% │ 72.1% │ 72.1%  │ 72.1%  │
│ 14  │ 54.2% │ 66.5% │ 70.0% │ 70.7% │ 70.9% │ 70.9% │ 70.9%  │ 70.9%  │
│ 15  │ 70.1% │ 75.6% │ 76.5% │ 76.6% │ 76.6% │ 76.6% │ 76.6%  │ 76.6%  │
│ 16  │ 73.4% │ 77.0% │ 77.0% │ 77.0% │ 77.0% │ 77.0% │ 77.0%  │ 77.0%  │
│ 17  │ 69.8% │ 75.5% │ 76.0% │ 76.1% 

In [13]:
simulation = simulate(CS_degree_plan, course_attempt_limit, students,
                      max_credits = max_credits,
                      performance_model = performance_model,
                      enrollment_model = enrollment_model,
                      duration = num_terms,
                      duration_lock = duration_lock,
                      stopouts = stopouts);
simulation_report(simulation, num_terms, course_passrate, max_credits, real_passrate)


[0m[1m------------ Simulation Report ------------[22m
Computer Science, BS -- 2019-20 Degree Plan

-------- Simulation Statistics --------
Number of terms: 12
Max Credits per Term: 18
Max Course Attempts: 3
Number of Students: 1000
Preset Course Pass Rates: 90.0%

-------- Graduation Statistics --------
Number of Students Graduated: 572
Graduation Rate: 57.199999999999996%
Term Graduation Rates: 
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.119, 0.427, 0.554, 0.571, 0.572]
Average time to degree: 9.078671328671328 terms

-------- Stop out Statistics --------
Number of Students Stopped Out (Stopout Model Prediction + Reached Max Attempts): 428
Number of Students Reaching Max Attempts: 129
Stop-out Rate: 42.8%
Cumulative Term Stop-out Rates (including reached max course attempts students): 
[0.082, 0.21, 0.286, 0.334, 0.371, 0.396, 0.411, 0.422, 0.424, 0.426, 0.427, 0.428]

Cumulative Term Stop-out Rates (excluding reaching max course attempts students): 
[0.082, 0.21, 0.234, 0.266, 0.286, 

Next, notice that CSC 110 -- Intro to Computer Programming I is clearly a gateway course for the computer science program at the University of Arizona, and that is has a relatively low success rate. There are a few other courses with teh CSC prefix that also have low success rates. What would happen if the instruction and instructional support were changed in a way that enabled the students taking CSC 110, CSC 120 and CSC 353 to obtain a 90% pass rate? 

In [14]:
convert_ids(CS_degree_plan.curriculum)
csc110 = course(CS_degree_plan.curriculum, "CSC", "110", "Intro to Computer Programming I", "")
csc120 = course(CS_degree_plan.curriculum, "CSC", "120", "Intro to Computer Programming II", "")
csc353 = course(CS_degree_plan.curriculum, "CSC", "352", "Systems Programming & Unix", "")
csc110.passrate = csc120.passrate = csc353.passrate = 0.9;

In [15]:
simulation = simulate(CS_degree_plan, course_attempt_limit, students,
                      max_credits = max_credits,
                      performance_model = performance_model,
                      enrollment_model = enrollment_model,
                      duration = num_terms,
                      duration_lock = duration_lock,
                      stopouts = stopouts);
simulation_report(simulation, num_terms, course_passrate, max_credits, real_passrate)


[0m[1m------------ Simulation Report ------------[22m
Computer Science, BS -- 2019-20 Degree Plan

-------- Simulation Statistics --------
Number of terms: 12
Max Credits per Term: 18
Max Course Attempts: 3
Number of Students: 1000
Preset Course Pass Rates: 90.0%

-------- Graduation Statistics --------
Number of Students Graduated: 628
Graduation Rate: 62.8%
Term Graduation Rates: 
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.189, 0.525, 0.621, 0.628, 0.0]
Average time to degree: 8.874203821656051 terms

-------- Stop out Statistics --------
Number of Students Stopped Out (Stopout Model Prediction + Reached Max Attempts): 372
Number of Students Reaching Max Attempts: 68
Stop-out Rate: 37.2%
Cumulative Term Stop-out Rates (including reached max course attempts students): 
[0.081, 0.203, 0.247, 0.296, 0.338, 0.351, 0.368, 0.368, 0.371, 0.372, 0.372, 0.0]

Cumulative Term Stop-out Rates (excluding reaching max course attempts students): 
[0.081, 0.203, 0.232, 0.276, 0.296, 0.297, 0.304, 0.3

# References

Heileman, G. L., Abdallah, C.T., Slim, A., and Hickman, M. (2018). Curricular analytics: A framework for quantifying the impact of curricular reforms and pedagogical innovations. www.arXiv.org, arXiv:1811.09676 [cs.CY].

Hickman, M. (2014). Development of a Curriculum Analysis and Simulation Library with Applications in Curricular Analytics. MS thesis, University of New Mexico,
Albuquerque, NM.