# Feb 2025 Full details version - Enrolment by Program and Credential type & revenue impact

## Introduction

This notebook looks at enrolment down to the program and credentials level to assess the impact of international enrolment changes on revenue, and other trends that have emerged in the time period.

Following on from the lead notebook, this notebook is more thorough look at the data, with program and credential specific analysis. 
- Unless explicitly stated, all program types, credential types, and other parameters are included in this data sheet at the outset and examined for significance before discarding it, as we move towards

Source:
[StatCan: Postsecondary enrolments, by detailed field of study, institution, and program and student characteristics](https://www150.statcan.gc.ca/t1/tbl1/en/cv.action?pid=3710027701)

**Important notes**

1. Statcan's Classification of program types [is here](https://www23.statcan.gc.ca/imdb/p3VD.pl?Function=getVD&TVD=1252482&CVD=1252483&CLV=0&MLV=2&D=1). 
    - **Graduate (second cycle) means Master's programs, or those that otherwise require a Bachelor's degree**
    - **Graduate (third cycle) is PhD**
    - Certificates and Diplomas are inconsistent and have different criteria in different provinces, see pt. 4

2. I'm using **2022-2023 enrolment data, and 2023/2024 tuition fee figures**, the latest available. We will use 2023-24 tuition fees, and enrolment from 22-23 with hypothetical declines in student enrolment due to IRCC changes made January 2024 to estimate revenue changes into the future.

3. The easiest distinction is at Program Type, between undergraduate and graduate degrees for their tuition fee costs. However, you need to look at Credential Type for certificates and diplomas (popular at the colleges)

4. There is inconsistency in where graduate diplomas/certificates sit in 'program type'.
    - For example there are 509,000 Credential type: Diploma students across all of Canada in 22/23 and 386k of them are in 'Career, Technical or Professional Training Program' Program Type. 85k of these are sitting under 'Pre-University Program' (of 87k total in the Pre-uni category) which makes me think there are PSIs classifying a High School diploma, which wouldn't impact tuition fees. The remaining 55k are scattered across various other program types
    - Certificate credentials are clearer - of 190k in Canada, almost all are captured in the 'Career, Technical or Professional Training Program', 'Post Career, Technical or Professional Training Program' or Undergraduate.
    - *Post career, technical or professional training program* specifically includes **Ontario graduate certificate programs**

5. I only imported the student enrolment from full programs - there were around 100,000 enrolments (out of 2.2m total enrolments in all programs) in 'non-program' some were non credit, some undergraduate, some graduate, I assume this meant students taking individual classes to complete programs at a later date, and not an end-to-end program enrolment on a schedule.
6. The above analysis was done on full-time and part time students. As with my analysis earlier, I am only taking full-time PSI student data (a total of 1.7m in Canada)


All this is to say the calculations here will be estimates at best, with the heavy lifting being done by the difference between domestic and international tuition fees mostly at the undergraduate and graduate degree level, as these are the most numerous and require somewhat less granularity than fees for certificates/diplomas.

Need to begin modularising this project

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.graph_objects as go
import plotly.express as px

# for the preprocessing pipeline
from sklearn.base import BaseEstimator, TransformerMixin
from sklearn.pipeline import Pipeline