# Data Warehouse Utilization

In [1]:
%load_ext sql

In [2]:
%%sql
sqlite:///CourseDataWarehouse.db

'Connected: @CourseDataWarehouse.db'

## On which day are the most/least classes held?

In [3]:
%%sql
SELECT DISTINCT Day, count(Day) as Count
FROM TIMECODES_DW
GROUP BY Day
ORDER BY Count DESC;

 * sqlite:///CourseDataWarehouse.db
Done.


Day,Count
T,249
W,243
R,206
M,196
F,137
S,45
U,14


Seeing the most popular days that classes are held is insighful when considering if the University has capacity to offer more classes, or where they may be able to cut back if necessary.  It is no surprise that the during the week Friday has the fewest classes (3-day weekend?).

## Which classroom is the most utilized and what programs hold classes there? 

In [4]:
%%sql
SELECT location, COUNT(course_id) as count
FROM LOCATIONS_DW 
    JOIN CLASS_FACTS_DW USING (location_id)
    JOIN COURSES_DW USING (course_id)
GROUP BY location 
ORDER BY count DESC
LIMIT 10;

 * sqlite:///CourseDataWarehouse.db
Done.


location,count
DSB 112,171
BNW 341,160
DSB 109,158
NHS 301,158
CNS 101,157
DSB 106,157
CNS 108,154
BNW 137,152
CNS 106,149
DSB 108,148


In [5]:
%%sql
SELECT program_name AS Program, count(catalog_id) AS Count
FROM COURSES_DW 
    JOIN CLASS_FACTS_DW USING (course_id)
    JOIN PROGRAMS_DW USING (program_id)
    JOIN LOCATIONS_DW USING (location_id)
    WHERE location = 'DSB 112'
    GROUP BY program_name
    ORDER BY Count DESC; 

 * sqlite:///CourseDataWarehouse.db
Done.


Program,Count
Accounting,63
Marketing,38
Management,26
Finance,21
Operations Management,10
Information Systems,8
Taxation,5


This question is pertinent to the University when considering resource/logistics planning.  Dolan School of Business has the most location-specific classrooms, which would be expected considering the many business majors at the school.

## Which professors have the most diverse courseload?

In [8]:
%%sql
SELECT professor_id, Name, COUNT(DISTINCT Course_id) AS CourseCount
FROM PROFESSORS_DW
    JOIN CLASS_FACTS_DW USING (professor_id)
    JOIN COURSES_DW USING (course_id)
    GROUP BY professor_id, Name
    ORDER BY CourseCount DESC
    LIMIT 10;

 * sqlite:///CourseDataWarehouse.db
Done.


Professor_id,Name,CourseCount
161,Jeffrey N. Denenberg,18
110,Michael P. Pagano,17
65,Aaron R. Van Dyke,16
111,Qin Zhang,15
511,Amalia I. Rusu,15
576,Shannon P. Gerry,14
67,Diane J. Brousseau,13
84,John R. Miecznikowski,13
93,Amanda S. Harper-Leatherman,13
117,Virginia A. Kelly,13


In [9]:
%%sql
SELECT DISTINCT term, catalog_id, course_title
FROM COURSES_DW
    JOIN CLASS_FACTS_DW USING (course_id)
    WHERE professor_id = '161';

 * sqlite:///CourseDataWarehouse.db
Done.


Term,Catalog_id,Course_title
Fall2017,ECE 0461,Green Power Generation
Fall2017,EE 0213L,Electric Circuits Lab
Fall2017,EE 0231,Introduction to Electronics Circuits and Devices
Fall2017,EE 0231L,Electronics Circuits Lab
Fall2017,EE 0361,Green Power Generation
Fall2018,BEN 0331,Biomedical Signal Processing
Fall2018,CR 0331,Biomedical Signal Processing
Fall2018,EE 0213L,Electric Circuits Lab
Fall2018,EE 0231L,Electronics Circuits Lab
Spring2018,CR 0245L,Digital Design I Lab


Jeffery N. Denenberg teaches a wide array of Engineering classes across many different distinct types of Engineering disciplines (Electrical, Mechanical, Biomedical).  Does Fairfield need more faculty in these areas?  Is Professor Denenberg overloaded with his class schedule?

## Which professor teaches in the highest number of programs?

In [10]:
%%sql
SELECT professor_id, Name, COUNT(DISTINCT program_name) AS Prof_count
FROM PROFESSORS_DW
    JOIN CLASS_FACTS_DW USING (professor_id)
    JOIN PROGRAMS_DW USING (program_id)
    GROUP BY Name
    ORDER BY Prof_count DESC
    LIMIT 10;

 * sqlite:///CourseDataWarehouse.db
Done.


Professor_id,Name,Prof_count
161,Jeffrey N. Denenberg,7
112,James P. Cavallo,5
17,Anna M. Lawrence,4
586,Djedjiga Belfadel,4
162,Douglas A. Lyon,4
892,Ellen M. Lee,4
192,Emily J. Orlando,4
20,Martha S. LoMonaco,4
290,Mary Ann M. Carolan,4
660,Michelle Leigh Farrell,4


In [11]:
%%sql
SELECT DISTINCT program_name
FROM CLASS_FACTS_DW
    JOIN PROGRAMS_DW USING (program_id)
    WHERE professor_id = '161';

 * sqlite:///CourseDataWarehouse.db
Done.


program_name
Electrical and Computer Engineering
Electrical Engineering
Bioengineering
Computer Engineering
Engineering
Mechanical Engineering
Software Engineering


Again, we see Professor Denenburg at the top of the list which supports our previous answer when we saw the variety of Engineering disciplines he teaches at Fairfield University.  In the cell below, you can see all the program names he is responsible for.

## Which program has the most/least count of courses?

In [12]:
%%sql
SELECT DISTINCT program_name, COUNT(DISTINCT course_title) as Course_count
    FROM PROGRAMS_DW
        JOIN CLASS_FACTS_DW USING (Program_id)
        JOIN COURSES_DW USING (Course_id)
    GROUP BY program_name
    ORDER BY Course_count DESC
LIMIT 10;

 * sqlite:///CourseDataWarehouse.db
Done.


program_name,Course_count
Biology,71
Nursing,68
Psychology,67
English,63
Mathematics,45
History,43
Education,38
Religious Studies,38
Communication,34
Chemistry,30


In [13]:
%%sql
SELECT DISTINCT program_name, COUNT(DISTINCT course_title) as Course_count
    FROM PROGRAMS_DW
        JOIN CLASS_FACTS_DW USING (Program_id)
        JOIN COURSES_DW USING (Course_id)
    GROUP BY program_name
    ORDER BY Course_count
LIMIT 10;

 * sqlite:///CourseDataWarehouse.db
Done.


program_name,Course_count
Black Studies,1
English for Engineers,1
Humanitarian Action,1
Irish Studies,1
Latin American and Caribbean Studies,1
"Women, Gender, and Sexuality Studies",1
Environmental Studies,2
Graphic Design,2
Greek,2
Humanities,2


This question would be useful to Fairfield University to confirm when highlighting courses of study.  We confirm with this question that Biology and Nursing programs offer the most distinct courses, which is expected as they are two rigorous programs of study at Fairfield.

## Which classes go over capacity the most frequently?

In [14]:
%%sql
SELECT DISTINCT Course_title, COUNT(Course_title) AS Cap_count
FROM (SELECT DISTINCT COURSES_DW.CatalogYear, term, section, professor_id, COURSES_DW.Course_id, Catalog_id, Course_title, Timecodes, Cap, Actual, Remaining
FROM COURSES_DW    
    JOIN CLASS_FACTS_DW USING (course_id)
    JOIN TIMECODES_DW USING(Timecode_id)
    WHERE Remaining < 0 AND Cap != 0
    ORDER BY professor_id)
GROUP BY course_title
ORDER BY Cap_count DESC
LIMIT 10;

 * sqlite:///CourseDataWarehouse.db
Done.


Course_title,Cap_count
Introduction to Management Accounting,13
Operations Management,13
Principles of Marketing,13
Introduction to Information Systems,11
Marketing Research,10
Business Strategies in the Global Environment,8
Health Assessment Lab,7
Honors Seminar,7
Introduction to Finance,7
Introduction to Management,7


Here we can see that the vast majority of classes that go over capacity are Business-related classes, which speaks to the popularity of this area of study at the school.  It does question, however, if the University has the course or faculty availability it needs to support the programs offered. 

## Which classes are attracting <10 students/which classes are the least popular?

In [15]:
%%sql
-- Find classes where actual is <10 
-- Also get the count of each value
SELECT Course_title as Course, Catalog_id, Name, COURSES_DW.CatalogYear, Term, Actual, Cap, round((Actual*1.0/Cap*1.0),2) AS Ratio
    FROM COURSES_DW
    JOIN CLASS_FACTS_DW USING (course_id)
    JOIN PROFESSORS_DW USING (professor_id)
    JOIN TIMECODES_DW USING (timecode_id)
    JOIN PROGRAMS_DW USING (program_id)
    WHERE Actual < 10 AND Ratio <0.3
    GROUP BY Course_title
    ORDER BY Actual DESC

LIMIT 20;

 * sqlite:///CourseDataWarehouse.db
Done.


Course,Catalog_id,Name,CatalogYear,Term,Actual,Cap,Ratio
Precalculus,MA 0011,Scott Kaminski,2017_2018,Fall2017,9,32,0.28
Introduction to Probability and Statistics,MA 0017,Riaz A. Lalani,2017_2018,Fall2017,8,32,0.25
"Population: Birth, Death, and Migration",SO 0184,Mehmet Cansoy,2018_2019,Fall2018,8,28,0.29
20th Century United States,HI 0239,David W. McFadden,2017_2018,Summer2018,7,25,0.28
Abstract Algebra,MA 0436,Paul Baginski,2017_2018,Fall2017,7,25,0.28
Comprehensive Exam in Marriage and Family Therapy,FT 0099,Erica E. Hartwell,2018_2019,Fall2018,7,25,0.28
Politics of Humanitarian Action,PO 0129,Janie L. Leatherman,2017_2018,Fall2017,7,25,0.28
Popular Music Theory and Composition,MU 0155,Brian Q. Torff,2018_2019,Fall2018,7,25,0.28
Social Change in Developing Nations,SO 0191,Marcela Aliaga,2017_2018,Fall2017,7,24,0.29
Calculus II,MA 0172,Neha Hooda,2018_2019,Spring2019,6,25,0.24


Our first question confirmed that the Mathematics program offers among the highest number of courses (45), yet we see here that 25% of the lowest attended classes are Math courses.  Should the University offer fewer Math courses? 