## Import modules & set database connection

In [132]:
%load_ext sql
import pandas as pd
import sqlite3 as sql

%sql sqlite:///CourseDataWarehouse.db

The sql extension is already loaded. To reload it, use:
  %reload_ext sql


'Connected: @CourseDataWarehouse.db'



## Query #1


- __Query 1A__ represents a broad dimensionsional view of the number of courses and instructors within each CatalogYear (very general). 
  * This is useful when analyzing staffing levels across the entire University (macro level).
  
  
- __Query 1B__ adjusts the scope of query A to a lower dimension in order to view the number of courses taught per professor within a specific program and term (more specific and insightful).
  * This is useful when analyzing the staffing levels specific to a department (micro level).

  

#### Query 1A.

In [133]:
%%sql
SELECT CatalogYear, COUNT(DISTINCT CatalogID) AS CatalogsCount, COUNT(DISTINCT InstructorID) as InstructorsCount
FROM CATALOG_DIMENSION
    LEFT JOIN FACT_TABLE USING(CourseID)
    LEFT JOIN COURSE_DIMENSION USING(CourseOfferingID)
    LEFT JOIN INSTRUCTOR_DIMENSION USING(InstructorID)
    JOIN TIME_DIMENSION USING(CourseMeetingID)
GROUP BY CatalogYear
ORDER BY CourseDate DESC;

 * sqlite:///CourseDataWarehouse.db
Done.


CatalogYear,CatalogsCount,InstructorsCount
2018_2019,977,620
2017_2018,1125,642


#### Query 1B.

In [134]:
%%sql
SELECT DISTINCT Name,Term,ProgramName,
        COUNT(DISTINCT CatalogID) AS CatalogsTaught,
        (
         SELECT COUNT(DISTINCT CatalogID) AS TotCatalog
            FROM CATALOG_DIMENSION
                LEFT JOIN FACT_TABLE USING (CourseID)
                LEFT JOIN TIME_DIMENSION USING (CourseMeetingID)
            WHERE Term = 'Fall2017' AND ProgramName LIKE '%Information%'
        ) AS CatalogsTotal
FROM CATALOG_DIMENSION
    LEFT JOIN FACT_TABLE USING (CourseID)
    LEFT JOIN INSTRUCTOR_DIMENSION USING (InstructorID)
    LEFT JOIN TIME_DIMENSION USING (CourseMeetingID)
WHERE Term ='Fall2017' AND ProgramName LIKE '%Information%'
GROUP BY Name
ORDER BY CatalogsTaught DESC;

 * sqlite:///CourseDataWarehouse.db
Done.


Name,Term,ProgramName,CatalogsTaught,CatalogsTotal
Christopher L. Huntley,Fall2017,Information Systems,3,8
Vishnu V. Vinekar,Fall2017,Information Systems,3,8
Yasin Ozcelik,Fall2017,Information Systems,2,8
Kanlun Wang,Fall2017,Information Systems,1,8
Nicholas F. Socci,Fall2017,Information Systems,1,8
Patrick S. Lee,Fall2017,Information Systems,1,8
Thomas F. McCabe,Fall2017,Information Systems,1,8




## Query #2

- __Query A__ represents the top ten courses with the most sections for a specific term and the total number of instructors teaching them. 
  * Query A output shows that 11 sections of CatalogID BI 0171P are being taught by 1 professor. This is a red flag.
  
  
- __Query B__ represents an example of what the user might do after seeing the output from query A.
  * Query B output reveils a deeper dimension outlining the TimeCodes for the 11 sections that are being taught by 1 professor. After reviewing this output, the user is now certain that the professor is not being unfairly overworked.

#### Query 2A.

In [135]:
%%sql
SELECT Term,CatalogID,Title,
                COUNT(DISTINCT Section) AS NumSections,  
                COUNT(DISTINCT InstructorID) AS TotalInstructors
FROM CATALOG_DIMENSION
    LEFT JOIN FACT_TABLE USING(CourseID)
    LEFT JOIN TIME_DIMENSION USING(CourseMeetingID)
    LEFT JOIN COURSE_DIMENSION USING(CourseOfferingID)
    LEFT JOIN INSTRUCTOR_DIMENSION USING(InstructorID)
WHERE Term LIKE '%Spring2019'
GROUP BY Term,CatalogID
ORDER BY NumSections DESC
LIMIT 10;

 * sqlite:///CourseDataWarehouse.db
Done.


Term,CatalogID,Title,NumSections,TotalInstructors
Spring2019,EN 0012,Texts and Contexts II: Writing About Literature,49,25
Spring2019,EC 0012,Introduction to Macroeconomics,21,10
Spring2019,NS 0323C,Pediatric Nursing Clinical,19,6
Spring2019,MA 0217,Accelerated Statistics,18,11
Spring2019,PH 0101,Introduction to Philosophy,18,9
Spring2019,AC 0012,Introduction to Management Accounting,15,8
Spring2019,NS 0305C,Mental Health Nursing Clinical,15,5
Spring2019,NS 0312C,Medical Surgical Nursing I Clinical,14,13
Spring2019,BI 0108L,Human Anatomy and Physiology Lab II,11,4
Spring2019,BI 0171P,General Biology II PLG,11,1


#### Query 2B.

In [136]:
%%sql
SELECT DISTINCT Name,Term,Title,Section,CatalogID,TimeCodes
    
FROM CATALOG_DIMENSION
    JOIN FACT_TABLE USING(CourseID)
    JOIN COURSE_DIMENSION USING(CourseOfferingID)
    JOIN INSTRUCTOR_DIMENSION USING(InstructorID)
    JOIN TIME_DIMENSION USING(CourseMeetingID)
WHERE CatalogID LIke 'BI 0171P' AND Term = 'Spring2019'
ORDER BY ProgramName

 * sqlite:///CourseDataWarehouse.db
Done.


Name,Term,Title,Section,CatalogID,Timecodes
Catherine J. Andersen,Spring2019,General Biology II PLG,1,BI 0171P,['T 0300pm-0350pm 01/22-04/30 BNW 319']
Catherine J. Andersen,Spring2019,General Biology II PLG,2,BI 0171P,['T 0400pm-0450pm 01/22-04/30 BNW 319']
Catherine J. Andersen,Spring2019,General Biology II PLG,3,BI 0171P,['T 0500pm-0550pm 01/22-04/30 BNW 319']
Catherine J. Andersen,Spring2019,General Biology II PLG,4,BI 0171P,['T 0600pm-0650pm 01/22-04/30 BNW 319']
Catherine J. Andersen,Spring2019,General Biology II PLG,5,BI 0171P,['W 0300pm-0350pm 01/22-04/30 BNW 319']
Catherine J. Andersen,Spring2019,General Biology II PLG,6,BI 0171P,['W 0400pm-0450pm 01/22-04/30 BNW 319']
Catherine J. Andersen,Spring2019,General Biology II PLG,7,BI 0171P,['W 0500pm-0550pm 01/22-04/30 BNW 319']
Catherine J. Andersen,Spring2019,General Biology II PLG,8,BI 0171P,['W 0600pm-0650pm 01/22-04/30 BNW 319']
Catherine J. Andersen,Spring2019,General Biology II PLG,9,BI 0171P,['R 0300pm-0350pm 01/22-04/30 BNW 319']
Catherine J. Andersen,Spring2019,General Biology II PLG,10,BI 0171P,['R 0400pm-0450pm 01/22-04/30 BNW 319']
