
<img src="https://www.inderdhillon.com/files/logo-gray.png" width="100">

<h1>arxiv Data Dashboard</h1>
<b>Inder Dhillon</b><br>
<i>inderdhillon.com</i><br><br>
Wanted to brush up on DataViz, made a dashboard for GPA data from the YorkU Grades website using PowerBI. <br><br>
<hr><br>

Code for scraping the YorkU Course Webpage:

In [0]:
import requests
from bs4 import BeautifulSoup


COURSE_PAGE = 'https://wrem.sis.yorku.ca/Apps/WebObjects/ydml.woa/wa/DirectAction/document?name=CourseListv1'
LOGIN_PAGE = 'https://passportyork.yorku.ca/ppylogin/ppylogin'


username = "{YORK_USERNAME_HERE}"
password = "{YORK_PASSWORD_HERE}"
session = requests.Session()

r = session.get(COURSE_PAGE)

login_data = {'mli': username,
            'password': password,
            'dologin': 'Login'}
soup = BeautifulSoup(r.text, 'html.parser')
hiddens = soup.find_all("input", type="hidden")
for tag in hiddens:
    login_data[tag['name']] = tag['value']
r = session.post(LOGIN_PAGE, data=login_data)
if not 'You have successfully authenticated' in r.text:
    raise Exception('Failed to login')

r = session.get(COURSE_PAGE)
soup = BeautifulSoup(r.text, 'html.parser')
grades_table = soup.find_all('table', class_='bodytext')

Importing required libraries:

In [0]:
import pandas as pd
import numpy as np

In [0]:
grades_df = pd.read_html(' '.join(map(str, grades_table)))

In [0]:
grades_df = grades_df[0]

Imported Scraped Data into a Pandas DataFrame:

In [5]:
grades_df.head()

Unnamed: 0,Session,Course,Title,Grade
0,FW19,AP ADMS 1000 3.00 H,Introduction to Business,A
1,FW19,FA FILM 1701 3.00 M,Hollywood: Old and New,A
2,FW19,FA THEA 3135 3.00 A,Technology in Arts Management,A+
3,FW19,HH PSYC 3140 3.00 P,Abnormal Psychology,B+
4,FW19,LE EECS 3000 3.00 A,Professional Practice in Computing,B


Function for adding grade weights:

In [0]:
def grade_weight(row):
    if row['Grade'] == "A+":
        val = 9
    elif row['Grade'] == "A":
        val = 8
    elif row['Grade'] == "B+":
        val = 7
    elif row['Grade'] == "B":
        val = 6
    elif row['Grade'] == "C+":
        val = 5
    elif row['Grade'] == "C":
        val = 4
    elif row['Grade'] == "D+":
        val = 3
    elif row['Grade'] == "D":
        val = 2
    elif row['Grade'] == "E":
        val = 1
    elif row['Grade'] == "F":
        val = 0
    else:
        val = 0
    return val

grades_df['Weight'] = grades_df.apply(grade_weight, axis=1)

In [7]:
grades_df.head()

Unnamed: 0,Session,Course,Title,Grade,Weight
0,FW19,AP ADMS 1000 3.00 H,Introduction to Business,A,8
1,FW19,FA FILM 1701 3.00 M,Hollywood: Old and New,A,8
2,FW19,FA THEA 3135 3.00 A,Technology in Arts Management,A+,9
3,FW19,HH PSYC 3140 3.00 P,Abnormal Psychology,B+,7
4,FW19,LE EECS 3000 3.00 A,Professional Practice in Computing,B,6


Splitting course data into relative columns: 

In [0]:
grades_df["Credit"] = grades_df["Course"].str.slice(-6, -1)
grades_df["Faculty"] = grades_df["Course"].str.slice(0, 2)
grades_df["Subject"] = grades_df["Course"].str.slice(3,7)
grades_df["Course"] = grades_df["Course"].str.slice(3,12)

In [0]:
grades_df = grades_df[["Session", "Faculty", "Subject", "Course", "Credit", "Title", "Grade", "Weight"]]

In [10]:
grades_df.head()

Unnamed: 0,Session,Faculty,Subject,Course,Credit,Title,Grade,Weight
0,FW19,AP,ADMS,ADMS 1000,3.0,Introduction to Business,A,8
1,FW19,FA,FILM,FILM 1701,3.0,Hollywood: Old and New,A,8
2,FW19,FA,THEA,THEA 3135,3.0,Technology in Arts Management,A+,9
3,FW19,HH,PSYC,PSYC 3140,3.0,Abnormal Psychology,B+,7
4,FW19,LE,EECS,EECS 3000,3.0,Professional Practice in Computing,B,6


Exporting the Pandas DataFrame for use in PowerBI:

In [0]:
grades_df.to_csv("grade_data.csv")

The Final Dashboard:<br><br>
![Dashboard](https://github.com/Inder-Dhillon/YorkU-Grade-Dashboard/raw/master/dashboard.png)

Available:<br>
https://app.powerbi.com/view?r=eyJrIjoiOWYzYzg1OGUtNTliYS00OGFhLWE3OTctMGM3YjQ0YmZlNDQ1IiwidCI6ImQxYTBkZTYxLThiODAtNDcxZC1hYjk5LTdkN2Q5NjQxMTA3OSJ9