### CAO Points Notebook

Include a Jupyter notebook called cao.ipynb that contains the following.

10% A clear and concise overview of how to load CAO points information from the
CAO website into a pandas data frame, pitched as your classmates.
***

20% A detailed comparison of CAO points in 2019, 2020, and 2021 using the functionality in pandas.
***

10% Appropriate plots and other visualisations to enhance your notebook for viewers.
***

In [1]:
# Importing required packages:
# HTTP Requests
import requests as rq

# Regular Expressions
import re

# Dates and times
import datetime as dt

# Data frames
import pandas as pd

# For Url downloads
import urllib.request as urlrq

# CAO Points for 2021 <br>
http://www.cao.ie/index.php?page=points&p=2021
***

In [2]:
# Getting data form CAO website
resp = rq.get("http://www2.cao.ie/points/l8.php")

# Confirming it is ok

resp

<Response [200]>

### Saving original dataset

In [3]:
# Getting current date and time
now = dt.datetime.now()

# Format as a string
nowstr = now.strftime("%Y%m%d_%H%M%S")

In [4]:
# Creating a file path
path = 'data/cao2021_' + nowstr + '.html'

### Using decoding cp1252 that includes fada
***

In [None]:
# To include CAO website encoding screen

In [5]:
# Checking original server encoding
original_encoding = resp.encoding

resp.encoding = 'cp1252'
original_encoding

'iso-8859-1'

In [6]:
# Save original html file
with open(path, 'w') as f:
    f.write(resp.text)

# CAO Points for 2020 <br>
https://www.cao.ie/index.php?page=points&p=2020
http://www2.cao.ie/points/CAOPointsCharts2020.xlsx
***

# CAO Points for 2019 <br>
https://www.cao.ie/index.php?page=points&p=2019
http://www2.cao.ie/points/lvl8_19.pdf
***

Importing Camelot and reading PDF, yet only first page is read, stack overflow to the rescue https://stackoverflow.com/questions/56777241/camelot-is-reading-only-the-first-page-of-the-pdf

In [25]:
import camelot

tables2019 = camelot.read_pdf('http://www2.cao.ie/points/lvl8_19.pdf', pages='all')

tables2019

<TableList n=18>

In [15]:
tables2019.export('data/cao2019_20211213_2143.csv', f='csv', compress=True)

In [31]:

tables2019[1].df


Unnamed: 0,0,1,2,3
0,CW258,Cybercrime and IT Security,300,328.0
1,CW268,Computing in Interactive Digital Art and Design,274,321.0
2,CW438,Construction (options),271,308.0
3,CW468,Architectural Technology,252,290.0
4,CW478,Civil Engineering,348,383.0
5,CW548,Mechanical Engineering,310,351.0
6,CW558,Electronic Systems,279,338.0
7,CW568,Aerospace Engineering,366,422.0
8,CW578,TV and Media Production,327,361.0
9,CW708,Law - LLB,298,328.0


In [32]:
tables2019[0].to_csv


<bound method Table.to_csv of <Table shape=(44, 4)>>

### References

In [10]:
# Camelot https://www.youtube.com/watch?v=LoiHI-IB3lY