nushPOSextractor

Extractor for NUS High Programme of Studies (POS) from PDF to CSV.

Usage instructions

Create a Python 3 virtual environment (i used python 3.11 but most versions >3.8 should work)
pip install -r requirements.txt
Download the POS of choice and save in same folder as scripts as "POS.pdf"
Run PDFFilter.py, will create merged.pdf
Open merged.pdf with mIcrOSoft WoRD and save it as table.docx in same folder
Run CSVFromWord.py, pos.csv should be generated
Profit

This repo includes pos.csv generated from POS for C2028

Current information included: "department", "level", "sem", "code", "type", "title", "description", "mcs", "prerequisites", "preclusions", "corequisites", "hrs", "remarks"

How it works, in case this stops working

PDFFilter.py filters all horizontal pages as pages are horizontal if and only if they contain useful table data We exploit miCrOSOft woRd'S ability to open PDFs to make the table into a pdf because pdf tables are impossible to manipulate Then use CSVFromWord.py to deal with scuffed newlines and put it in CSV format

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
CSVFromWord.py		CSVFromWord.py
PDFfilter.py		PDFfilter.py
README.md		README.md
pos.csv		pos.csv
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nushPOSextractor

Usage instructions

How it works, in case this stops working

About

Releases

Packages

Languages

Walnit/nushPOSextractor

Folders and files

Latest commit

History

Repository files navigation

nushPOSextractor

Usage instructions

How it works, in case this stops working

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages