Added scrape_grades command #159

This is needed for grade dist. pdfs to not be tracked by git

Most of this is from Good Bull Schedules, but will most likely change as we go along Also added __init__.py for it

Also added a _generate_path function for use in it + load_json_file

Moved into pdf_helper, and simplifies the parse_page function accordingly

This basically just cleans up the return types so they're easier to understand

Basically just for readability purposes, functions the same Also removed unused function generate_year_semesters()

Also some misc linting fixes

Also added it to the lint-requirements for GitHub Actions

ALso removed redundant json.close() in load_json_file and instead returned it directly

Also changed pdf_reader.getNumPages() to .numPages Also fixed linting error

Changed get_pdf_skip_count to assign returned variables inline Removed extra grades iteration by adding up num_students in existing for-loop Changed list addition operator to .extend for readability

GradeManager is used for calculating an instructor's past grade distributions

Changed instructor_performance return to specify that Dict value can be a float or int Rest of commit is minor comment fixes

Also added beautiful soup to lint-requirements

These are incomplete, and more need to be added as commented

Since only the header row of the PDF indicates that it's an old pdf style, we only knew that it was an old pdf style for the first row and not the actual grades themselves, which prevented us from actually correctly parsing the section's grades, since the old style has a different format. To remedy this, anytime old_pdf_style is True in pdf_helper.get_pdf_skip_count, we store it (in pdf_parser.parse_page) and use it for the rest of the page. Also adds the according tests for it

Changed PDF_DOWNLOAD_DIR to use dirname instead of relative path Changed scrape_pdf's counts dictionary to use defaultdict Other misc semantic syntax changes

Moved to _create_documents_folder since thats where the actual error will occur

Example usage: python manage.py scrape_grades -c EN --year 2015

Also adds SSL verification back to scrape_grades.fetch_page_data

- Removed unnecessary import to pass linting - Changed task collecting to use list comprehension - Changed colleges & years assignment to use ternary operators

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added scrape_grades command #159

Added scrape_grades command #159

Commits on Mar 22, 2020