(Work in progress; proceed with caution)
Lecdown is a small script to automatically download new/updated lecture notes from a configured course website.
- Check and download configured web pages for new/updated links
- Rename file with scripting
- Use ETag header to efficiently check for updates
- Download HTML pages as PDF
Install requirements for xattr and ChromeDriver:
# On Mac: I forgot...
brew install chromedriver # and something else I forgot...
# On Fedora, etc.:
sudo dnf install chromedriver python3-dev libffi-dev
Install this package:
pip3 install git+https://github.com/jasonchoimtt/lecdown
You can clone this repository, and install lecdown in "editable" mode:
git clone https://github.com/jasonchoimtt/lecdown
cd lecdown
pip3 install -e .
Now running lecdown
will use the version in the local repo.
# This creates lecdown.json in the current directory
lecdown init
# Add a page to extract links from
lecdown add-source http://path.to/some/course/page
# Download!
lecdown
# List downloaded files
lecdown ls
Lecdown works by storing an index in lecdown.json
. Currently, it ignores any
HTML links and downloads everything else. It does not scrape links of links
either. It associates the downloaded files with the origin link in
lecdown.json
, and also uses extended file attributes (on Mac and Linux) to
keep track of file moves.
Some web pages (i.e. Piazza resources) require login to be scraped. You can use
lecdown browser
to login to that page, then save the cookie in the console.
Currently, only the link scraper (but not the file downloader) uses the saved
cookie, but it still works for Piazza.