Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
download all the files linked to from a web page
Fetching latest commit…
Cannot retrieve the latest commit at this time.
|Type||Name||Latest commit message||Commit time|
|Failed to load latest commit information.|
downlink A python library and command line tool for scraping (and downloading) links on a web page. library linkscraper.py LinkScraper - class for scraping links from a page document_linkscraper.py DocumentLinkScraper - subclass of LinkScraper - class for scraping "document links," which all end in a given file extension, such as ".pdf" __init__.py imports library classes for cleaner importing __main__.py main() - entrypoint for command line tool command line tool Basic usage: $ downlink "https://www.ct.gov/doh/cwp/view.asp?a=4513&q=530462" output The above will download all PDF documents to a folder called "output" which must exist and be writable. To download files of a different extension, use the --ext option. For more usage details, run downlink --help