PageLoader is a command line utility that downloads pages from the internet and stores them on your computer. Along with the page it downloads all the resources (images, styles and js) allowing you to open the page without the Internet.
Tool | Version |
---|---|
python | "^3.8.1" |
requests | "^2.28.1" |
beautifulsoup4 | "^4.11.1" |
progress | "^1.6" |
Before installation, make sure that you have Python and Poetry installed.
- Clone the repository to your computer
git clone https://github.com/ratushnyyvm/page-loader.git
- Go to the project folder
cd page-loader
- Install the program
make setup
$ page-loader -h
usage: page-loader [-h] [-o OUTPUT] url
Downloads web-pages and save locally
positional arguments:
url
optional arguments:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
output dir (default: current directory)
from page_loader import download
file_path = download('https://ru.hexlet.io/courses', '/var/tmp')
print(file_path) # => '/var/tmp/ru-hexlet-io-courses.html'