Note: I wrote this script because my friend Pragyaditya das needed it. I ain't sure if it's legal to scrape Geeks4Geeks. In future if I come across any such policies, I shall remove this repo without any hesitation.
git clone "https://github.com/tushar-rishav/g4g-dl.git"
cd g4g-dl
python setup.py install
###Default config: target : g4gPdf
usage: g4g-dl [-h] [-t TARGET] [-p POST] [-d] [-a] [-s START] [-e END]
Downloads Geeks for Geeks DS and Algorithm tutorials and save as PDF
optional arguments:
-h, --help show this help message and exit
-t TARGET, --target TARGET
absolute path of target directory to save all PDFs.
Default is g4gPdf in current dir
-p POST, --post POST link for single post
-s START, --start START
Position to start from. Default is 0
-e END, --end END Position to end at. Default is the last link
group:
-d, --ds Fetch all Data Structures
-a, --algo Fetch all Algorithms
Author:https://github.com/tushar-rishav
To fetch single tutorial, say Topological Sorting, run this in your shell
g4g-dl -p http://www.geeksforgeeks.org/topological-sorting/
Note:
- The above command will save the pdf in default directory
- You must specify
-d
(for data structure) or-a
(for algo) if you aren't fetching tutorials this way.
g4g-dl -t my_directory_abs_path -d
g4g-dl -t my_directory_abs_path -d -s 63 -e 69
Note: The order is according to which links appear in page. Above command will fetch some graphs turorials which exist between between 63rd and 69th positions (both inclusive). This way you can download selected tutorials. Go ahead and try downloading just Dynamic Programming tutorials.
g4g-dl -h
Have an idea to make it better? Go ahead! I will be happy to see a pull request from you! 😊