Skip to content

A scraper to generate a PDF of the book Introductory Chemistry: A Foundation

License

Notifications You must be signed in to change notification settings

markasoftware/mindtap-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MindTap Scraper

This is a scraper designed to take the contents of the online textbook "Introductory Chemistry: A Foundation", 8th edition. The PDFs generated by this tool are intended for personal, non-commercial use only, by limiting access to people who have access to the online textbook ;).

How to use

Before starting, please note that it can take upwards of 30 minutes for the scraper to complete. It is not an instant process.

  1. Download and install PhantomJS for your operating system. You can also install it with npm if you prefer. As root: npm i -g phantomjs-prebuilt
  2. Download or clone this repo (click on clone or download, then download zip). If it's a zip, you'll need to extract it before use.
  3. Run phantomjs main.js yourusername@example.com pAs5w0rD true from the cloned directory to begin scraping. Use your username and password instead of the example. Remove the true at the end if you want it to not show answers by default.

Note: If you see any "JSON Parse error" messages during the scraping, do not worry. This is a bug with MindTap and does not affect the scraping process.

After following these steps, separate PDF files for every "page" in the online textbook will be located in the pdfs directory.

Sometimes, there will be issues and the program will hang at the table of contents step, or all the pdfs will be copies of the chapter 1 table of contents. If either of these happen, just restart the script.

Combining and compressing the resultant PDFs

You can use whatever tool you want to combine the PDFs created, but if you're in Linux (might work on mac too) an easy way to do it is to run the following command from the pdfs directory (make sure GhostScript is installed):

gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite -dFastWebView -dPDFSETTINGS=/screen -sOutputFile=../introductory-chemistry-a-foundation.pdf $(ls | sort -V)

About

A scraper to generate a PDF of the book Introductory Chemistry: A Foundation

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published