Skip to content

random-scraper/random_scraper

Repository files navigation

random_scraper

Tools for Webscraping

This package provides simple methods for scraping data anonymously and avoid getting your IP blocked by web servers. In particular, a better approach consists in using proxy servers to change IP addresses over time as well as user agents. There are both free and paid proxy servers available online. Unfortunately, the free proxies may be slow and unreliable which may result in missing data.

This package automatically collects and updates available free proxies online. It also provides a list of user agents and a user-friendly tool to request a page anonymously.

Please send feedback and comments to mab2343@columbia.edu.

Next steps:

  • Write a detailed documentation and examples
  • Update the request_page function to scrape AJAX websites

Note: We are not responsible for the wrongful usage of the tools provided. Please scrape content responsibly.

Releases

No releases published

Packages

No packages published

Languages