A web scraper to scrape the services from the Get Help section of the 211 SEPA website.
Python
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
sepa211scraper
.DS_Store
README.md
scrapy.cfg

README.md

sepa211scraper

I made this scraper to scrape the services found on the SEPA 211 website: https://211sepa.org/browse/. Changes I'd like to make include:

  • right now the spider is hard coded to scrape services that "serve Philadelphia" using "search/?area_served=Philadelphia&" in the url. I'd like to make it so there could be either command line arguments or prompts for the user to enter the search options available on the site, including distance from zip code, county served, and agency physically located in: https://211sepa.org/search/?advanced=true.
  • make some PEP8 changes
  • clean up the ways I use regex and make it consistent