Skip to content
master
Switch branches/tags
Go to file
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 

About

This is a basic Python script for downloading all bus and trolley route trace KML files from the SEPTA data page. The script uses the Beautiful Soup library to build a list of routes from the list of bus and trolley routes and then downloads a KML for each file. Using Node modules, each .KML can be converted to a geoJSON file, and each geoJSON can be combined into a single large geoJSON file (see geoJSON Usage).

Right now the script is set up for one time use; there are no timesteamprs and nothing incrementally add changes. So, to re-run it or use a different version of the script you'll need to blow away the output directory first: rm -rf output

Dependencies

KML Only

  • Python 2.7
  • Python Libraries:
    • BeautifulSoup4
    • urllib2
    • requests
    • shutil

geoJSON

  • NodeJS
  • npm
  • Python subprocess library

KML Only Usage

python scrape-septa-only-kml.py

KML files will appear in output/kml. Filename is [route]-[type].[extension], like 13-trolley.kml or 33-bus.geojson. Open them all in Google Earth.

geoJSON Usage:

npm install
python scrape-septa-kml-geojson.py

This will populate two directorie, output/kml and output/geojson that include all routes of the respective file type.

If you'd like to compress all-routes.geojson, install the geojson-minifier package globally.

npm install -g geojson-minifier
geojson-minifier -o pack -f output/all-routes.geojson -p 6

That will compress all-routes.geojson to 10 or 11 MB. It's possible to load with leaflet.js if you change the extension to .js and declare a variable in the first line, like:

var septa = {
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",

and so on.

Leaflet.js can load the 11MB geoJSON file, but it will put some stress on your browser and take a long time to load. And it's surely not the best way to use this data. Proof of concept here.

Problems

This is my first project in Python; please contact me if I've made a major error.

About

Python script to scrape septa.org for bus and trolley route trace KML files

Resources

Releases

No releases published

Packages

No packages published

Languages