Python script to scrape for bus and trolley route trace KML files
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.


This is a basic Python script for downloading all bus and trolley route trace KML files from the SEPTA data page. The script uses the Beautiful Soup library to build a list of routes from the list of bus and trolley routes and then downloads a KML for each file. Using Node modules, each .KML can be converted to a geoJSON file, and each geoJSON can be combined into a single large geoJSON file (see geoJSON Usage).

Right now the script is set up for one time use; there are no timesteamprs and nothing incrementally add changes. So, to re-run it or use a different version of the script you'll need to blow away the output directory first: rm -rf output


KML Only

  • Python 2.7
  • Python Libraries:
    • BeautifulSoup4
    • urllib2
    • requests
    • shutil


  • NodeJS
  • npm
  • Python subprocess library

KML Only Usage


KML files will appear in output/kml. Filename is [route]-[type].[extension], like 13-trolley.kml or 33-bus.geojson. Open them all in Google Earth.

geoJSON Usage:

npm install

This will populate two directorie, output/kml and output/geojson that include all routes of the respective file type.

If you'd like to compress all-routes.geojson, install the geojson-minifier package globally.

npm install -g geojson-minifier
geojson-minifier -o pack -f output/all-routes.geojson -p 6

That will compress all-routes.geojson to 10 or 11 MB. It's possible to load with leaflet.js if you change the extension to .js and declare a variable in the first line, like:

var septa = {
  "type": "FeatureCollection",
  "features": [
      "type": "Feature",

and so on.

Leaflet.js can load the 11MB geoJSON file, but it will put some stress on your browser and take a long time to load. And it's surely not the best way to use this data. Proof of concept here.


This is my first project in Python; please contact me if I've made a major error.