Hack Oregon Transportation Systems Backend Repository
This repo represents the work of the Transportation Systems project of Hack Oregon. We are a volunteer project for Open Data.
This repo is intended to be run in a docker environment.
A note about line endings
As you probably know, text files on Linux (where we deploy and where our containers run) have by convention lines ending with
LF. However, Windows, where some of us test and develop, uses the convention that lines in a text file end with a
CR and a
Git knows what we're using, and it tries to accomodate us by checking out working files with the line endings our platform uses. See this explainer from GitHub on how line endings work with Git: https://help.github.com/articles/dealing-with-line-endings/.
Now, throw in Docker and Docker Compose. Some Dockerfiles call for copying files from your host into the image during the build. If those files came from a Git repo, the default is that their line endings are your host native convention. But inside the image, which is a Linux filesystem, some files must use the Linux line ending convention or they won't work.
So far, we know that Bash and
sh scripts will crash if they have Windows line endings, often with mysterious error messages. Also, Python scripts like Django's
manage.py, when executed as commands, will crash mysteriously if they have Windows line endings.
One final note:
vim has a command
:setfileformat unix if you want the Unix / Linux convention or
:setfileformat dos if you want the DOS / Windows convention. Then save the file and edit your
.gitattributes file to declare that the file must have that convention on checkout.
Restarting Docker for Windows sometimes necessary
Sometimes, Docker for Windows loses contact with some critical resource and throws ugly messages like this:
ERROR: for transportationsystembackend_db_1 Cannot start service db: driver failed programming external connectivity on endpoint transportationsystembackend_db_1 (f944aeb0244747359af77373b4949561c6e6e1d8ee48fb0bfc8aba98aa32877e): Error starting userland proxy: mkdir /port/tcp:0.0.0.0:5439:tcp:172.18.0.2:5432: input/output error ERROR: for db Cannot start service db: driver failed programming external connectivity on endpoint transportationsystembackend_db_1 (f944aeb0244747359af77373b4949561c6e6e1d8ee48fb0bfc8aba98aa32877e): Error starting userland proxy: mkdir /port/tcp:0.0.0.0:5439:tcp:172.18.0.2:5432: input/output error Encountered errors while bringing up the project.
If this happens, you will need to restart Docker. Open the
Settings dialog and go to
Reset. Select the
Restart option (the top one). Wait till the green
Docker is running light shows up and then go back to your terminal. Everything should then work. This is a known Docker for Windows bug, not something you did wrong.
In order to run this you will want to:
Clone this Repository
The environment variables that Docker uses and inserts into the images it builds are taken from a file in the root of this repository called
.env. Because it contains sensitive information like passwords, it is not checked into version control - you have to create it as follows:
cp env.sample .env
.envand change at least
DJANGO_SECRET_KEY. You should not need to change any of the others during test and development.
Download the database
.backupfiles from Google Drive and place them in
./Backupsbefore doing the Docker build. The build will copy them onto the image and the first "run" in a container will restore them. See Automatic database restores for the details on the restore mechanism.
.envfile and the
.backupfiles have been added to the
.gitignorefile. Provided you do not rename them or change locations they will not be committed to the repo and this project will build and run.
Confirm you have executable perms on all the scripts in the
$ chmod +x ./bin/*.shFeel free to read each one and assign perms individually, cause it is your computer
😜and security is a real thing.
build.shscript to build the project. Since you are going to be running it on the local machine you will want to run:
./bin/build.sh -l- This command is doing a docker-compose build in the background. It is downloading the images needed for the project to your local machine.
Once this completes you will now want to start up the project. We will use the start.sh script for this, again using the
-lflag to run locally:
./bin/start.sh -lThe first time you run this you will see the database restores. You can ignore the error messages. You will also see the api container start up.
Once the first startup completes kill the container using cmd c/ctrl c depending on your os.
Restart the container using the same start command:
./bin/start.sh -land both the db and the api will start up.
Open your browser and you will be able to access the Django Rest Framework browserable front end at
http://localhost:8000/api, the Swagger API schema at
http://localhost:8000/schema, and the Django
To Run Tests: run the
./bin/build.sh -lfollowed by the
Note that the
apicontainer will write some files into your Git repository. They're in
.gitignore, so they won't be checked into version control.
Run in Staging Environment
While developing the API, using the built in dev server is useful as it allows for live reloading, and debug messages. When running in a production environment, this is a security risk, and not efficient. As such a staging/production environment has been created using the following technologies:
- Gunicorn - A "green" HTTP server
- Gevent - Asynchronous workers
- Pyscopgreen - A "green" version of the psycop database connector
- django_db_geventpool - DB pool using gevent for PostgreSQL DB.
- WhiteNoise - allows for hosting of static files by gunicorn in a prod environment vs. integrating a webserver
- copy the
/bin/env.staging.samplefile to create a
.env.stagingfile in same directory:
$ cp ./bin/env.staging.sample ./bin/.env.staging
./bin/.envin your text editor and complete the environmental variables.
Download and save the sql file if you have not already.
build.shscript to build the project for the staging environment:
$ ./bin/build.sh -s
Start the project using the staging flag:
$ ./bin/start.sh -s
Try going to an nonexistent page and you should see a generic 404 Not found page instead of the Django debug screen.
What was configured:
So what is changed from the default Django setup for the staging environment. This already has been done, being included for informational purposes
- Add gunicorn, gevent, and whitenoise to
- Set the debug variable to false in the
- make any other changes necessary to config vars, ie: database settings
- create a staging entrypoint/prod entrypoint file that runs the gunicorn start command instead of the ./manage.py runserver
- create the
gunicorn_config.pyfile to hold gunicorn config, including using gevent worker_class
- create a staging/production docker_compose file, to use the correct .env, entrypoint, and any other changes needed
- Make changes to settings.py:
# Change DEBUG line: DEBUG = os.environ.get('DEBUG') == "True" - handles os variables being treated as strings # ADD to MIDDLEWARE right after SECURITY: 'whitenoise.middleware.WhiteNoiseMiddleware', # ADD these just before the STATIC_URL so staticfiles are handled correctly and are compressed: STATICFILES_STORAGE = 'whitenoise.storage.CompressedManifestStaticFilesStorage' STATIC_ROOT = os.path.join(BASE_DIR, 'staticfiles')
To develop on the repo,
Create an issue for tracking and communication
You will clone repo and then create a feature branch.
After branching confirm you can follow above get started steps.
Develop you feature
Update documentation and env sample file as necessary.
Merge current Staging branch into feature branch to resolve any merge conflicts.
Push local feature branch to Hack Oregon repo. Any PR requests from forks will be rejected.
Create a Pull Request to staging branch. No PRs will be accepted to Master unless from staging and by approved reviewers
PR should be reviewed by authorized reviewer, another team member if possible, and pass any automated testing requirements.
Any outstanding merge conflicts resolved
Authorized reviewer will commit to staging.
Process for staging to master will be defined.
The primary function of this API is to act as a read-only wrapper around ODOT's Crash data and expose the underlying data to the web via HTTP Requests. The secondary function is eventually expose helper functions that could simplify data pre-processing via in-built helper functions. This API aims to be RESTful.
Note on Unmanaged models
The models in this project are unmanaged. Given that a) the API sits upon a legacy database and b) the API is intended to be read-only, the decision was made to decouple Django from database management and isolate that solely to the underlying PostGres shell environment. This is to prevent creation and deletions of the underlying data tables primarily during development. Malicious editing (outside of the dev environment) is less of a concern since that can be handled by a secure permissions for users making API calls.
Note on Permissions
All users can browse the API. Read-only access is the default permission for unauthenticated users.
Note on Testing
Testing an unmanaged model requires a few modifications to the test runner. Since migrations don't create any tables, they create a blank test database which results in no test data being found. The fix is outlined in the following post - https://dev.to/patrnk/testing-against-unmanaged-models-in-django
Runnning a test requires you have 'django-test-without-migrations' as part of your requirements. The only other point to remember is that tests need to be run with
./manage.py test --no-migrations flag to prevent Django from trying to run migrations on your test db.
- API endpoints can viewed in a browser.
- List of endpoints (assuming local machine as hostm with port 8000 exposed):
Three types of filters are currently supported -
1. Search Filters
Simple text search can be performed on the following fields:
Passenger Census Table
To look for all fields listed above that match (not exact) the string "DIS-RAG" -
2. Field Filters
The API also supports explicit filter fields as part of URL query strings. The following fields are currently supported -
If filtering just "00173" and "00174" for the field 'ser_no' -
- Filters need to be an exact match.
- URLs need to be encoded. All reserved characters characters should be escaped. For example, a query on the field
public_location_descriptionwith the value
SW 6th & Salmon, then the query should look like this -
http://localhost:8000/api/passenger-census/?public_location_description=SW%206th%20%26%20Salmon. Here both spaces (
%20) and ampersands (
%26) have been escaped.
3. Ordering Filters
Results can be sorted against any field or combinations of fields.
To show results in ascending order of the field 'ser_no':
In descending order:
The API supports Accept Header Versioning. Version numbers in API requests are optional and if no version is specified the request header latest version is returned by default. Specify versions as numbers, as shown in header example below -
GET /api/passenger-census HTTP/1.1 Host: example.com:8000 Accept: application/json; version=1.0
Latest version: 1.0 (as of 02/24/2018)
We follow the MIT License: https://github.com/hackoregon/transportation-system-backend/blob/staging/LICENSE