Jobs API Server
NOTE: USAJobs now provides a direct API for retrieving job postings with the U.S. federal government. Please leverage that API for any new projects: https://developer.usajobs.gov/
The API in this repo is primarily for use by Search.gov.
The server code that runs the Search.gov Jobs API is here on Github. If you're a Ruby developer, keep reading. Fork this repo to add features (such as additional datasets) or fix bugs.
The documentation on request parameters and response format is on the API developer page. This README just covers software development of the API service itself.
Ruby
This code is currently tested against Ruby 2.3.
Gems
We use bundler to manage gems. You can install bundler and other required gems like this:
gem install bundler
bundle install
Elasticsearch
We're using Elasticsearch (>= 5.6) for fulltext search. On a Mac, it's easy to install with Homebrew.
$ brew install elasticsearch@5.6
Otherwise, follow the instructions to download and run it.
Elasticsearch with Docker
Install Docker if you haven't done so yet. Follow the instruction here Once you have Docker installed on your machine, run the following command in your terminal
$ docker run -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" elasticsearch:5.6
This will download an docker image containing elasticsearch=5.6.5 from docker hub, run it, and expose port 9200 & 9300 to your machine. You can verify your setup with the following command.
$ curl localhost:9200
{
"name" : "u2bQgL2",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "qZ-Xas_PR_2ARtHpY724Ug",
"version" : {
"number" : "5.6.5",
"build_hash" : "6a37571",
"build_date" : "2017-12-04T07:50:10.466Z",
"build_snapshot" : false,
"lucene_version" : "6.6.1"
},
"tagline" : "You Know, for Search"
}
Geonames
We use the United States location data from Geonames.org to help geocode the locations of each job position. By assigning latitude and longitude coordinates to each position location, we can sort job results based on proximity to the searcher's location, provided that information is sent in with the request.
The 'US.txt' file from the Geonames archive contains goecoding information for many entities that we aren't interested in for the purpose of government jobs (e.g., canals, churches), so we pick out just what we need in order to keep the index small with this AWK script:
awk -F $'\\t' '$8 ~ /PPL|ADM\d?|PRK|BLDG|AIR|INSM/' US.txt > doc/filtered_US.txt
This includes populated places, administrative areas, parks, buildings, airports, and military bases.
You can download, unzip, and filter a more recent version of the file if you like, or you can import the one in this repo to get started:
bundle exec rake geonames:import[doc/filtered_US.txt]
If you are running Elasticsearch with the default 1g JVM heap, this import process will be pretty slow. You may want to consider allocating more memory to Elasticsearch.
Seed jobs data
You can use the sample.xml file just to load a few jobs and see the system working.
bundle exec rake jobs:import_usajobs_xml[doc/sample.xml]
The importer adds to or updates any existing entries, so you can run it multiple times if you have multiple XML files. You can also start over with an index if you want to erase what's there or load a different dataset:
bundle exec rake jobs:recreate_index
bundle exec rake geonames:recreate_index
Production data
Federal agencies can request XML files from USAJobs as described in the SIF Guide at https://schemas.usajobs.gov/.
Running it
Fire up a server and try it all out.
bundle exec rails s
http://127.0.0.1:3000/search.json?query=nursing+jobs&organization_id=VATA&hl=1
Parameters and Results
Full documentation on the parameters and result format is in our Jobs API documentation.
Expiration
When a job opening's end application date has passed, it is automatically purged from the index and won't show up in search results.
API Versioning
We support API versioning with JSON format. The current/default version is v3. You can specify a specific JSON API version like this:
curl -H 'Accept: application/vnd.usagov.position_openings.v3' http://localhost:3000/search.json?query=jobs
Tests
These require an Elasticsearch server to be running.
bundle exec rake spec
Code Coverage
We track test coverage of the codebase over time, to help identify areas where we could write better tests and to see when poorly tested code got introduced.
After running your tests, view the report by opening coverage/index.html
.
Click around on the files that have less than 100% coverage to see what lines weren't exercised by the tests.
Feedback
You can send feedback via Github Issues.