Populating the Postgres database from the Human Services Data Specification (HSDS) compliant CSV files

Pat Kujawa edited this page Apr 18, 2017 · 8 revisions

Ohana can load CSV files that adhere to the Human Services Data Specification. Not all sets of files that are valid HSDS can be loaded into Ohana, so please check the list below for details of what's required for Ohana.

The list below details all the CSV files you will need to prepare, along with a description of the columns in each file.

If you are producing data from scratch, you can either use the Admin interface that's built into the Ohana API app, or our Google Spreadsheets template that you can copy to your own account, and then fill in your data before exporting to load into Ohana.

If you use the template, ensure that when you export each tab as a CSV file you rename each file so that it only contains the pluralized table name, all in lowercase, e.g. organizations.csv. If you're comfortable with the command line, you can use our script to automatically generate the CSV files from your spreadsheet.

For each table, make sure each row has a unique id, and that the id is an integer.

Once your CSV files have been exported, it's a good idea to run them through CSVLint to make sure they are proper CSV files.

Once your CSV files are ready, follow the installation instructions to import them into the database.

If you already have data, and are using a script to convert them to OpenReferral CSV files, you will need to make a few changes:

  • Make sure the CSV filenames match the ones listed here. Most should be pluralized.

  • For performance and ease-of-use reasons, attributes in Ohana API that are just a collection of Strings are Postgres array fields within the table they belong to, as opposed to separate tables. These fields are: accessibility, accreditations, licenses, funding_sources, service_areas, eligibility, required_documents, fees, accepted_payments, interpretation_services, and languages. Refer to the tables below to see where each field belongs and how it should be formatted.

  • If your Locations are associated to Services via an n:m relationship, you will have to convert that to a 1:n relationship, which is what Ohana API uses for performance and simplicity reasons.

Also, note that out of the box, Ohana API does not support the HSDS metadata CSV file for keeping a history of the changes to the data. If you require this functionality, you can use the PaperTrail gem to implement it.

organizations.csv (required)

Column Requirement Details
id required The organization's unique id, starting at 1.
accreditations optional A list of accreditations an organization has received. Multiple entries must be comma-separated.
alternate_name optional Another name this organization might be known by.
date_incorporated optional The date this organization was incorporated. By default, the format for slash-separated dates is assumed to be in month/day/year format, such as 1/24/14 and 1/24/2014. Spelled out dates are also accepted, such as December 5, 2014. If you wish to use the day/month/year format, you must first update the date_format option in settings.yml to %d/%m/.
description required A description of what the organization does.
email optional The organization's primary email.
funding_sources optional A list of sources of funds for an organization. Multiple entries must be comma-separated.
legal_status optional The conditions an organization is operating under; e.g. non-profit, private corporation or a government organization.
licenses optional A list of licenses an organization has obtained to operate legally. Multiple entries must be comma-separated.
name required Name of the organization
tax_id optional Tax identifier, such as Federal Employer Identification Number.
tax_status optional Internal Revenue Service tax designation, such as 501(c)(3) for tax-exempt organizations.
website optional The organization's website. The value must be a fully qualified URL that includes http:// or https://, and any special characters in the URL must be correctly escaped.

locations.csv (required)

Column Requirement Details
id required The location's unique id, starting at 1.
organization_id required The id of the organization that the location belongs to.
accessibility optional Accessibility options available at the location. Multiple entries must be comma-separated. See the String column in the Accessibility section for accepted values.
admin_emails optional Email addresses for the people allowed to administer the location (via the Ohana API Admin interface for example). Multiple entries must be comma-separated.
alternate_name optional Another name this location might be known by.
description required Description of services provided at the location
email optional General Email address for the location. Emails that belong to contacts should go in the contacts.csv table.
languages optional Languages spoken at the location. Full language names based on the ISO 639-1 standard, such as English, French, Arabic. Multiple entries must be comma-separated.
latitude optional The location's latitude. Must be a valid WGS 84 latitude. Note that the app automatically geocodes addresses if your data doesn't include latitude and longitude.
longitude optional The location's longitude. Must be a valid WGS 84 longitude. Note that the app automatically geocodes addresses if your data doesn't include latitude and longitude.
name required Name of the location
short_desc optional Succinct description of services provided at the location.
transportation optional Public transportation options near the location
website optional The location's website. The value must be a fully qualified URL that includes http:// or https://, and any special characters in the URL must be correctly escaped.
virtual required if the location does not have a physical address Whether or not the location has a physical address. If false, it must have an address associated with it. The default value is false. Accepted values are: TRUE, FALSE, true, false, 1, or 0.

Accessibility

One or more of the following strings:

String Description
"cd" Information on CD
"deaf_interpreter" Interpreter for the deaf
"disabled_parking" Disabled Parking
"elevator" Elevator
"ramp" Ramp
"restroom" Disabled Restroom
"tape_braille" Information on tape or in Braille
"tty" TTY option available
"wheelchair" Wheelchairs available
"wheelchair_van" Wheelchair-accessible van available

addresses.csv (required unless all your locations are virtual)

Column Requirement Detail
id required The unique id for the address, starting at 1.
location_id required if the location is not virtual The id of the location that the address belongs to.
address_1 required The primary part of the Street Address
address_2 optional The secondary part of the Street Address, such as the Suite, Room, or Floor number
city required The City name
state_province required for US and Canada The 2-letter capitalized US state or Canada province abbreviation, such as CA for California, or ON for Ontario.
postal_code required A valid postal code. Note that the API currently assumes this will be a US 5-digit ZIP code. If you are using non-US data, you will need to replace the postal code validation with your own.
country required 2-letter ISO 3361-1 country code, such as US.

contacts.csv (optional)

Column Requirement Detail
id required The contact's unique id, starting at 1.
location_id required if the contact belongs to a location The id of the location that the contact belongs to.
organization_id required if the contact belongs to an organization The id of the organization that the contact belongs to.
service_id required if the contact belongs to a service The id of the service that the contact belongs to.
name required The Contact's full name
title optional The Contact's title
email optional The Contact's email address
department optional The department where the Contact works.

mail_addresses.csv (optional)

Column Requirement Detail
id required The unique id for the mailing address, starting at 1.
location_id required if the mail_address belongs to a location The id of the location that the mail address belongs to.
attention optional Name of person or organization receiving mail
address_1 required The primary part of the Street Address
address_2 optional The secondary part of the Street Address, such as the Suite, Room, or Floor number
city required The City name
state_province required for US or Canada The 2-letter capitalized US state or Canada province abbreviation, such as CA for California, or ON for Ontario.
postal_code required A valid postal code. Note that the API currently assumes this will be a US 5 or 9-digit ZIP code. 9-digit ZIP codes are separated with a dash, like this: 94025-9881. If you are using non-US data, you will need to replace the postal code validation with your own.
country required 2-letter ISO 3361-1 country code

phones.csv (required)

Column Requirement Detail
id required The phone's unique id, starting at 1.
location_id required if the phone is meant to be added at the location level The ID of the location that the phone belongs to.
contact_id required if the phone is meant to be added at the contact level The ID of the contact that the phone belongs to.
organization_id required if the phone is meant to be added at the organization level The ID of the organization that the phone belongs to.
service_id required for CSV files if the phone is meant to be added at the service level The ID of the service that the phone belongs to.
number required The 10-digit US phone number. See examples of accepted formats below.
vanity_number optional The 10-digit US phone number with vanity letters. Example format: 703-555-HELP
extension optional The phone number extension. Must only contain numbers. For example: 1234
department optional The department this phone number reaches.
number_type required The type that best describes the number. Must be one of these five values: fax, voice, hotline, sms, tty.
country_prefix optional The country prefix code. Must only contain numbers, such as 1 for the USA.

Accepted formats for US phone numbers:

213-456-7890

213 456-7890

213.456.7890

2134567890

(213) 456-7890

(213)456-7890

services.csv (required)

Column Requirement Detail
id required The service's unique id, starting at 1.
location_id required The ID of the location that the service belongs to.
program_id optional The ID of the program that the service belongs to.
accepted_payments optional Methods of payment for this service. Multiple entries must be comma-separated, such as "Cash, Credit Card, Medicare".
alternate_name optional Another name this Service might be known by.
application_process optional Description of the service's application process
description required Description of the service provided
eligibility optional Who is this service intended for?
email optional The service's main email address.
fees optional Fees charged to receive the service
funding_sources optional Source of funds used to support the service. Multiple entries must be comma-separated.
interpretation_services optional Description of the types of interpretation services available, such as a 3-way phone call using a third party company that provides support for many languages.
keywords optional Keywords that people might use to search for this service, but that you might not want to include in the service description (such as common misspellings). Multiple entries must be comma-separated.
languages optional Languages in which this service is provided. Full language names based on the ISO 639-1 standard. Multiple entries must be comma-separated.
name required Name of the service
required_documents optional The documents that are required to receive this service. Multiple entries must be comma-separated.
service_areas optional Cities and Counties served. Multiple entries must be comma-separated. See Service Areas section for more details.
status required Must be one of active, defunct, or inactive.
wait_time optional Wait times associated with the service
website optional The service's website. The value must be a fully qualified URL that includes http:// or https://, and any special characters in the URL must be correctly escaped.
taxonomy_ids optional A comma-separated list of taxonomy_ids corresponding to the taxonomy you want to use. If you want to use the Open Eligibility taxonomy, refer to data/sample-csv/taxonomy.csv for a list of valid taxonomy IDs. If you are using your own taxonomy, make sure to include the CSV file that defines your taxonomy.

Service Areas

If your data includes service areas, and you want to validate your data against a list of valid service areas, add the accepted entries in settings.yml before you populate the DB.

programs.csv (optional)

Column Requirement Detail
id required The program's unique id, starting at 1.
organization_id required for CSV files only The ID of the organization that the program belongs to.
name required The name of the program.
alternate_name optional Another name the program might be known by.

regular_schedules.csv (recommended)

Column Requirement Detail
id required The regular schedule's unique id, starting at 1.
location_id required if the schedule is for a location The ID of the location that the regular schedule belongs to.
service_id required if the schedule is for a service The ID of the service that the regular schedule belongs to.
weekday required Can be either the full English weekday name, such as Monday, or the 3-letter abbreviated name, such as Fri, or a String or Integer from 1 to 7, where 1 is Monday.
opens_at required Accepted formats are 24-hour format, such as 09:30, 8:00, 17:00, or AM/PM format, such as 8am, 5pm, or 5:00 PM.
closes_at required Accepted formats are 24-hour format, such as 09:30, 8:00, 17:00, or AM/PM format, such as 8am, 5pm, or 5:00 PM.

holiday_schedules.csv (recommended)

Column Requirement Detail
id required The holiday schedule's unique id, starting at 1.
location_id required if the schedule is for a location The ID of the location that the regular schedule belongs to.
service_id required if the schedule is for a service The ID of the service that the regular schedule belongs to.
start_date required The date when the holiday schedule or modified operating hours start taking effect. The year must be included. Accepted formats are 12/24/2014, 12/24/14, and December 24, 2014. If you wish to use the day/month/year format, you must first update the date_format option in settings.yml to %d/%m/.
end_date required The date when the holiday schedule or modified operating hours stop taking effect. The year must be included. Accepted formats are 12/24/2014, 12/24/14, and December 24, 2014. If you wish to use the day/month/year format, you must first update the date_format option in settings.yml to %d/%m/.
closed required It specifies whether the location or service is closed (true) or open (false) during the specified dates. Accepted values are one of: TRUE, true, FALSE, false, 1, or 0.
opens_at required if closed is false Accepted formats are 24-hour format, such as 09:30, 8:00, 17:00, or AM/PM format, such as 8am, 5pm, or 5:00 PM.
closes_at required if closed is false Accepted formats are 24-hour format, such as 09:30, 8:00, 17:00, or AM/PM format, such as 8am, 5pm, or 5:00 PM.

taxonomy.csv (optional)

If you want to use your own custom taxonomy instead of Open Eligibility (which Ohana API supports out of the box), then define your taxonomy as defined below. For an example of a valid CSV file that recreates the Open Eligibility taxonomy, view data/sample-csv/taxonomy.csv. For more details about how taxonomy works, read our Wiki article on taxonomy basics.

Column Requirement Detail
taxonomy_id required The category's unique taxonomy id.
name required The name of the category.
parent_id required for child categories The taxonomy_id of the parent category.
parent_name required for child categories The name of the parent category.
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.