Skip to content

Guide to creating an account for and using the test harvester

Kishore K. Vuppala edited this page Feb 3, 2016 · 10 revisions

Create an Account

There are three stages for setting up access to the test harvester - account creation, account approval, and account promotion.

First create an account in the CKAN staging environment

After completing this, you will receive an initial confirmation of your account creation.

We’re close at this stage but still need you to do one more thing.

Go to http://uat-catalog-fe-data.reisys.com, log in, then send Phil Ashlock an email telling him that you’ve logged in, and he will promote you to an admin. Unfortunately, this last step can’t take place until you log in for the first time.

You’ll now be able to create, edit, and maintain data.json harvests in an identical environment to data.gov.

Create a harvest

To set up a harvest, you’ll first need to host the data.json file that you intend to test. Then, take it’s url and go to the harvest section of this staging environment. Click ‘Add Harvest Source’ and then:

  • Add the URL of the hosted json file to the first field, labeled URL
  • Add a name for this harvest test
  • Ensure that the data.json style of harvest is selected
  • Choose the frequency of the harvest
  • Ensure that the 'Validation Schema' field is set to 'Project Open Data (Federal)'
  • Be sure that your organization (usually similar to ‘agcy-gov’) is selected for the organization.

After you’ve saved the harvest setup, you will need to manually run the initial harvest.

Run a Harvest

You can generate a fresh analysis of your hosted data.json file at any time by clicking the re-harvest button on the harvest admin page. The process usually takes a few minutes but may take 10-15 for larger catalogs. To do this:

Admin button

  • Select the Reharvest button

screenshot 2014-03-24 at 10 18 59 am

View the Results of a Test Harvest

  • Go to the harvest section of this staging environment and find the harvest you've set up.
  • After clicking on it, you should see an 'Admin' button in the top right.
    Admin button
  • Select the Jobs Tab screenshot 2014-03-24 at 10 18 59 am
  • Click on the most recent job in order to see the results.

Notes

  • If the hosted file you are analyzing is not valid json or has serious schema errors, the harvest might be unable to complete. If this is the case, the data.json file may need correction before a harvest analysis can be successfully run.
  • When viewing the details of job reports, note that each entry begins by indicating the identifier and title of the dataset that had errors and how many errors it had. Each discreet error will then be indicated by "####### Error Message #######"
You can’t perform that action at this time.