Skip to content

data catalog API 0.7

mlaboszc edited this page May 24, 2016 · 1 revision

Overview

Enables search, retrieval and storageof metadata describing data sets.

Version information

Version: 0.4.29

URI scheme

BasePath: /

Tags

  • default: Default namespace
  • rest/datasets: Data Catalog - enables search, retrieval and storage of metadata describing data sets.

Consumes

  • application/json

Produces

  • application/json

Paths

Do a search for data sets

GET /rest/datasets

Description

Query should be in this format: { "query": SEARCH_TEXT, "filters":[ {FILTERED_FIELD_NAME: [FIELD_VALUE_1, FIELD_VALUE_1]} ], "from": FROM_HIT_NUMBER, "size": NUMBER_OF_HITS }

All query fields are optional. When filtering by time ranges, you must supply exactly two filter field values. -1 can be used as infinity.

"from" and "size" are used for pagination of search queries. If we get 20 hits for a query, we can set "from" and "size" to 10 to get the second half of hits.

Filter examples:

{"creationTime": [-1, "2015-02-24T14:56"]} <- all until 2015-02-24T14:56 {"format": ["csv", "json"]} <- all CSV and JSON data sets

Field 'orgs' should be in a form of a list of org uuids separated with a coma example: orguuid-01,oruuid-02

Fields 'onlyPublic' and 'onlyPrivate' should have boolean value (true or false). In addition to a query, they allow to choose only private data sets or only public ones. They are mutually exclusive!

Consumer of this endpoint must have a valid OAuth token. Also, user has to be a member of the organization owning the data sets. This doesn't concern admins (console.admin in token's scope) who always have access. Moreover an admin owning the data sets being targeted by this request receives data from all orgs.

Parameters

Type Name Description Required Schema Default
QueryParameter onlyPrivate Returns a list of the private data sets only false boolean
QueryParameter query A query JSON object. false string
QueryParameter onlyPublic Returns a list of the public data sets only. false boolean
QueryParameter orgs A list of org UUIDs. false null string array

Responses

HTTP Code Description Schema
200 Result queries returned. SearchHits
400 Invalid or malformed query. No Content
500 Internal error. No Content

Tags

  • rest/datasets

Get the number of current data sets in the index per organisation

GET /rest/datasets/count

Description

Consumer of this endpoint must have a valid OAuth token. Also, user has to be a member of the organization owning the data sets. This doesn't concern admins (console.admin in token's scope) who always have access. Moreover an admin owning the data sets being targeted by this request receives data from all orgs.

Parameters

Type Name Description Required Schema Default
QueryParameter onlyPrivate Returns a list of the private data sets only false boolean
QueryParameter orgs A list of org UUIDs. false null string array
QueryParameter onlyPublic Returns a list of the public data sets only. false boolean

Responses

HTTP Code Description Schema
200 Data set count returned. integer

Tags

  • rest/datasets

Puts a metadata entry in the search index under the given ID

PUT /rest/datasets/{entry_id}

Description

Consumer of this endpoint must have a valid OAuth token. Also, user has to be a member of the organization owning the data sets. This doesn't concern admins (console.admin in token's scope) who always have access.

Parameters

Type Name Description Required Schema Default
PathParameter entry_id ID of a metadata entry (data set). true string
BodyParameter body JSON-formatted metadata entry. true InputMetadataEntry

Responses

HTTP Code Description Schema
200 Entry updated. No Content
201 Entry created. No Content
400 Putting data set in index failed: malformed data in meta data fields. No Content
403 Forbidden access to required organisation No Content
503 Putting data set in index failed: failed to connect to ElasticSearch. No Content

Consumes

  • application/x-www-form-urlencoded
  • multipart/form-data

Tags

  • rest/datasets

Deletes a metadata entry labeled with the given ID

DELETE /rest/datasets/{entry_id}

Description

Consumer of this endpoint must have a valid OAuth token. Also, user has to be a member of the organization owning the data sets. This doesn't concern admins (console.admin in token's scope) who always have access.

Parameters

Type Name Description Required Schema Default
PathParameter entry_id ID of the metadata entry describing some data set. true string

Responses

HTTP Code Description Schema
200 Entry has been removed from Elastic Search. Status of deletion from external services is in response's body DeleteResponse
401 Authorization header not found. No Content
403 Forbidden access to the resource No Content
404 No entry with the given ID found. No Content
503 Problem connecting to ElasticSearch. No Content

Tags

  • rest/datasets

Gets a metadata entry labeled with the given ID

GET /rest/datasets/{entry_id}

Description

Consumer of this endpoint must have a valid OAuth token. Also, user has to be a member of the organization owning the data sets. This doesn't concern admins (console.admin in token's scope) who always have access.

Parameters

Type Name Description Required Schema Default
PathParameter entry_id ID of the metadata entry describing some data set. true string

Responses

HTTP Code Description Schema
200 Success QueryHit
403 Forbidden access to the resource No Content
404 No entry with the given ID found. No Content
503 Problem while connecting to the index. No Content

Tags

  • rest/datasets

Updates specified attributes of metadata entry with the given ID

POST /rest/datasets/{entry_id}

Description

The body of the POST method should be formed in a following way:

{ "argumentName": ["value01", "value02"] }

The value of a given argument will replace current value for this argument in the specified metadata entry.

Example: { "title": "A new, better title for this data set!" }

Consumer of this endpoint must have a valid OAuth token. Also, user has to be a member of the organization owning the data set. This doesn't concern admins (console.admin in token's scope) who always have access.

Parameters

Type Name Description Required Schema Default
PathParameter entry_id ID of a metadata entry (data set). true string
BodyParameter body Attributes with values to change. true InputMetadataEntry

Responses

HTTP Code Description Schema
200 Data set attributes are updated. No Content
400 Wrong input data. No Content
403 Forbidden access to the resource No Content
404 No entry with the given ID found. No Content

Tags

  • rest/datasets

Definitions

DeleteResponse

Name Description Required Schema Default
deleted_from_downloader true boolean
deleted_from_publisher true boolean

InputMetadataEntry

Name Description Required Schema Default
category true string
recordCount true integer
dataSample true string
isPublic true boolean
creationTime false string (date-time)
targetUri true string
format true string
sourceUri true string
title true string
size true integer
orgUUID true string

InputMetadataEntryWithID

Name Description Required Schema Default
category true string
recordCount true integer
dataSample true string
isPublic true boolean
id true string
creationTime false string (date-time)
targetUri true string
format true string
sourceUri true string
title true string
size true integer
orgUUID true string

QueryHit

Name Description Required Schema Default
_type true string
_source true InputMetadataEntry
_id true string
found true boolean
_index true string
_version true integer

SearchHits

Name Description Required Schema Default
total true integer
hits true InputMetadataEntryWithID array
formats true string array
categories true string array
Clone this wiki locally