-
Notifications
You must be signed in to change notification settings - Fork 0
Search API
The search method of the VertNet API provides a simple way to access VertNet data programmatically, in JSON format. With appropriate API requests, you can easily automate searching for and retrieving custom data sets from VertNet.
This method returns its results as soon as they are available by wrapping matching VertNet records inside a JSON response object. Thus, calls to the search method are "blocking" in the sense that a response is not returned until the first records matching the search query have been retrieved.
See the documentation home page for information on the current URL to access the service.
Requests are made by building a query object (see below) and adding this as a parameter to the API's method's URL. This query object specifies the query parameters and provides some extra arguments for customizing the return values of the call.
As a simple example, the following request searches for all Noturus placidus records (the threatened Neosho madtom catfish):
http://api-module.vertnet-portal.appspot.com/api/v1/search?q={"q":"noturus placidus"}
The query object is a JSON object that contains the parameters that define the query to be performed, the amount of records to return and other result features. This object is added at the end of the search method URL as the value of the q method argument, with a question mark ? separating both entities. The above URL is an example of how to build a basic query. In that URL, the query object is everything between (and including) the curly braces:
{"q": "noturus placidus"}IMPORTANT NOTE: As you can see in the URL above, there are two different q values. The first one, ?q= is the query object itself whereas the second one, {"q": is the query string element of the query object (see below). This distinction is important, since the query object should never be quoted and the query string should always be quoted.
The query object can have the following elements.
The element that defines the query terms, the definition of the query, the records you want to retrieve. This is the most important element in the quey object and, therefore, is a mandatory element. Without it, the API call will fail. The query terms must be present as a single, quoted string. See the "Search query string" section for more information on how to properly build this element and different options for more complex queries.
Example:
{"q": "noturus placidus"}An optional numeric value indicating the maximum amount of records to be returned. Performance depends heavily on this value as described in Stucky 2014. There are two "special" values for this element: the optimum (and therefore default) value is 400, while the maximum value allowed by the API infrastructure is 1000. Any value higher than this will be silently replaced by 1000.
Example:
{"q": "noturus placidus", "l": 20}An optional string defining the cursor needed to retrieve the next "batch" of results. See the using the c cursor element below for more information.
Example:
{"q": "noturus placidus", "c": "False:CrEFCuYCCr4C9wAAABn_____jIGJmo2LkZqL0o..."}An optional string element declaring the application that made the call, useful for our usage logging systems. This is generally a developer-only element, and regular users can safely skip it.
If absent, queries will be logged in our system as generated by either the API (if used directly) or the data portal (if the query was made via this application).
The result of the API call is a JSON object with the following fields:
| Field | Definition | Description |
|---|---|---|
recs |
returned record set | A list with each of the records that match the query string terms. These records are represented as JSON objects with the DarwinCore, metadata and VertNet-specific fields. |
limit |
limit used in the query | The value of the specified limit element, l, if any. Otherwise, the value 400 will be shown in this field. |
response_records |
returned records | The volume of the returned record set. |
matching_records |
estimate of the available records | Calculation of the total amount of records that match the specified criteria, regardless of the limit value. When the amount of returned records is small, this number is accurate. However, when the actual matching records are more than 10,000, this number becomes unreliable, even by orders of magnitude. Therefore, when there are more than 10,000 mathing records, the value for this field will simply be ">10000". |
cursor |
cursor for the next batch of records | String to be used on the same query to retrieve the next batch of records. See the using the c cursor element section below. |
request_date |
date of the API call | Date and time (UTC) on which the query was performed. |
request_origin |
location of the API call | Estimated coordinates of the place where the query was performed. |
api_version |
used version of the API | Specific details on the API version, for feedback purposes. |
When the query doesn't return all the records that are available (say for example you applied a small limit value on a very broad query), the response will include an element called cursor. This value can be used to indicate the query to start returning records where the last query finished.
So, for example, if you make a query that would return 100 records (according to the matching_records returned value) but specify a limit value of 30, you can get the first 30 records without specifying a cursor element. The response will show the cursor value needed to get the second batch of records (from 31 to 60). The response of applying that cursor will have the cursor for the third batch (61-90) and so on, until there are no more un-retrieved records.
The cursor element is usually a very long string, and it must appear exactly the same as it appeared in the response.
Let's see it in action. We will search for records of the Swainson's hawk, Buteo swainsoni and apply a small limit value of 20:
{"q":"buteo swainsoni","l":20}Here are the first elements of the response:
{
"matching_records": 2366,
"request_date": "2016-06-13T21:46:31.982520",
"request_origin": "42.811663,-1.648265",
"cursor": "False:CrEFCuYCCr4C9wAAABn_____jIGJmo2LkZqL0o-QjYuek96WkZuah9LNz87L0s_N0s7Onv8AAP90baCgmYuMoKD_AAD_XZ6Pj5qRmJaRmv8AAP9zdG2WkZuah_8AAP9dm4ic_wAA_3N0bZuQnKCWm_8AAP9dm5KRjNCdlo2b0oyPmpyWkpqRjNDNzc7OzP8AAP9zf5uSkYzQnZaNm9KMj5qclpKakYzQzc3Ozsz_AAD__wD-__6MgYmajYuRmovSj5CNi56T3paRm5qH0s3PzsvSz83Szs6e_wB0baCgmYuMoKD_AF2ej4-akZiWkZr_AHN0bZaRm5qH_wBdm4ic_wBzdG2bkJyglpv_AF2bkpGM0J2WjZvSjI-anJaSmpGM0M3Nzs7M_wBzf5uSkYzQnZaNm9KMj5qclpKakYzQzc3Ozsz_AP_-EBQhBN0EkB08Gxk5AAAAAOb___9IFFAAWgsJod6J9YocpMYQARINRG9jdW1lbnRJbmRleBqUAihBTkQgKElTICJjdXN0b21lcl9uYW1lIiAiYXBwZW5naW5lIikgKElTICJncm91cF9uYW1lIiAic352ZXJ0bmV0LXBvcnRhbCIpIChJUyAibmFtZXNwYWNlIiAiaW5kZXgtMjAxNC0wMi0xMWEiKSAoSVMgImluZGV4X25hbWUiICJkd2MiKSAoT1IgKEFORCAoT1IgKFFUICJidXRlbyIpIChJUyAiX19nYXRvbV9fIiAiYnV0ZW8iKSkgKE9SIChRVCAic3dhaW5zb25pIikgKElTICJfX2dhdG9tX18iICJzd2FpbnNvbmkiKSkpIChJUyAiX19nYXRvbV9fIiAiYnV0ZW8gc3dhaW5zb25pIikpKToZCgwoTiBvcmRlcl9pZCkQARkAAAAAAADw_0oFCABA6Ac",
"limit": 20,
"response_records": 20,
"recs": [ ... ]
}As we can see, the query returned 20 records (response_record) even though there are many more (matching_records = 2,366). We can also see the large cursor value there. So, in order to get the next batch of 20 records, we need to supply the value of the cursor element exactly as it is in the c element of the query object, so the new query should be:
{"q":"buteo swainsoni","l":20, "c": "False:CrEFCuYCCr4C9wAAABn_____jIGJmo2LkZqL0o-QjYuek96WkZuah9LNz87L0s_N0s7Onv8AAP90baCgmYuMoKD_AAD_XZ6Pj5qRmJaRmv8AAP9zdG2WkZuah_8AAP9dm4ic_wAA_3N0bZuQnKCWm_8AAP9dm5KRjNCdlo2b0oyPmpyWkpqRjNDNzc7OzP8AAP9zf5uSkYzQnZaNm9KMj5qclpKakYzQzc3Ozsz_AAD__wD-__6MgYmajYuRmovSj5CNi56T3paRm5qH0s3PzsvSz83Szs6e_wB0baCgmYuMoKD_AF2ej4-akZiWkZr_AHN0bZaRm5qH_wBdm4ic_wBzdG2bkJyglpv_AF2bkpGM0J2WjZvSjI-anJaSmpGM0M3Nzs7M_wBzf5uSkYzQnZaNm9KMj5qclpKakYzQzc3Ozsz_AP_-EBQhBN0EkB08Gxk5AAAAAOb___9IFFAAWgsJod6J9YocpMYQARINRG9jdW1lbnRJbmRleBqUAihBTkQgKElTICJjdXN0b21lcl9uYW1lIiAiYXBwZW5naW5lIikgKElTICJncm91cF9uYW1lIiAic352ZXJ0bmV0LXBvcnRhbCIpIChJUyAibmFtZXNwYWNlIiAiaW5kZXgtMjAxNC0wMi0xMWEiKSAoSVMgImluZGV4X25hbWUiICJkd2MiKSAoT1IgKEFORCAoT1IgKFFUICJidXRlbyIpIChJUyAiX19nYXRvbV9fIiAiYnV0ZW8iKSkgKE9SIChRVCAic3dhaW5zb25pIikgKElTICJfX2dhdG9tX18iICJzd2FpbnNvbmkiKSkpIChJUyAiX19nYXRvbV9fIiAiYnV0ZW8gc3dhaW5zb25pIikpKToZCgwoTiBvcmRlcl9pZCkQARkAAAAAAADw_0oFCABA6Ac"}When we execute this second query, we get a different set of 20 records, those in positions 21-40 according to the original sorting. The response will also contain a cursor element, with a different value, and we should use this new cursor to retrieve records 41-60, and so on. With the last batch of records, there won't be any cursor element.
Although the search API can be used to retrieve large sets of records, with the cursor element shown above, it is advisable to use the download method when the amount of records to retrieve is large. By large, we could mean more than 10,000 records, but actually the download API method can be used to efficiently retrieve any amount of records. Please, check out this method's documentation page for more information.