-
Notifications
You must be signed in to change notification settings - Fork 55
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #242 from erinspace/feature/elasticsearch_docs
elasticsearch docs
- Loading branch information
Showing
1 changed file
with
90 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,93 @@ | ||
Elasticsearch | ||
============= | ||
|
||
Coming Soon... | ||
SHARE has an elasticsearch API endpoint that can be used for searching SHARE's normalized data, as well as for compiling | ||
summary statistics and analyses of the completeness of data from the various sources. | ||
|
||
Fields Indexed by Elasticsearch | ||
############################### | ||
|
||
Elasticsearch can be used to search the following fields in the normalized data:: | ||
|
||
'title' | ||
'language' | ||
'subject' | ||
'description' | ||
'date' | ||
'date_created' | ||
'date_modified | ||
'date_updated' | ||
'date_published' | ||
'tags' | ||
'links' | ||
'awards' | ||
'venues' | ||
'sources' | ||
'contributors' | ||
|
||
|
||
Accessing the Search API | ||
######################## | ||
|
||
Using curl | ||
********** | ||
|
||
You can acess the API via the command line using a basic query string with curl:: | ||
|
||
curl -H "Content-Type: application/json" -X POST -d '{ | ||
"query": { | ||
"query_string" : { | ||
"query" : "test" | ||
} | ||
} | ||
}' http://localhost:8000/api/search/abstractcreativework/_search | ||
|
||
The elasticsearch API also allows you to aggregate over the whole dataset. This query will also return an aggregation of which sources | ||
do not have a value specified for the field "language":: | ||
|
||
|
||
curl -H "Content-Type: application/json" -X POST -d '{ | ||
"aggs": { | ||
"sources": { | ||
"significant_terms": { | ||
"percentage": {}, | ||
"size": 0, | ||
"min_doc_count": 1, | ||
"field": "sources" | ||
} | ||
} | ||
}, | ||
"query": { | ||
"bool": { | ||
"must_not": [ | ||
{ | ||
"exists": { | ||
"field": "language" | ||
} | ||
} | ||
] | ||
} | ||
} | ||
}' http://localhost:8000/api/search/abstractcreativework/_search | ||
|
||
For more information on sending elasticsearch queries and aggregations, check out the `elasticsearch query DSL documentation <https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html>`_. | ||
|
||
Tutorials | ||
********* | ||
|
||
For a detailed series of tutorials on how to use the SHARE Search API, check out `this repository on GitHub <https://github.com/erinspace/share_tutorials>`_. | ||
|
||
|
||
sharepa - SHARE Parsing and Analysis Library | ||
******************************************** | ||
|
||
You can also use ``sharepa`` - a python library for parsing SHARE data that connects directly to the search API. It is based on the | ||
`elasticsearch DSL <http://elasticsearch-dsl.readthedocs.io/en/latest/index.html>`_. | ||
|
||
Install sharepa with:: | ||
|
||
pip install sharepa | ||
|
||
You can see the `source code for sharepa on GitHub <https://github.com/CenterForOpenScience/sharepa>`_. | ||
|
||
See some tutorials on how to use sharepa by visiting `this repository on GitHub <https://github.com/erinspace/share_tutorials>`_. |