Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

River: A pluggable river (indexer like) support #377

kimchy opened this issue Sep 20, 2010 · 1 comment


Copy link

commented Sep 20, 2010

A river is a pluggable entity running within elasticsearch cluster pulling data (or being pushed with data) that is then indexed into the cluster.

A river is composed of a unique name and a type. The type is the type of the river (out of the box, there is the dummy river that simply logs that its running). The name uniquely identifies the river within the cluster. For example, one can run an river called my_river with type dummy.

Rivers are singletons within the cluster. They get allocated automatically to one of the nodes and run. If that node fails, an river will be automatically allocated to another node.

River allocation on nodes can be controlled on each node. The node.river can be set to _none_ disabling any river allocation to it. The node.river can also include a comma separated list of either river names or types controlling the rivers allowed to run on it. For example: my_river1,my_river2, or dummy,twitter.

Rivers require both meta data (what type they are, and additional information) that forms the "settings" of the river, and possibly need to store runtime state (indexed up to data X, continue from it in case of failover). Everything is driven by working an internal index called _river.

In that index (_river), each _type in the index (mapping) corresponds to an river name (do not confuse it with the river type). The _meta document id is required and includes the settings of that river. It must include at least the river type. In order to delete a river, a simple delete of the mapping type (river name) can be done.

With the fact that the river(s) information is stored as an index, it is fully persistent, and allows for very frequent state storage (under one or more documents).

Sounds confusing, but its really simple, here is an example for creating the dummy river with the name my_river:

curl -XPUT 'localhost:9200/_river/my_river/_meta' -d '{
    "type" : "dummy"

And deleting the river is:

curl -XDELETE 'localhost:9200/_river/my_river'    

This comment has been minimized.

Copy link
Member Author

commented Sep 20, 2010


This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
1 participant
You can’t perform that action at this time.