River does not restart after crashing/erroring #9

kishorevarma · 2015-05-13T05:52:07Z

We have a setup of a RethinkDB cluster being read from an Elasticsearch cluster using this plugin. Recently when we updated RethinkDB - which caused downtime on RethinkDB - the plugin failed to find the tables and bailed out as in the logs below.

So the documents between ES and RethinkDB went out of sync - and only when we went into the logs did we notice that the plugin had completely bailed out. Restarting the ES node where the plugin was running fixes this - but it would be great if the plugin would attempt restarting after failure like ES itself.

Logs:

[2015-04-07 11:56:48,630][INFO ][river.rethinkdb.feedworker] [cortex.layouts] Synced 450 documents
[2015-04-07 18:14:03,286][INFO ][river.rethinkdb.feedworker] [cortex.layouts] Synced 460 documents
[2015-04-08 04:02:28,360][INFO ][river.rethinkdb.feedworker] [cortex.layouts] Synced 470 documents
[2015-04-08 06:19:23,834][INFO ][river.rethinkdb.feedworker] [cortex.layouts] Synced 480 documents
[2015-04-08 12:13:20,714][INFO ][river.rethinkdb.feedworker] [cortex.layouts] Synced 490 documents
[2015-04-08 12:29:34,956][INFO ][cluster.metadata         ] [cortex-elasticsearch2] [cortex] update_mapping [layouts] (dynamic)
[2015-04-08 13:55:32,769][INFO ][cluster.metadata         ] [cortex-elasticsearch2] [cortex] update_mapping [layouts] (dynamic)
[2015-04-08 13:58:30,165][INFO ][river.rethinkdb.feedworker] [cortex.layouts] Synced 500 documents
[2015-04-08 22:07:51,018][WARN ][monitor.jvm              ] [cortex-elasticsearch2] [gc][young][4331504][511] duration [1.4s], collections [1]/[2.4s], total [1.4s]/[2.5m], memory [567.9mb]->[40mb]/[3.9gb], all_pools {[young] [531.5mb]->[3.5mb]/[532.5mb]}{[survivor] [1.4mb]->[1.2mb]/[66.5mb]}{[old] [34.8mb]->[35.3mb]/[3.3gb]}
[2015-04-14 19:56:46,340][INFO ][river.rethinkdb.feedworker] [cortex.layouts] Synced 510 documents
[2015-04-14 22:28:37,068][INFO ][river.rethinkdb.feedworker] [cortex.layouts] Synced 520 documents
[2015-04-18 14:15:29,601][WARN ][monitor.jvm              ] [cortex-elasticsearch2] [gc][young][5166964][660] duration [1.2s], collections [1]/[1.5s], total [1.2s]/[2.6m], memory [569.6mb]->[43.2mb]/[3.9gb], all_pools {[young] [532.5mb]->[6.2mb]/[532.5mb]}{[survivor] [31.1kb]->[33kb]/[66.5mb]}{[old] [37mb]->[37mb]/[3.3gb]}
[2015-04-20 18:56:07,735][INFO ][river.rethinkdb.feedworker] [cortex.layouts] Synced 530 documents
[2015-04-28 15:57:42,071][ERROR][river.rethinkdb.feedworker] [cortex.symlinks] Worker has a problem: RUNTIME_ERROR: Changefeed aborted (table unavailable).
[2015-04-28 15:57:42,075][INFO ][river.rethinkdb.feedworker] [cortex.symlinks] This probably isn't recoverable, bailing.
[2015-04-28 15:57:42,075][ERROR][river.rethinkdb.feedworker] [cortex.symlinks] failed due to exception
com.rethinkdb.RethinkDBException: RUNTIME_ERROR: Changefeed aborted (table unavailable).
    at com.rethinkdb.response.DBResultFactory.convert(DBResultFactory.java:25)
    at com.rethinkdb.Cursor.loadNextBatch(Cursor.java:62)
    at com.rethinkdb.Cursor.next(Cursor.java:85)
    at org.elasticsearch.river.rethinkdb.FeedWorker.run(FeedWorker.java:77)
    at java.lang.Thread.run(Thread.java:744)
[2015-04-28 15:57:42,075][INFO ][river.rethinkdb.feedworker] [cortex.symlinks] thread shutting down
[2015-04-28 15:57:42,075][ERROR][river.rethinkdb.feedworker] [cortex.templates] Worker has a problem: RUNTIME_ERROR: Changefeed aborted (table unavailable).
[2015-04-28 15:57:42,078][ERROR][river.rethinkdb.feedworker] [cortex.layouts] Worker has a problem: RUNTIME_ERROR: Changefeed aborted (table unavailable).
[2015-04-28 15:57:42,080][INFO ][river.rethinkdb.feedworker] [cortex.layouts] This probably isn't recoverable, bailing.
[2015-04-28 15:57:42,080][INFO ][river.rethinkdb.feedworker] [cortex.templates] This probably isn't recoverable, bailing.
[2015-04-28 15:57:42,080][ERROR][river.rethinkdb.feedworker] [cortex.layouts] failed due to exception
com.rethinkdb.RethinkDBException: RUNTIME_ERROR: Changefeed aborted (table unavailable).
    at com.rethinkdb.response.DBResultFactory.convert(DBResultFactory.java:25)
    at com.rethinkdb.Cursor.loadNextBatch(Cursor.java:62)
    at com.rethinkdb.Cursor.next(Cursor.java:85)
    at org.elasticsearch.river.rethinkdb.FeedWorker.run(FeedWorker.java:77)
    at java.lang.Thread.run(Thread.java:744)
[2015-04-28 15:57:42,080][INFO ][river.rethinkdb.feedworker] [cortex.layouts] thread shutting down
[2015-04-28 15:57:42,080][ERROR][river.rethinkdb.feedworker] [cortex.templates] failed due to exception
com.rethinkdb.RethinkDBException: RUNTIME_ERROR: Changefeed aborted (table unavailable).
    at com.rethinkdb.response.DBResultFactory.convert(DBResultFactory.java:25)
    at com.rethinkdb.Cursor.loadNextBatch(Cursor.java:62)
    at com.rethinkdb.Cursor.next(Cursor.java:85)
    at org.elasticsearch.river.rethinkdb.FeedWorker.run(FeedWorker.java:77)
    at java.lang.Thread.run(Thread.java:744)
[2015-04-28 15:57:42,081][INFO ][river.rethinkdb.feedworker] [cortex.templates] thread shutting down

The text was updated successfully, but these errors were encountered:

paramaggarwal · 2015-05-13T05:58:59Z

Suggestion: showing the state of the plugin as running or stopped on the /_river/rethinkdb/status endpoint can help debug plugin crashes.

coffeemug · 2015-05-13T19:33:58Z

/cc @deontologician

We have resumable changefeeds in development, so aside from reporting the status better, this particular bug will be a lot less likely to happen.

deontologician · 2015-05-18T20:02:03Z

@kishorevarma I would be open to pull requests on this, but at the moment the idea is to deprecate this river (since rivers are deprecated by ElasticSearch), and move to the logstash input. Once RethinkDB has resumable changefeeds, we'll add that capability to the logstash plugin and formally deprecate the river.

paramaggarwal · 2015-05-19T09:36:50Z

Wow, didn't know that Elasticsearch has deprecated Rivers - then it makes sense to deprecate this also. (src: https://www.elastic.co/blog/deprecating_rivers)

deontologician · 2015-11-17T22:45:06Z

closing since this repo is no longer maintained as of RethinkDB 2.2.

deontologician added the bug label May 18, 2015

deontologician closed this as completed Nov 17, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

River does not restart after crashing/erroring #9

River does not restart after crashing/erroring #9

kishorevarma commented May 13, 2015

paramaggarwal commented May 13, 2015

coffeemug commented May 13, 2015

deontologician commented May 18, 2015

paramaggarwal commented May 19, 2015

deontologician commented Nov 17, 2015

River does not restart after crashing/erroring #9

River does not restart after crashing/erroring #9

Comments

kishorevarma commented May 13, 2015

paramaggarwal commented May 13, 2015

coffeemug commented May 13, 2015

deontologician commented May 18, 2015

paramaggarwal commented May 19, 2015

deontologician commented Nov 17, 2015