`Unterminated string` error during `rethinkdb restore` #3859

brandon-beacher · 2015-03-02T18:05:11Z

Our app - http://gatherhere.com/platform - uses RethinkDB as it's primary data store.

rethinkdb restore is currently failing if we attempt to restore a dump of our database.

The failure occurs for a table containing JSON from Postmark's Inbound email webhook.

I suspect the wide range of characters which occur in email may have landed on something which rethinkdb restore does not yet handle?

Here is the full error message:

rethinkdb restore ~/Downloads/rethinkdb_dump_2015-03-02T11:00:01.tar.gz --force
Unzipping archive file...
  Done (8 seconds)
Importing from directory...
[                                        ]   2% 
412586 rows imported in 26 tables
Unterminated string starting at: line 2 column 15988747 (char 15988748)
In file: /var/folders/n0/fqys4_ns6kl2wl990nhd4nf40000gn/T/tmpImjL5r/gather/inbound_emails.json
Errors occurred during import
Error: rethinkdb-import failed

I tried to take a look at the character referenced in the error - but if I open the file in an editor - line 2 does not have that many characters.

Anything I can do to help diagnose this one?

The text was updated successfully, but these errors were encountered:

mglukhovsky · 2015-03-02T18:15:11Z

@brandon-beacher, have you checked the version of your RethinkDB Python driver with pip freeze?

The import/export and dump/restore scripts rely on the Python driver installed. If you're running an older version of the driver (< 1.16.0-2), you should upgrade the driver:

sudo pip install -U rethinkdb

@Tryneus, is there anything else that could be causing this error?

brandon-beacher · 2015-03-02T18:18:16Z

Upgrading the driver - was at 1.15.0-0 - will report back!

Tryneus · 2015-03-02T18:45:10Z

I think I see the problem. This didn't actually occur on line '2' - we just iteratively parse JSON rows from the file (to keep track of progress), and it happened on the second line of a parse. The problem itself appears to be that a single row is larger than 16 MB, and the import script fails in that case. This could be solved by increasing the maximum size of the buffer in the script, but that will have some performance implications.

I'll look into solving this without killing performance on very large rows.

Tryneus · 2015-03-02T18:49:04Z

Ah, I just remembered one of the reasons we limited the maximum buffer size - some users were getting OOM-killed due to the import script using too much memory. We can't keep arbitrarily reading more data into memory until the parse works - the system will run out of memory (and if the import is running on the same machine as the server, there's a good chance the OOM killer will target the server). So we need some upper limit on the buffer. Otherwise, on bad JSON input, we would keep buffering the file until we reach the EOF.

brandon-beacher · 2015-03-02T19:06:15Z

This makes sense - since they're JSON representations of emails - the attachments are represented as base64 strings. I bet the document it's failing on just has a large attachment.

Tryneus · 2015-03-02T19:54:50Z

Ok, a fix is up in review 2859. This bumps up the maximum row size to 128 MB and implements a scaling buffer size so it is much faster for larger rows. It may be useful to add a command-line argument for setting this value in the future, but with any luck this should do the job until then.

brandon-beacher · 2015-03-02T19:58:43Z

Nice! I will test this against our dump here and report back.

danielmewes · 2015-03-02T22:53:03Z

@brandon-beacher Note that the fix isn't released yet. Maybe @Tryneus can give you the branch he has implemented it in, so you could build the python driver (which contains the rethinkdb-restore script) from source?

brandon-beacher · 2015-03-02T22:56:37Z

Thanks Daniel - I thought 2859 was a pull request - but realized it would have auto-linked via Github if it were. I've got a workaround for now but will be happy to test if needed.

Tryneus · 2015-03-02T23:38:39Z

Ah, sorry about the confustion @brandon-beacher. The fix has been approved and merged to next in commit 0b08f09, and cherry-picked into v1.16.x in commit adf4e7e. Will be in the next python driver release, which I am terribly unsuccessful at predicting the version numbers of.

danielmewes · 2015-03-03T00:11:30Z

@brandon-beacher Since you have a work-around, would it be enough if we released this together with the next server version? ETA is about two weeks from now.
Feel free to let us know if you need the fix earlier. In that case we can push out a new version of the Python driver earlier.

brandon-beacher · 2015-03-03T14:32:54Z

Sounds great @danielmewes - also just wanted to point out that you all are awesome!

AtnNn · 2015-03-26T04:59:38Z

The fix was released in version 1.16.0-3 of the Python driver. Please re-open if there is something left to do in this issue.

danielmewes added this to the 1.16.x milestone Mar 2, 2015

danielmewes assigned Tryneus Mar 2, 2015

AtnNn modified the milestones: 2.0, 1.16.x, 1.16.3 Mar 26, 2015

AtnNn closed this as completed Mar 26, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`Unterminated string` error during `rethinkdb restore` #3859

`Unterminated string` error during `rethinkdb restore` #3859

brandon-beacher commented Mar 2, 2015

mglukhovsky commented Mar 2, 2015

brandon-beacher commented Mar 2, 2015

Tryneus commented Mar 2, 2015

Tryneus commented Mar 2, 2015

brandon-beacher commented Mar 2, 2015

Tryneus commented Mar 2, 2015

brandon-beacher commented Mar 2, 2015

danielmewes commented Mar 2, 2015

brandon-beacher commented Mar 2, 2015

Tryneus commented Mar 2, 2015

danielmewes commented Mar 3, 2015

brandon-beacher commented Mar 3, 2015

AtnNn commented Mar 26, 2015

Unterminated string error during rethinkdb restore #3859

Unterminated string error during rethinkdb restore #3859

Comments

brandon-beacher commented Mar 2, 2015

mglukhovsky commented Mar 2, 2015

brandon-beacher commented Mar 2, 2015

Tryneus commented Mar 2, 2015

Tryneus commented Mar 2, 2015

brandon-beacher commented Mar 2, 2015

Tryneus commented Mar 2, 2015

brandon-beacher commented Mar 2, 2015

danielmewes commented Mar 2, 2015

brandon-beacher commented Mar 2, 2015

Tryneus commented Mar 2, 2015

danielmewes commented Mar 3, 2015

brandon-beacher commented Mar 3, 2015

AtnNn commented Mar 26, 2015

`Unterminated string` error during `rethinkdb restore` #3859

`Unterminated string` error during `rethinkdb restore` #3859