Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data loss when using big batches #502

Closed
ohurvitz opened this issue May 5, 2014 · 5 comments
Closed

Data loss when using big batches #502

ohurvitz opened this issue May 5, 2014 · 5 comments
Milestone

Comments

@ohurvitz
Copy link

ohurvitz commented May 5, 2014

I have the following case:

  • I have a cluster of several servers (I reproduced with 4 and 10 servers)
  • I set split=(number of servers)
  • This reproduced with replication set to both 1 and 2
  • I write a single point to each of 25000 series, and repeat 5 times (increasing time for every cycle)
  • I send the data in batches, using 10 sending threads and I spread the writes between all the servers.
  • I write as fast as I can from each send thread, but verify I get no errors.

As the batch size increases, some data does not make it to the database, even though no error is reported. For my cases, a batch size of 5000 points does it almost every time, while 4000 does it most of the time.

Code to reproduce:
https://gist.github.com/ohurvitz/e5d74ae56d8ffa20e968

Note that there is a lot of hard coded numbers in that code, and also it expects the server name to contain a single digit '1' that is replaced by '2', '3' etc to get to more servers.

@ohurvitz
Copy link
Author

ohurvitz commented May 6, 2014

Forgot to add something that might help finding the issue - if you run with replication on you can get different data dropped in different nodes, so if you run the test program with -write_data=false you get different errors every time (as queries get run on different nodes for each replicated shard).

@jvshahid jvshahid added this to the 0.6.1 milestone May 6, 2014
@jvshahid jvshahid self-assigned this May 6, 2014
@jvshahid
Copy link
Contributor

jvshahid commented May 6, 2014

Thanks @ohurvitz. Moving this issue to 0.6.1

@jvshahid jvshahid modified the milestones: 0.6.2, 0.6.1, Next release May 7, 2014
@ohurvitz
Copy link
Author

Anything about this? Still happens on latest, with and without setting write-batch-size as in new config file.

@jvshahid
Copy link
Contributor

We haven't got chance to take a look at it yet. It's definitely on my todo list.

@jvshahid
Copy link
Contributor

@ohurvitz I finally got to take a look at this and have a fix. Thanks for the best bug report ever, was little tricky to track it down though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants