Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prevent duplicate time, sequence_number on write #695

Closed
wants to merge 2 commits into from
Closed

prevent duplicate time, sequence_number on write #695

wants to merge 2 commits into from

Conversation

code0x9
Copy link

@code0x9 code0x9 commented Jun 26, 2014

duplicate "time" or "sequence_number" column leads to panic. prevent this by filter out invalid request.

test request

/db/<database>/series?u=<user>&p=<pass>
[
  {
    "name" : "hd_used",
    "columns" : ["time", "value", "time", "host", "mount"],
    "points" : [
      [1403761091094, 23.2, 1403761091094, "serverA", "/mnt"]
    ]
  }
]

result

panic: runtime error: index out of range

goroutine 18 [running]:
runtime.panic(0x448dd60, 0x4a601f7)
    /Applications/go/src/pkg/runtime/panic.c:266 +0xb6
datastore.(*Shard).Write(0xc2100d3940, 0xc21031ae86, 0x4, 0xc2102eac48, 0x1, ...)
    /Users/niko/git/influxdb/src/datastore/shard.go:83 +0xc7f
datastore.(*ShardDatastore).Write(0xc21009baf0, 0xc21014b800, 0x0, 0x0)
    /Users/niko/git/influxdb/src/datastore/shard_datastore.go:231 +0xf2
cluster.(*WriteBuffer).write(0xc210106690, 0xc21014b800)
    /Users/niko/git/influxdb/src/cluster/write_buffer.go:88 +0x86
cluster.(*WriteBuffer).handleWrites(0xc210106690)
    /Users/niko/git/influxdb/src/cluster/write_buffer.go:79 +0xb7
created by cluster.NewWriteBuffer
    /Users/niko/git/influxdb/src/cluster/write_buffer.go:43 +0x24f

@otoolep
Copy link
Contributor

otoolep commented Jun 26, 2014

Seems like a good catch. Is this just 'time', or if any column is duplicated, will this happen?

I'm not sure, since I am new to this code base, but I think this kind of validation should happen higher up the API stack -- in src/api/http/api.go. Sanitize all incoming JSON there, removing all duplicates for example (without actually caring what column names are). Today it may be the 'time' column, tomorrow it will be something else.

What about adding a unit test?

@code0x9
Copy link
Author

code0x9 commented Jun 26, 2014

I think sanitize all JSON leads to performance degradation, And the func ConvertToDataStoreSeries at src/common/serialize_series.go treats "time" and "sequence_number" exceptionally. So if the other column is duplicated, last column wins.

Main objective of pull request above is to prevent panic which leads to broken shard data. You can test by send write request with duplicate "time" field. After that, influxdb panics and you cannot repair old series data.

@otoolep
Copy link
Contributor

otoolep commented Jun 26, 2014

What I mean is can you add a unit test to your change which sends duplicate 'time' columns, and confirms that the system doesn't blow up? And that the later time wins?

@otoolep
Copy link
Contributor

otoolep commented Jun 26, 2014

Oh, by "sanitize", I just mean the column field, removing any dupes, since it seems to have a fairly constrained specification. I agree that sanitizing the entire JSON blob doesn't make much sense.

@code0x9
Copy link
Author

code0x9 commented Jun 27, 2014

What about points? When you remove duplicate column, you have to remove obsolete point too because serialize series loops over Columns. (and that means "doesn't make sense" situation)

Think about this situation.

  • column: ["a", "b", "b", "c"]
  • points: [1, 2, 2, 3]

If you remove "b" only, column "c" will have value 2, like this.

  • column: ["a", "b", "c"]
  • points: [1, 2, 2, 3]

I don't think unit test helps. Also, tests on master is already failing on my macbook.

FAIL: wal_test.go:536: WalSuite.TestSimultaneousReplay

wal_test.go:563:
    c.Assert(len(requests), InRange, 100, 200)
... obtained int = 235
... expected greater than int = 100
... expected less than int = 200

OOPS: 24 passed, 1 FAILED, 1 PANICKED
--- FAIL: Test-8 (3.54 seconds)
FAIL
FAIL    wal 3.571s
make: *** [test] Error 1

@otoolep
Copy link
Contributor

otoolep commented Jun 27, 2014

Good point about the corresponding data -- I hadn't thought of that. I was only thinking in terms of duplicate column names, no more.

I still don't follow why, with your fix in place, a unit test couldn't be added to prove this issue was no longer an issue. It would prevent a regression in the future. Isn't that useful?

@code0x9
Copy link
Author

code0x9 commented Jun 27, 2014

Special case "time" and "sequence_number" is limited to func ConvertToDataStoreSeries. Anyone who will work on this source will notice that without reading unit test.

And. as I said, test on master is failing already. Not working test is meaningless. Don't you think?

I added tests anyway.

@jvshahid jvshahid closed this in 9508c16 Jun 30, 2014
@jvshahid
Copy link
Contributor

I merged in the pr but changed the implementation to check for duplicates in the Columns member of the series. This should avoid other confusing problems that might arise with duplicate field names other than time and sequence_number

@otoolep
Copy link
Contributor

otoolep commented Jul 1, 2014

@jvshahid -- maybe I am missing something, but I don't see how the PR that was merged differs from what was first proposed. Where is the code that checks for dupes in Columns?

@jvshahid
Copy link
Contributor

jvshahid commented Jul 1, 2014

@code0x9
Copy link
Author

code0x9 commented Jul 1, 2014

I was so curious about those needless conversations...
Thank you, @jvshahid. You teach me how to distinguish "Owner" and "NOBODY".
I won't make mistake next time.

P.S. Your patch is much better than mine.

@jvshahid
Copy link
Contributor

jvshahid commented Jul 1, 2014

Oh forgot to mention. Do you mind signing the CLA. You can find it on the website, http://influxdb.com/community/cla.html

@code0x9
Copy link
Author

code0x9 commented Jul 1, 2014

I have signed CLA. thank you.

@jvshahid jvshahid added this to the 0.7.4 milestone Jul 7, 2014
@jvshahid jvshahid modified the milestones: 0.8.0, 0.7.4 Jul 24, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants