Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

skip empty string for start position #4596

Merged
merged 1 commit into from
Oct 29, 2015

Conversation

ch33hau
Copy link
Contributor

@ch33hau ch33hau commented Oct 28, 2015

Hi,

I found out that scanLine() in models/points.go doesn't filter empty string (uint8 = 0) for start position.
This will introduce some weird issues if the input is made from a large buffer, the Key() from first point gonna be a very large string with all the empty string due to the invalid start position.

One of the obvious example is, if you change the value from uswest10 to uswest1 in https://github.com/influxdb/influxdb/blob/master/tsdb/engine/wal/wal_test.go#L327, the test will be failed due to this issue (only uswest1 will fail, uswest2 until uswest100 working fine). The reason is the first key is a large string (350000*uint8 + "cpu,host=A,region=uswest1"), it return no result when we pass in the valid key "cpu,host=A,region=uswest1" to lookup value from map (string comparison failed).

  • uswest10 (or uswest2 until uswest99) have no issue is because it is not the first point.

I have also created a test for points_test.go, this test gonna be failed without the new changes.

Sorry if my explanation is not clear enough.

@ch33hau ch33hau force-pushed the models_point_skip_empty_string branch 2 times, most recently from 92284c4 to 81438f5 Compare October 28, 2015 14:53
@ch33hau
Copy link
Contributor Author

ch33hau commented Oct 28, 2015

Again not sure why ci/circleci test -- TestServer_Query_Aggregates_FloatMany failed as below:

--- FAIL: TestServer_Query_Aggregates_FloatMany (9.65s)
    server_test.go:2499: median - odd count - float: unexpected results
        query:  SELECT MEDIAN(value) FROM floatmany where time < '2000-01-01T00:01:10Z'
        exp:    {"results":[{"series":[{"name":"floatmany","columns":["time","median"],"values":[["1970-01-01T00:00:00Z",4]]}]}]}
        actual: {"results":[{"series":[{"name":"floatmany","columns":["time","median"],"values":[["1970-01-01T00:00:00Z",null]]}]}]}

I have tried to execute test locally as well as docker, success without errors...
I will check if the test fail again...

UPDATED: the test seems passed now

@oiooj
Copy link
Contributor

oiooj commented Oct 28, 2015

check this function skipWhitespace at https://github.com/influxdb/influxdb/blob/master/models/points.go#L723

@ch33hau
Copy link
Contributor Author

ch33hau commented Oct 28, 2015

@oiooj yup I'm doing similar checking, but there is some minor different:
skipWhitespace is skipping for:

  • ' ': uint = 32
  • '\t': uint = 9

and I'm trying to skip for uint = 0

bufLength := len(buf)

for {
if bufLength > start && buf[start] == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if I had filled the buffer with some other non-zero value, which was not a valid character? Is that possible?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is possible if someone purposely do it programmatically.

This fixes is more like a prevention for a programmer's mistake, eg, buffer helps us to generate empty string without we knowing it, and it is not visible until the result from len(key).is called.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that is fair enough.

On Wed, Oct 28, 2015 at 11:14 AM, Lim Chee Hau notifications@github.com
wrote:

In models/points.go
#4596 (comment):

@@ -741,6 +741,16 @@ func scanLine(buf []byte, i int) (int, []byte) {
start := i
quoted := false
fields := false

  • bufLength := len(buf)
  • for {
  •   if bufLength > start && buf[start] == 0 {
    

I think it is possible if someone purposely do it programmatically.

This fixes is more like a prevention for a programmer's mistake, eg,
buffer helps us to generate empty string without we knowing it, and it is
not visible until the result from len(key).is called.


Reply to this email directly or view it on GitHub
https://github.com/influxdb/influxdb/pull/4596/files#r43296011.

@jwilder
Copy link
Contributor

jwilder commented Oct 28, 2015

Thanks @ch33hau. The fix for this is in the wrong place though. It should be:

diff --git a/models/points.go b/models/points.go
index 6a2fdad..5fa8021 100644
--- a/models/points.go
+++ b/models/points.go
@@ -731,7 +731,7 @@ func skipWhitespace(buf []byte, i int) int {
                        return i
                }

-               if buf[i] == ' ' || buf[i] == '\t' {
+               if buf[i] == ' ' || buf[i] == '\t' || buf[i] == 0 {
                        i += 1
                        continue

Because https://github.com/influxdb/influxdb/blob/master/models/points.go#L139 is where the start position within the line is determined.

@@ -1509,3 +1510,25 @@ func TestPrecisionString(t *testing.T) {
}
}
}

func Test_ParsePointsStringWithExtraBuffer(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency with the other tests in the file, can you remove the _ since the others do not follow this naming convention.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

@ch33hau
Copy link
Contributor Author

ch33hau commented Oct 28, 2015

Thanks @jwilder ( and @oiooj too )!

I got hinted earlier but just got it now, yes skipWhitespace is the right method to be modified, I will update it

@ch33hau ch33hau force-pushed the models_point_skip_empty_string branch from 81438f5 to f13af15 Compare October 28, 2015 18:29
@otoolep
Copy link
Contributor

otoolep commented Oct 28, 2015

+1, makes sense to me, but waiting for confirmation from @jwilder

jwilder added a commit that referenced this pull request Oct 29, 2015
@jwilder jwilder merged commit 78086cf into influxdata:master Oct 29, 2015
@jwilder
Copy link
Contributor

jwilder commented Oct 29, 2015

Thanks @ch33hau!

@ch33hau
Copy link
Contributor Author

ch33hau commented Oct 29, 2015

Thanks @otoolep and @jwilder =)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants