Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make a dump command #1909

Merged
merged 17 commits into from
Mar 19, 2015
Merged

make a dump command #1909

merged 17 commits into from
Mar 19, 2015

Conversation

rothrock
Copy link
Contributor

added a 'dump' command to the cli.
I still need to see about getting the name of the db and the retention policy.
The idea is that the dump command outputs something that can be played back as input.

@corylanou
Copy link
Contributor

I think this is looking good.

I am wondering though if we shouldn't create a "dump" endpoint from the http handlers. This would allow you to dump via a curl (very useful), and then the cli becomes very dumb about it and just has to hit the http endpoint.

My suggestion would be to move the logic to create the "dump" struct into an http endpoint, and then hit the client from the cli, that hits the endpoint.

Reminder to update the CHANGELOG.md as well when finished.

@rothrock
Copy link
Contributor Author

OK, I think I follow it now. Sure. That would be useful.

Issue: 1909
move dump cmd to an http endpoint.
Add dump cmd on client to call the endpoint.
@rothrock rothrock self-assigned this Mar 13, 2015
@@ -145,6 +145,27 @@ func (c *Client) Ping() (time.Duration, string, error) {
return time.Since(now), version, nil
}

func (c *Client) Dump(db string) (*http.Response, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should return a io.ReadCloser so that anyone else consuming our library wouldn't have to add the http package if they wanted to declare a variable to receive the response.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, but then others consuming our library will need to include the io package, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially, yes, but it only exposes what you want to expose. Returning the entire http response is way to much to return. Also, it's very idiomatic in Go to pass around readers/writers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for returning io.ReadCloser. It also doesn't tie the Client.Dump() interface to HTTP. Maybe we'll add a named pipes transport in the future. ;)

@corylanou
Copy link
Contributor

Looking good. Some more comments for discussion. We also need to add handler tests, and integration tests.

@beckettsean beckettsean modified the milestones: 0.9.0, Next Release Mar 14, 2015
@toddboom toddboom modified the milestones: Next Point Release, 0.9.0 Mar 14, 2015
@rothrock
Copy link
Contributor Author

@toddboom putting in a sleep "fixes" it. That's no good.

t.Fatalf("unexpected status for get: %d", status)
}

time.Sleep(500 * time.Millisecond) // Shouldn't committed data be readable?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Writing data is not committed data. When you do a write, that means that we accepted the message, and are sending it across brokers to write it and mark as committed. The better practice than sleep is to query and wait.

We do this in other tests here: https://github.com/influxdb/influxdb/blob/master/cmd/influxd/server_integration_test.go#L210

I wouldn't change it for now, but in general, this can lead to a racy test, as well as it artificially inflates our test suite, which adds up over time.

@rothrock
Copy link
Contributor Author

That's counterintuitive. I notice other sleeps in the test code. I also noticed tests that never verify that the data they wrote made it to the db.

@rothrock
Copy link
Contributor Author

our docs say:

Response
Once InfluxDB has accepted this data and safely persisted it to disk, it responds with HTTP 200 OK.

@rothrock
Copy link
Contributor Author

@corylanou @toddboom Can I merge this?

@corylanou
Copy link
Contributor

When we are done reviewing a PR, we typically give it a +1 or 🚢, etc. when it is ready to merge. I think there are reviews going on yet. I know I'm not done yet.

@rothrock
Copy link
Contributor Author

Sure, @corylanou, take your time. We've never really discussed procedures around here.

if err != nil {
return nil, err
}
return resp.Body, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably need to check the status code as well to see if we got an ok response, if not, return some type of error.

We do something like that here:

https://github.com/influxdb/influxdb/blob/master/client/influxdb.go#L121

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was looking to the Query handler for inspiration when I coded the Dump handler.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. we don't check http response codes for query or ping.

The cited example looks to be a mistake. It seem like the err returned on line 123 will always be nil. In the unlikely case that it is non-nil, the error returned doesn't speak to the problem the http client encountered. The error returned is in the context of json decoding.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, now it returns

return resp.Body, fmt.Errorf("HTTP Protocol error %d", resp.StatusCode)

@otoolep
Copy link
Contributor

otoolep commented Mar 17, 2015

On Tue, Mar 17, 2015 at 3:58 PM, Joseph Rothrock notifications@github.com
wrote:

our docs say:

Response
Once InfluxDB has accepted this data and safely persisted it to disk, it
responds with HTTP 200 OK.

Note that it doesn't say anything about the data being indexed in the
time-series database. Safely to disk in this case means sent to the
brokers. The assumption is the Broker are streaming it to the topic logs,
which are on disk.This may need some tweaking with the new Raft work, but
we should be able to build a system that can give this guarantee.

Reply to this email directly or view it on GitHub
#1909 (comment).

@rothrock
Copy link
Contributor Author

@otoolep that's great that safely-to-disk means "written to a broker," but that's not intuitive.
Also, I don't see the set-and-test pattern in the test coverage that such behavior would imply.

if err != nil {
fmt.Printf("Dump failed. %s\n", err)
} else {
scanner := bufio.NewScanner(response)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can do:

_, err := io.Copy(os.Stdout, response)

See this example
http://play.golang.org/p/lK2kFDJUw0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The response is line-oriented, so this felt like a natural way to do it.
The code works.

I will change it if you insist.

@corylanou
Copy link
Contributor

@rothrock ok, done with my review. Let me know if you have any questions. Super excited to see this in production! Thanks!

@toddboom
Copy link
Contributor

@rothrock For propagation delays, I think something like this is the preferred solution (whenever possible) going forward:
https://github.com/influxdb/influxdb/blob/master/cmd/influxd/server_integration_test.go#L210

That way we can timebox our tests to some upper bound, but keep them responsive.

@pauldix
Copy link
Member

pauldix commented Mar 18, 2015

@otoolep if that's what the docs say about writes, they should be updated as they're inaccurate. Writes will return a 200 once a quorum of the raft cluster has verified the write.

However, at that stage the write is only in memory. They haven't necessarily fsync'd the write to disk. This means there is a potential for data loss on a confirmed write if you have a catastrophic failure among the brokers before they manage to sync the write to disk.

This is fine, but it should be accurate in the documentation.

We'll probably add some configuration option later to force the raft servers to fsync before every commit to the raft log, but that's not the default behavior.

@otoolep
Copy link
Contributor

otoolep commented Mar 18, 2015

Docs clarified @pauldix

rename dump flag

put an error response body if json marshalling breaks.

detect and respond to http error codes in the client.
@pauldix
Copy link
Member

pauldix commented Mar 19, 2015

@otoolep awesome, thanks!

@toddboom
Copy link
Contributor

@corylanou how's she looking now? 👍 from me to merge if you think it's gtg.

@corylanou
Copy link
Contributor

+1 🚢

toddboom added a commit that referenced this pull request Mar 19, 2015
Add a `dump` command for exporting data.
@toddboom toddboom merged commit e2391d9 into master Mar 19, 2015
@toddboom toddboom deleted the export branch March 19, 2015 22:07
mark-rushakoff pushed a commit that referenced this pull request Jan 11, 2019
…rated

Switch buckets ui to use generated client
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants