-
Notifications
You must be signed in to change notification settings - Fork 8.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TSDB data import tool for OpenMetrics format. #5887
Conversation
(Copying my response to the last comment on the earlier PR for clarity: prometheus-junkyard/tsdb#671 (comment)) @codesome If I understand correctly, you're looking for a block structure as follows:
One road-block I see is that we would have to take down the running TSDB instance, in case the data being imported overlaps with the current data, and then force a compaction. The default mode of operation would only work if the data does not overlap, but we could certainly add a force compaction stage for overlapping data using a flag or some such. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the long delay.
Enough time has passed and noone has commented anything about the format so lets leave it as expfmt
and can add another one if/when needed.
@dipack95 sorry for the delay too.
And there is already a flag to allow overlapping blocks which takes care of merging the blocks. So if that flag is not enabled, then yes you have to restart Prometheus with that flag enabled. |
Thanks for the review @krasi-georgiev, @codesome, I should have some free time this weekend to implement your suggestions. Hopefully, I'll have a refreshed PR by next week. |
I've made quite a few changes in the last few commits, to bring them in line with your suggestions,
There are a few other minor changes that are yet to be implemented, like shuffling around a couple of util functions to be part of |
I still think that this is unnecessary. Passing ranges leaves the responsibility to the user and can just give some examples for best practices. Different users will have different requirements. Imagine in the case of Thanos which I think only uploads 2h blocks. In this case you would want to import all blocks to be 2h. |
after a chat with @codesome I do agree that passing block ranges would be too much hassle for the user so lets leave it as is. @bwplotka In the case of Thanos do you have any idea/suggestion how should backfiling/importing data should work? |
In any case, if the importer cannot find any existing blocks to try and align the imported data with, it cuts the imported data into 2h blocks, so that should help with the Thanos use case. |
Odd, one of the tests, in a completely unrelated part of the code, has failed. |
yes this is a flaky test and I think Simon is working on a fix |
Alright, looks like the builds + tests went through fine this time! |
@krasi-georgiev @codesome Does the PR look to be in good shape now? |
Regarding exporting the |
could you add the changes that trigger the error when part of tsdbutil and I can also have a look. |
@krasi-georgiev I've added the breaking changes. The CircleCI test breaks with the error I receive on my machine, but the Travis build goes through that bit fine, but breaks at the test stage complaining about circular dependencies. |
I looked into it and the only way to avoid the circular import is to put The problem is that the tsdb tests use |
Okay, I'll revert my most recent commit then, to allow the builds to go through + merge |
I still think we should reuse these funcs to please put them in a file tsdbutil.go in the tsdb root |
Oh, I thought that the new issue you've opened would take care of that as well. |
my idea is to reuse the func anyway and in the issue just refactor the code to allow better package structure. |
Makes sense, I've moved the functions to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is capable of dealing with real world data.
ping @dipack95 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
It's looking very good. I plan to extend it a bit with CSV support, in next PR basing on yours 🤗
The only major comment is related to input. I would suggest just specifying output
dir nothing else. What do you think? Explained below.
cc @gouthamve
An example import command is: `tsdb import rrd_exported_data.txt /var/lib/prometheus/ --max-samples-in-mem=10000` | ||
|
||
You can increase `max-sample-in-mem` to speed up the process, but the value 10000 seems a good balance. | ||
This tool will create all Prometheus blocks (see [On-disk layout][On-disk layout] above), in a temporary workspace. By default temp workspace is /tmp/ according to the $TMPDIR env var, so you can change it if you have disk space issues (`TMPDIR=/new/path tsdb import [...]`) ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to agree here.
I would simplify configuration here. What about just following flow:
Create new blocks in given directory. It's up to user if this dir is tmp, Prometheus Dir or White House Google Drive. Does not matter. We just write blocks as <ID>.tmp
and use os.Rename
to avoid partial reads. Done.
What do you think? (:
"strings" | ||
"testing" | ||
|
||
labels2 "github.com/prometheus/prometheus/pkg/labels" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for this being label2
anymore (:
}, | ||
{ | ||
ToParse: `# HELP http_requests_total The total number of HTTP requests. | ||
# TYPE http_requests_total counter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, but I am missing test that actually test import of OpenMetircs/Prom exposition with multiple samples 🤔
Is this supported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the test named TestImportIntoExistingDB
is for exactly that case, where we import multiple samples into the DB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not having this in the same TableTest? Why complicating? (:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I checked that test
// genOpenMetricsText formats the given series data into OpenMetrics compliant format.
func genOpenMetricsText(metricName, metricType string, series []tsdb.MetricSample) string {
str := fmt.Sprintf("# HELP %s This is a metric\n# TYPE %s %s", metricName, metricName, metricType)
for _, s := range series {
str += fmt.Sprintf("\n%s%s %f %d", metricName, labelsToStr(s.Labels), s.Value, s.Timestamp)
}
str += fmt.Sprintf("\n# EOF")
return str
}
This is not allowing me to inject many samples for single series, just single sample, no?
So how do user import series with let's say 100 samples? (:
// It iterates through the given durations, creating progressive blocks, and once it reaches | ||
// the last given duration, it divvies all the remaining samples into blocks of that duration. | ||
// Input blocks are assumed sorted by time. | ||
func binBlocks(blocks []*newBlockMeta, durations []int64) [][]*newBlockMeta { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What bin
means here? Can we have more verbose function name? (:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that function basically creates bins, or tranches, of the given samples corresponding to the provided durations. I thought bin was an accurate term for it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bin like... garbage bin? 🤔
return err | ||
} | ||
|
||
level.Debug(logger).Log("msg", "copying newly created blocks from temp location to actual DB location") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if location is remote?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't tested that specific case, but that seems like something the OS / golang libs itself should handle right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How? The key is to not overcomplicate the program. I would suggest simple thing. Just output
dir and nothing more. (:
testutil.Assert(t, r.Len() > 0, "import text has been completely read") | ||
} | ||
|
||
// func benchVirtual(vThresh int, b *testing.B) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid commented, dead code in the PR. It makes this PR looks like it's unfinished 🤗
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think I meant to remove this before merge.
// See the License for the specific language governing permissions and | ||
// limitations under the License. | ||
|
||
package importer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
package importer | |
package import |
Usually packages are noun, then internal structs can be -er
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough
Also cc @JessicaGreben as there is much to reuse from this PR |
Also I would be curious if we can find better name for this command. |
dumpMinTime = dumpCmd.Flag("min-time", "minimum timestamp to dump").Default(strconv.FormatInt(math.MinInt64, 10)).Int64() | ||
dumpMaxTime = dumpCmd.Flag("max-time", "maximum timestamp to dump").Default(strconv.FormatInt(math.MaxInt64, 10)).Int64() | ||
importCmd = cli.Command("import", "import samples from file containing information formatted in the Open Metrics format. Please refer to the storage docs for more details.") | ||
importFilePath = importCmd.Arg("file path", "file to import samples from (must be in Open Metrics format)").Required().String() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would just ask for input (: so we can do
cat X | promtool import /path/here
(:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have almost ready PR with this improvement if you want tot take a look
dumpPath = dumpCmd.Arg("db path", "database path (default is "+defaultDBPath+")").Default(defaultDBPath).String() | ||
dumpMinTime = dumpCmd.Flag("min-time", "minimum timestamp to dump").Default(strconv.FormatInt(math.MinInt64, 10)).Int64() | ||
dumpMaxTime = dumpCmd.Flag("max-time", "maximum timestamp to dump").Default(strconv.FormatInt(math.MaxInt64, 10)).Int64() | ||
importCmd = cli.Command("import", "import samples from file containing information formatted in the Open Metrics format. Please refer to the storage docs for more details.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably it should be rather gen-block
instead of import
🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think import
makes sense because you're generating blocks, and placing them in the db after the fact, which is an import operation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we cannot place them into DB, that's something I would see in this tool to separate. Separation of concern essentially.
importFilePath = importCmd.Arg("file path", "file to import samples from (must be in Open Metrics format)").Required().String() | ||
importDbPath = importCmd.Arg("db path", "database path").Required().String() | ||
importMaxSamplesInMemory = importCmd.Flag("max-samples-in-mem", "maximum number of samples to process in a cycle").Default("10000").Int() | ||
importMaxBlockChildren = importCmd.Flag("max-block-children", "maximum number of children a block can have at a given time").Default("20").Int() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we care to align blocks? 🤔 Might make sense but requires really a good documentation explanation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, let's document it more 🤗
@dipack95 are you on CNCF slack (https://slack.cncf.io/) maybe? So we can talk offline as well? 🤗 I am @bwplotka there |
dumpPath = dumpCmd.Arg("db path", "database path (default is "+defaultDBPath+")").Default(defaultDBPath).String() | ||
dumpMinTime = dumpCmd.Flag("min-time", "minimum timestamp to dump").Default(strconv.FormatInt(math.MinInt64, 10)).Int64() | ||
dumpMaxTime = dumpCmd.Flag("max-time", "maximum timestamp to dump").Default(strconv.FormatInt(math.MaxInt64, 10)).Int64() | ||
cli = kingpin.New(filepath.Base(os.Args[0]), "CLI tool for tsdb") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious why this tool is in TSDB tool?
Is this really what we want? 🤔 Just double checking, I think we all talked about promtool
.
Looks to me overall that it would be nice to move tsdb tool to subcommand of promtool tsdb
... but we can do that as separate piece of work (:
decBuf := bytes.NewBuffer(encSample) | ||
sample := tsdb.MetricSample{} | ||
if err := gob.NewDecoder(decBuf).Decode(&sample); err != nil { | ||
level.Error(logger).Log("msg", "failed to decode current entry returned by file scanner", "err", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's avoid things like this, either log error or return, let's not handle errors twice (:
|
||
// sampleStreamer returns a function that can be used to parse an OpenMetrics compliant | ||
// byte array, and return each token in the user-provided byte buffer. | ||
func sampleStreamer(buf *bytes.Buffer) func([]byte, bool) (int, []byte, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not fan of this function. Why we encode twice? We parse, then encode in buffer with enc := gob.NewEncoder(buf)
. Why we do that?
} | ||
defer db.Close() | ||
|
||
blocks, err := db.Blocks() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can rely on being in Prometheus DB path. Think about cases when you run Promethues on remote machines
Based on #5887 Thanks for your work so far @dipack95, it helped a lot! Changes on top of @dipack95: * Addressed all reviews components * Use subcommands for different formats * Simplifed block creation, no need to be such complex for first iteration. * Simpliefied and separate concerns. No need to have access to DB. Block * writting is separated as well for ease of benchmarking and test. This will be also needed by @JessicaGreben * Added import support for different formats. * Removed all tests - those had to be pulled over and adjusted ): Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
Based on #5887 Thanks for your work so far @dipack95, it helped a lot! Changes on top of @dipack95: * Addressed all reviews components * Use subcommands for different formats * Simplifed block creation, no need to be such complex for first iteration. * Simpliefied and separate concerns. No need to have access to DB. Block * writting is separated as well for ease of benchmarking and test. This will be also needed by @JessicaGreben * Added import support for different formats. * Removed all tests - those had to be pulled over and adjusted ): Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
Based on #5887 Thanks for your work so far @dipack95, it helped a lot! Changes on top of @dipack95: * Addressed all reviews components * Use subcommands for different formats * Simplifed block creation, no need to be such complex for first iteration. * Simpliefied and separate concerns. No need to have access to DB. Block * writting is separated as well for ease of benchmarking and test. This will be also needed by @JessicaGreben * Added import support for different formats. * Removed all tests - those had to be pulled over and adjusted ): Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
Hey, since you seems busy @dipack95 and we all would love to have that in, I created separate PR with your work, plus CSV format. I had to simplify a lot allow different format. Also applied all review comments from here: #7586 Please let me know if you want to still help, I would love to see us together moving this forward. Feel free to propose changes to |
Based on #5887 Thanks for your work so far @dipack95, it helped a lot! Changes on top of @dipack95: * Addressed all reviews components * Use subcommands for different formats * Simplifed block creation, no need to be such complex for first iteration. * Simpliefied and separate concerns. No need to have access to DB. Block * writting is separated as well for ease of benchmarking and test. This will be also needed by @JessicaGreben * Added import support for different formats. * Removed all tests - those had to be pulled over and adjusted ): Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
Based on #5887 Thanks for your work so far @dipack95, it helped a lot! Changes on top of @dipack95: * Addressed all reviews components * Use subcommands for different formats * Simplifed block creation, no need to be such complex for first iteration. * Simpliefied and separate concerns. No need to have access to DB. Block * writting is separated as well for ease of benchmarking and test. This will be also needed by @JessicaGreben * Added import support for different formats. * Removed all tests - those had to be pulled over and adjusted ): Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
Thank you @bwplotka! I was planning to work on this this weekend, but it's great that you're using my work as a base. |
now, which is the stable import tool? tsdb or promtool |
The CSV import never really worked. It's actually tricky to define a reasonable structure of a CSV that would nicely fit the Prometheus data model. Most relevant open issue is probably #7119. |
thanks, very great, currently switch to promtool already |
Created a tool to import data formatted according to the Prometheus exposition format. The tool can be accessed via the TSDB CLI.
closes #535
Signed-off-by: Dipack P Panjabi dipack.panjabi@gmail.com
(Port of prometheus-junkyard/tsdb#671)