Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backfilling does not copy tags properly. #6955

Closed
sikadiamond opened this issue Jul 4, 2016 · 2 comments
Closed

Backfilling does not copy tags properly. #6955

sikadiamond opened this issue Jul 4, 2016 · 2 comments

Comments

@sikadiamond
Copy link

sikadiamond commented Jul 4, 2016

Bug report

I wanted to use backfilling to extract a slice of info from one database into another.
The backfill method does not seem to copy all tags properly into the target.
I expanded the scope for full tables (instead of a time slice), and see the same issue.

System info: Linux 3.19.0-58-generic #64~14.04.1-Ubuntu SMP Fri Mar 18 19:05:43 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

InfluxDB info: InfluxDB shell version: 0.13.0

Steps to reproduce:

Original Database:

$ curl https://s3-us-west-1.amazonaws.com/noaa.water.database.0.9/NOAA_data.txt > NOAA_data.txt
$ influx -import -path=NOAA_data.txt -precision=s
2016/07/04 11:25:15 Processed 1 commands
2016/07/04 11:25:15 Processed 76290 inserts
2016/07/04 11:25:15 Failed 0 inserts
$ influx
Visit https://enterprise.influxdata.com to register for updates, InfluxDB server management, and monitoring.
Connected to http://localhost:8086 version 0.13.0
InfluxDB shell version: 0.13.0

> use NOAA_water_database
Using database NOAA_water_database
> show series
key
average_temperature,location=coyote_creek
average_temperature,location=santa_monica
h2o_feet,location=coyote_creek
h2o_feet,location=santa_monica
h2o_pH,location=coyote_creek
h2o_pH,location=santa_monica
h2o_quality,location=coyote_creek,randtag=1
h2o_quality,location=coyote_creek,randtag=2
h2o_quality,location=coyote_creek,randtag=3
h2o_quality,location=santa_monica,randtag=1
h2o_quality,location=santa_monica,randtag=2
h2o_quality,location=santa_monica,randtag=3
h2o_temperature,location=coyote_creek
h2o_temperature,location=santa_monica

Ok so far, start backfilling:

> create database test
> use test
Using database test
> select * INTO "test"."default"."h2o_quality" FROM "NOAA_water_database"."default"."h2o_quality"
name: result
time    written
0   17854

All done, show series from table on the original database:

> show series from h2o_quality
key
h2o_quality,location=coyote_creek,randtag=1
h2o_quality,location=coyote_creek,randtag=2
h2o_quality,location=coyote_creek,randtag=3
h2o_quality,location=santa_monica,randtag=1
h2o_quality,location=santa_monica,randtag=2
h2o_quality,location=santa_monica,randtag=3

Show series from table on the target database:

> show series from h2o_quality
key
h2o_quality

This does not look like a desired outcome.

@jsternberg
Copy link
Contributor

You have to add GROUP BY * to your SELECT * INTO query so that it treats the tags as tags instead of as string values. From here.

Note: If you use SELECT * with INTO, the query converts tags in the current measurement to fields in the new measurement. This can cause InfluxDB to overwrite points that were previously differentiated by a tag value. Use GROUP BY <tag_key> to preserve tags as tags.

@jsternberg
Copy link
Contributor

I'm going to close this for now as I think the question has been answered. If there is something wrong and I haven't resolved your issue appropriately, please respond with more information and I will reopen the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants