[feature request] Insert new tags to existing values, like update #3904

mvadu · 2015-08-31T12:56:49Z

Can we have a query syntax which allows to insert/attach a new set of tags (along with exiting ones) to values/rows that are already part of a measurement?

My use case: I created a measurement from PerfMon log, which already has Host= tag. Now I want to categorize the data by applications, so I want to add tags like "App1=, App2=" assuming I can have two apps hosted on same server.

Then I want to be able to say Update <measurement name> add <tag name=value> where <some condition based on tags>

The text was updated successfully, but these errors were encountered:

airyland · 2015-09-01T06:46:47Z

+1

beckettsean · 2015-09-01T20:10:10Z

This feature would require a fairly major architectural change to the database and is not something we will do this year, if ever. Closing the request for now.

mvadu · 2015-09-02T06:20:18Z

Please consider this during your next architecture review. I think this would be required in many enterprise situations, where you make a mistake while inserting values, and realize it later. Having an option to update would be a life saver!

ivanscattergood · 2015-09-30T06:32:17Z

Hi,
I have been two specific use cases for this functionality

Obfuscation of data for testing / demo purposes.
As the data we deal with is Market data the Tags within the data may actually refer to a ticker and given the other attributes of the data you can work out trade information. I am currently working on a data set which I have been asked to Obfuscate for a demo. To achieve this I am currently loading the dataset into memory and writing it back in with new tags. (This would be simplified by the insert into functionality that has been proposed in Should be able to force recalculation of continuous query for given time interval #211)
Change of the Tag value in the real world
Imagine that one of the tags in your data referenced the Name of a department, if that department name changed it would be good to be able to change it historically, rather than having to have the two tag values running in tandem.
In the world of Data Warehousing this is known as slowly changing dimensions:
http://datawarehouse4u.info/SCD-Slowly-Changing-Dimensions.html

rapport · 2016-06-17T12:34:48Z

+1 For us events are usually only recognising in the data following a reasoning step after capture. And, we want to tag data with its event once its recognised. I am considering using continuous queries to do some of this tagging after capture but it seems inefficient and I am not sure it will work for all our use cases

srikara · 2016-06-17T18:00:34Z

+1

mrh666 · 2016-06-27T20:48:20Z

+1
This is will be surprisingly useful feature! Sometimes our developers sending metrics not following the namespace requirement, but following some pre-defined hardcoded application name structure (storm metrics collector for example) and those metrics have a lot of value for research.

jarlwolfganger · 2016-07-19T15:53:07Z

+1, if you guys aren't going to add this in some time soon could you at least add work around options to the current documentation since there was nothing mentioning appending tags to an existing measurement.

ghost · 2016-07-24T17:21:03Z

+1 this is definitely needed in many real-world solutions as architectural changes might need additional tags.

hgomez-sonarsource · 2016-07-25T14:26:21Z

+1 to add this feature since as of today, there is no other options than to dump a measure in ASCII, hack ASCII file and send it back to a new measure via wireprotocol.

hgomez-sonarsource · 2016-07-26T07:09:38Z

For now, I updated my InfluxDB Fetcher tool (https://github.com/hgomez/influxdb) to transform fields in tags so I could reimport them in proper format

jincejames · 2016-08-25T07:19:54Z

+1

szll · 2016-10-13T14:49:04Z

+1

autumnw · 2016-10-24T13:58:16Z

+1

andreaaizza · 2016-11-18T19:30:49Z

+1

ryanmills · 2016-12-07T19:50:15Z

+1 please provide ability to update existing tags

geethanjalieswaran · 2016-12-08T09:10:09Z

+1

mvadu · 2016-12-13T17:29:42Z

@beckettsean Hi Sean, since closing this request quite few folks have expressed their interest and support for this feature. Do you guys want to reconsider this request?

ryanmills · 2016-12-13T19:15:04Z

Additional use-cases:

Adding a new tag to an existing schema -- It's crazy you can't do this?
You might find that suddenly you need to GROUP BY an existing field value so you must then update the records to remove the value as a field and now add it as a tag
Deletes are expensive so you could update a tag to mark the record as active/disabled instead of deleting it and include the status tag in queries.

hcsaustrup · 2016-12-16T12:15:06Z

+1

hraftery · 2016-12-23T00:12:05Z

Issue #828 concerns renaming databases, measurements, fields and tags. Some of that functionality is provided by INTO. Eg.
SELECT "old_field_key" AS "new_field_key" INTO new_db..new_measurement FROM old_db..old_measurement GROUP BY "old_tag_key"
renames database, measurement and field key, but cannot be extended to rename the tag key or update with new tags.

Given those limitations, it may be better to dump data, convert to line protocol, go nuts with sed to change/add keys, and then POST back.

@hgomez-sonarsource 's excellent InfluxDB Fetcher makes this quite quick and easy, but note:

It requires maven and java. Easy to install and use, but requires quite a few packages to be downloaded/built/tested/installed.
The invocation line is long and complex, but works just fine if you're careful.
The extra "data1,data2" arguments are for converting a tag to a field, so don't do anything if you don't have fields by those names. You can achieve the same in your sed script anyway.
You probably don't want the 'i' (integer) suffixes on your values because Influx numeric fields are floats by default, so make sure you strip them out in your sed script.

So you can do a simultaneous database rename, measurement rename, field rename, tag rename and tag addition with three chunky lines:

java -cp target/influxdb-fetcher-1.0.0-SNAPSHOT.jar com.github.hgomez.influxdb.InfluxDBFetcher http://127.0.0.1:8086 login password old_db "SELECT * from old_measurement GROUP BY *" > dump.wireproto
sed -i -e "s/old_measurement/new_measurement/;s/old_field_key/new_field_key/;s/old_tag_key/new_tag_key/;s/new_tag_key/additional_tag_key=additional_tag,new_tag_key/;s/i,/,/g;s/i / /" dump.wireproto 
curl -i -POST "http://127.0.0.1:8086/write?db=new_db" --data-binary @dump.wireproto

I used something very similar to convert 20000 points in a couple of seconds. YMMV.

andyflury · 2016-12-26T14:25:10Z

+1

hraftery · 2017-01-06T11:49:03Z

Just came across a possible improvement on my workaround posted above. Haven't investigated it, but influx_inspect export might be a replacement for InfluxDB Fetcher for this purpose?

einhirn · 2017-01-25T14:22:23Z

@hraftery Nice, but you can only use it on database level, not measurement level. Output will get very big, very soon, and I didn't manage to have it output to stdout (i.e. pipe.). Of course you can work around that with mkfifo etc. but still big output...

Of course you can add to the workaround by first "select"ing "into" a new db and so on 😁

vikrammurugesan · 2017-03-28T01:03:46Z

+1 this will be really useful feature

mei-rune · 2017-03-28T01:57:35Z

+1, this will be really useful feature while refact app

samhatchett · 2017-04-04T15:02:33Z

+1 : this could also be accomplished if SELECT INTO could alter or replace tag keys/values.

JeffAbrahamson · 2017-07-30T12:05:45Z

+1

meesern · 2017-08-25T19:50:52Z

+1 to allow tagging measurements with post processed tags.

amoondra19 · 2017-10-13T20:01:27Z

+1

bkdonline · 2017-10-18T15:56:05Z

@beckettsean Can this be reconsidered at this point? Its been two years since the thread was closed, and it might be more feasible architecturally to consider this now!

This will be extremely useful for maintaining data in medium to long term in an enterprise setting.

gwijnja · 2017-11-01T16:05:10Z

+1
Just realized that a CQ by default drops all tags during while it downsamples data. The tags in the source table were always the same (network traffic for 1 host, 1 interface), so I never noticed the missing tags in the downsampled table. Now that I'm adding a second host, I realize that all existing downsampled entries need to have the tags added (the same for all entries) before I can add a new host. There's months of data in the table already...

Update:
Found a workaround here to add tags. If anyone has the same problem, then this may work for you as well:

Drop the continuous query

> DROP CONTINUOUS QUERY cq_1m ON network

Run the python script (I adjusted it a bit to my problem) to copy the data into an intermediate table, while adding some tags:

from influxdb import InfluxDBClient

client = InfluxDBClient('localhost', database='network')
db_data = client.query('select tx_bps, rx_bps from downsampled_traffic')
data_to_write = [
        {
            'measurement': 'intermediate',
            'tags': {'host': 'compass', 'if': 'eth0'},
            'time': d['time'],
            'fields': {'tx_bps': float(d['tx_bps']), 'rx_bps': float(d['rx_bps'])}
        }
        for d in db_data.get_points()
    ]
client.write_points(data_to_write)

Drop the original measurement

> DROP MEASUREMENT downsampled_traffic

Reload the intermediate measurement into the original measurement:

> SELECT * INTO downsampled_traffic FROM intermediate GROUP BY *

Now the downsampled table contains the host and interface tags I needed:

> select * from downsampled_traffic limit 3
name: downsampled_traffic
time                host    if   rx_bps        tx_bps
----                ----    --   ------        ------
1504201620000000000 compass eth0 12962.4666667 13043
1504201740000000000 compass eth0 107997.333333 125236.466667
1504201800000000000 compass eth0 50329.7333333 51249.3333333

Drop the intermediate measurement:

> DROP MEASUREMENT intermediate

Recreate the continuous query, now including a GROUP BY for the host and interface (if) tags:

> CREATE CONTINUOUS QUERY cq_1m ON network BEGIN SELECT 8 * derivative(mean(rx), 1s) AS rx_bps, 8 * derivative(mean(tx), 1s) AS tx_bps INTO network.autogen.downsampled_traffic FROM network.autogen.traffic GROUP BY time(1m),host,if END

gylu · 2017-11-06T23:41:41Z

+9000
Having to loop through every single data point to insert or rename a tag is saddening and extremely resource consuming

UVk · 2017-11-17T05:49:40Z

+1

rvolosatovs · 2017-11-26T04:09:51Z

Encountered this issue on a production system. We have around a million entries every day, the current dataset has been accumulating for half a year.
I was going to set up some additional continuous queries, but turned out that is not possible, as some of the values in measurements, which I wanted to GROUP BY are stored as fields and there is not way to group by "field" types and also no way to convert the "field" types to "tag" types.

Work has been started on a simple utility to convert fields to tags in the wire protocol representation.

go get -u github.com/rvolosatovs/influx-taggify

Example usage:

influx_inspect export -database "$db" -datadir "$datadir" -waldir "$waldir" -out /tmp/influx-export
# Delete the measurements you don't need to convert using `sed`/`perl` (i.e. `perl -in -e 'print unless m/^unrelated_measurement.*/' /tmp/influx-export`)
influx-taggify -in /tmp/influx-export -out /tmp/influx-export-tagged fieldFoo fieldBar
# Drop the old database or edit generated file to change the name of the database
influx -import -path /tmp/influx-export-tagged

It worked for my use case, but your mileage may vary.
Try locally on non-critical setup first!
Feel free to try, report issues and contribute! :)

YEMEAC · 2018-05-04T12:06:51Z

+1 @2018

jheusser · 2018-05-17T10:33:52Z

+1

javiergarciad · 2018-06-01T19:33:21Z

+1

hor1z0nx · 2018-06-05T11:10:57Z

+1

longit644 · 2018-06-06T06:25:50Z

+1

kikohs · 2018-07-19T09:48:43Z

+1

galindro · 2018-09-05T18:43:17Z

+1

hraftery · 2018-09-13T08:40:00Z

For what it's worth I've revisited this 2 years later and explored all alternative options for exporting/importing data. To save others the hours of frustration, it seems my workaround above is still the easiest way to achieve this and all manner of related export/import tasks.

Lesser alternatives:

The influx_inspect tool is complicated, requires sudo, relies on TSM files and dumps the whole db.
The HTTP API (eg. via curl) is great, but it produces JSON that can't be read by influx!
The influx tool's - execute flag is promising, but even csv2influx.py can't read the csv generated.

samhatchett · 2018-09-13T13:51:38Z

We were trying to solve this problem for a customer - where data may have been inserted incorrectly and we then needed to go back and re-tag a series (or several series). We ended up creating a little node-based migration service. It's solving our problem for now, and our customer has a more friendly means of managing their data. May not be suitable for all cases. You can find it here.

steverweber · 2019-01-28T20:07:26Z

+1 : this could also be accomplished if SELECT INTO could alter or replace tag keys/values.

this would be nice in CONTINUOUS QUERY... because the data is being transformed and new tags would be the most logical.

example:

we have a few lab room and we use a CQ to find how busy each room is... The rooms are not tagged but are inferred from the host name.
It be ideal if we could tag the room like mc3006... but because of influxdb current limitations we are using a new metric for each room.

SELECT sum("value") INTO "thinclient_active_mc3006" FROM "thinclient_active" WHERE ("host" =~ /3006/) GROUP BY time(1m, -5m)
SELECT sum("value") INTO "thinclient_active_mc3007" FROM "thinclient_active" WHERE ("host" =~ /3007/) GROUP BY time(1m, -5m)
SELECT sum("value") INTO "thinclient_active_mc3008" FROM "thinclient_active" WHERE ("host" =~ /3008/) GROUP BY time(1m, -5m)

something like the following would would be a nice improvement!

SELECT sum("value"), room::tag=mc3008 INTO "thinclient_active_room" FROM "thinclient_active" WHERE ("host" =~ /3008/) GROUP BY time(1m, -5m), room

steverweber · 2019-01-28T20:24:40Z

@hraftery please reconsider this feature request.

hraftery · 2019-01-28T21:52:16Z

@steverweber I'm just a user - you probably want to direct your request to a contributor. But I wouldn't hold your breath. This issue was closed back in 2015. There are at least three workarounds described in this thread, but I agree, something along the lines of a SELECT INTO would be far better.

newskooler · 2020-04-20T20:54:09Z

I sincerely with that this would be reconsidered. It's a must-have :)

kaszperro · 2020-06-09T11:37:22Z

Are there any plans to implement this feature in influxdb 2.0?

kavinbright · 2020-08-07T06:58:51Z

+1

amotl · 2021-02-01T17:42:05Z

Hi there,

following up on the comment by @hraftery at #3904 (comment), everything about @hgomez' excellent InfluxDB Fetcher still holds true, but we just improved the convenience of installation.

You don't have to build it yourself from now on but instead you can conveniently download a ready-made influxdb-fetcher-1.0.2.jar to your workstation. The new setup instructions at [1] outline how to easily install a wrapper program which will download the .jar file automatically on its first invocation.

With kind regards,
Andreas.

[1] https://github.com/hgomez/influxdb#setup

beckettsean added this to the Longer term milestone Aug 31, 2015

beckettsean closed this as completed Sep 1, 2015

beckettsean mentioned this issue Sep 17, 2015

Add series alteration #4146

Closed

hraftery mentioned this issue Feb 1, 2018

[feature request] RENAME TAG #4157

Open

afausti mentioned this issue May 9, 2020

[DM-24773] Add example on how to input tags in InfluxDB lsst-sqre/squash-api#47

Merged

darshank15 mentioned this issue Jun 28, 2020

Rename UUID on a Fossology instance darshank15/GSoC_2020_FOSSOlogy#6

Closed

amotl mentioned this issue Feb 1, 2021

Improve documentation hgomez/influxdb#5

Merged

BuongiornoTexas mentioned this issue Oct 2, 2022

Enhancement suggestion - Align Production and consumption data with tesla app and gateway data jasonacox/Powerwall-Dashboard#87

Closed

cschlipf mentioned this issue Sep 3, 2023

Influx: Tag data by month and year evcc-io/evcc#9699

Closed

[feature request] Insert new tags to existing values, like update #3904

[feature request] Insert new tags to existing values, like update #3904

Comments

mvadu commented Aug 31, 2015

airyland commented Sep 1, 2015

beckettsean commented Sep 1, 2015

mvadu commented Sep 2, 2015 • edited Loading

ivanscattergood commented Sep 30, 2015

rapport commented Jun 17, 2016

srikara commented Jun 17, 2016

mrh666 commented Jun 27, 2016

jarlwolfganger commented Jul 19, 2016

ghost commented Jul 24, 2016

hgomez-sonarsource commented Jul 25, 2016

hgomez-sonarsource commented Jul 26, 2016

jincejames commented Aug 25, 2016

szll commented Oct 13, 2016

autumnw commented Oct 24, 2016

andreaaizza commented Nov 18, 2016

ryanmills commented Dec 7, 2016

geethanjalieswaran commented Dec 8, 2016

mvadu commented Dec 13, 2016

ryanmills commented Dec 13, 2016 • edited Loading

hcsaustrup commented Dec 16, 2016

hraftery commented Dec 23, 2016 • edited Loading

andyflury commented Dec 26, 2016

hraftery commented Jan 6, 2017

einhirn commented Jan 25, 2017 • edited Loading

vikrammurugesan commented Mar 28, 2017

mei-rune commented Mar 28, 2017 • edited Loading

samhatchett commented Apr 4, 2017

JeffAbrahamson commented Jul 30, 2017

meesern commented Aug 25, 2017

amoondra19 commented Oct 13, 2017

bkdonline commented Oct 18, 2017

gwijnja commented Nov 1, 2017 • edited Loading

gylu commented Nov 6, 2017 • edited Loading

UVk commented Nov 17, 2017

rvolosatovs commented Nov 26, 2017

YEMEAC commented May 4, 2018

jheusser commented May 17, 2018

javiergarciad commented Jun 1, 2018

hor1z0nx commented Jun 5, 2018

longit644 commented Jun 6, 2018

kikohs commented Jul 19, 2018

galindro commented Sep 5, 2018

hraftery commented Sep 13, 2018

samhatchett commented Sep 13, 2018

steverweber commented Jan 28, 2019 • edited Loading

steverweber commented Jan 28, 2019 • edited Loading

hraftery commented Jan 28, 2019

newskooler commented Apr 20, 2020

kaszperro commented Jun 9, 2020

kavinbright commented Aug 7, 2020

amotl commented Feb 1, 2021

mvadu commented Sep 2, 2015 •

edited

Loading

ryanmills commented Dec 13, 2016 •

edited

Loading

hraftery commented Dec 23, 2016 •

edited

Loading

einhirn commented Jan 25, 2017 •

edited

Loading

mei-rune commented Mar 28, 2017 •

edited

Loading

gwijnja commented Nov 1, 2017 •

edited

Loading

gylu commented Nov 6, 2017 •

edited

Loading

steverweber commented Jan 28, 2019 •

edited

Loading

steverweber commented Jan 28, 2019 •

edited

Loading