Skip to content

Added support for deleting UID mappings #291

Closed
wants to merge 2 commits into from

6 participants

@jmangs
jmangs commented Feb 28, 2014

Right now there's no support built into the CLI tools to delete the UID mappings that show up when you run API commands like suggest and query - you can delete the data using tsdb scan --delete but the metrics will still come back in suggest and in queries (with no data if you delete it all). We have a bunch of old metrics which no longer have data, but they still show up in suggest - this becomes a pain in metrilyx because we can't show only pertinent metrics.

To address this, I've added a command called delete to tsdb uid:

tsdb uid delete metrics sys.cpu.1 would delete the UID forward/reverse mappings for the specified UID. I'm not 100% sure how safe this is or how much data (if any) is left over but this let's you at least permanently remove it. Also, you may have to forcibly call dropcaches to get the metric to unregister.

Note: Deleting the UID before cleaning up it's data means you cannot use the built in CLI commands like scan on the metric anymore. You will have to clean it up in HBase yourself, so be warned.

Anyways, this is just meant to be an administrative tool for cleaning up garbage like rename is. If there are any caveats to this approach it would be good to know as well.

@manolama
OpenTSDB member

Hah, I was just about to start this too so thanks! I think it's important for the delete command to data points associated with a UID though. It doesn't really affect metrics but if you scan a time series that includes a row with a deleted tagk or tagv UID, it will throw an error. We'll get David's patch from #243 in there and that will let us skip over the bad row, but for now it's best to try and delete those data points.

For metrics it's super simple, you'd just setup a scanner with a start and stop on the metric UID and delete any rows you find. For tagk and tagvs though, it's a big pain since you have to scan the whole data table and look for rows with that UID in the proper position (depending on if it's a name or value). It may make sense to run a map-reduce job for this kind of situation to push the work off to the region servers.

And there are some caveats in that with 2.0 there are the meta and tree tables that may include the UIDs so some cleanup needs to be done there.

Let me know if all of that is something you'd have a bit of time to work on, otherwise I can pull your work so far and get started on the rest in another week or so. This will all make it into 2.1.

@jmangs
jmangs commented Feb 28, 2014

I'd say take what I have now - I might not have the free time to work on this in the coming weeks so I want to avoid committing to it and then not having the time to do it.

I could also add a constraint to only allow the delete command on metrics in the mean time so if someone just starts trying to delete them it won't let them.

jan-mangs Added data cleanup for metrics deletion. Added UnsupportedOperationEx…
…ception for tagk/tagv UID deletes for the time being.
a526b74
@jmangs
jmangs commented Feb 28, 2014

I've added the data deletion portion for metrics. For now, I've made it throw an exception if you try to delete tagk/tagk UIDs - it's probably better for now since you could accidentally delete a lot of data if you delete a common used tag key for example.

This was referenced Jun 3, 2014
@mxk1235
mxk1235 commented Oct 31, 2014

any update on when this will get merged in?

thanks in advance!

@Misenko Misenko added a commit to Misenko/opentsdb that referenced this pull request Jun 26, 2015
@Misenko Misenko applied pull request 1efc8d2
@gsaray101

Is this merged with the latest master? what is the actual command to actually remove the metric from the tsdb? scan --delete not working for me either. I still see the metric in uid table.

@mxk1235

I think it's probably better to put this sort of thin in the LOG, rather than console.

@mxk1235

LOG, rather than console?

@mxk1235

would love this functionality

@sidhhu
sidhhu commented Jul 30, 2015

@manolama Just looking through this. Well, if is it necessary to delete the tagk and tagv from the data table, then probably a better approach would be to do it on query time with compaction (similar way as how HBase major compaction works). But that would kill the read performance so it can be a option to delete or not.

@manolama
OpenTSDB member

Cleaned up a bit and merge in 58ead0f

@manolama manolama closed this Sep 10, 2015
@xmj
xmj commented Sep 30, 2015

When will this merge make it into a release?

@manolama
OpenTSDB member

It's in 2.2RC1 already.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.