New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added support for deleting UID mappings #291
Conversation
Hah, I was just about to start this too so thanks! I think it's important for the delete command to data points associated with a UID though. It doesn't really affect metrics but if you scan a time series that includes a row with a deleted tagk or tagv UID, it will throw an error. We'll get David's patch from #243 in there and that will let us skip over the bad row, but for now it's best to try and delete those data points. For metrics it's super simple, you'd just setup a scanner with a start and stop on the metric UID and delete any rows you find. For tagk and tagvs though, it's a big pain since you have to scan the whole data table and look for rows with that UID in the proper position (depending on if it's a name or value). It may make sense to run a map-reduce job for this kind of situation to push the work off to the region servers. And there are some caveats in that with 2.0 there are the meta and tree tables that may include the UIDs so some cleanup needs to be done there. Let me know if all of that is something you'd have a bit of time to work on, otherwise I can pull your work so far and get started on the rest in another week or so. This will all make it into 2.1. |
I'd say take what I have now - I might not have the free time to work on this in the coming weeks so I want to avoid committing to it and then not having the time to do it. I could also add a constraint to only allow the delete command on metrics in the mean time so if someone just starts trying to delete them it won't let them. |
…ception for tagk/tagv UID deletes for the time being.
I've added the data deletion portion for metrics. For now, I've made it throw an exception if you try to delete tagk/tagk UIDs - it's probably better for now since you could accidentally delete a lot of data if you delete a common used tag key for example. |
any update on when this will get merged in? thanks in advance! |
Is this merged with the latest master? what is the actual command to actually remove the metric from the tsdb? scan --delete not working for me either. I still see the metric in uid table. |
@manolama Just looking through this. Well, if is it necessary to delete the tagk and tagv from the data table, then probably a better approach would be to do it on query time with compaction (similar way as how HBase major compaction works). But that would kill the read performance so it can be a option to delete or not. |
Cleaned up a bit and merge in 58ead0f |
When will this merge make it into a release? |
It's in 2.2RC1 already. |
How can we delete uid for a specific tag value of any giving metric? For e.g. If we have metrics like:
How can I delete the uid of only first metric? |
Right now there's no support built into the CLI tools to delete the UID mappings that show up when you run API commands like
suggest
andquery
- you can delete the data usingtsdb scan --delete
but the metrics will still come back in suggest and in queries (with no data if you delete it all). We have a bunch of old metrics which no longer have data, but they still show up in suggest - this becomes a pain in metrilyx because we can't show only pertinent metrics.To address this, I've added a command called delete to
tsdb uid
:tsdb uid delete metrics sys.cpu.1
would delete the UID forward/reverse mappings for the specified UID. I'm not 100% sure how safe this is or how much data (if any) is left over but this let's you at least permanently remove it. Also, you may have to forcibly calldropcaches
to get the metric to unregister.Note: Deleting the UID before cleaning up it's data means you cannot use the built in CLI commands like scan on the metric anymore. You will have to clean it up in HBase yourself, so be warned.
Anyways, this is just meant to be an administrative tool for cleaning up garbage like rename is. If there are any caveats to this approach it would be good to know as well.