-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make normalized non-utf-8 labels unique #24
Conversation
Hi @khaf what do you think about this change? |
I'm not sure about this. Fixing an invalid UTF8 name is one thing, changing it via appending another string at the end is another. The original fix was already for a very niche situation, what does this fix address? Do you have examples of the problem to demonstrate the issue? |
I agree that it is a very strong edge-case. Sadly we hit that. Yes, I do have an example. The problem we encountered is that after originally proposed utf-8 normalization is that multiple set labels were normalized to the set name. This fix overcomes this problem. Example log from the exporter:
|
Could you kindly share those set names? How did invalid utf8 strings end up as your set names? What part of Aerospike toolchain was responsible for allowing it? |
It was possible to create sets with binary data via C client. |
It is possible for me to get the set names but it is probably simpler to imagine the situation
|
Can't we do this differently? As in show the invalid characters inline and escaped? Something Like: []byte{set\xc0} -> "set\xc0" |
I display the escaped value and hex of the original data (both only if it is invalid utf-8). It may also be only the hex - I think it is sufficient Your proposal is also nice - how would you encode it to make sure that it won't be interpreted by the prometheus? |
@khaf what do you think about this. I know it is not a big issue. I just think it is a better solution than the first one. |
Prometheus is written in Go, so I expect it to be fully utf8 compatible. |
Regardless of any of this set names have restrictions (some implicit unfortunately). The next version of the server will have a much more strict set of rules for naming and will enforce them. It will be something you can turn off if you want to live dangerous but by default only namespace names, set names, and bin names that conform to a simplified naming scheme will be supported. |
Hi @tivvit Having a set name with binary data is not something that we recommend (although you were able to create them). Aerospike tools and monitoring could break with such invalid set names. Also, I think this change wouldn't actually fix your problem The current Let me know what you think. I would suggest we can close this PR. |
@spkesan I am actually using alicebob/asprom#44. I just wanted to handle this very special case in this "official exporter" too. The code is tested and it works as expected for me. I am ok with closing this PR. If the rules for sets will be more strict this won't be an issue anymore. I wanted to inform the community about this possible problem and I think that handling this directly in the Aerospike server is the best solution. |
Thanks @tivvit |
Normalized label names may not be unique which causes problems. I have added original string hex to make them unique and consistent. Connected to alicebob/asprom#44