Unicode support is added for metric names and tag names/values.#205
Unicode support is added for metric names and tag names/values.#205vasiliyk wants to merge 1 commit intoOpenTSDB:nextfrom vasiliyk:next
Conversation
Current whitelisted symbols: letter, digit, underscore symbol, dot symbol, forward slash. Dash symbol was removed from white-list.
|
We can't remove the "-" since it's already used by a bunch of folks. It would cause a ton of trouble. Also, could you look at using the "next" branch and add a Config option that would enable the unicode validator for folks that need it? |
|
@tsuna mentioned "-" on the mail list in the list of symbols that we may need to reserve, and I thought it was good idea to make it in next release. https://groups.google.com/forum/#!topic/opentsdb/4v43M3aOl-c If you still worry about performance I can propose another variant of algorithm. For non unicode letters we have 0.005ms(current filter) vs 0.012ms for 25 symbols, and minimal overhead for unicode letters (additional 4 numbers comparisons) : 0.002ms. What do you prefer config option or this variant ? public static void validateString(final String what, final String s) {
if (s == null) {
throw new IllegalArgumentException("Invalid " + what + ": null");
}
final int n = s.length();
for (int i = 0; i < n; i++) {
final char c = s.charAt(i);
if (!('a' <= c && c <= 'z') || ('A' <= c && c <= 'Z') ||
('0' <= c && c <= '9') || c == '_' || c == '.' || c == '/' ||
Character.isLetter(c)) {
throw new IllegalArgumentException("Invalid " + what
+ " (\"" + s + "\"): illegal character: " + c);
}
}
} |
|
Since we already allowed I like @vasiliyk's last suggestion in the comment above. I didn't even think such a minimal change would be all that would be required. |
Current whitelisted symbols: letter, digit, underscore symbol, dot symbol, forward slash.
Dash symbol was removed from white-list.
Right after merge I will prepare change in documentation: dash "-" symbol is removed from list of allowed symbols.