Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg/trace/agent: consider UTF-8 characters in tags #2957

Merged
merged 5 commits into from
Jan 29, 2019
Merged

pkg/trace/agent: consider UTF-8 characters in tags #2957

merged 5 commits into from
Jan 29, 2019

Conversation

gbbr
Copy link
Contributor

@gbbr gbbr commented Jan 29, 2019

This change fixes a problem in NormalizeTag where it was assuming character lengths to be always equal to 1, whereas this isn't always true for UTF-8 characters.

It also fixes an issue where not all Unicode letters were lower-cased.

⚠️ Warning: there is an edge case to this algorithm where some uppercased Unicode characters are not lowercased when their length would end up being different as a result. 24 characters fall into this edge case: https://play.golang.org/p/Af1bIusSbfq

Performance remains virtually unimpacted.

 name                      old time/op    new time/op    delta
 NormalizeTag/ok-4           68.8ns ± 0%    69.2ns ± 0%  +0.58%
 NormalizeTag/trim-4         92.8ns ± 0%    92.5ns ± 0%  -0.32%
 NormalizeTag/trim-both-4     159ns ± 0%     163ns ± 0%  +2.52%
 NormalizeTag/plenty-4        147ns ± 0%     150ns ± 0%  +2.04%
 NormalizeTag/more-4          259ns ± 0%     270ns ± 0%  +4.25%

 name                      old alloc/op   new alloc/op   delta
 NormalizeTag/ok-4            24.0B ± 0%     24.0B ± 0%   0.00%
 NormalizeTag/trim-4          32.0B ± 0%     32.0B ± 0%   0.00%
 NormalizeTag/trim-both-4     64.0B ± 0%     64.0B ± 0%   0.00%
 NormalizeTag/plenty-4        64.0B ± 0%     64.0B ± 0%   0.00%
 NormalizeTag/more-4           128B ± 0%      128B ± 0%   0.00%

 name                      old allocs/op  new allocs/op  delta
 NormalizeTag/ok-4             2.00 ± 0%      2.00 ± 0%   0.00%
 NormalizeTag/trim-4           2.00 ± 0%      2.00 ± 0%   0.00%
 NormalizeTag/trim-both-4      3.00 ± 0%      3.00 ± 0%   0.00%
 NormalizeTag/plenty-4         3.00 ± 0%      3.00 ± 0%   0.00%
 NormalizeTag/more-4           4.00 ± 0%      4.00 ± 0%   0.00%

In addition to #2951

This change fixes a problem in NormalizeTag where it was assuming
character lengths to be always equal to 1, whereas this isn't always
true for UTF-8 characters.

It also fixes an issue where not all Unicode letters were lower-cased.

Performance remains virtually unimpacted.

     name                      old time/op    new time/op    delta
     NormalizeTag/ok-4           68.8ns ± 0%    69.2ns ± 0%  +0.58%
     NormalizeTag/trim-4         92.8ns ± 0%    92.5ns ± 0%  -0.32%
     NormalizeTag/trim-both-4     159ns ± 0%     163ns ± 0%  +2.52%
     NormalizeTag/plenty-4        147ns ± 0%     150ns ± 0%  +2.04%
     NormalizeTag/more-4          259ns ± 0%     270ns ± 0%  +4.25%

     name                      old alloc/op   new alloc/op   delta
     NormalizeTag/ok-4            24.0B ± 0%     24.0B ± 0%   0.00%
     NormalizeTag/trim-4          32.0B ± 0%     32.0B ± 0%   0.00%
     NormalizeTag/trim-both-4     64.0B ± 0%     64.0B ± 0%   0.00%
     NormalizeTag/plenty-4        64.0B ± 0%     64.0B ± 0%   0.00%
     NormalizeTag/more-4           128B ± 0%      128B ± 0%   0.00%

     name                      old allocs/op  new allocs/op  delta
     NormalizeTag/ok-4             2.00 ± 0%      2.00 ± 0%   0.00%
     NormalizeTag/trim-4           2.00 ± 0%      2.00 ± 0%   0.00%
     NormalizeTag/trim-both-4      3.00 ± 0%      3.00 ± 0%   0.00%
     NormalizeTag/plenty-4         3.00 ± 0%      3.00 ± 0%   0.00%
     NormalizeTag/more-4           4.00 ± 0%      4.00 ± 0%   0.00%
@gbbr gbbr added this to the 6.10.0 milestone Jan 29, 2019
@gbbr gbbr requested a review from LotharSee January 29, 2019 09:17
@gbbr gbbr requested a review from a team as a code owner January 29, 2019 09:17
@codecov-io
Copy link

codecov-io commented Jan 29, 2019

Codecov Report

Merging #2957 into master will increase coverage by 0.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #2957      +/-   ##
==========================================
+ Coverage   56.53%   56.54%   +0.01%     
==========================================
  Files         484      484              
  Lines       34250    34260      +10     
==========================================
+ Hits        19362    19373      +11     
+ Misses      13727    13726       -1     
  Partials     1161     1161
Impacted Files Coverage Δ
pkg/trace/agent/tags.go 73.47% <100%> (+1.2%) ⬆️
pkg/logs/auditor/auditor.go 71.12% <0%> (-0.71%) ⬇️
pkg/trace/writer/trace.go 91.66% <0%> (+0.49%) ⬆️
pkg/forwarder/transaction.go 82.9% <0%> (+0.85%) ⬆️

pkg/trace/agent/tags.go Outdated Show resolved Hide resolved
ufoot
ufoot previously approved these changes Jan 29, 2019
Copy link

@ufoot ufoot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests say it all I think. GTM.

pkg/trace/agent/tags.go Outdated Show resolved Hide resolved
pkg/trace/agent/tags_test.go Show resolved Hide resolved
pkg/trace/agent/tags.go Show resolved Hide resolved
pkg/trace/agent/tags.go Show resolved Hide resolved
Copy link
Contributor

@AlexJF AlexJF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@gbbr gbbr merged commit b79fe06 into master Jan 29, 2019
@gbbr gbbr deleted the gbbr/utf8 branch January 29, 2019 11:40
@gbbr
Copy link
Contributor Author

gbbr commented Jan 29, 2019

Thanks Alex! Great learnings here! 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants