-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mark kubectl-sourced strings with invalid encoding as UTF-8 #646
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -34,7 +34,16 @@ def run(*args, log_failure: nil, use_context: true, use_namespace: true, output: | |
(1..attempts).to_a.each do |current_attempt| | ||
logger.debug("Running command (attempt #{current_attempt}): #{cmd.join(' ')}") | ||
out, err, st = Open3.capture3(*cmd) | ||
logger.debug("Kubectl out: " + out.gsub(/\s+/, ' ')) unless output_is_sensitive | ||
|
||
# https://github.com/Shopify/krane/issues/395 | ||
unless out.valid_encoding? | ||
out = out.dup.force_encoding(Encoding::UTF_8) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. After further thought and reading about encodings, I am comfortable saying this is safe to do. The string we're tagging here isn't arbitrary--it's coming from kubectl. I'd be shocked if kubectl decided to suddenly start outputting e.g. UTF-16 encoding. Putting that aside, the worst case is that a user gets a non-UTF8 string and still sees the original bug. If we want to be super paranoid, we can add this code after this line: unless out.valid_encoding?
out.encode(invalid: :replace, undef: :replace)
end That will force the string to be UTF8, with invalid characters replaced with "?". But I just don't think it's worth it. I spent a few minutes trying to figure out how to test that, and couldn't figure out how. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the better behaviour here is, as you suggest, to let the original bug come through if they hit it. Relying on |
||
end | ||
|
||
if logger.debug? && !output_is_sensitive | ||
# don't do the gsub unless we're going to print this | ||
logger.debug("Kubectl out: " + out.gsub(/\s+/, ' ')) | ||
end | ||
|
||
break if st.success? | ||
raise(ResourceNotFoundError, err) if err.match(ERROR_MATCHERS[:not_found]) && raise_if_not_found | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -348,6 +348,29 @@ def test_retry_delay_backoff | |
end | ||
end | ||
|
||
def test_kubectl_run_fixes_encoding_when_locales_set_to_non_utf8 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is now a much better version of the test that reproduces the Ruby environment from the bug report (to confirm, start a console session with the wrong local envs set and check the Note that I confirmed this is a true regression test that files if I remove my new code. |
||
ext_before = Encoding.default_external | ||
int_before = Encoding.default_internal | ||
logger.level = ::Logger::DEBUG | ||
utf8 = "こんにちは!hélas!" | ||
|
||
# This is how setting the env from https://github.com/Shopify/krane/issues/395 manifests internally | ||
Encoding.default_external = Encoding::US_ASCII | ||
Encoding.default_internal = nil | ||
|
||
# put the string through the default external encoder | ||
data = %x(echo #{utf8}) | ||
assert_equal(Encoding::US_ASCII, data.encoding) | ||
|
||
stub_open3(%W(kubectl get pods --namespace=testn --context=testc --request-timeout=#{timeout}), resp: data) | ||
out, _err, _st = build_kubectl.run("get", "pods") | ||
assert_equal(utf8, out) | ||
assert_equal(Encoding::UTF_8, out.encoding) | ||
ensure | ||
Encoding.default_external = ext_before | ||
Encoding.default_internal = int_before | ||
end | ||
|
||
private | ||
|
||
def timeout | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This returns false when the string uses characters that aren't in the character set for the encoding the string is claiming to have. I think this is more strictly correct than my previous solution of checking if the original tag is utf8 and always changing it if so. The trade-off is potentially performance: in this version we have to check the encoding on every single string, but we only have to dup the string and change it when a non-ASCII character is found. In the previous version, we never had to check the encoding, but if your locales are set wrong, we're dup'ing and forcing encoding on every single string.
Note that setting
Encoding.default_external
would also fix the bug, but I don't think we should do that, because as a gem, we should not set globals (not to mention that the docs say "don't set this yourself").