Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mark kubectl-sourced strings with invalid encoding as UTF-8 #646

Merged
merged 2 commits into from
Jan 27, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
## next

*Bug Fixes*
- Help ruby correctly identify kubectl output encoding. [#646](https://github.com/Shopify/krane/pull/646)

## 1.1.1

*Enhancements*
Expand Down
11 changes: 10 additions & 1 deletion lib/krane/kubectl.rb
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,16 @@ def run(*args, log_failure: nil, use_context: true, use_namespace: true, output:
(1..attempts).to_a.each do |current_attempt|
logger.debug("Running command (attempt #{current_attempt}): #{cmd.join(' ')}")
out, err, st = Open3.capture3(*cmd)
logger.debug("Kubectl out: " + out.gsub(/\s+/, ' ')) unless output_is_sensitive

# https://github.com/Shopify/krane/issues/395
unless out.valid_encoding?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This returns false when the string uses characters that aren't in the character set for the encoding the string is claiming to have. I think this is more strictly correct than my previous solution of checking if the original tag is utf8 and always changing it if so. The trade-off is potentially performance: in this version we have to check the encoding on every single string, but we only have to dup the string and change it when a non-ASCII character is found. In the previous version, we never had to check the encoding, but if your locales are set wrong, we're dup'ing and forcing encoding on every single string.

Note that setting Encoding.default_external would also fix the bug, but I don't think we should do that, because as a gem, we should not set globals (not to mention that the docs say "don't set this yourself").

out = out.dup.force_encoding(Encoding::UTF_8)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After further thought and reading about encodings, I am comfortable saying this is safe to do. The string we're tagging here isn't arbitrary--it's coming from kubectl. I'd be shocked if kubectl decided to suddenly start outputting e.g. UTF-16 encoding. Putting that aside, the worst case is that a user gets a non-UTF8 string and still sees the original bug. If we want to be super paranoid, we can add this code after this line:

          unless out.valid_encoding?
            out.encode(invalid: :replace, undef: :replace)
          end

That will force the string to be UTF8, with invalid characters replaced with "?". But I just don't think it's worth it. I spent a few minutes trying to figure out how to test that, and couldn't figure out how.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the better behaviour here is, as you suggest, to let the original bug come through if they hit it. Relying on undef: feels very close to masking a (potentially) larger problem

end

if logger.debug? && !output_is_sensitive
# don't do the gsub unless we're going to print this
logger.debug("Kubectl out: " + out.gsub(/\s+/, ' '))
end

break if st.success?
raise(ResourceNotFoundError, err) if err.match(ERROR_MATCHERS[:not_found]) && raise_if_not_found
Expand Down
23 changes: 23 additions & 0 deletions test/unit/krane/kubectl_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -348,6 +348,29 @@ def test_retry_delay_backoff
end
end

def test_kubectl_run_fixes_encoding_when_locales_set_to_non_utf8
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now a much better version of the test that reproduces the Ruby environment from the bug report (to confirm, start a console session with the wrong local envs set and check the Encoding values below). Implementation inspired by this patch on Pysch, which I found via a bug report on RubyGems.

Note that I confirmed this is a true regression test that files if I remove my new code.

ext_before = Encoding.default_external
int_before = Encoding.default_internal
logger.level = ::Logger::DEBUG
utf8 = "こんにちは!hélas!"

# This is how setting the env from https://github.com/Shopify/krane/issues/395 manifests internally
Encoding.default_external = Encoding::US_ASCII
Encoding.default_internal = nil

# put the string through the default external encoder
data = %x(echo #{utf8})
assert_equal(Encoding::US_ASCII, data.encoding)

stub_open3(%W(kubectl get pods --namespace=testn --context=testc --request-timeout=#{timeout}), resp: data)
out, _err, _st = build_kubectl.run("get", "pods")
assert_equal(utf8, out)
assert_equal(Encoding::UTF_8, out.encoding)
ensure
Encoding.default_external = ext_before
Encoding.default_internal = int_before
end

private

def timeout
Expand Down