Skip to content

Allow retrieval of binary data from WSClient output stream #1471

@FabianNiehaus

Description

@FabianNiehaus

What is the feature and why do you need it:

The API client currently does not offer any capabilities for copying files from Pods / containers (similiar to kubectl cp).
We have created a workaround by opening a stream to the Pod, creating a tar archive of the files to copy, and outputting the data to stdout. Once retrieved, the data is then extracted from the archive.

However, this does not work for binary files due to how the WSClient handles incoming data.

kubernetes\stream\ws_client.py:161-178 (kubernetes v17.17.0)

    def update(self, timeout=0):
        """Update channel buffers with at most one complete frame of input."""
        if not self.is_open():
            return
        if not self.sock.connected:
            self._connected = False
            return
        r, _, _ = select.select(
            (self.sock.sock, ), (), (), timeout)
        if r:
            op_code, frame = self.sock.recv_data_frame(True)
            if op_code == ABNF.OPCODE_CLOSE:
                self._connected = False
                return
            elif op_code == ABNF.OPCODE_BINARY or op_code == ABNF.OPCODE_TEXT:
                data = frame.data
                if six.PY3:
                    data = data.decode("utf-8", "replace")

The last line always tries to decode to UTF-8 while replacing all characters that cannot be properly decoded.
In the case of binary PCAP files, this results in corrupted data.

Describe the solution you'd like to see:
I think that changing the signature of update to allow the user the following options might resolve the issue:

  1. Choose a different error handling than replace. According to the docs, strict and ignore are options as well. I tested out ignore and it results in the desired output when decoding and then encoding again, which both byte object being the same.
  2. Allow the user to skip conversion alltogether. This would mean adding a flag which would cause skipping of
    if six.PY3:
        data = data.decode("utf-8", "replace")

If needed, I can implement the agreed on solution myself and open a pull request.

Metadata

Metadata

Assignees

Labels

kind/featureCategorizes issue or PR as related to a new feature.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions