Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot pipe GzipFile into subprocess #85062

Open
NehalPatel mannequin opened this issue Jun 6, 2020 · 4 comments
Open

Cannot pipe GzipFile into subprocess #85062

NehalPatel mannequin opened this issue Jun 6, 2020 · 4 comments
Labels
3.7 (EOL) end of life stdlib Python modules in the Lib dir topic-IO type-bug An unexpected behavior, bug, or error

Comments

@NehalPatel
Copy link
Mannequin

NehalPatel mannequin commented Jun 6, 2020

BPO 40885
Nosy @giampaolo

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2020-06-06.00:56:53.006>
labels = ['3.7', 'type-bug', 'library', 'expert-IO']
title = 'Cannot pipe GzipFile into subprocess'
updated_at = <Date 2021-10-23.06:17:00.698>
user = 'https://bugs.python.org/NehalPatel'

bugs.python.org fields:

activity = <Date 2021-10-23.06:17:00.698>
actor = 'mherrmann.at'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)', 'IO']
creation = <Date 2020-06-06.00:56:53.006>
creator = 'Nehal Patel'
dependencies = []
files = []
hgrepos = []
issue_num = 40885
keywords = []
message_count = 4.0
messages = ['370804', '370815', '370820', '404854']
nosy_count = 4.0
nosy_names = ['giampaolo.rodola', 'SilentGhost', 'mherrmann.at', 'Nehal Patel']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue40885'
versions = ['Python 3.7']

@NehalPatel
Copy link
Mannequin Author

NehalPatel mannequin commented Jun 6, 2020

The following code produces incorrect behavior:

with gzip.open("foo.gz") as gz:
    res = subprocess.run("cat", stdin=gz, capture_output=True)

the contents of res.stdout are identical to the contents of "foo.gz"

It seems the subprocess somehow gets a hold of the underlying file descriptor pointing to the compressed file, and ends up being fed the compressed bytes.

@NehalPatel NehalPatel mannequin added 3.7 (EOL) end of life topic-IO type-bug An unexpected behavior, bug, or error labels Jun 6, 2020
@SilentGhost
Copy link
Mannequin

SilentGhost mannequin commented Jun 6, 2020

subprocess somehow gets a hold of the underlying file descriptor pointing to the compressed file, and ends up being fed the compressed bytes

That is exactly what happens, and I'd wager this is not going to change. You could easily pass the decoded bytes into the process using input parameter.

@SilentGhost SilentGhost mannequin added stdlib Python modules in the Lib dir labels Jun 6, 2020
@NehalPatel
Copy link
Mannequin Author

NehalPatel mannequin commented Jun 6, 2020

In my use case, I was actually trying to stream a large gzip file from the cloud directly into subprocess without spilling onto disk or RAM i.e. the code actually looked something more like:

r, w = os.pipe()
# ... launch a thread to feed r
with gzip.open(os.fdopen(w, 'rb')) as gz:
res = subprocess.run("myexe", stdin=gz, capture_output=True)
## fyi, expected output is tiny

(In my case, I could modify the executable to expect compressed input, so I chose that solution. Another possibility would have been to use subprocess.POpen twice, once with 'gzcat' and second with 'myexe')

I agree that given how libgz works, it would be difficult to fix the problem. I would suggest finding a way to alert the user about this issue because it will in general be a very confusing situation when this happens.

@mherrmannat
Copy link
Mannequin

mherrmannat mannequin commented Oct 23, 2021

I just encountered what seems to be the inverse problem of this issue: bpo-45585

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.7 (EOL) end of life stdlib Python modules in the Lib dir topic-IO type-bug An unexpected behavior, bug, or error
Projects
Status: No status
Development

No branches or pull requests

0 participants