Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add cat() to MRJobRunner (stream compressed output) #17

Closed
coyotemarin opened this issue Oct 23, 2010 · 2 comments
Closed

add cat() to MRJobRunner (stream compressed output) #17

coyotemarin opened this issue Oct 23, 2010 · 2 comments
Assignees
Labels
Milestone

Comments

@coyotemarin
Copy link
Collaborator

cat() would work like hadoop -fs cat, except that it would automatically decompress .gz and .bz2 files.

MRJobRunner.stream_output() would just call cat(). This would allow us to stream compressed output from jobs (which currently we can't do).

@ghost ghost assigned coyotemarin Apr 11, 2011
@ghost ghost assigned wqardaji May 20, 2011
@coyotemarin
Copy link
Collaborator Author

Actually, going to defer MRJobRunner.stream_output() calling cat() until v0.3.0, so that I don't unintentionally sabotage someone who was doing the gunzipping themselves.

coyotemarin pushed a commit that referenced this issue Jun 3, 2011
add cat() to MRJobRunner (solves issue #17)
@coyotemarin
Copy link
Collaborator Author

Fixed in development.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants