Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

automatically tarball directories? #23

Closed
coyotemarin opened this issue Oct 23, 2010 · 6 comments
Closed

automatically tarball directories? #23

coyotemarin opened this issue Oct 23, 2010 · 6 comments
Assignees
Labels
Milestone

Comments

@coyotemarin
Copy link
Collaborator

We have a python_archives option which allows you to upload a tarball and stick it in the $PYTHONPATH. It seems kind of silly, but it would probably be helpful to people if we would automatically tar up directories for them.

We probably want to automatically remove stray editor/MacFuse crud (~, .#, ._*) like we do when bootstrapping mrjob.

Not going to do this until someone asks for it. :)

@BrandonHaynes
Copy link
Contributor

I have been running a (somewhat) specialized implementation of this feature that might be generalized for this purpose. Is this still an open issue in MRJob?

@coyotemarin
Copy link
Collaborator Author

(whoops, didn't mean to close that. wooo GitHub keyboard shortcuts)

Yes, this is still an open issue. We didn't do this because we didn't have a good use case. Would love to see your code.

@BrandonHaynes
Copy link
Contributor

I'm currently archiving the specified files on runner instantiation and appending the resulting file(s) onto the python_archives list. Although I'd generally prefer to avoid mucking with the caller's passed-in values, this is an easy shortcut. Thoughts?

@coyotemarin
Copy link
Collaborator Author

Sorry for the slow response. Just to be totally clear, can you show me what your command line looks like and/or some sample code?

@coyotemarin
Copy link
Collaborator Author

Probably better to .zip directories, so they can play well with the py_files option (see #1375).

@coyotemarin coyotemarin modified the milestones: v0.5.7, v0.5.8 Nov 5, 2016
@coyotemarin coyotemarin self-assigned this Dec 21, 2016
@coyotemarin
Copy link
Collaborator Author

Finally going to do this!

Just going to create tarballs (since even old versions of Hadoop support them) and not do any filtering of hidden files.

coyotemarin pushed a commit that referenced this issue Dec 28, 2016
auto-archive directories (fixes #23)
coyotemarin pushed a commit to coyotemarin/mrjob that referenced this issue Mar 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants