-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-22519][flink-python] support tar python archives #15813
[FLINK-22519][flink-python] support tar python archives #15813
Conversation
Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community Automated ChecksLast check on commit ec561cd (Fri Apr 30 02:12:21 UTC 2021) Warnings:
Mention the bot in a comment to re-run the automated checks. Review Progress
Please see the Pull Request Review Guide for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commandsThe @flinkbot bot supports the following commands:
|
flink-python/src/main/java/org/apache/flink/python/util/TarGzUtils.java
Outdated
Show resolved
Hide resolved
...k-python/src/main/java/org/apache/flink/python/env/beam/ProcessPythonEnvironmentManager.java
Outdated
Show resolved
Hide resolved
...k-python/src/main/java/org/apache/flink/python/env/beam/ProcessPythonEnvironmentManager.java
Outdated
Show resolved
Hide resolved
ec561cd
to
0d44263
Compare
String untarCommand = | ||
gzipped | ||
? String.format( | ||
"gzip -dc '%s' | (cd '%s' && tar -xf -)", inFilePath, targetDirPath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if I can use tar -xzf
directly? It is simpler, but I am not exactly sure why hadoop-common chooses to do the more verbose way, i.e., gzip ... | tar -xf ...
...k-python/src/main/java/org/apache/flink/python/env/beam/ProcessPythonEnvironmentManager.java
Outdated
Show resolved
Hide resolved
flink-python/src/main/java/org/apache/flink/python/util/DecompressUtils.java
Outdated
Show resolved
Hide resolved
flink-python/src/main/java/org/apache/flink/python/util/TarUtils.java
Outdated
Show resolved
Hide resolved
flink-python/src/main/java/org/apache/flink/python/util/TarUtils.java
Outdated
Show resolved
Hide resolved
flink-python/src/main/java/org/apache/flink/python/util/TarUtils.java
Outdated
Show resolved
Hide resolved
if (!f.isDirectory() && !f.mkdirs()) { | ||
throw new IOException("failed to create directory " + f); | ||
} | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also considered link and hard link in https://github.com/apache/hadoop/blob/7f93349ee74da5f35276b7535781714501ab2457/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileUtil.java#L991. Should we also consider that?
@YikSanChan Thanks a lot for the update. Have left a few comments as above. Besides, it would be great if we could add some tests. |
Co-authored-by: Dian Fu <dianfu@apache.org>
180a9c6
to
1533b82
Compare
Any update on this PR ? Supporting tar is definitely necessary. |
Hi Jeff, I haven't worked on this for a while. Though if you see the need being urgent, I am happy to revisit and try to finish this next week. |
Thanks @YikSanChan I am trying to make a workaround via |
@YikSanChan Regarding to tests, you could take a look ZipUtilsTests which is added recently for reference. |
@YikSanChan What's the status of this PR? I'm asking because we are approaching the feature freeze of 1.14 which is planned at the end of this month. It would be great to include this feature in 1.14. |
@YikSanChan @dianfu Here's one zeppelin notebook which create conda env tar and zip file to use customized python env in yarn cluster. But it would be great if this PR can be merged so that I can only create one tar file. http://23.254.161.240/#/notebook/2G8N1WTTS |
@YikSanChan Any update ? |
Hi Jeff, unfortunately I am not able to work on this recently. |
@YikSanChan Since this feature is very important and you don't have much bandwidth on this work. If you don't mind, I'd like to take over this PR and continue the work based on this PR. |
Feel free, no worry at all. |
@YikSanChan Thanks a lot~ |
What is the purpose of the change
Support tar python archives.
Brief change log
Verifying this change
This change added tests and can be verified as follows: TODO
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: noDocumentation