-
Notifications
You must be signed in to change notification settings - Fork 28k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-41775][PYTHON][ML] Adding support for PyTorch functions #39369
Conversation
Can one of the admins verify this patch? |
9becf31
to
0d169aa
Compare
overall good. |
0d169aa
to
c6b96a5
Compare
2d1ad9f
to
ad662be
Compare
ad662be
to
5f10396
Compare
5f10396
to
d8715ba
Compare
36a9a00
to
177b291
Compare
37df2c6
to
691fc3b
Compare
691fc3b
to
71f78f7
Compare
41aebc2
to
73a62f6
Compare
Merged to master. |
@@ -15,16 +15,21 @@ | |||
# limitations under the License. | |||
# | |||
|
|||
import cloudpickle # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be our cloudpickle
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @rithwik-db , @WeichenXu123 , @HyukjinKwon .
Apache Spark should not require two versions of cloudpickle
at the same time.
I made a PR, #39715.
…`cloudpickle` ### What changes were proposed in this pull request? This is a follow-up of #39369 which aims to use `pyspark.cloudpickle` instead of outside `cloudpickle` dependency. ### Why are the changes needed? Apache PySpark should not use two versions of `cloudpickle`. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CIs. Closes #39715 from dongjoon-hyun/SPARK-41775. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
### What changes were proposed in this pull request? This is a follow-up of #39369 which aims to fix `stderr` rerouting to `stdout` which is required for users seeing any errors that come up due to distributed training, ### Why are the changes needed? Previously, the error would be lost since it's never returned to the user. We are fixing that issue. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? N/A Closes #39724 from rithwik-db/sterr-followup. Authored-by: Rithwik Ediga Lakhamsani <rithwik.ediga@databricks.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
NOTE: If you want to only view the diff from the other WIP PR regarding the baseline changes, look at the LAST COMMIT in this PR's commit history (titled "WIP adding notebook functionality"). Since I am sending out parallel PRs that are related, you should view this commit to see the diff pertaining to this ticket.
What changes were proposed in this pull request?
This is an addition to #39299 to add support for using functions as the input for distributed training. The users would follow the first workflow in the design document to run training.
Why are the changes needed?
We want to make it easier for users to run distributed training from a notebook setting.
Does this PR introduce any user-facing change?
Users can now input a training function into the
PyTorchDistributor().run(...)
api. This will require a lot of additional documentation though since since we are internally using cloudpickle to run training so we need to be able to have the user's train() function be picklable.How was this patch tested?
This is a WIP PR so it hasn't been tested yet.