Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ORT Training] Some important updates of ONNX Runtime training APIs #1335

Merged
merged 18 commits into from
Oct 18, 2023

Conversation

JingyaHuang
Copy link
Collaborator

@JingyaHuang JingyaHuang commented Sep 1, 2023

What does this PR do?

  • Update ORTTrainer to be compatible with transformers 40ea9ab2a1ad99f12e71c3a26215ad33df082ef9.
  • /Deprecation/ Deprecate the evaluation and prediction using ONNX runtime
  • Refactoring of examples, tests and enable again the CI -> Run with tiny functional models

[Deprecation notes]

The Optimum team decided to deprecate the evaluation and prediction using ONNX Runtime for the reasons below:

  • After the deprecation, evaluation and prediction for the trained model are always possible within PyTorch through ORTTrainer. If you want to do inference with ORT, either the evaluation or the prediction can be done through ORTModels in the library.
  • Reduce the workload of maintaining ORT inference, which could be broken easily with the evolution of ORT inference APIs. And the feature has no clear usage to encourage us to continue maintaining it.
  • Ease the maintenance of ORTTrainer and training examples

[Other subjects to discuss]

  • Automate the update and tests of ORT training examples

@JingyaHuang JingyaHuang changed the title Update ort trainer to 4.32.1 Update ORTTrainer to 4.33.1 Sep 15, 2023
@JingyaHuang JingyaHuang marked this pull request as ready for review September 15, 2023 15:43
@JingyaHuang JingyaHuang mentioned this pull request Sep 15, 2023
@fxmarty
Copy link
Collaborator

fxmarty commented Sep 16, 2023

#1327 should already be fixed

Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>
@JingyaHuang
Copy link
Collaborator Author

yeah @fxmarty I forgot to update the description, thanks a lot for the fix.

Copy link
Collaborator

@fxmarty fxmarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, hopefully adding position_ids inputs did not break ORTTrainer?

Copy link
Contributor

@regisss regisss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@JingyaHuang JingyaHuang changed the title Update ORTTrainer to 4.33.1 Some important updates of ONNx Runtime training APIs Oct 12, 2023
@JingyaHuang JingyaHuang changed the title Some important updates of ONNx Runtime training APIs [ORT Training] Some important updates of ONNx Runtime training APIs Oct 12, 2023
@JingyaHuang JingyaHuang changed the title [ORT Training] Some important updates of ONNx Runtime training APIs [ORT Training] Some important updates of ONNX Runtime training APIs Oct 12, 2023
@JingyaHuang JingyaHuang added the gpu-test trigger GPU tests label Oct 16, 2023
@JingyaHuang JingyaHuang removed gpu-test trigger GPU tests training labels Oct 17, 2023
@JingyaHuang
Copy link
Collaborator Author

All tests passed:

====================================================== test session starts =======================================================
platform linux -- Python 3.10.0, pytest-7.4.2, pluggy-1.0.0 -- /home/onnxruntimedev/miniconda3/bin/python
cachedir: .pytest_cache
rootdir: /workspace
configfile: pyproject.toml
collected 13 items                                                                                                               

onnxruntime/nightly_test_trainer.py::ORTTrainerIntegrationTest::test_trainer_fp16_0_distilbert_text_classification PASSED  [  7%]
onnxruntime/nightly_test_trainer.py::ORTTrainerIntegrationTest::test_trainer_fp16_1_gpt2_text_generation PASSED            [ 15%]
onnxruntime/nightly_test_trainer.py::ORTTrainerIntegrationTest::test_trainer_fp16_2_t5_text2text_generation PASSED         [ 23%]
onnxruntime/nightly_test_trainer.py::ORTTrainerIntegrationTest::test_trainer_fp32_0_distilbert_text_classification PASSED  [ 30%]
onnxruntime/nightly_test_trainer.py::ORTTrainerIntegrationTest::test_trainer_fp32_1_gpt2_text_generation PASSED            [ 38%]
onnxruntime/nightly_test_trainer.py::ORTTrainerIntegrationTest::test_trainer_fp32_2_t5_text2text_generation PASSED         [ 46%]
onnxruntime/nightly_test_trainer.py::ORTTrainerIntegrationTest::test_trainer_fp32_with_label_smoothing_0_distilbert_text_classification PASSED [ 53%]
onnxruntime/nightly_test_trainer.py::ORTTrainerIntegrationTest::test_trainer_fp32_with_label_smoothing_1_gpt2_text_generation PASSED [ 61%]
onnxruntime/nightly_test_trainer.py::ORTTrainerIntegrationTest::test_trainer_fp32_with_label_smoothing_2_t5_text2text_generation PASSED [ 69%]
onnxruntime/nightly_test_trainer.py::ORTTrainerIntegrationDeepSpeedTest::test_trainer_fp16_ds_stage1_0_distilbert_text_classification PASSED [ 76%]
onnxruntime/nightly_test_trainer.py::ORTTrainerIntegrationDeepSpeedTest::test_trainer_fp16_ds_stage2_0_distilbert_text_classification PASSED [ 84%]
onnxruntime/nightly_test_trainer.py::ORTTrainerOptimizerChoiceTest::test_ort_fused_adam PASSED                             [ 92%]
onnxruntime/nightly_test_trainer.py::ORTTrainerExampleTest::test_trainer_glue SKIPPED (skip for now, server socket error)  [100%]

Left an example test skipped, need to add similar tests for other tasks.

@JingyaHuang JingyaHuang merged commit 85e6fff into huggingface:main Oct 18, 2023
49 of 52 checks passed
@JingyaHuang JingyaHuang deleted the update-ort-trainer-to-4.32 branch October 18, 2023 21:51
@JingyaHuang JingyaHuang mentioned this pull request Oct 18, 2023
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants