Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable training on XPU devices in OTX2.0 #3094

Merged
merged 67 commits into from
Mar 24, 2024
Merged

Conversation

kprokofi
Copy link
Contributor

Summary

How to test

Checklist

  • I have added unit tests to cover my changes.​
  • I have added integration tests to cover my changes.​
  • I have added e2e tests for validation.
  • I have added the description of my changes into CHANGELOG in my target branch (e.g., CHANGELOG in develop).​
  • I have updated the documentation in my target branch accordingly (e.g., documentation in develop).
  • I have linked related issues.

License

  • I submit my code changes under the same Apache License that covers the project.
    Feel free to contact the maintainers if that's a concern.
  • I have updated the license header for each file (see an example below).
# Copyright (C) 2023 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

kprokofi and others added 29 commits January 14, 2024 23:05
@github-actions github-actions bot added the DEPENDENCY Any changes in any dependencies (new dep or its version) should be produced via Change Request on PM label Mar 13, 2024
@harimkang harimkang added this to the 2.0.0 milestone Mar 21, 2024
harimkang
harimkang previously approved these changes Mar 21, 2024
@kprokofi
Copy link
Contributor Author

kprokofi commented Mar 22, 2024

I updated strategy: 98e0b69 removing torch.xpu.optimize for Semantic Segmentation since it was reported by IPEX that they have a bug there in that case and to be able to train segmentation model we should remove optimization.
I hope it will be fixed in the next IPEX releases

@kprokofi
Copy link
Contributor Author

kprokofi commented Mar 22, 2024

@harimkang , seems like a problem with installing mmdet occurs during set up env for unit tests. Could you take a look?

@harimkang
Copy link
Contributor

@harimkang , seems like a problem with installing mmdet occurs during set up env for unit tests. Could you take a look?

There was a brief issue with the CI network. I rerun the test and the install is fine.

@kprokofi kprokofi enabled auto-merge (squash) March 24, 2024 22:45
@kprokofi kprokofi merged commit 9c746da into releases/2.0.0 Mar 24, 2024
16 checks passed
@kprokofi kprokofi deleted the kp/xpu_otx2.0 branch March 24, 2024 23:30
kprokofi added a commit that referenced this pull request Apr 9, 2024
* add raising an error when metric is None

* added accelerators

* fix packages

* fix assigning model

* debug on MAX

* change precision

* update MixedPrecisionXPUPlugin

* debug

* added monkey patching

* minor

* minor

* added patch for mmengine

* fix OD and IS

* benchmark debug

* change device

* quick fix for instance seg

* fix pre-commit

* fix pre-commit

* clean the code

* added additional flag for mmcv

* added unit tests

* fixed unit test

* fix linter

* added unit tests and replied comments

* fix pre-commit

* minor fix

* added documentation

* fix unit test

* add workaround for semantic segmentation

* remove RoiAlignTest due to unstability

* minor

* remove strategy back

* try to patch SingleDeviceStrategy

* added auto xpu configuration

* patch strategy

* small fix

* reply to comments

* move patching xpu packages to accelerator

* fix test_xpu test

* remove do-not-install-mmcv

* fix pre-commit

* remove torch.xpu.optimize for segmentation

---------

Co-authored-by: Emily <emily.chun@intel.com>
kprokofi added a commit that referenced this pull request Apr 16, 2024
* Enable training on XPU devices in OTX2.0 (#3094)

* add raising an error when metric is None

* added accelerators

* fix packages

* fix assigning model

* debug on MAX

* change precision

* update MixedPrecisionXPUPlugin

* debug

* added monkey patching

* minor

* minor

* added patch for mmengine

* fix OD and IS

* benchmark debug

* change device

* quick fix for instance seg

* fix pre-commit

* fix pre-commit

* clean the code

* added additional flag for mmcv

* added unit tests

* fixed unit test

* fix linter

* added unit tests and replied comments

* fix pre-commit

* minor fix

* added documentation

* fix unit test

* add workaround for semantic segmentation

* remove RoiAlignTest due to unstability

* minor

* remove strategy back

* try to patch SingleDeviceStrategy

* added auto xpu configuration

* patch strategy

* small fix

* reply to comments

* move patching xpu packages to accelerator

* fix test_xpu test

* remove do-not-install-mmcv

* fix pre-commit

* remove torch.xpu.optimize for segmentation

---------

Co-authored-by: Emily <emily.chun@intel.com>

* Add exporter/demo unit tests (#3218)

* added unit tests. Need to clean up

* move tests

* fix pre-commit

* return demo back

* minor

* delete unnecessery comments

* fix unit test

* fix pre-commit

* fix pre-commit 2

* fix test_postprocess_openvino_model

* fix unit tests

* test_precommit

* Fix a bug that engine.test doesn't work with XPU (#3293)

* fix bug

* align with pre-commit

---------

Co-authored-by: Emily <emily.chun@intel.com>

* fix merge conflicts for pre-commit

* fix precommit 2

* fix unit test

* fix pre-commit

* fix export tests

* fix pre-commit

* fix tox

* fix pre-commit

---------

Co-authored-by: Emily <emily.chun@intel.com>
Co-authored-by: Eunwoo Shin <eunwoo.shin@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DOC Improvements or additions to documentation OTX 2.0 For OTX v2.0 TEST Any changes in tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants