Skip to content

Conversation

dependabot[bot]
Copy link
Contributor

@dependabot dependabot bot commented on behalf of github Apr 15, 2022

Bumps horovod from 0.22.1 to 0.24.0.

Release notes

Sourced from horovod's releases.

Elastic mode improvements, MXNet async dependency engine, fixes for latest PyTorch and TensorFlow versions

Added

  • Ray: Added elastic keyword parameters to RayExecutor API: This API supports both static (non-elastic) and elastic Horovod jobs. (#3190)
  • TensorFlow: Added in-place broadcasting of variables. (#3128)
  • Elastic: Added support for resurrecting blacklisted hosts. (#3319)
  • MXNet: Added support for MXNet async dependency engine. (#3242, #2963)
  • Spark/Lightning: Added history to lightning estimator. (#3214)

Changed

  • Moved to CMake version 3.13 with first-class CUDA language support and re-enabled parallelized builds. Uses a temporary installation of CMake if CMake 3.13 is not found. (#3261, #3371)
  • Moved released Docker image horovod and horovod-cpu to Ubuntu 20.04 and Python 3.8. (#3393)
  • Spark Estimator: Don't shuffle row groups if training data requires non-shuffle (#3369)
  • Spark/Lightning: Reduced memory footprint of async dataloader. (#3239)
  • Elastic: Improved handling NCCL errors under elastic scenario. (#3112)
  • Spark/Lightning: Do not overwrite model with checkpoint by default. (#3201)
  • Make checkpoint name optional so that user can save to h5 format. (#3411)

Deprecated

  • Deprecated ElasticRayExecutor APIs in favor of the new RayExecutor API. (#3190)

Removed

  • Spark: Removed h5py<3 constraint as this is not needed anymore for Tensorflow >2.5.0. (#3301)

Fixed

  • Elastic Spark: Fixed indices in initial task-to-task registration. (#3410)
  • PyTorch: Fixed GIL-related deadlock with PyTorch 1.10.1. (#3352)
  • PyTorch: Fixed finalization of ProcessSetTable. (#3351)
  • Fixed remote trainers to point to the correct shared lib path. (#3258)
  • Fixed imports from tensorflow.python.keras with tensorflow 2.6.0+. (#3403)
  • Fixed Adasum communicator init logic. (#3379)
  • Lightning: Fixed resume logger. (#3375)
  • Fixed the checkpoint directory structure for pytorch and pytorch lightning. (#3362)
  • Fixed possible integer overflow in multiplication. (#3368)
  • Fixed the pytorch_lightning_mnist.py example. (#3245, #3290)
  • Fixed barrier segmentation fault. (#3313)
  • Fixed hvd.barrier() tensor queue management. (#3300)
  • Fixed PyArrow "list index out of range" IndexError. (#3274)
  • Elastic: Fixed all workers sometimes failing on elastic Horovod failure. (#3264)
  • Spark/Lightning: Fixed setting limit_train_batches and limit_val_batches. (#3237)
  • Elastic: Fixed ElasticSampler and hvd.elastic.state losing some indices of processed samples when nodes dropped. (#3143)
  • Spark/Lightning: Fixed history metrics for estimator serialization. (#3216)
  • Ray: Fixed RayExecutor to fail when num_workers=0 and num_hosts=None. (#3210)
  • Spark/Lightning: Fixed checkpoint callback dirpath typo. (#3204)

Process sets, XLA support, improved GPU backend

... (truncated)

Changelog

Sourced from horovod's changelog.

[v0.24.0] - 2022-03-01

Added

  • Ray: Added elastic keyword parameters to RayExecutor API: This API supports both static (non-elastic) and elastic Horovod jobs. (#3190)
  • TensorFlow: Added in-place broadcasting of variables. (#3128)
  • Elastic: Added support for resurrecting blacklisted hosts. (#3319)
  • MXNet: Added support for MXNet async dependency engine. (#3242, #2963)
  • Spark/Lightning: Added history to lightning estimator. (#3214)

Changed

  • Moved to CMake version 3.13 with first-class CUDA language support and re-enabled parallelized builds. Uses a temporary installation of CMake if CMake 3.13 is not found. (#3261, #3371)
  • Moved released Docker image horovod and horovod-cpu to Ubuntu 20.04 and Python 3.8. (#3393)
  • Spark Estimator: Don't shuffle row groups if training data requires non-shuffle (#3369)
  • Spark/Lightning: Reduced memory footprint of async dataloader. (#3239)
  • Elastic: Improved handling NCCL errors under elastic scenario. (#3112)
  • Spark/Lightning: Do not overwrite model with checkpoint by default. (#3201)
  • Make checkpoint name optional so that user can save to h5 format. (#3411)

Deprecated

  • Deprecated ElasticRayExecutor APIs in favor of the new RayExecutor API. (#3190)

Removed

  • Spark: Removed h5py<3 constraint as this is not needed anymore for Tensorflow >2.5.0. (#3301)

Fixed

  • Elastic Spark: Fixed indices in initial task-to-task registration. (#3410)
  • PyTorch: Fixed GIL-related deadlock with PyTorch 1.10.1. (#3352)
  • PyTorch: Fixed finalization of ProcessSetTable. (#3351)
  • Fixed remote trainers to point to the correct shared lib path. (#3258)
  • Fixed imports from tensorflow.python.keras with tensorflow 2.6.0+. (#3403)
  • Fixed Adasum communicator init logic. (#3379)
  • Lightning: Fixed resume logger. (#3375)
  • Fixed the checkpoint directory structure for pytorch and pytorch lightning. (#3362)
  • Fixed possible integer overflow in multiplication. (#3368)
  • Fixed the pytorch_lightning_mnist.py example. (#3245, #3290)
  • Fixed barrier segmentation fault. (#3313)
  • Fixed hvd.barrier() tensor queue management. (#3300)
  • Fixed PyArrow "list index out of range" IndexError. (#3274)
  • Elastic: Fixed all workers sometimes failing on elastic Horovod failure. (#3264)
  • Spark/Lightning: Fixed setting limit_train_batches and limit_val_batches. (#3237)
  • Elastic: Fixed ElasticSampler and hvd.elastic.state losing some indices of processed samples when nodes dropped. (#3143)
  • Spark/Lightning: Fixed history metrics for estimator serialization. (#3216)
  • Ray: Fixed RayExecutor to fail when num_workers=0 and num_hosts=None. (#3210)
  • Spark/Lightning: Fixed checkpoint callback dirpath typo. (#3204)

... (truncated)

Commits
  • b089df6 Bump version to 0.24.0 (#3433)
  • db19aa4 Move apt-get into non-interactive mode (#3441)
  • 2632c05 Build Horovod with temporarily installed CMake if necessary (#3371)
  • 7bf9b04 Make checkpoint name optional so that user can save to h5 format. (#3411)
  • b553974 Fix flaky ray tests (#3430)
  • 7b5346e Fix indices in initial task-to-task registration (#3410)
  • 71e10b4 Fixing GPU and CPU TF head CI failures (#3431)
  • 79ded4b Fix FindNVTX.cmake (#3421)
  • 642a6b3 [TF - Fix] Fix imports from tensorflow.python.keras with tf.version >= 2....
  • 046c071 Allow stderr of executed cmake python code appear in logs (#3398)
  • Additional commits viewable in compare view

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
  • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
  • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
  • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
  • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

Bumps [horovod](https://github.com/horovod/horovod) from 0.22.1 to 0.24.0.
- [Release notes](https://github.com/horovod/horovod/releases)
- [Changelog](https://github.com/horovod/horovod/blob/master/CHANGELOG.md)
- [Commits](horovod/horovod@v0.22.1...v0.24.0)

---
updated-dependencies:
- dependency-name: horovod
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot bot added dependencies Pull requests that update a dependency file python Pull requests that update Python code labels Apr 15, 2022
@chensuyue
Copy link
Contributor

Will fix in internal repo firstly.

@chensuyue chensuyue closed this Apr 19, 2022
@dependabot @github
Copy link
Contributor Author

dependabot bot commented on behalf of github Apr 19, 2022

OK, I won't notify you again about this release, but will get in touch when a new version is available. If you'd rather skip all updates until the next major or minor version, let me know by commenting @dependabot ignore this major version or @dependabot ignore this minor version.

If you change your mind, just re-open this PR and I'll resolve any conflicts on it.

@dependabot dependabot bot deleted the dependabot/pip/examples/pytorch/image_recognition/torchvision_models/quantization/qat/eager/distributed/horovod-0.24.0 branch April 19, 2022 14:54
xin3he pushed a commit that referenced this pull request Feb 14, 2025
* [SW-204341] explicit scale format for ops

Added wrapper around fp8 functions

Wrapper decides which flavor of the function to call,
according to scale format

Helper modules call the wrapper

Decide which cast flavor to call,
according to scale format

* [SW-204341] Adjust softmax API , remove commented-out code

* [SW-204341] Fixes from CR 1

* [SW-204341] Fixed CR 2

* [SW-204341] add missing arg is fsdpa

Signed-off-by: Uri Livne <ulivne@habana.ai>

* [SW-204341] Enhance SDPA for measure and quant

* [SW-204341] remove sdpa quantized ops

* reland per op class with more enchancments

* [SW-204341] reland specfic arguments , rename class to wrapper

* added call with self in patched lm head

rebased on top of master next
force push

* fix mistake in conflict resolution

resotore MethodType fix

* antoher fix

* modified fp8 mtamul test to test quantized matmul func

* another fix of rebase mistake

* hopefully last rebase mistake fix

* restore backward compatibly import protection

---------

Signed-off-by: Uri Livne <ulivne@habana.ai>
yiliu30 pushed a commit that referenced this pull request Feb 14, 2025
* [SW-204341] explicit scale format for ops

Added wrapper around fp8 functions

Wrapper decides which flavor of the function to call,
according to scale format

Helper modules call the wrapper

Decide which cast flavor to call,
according to scale format

* [SW-204341] Adjust softmax API , remove commented-out code

* [SW-204341] Fixes from CR 1

* [SW-204341] Fixed CR 2

* [SW-204341] add missing arg is fsdpa

Signed-off-by: Uri Livne <ulivne@habana.ai>

* [SW-204341] Enhance SDPA for measure and quant

* [SW-204341] remove sdpa quantized ops

* reland per op class with more enchancments

* [SW-204341] reland specfic arguments , rename class to wrapper

* added call with self in patched lm head

rebased on top of master next
force push

* fix mistake in conflict resolution

resotore MethodType fix

* antoher fix

* modified fp8 mtamul test to test quantized matmul func

* another fix of rebase mistake

* hopefully last rebase mistake fix

* restore backward compatibly import protection

---------

Signed-off-by: Uri Livne <ulivne@habana.ai>
XuehaoSun added a commit that referenced this pull request Feb 27, 2025
* [SW-210525] release HPU memory when loading neural_magic fp8 models (#48)

Signed-off-by: Xin He <xinhe3@habana.ai>
Co-authored-by: Xin He <xinhe3@habana.ai>

* [SW-211178] save generation_config when saving model if exists (#57)

* [SW-211178] save generation_config when saving model if exists

---------

Signed-off-by: Xin He <xinhe3@habana.ai>
Co-authored-by: Xin He <xinhe3@habana.ai>

* [SW-210543] update gitignore to simplify the git message (#50)

Signed-off-by: Xin He <xinhe3@habana.ai>
Co-authored-by: Xin He <xinhe3@habana.ai>

* [SW-205334][SW-187731] llama70b vLLM fix graph breaks with  torch.compile (#67)

* fix graph breaks with torch.compile

* remove orig_mod from helper_modules

* fix typos

* fix test_register_apis

---------

Co-authored-by: Rafal Litka <rlitka@habana.ai>

* [SW-213890] Disable test_two_step_layer_wise temporarily (#84)

* [SW-205437] - Support LM-HEAD patching (#79)

* [SW-205437] - Support LM-HEAD patching

* fix CR comments

* Enhance and rename fix_measurements tool to postprocessing_vllm_measurements (#82)

* [SW-214088] Fix graph break caused by PatchedMixtralMoE (#74)

* [SW-208528] Support FP8 per channel Q/DQ (#13)

* add per channel qdq support

Signed-off-by: changwang <changwang@habana.ai>

* improve ut

Signed-off-by: changwang <changwang@habana.ai>

* improve get_scale_dtype func and qdq init

Signed-off-by: changwangss <changwang@habana.ai>

* improve DequantOutput QuantInput init

Signed-off-by: changwangss <changwang@habana.ai>

* add scale_method improve PCQ

Signed-off-by: changwangss <changwang@habana.ai>

* remove scale name

Signed-off-by: changwangss <changwang@habana.ai>

* fix PCQ scale_inv expanding

Signed-off-by: changwangss <changwang@habana.ai>

* merge the qdq_per_channel, qdq_per_tensor to qdq

Signed-off-by: changwangss <changwang@habana.ai>

* move scale_inv change to the QuantInput init

Signed-off-by: changwangss <changwang@habana.ai>

* remove  scale_dtype list judge

Signed-off-by: changwangss <changwang@habana.ai>

* fix missing axis parameter

Signed-off-by: changwangss <changwang@habana.ai>

---------

Signed-off-by: changwang <changwang@habana.ai>
Signed-off-by: changwangss <changwang@habana.ai>

* [SW-204341] explicit scale format for ops (#73)

* [SW-204341] explicit scale format for ops

Added wrapper around fp8 functions

Wrapper decides which flavor of the function to call,
according to scale format

Helper modules call the wrapper

Decide which cast flavor to call,
according to scale format

* [SW-204341] Adjust softmax API , remove commented-out code

* [SW-204341] Fixes from CR 1

* [SW-204341] Fixed CR 2

* [SW-204341] add missing arg is fsdpa

Signed-off-by: Uri Livne <ulivne@habana.ai>

* [SW-204341] Enhance SDPA for measure and quant

* [SW-204341] remove sdpa quantized ops

* reland per op class with more enchancments

* [SW-204341] reland specfic arguments , rename class to wrapper

* added call with self in patched lm head

rebased on top of master next
force push

* fix mistake in conflict resolution

resotore MethodType fix

* antoher fix

* modified fp8 mtamul test to test quantized matmul func

* another fix of rebase mistake

* hopefully last rebase mistake fix

* restore backward compatibly import protection

---------

Signed-off-by: Uri Livne <ulivne@habana.ai>

* [SW-213890] Revert "[SW-213890] Disable test_two_step_layer_wise temporarily (#84)" (#86)

This reverts commit 27162ae.

* Revert "[SW-205334][SW-187731] llama70b vLLM fix graph breaks with  torch.com…" (#87)

This reverts commit 01a5734.

Co-authored-by: Danny Semiat <dsemiat@habana.ai>

* [ALGO-809] PatchedLmHeadLinearAllreduce: replacing the sharding code with the one from deepspeed-fork (#85)

Change-Id: Icb9670cfefdd1880c1ebb9a804a97c9ba79ecdc3

Co-authored-by: smarkovichgolan <smarkovich@habana.ai>

* fix bug of FusedMoE object has no attribute w13_weight (#94)

Signed-off-by: yuwenzho <yuwen.zhou@intel.com>

* [SW-208588] Add HPU fp8 Dynamic MOE (#88)

* [SW-208588] Add HPU fp8 Dynamic MOE

* fix review comments

* fix more review comments

* fix comments

* fix tests

* minor config fixes (#96)

* [SW-0] minor cosmetic fixes in quant_config

* remove hooks

* [SW-196641] - Fix type mismatch in linear quantization unit tests (#99)

* [SW-196641] - Fix type mismatch in linear quantization unit tests

* fix atol value

* add hp_dtype to fp8 config dict before parsing

* [SW-214785] Apply PatchedModuleBase for all existing PatchedModules (#92)

* [SW-214785] Apply PatchedModuleBase for all existing PatchedModules

Signed-off-by: Xin He <xinhe3@habana.ai>

---------

Signed-off-by: Xin He <xinhe3@habana.ai>
Co-authored-by: Xin He <xinhe3@habana.ai>

* [SW-215319] threshold of memory usage in test_block_wise.py is too tight (#100)

* [SW-215543] Revert "minor config fixes (#96)" (#104)

This reverts commit fa40142.

* fix RowParalleLinear func names from string to tuple (#106)

* [SW-215615] memory is unreleased during loading neural_magic models on multi-cards (#105)

Signed-off-by: Xin He <xinhe3@habana.ai>
Co-authored-by: Xin He <xinhe3@habana.ai>

* [SW-212423] RuntimeError when load the gptq model from HF (#70)

* [SW-212423] RuntimeError when load the gptq model from HF
* skip tie_word_embeddings=False

Signed-off-by: Xin He <xinhe3@habana.ai>

---------

Signed-off-by: Xin He <xinhe3@habana.ai>
Co-authored-by: Xin He <xinhe3@habana.ai>

* [SW-214785] fix issue when self._mod_extra_config is None (#108)

* [SW-211826] [example] demonstrate layer-wise, block-wise and lm_eval usage (#66)

* [SW-211826] [example] demonstrate layer-wise&block-wise usage to quantize LLM with limited host&device memory

Signed-off-by: Xin He <xinhe3@habana.ai>

---------

Signed-off-by: Xin He <xinhe3@habana.ai>
Co-authored-by: Xin He <xinhe3@habana.ai>

* [SW-215295] Force single object from quantized func wrapper classes (#103)

* [SW-215295] Force single object from quantized func wrapper classes

* Modify the factory object to be cleared after module patching

* Move cleanup to Quantizer object

* [SW-216292]Minor update for lm-eval (#113)

* Enable lm-eval 0.4.2 and expose `add_bos_token`

---------

Signed-off-by: Yi Liu <yiliu4@habana.ai>
Co-authored-by: Yi Liu <yiliu4@habana.ai>

* [SW-209207] add vllm fp8 dynamic MoE (#116)

* [SW-216239] Align Softmax fp8 scale calc with configuration (#112)

* [SW-217321] Skip auto round tests (#119) (#125)

* Test Commit

* [SW-217321] Skip auto round tests do to CI breakage

* remove uneeded print

* [SW-207451] Implement block-wise calibration for LLM (#24)

For LLMs, measurement on bf16 requires high hpu memory usage.
This change can help measure bf16 llama-405b on 8 Gaudi2 card, or measure llama-70b on 1 Gaudi card.
Shortage: cannot measure lm_head layer, maybe we can enhance it later.

---------

Signed-off-by: Xin <xin3.he@intel.com>
Co-authored-by: Xin He <xinhe3@habana.ai>
Signed-off-by: Xin He <xinhe3@habana.ai>

* [SW-197077] fix bug in output arbitrary scales (#45)

* [SW-197077] fix bug

* [SW-197077] fix bug in outputs arbitrary scales

Signed-off-by: Xin He <xinhe3@habana.ai>

* [SW-197077] fix bug in output arbitrary scales (#45)

* [SW-197077] fix bug

* [SW-197077] fix bug in outputs arbitrary scales

* [SW-210500] [Optimum-Habana] [Regression] [fp8] [INC] No generated text for llava models [llava-1.5-7b-hf] [llava-1.5-13b-hf ] (#54) (#77)

Signed-off-by: Xin He <xinhe3@habana.ai>
Co-authored-by: Xin He <xinhe3@habana.ai>

* [SW-213236] resolve CPU mem issue in CI (#76) (#83)

Cherry-pick from 1.19
Co-authored-by: Xin He <xin3.he@intel.com>

* [SW-213368] requirements_pt.txt: allow newer pydantic versions to >= 1.10.13 (#80)

* requirements_pt.txt: upgrade pydantic version to >= 2.0.0

* allow newer version of pydantic

newer deepspeed uses pydantic v2, which have slight different APIs.

* Update requirements_pt.txt

* [SW-212057] Enable scalar scale to support QDQ (#98)

* [SW-212057] Enable scalar scale to support QDQ

Change-Id: Ib5f5accd7a770675609e91c18bd04497b15937c5

* PR comment fixes

Change-Id: I01be41c29721b8d59c887f3d2b4e3cef8433331c
Signed-off-by: Xin He <xinhe3@habana.ai>

* [SW-215845] Run some unit tests from top level API (#109)

Signed-off-by: Xin He <xinhe3@habana.ai>

* [SW-212629] Support saving weight-only quantization INT4 model in Hugging Face format (#101)

Signed-off-by: Xin He <xinhe3@habana.ai>
Co-authored-by: Xin He <xinhe3@habana.ai>
Signed-off-by: Xin He <xinhe3@habana.ai>

* [SW-205970] update state_dict to save scalar scales (#6)

* update state_dict method in save/load function

---------

Signed-off-by: Xin He <xinhe3@habana.ai>
Co-authored-by: Xin He <xinhe3@habana.ai>
Signed-off-by: Xin He <xinhe3@habana.ai>

* Revert "[SW-205970] update state_dict to save scalar scales (#6)" (#114)

This reverts commit ffcb97e.

* [SW-212092] Save vllm compatible format (#102)

* save vllm compatible format

Signed-off-by: changwangss <changwang@habana.ai>

* add assertion and improve max_file_size to human reading

Signed-off-by: changwangss <changwang@habana.ai>

* support default the same with huggingface when saving

Signed-off-by: changwangss <changwang@habana.ai>

* separate save funtion for single device and multi devices.

Signed-off-by: changwangss <changwang@habana.ai>

* rebase

Signed-off-by: changwangss <changwang@habana.ai>

* rebase save

Signed-off-by: changwangss <changwang@habana.ai>

* remove weight and scale convert on G2

Signed-off-by: changwangss <changwang@habana.ai>

* rebase master_next due to revert #6

Signed-off-by: changwangss <changwang@habana.ai>

* improve convert weight to vllm compatable function

Signed-off-by: changwangss <changwang@habana.ai>

* replace print to logger

Signed-off-by: changwangss <changwang@habana.ai>

* move unit_mapping to common utils

Signed-off-by: changwangss <changwang@habana.ai>

---------

Signed-off-by: changwangss <changwang@habana.ai>
Signed-off-by: Xin He <xinhe3@habana.ai>

* [SW-205970] update state_dict to save scalar scales (#115)

* [SW-205970] update state_dict to save scalar scales (#6)

* update state_dict method in save/load function

* support mixtral
---------

Signed-off-by: Xin He <xinhe3@habana.ai>
Co-authored-by: Xin He <xinhe3@habana.ai>

* [SW-215009] support loading per-channel scales (#95)

* [SW-215009] support loading per-channel scales

Signed-off-by: Xin He <xinhe3@habana.ai>

* fix UT

Signed-off-by: Xin He <xinhe3@habana.ai>

---------

Signed-off-by: Xin He <xinhe3@habana.ai>
Co-authored-by: Xin He <xinhe3@habana.ai>

* Refactoring scales (#22) (#122)

* Refactoring scales (#22)

* [SW-197077] refactoring maxabs scales and adding arbitrary scales.

* [SW-199696] Supporting Dynamic Quantization (#128)

* Calculating dynamic scales using nn.Modules

Change-Id: I8c344ae737803b39117037edaaa3d3b9cbd09f30

* [SW-199696] Supporting Dynamic Quantization

Change-Id: Ic5d6f04ec0b5032ac305e1b3097747c47250385b

* Code cleanup

Change-Id: I213bc7438e06bd1002775066bfb0dc6f10e8a84a

* Review changes and model print issue (circular dependency fix)

Change-Id: I5c41d2f9a937416ce260f55cb045c86858dd201a

* removed debug code from patching_common.py

* Round 2 + CI import issue

Change-Id: I27dbb33de8e027fb0b726336b38156b5d23a6896
Signed-off-by: Xin He <xinhe3@habana.ai>

* [SW-217334] enable fp8 qdq mode using PatchedModuleBase (#129)

* [SW-217334] enable fp8 qdq mode using PatchedModuleBase

* fix review commnets

* [SW-218871] fp8 multi-cards is not loaded correctly (#138)

Signed-off-by: Xin He <xinhe3@habana.ai>
Co-authored-by: Xin He <xinhe3@habana.ai>

* Fix bug in mixtral unitscale (#141)

* [SW-218197] fix bug in Mixtral unitscale

* [SW-218197] fix bug in Mixtral unitscale

* update version to 3.3 for release

Signed-off-by: Xin He <xinhe3@habana.ai>

* [SW-20808] Make sure save&load format is an Enum object (#58)

* [SW-20808] Make sure save&load format is an Enum object

Signed-off-by: Xin He <xinhe3@habana.ai>

* Update save_load_entry.py

---------

Signed-off-by: Xin He <xinhe3@habana.ai>
Co-authored-by: Xin He <xinhe3@habana.ai>
Signed-off-by: Xin He <xinhe3@habana.ai>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add xfail for torchvision

Signed-off-by: Xin He <xinhe3@habana.ai>

* fix ILITV-3859

Signed-off-by: xin3he <xin3.he@intel.com>

* workaround for ILITV-3858

Signed-off-by: xin3he <xin3.he@intel.com>

* fix sdxl_smooth_quant

Signed-off-by: xin3he <xin3.he@intel.com>

* fix ILITV-3854

Signed-off-by: xin3he <xin3.he@intel.com>

---------

Signed-off-by: Xin He <xinhe3@habana.ai>
Signed-off-by: changwang <changwang@habana.ai>
Signed-off-by: changwangss <changwang@habana.ai>
Signed-off-by: Uri Livne <ulivne@habana.ai>
Signed-off-by: yuwenzho <yuwen.zhou@intel.com>
Signed-off-by: Yi Liu <yiliu4@habana.ai>
Signed-off-by: Xin <xin3.he@intel.com>
Signed-off-by: xin3he <xin3.he@intel.com>
Co-authored-by: Xin He <xinhe3@habana.ai>
Co-authored-by: RafLit <rafal.litka@intel.com>
Co-authored-by: Rafal Litka <rlitka@habana.ai>
Co-authored-by: Dany Kiazada <141814181+kiazada@users.noreply.github.com>
Co-authored-by: Nir David <124874956+nirda7@users.noreply.github.com>
Co-authored-by: Yuwen Zhou <yuwen.zhou@intel.com>
Co-authored-by: Wang, Chang <changwang@habana.ai>
Co-authored-by: Uri Livne <ulivne@habana.ai>
Co-authored-by: Oz Abramovich <oabramovich@habana.ai>
Co-authored-by: Dudi Lester <160421192+dudilester@users.noreply.github.com>
Co-authored-by: Danny Semiat <dsemiat@habana.ai>
Co-authored-by: smarkovichgolan <smarkovich@habana.ai>
Co-authored-by: Yi Liu <yi4.liu@intel.com>
Co-authored-by: Yi Liu <yiliu4@habana.ai>
Co-authored-by: Linoy Buchnik <linoybu@gmail.com>
Co-authored-by: Nadav Elyahu <88962733+nelyahu@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: chen, suyue <suyue.chen@intel.com>
Co-authored-by: Sun, Xuehao <xuehao.sun@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file python Pull requests that update Python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant