Skip to content

Releases: alibaba/heterogeneity-aware-lowering-and-optimization

IPU_STABLE_SDK_2.2.2_v3

03 Nov 04:18
5cf3b6f
Compare
Choose a tag to compare

close floating point check in popart (#665)

close floating point check in popart

create pipeline resource when computation created. not do it at the time of set the cache computation item

fix pixel bert detect application runtime error

add weiming build script fix
Co-authored-by: yanwei yw01041751@alibaba-inc.com

Combine config cache (#659) (#663)

fix poplar sdk path symbol

deserialize config in catch file

Add error handling codes

Co-authored-by: yanwei yw01041751@alibaba-inc.com
Co-authored-by: gcuser jackz@alibaba-inc.com
Co-authored-by: gcuser gcuser@alibaba-inc.com
(cherry picked from commit 0b006c0)

Co-authored-by: yanwei-gr 64010848+yanwei-gr@users.noreply.github.com

IPU_STABLE_SDK_2.3.0_v2

28 Oct 08:56
356f0ec
Compare
Choose a tag to compare
update popart api to sdk2.3 & remove custom erf ,already use popart o…

…… (#655) (#656)

* update popart api to sdk2.3 & remove custom erf ,already use popart origin erf

Co-authored-by: yanwei <yw01041751@alibaba-inc.com>
(cherry picked from commit 4d23a958f4cd68fd2e98073af286520fbae87337)

Co-authored-by: yanwei-gr <yw01041751@alibaba-inc.com>

IPU_STABLE_SDK_2.3.0_v1

28 Oct 05:02
356f0ec
Compare
Choose a tag to compare

v0.7.2

01 Oct 16:26
Compare
Choose a tag to compare

This release contains the following major changes since v0.7.1:

  • Enhance ops support, including

    • RNN, GRU, LSTM
    • More arithmetic ops and logical ops
  • ODLA runtime libarry supports TensorRT 8.0.3

  • Initial Python interface support

  • Bug fixes

IPU_STABLE_SDK_2.2.2_v2

14 Sep 18:32
7f358b9
Compare
Choose a tag to compare
cherry-pick master 代码到SDK2.2.2 分支上 (#559)

* Add Custom Op for Yolov3 Post Process (#512)

* add custom op for yolov3

* reset submodule onnx

* reset tensorrt

* delete build

* merge odla_ops_nn

* modify for passing link-check

Co-authored-by: gcuser <jackz@graphcore.ai>
(cherry picked from commit 5847cd338e12b7154107ea0346b113605bb1223b)

* ODLA popART pipeline function (#522)

* First runnable with single thread & test context

* mnist runnable demot to test the pipeline

* multi thread put the data to the session run

* simple bash to compile and run test

* An example of how to use the callback in pipeline

* multi threads using local Ctx

* Can run with pipeline setting in onnx file

* Refactored and add no pipeline multi thread

* Move codes to the odla_pipeline.h .cc

* Make single empty/zero data, and delete context for empty data after get result

* Add mutex to serialization the compute requests

* Merge the changes for attention mask & prevous changes

* test codes for time

* Chage the CMakeList to make the pipeline.cc and new custom op compiled

* Successfully run on 24L with attention mask custom OP

* custom op attention_mask test code

* And name scope to the each node in model

* Try throghput test with MLPerf model

* only set AMP on feed forward matmul

* Run the online pipeling with config hard coded to the config read class

* Compile with SDK 2.2 with pipeline online setting

* Add config file for pipeline stage setting

* Run pipeline with similar performance of popart

* change some names & make AMP all 0.445

* Add amp parameter in config file

* Detach device and clear session when DestroyComputation

* Make the batch_per_step take effect on execution mode SEQUENCE to pass enough size of data

* Add the new lock free queue and logging

* Fix bug on empty data visit counter

* delete the empty context

* add some pipeline sync

* Make thread sleep for 5 ms when no task in the queue

* change the size() of LockFreeQueue to tail-wait

* [CI] make the call by main can work with npz files

* Move the computation init to create context

* Add common functions to common.h and common.cc

* move the compuation init out

* Move common functions to the test foler

* Test the config of ODLA popART and make no configuration act as before

* Add tests for call the model.cc

* Add FP32 to save as result

* Some changes on LockFreeQueue and tests

* Fix the rsqrt wrong problem, and remove std cout&cerr to avoid crash

* fix the accuracy problem of large bps

* Add thread check for context & computation holding to avoid conflicts

* Add the batch tools to help on the test to generate model, build and run

* Decreasing the empty data put

* temporary commit to migrate crashed system

* set pipeline information on fly
change the mixed style of class member
add debug setting and default to false to make the opts set by api
remove the old pipeline set api

* Fixed the mixed code style and removed redundant codes

* Remove the function test codes of the odla_popart

* remove some redundant codes and files

* Changed the CACHE STRING to CACHE PATH

* move ENGINE_CACHE_PATH to odla_popart.cc

* format the codes with clang-format-9 -i command

* Move json.hpp to third party

* Set virtualgraph for model not using pipeline in set_session_opts

* Add virtual graph attribute when _odla_computation constructed

* Check the shape before extends it with batches_per_step

Co-authored-by: gcuser <gcuser@alibaba-inc.com>
(cherry picked from commit 6095bdf246c3a4d9d686f2802cb6955cb7d70f79)

* fix on default configuration & computation destroyment

(cherry picked from commit 40b9fc840e76ed139d6038bc72f7cd4da03a7b52)

* definitions for static variables

(cherry picked from commit 18e0e83a9b4721624c291777c02fbecf189350fb)

* disable test case test_constant_popart.cc

Co-authored-by: Zars19 <1036473307@qq.com>
Co-authored-by: jackzipu <74961298+jackzipu@users.noreply.github.com>
Co-authored-by: gcuser <jackz@graphcore.ai>

IPU_STABLE_SDK_2.2.2_v1

08 Sep 06:24
Compare
Choose a tag to compare
Pre-release
code review

(cherry picked from commit b9f8a69edd6f71d8e645311d01f3f1ac386d535d)

IPU_STABLE_SDK_2.1.0_v3

07 Sep 03:01
Compare
Choose a tag to compare
Pre-release
  • Fix axis attribute for reduction instrs

IPU_STABLE_SDK_2.1.0_v2

04 Sep 01:33
Compare
Choose a tag to compare
  • add constant decombine pass

v0.7.1

23 Aug 16:09
Compare
Choose a tag to compare

This release contains the following major changes since v0.7.0:

  • Enhance ONNX ops with ODLA/DNNL runtime library, including:
    ** reduction ops
    ** arg min, arg max ops
    ** Hardmax op

  • Improve "double" data type (FP64) support

  • Switch to LLVM 12.0.0

  • Improve error handling for ODLA APIs and code generation.

IPU_STABLE_SDK_2.1.0_v1

27 Aug 16:58
Compare
Choose a tag to compare
[CodeGen] Check odla status after odla API calls