Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CXX API Todo #3

Closed
tqchen opened this issue Oct 20, 2016 · 0 comments
Closed

CXX API Todo #3

tqchen opened this issue Oct 20, 2016 · 0 comments

Comments

@tqchen
Copy link
Member

tqchen commented Oct 20, 2016

  • Tensor
    - InferInputDomain @icemelon9
  • Schedule @tqchen
  • Split @tqchen
  • Buffer
  • Codegen
@tqchen tqchen closed this as completed Oct 27, 2016
tqchen referenced this issue in tqchen/tvm May 26, 2018
tqchen added a commit that referenced this issue May 29, 2018
tqchen added a commit that referenced this issue May 29, 2018
tqchen referenced this issue in tqchen/tvm Jul 6, 2018
grwlf referenced this issue in grwlf/tvm Aug 8, 2018
jroesch referenced this issue in jroesch/tvm Aug 29, 2018
* Update TVM Version

* Fix Lint and Jenkins

* fix workspace

* fix doxygen complain

* move test to nose
weberlo added a commit to weberlo/tvm that referenced this issue Jun 20, 2019
* Reorder LowLevelDevice interface

* Store shared ptr to session in all alloced objects

* Rename `micro_build` to `build`
prashantsail pushed a commit to prashantsail/incubator-tvm that referenced this issue May 14, 2020
Merge back from upstream(apache) to origin(local dev)
jcf94 added a commit to jcf94/tvm that referenced this issue Jun 22, 2020
* Split transform_step out

* Update GetProducers & GetConsumers

* Update UTs

* Add UT for CacheReadWrite & Some bug fix
tqchen pushed a commit that referenced this issue Jul 15, 2020
…generating (#5962)

* Code migration Start (#1)

* Init commit: Code migration Start

* Add loop_state.cc/h

* Add ComputeDAG basic test

* Split transform_step out & Update more UTs (#3)

* Split transform_step out

* Update GetProducers & GetConsumers

* Update UTs

* Add UT for CacheReadWrite & Some bug fix

* Add search_task, measure and serialization (#4)

* Add FollowSplit & FollowFusedSplit tests

* Update dag.InferBound & its UT

* Add search_task, measure and serialization

* Update Serialization UT

* Add MetaTileRewritePolicy (#5)

* Add feature

* Add cost_model, meta_tile_rewrite_policy

* Add MetaTileRewritePolicy basic UT

* Basic Python API for State (#6)

* Add Basic Python API for State

* Add UTs for State

* Add Python API: Measure & Task (#7)

* Update the return value of state operation

* Add task

* Copy measure.py & utils.py

* Fix LocalBuilder

* Fix LocalRunner

* Add ansor.auto_schedule() API; First AutoSchedule working version(#8)

* Add basic Python support for ansor.auto_schedule

* Update AutoSchedule API

* Bug fix for get the attach point of a fused iter

* Update UT after infer bug fix

* Bug fix & Add python serialization API (#10)

* Delete C++ UT hack since Python is ready

* Add ndarray.non_empty

* Update Serialization python API

* Improve code style, python wrapper and test cases (#11)

* Update c++ code style and unit test

* Update python State wrapper and test cases

* fix unit tests

* Add RPCRunner & OpenCL/CUDA test (#12)

* Add RPCRunner & OpenCL search test

* Add CUDA search test

* Add RPCRunner test

* rebase to upstream/master

* Add Ansor basic tutorial (#13)

* Add basic tutorial

* migrate feature extraction (#14)

* Add XGBModel & RPCRunnerWarpper (#15)

* Add XGBModel & RPCRunnerWarpper

* Revert "Add Parallel Granularity Mutation"

* Migrate workload_registry.py (#16)

* add workload registry

* update

* update

* add task scheduler (#17)

* Add conv2d cuda tutorial with workload registry (#18)

* add tune_test.py (the old tune_wkl.py) (#19)

* add tune_test.py (the old tune_wkl.py)

* update

* fix measure

* fix for gpu

* Code refine for tune_test.py & Add a pre load callback (#20)

* Bug fix for tutorials

* Add PreLoadMeasuredStates

* Add search_callback support for task tuner

* Code refine for tune_test.py

* Update

* Update

* Update

* Update

* Bug fix

* Add python custom sketch rule (#21)

* Add custom sketch rule

* Bug fix

* Ansor Relay Integration (without layout rewrite) (#22)

* relay integration

* Add tune_op_subgraph.py & Some code clean for tune_network.py (#23)

* Add single op tune scripts

* Add tune subgraph support

* Merge all op & all subgraph to one file

* Rename file

* add explicit_unroll_max_extent (#25)

* Add Index simplification & API update (#26)

* Add vectorized cooperative_fetching test

* Update math simplify for vectorized CF

* File rename

* Update tune_network

* API update

* Update PreLoadMeasuredStates & Some bug fix (#27)

* Add a threading wrapper to fix the test bug

* Set default TVM_USE_AUTO_SCHEDULER to false

* Update PreLoadMeasuredStates callback

* Add tensorize step for loop_state (#31)

* Add tensorize step

* State python api update (#33)

* Start to update api

* Add compute_dag to state

* API update

* kernel layout rewrite (#28)

* kernel layout rewrite

* remove some hacks

* add defuse_ops pass and move kernel_layout_rewrite pass after fuse_ops pass

* set TVM_RELAY_DISABLE_BUILD_CACHE for task extraction and prepare_layout_rewrite

* [cache flush] port cache flush to ansor (#32)

* Improve relay integration (#34)

* tmp checkpoint

* Improve relay integration

* Improve relay integration

* Fix xgb error & Simplify dispatcher (#35)

* Rename "MetaTileRewritePolicy" to "SketchPolicy". (#36)

* Rename "MetaTileRewritePolicy" to "SketchPolicy".

* Add a new class for auto_unroll_max_step, storage_offset in StageNode

* fix tune_op_subgraph.py

* rebase

* Migrate all node::make to noderef's construct function (#37)

* Start to move xxxnode::make to noderef()

* Update

* Update

* Finish transform_step

* Finish comute dag & auto schedule

* Update

* Update

* Update

* Update

* Update

* Code refine

* Code refine

* Code refine

* Update

* Update

* Some lint fix & Recover the double constructor of tvm::PrimExpr (#39)

* lint fix

* clang-format-fix

* pylint fix

* Update

* Recover the double constructor of tvm::PrimExpr

* Fix pylint

* pylint fix

* pylint fix

* Add MutateComputeLocation and MutateParallel in evolutionary search (#40)

* Add MutateComputeLocation and MutateParallel in evolutionary search

* fix lint

* Improve loop state python API (stage_tensors -> stage_ops) (#41)

* improve loop state python API (stage_tensors -> stage_ops)

* fix

* ComputeDAG bug fix & Add Custom TensorCore Matmul Example (#42)

* Bug Fix

* Sample example of Custom TensorCore Matmul

* Rever Commits, Start to build minimum Ansor system

* Code clean for minimum Ansor system

* Bug fix & Delete AccessAnalyzer

* Delete attachmap & Code clean

* Doc update

Update statenode::stages from vector to Array

* Headfile update & Python doc update

* clang-format fix

* pylint fix

* Update

* Doc update

* Update

* Bug fix after code merge to the new master

* clang-format fix

* Update

* Update

* Update std::vector to Array; Update verbosity setting; Some commemts
addressed

* std::vector->Array & std::string->String

* Add init_state to ComputeDAG

* Update

* Update some unordered_map to Map

* clang-format fix

* Comments addressed
Delete ReplayAndInferBound
Delete ReplaySteps & InferBoundCommon

* Lint fix

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Rename ansor namespace to auto_schedule

* Update

* Rename ThreadPool to ParallelFor

* Add parallel_for

* Remove ThreadPool

* Update python/tvm/auto_schedule/auto_schedule.py

* trigger CI

Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
Co-authored-by: Minmin Sun (孙敏敏) <minmin.smm@alibaba-inc.com>
Co-authored-by: Zhao Wu <zhaowu@apache.org>
weberlo pushed a commit to weberlo/tvm that referenced this issue Aug 6, 2020
add more common cfg to DefaultOptions
CloudManX pushed a commit to CloudManX/incubator-tvm that referenced this issue Sep 15, 2020
…generating (apache#5962)

* Code migration Start (apache#1)

* Init commit: Code migration Start

* Add loop_state.cc/h

* Add ComputeDAG basic test

* Split transform_step out & Update more UTs (apache#3)

* Split transform_step out

* Update GetProducers & GetConsumers

* Update UTs

* Add UT for CacheReadWrite & Some bug fix

* Add search_task, measure and serialization (apache#4)

* Add FollowSplit & FollowFusedSplit tests

* Update dag.InferBound & its UT

* Add search_task, measure and serialization

* Update Serialization UT

* Add MetaTileRewritePolicy (apache#5)

* Add feature

* Add cost_model, meta_tile_rewrite_policy

* Add MetaTileRewritePolicy basic UT

* Basic Python API for State (apache#6)

* Add Basic Python API for State

* Add UTs for State

* Add Python API: Measure & Task (apache#7)

* Update the return value of state operation

* Add task

* Copy measure.py & utils.py

* Fix LocalBuilder

* Fix LocalRunner

* Add ansor.auto_schedule() API; First AutoSchedule working version(apache#8)

* Add basic Python support for ansor.auto_schedule

* Update AutoSchedule API

* Bug fix for get the attach point of a fused iter

* Update UT after infer bug fix

* Bug fix & Add python serialization API (apache#10)

* Delete C++ UT hack since Python is ready

* Add ndarray.non_empty

* Update Serialization python API

* Improve code style, python wrapper and test cases (apache#11)

* Update c++ code style and unit test

* Update python State wrapper and test cases

* fix unit tests

* Add RPCRunner & OpenCL/CUDA test (apache#12)

* Add RPCRunner & OpenCL search test

* Add CUDA search test

* Add RPCRunner test

* rebase to upstream/master

* Add Ansor basic tutorial (apache#13)

* Add basic tutorial

* migrate feature extraction (apache#14)

* Add XGBModel & RPCRunnerWarpper (apache#15)

* Add XGBModel & RPCRunnerWarpper

* Revert "Add Parallel Granularity Mutation"

* Migrate workload_registry.py (apache#16)

* add workload registry

* update

* update

* add task scheduler (apache#17)

* Add conv2d cuda tutorial with workload registry (apache#18)

* add tune_test.py (the old tune_wkl.py) (apache#19)

* add tune_test.py (the old tune_wkl.py)

* update

* fix measure

* fix for gpu

* Code refine for tune_test.py & Add a pre load callback (apache#20)

* Bug fix for tutorials

* Add PreLoadMeasuredStates

* Add search_callback support for task tuner

* Code refine for tune_test.py

* Update

* Update

* Update

* Update

* Bug fix

* Add python custom sketch rule (apache#21)

* Add custom sketch rule

* Bug fix

* Ansor Relay Integration (without layout rewrite) (apache#22)

* relay integration

* Add tune_op_subgraph.py & Some code clean for tune_network.py (apache#23)

* Add single op tune scripts

* Add tune subgraph support

* Merge all op & all subgraph to one file

* Rename file

* add explicit_unroll_max_extent (apache#25)

* Add Index simplification & API update (apache#26)

* Add vectorized cooperative_fetching test

* Update math simplify for vectorized CF

* File rename

* Update tune_network

* API update

* Update PreLoadMeasuredStates & Some bug fix (apache#27)

* Add a threading wrapper to fix the test bug

* Set default TVM_USE_AUTO_SCHEDULER to false

* Update PreLoadMeasuredStates callback

* Add tensorize step for loop_state (apache#31)

* Add tensorize step

* State python api update (apache#33)

* Start to update api

* Add compute_dag to state

* API update

* kernel layout rewrite (apache#28)

* kernel layout rewrite

* remove some hacks

* add defuse_ops pass and move kernel_layout_rewrite pass after fuse_ops pass

* set TVM_RELAY_DISABLE_BUILD_CACHE for task extraction and prepare_layout_rewrite

* [cache flush] port cache flush to ansor (apache#32)

* Improve relay integration (apache#34)

* tmp checkpoint

* Improve relay integration

* Improve relay integration

* Fix xgb error & Simplify dispatcher (apache#35)

* Rename "MetaTileRewritePolicy" to "SketchPolicy". (apache#36)

* Rename "MetaTileRewritePolicy" to "SketchPolicy".

* Add a new class for auto_unroll_max_step, storage_offset in StageNode

* fix tune_op_subgraph.py

* rebase

* Migrate all node::make to noderef's construct function (apache#37)

* Start to move xxxnode::make to noderef()

* Update

* Update

* Finish transform_step

* Finish comute dag & auto schedule

* Update

* Update

* Update

* Update

* Update

* Code refine

* Code refine

* Code refine

* Update

* Update

* Some lint fix & Recover the double constructor of tvm::PrimExpr (apache#39)

* lint fix

* clang-format-fix

* pylint fix

* Update

* Recover the double constructor of tvm::PrimExpr

* Fix pylint

* pylint fix

* pylint fix

* Add MutateComputeLocation and MutateParallel in evolutionary search (apache#40)

* Add MutateComputeLocation and MutateParallel in evolutionary search

* fix lint

* Improve loop state python API (stage_tensors -> stage_ops) (apache#41)

* improve loop state python API (stage_tensors -> stage_ops)

* fix

* ComputeDAG bug fix & Add Custom TensorCore Matmul Example (apache#42)

* Bug Fix

* Sample example of Custom TensorCore Matmul

* Rever Commits, Start to build minimum Ansor system

* Code clean for minimum Ansor system

* Bug fix & Delete AccessAnalyzer

* Delete attachmap & Code clean

* Doc update

Update statenode::stages from vector to Array

* Headfile update & Python doc update

* clang-format fix

* pylint fix

* Update

* Doc update

* Update

* Bug fix after code merge to the new master

* clang-format fix

* Update

* Update

* Update std::vector to Array; Update verbosity setting; Some commemts
addressed

* std::vector->Array & std::string->String

* Add init_state to ComputeDAG

* Update

* Update some unordered_map to Map

* clang-format fix

* Comments addressed
Delete ReplayAndInferBound
Delete ReplaySteps & InferBoundCommon

* Lint fix

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Update

* Rename ansor namespace to auto_schedule

* Update

* Rename ThreadPool to ParallelFor

* Add parallel_for

* Remove ThreadPool

* Update python/tvm/auto_schedule/auto_schedule.py

* trigger CI

Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
Co-authored-by: Minmin Sun (孙敏敏) <minmin.smm@alibaba-inc.com>
Co-authored-by: Zhao Wu <zhaowu@apache.org>
rohanmukh added a commit to anijain2305/tvm that referenced this issue Oct 2, 2020
* InferType fix

* Skip annotation passes for non-main funcs

Co-authored-by: Rohan Mukherjee <rohan.mukherjii@gmail.com>
zhiics pushed a commit that referenced this issue Oct 3, 2020
* Change onnx importer to use dynamic upsampling3d (#3)

fix pylint

* Refactor ONNX frontend to be dynamic

Make OneHot dynamic

Support BatchMatMul with dynamically shaped inputs

fix dynamic broadcast

Add null checks to broadcast_to rel functions

fail more isolated broadcast_to test

use StructuralEqual instead of pointer comparisions in dynamic_to_static pass

add an optional weight freeze argument to onnx importer

convert onnx resize to dynamic op

add dynamic expand to onnx importer

add a shape_func for power

fix BERTSquad, lint

handle onnx graph initializer parameters more intelligently

* Dynamic ONNX importer: Upsampling and Pad (#2)

fix lint

fix Call reference

fix a type issue with expand

fix a bad test refactor

respond to review comments, fix batch matmul tests

* black format

* fix batch matmul test

* add dynamic strided slice to the onnx importer

* fix clip importer

* fix qnn tutorial

* fix bad merge, respond to review comments

* add a simple dynamic model test

* Add dynamic-shaped autopadding to convolution and pooling ops

* fix dynamic issues in a few ops

* fix pylint

* disable tests onnxrt doesn't support

* fix pytorch test

* respond to review comments

* add documentation about partially supporting dynamic shapes

Co-authored-by: Lily Orth-Smith <lorthsmith@octoml.ai>
CloudManX pushed a commit to CloudManX/incubator-tvm that referenced this issue Oct 30, 2020
* Change onnx importer to use dynamic upsampling3d (apache#3)

fix pylint

* Refactor ONNX frontend to be dynamic

Make OneHot dynamic

Support BatchMatMul with dynamically shaped inputs

fix dynamic broadcast

Add null checks to broadcast_to rel functions

fail more isolated broadcast_to test

use StructuralEqual instead of pointer comparisions in dynamic_to_static pass

add an optional weight freeze argument to onnx importer

convert onnx resize to dynamic op

add dynamic expand to onnx importer

add a shape_func for power

fix BERTSquad, lint

handle onnx graph initializer parameters more intelligently

* Dynamic ONNX importer: Upsampling and Pad (apache#2)

fix lint

fix Call reference

fix a type issue with expand

fix a bad test refactor

respond to review comments, fix batch matmul tests

* black format

* fix batch matmul test

* add dynamic strided slice to the onnx importer

* fix clip importer

* fix qnn tutorial

* fix bad merge, respond to review comments

* add a simple dynamic model test

* Add dynamic-shaped autopadding to convolution and pooling ops

* fix dynamic issues in a few ops

* fix pylint

* disable tests onnxrt doesn't support

* fix pytorch test

* respond to review comments

* add documentation about partially supporting dynamic shapes

Co-authored-by: Lily Orth-Smith <lorthsmith@octoml.ai>
CloudManX pushed a commit to CloudManX/incubator-tvm that referenced this issue Oct 30, 2020
* Change onnx importer to use dynamic upsampling3d (apache#3)

fix pylint

* Refactor ONNX frontend to be dynamic

Make OneHot dynamic

Support BatchMatMul with dynamically shaped inputs

fix dynamic broadcast

Add null checks to broadcast_to rel functions

fail more isolated broadcast_to test

use StructuralEqual instead of pointer comparisions in dynamic_to_static pass

add an optional weight freeze argument to onnx importer

convert onnx resize to dynamic op

add dynamic expand to onnx importer

add a shape_func for power

fix BERTSquad, lint

handle onnx graph initializer parameters more intelligently

* Dynamic ONNX importer: Upsampling and Pad (apache#2)

fix lint

fix Call reference

fix a type issue with expand

fix a bad test refactor

respond to review comments, fix batch matmul tests

* black format

* fix batch matmul test

* add dynamic strided slice to the onnx importer

* fix clip importer

* fix qnn tutorial

* fix bad merge, respond to review comments

* add a simple dynamic model test

* Add dynamic-shaped autopadding to convolution and pooling ops

* fix dynamic issues in a few ops

* fix pylint

* disable tests onnxrt doesn't support

* fix pytorch test

* respond to review comments

* add documentation about partially supporting dynamic shapes

Co-authored-by: Lily Orth-Smith <lorthsmith@octoml.ai>
ZihengJiang referenced this issue in ZihengJiang/tvm Nov 26, 2020
* Changes for TF/PT Rn50

* Refactoring

* Comments
wyanzhao pushed a commit to wyanzhao/incubator-tvm that referenced this issue Dec 9, 2020
* [CI] Add vta path to cpptest

* [DOCS] Points docs to the ASF site
monklof pushed a commit to monklof/incubator-tvm that referenced this issue Jan 22, 2021
…m_data:master to master

* commit 'cd0d52daa6942bdafa9363ff6cfa3d25fcd5b8d6': (824 commits)
  [Intrinsic] Add log1p, ldexp, atan2, hypot, nextafter, copysign (apache#5312)
  [Rust][CI] Restore Rust CI (apache#5137)
  Remove PrimExpr from String (apache#5311)
  [Requantize] Cleanup and Optimize Lowering (apache#5286)
  [IR][TRANSFORM] Enable CopyOnWrite for passes. (apache#5309)
  [PYTORCH]Abs, Arange, Softplus ops (apache#5295)
  [LLVM] Fix generation of LLVM intrinsics (apache#5282)
  [BYOC] Add example of Composite + Annotate for DNNL fused op (apache#5272)
  [Frontend][TensorFlow]Improve TensorFlow Static Shape Tensor Array (apache#5243)
  [RUNTIME] Introduce RValue reference(move) support to TypedPackedFunc (apache#5271)
  [RELAY][FRONTEND][CAFFE2] add Mul and ConvTranspose operator (apache#5302)
  [BYOC] Refine AnnotateTarget and MergeCompilerRegion Passes (apache#5277)
  [CI] Fix the hexagon string (apache#5304)
  [Arith] linear system and equation solver (apache#5171)
  [PYTORCH]Repeat, Reciprocal & Reshape Op support (apache#5280)
  [FRONTEND][TENSORFLOW] Fix gather_nd indices (apache#5279)
  Update device_annotation.cc (apache#5291)
  [REFACTOR][IR] Move to runtime::String (apache#5276)
  [NDArray] Set shape_ in NDArray::FromDLPack (apache#5301)
  [RUNTIME] Initial implementation of Hexagon runtime support (apache#5252)
  ...
Hzfengsy pushed a commit to Hzfengsy/tvm that referenced this issue Feb 19, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485)

[Meta Schedule][M3c] PostOrderApply (apache#486)

Fix Post Order Apply (apache#490)

[MetaSchedule] Relay Integration (apache#489)

[M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492)

Fix replay trace. (apache#493)

[M3c][Meta Schedule] Implement the Replay Func class. (apache#495)

[PR] Test script for meta-schedule task extraction. Interface to load… (apache#494)

[Meta Schedule Refactor] Get child blocks (apache#500)

Read-at && Write-at (apache#497)

[M3c][Meta Schedule] Measure Callbacks (apache#498)

[Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496)

[MetaSchedule] Sample-Perfect-Tile (apache#501)

[MetaSchedule] TE Workloads (apache#502)

[TensorIR] GetProducer, GetConsumer (apache#506)

[MetaScheduleRefactor] Annotate&Unannotate (apache#505)

[MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503)

[Tests] Add unittests for auto-inline and multi-level-tiling (apache#508)

[Meta Schedule] Minor Fixes (apache#507)

[MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509)

[MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499)

[Meta Schedule] Add Helper Function & Minor Modification (apache#512)

[MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll  (apache#513)

[Meta Schedule] Feature Extractor & Cost Model (apache#510)

Blockize & Tensorize (apache#514)

Layout Rewriting: Suggest-Index-Map (apache#520)

[MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516)

[Meta Schedule] Per-Store-Feature (apache#521)

Add traced schedule for blockize & tensorize (apache#526)

[Meta Schedule] Add XGBoost Model & Random Model (apache#519)

User-Interface: Tune-TIR (apache#525)

User-Interface: Tune-TE (apache#527)

[Minor] More logging on python (apache#528)

Get CUDA tuning working (apache#529)

[MetaSchedule] TensorRT BYOC (apache#518)

[BugFix] LocalBuilder API (apache#531)

[Meta Schedule] Add Cost Model Update Measure Callback (apache#530)

[Bugfix] BuilderInput with default params (apache#532)

[MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534)

[Meta Schedule] Evolutionary Search (apache#522)

[BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535)

[Meta Schedule] Fix some bugs (apache#537)

Initiate Experiments for CPU Performance Alignment with Ansor (apache#538)

[Meta Schedule] Tweak experiment scripts (apache#539)

[Meta Schedule] Initiate experiments on CUDA (apache#540)

[TIR][Schedule] Buffer transform (apache#523)

Auto Tensor Core (apache#524)

Working on Evo Search (apache#542)

[Meta Schedule] Add Replay Tuning Interface (apache#543)

Evolutionary Search on CPU (apache#544)

Misc improvement over the error message (apache#545)

[TIR][Schedule] Software pipelining (apache#533)

[Meta Schedule Refactor] fixing unit tests (apache#547)

[MetaSchedule] Mutator-Compute-Location (apache#548)

Misc Improvement of Evolutionary Search (apache#549)

Hotfix for software pipeline (apache#552)

Misc Improvement (apache#550)

[Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555)

Rule RFactor (apache#551)

[MemHammer] Rewrite Rules (apache#554)

[MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556)

[MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559)

[MetaSchedule] Perf Alignment - NRM on CUDA (apache#560)

[TIR] Reorder the block iters of the blocks generated by RFactor (apache#561)

Removing 2 unit tests for software pipelining (apache#562)

[MemHammer] Lower Pass + Unittests (apache#557)

Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564)

Fix Sketch Generation Unittests (apache#565)

speed up VerifyGpuCode (apache#568)

[Performance Align] fixing codegen problems (apache#569)

[Meta schedule] improve search space (#1)

Hot fix for bound predicate (apache#3)

[Meta Schedule] Update Tune Relay (apache#4)

[Performance Align] fixing codegen problems (apache#5)

[PerfAlign] NRM & SFM on Raspi Aligned (apache#6)

[BugFix] Apply bound predicate directly to loops when possible (apache#12)

[BugFix] Fix CrossThreadReduction on CUDA (apache#13)

[MetaSchedule] Enable BertTuning with MetaScheduler (apache#11)

[Minor][MemHammer] Minor tweaks in code review (apache#14)

[Meta Schedule] Add customizable search space to PostOrderApply. (apache#16)

Fix cooperative fetching (apache#17)

Fixes for codegen (apache#18)

[Hotfix] A unittest (apache#19)

Fix for GRP sketch gen (apache#21)

Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20)

[BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22)

[MemHammer][Refactor] Code Review (apache#15)

[Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24)

Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com>
Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>

fix

some fixes

fix test
zxybazh added a commit to zxybazh/tvm that referenced this issue Feb 21, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485)

[Meta Schedule][M3c] PostOrderApply (apache#486)

Fix Post Order Apply (apache#490)

[MetaSchedule] Relay Integration (apache#489)

[M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492)

Fix replay trace. (apache#493)

[M3c][Meta Schedule] Implement the Replay Func class. (apache#495)

[PR] Test script for meta-schedule task extraction. Interface to load… (apache#494)

[Meta Schedule Refactor] Get child blocks (apache#500)

Read-at && Write-at (apache#497)

[M3c][Meta Schedule] Measure Callbacks (apache#498)

[Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496)

[MetaSchedule] Sample-Perfect-Tile (apache#501)

[MetaSchedule] TE Workloads (apache#502)

[TensorIR] GetProducer, GetConsumer (apache#506)

[MetaScheduleRefactor] Annotate&Unannotate (apache#505)

[MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503)

[Tests] Add unittests for auto-inline and multi-level-tiling (apache#508)

[Meta Schedule] Minor Fixes (apache#507)

[MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509)

[MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499)

[Meta Schedule] Add Helper Function & Minor Modification (apache#512)

[MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll  (apache#513)

[Meta Schedule] Feature Extractor & Cost Model (apache#510)

Blockize & Tensorize (apache#514)

Layout Rewriting: Suggest-Index-Map (apache#520)

[MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516)

[Meta Schedule] Per-Store-Feature (apache#521)

Add traced schedule for blockize & tensorize (apache#526)

[Meta Schedule] Add XGBoost Model & Random Model (apache#519)

User-Interface: Tune-TIR (apache#525)

User-Interface: Tune-TE (apache#527)

[Minor] More logging on python (apache#528)

Get CUDA tuning working (apache#529)

[MetaSchedule] TensorRT BYOC (apache#518)

[BugFix] LocalBuilder API (apache#531)

[Meta Schedule] Add Cost Model Update Measure Callback (apache#530)

[Bugfix] BuilderInput with default params (apache#532)

[MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534)

[Meta Schedule] Evolutionary Search (apache#522)

[BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535)

[Meta Schedule] Fix some bugs (apache#537)

Initiate Experiments for CPU Performance Alignment with Ansor (apache#538)

[Meta Schedule] Tweak experiment scripts (apache#539)

[Meta Schedule] Initiate experiments on CUDA (apache#540)

[TIR][Schedule] Buffer transform (apache#523)

Auto Tensor Core (apache#524)

Working on Evo Search (apache#542)

[Meta Schedule] Add Replay Tuning Interface (apache#543)

Evolutionary Search on CPU (apache#544)

Misc improvement over the error message (apache#545)

[TIR][Schedule] Software pipelining (apache#533)

[Meta Schedule Refactor] fixing unit tests (apache#547)

[MetaSchedule] Mutator-Compute-Location (apache#548)

Misc Improvement of Evolutionary Search (apache#549)

Hotfix for software pipeline (apache#552)

Misc Improvement (apache#550)

[Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555)

Rule RFactor (apache#551)

[MemHammer] Rewrite Rules (apache#554)

[MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556)

[MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559)

[MetaSchedule] Perf Alignment - NRM on CUDA (apache#560)

[TIR] Reorder the block iters of the blocks generated by RFactor (apache#561)

Removing 2 unit tests for software pipelining (apache#562)

[MemHammer] Lower Pass + Unittests (apache#557)

Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564)

Fix Sketch Generation Unittests (apache#565)

speed up VerifyGpuCode (apache#568)

[Performance Align] fixing codegen problems (apache#569)

[Meta schedule] improve search space (apache#1)

Hot fix for bound predicate (apache#3)

[Meta Schedule] Update Tune Relay (apache#4)

[Performance Align] fixing codegen problems (apache#5)

[PerfAlign] NRM & SFM on Raspi Aligned (apache#6)

[BugFix] Apply bound predicate directly to loops when possible (apache#12)

[BugFix] Fix CrossThreadReduction on CUDA (apache#13)

[MetaSchedule] Enable BertTuning with MetaScheduler (apache#11)

[Minor][MemHammer] Minor tweaks in code review (apache#14)

[Meta Schedule] Add customizable search space to PostOrderApply. (apache#16)

Fix cooperative fetching (apache#17)

Fixes for codegen (apache#18)

[Hotfix] A unittest (apache#19)

Fix for GRP sketch gen (apache#21)

Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20)

[BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22)

[MemHammer][Refactor] Code Review (apache#15)

[Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24)

Import & Cache Mechanism (apache#26)

[BugFix] Fix Winograd Test Script (apache#25)

Add task extraction & caching (apache#27)

A few fixes for task extraction (apache#28)

Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com>
Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>
zxybazh added a commit to zxybazh/tvm that referenced this issue Feb 21, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485)

[Meta Schedule][M3c] PostOrderApply (apache#486)

Fix Post Order Apply (apache#490)

[MetaSchedule] Relay Integration (apache#489)

[M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492)

Fix replay trace. (apache#493)

[M3c][Meta Schedule] Implement the Replay Func class. (apache#495)

[PR] Test script for meta-schedule task extraction. Interface to load… (apache#494)

[Meta Schedule Refactor] Get child blocks (apache#500)

Read-at && Write-at (apache#497)

[M3c][Meta Schedule] Measure Callbacks (apache#498)

[Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496)

[MetaSchedule] Sample-Perfect-Tile (apache#501)

[MetaSchedule] TE Workloads (apache#502)

[TensorIR] GetProducer, GetConsumer (apache#506)

[MetaScheduleRefactor] Annotate&Unannotate (apache#505)

[MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503)

[Tests] Add unittests for auto-inline and multi-level-tiling (apache#508)

[Meta Schedule] Minor Fixes (apache#507)

[MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509)

[MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499)

[Meta Schedule] Add Helper Function & Minor Modification (apache#512)

[MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll  (apache#513)

[Meta Schedule] Feature Extractor & Cost Model (apache#510)

Blockize & Tensorize (apache#514)

Layout Rewriting: Suggest-Index-Map (apache#520)

[MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516)

[Meta Schedule] Per-Store-Feature (apache#521)

Add traced schedule for blockize & tensorize (apache#526)

[Meta Schedule] Add XGBoost Model & Random Model (apache#519)

User-Interface: Tune-TIR (apache#525)

User-Interface: Tune-TE (apache#527)

[Minor] More logging on python (apache#528)

Get CUDA tuning working (apache#529)

[MetaSchedule] TensorRT BYOC (apache#518)

[BugFix] LocalBuilder API (apache#531)

[Meta Schedule] Add Cost Model Update Measure Callback (apache#530)

[Bugfix] BuilderInput with default params (apache#532)

[MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534)

[Meta Schedule] Evolutionary Search (apache#522)

[BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535)

[Meta Schedule] Fix some bugs (apache#537)

Initiate Experiments for CPU Performance Alignment with Ansor (apache#538)

[Meta Schedule] Tweak experiment scripts (apache#539)

[Meta Schedule] Initiate experiments on CUDA (apache#540)

[TIR][Schedule] Buffer transform (apache#523)

Auto Tensor Core (apache#524)

Working on Evo Search (apache#542)

[Meta Schedule] Add Replay Tuning Interface (apache#543)

Evolutionary Search on CPU (apache#544)

Misc improvement over the error message (apache#545)

[TIR][Schedule] Software pipelining (apache#533)

[Meta Schedule Refactor] fixing unit tests (apache#547)

[MetaSchedule] Mutator-Compute-Location (apache#548)

Misc Improvement of Evolutionary Search (apache#549)

Hotfix for software pipeline (apache#552)

Misc Improvement (apache#550)

[Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555)

Rule RFactor (apache#551)

[MemHammer] Rewrite Rules (apache#554)

[MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556)

[MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559)

[MetaSchedule] Perf Alignment - NRM on CUDA (apache#560)

[TIR] Reorder the block iters of the blocks generated by RFactor (apache#561)

Removing 2 unit tests for software pipelining (apache#562)

[MemHammer] Lower Pass + Unittests (apache#557)

Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564)

Fix Sketch Generation Unittests (apache#565)

speed up VerifyGpuCode (apache#568)

[Performance Align] fixing codegen problems (apache#569)

[Meta schedule] improve search space (apache#1)

Hot fix for bound predicate (apache#3)

[Meta Schedule] Update Tune Relay (apache#4)

[Performance Align] fixing codegen problems (apache#5)

[PerfAlign] NRM & SFM on Raspi Aligned (apache#6)

[BugFix] Apply bound predicate directly to loops when possible (apache#12)

[BugFix] Fix CrossThreadReduction on CUDA (apache#13)

[MetaSchedule] Enable BertTuning with MetaScheduler (apache#11)

[Minor][MemHammer] Minor tweaks in code review (apache#14)

[Meta Schedule] Add customizable search space to PostOrderApply. (apache#16)

Fix cooperative fetching (apache#17)

Fixes for codegen (apache#18)

[Hotfix] A unittest (apache#19)

Fix for GRP sketch gen (apache#21)

Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20)

[BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22)

[MemHammer][Refactor] Code Review (apache#15)

[Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24)

Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com>
Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>
zxybazh added a commit to zxybazh/tvm that referenced this issue Feb 21, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485)

[Meta Schedule][M3c] PostOrderApply (apache#486)

Fix Post Order Apply (apache#490)

[MetaSchedule] Relay Integration (apache#489)

[M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492)

Fix replay trace. (apache#493)

[M3c][Meta Schedule] Implement the Replay Func class. (apache#495)

[PR] Test script for meta-schedule task extraction. Interface to load… (apache#494)

[Meta Schedule Refactor] Get child blocks (apache#500)

Read-at && Write-at (apache#497)

[M3c][Meta Schedule] Measure Callbacks (apache#498)

[Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496)

[MetaSchedule] Sample-Perfect-Tile (apache#501)

[MetaSchedule] TE Workloads (apache#502)

[TensorIR] GetProducer, GetConsumer (apache#506)

[MetaScheduleRefactor] Annotate&Unannotate (apache#505)

[MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503)

[Tests] Add unittests for auto-inline and multi-level-tiling (apache#508)

[Meta Schedule] Minor Fixes (apache#507)

[MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509)

[MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499)

[Meta Schedule] Add Helper Function & Minor Modification (apache#512)

[MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll  (apache#513)

[Meta Schedule] Feature Extractor & Cost Model (apache#510)

Blockize & Tensorize (apache#514)

Layout Rewriting: Suggest-Index-Map (apache#520)

[MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516)

[Meta Schedule] Per-Store-Feature (apache#521)

Add traced schedule for blockize & tensorize (apache#526)

[Meta Schedule] Add XGBoost Model & Random Model (apache#519)

User-Interface: Tune-TIR (apache#525)

User-Interface: Tune-TE (apache#527)

[Minor] More logging on python (apache#528)

Get CUDA tuning working (apache#529)

[MetaSchedule] TensorRT BYOC (apache#518)

[BugFix] LocalBuilder API (apache#531)

[Meta Schedule] Add Cost Model Update Measure Callback (apache#530)

[Bugfix] BuilderInput with default params (apache#532)

[MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534)

[Meta Schedule] Evolutionary Search (apache#522)

[BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535)

[Meta Schedule] Fix some bugs (apache#537)

Initiate Experiments for CPU Performance Alignment with Ansor (apache#538)

[Meta Schedule] Tweak experiment scripts (apache#539)

[Meta Schedule] Initiate experiments on CUDA (apache#540)

[TIR][Schedule] Buffer transform (apache#523)

Auto Tensor Core (apache#524)

Working on Evo Search (apache#542)

[Meta Schedule] Add Replay Tuning Interface (apache#543)

Evolutionary Search on CPU (apache#544)

Misc improvement over the error message (apache#545)

[TIR][Schedule] Software pipelining (apache#533)

[Meta Schedule Refactor] fixing unit tests (apache#547)

[MetaSchedule] Mutator-Compute-Location (apache#548)

Misc Improvement of Evolutionary Search (apache#549)

Hotfix for software pipeline (apache#552)

Misc Improvement (apache#550)

[Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555)

Rule RFactor (apache#551)

[MemHammer] Rewrite Rules (apache#554)

[MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556)

[MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559)

[MetaSchedule] Perf Alignment - NRM on CUDA (apache#560)

[TIR] Reorder the block iters of the blocks generated by RFactor (apache#561)

Removing 2 unit tests for software pipelining (apache#562)

[MemHammer] Lower Pass + Unittests (apache#557)

Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564)

Fix Sketch Generation Unittests (apache#565)

speed up VerifyGpuCode (apache#568)

[Performance Align] fixing codegen problems (apache#569)

[Meta schedule] improve search space (apache#1)

Hot fix for bound predicate (apache#3)

[Meta Schedule] Update Tune Relay (apache#4)

[Performance Align] fixing codegen problems (apache#5)

[PerfAlign] NRM & SFM on Raspi Aligned (apache#6)

[BugFix] Apply bound predicate directly to loops when possible (apache#12)

[BugFix] Fix CrossThreadReduction on CUDA (apache#13)

[MetaSchedule] Enable BertTuning with MetaScheduler (apache#11)

[Minor][MemHammer] Minor tweaks in code review (apache#14)

[Meta Schedule] Add customizable search space to PostOrderApply. (apache#16)

Fix cooperative fetching (apache#17)

Fixes for codegen (apache#18)

[Hotfix] A unittest (apache#19)

Fix for GRP sketch gen (apache#21)

Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20)

[BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22)

[MemHammer][Refactor] Code Review (apache#15)

[Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24)

Import & Cache Mechanism (apache#26)

[BugFix] Fix Winograd Test Script (apache#25)

Add task extraction & caching (apache#27)

A few fixes for task extraction (apache#28)

Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com>
Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>
zxybazh added a commit to zxybazh/tvm that referenced this issue Feb 21, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485)

[Meta Schedule][M3c] PostOrderApply (apache#486)

Fix Post Order Apply (apache#490)

[MetaSchedule] Relay Integration (apache#489)

[M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492)

Fix replay trace. (apache#493)

[M3c][Meta Schedule] Implement the Replay Func class. (apache#495)

[PR] Test script for meta-schedule task extraction. Interface to load… (apache#494)

[Meta Schedule Refactor] Get child blocks (apache#500)

Read-at && Write-at (apache#497)

[M3c][Meta Schedule] Measure Callbacks (apache#498)

[Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496)

[MetaSchedule] Sample-Perfect-Tile (apache#501)

[MetaSchedule] TE Workloads (apache#502)

[TensorIR] GetProducer, GetConsumer (apache#506)

[MetaScheduleRefactor] Annotate&Unannotate (apache#505)

[MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503)

[Tests] Add unittests for auto-inline and multi-level-tiling (apache#508)

[Meta Schedule] Minor Fixes (apache#507)

[MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509)

[MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499)

[Meta Schedule] Add Helper Function & Minor Modification (apache#512)

[MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll  (apache#513)

[Meta Schedule] Feature Extractor & Cost Model (apache#510)

Blockize & Tensorize (apache#514)

Layout Rewriting: Suggest-Index-Map (apache#520)

[MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516)

[Meta Schedule] Per-Store-Feature (apache#521)

Add traced schedule for blockize & tensorize (apache#526)

[Meta Schedule] Add XGBoost Model & Random Model (apache#519)

User-Interface: Tune-TIR (apache#525)

User-Interface: Tune-TE (apache#527)

[Minor] More logging on python (apache#528)

Get CUDA tuning working (apache#529)

[MetaSchedule] TensorRT BYOC (apache#518)

[BugFix] LocalBuilder API (apache#531)

[Meta Schedule] Add Cost Model Update Measure Callback (apache#530)

[Bugfix] BuilderInput with default params (apache#532)

[MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534)

[Meta Schedule] Evolutionary Search (apache#522)

[BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535)

[Meta Schedule] Fix some bugs (apache#537)

Initiate Experiments for CPU Performance Alignment with Ansor (apache#538)

[Meta Schedule] Tweak experiment scripts (apache#539)

[Meta Schedule] Initiate experiments on CUDA (apache#540)

[TIR][Schedule] Buffer transform (apache#523)

Auto Tensor Core (apache#524)

Working on Evo Search (apache#542)

[Meta Schedule] Add Replay Tuning Interface (apache#543)

Evolutionary Search on CPU (apache#544)

Misc improvement over the error message (apache#545)

[TIR][Schedule] Software pipelining (apache#533)

[Meta Schedule Refactor] fixing unit tests (apache#547)

[MetaSchedule] Mutator-Compute-Location (apache#548)

Misc Improvement of Evolutionary Search (apache#549)

Hotfix for software pipeline (apache#552)

Misc Improvement (apache#550)

[Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555)

Rule RFactor (apache#551)

[MemHammer] Rewrite Rules (apache#554)

[MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556)

[MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559)

[MetaSchedule] Perf Alignment - NRM on CUDA (apache#560)

[TIR] Reorder the block iters of the blocks generated by RFactor (apache#561)

Removing 2 unit tests for software pipelining (apache#562)

[MemHammer] Lower Pass + Unittests (apache#557)

Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564)

Fix Sketch Generation Unittests (apache#565)

speed up VerifyGpuCode (apache#568)

[Performance Align] fixing codegen problems (apache#569)

[Meta schedule] improve search space (apache#1)

Hot fix for bound predicate (apache#3)

[Meta Schedule] Update Tune Relay (apache#4)

[Performance Align] fixing codegen problems (apache#5)

[PerfAlign] NRM & SFM on Raspi Aligned (apache#6)

[BugFix] Apply bound predicate directly to loops when possible (apache#12)

[BugFix] Fix CrossThreadReduction on CUDA (apache#13)

[MetaSchedule] Enable BertTuning with MetaScheduler (apache#11)

[Minor][MemHammer] Minor tweaks in code review (apache#14)

[Meta Schedule] Add customizable search space to PostOrderApply. (apache#16)

Fix cooperative fetching (apache#17)

Fixes for codegen (apache#18)

[Hotfix] A unittest (apache#19)

Fix for GRP sketch gen (apache#21)

Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20)

[BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22)

[MemHammer][Refactor] Code Review (apache#15)

[Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24)

Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com>
Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>
zxybazh added a commit to zxybazh/tvm that referenced this issue Feb 22, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485)

[Meta Schedule][M3c] PostOrderApply (apache#486)

Fix Post Order Apply (apache#490)

[MetaSchedule] Relay Integration (apache#489)

[M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492)

Fix replay trace. (apache#493)

[M3c][Meta Schedule] Implement the Replay Func class. (apache#495)

[PR] Test script for meta-schedule task extraction. Interface to load… (apache#494)

[Meta Schedule Refactor] Get child blocks (apache#500)

Read-at && Write-at (apache#497)

[M3c][Meta Schedule] Measure Callbacks (apache#498)

[Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496)

[MetaSchedule] Sample-Perfect-Tile (apache#501)

[MetaSchedule] TE Workloads (apache#502)

[TensorIR] GetProducer, GetConsumer (apache#506)

[MetaScheduleRefactor] Annotate&Unannotate (apache#505)

[MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503)

[Tests] Add unittests for auto-inline and multi-level-tiling (apache#508)

[Meta Schedule] Minor Fixes (apache#507)

[MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509)

[MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499)

[Meta Schedule] Add Helper Function & Minor Modification (apache#512)

[MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll  (apache#513)

[Meta Schedule] Feature Extractor & Cost Model (apache#510)

Blockize & Tensorize (apache#514)

Layout Rewriting: Suggest-Index-Map (apache#520)

[MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516)

[Meta Schedule] Per-Store-Feature (apache#521)

Add traced schedule for blockize & tensorize (apache#526)

[Meta Schedule] Add XGBoost Model & Random Model (apache#519)

User-Interface: Tune-TIR (apache#525)

User-Interface: Tune-TE (apache#527)

[Minor] More logging on python (apache#528)

Get CUDA tuning working (apache#529)

[MetaSchedule] TensorRT BYOC (apache#518)

[BugFix] LocalBuilder API (apache#531)

[Meta Schedule] Add Cost Model Update Measure Callback (apache#530)

[Bugfix] BuilderInput with default params (apache#532)

[MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534)

[Meta Schedule] Evolutionary Search (apache#522)

[BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535)

[Meta Schedule] Fix some bugs (apache#537)

Initiate Experiments for CPU Performance Alignment with Ansor (apache#538)

[Meta Schedule] Tweak experiment scripts (apache#539)

[Meta Schedule] Initiate experiments on CUDA (apache#540)

[TIR][Schedule] Buffer transform (apache#523)

Auto Tensor Core (apache#524)

Working on Evo Search (apache#542)

[Meta Schedule] Add Replay Tuning Interface (apache#543)

Evolutionary Search on CPU (apache#544)

Misc improvement over the error message (apache#545)

[TIR][Schedule] Software pipelining (apache#533)

[Meta Schedule Refactor] fixing unit tests (apache#547)

[MetaSchedule] Mutator-Compute-Location (apache#548)

Misc Improvement of Evolutionary Search (apache#549)

Hotfix for software pipeline (apache#552)

Misc Improvement (apache#550)

[Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555)

Rule RFactor (apache#551)

[MemHammer] Rewrite Rules (apache#554)

[MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556)

[MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559)

[MetaSchedule] Perf Alignment - NRM on CUDA (apache#560)

[TIR] Reorder the block iters of the blocks generated by RFactor (apache#561)

Removing 2 unit tests for software pipelining (apache#562)

[MemHammer] Lower Pass + Unittests (apache#557)

Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564)

Fix Sketch Generation Unittests (apache#565)

speed up VerifyGpuCode (apache#568)

[Performance Align] fixing codegen problems (apache#569)

[Meta schedule] improve search space (apache#1)

Hot fix for bound predicate (apache#3)

[Meta Schedule] Update Tune Relay (apache#4)

[Performance Align] fixing codegen problems (apache#5)

[PerfAlign] NRM & SFM on Raspi Aligned (apache#6)

[BugFix] Apply bound predicate directly to loops when possible (apache#12)

[BugFix] Fix CrossThreadReduction on CUDA (apache#13)

[MetaSchedule] Enable BertTuning with MetaScheduler (apache#11)

[Minor][MemHammer] Minor tweaks in code review (apache#14)

[Meta Schedule] Add customizable search space to PostOrderApply. (apache#16)

Fix cooperative fetching (apache#17)

Fixes for codegen (apache#18)

[Hotfix] A unittest (apache#19)

Fix for GRP sketch gen (apache#21)

Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20)

[BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22)

[MemHammer][Refactor] Code Review (apache#15)

[Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24)

Import & Cache Mechanism (apache#26)

[BugFix] Fix Winograd Test Script (apache#25)

Add task extraction & caching (apache#27)

A few fixes for task extraction (apache#28)

Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com>
Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>
zxybazh added a commit to zxybazh/tvm that referenced this issue Feb 22, 2022
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485)

[Meta Schedule][M3c] PostOrderApply (apache#486)

Fix Post Order Apply (apache#490)

[MetaSchedule] Relay Integration (apache#489)

[M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492)

Fix replay trace. (apache#493)

[M3c][Meta Schedule] Implement the Replay Func class. (apache#495)

[PR] Test script for meta-schedule task extraction. Interface to load… (apache#494)

[Meta Schedule Refactor] Get child blocks (apache#500)

Read-at && Write-at (apache#497)

[M3c][Meta Schedule] Measure Callbacks (apache#498)

[Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496)

[MetaSchedule] Sample-Perfect-Tile (apache#501)

[MetaSchedule] TE Workloads (apache#502)

[TensorIR] GetProducer, GetConsumer (apache#506)

[MetaScheduleRefactor] Annotate&Unannotate (apache#505)

[MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503)

[Tests] Add unittests for auto-inline and multi-level-tiling (apache#508)

[Meta Schedule] Minor Fixes (apache#507)

[MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509)

[MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499)

[Meta Schedule] Add Helper Function & Minor Modification (apache#512)

[MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll  (apache#513)

[Meta Schedule] Feature Extractor & Cost Model (apache#510)

Blockize & Tensorize (apache#514)

Layout Rewriting: Suggest-Index-Map (apache#520)

[MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516)

[Meta Schedule] Per-Store-Feature (apache#521)

Add traced schedule for blockize & tensorize (apache#526)

[Meta Schedule] Add XGBoost Model & Random Model (apache#519)

User-Interface: Tune-TIR (apache#525)

User-Interface: Tune-TE (apache#527)

[Minor] More logging on python (apache#528)

Get CUDA tuning working (apache#529)

[MetaSchedule] TensorRT BYOC (apache#518)

[BugFix] LocalBuilder API (apache#531)

[Meta Schedule] Add Cost Model Update Measure Callback (apache#530)

[Bugfix] BuilderInput with default params (apache#532)

[MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534)

[Meta Schedule] Evolutionary Search (apache#522)

[BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535)

[Meta Schedule] Fix some bugs (apache#537)

Initiate Experiments for CPU Performance Alignment with Ansor (apache#538)

[Meta Schedule] Tweak experiment scripts (apache#539)

[Meta Schedule] Initiate experiments on CUDA (apache#540)

[TIR][Schedule] Buffer transform (apache#523)

Auto Tensor Core (apache#524)

Working on Evo Search (apache#542)

[Meta Schedule] Add Replay Tuning Interface (apache#543)

Evolutionary Search on CPU (apache#544)

Misc improvement over the error message (apache#545)

[TIR][Schedule] Software pipelining (apache#533)

[Meta Schedule Refactor] fixing unit tests (apache#547)

[MetaSchedule] Mutator-Compute-Location (apache#548)

Misc Improvement of Evolutionary Search (apache#549)

Hotfix for software pipeline (apache#552)

Misc Improvement (apache#550)

[Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555)

Rule RFactor (apache#551)

[MemHammer] Rewrite Rules (apache#554)

[MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556)

[MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559)

[MetaSchedule] Perf Alignment - NRM on CUDA (apache#560)

[TIR] Reorder the block iters of the blocks generated by RFactor (apache#561)

Removing 2 unit tests for software pipelining (apache#562)

[MemHammer] Lower Pass + Unittests (apache#557)

Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564)

Fix Sketch Generation Unittests (apache#565)

speed up VerifyGpuCode (apache#568)

[Performance Align] fixing codegen problems (apache#569)

[Meta schedule] improve search space (apache#1)

Hot fix for bound predicate (apache#3)

[Meta Schedule] Update Tune Relay (apache#4)

[Performance Align] fixing codegen problems (apache#5)

[PerfAlign] NRM & SFM on Raspi Aligned (apache#6)

[BugFix] Apply bound predicate directly to loops when possible (apache#12)

[BugFix] Fix CrossThreadReduction on CUDA (apache#13)

[MetaSchedule] Enable BertTuning with MetaScheduler (apache#11)

[Minor][MemHammer] Minor tweaks in code review (apache#14)

[Meta Schedule] Add customizable search space to PostOrderApply. (apache#16)

Fix cooperative fetching (apache#17)

Fixes for codegen (apache#18)

[Hotfix] A unittest (apache#19)

Fix for GRP sketch gen (apache#21)

Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20)

[BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22)

[MemHammer][Refactor] Code Review (apache#15)

[Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24)

Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com>
Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>
MasterJH5574 pushed a commit to MasterJH5574/tvm that referenced this issue Feb 26, 2022
* Implementation of call_dps.

* Implementation of PackedFuncExpr.

* Test CallDPS for TIR function.

* Rename.

* Add header and comments.

* Update.

* Address comments.
MasterJH5574 pushed a commit to MasterJH5574/tvm that referenced this issue Mar 3, 2022
* Implementation of call_dps.

* Implementation of PackedFuncExpr.

* Test CallDPS for TIR function.

* Rename.

* Add header and comments.

* Update.

* Address comments.
zxy844288792 pushed a commit to zxy844288792/tvm that referenced this issue Mar 4, 2022
…-out (apache#8010)

* [UnitTests] Explicitly list tests that were enabled by TVM_TEST_TARGETS but were skipped

Previously, these were removed by a filter in
tvm.testing._get_targets(), and weren't listed at all.  With this
change, they are instead removed by pytest.skipif, and show up as
explicitly skipped tests in pytest's summary when using
tvm.testing.parametrize_targets.

* [UnitTests] Automatic parametrize_targets for tests that use (target,dev)

Should make it easier to convert tests from using
tvm.testing.enabled_targets to use pytest's parametrized tests
instead.

* [UnitTests] Added ability to explicitly exclude a target from a particular test

Uses tvm_exclude_targets variable, which can be set (1) in the
conftest.py to apply to a test directory, (2) in a test script to
apply to that module, or (3) on an individual test function to apply
to it.  The @tvm.testing.exclude_targets decorator is provided for
readability in case apache#3.

* [UnitTests] Refactored test_topi_relu.py to use pytest.mark.parametrize

* [UnitTests] Added tvm_known_failing_targets option for the unittests.

Intended to mark tests that fail for a particular target, and are
intended to be fixed in the future.  Typically, these would result
either from implementing a new test, or from an in-progress
implementation of a new target.

* [UnitTests] Known failing targets now marked with xfail instead of skipif

* [UnitTests] Removed tvm_excluded_targets and tvm_known_failing_targets

These were implemented to exclude or mark as failing an entire file or
directory of tests.  In
https://discuss.tvm.apache.org/t/rfc-parametrized-unit-tests/9946/4,
it was pointed out that the global variables would be vulnerable to
typos in the names, resulting in the option being silently ignored.
The decorators `@tvm.testing.exclude_targets` and
`@tvm.testing.known_failing_targets` do not have this failure mode,
and are the preferred version.

* [UnitTests] Added helper functions to tvm.testing.

- tvm.testing.parameter() defines a parameter that can be passed to
  tests.  Tests that accept more than one parameter are run for all
  combinations of parameter values.

- tvm.testing.parameters() defines multiple sets of parameter values.
  Tests that accept more than one parameter are run once for each set
  of parameter values.

- tvm.testing.fixture() is a decorator that defines setup code.  The
  `cache=True` argument can be passed to avoid repeating expensive
  setup across multiple tests.

* [UnitTests] Bugfix for auto parametrizing of "target"

Previously, if the @parametrize_targets were present, but had other
@pytest.mark.parametrize after it, "target" would get parametrized a
second time.  Now, it checks more than just the closest "parametrize"
marker.

* [UnitTests] Renamed "cache" argument of tvm.testing.fixture to "cache_return_value"

* [UnitTests] Minor updates to parametrized test implementation.

As recommended by @tkonolige:

- Avoid infinite loop if LLVM target isn't enabled

- Update documentation for preferred use cases of
  tvm.testing.parametrize_targets, and recommended alternatives.

* [UnitTests] Minor updates to parametrized test implementation

- Documentation, removed previous example usage of tvm.testing.parametrize_targets

* [UnitTests] Changed accidental use of pytest fixtures to a NameError.

- Previously, a fixture function defined in a module was accessible
  through the global scope, and the function definition is accessible
  if a test function uses that name but fails to declare the fixture
  as a parameter.  Now, it will result in a NameError instead.

* [UnitTests] More careful removal of fixture functions from module global scope.

- Initial implementation only checked hasattr(obj, "_pytestfixturefunction")
  before removing obj, which gave false positives for objects that implement
  __getattr__, such as caffe.layers.  Now, also check that the value
  contained is a FixtureFunctionMarker.

* [UnitTests] Copy cached values when using tvm.testing.fixture(cache_return_value=True)

To avoid unit tests being able to influence each other through a
shared cache, all cached fixtures are passed through copy.deepcopy
prior to use.

* [UnitTests] Added meta-tests for tvm.testing functionality

Co-authored-by: Eric Lunderberg <elunderberg@octoml.ai>
MasterJH5574 pushed a commit to MasterJH5574/tvm that referenced this issue Mar 7, 2022
[SparseTIR] Constructors and Python Interface for `Axis` and `SparseBuffer` (apache#2)

* add methods for Object

* axis constructors

* methods for SparseBuffer

* put into registry

* python interface

[CherryPick][Intrinsic] lower_bound and upper_bound for binary search in Sparse TIR. (apache#483) (apache#4)

* upd

* upd

* fix

* upd

* upd

* upd

* upd

* upd

* fix

* upd

* upd

* upd

* upd

* upd

* upd

* upd

* codegen-rule

* upd

* upd

* test

* upd

* fix

* two arguments

Co-authored-by: Zihao Ye <expye@outlook.com>

Fix AxisTree (apache#3)

* fix axis tree

* upd

[SparseTIR] Add SparseBufferLoad/SparseBufferStore (apache#5)

* Add dtype for SparseBuffer

* Add name for SparseBuffer. Remove `ndim`

* Remove namespace sparse

* Add SparseBufferLoad/Store

* Add method `ndim()`

[SparseTIR] Introduce SpIterVar (apache#6)

* [SparseTIR] Introduce SpIterVar

* Add conversion to PrimExpr

[BugFix] Fix binary search & SpIterVar (apache#7)

[BugFix] Add field `is_reduction` for SpIterVar (apache#9)

* [BugFix] Add field `is_reduction` for SpIterVar

* Formatting

[SparseTIR] Index Lowering (apache#8)

* Add StmtFunctor/ExprFunctor for SparseBufferStore/Load

* Add basic index lowering

* Finish index lowering (maybe)

* Address comments

* Convert CRLF to LF

Frontend update, demo scripts. (apache#10)

* Format and Buffer data structure (apache#1)

* [SparseTIR] Constructors and Python Interface for `Axis` and `SparseBuffer` (apache#2)

* add methods for Object

* axis constructors

* methods for SparseBuffer

* put into registry

* python interface

* [CherryPick][Intrinsic] lower_bound and upper_bound for binary search in Sparse TIR. (apache#483) (apache#4)

* upd

* upd

* fix

* upd

* upd

* upd

* upd

* upd

* fix

* upd

* upd

* upd

* upd

* upd

* upd

* upd

* codegen-rule

* upd

* upd

* test

* upd

* fix

* two arguments

Co-authored-by: Zihao Ye <expye@outlook.com>

* Fix AxisTree (apache#3)

* fix axis tree

* upd

* Format and Buffer data structure (apache#1)

* [SparseTIR] Constructors and Python Interface for `Axis` and `SparseBuffer` (apache#2)

* add methods for Object

* axis constructors

* methods for SparseBuffer

* put into registry

* python interface

* fix axis tree

* upd

* Format and Buffer data structure (apache#1)

* [SparseTIR] Constructors and Python Interface for `Axis` and `SparseBuffer` (apache#2)

* add methods for Object

* axis constructors

* methods for SparseBuffer

* put into registry

* python interface

* [CherryPick][Intrinsic] lower_bound and upper_bound for binary search in Sparse TIR. (apache#483) (apache#4)

* upd

* upd

* fix

* upd

* upd

* upd

* upd

* upd

* fix

* upd

* upd

* upd

* upd

* upd

* upd

* upd

* codegen-rule

* upd

* upd

* test

* upd

* fix

* two arguments

Co-authored-by: Zihao Ye <expye@outlook.com>

* Fix AxisTree (apache#3)

* fix axis tree

* upd

* [SparseTIR] Add SparseBufferLoad/SparseBufferStore (apache#5)

* Add dtype for SparseBuffer

* Add name for SparseBuffer. Remove `ndim`

* Remove namespace sparse

* Add SparseBufferLoad/Store

* Add method `ndim()`

* Format and Buffer data structure (apache#1)

* [SparseTIR] Constructors and Python Interface for `Axis` and `SparseBuffer` (apache#2)

* add methods for Object

* axis constructors

* methods for SparseBuffer

* put into registry

* python interface

* [CherryPick][Intrinsic] lower_bound and upper_bound for binary search in Sparse TIR. (apache#483) (apache#4)

* upd

* upd

* fix

* upd

* upd

* upd

* upd

* upd

* fix

* upd

* upd

* upd

* upd

* upd

* upd

* upd

* codegen-rule

* upd

* upd

* test

* upd

* fix

* two arguments

Co-authored-by: Zihao Ye <expye@outlook.com>

* Fix AxisTree (apache#3)

* fix axis tree

* upd

* [SparseTIR] Add SparseBufferLoad/SparseBufferStore (apache#5)

* Add dtype for SparseBuffer

* Add name for SparseBuffer. Remove `ndim`

* Remove namespace sparse

* Add SparseBufferLoad/Store

* Add method `ndim()`

* [SparseTIR] Introduce SpIterVar (apache#6)

* [SparseTIR] Introduce SpIterVar

* Add conversion to PrimExpr

* [BugFix] Fix binary search & SpIterVar (apache#7)

* [BugFix] Add field `is_reduction` for SpIterVar (apache#9)

* [BugFix] Add field `is_reduction` for SpIterVar

* Formatting

* upd

* upd

Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>

[SparseTIR] SparseBlock on C++/Python side (apache#11)

* Fix a bug in the last commit

* SparseBlock on C++ & Python side

[BugFix][SparseTIR] TVMScript Parser for Axis & SpIterVar (apache#12)

* Update `cord` and `pos`

* Fix `idtype`

* Formatting..

* Bug fix 1

* Move new special stmts

* Parser for Axis and SpIterVar

* Fix context_maintainer.py

[SparseTIR] Enhance SparseBlock to contain enough PrimFunc information (apache#13)

* Enhance SparseBlock to have enough PrimFunc info

* Remove `func_sparse_buffer_map_`

* Don't print the map uh-huh

[SparseTIR] Parser, Printer, Roundtrip (apache#14)

* SparseBlock scope handler (part 1)

* SparseBlock scope handler (part 2)

* SparseBlock scope handler (part 3)

* SparseBlock scope handler (fix 1)

* Add SparseBufferLoad/Store on Python side

* Parser for SparseBufferLoad/Store

* Add SparseBlock to Python __init__

* StmtFunctor for SparseBlock

* Ensure at least one dimension for SparseBuffer

* Make `axis` field of SpIterVar mandatory

* SparseBlock scope handler (fix 2)

* Update Axis syntax by removing `name` parameter

* Move to intrin.py

* Add filed `from_sparse` to DenseFixedAxis

* SparseTIR script printer

* Roundtrip test

* `update_symbol` bug fix

* Fix attr visit in SparseBuffer

* Define then compare in SparseBlock

* Fix printer bug for SparseBuffer

* Enable graph match for Axis and SparseBuffer

* Complete HashReduce and EqualReduce for AxisTree and SparseBuffer

* Fix typo

* Rename test

* Bug fix 1

* Bug fix 2

* Add more tests

Move tests (apache#15)

[SparseTIR] ReprPrinter for Axis and SpIterVar (apache#16)

upd (apache#17)

flatten (apache#18)

ELL and BSR correctness test scripts (apache#19)

[SparseTIR] SparseTIR Lowering (apache#20)

* Fix a previous bug of sparse-fixed SpIterVar creation

* Fix a previous bug in `GetDenseValue`

* Refactor Collector and IndexTransformer

* Construct block and loops

* Fix a previous bug which rejects DV iters in collector

* Update buffer map

* Create root block

* Fix bug of sparse-fixed SpIterVar creation

* Fix bug on SpIterVar conversion (with refactor)

* Fix bug when getting dependent SpIterVars

* Fix bug on dependency map and index lowering

* Full block read/write region

* Test version 1

* Fix bug of loop order

* Fix bug of batch-mm iterator ordering

* Update PrimFunc args to use symbolic params

* Fix bug of test "csr_element_wise"

* Fix bug of index accumulation for sparse-fixed axis

* Update correctness test

* Test structural equality

* Refactor and use Array

fix nnz cols

Add docstring for sparse tir lowering (apache#21)

* add docstring

* upd

Add more examples part 1 (sddmm) (apache#22)

* upd

* upd

* upd

[SparseTIR][Schedule] SparseBlockRV, GetSparseBlock, SparseReorder (apache#23)

* Test initialization

* Fix a stupid bug of ReprPrinter

* Add SparseBlockRV

* Schedule: GetSparseBlock

* Schedule: Reorder

[SparseTIR][Schedule] GetSpIters (apache#24)

remove hybrid script for successful compilation

Add atomic intrinsic for output nonzero inference. (apache#25)

* upd

* upd

Add "sparse" block attribute. (apache#26)

Revert "remove hybrid script for successful compilation"

This reverts commit eebd7c1.

[SparseTIR] Hack `IsAffineBinding` check (apache#27)

* [TensorIR][Schedule] Inherit block anotation upon creating new blocks

* Fix SDDMM test

* Hack IsAffineBinding for sparse blocks

Axis Dependency Tree aware code-gen and bmm example (apache#28)

* upd

* upd

* upd

* upd

* upd

* upd

* upd

* upd

* remove redundancy

* fix

* upd

* upd

Re-design Indices lowering (apache#29)

* upd

* upd

* upd

* upd

* upd

* init

* format

* fix

* revise coding-style

* format

Complete indices lowering (apache#30)

* upd

* upd

* upd

* done

* upd

* passed test

* upd

Add more docstrings and depress warnings for new lowering algorithm. (apache#31)

Refactor derived axis, frontend support of fusion. (apache#32)

* upd

* upd

* fix

Fatal bugfix and change the signature of DenseVariableAxis.  (apache#33)

Syntax simplification (apache#34)

Change the order of generated blocks for block isolation. (apache#35)

* upd

* upd

* upd

Syntax of AttachAxis for BMM (apache#36)

* upd

* upd

* upd

[SparseTIR] Add "square sum" lowering test (apache#37)

* Add square sum test

* Remove pylint comment

[BugFix] Fix offset caching in lowering (apache#38)

* Hack compact dataflow check in a dirty way

* Add two-K square sum test

* Mark skipped tests

* Fix offset saving in lowering

Fusion syntax fix + SDDMM example.  (apache#39)

Some structure change on update offsets. (apache#40)

[Refactor] SparseTIR Lowering (apache#41)

* Take out methods in Scope

* Refactor

* Refactor "match"

* Tweak scope contents

* Refactor ViewIndexInAxis

* Refactor Scope

* SDDMM tests under implementation

* Refactor block stack

* Use Map for var_map

* Extract NeedCreateNewBlock

* Simplify SpIterVarToIterVar via GetIterExtent

* Refactor NeedCreateNewBlock

* Add docstring

* Use "auto" correctly

* Minor refactor and use some move

Remove redundant analyzers (apache#42)

Support indices lowering for attach and fuse. (apache#43)

* upd

* upd

* upd

Fix irregular BMM example. (apache#44)

* upd

* upd

* upd

* upd

RGCN forward and butterfly pattern example. (apache#45)

Fused SDDMM example. (apache#46)

* upd

* wip

* fix

Fix sparse reorder after refactor (apache#47)

[Refactor] Refactor Unittest (apache#48)

* upd

* remove redundancy

[Unittest] Correctness test for benchmarking scripts (apache#49)

Bugfix and more test for axis fusion, new workload (apache#50)

* upd

* upd

upd
prateek9623 pushed a commit to prateek9623/tvm that referenced this issue May 1, 2022
Clamp int64 input to tvm::Integer to [INT32_MIN, INT32_MAX]
jinhongyii pushed a commit to jinhongyii/tvm that referenced this issue Jun 20, 2022
* Implementation of call_dps.

* Implementation of PackedFuncExpr.

* Test CallDPS for TIR function.

* Rename.

* Add header and comments.

* Update.

* Address comments.
jinhongyii pushed a commit to jinhongyii/tvm that referenced this issue Jun 20, 2022
Hzfengsy pushed a commit to Hzfengsy/tvm that referenced this issue Jul 30, 2022
* Implementation of call_dps.

* Implementation of PackedFuncExpr.

* Test CallDPS for TIR function.

* Rename.

* Add header and comments.

* Update.

* Address comments.
mikepapadim pushed a commit to mikepapadim/tvm that referenced this issue Sep 21, 2022
* Implementation of call_dps.

* Implementation of PackedFuncExpr.

* Test CallDPS for TIR function.

* Rename.

* Add header and comments.

* Update.

* Address comments.
gigiblender referenced this issue in gigiblender/tvm Nov 3, 2022
* Implementation of call_dps.

* Implementation of PackedFuncExpr.

* Test CallDPS for TIR function.

* Rename.

* Add header and comments.

* Update.

* Address comments.
MasterJH5574 pushed a commit to MasterJH5574/tvm that referenced this issue Nov 20, 2022
* Implementation of call_dps.

* Implementation of PackedFuncExpr.

* Test CallDPS for TIR function.

* Rename.

* Add header and comments.

* Update.

* Address comments.
mehrdadh pushed a commit to mehrdadh/tvm that referenced this issue Dec 7, 2022
zxybazh pushed a commit to zxybazh/tvm that referenced this issue Jan 20, 2023
* [IR] Introduce StructInfo

* StructInfoFunctor and Analysis Support

* [TVMScript] Parse type/shape annotation with StructInfo

* remove runtime type assign

* Remove type/shape during parsing (apache#2)

* Normalizer prep: simple checks and legacy function renaming.

* Struct info deduction in BlockBuilder.

* Two TODOs

* StructInfo Normalizer Fixes (apache#3)

* StructInfo AST Fix

* Fix Extern Func Deduction and shape mutator.

* Update VoidStructInfo & globalvar (apache#4)

* Fix passes and proper sinfo propagation.

* Refactor EraseToWellDefined to Enable Remapping

* [WIP] First stab at symbolic param tracking

* Update EraseToWellDefined to support symbolic shape return (apache#5)

* fix R.shape with ndim (apache#6)

* Remove update shape/type

* Address review comment, AnnotateTypeShape=>AnnotateStructInfo

* Update include/tvm/script/ir_builder/relax/frame.h

Co-authored-by: Ruihang Lai <ruihangl@cs.cmu.edu>

* Address comments

* Update printer to use structinfo (apache#7)

* Update Error mechanism to prep for obj loc based reporting

* Symbolic shape aware function call return value derivation.

The main flow works as follows:
- Match and populate shape_var_map and var_map by visit each pair of
  param and call arguments.
- Call EraseToWellDefined to map the ret parameter to new result.

* [ANALYSIS] Refactor well-form to only look at struct info.

* Update comments according to reviews.

* Update include/tvm/relax/struct_info.h

Co-authored-by: Ruihang Lai <ruihangl@cs.cmu.edu>

Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Tianqi Chen <tqchen>
Co-authored-by: Ruihang Lai <ruihangl@cs.cmu.edu>
csullivan pushed a commit to csullivan/tvm that referenced this issue Feb 7, 2023
* Implementation of call_dps.

* Implementation of PackedFuncExpr.

* Test CallDPS for TIR function.

* Rename.

* Add header and comments.

* Update.

* Address comments.
csullivan pushed a commit to csullivan/tvm that referenced this issue Feb 7, 2023
* [IR] Introduce StructInfo

* StructInfoFunctor and Analysis Support

* [TVMScript] Parse type/shape annotation with StructInfo

* remove runtime type assign

* Remove type/shape during parsing (apache#2)

* Normalizer prep: simple checks and legacy function renaming.

* Struct info deduction in BlockBuilder.

* Two TODOs

* StructInfo Normalizer Fixes (apache#3)

* StructInfo AST Fix

* Fix Extern Func Deduction and shape mutator.

* Update VoidStructInfo & globalvar (apache#4)

* Fix passes and proper sinfo propagation.

* Refactor EraseToWellDefined to Enable Remapping

* [WIP] First stab at symbolic param tracking

* Update EraseToWellDefined to support symbolic shape return (apache#5)

* fix R.shape with ndim (apache#6)

* Remove update shape/type

* Address review comment, AnnotateTypeShape=>AnnotateStructInfo

* Update include/tvm/script/ir_builder/relax/frame.h

Co-authored-by: Ruihang Lai <ruihangl@cs.cmu.edu>

* Address comments

* Update printer to use structinfo (apache#7)

* Update Error mechanism to prep for obj loc based reporting

* Symbolic shape aware function call return value derivation.

The main flow works as follows:
- Match and populate shape_var_map and var_map by visit each pair of
  param and call arguments.
- Call EraseToWellDefined to map the ret parameter to new result.

* [ANALYSIS] Refactor well-form to only look at struct info.

* Update comments according to reviews.

* Update include/tvm/relax/struct_info.h

Co-authored-by: Ruihang Lai <ruihangl@cs.cmu.edu>

Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Tianqi Chen <tqchen>
Co-authored-by: Ruihang Lai <ruihangl@cs.cmu.edu>
wangzy0327 pushed a commit to wangzy0327/tvm that referenced this issue Mar 3, 2023
[SYCL] Support auto_scheduler
Lunderberg pushed a commit to Lunderberg/tvm that referenced this issue Mar 3, 2023
* Implementation of call_dps.

* Implementation of PackedFuncExpr.

* Test CallDPS for TIR function.

* Rename.

* Add header and comments.

* Update.

* Address comments.
Lunderberg pushed a commit to Lunderberg/tvm that referenced this issue Mar 3, 2023
* [IR] Introduce StructInfo

* StructInfoFunctor and Analysis Support

* [TVMScript] Parse type/shape annotation with StructInfo

* remove runtime type assign

* Remove type/shape during parsing (apache#2)

* Normalizer prep: simple checks and legacy function renaming.

* Struct info deduction in BlockBuilder.

* Two TODOs

* StructInfo Normalizer Fixes (apache#3)

* StructInfo AST Fix

* Fix Extern Func Deduction and shape mutator.

* Update VoidStructInfo & globalvar (apache#4)

* Fix passes and proper sinfo propagation.

* Refactor EraseToWellDefined to Enable Remapping

* [WIP] First stab at symbolic param tracking

* Update EraseToWellDefined to support symbolic shape return (apache#5)

* fix R.shape with ndim (apache#6)

* Remove update shape/type

* Address review comment, AnnotateTypeShape=>AnnotateStructInfo

* Update include/tvm/script/ir_builder/relax/frame.h

Co-authored-by: Ruihang Lai <ruihangl@cs.cmu.edu>

* Address comments

* Update printer to use structinfo (apache#7)

* Update Error mechanism to prep for obj loc based reporting

* Symbolic shape aware function call return value derivation.

The main flow works as follows:
- Match and populate shape_var_map and var_map by visit each pair of
  param and call arguments.
- Call EraseToWellDefined to map the ret parameter to new result.

* [ANALYSIS] Refactor well-form to only look at struct info.

* Update comments according to reviews.

* Update include/tvm/relax/struct_info.h

Co-authored-by: Ruihang Lai <ruihangl@cs.cmu.edu>

Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Tianqi Chen <tqchen>
Co-authored-by: Ruihang Lai <ruihangl@cs.cmu.edu>
mikeseven pushed a commit to mikeseven/tvm that referenced this issue Sep 27, 2023
SIM-2981: Fix Versioning For Python Packages

Approved-by: Jeffrey Uong
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant