[SCHEDULE] add 'void AutoFuseEwise(Schedule sch)' #36

ZihengJiang · 2017-02-08T03:15:09Z

Add Schedule Fusion(Schedule sch), which will set the stage inline if its operation is element-wise.

TODO: More testcases and comments will be added after the interface is fixed.

tqchen · 2017-02-08T06:24:38Z

Consider rename to AutoFuseEwise, which is more explicit
Maybe we need to skip setting compute Inline when the schedule is already set for the op

tqchen · 2017-02-08T06:26:30Z

tests/python/unittest/test_schedule_schedule_ops.py

+  T2 = tvm.compute((m, n), lambda i, j: T1(i, j) + C(i, j), name='T2')
+
+  s = tvm.Schedule(T2.op)
+  fs = tvm.schedule.Fusion(s)


since this mutate s, let us consider we simply not return sch, otherwise user will think this returns a new schedule

ZihengJiang · 2017-02-08T06:42:39Z

Is there any way to know whether an op node has been scheduled? How about adding a flag in ComputeOpNode?

tqchen · 2017-02-08T06:45:26Z

check if the stage corresponds to that schedule is empty(no relations, no attachment)

…cheduled

tqchen · 2017-02-09T03:07:19Z

include/tvm/schedule_pass.h

+ *
+ * \param s The schedule to be fused.
+ */
+void AutoFuseElemWise(Schedule sch);


Change to AutoInlineElemWise

tqchen · 2017-02-09T03:07:53Z

include/tvm/ir_pass.h

+ * \brief Whether the node is element-wise.
+ * \return whether the node is element-wise.
+ */
+bool IsElemWise(const NodeRef& node);


Do not pass in NodeRef, which can be anything. Consider pass in Stmt and Array.

tqchen · 2017-02-09T03:08:21Z

src/schedule/fusion.cc

 namespace schedule {

-static bool is_stage_scheduled(const Stage& s) {
+namespace {
+inline bool is_stage_scheduled(const Stage& s) {


Add it as a member function of stage

tqchen · 2017-02-09T03:08:58Z

src/schedule/fusion.cc

@@ -3,83 +3,29 @@
 * \file schedule.cc
 */
 #include <tvm/schedule_pass.h>
-#include <tvm/ir.h>
-#include "./graph.h"
+#include <tvm/ir_pass.h>


Rename the file to be auto_inline_elemwise.cc

tqchen · 2017-02-09T05:22:22Z

include/tvm/schedule.h

+   * \brief whether the stage has been scheduled.
+   * \return whether the stage has been scheduled.
+   */
+  inline bool is_scheduled();


consts member?

tqchen · 2017-02-09T05:22:42Z

src/pass/elem_wise.cc

+
+  void Visit(const NodeRef& e) final {
+    if (!is_elem_wise_)
+        return;


no new line, or add {}

tqchen · 2017-02-09T05:23:04Z

src/pass/elem_wise.cc

+    for (size_t i = 0; i < axis_.size(); ++i) {
+      const Variable *v1 = axis_[i]->var.as<Variable>();
+      const Variable *v2 = axis[i].as<Variable>();
+      if (!(v1 && v2) || (v1 != v2)) {


we can do axis_[i].same_as(axis[i])

axis_[i]->var.same_as(axis[i])

tqchen · 2017-02-09T05:23:45Z

src/pass/elem_wise.cc

@@ -0,0 +1,57 @@
+/*!
+ *  Copyright (c) 2016 by Contributors


copy right at 2017

tqchen · 2017-02-09T05:24:00Z

src/pass/elem_wise.cc

+#include <tvm/ir.h>
+#include <tvm/ir_visitor.h>
+#include <tvm/operation.h>
+


consider name file as detect_elem_wise

tqchen · 2017-02-09T05:25:15Z

src/pass/elem_wise.cc

+};
+
+
+bool IsElemWise(const Operation& op) {


Maybe we should consider pass in axis and body separately, since Operation is not an object in IR, it is bad to have ir pass related to it

ZihengJiang · 2017-02-09T05:44:56Z

All the copy right in files is 2016, I suggest change it in another commit.

* overview * fix

* overview * fix

* overview * fix

* overview * fix

* Rename "MetaTileRewritePolicy" to "SketchPolicy". * Add a new class for auto_unroll_max_step, storage_offset in StageNode * fix tune_op_subgraph.py

…generating (#5962) * Code migration Start (#1) * Init commit: Code migration Start * Add loop_state.cc/h * Add ComputeDAG basic test * Split transform_step out & Update more UTs (#3) * Split transform_step out * Update GetProducers & GetConsumers * Update UTs * Add UT for CacheReadWrite & Some bug fix * Add search_task, measure and serialization (#4) * Add FollowSplit & FollowFusedSplit tests * Update dag.InferBound & its UT * Add search_task, measure and serialization * Update Serialization UT * Add MetaTileRewritePolicy (#5) * Add feature * Add cost_model, meta_tile_rewrite_policy * Add MetaTileRewritePolicy basic UT * Basic Python API for State (#6) * Add Basic Python API for State * Add UTs for State * Add Python API: Measure & Task (#7) * Update the return value of state operation * Add task * Copy measure.py & utils.py * Fix LocalBuilder * Fix LocalRunner * Add ansor.auto_schedule() API; First AutoSchedule working version(#8) * Add basic Python support for ansor.auto_schedule * Update AutoSchedule API * Bug fix for get the attach point of a fused iter * Update UT after infer bug fix * Bug fix & Add python serialization API (#10) * Delete C++ UT hack since Python is ready * Add ndarray.non_empty * Update Serialization python API * Improve code style, python wrapper and test cases (#11) * Update c++ code style and unit test * Update python State wrapper and test cases * fix unit tests * Add RPCRunner & OpenCL/CUDA test (#12) * Add RPCRunner & OpenCL search test * Add CUDA search test * Add RPCRunner test * rebase to upstream/master * Add Ansor basic tutorial (#13) * Add basic tutorial * migrate feature extraction (#14) * Add XGBModel & RPCRunnerWarpper (#15) * Add XGBModel & RPCRunnerWarpper * Revert "Add Parallel Granularity Mutation" * Migrate workload_registry.py (#16) * add workload registry * update * update * add task scheduler (#17) * Add conv2d cuda tutorial with workload registry (#18) * add tune_test.py (the old tune_wkl.py) (#19) * add tune_test.py (the old tune_wkl.py) * update * fix measure * fix for gpu * Code refine for tune_test.py & Add a pre load callback (#20) * Bug fix for tutorials * Add PreLoadMeasuredStates * Add search_callback support for task tuner * Code refine for tune_test.py * Update * Update * Update * Update * Bug fix * Add python custom sketch rule (#21) * Add custom sketch rule * Bug fix * Ansor Relay Integration (without layout rewrite) (#22) * relay integration * Add tune_op_subgraph.py & Some code clean for tune_network.py (#23) * Add single op tune scripts * Add tune subgraph support * Merge all op & all subgraph to one file * Rename file * add explicit_unroll_max_extent (#25) * Add Index simplification & API update (#26) * Add vectorized cooperative_fetching test * Update math simplify for vectorized CF * File rename * Update tune_network * API update * Update PreLoadMeasuredStates & Some bug fix (#27) * Add a threading wrapper to fix the test bug * Set default TVM_USE_AUTO_SCHEDULER to false * Update PreLoadMeasuredStates callback * Add tensorize step for loop_state (#31) * Add tensorize step * State python api update (#33) * Start to update api * Add compute_dag to state * API update * kernel layout rewrite (#28) * kernel layout rewrite * remove some hacks * add defuse_ops pass and move kernel_layout_rewrite pass after fuse_ops pass * set TVM_RELAY_DISABLE_BUILD_CACHE for task extraction and prepare_layout_rewrite * [cache flush] port cache flush to ansor (#32) * Improve relay integration (#34) * tmp checkpoint * Improve relay integration * Improve relay integration * Fix xgb error & Simplify dispatcher (#35) * Rename "MetaTileRewritePolicy" to "SketchPolicy". (#36) * Rename "MetaTileRewritePolicy" to "SketchPolicy". * Add a new class for auto_unroll_max_step, storage_offset in StageNode * fix tune_op_subgraph.py * rebase * Migrate all node::make to noderef's construct function (#37) * Start to move xxxnode::make to noderef() * Update * Update * Finish transform_step * Finish comute dag & auto schedule * Update * Update * Update * Update * Update * Code refine * Code refine * Code refine * Update * Update * Some lint fix & Recover the double constructor of tvm::PrimExpr (#39) * lint fix * clang-format-fix * pylint fix * Update * Recover the double constructor of tvm::PrimExpr * Fix pylint * pylint fix * pylint fix * Add MutateComputeLocation and MutateParallel in evolutionary search (#40) * Add MutateComputeLocation and MutateParallel in evolutionary search * fix lint * Improve loop state python API (stage_tensors -> stage_ops) (#41) * improve loop state python API (stage_tensors -> stage_ops) * fix * ComputeDAG bug fix & Add Custom TensorCore Matmul Example (#42) * Bug Fix * Sample example of Custom TensorCore Matmul * Rever Commits, Start to build minimum Ansor system * Code clean for minimum Ansor system * Bug fix & Delete AccessAnalyzer * Delete attachmap & Code clean * Doc update Update statenode::stages from vector to Array * Headfile update & Python doc update * clang-format fix * pylint fix * Update * Doc update * Update * Bug fix after code merge to the new master * clang-format fix * Update * Update * Update std::vector to Array; Update verbosity setting; Some commemts addressed * std::vector->Array & std::string->String * Add init_state to ComputeDAG * Update * Update some unordered_map to Map * clang-format fix * Comments addressed Delete ReplayAndInferBound Delete ReplaySteps & InferBoundCommon * Lint fix * Update * Update * Update * Update * Update * Update * Update * Update * Update * Rename ansor namespace to auto_schedule * Update * Rename ThreadPool to ParallelFor * Add parallel_for * Remove ThreadPool * Update python/tvm/auto_schedule/auto_schedule.py * trigger CI Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com> Co-authored-by: Minmin Sun (孙敏敏) <minmin.smm@alibaba-inc.com> Co-authored-by: Zhao Wu <zhaowu@apache.org>

…generating (apache#5962) * Code migration Start (apache#1) * Init commit: Code migration Start * Add loop_state.cc/h * Add ComputeDAG basic test * Split transform_step out & Update more UTs (apache#3) * Split transform_step out * Update GetProducers & GetConsumers * Update UTs * Add UT for CacheReadWrite & Some bug fix * Add search_task, measure and serialization (apache#4) * Add FollowSplit & FollowFusedSplit tests * Update dag.InferBound & its UT * Add search_task, measure and serialization * Update Serialization UT * Add MetaTileRewritePolicy (apache#5) * Add feature * Add cost_model, meta_tile_rewrite_policy * Add MetaTileRewritePolicy basic UT * Basic Python API for State (apache#6) * Add Basic Python API for State * Add UTs for State * Add Python API: Measure & Task (apache#7) * Update the return value of state operation * Add task * Copy measure.py & utils.py * Fix LocalBuilder * Fix LocalRunner * Add ansor.auto_schedule() API; First AutoSchedule working version(apache#8) * Add basic Python support for ansor.auto_schedule * Update AutoSchedule API * Bug fix for get the attach point of a fused iter * Update UT after infer bug fix * Bug fix & Add python serialization API (apache#10) * Delete C++ UT hack since Python is ready * Add ndarray.non_empty * Update Serialization python API * Improve code style, python wrapper and test cases (apache#11) * Update c++ code style and unit test * Update python State wrapper and test cases * fix unit tests * Add RPCRunner & OpenCL/CUDA test (apache#12) * Add RPCRunner & OpenCL search test * Add CUDA search test * Add RPCRunner test * rebase to upstream/master * Add Ansor basic tutorial (apache#13) * Add basic tutorial * migrate feature extraction (apache#14) * Add XGBModel & RPCRunnerWarpper (apache#15) * Add XGBModel & RPCRunnerWarpper * Revert "Add Parallel Granularity Mutation" * Migrate workload_registry.py (apache#16) * add workload registry * update * update * add task scheduler (apache#17) * Add conv2d cuda tutorial with workload registry (apache#18) * add tune_test.py (the old tune_wkl.py) (apache#19) * add tune_test.py (the old tune_wkl.py) * update * fix measure * fix for gpu * Code refine for tune_test.py & Add a pre load callback (apache#20) * Bug fix for tutorials * Add PreLoadMeasuredStates * Add search_callback support for task tuner * Code refine for tune_test.py * Update * Update * Update * Update * Bug fix * Add python custom sketch rule (apache#21) * Add custom sketch rule * Bug fix * Ansor Relay Integration (without layout rewrite) (apache#22) * relay integration * Add tune_op_subgraph.py & Some code clean for tune_network.py (apache#23) * Add single op tune scripts * Add tune subgraph support * Merge all op & all subgraph to one file * Rename file * add explicit_unroll_max_extent (apache#25) * Add Index simplification & API update (apache#26) * Add vectorized cooperative_fetching test * Update math simplify for vectorized CF * File rename * Update tune_network * API update * Update PreLoadMeasuredStates & Some bug fix (apache#27) * Add a threading wrapper to fix the test bug * Set default TVM_USE_AUTO_SCHEDULER to false * Update PreLoadMeasuredStates callback * Add tensorize step for loop_state (apache#31) * Add tensorize step * State python api update (apache#33) * Start to update api * Add compute_dag to state * API update * kernel layout rewrite (apache#28) * kernel layout rewrite * remove some hacks * add defuse_ops pass and move kernel_layout_rewrite pass after fuse_ops pass * set TVM_RELAY_DISABLE_BUILD_CACHE for task extraction and prepare_layout_rewrite * [cache flush] port cache flush to ansor (apache#32) * Improve relay integration (apache#34) * tmp checkpoint * Improve relay integration * Improve relay integration * Fix xgb error & Simplify dispatcher (apache#35) * Rename "MetaTileRewritePolicy" to "SketchPolicy". (apache#36) * Rename "MetaTileRewritePolicy" to "SketchPolicy". * Add a new class for auto_unroll_max_step, storage_offset in StageNode * fix tune_op_subgraph.py * rebase * Migrate all node::make to noderef's construct function (apache#37) * Start to move xxxnode::make to noderef() * Update * Update * Finish transform_step * Finish comute dag & auto schedule * Update * Update * Update * Update * Update * Code refine * Code refine * Code refine * Update * Update * Some lint fix & Recover the double constructor of tvm::PrimExpr (apache#39) * lint fix * clang-format-fix * pylint fix * Update * Recover the double constructor of tvm::PrimExpr * Fix pylint * pylint fix * pylint fix * Add MutateComputeLocation and MutateParallel in evolutionary search (apache#40) * Add MutateComputeLocation and MutateParallel in evolutionary search * fix lint * Improve loop state python API (stage_tensors -> stage_ops) (apache#41) * improve loop state python API (stage_tensors -> stage_ops) * fix * ComputeDAG bug fix & Add Custom TensorCore Matmul Example (apache#42) * Bug Fix * Sample example of Custom TensorCore Matmul * Rever Commits, Start to build minimum Ansor system * Code clean for minimum Ansor system * Bug fix & Delete AccessAnalyzer * Delete attachmap & Code clean * Doc update Update statenode::stages from vector to Array * Headfile update & Python doc update * clang-format fix * pylint fix * Update * Doc update * Update * Bug fix after code merge to the new master * clang-format fix * Update * Update * Update std::vector to Array; Update verbosity setting; Some commemts addressed * std::vector->Array & std::string->String * Add init_state to ComputeDAG * Update * Update some unordered_map to Map * clang-format fix * Comments addressed Delete ReplayAndInferBound Delete ReplaySteps & InferBoundCommon * Lint fix * Update * Update * Update * Update * Update * Update * Update * Update * Update * Rename ansor namespace to auto_schedule * Update * Rename ThreadPool to ParallelFor * Add parallel_for * Remove ThreadPool * Update python/tvm/auto_schedule/auto_schedule.py * trigger CI Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com> Co-authored-by: Minmin Sun (孙敏敏) <minmin.smm@alibaba-inc.com> Co-authored-by: Zhao Wu <zhaowu@apache.org>

* upd * upd * upd

* Init. * Proof of concept. * Rebase on the newest branch * Move to emit_te * Update emit_te * Make RXPlaceholderOpNode as a subclass of PlaceholderOpNode * Update * run vm test_te * Update argument conversion * Reset create_primfunc * Update doc * Update test * Add error message * Update * Update * Address comment * unit test check structural and validate_te_args * raise ValueError when multiple outputs * address comments * example usage emit_te * Rename to context_mod * Handle multiple call * Address comments * Address comments * Use unique name * remove * rename args to te_args * address comments * fix TVMscript manually * spelling Co-authored-by: Andrew Liu <andrewlliu@gmail.com>

[SparseTIR] Constructors and Python Interface for `Axis` and `SparseBuffer` (apache#2) * add methods for Object * axis constructors * methods for SparseBuffer * put into registry * python interface [CherryPick][Intrinsic] lower_bound and upper_bound for binary search in Sparse TIR. (apache#483) (apache#4) * upd * upd * fix * upd * upd * upd * upd * upd * fix * upd * upd * upd * upd * upd * upd * upd * codegen-rule * upd * upd * test * upd * fix * two arguments Co-authored-by: Zihao Ye <expye@outlook.com> Fix AxisTree (apache#3) * fix axis tree * upd [SparseTIR] Add SparseBufferLoad/SparseBufferStore (apache#5) * Add dtype for SparseBuffer * Add name for SparseBuffer. Remove `ndim` * Remove namespace sparse * Add SparseBufferLoad/Store * Add method `ndim()` [SparseTIR] Introduce SpIterVar (apache#6) * [SparseTIR] Introduce SpIterVar * Add conversion to PrimExpr [BugFix] Fix binary search & SpIterVar (apache#7) [BugFix] Add field `is_reduction` for SpIterVar (apache#9) * [BugFix] Add field `is_reduction` for SpIterVar * Formatting [SparseTIR] Index Lowering (apache#8) * Add StmtFunctor/ExprFunctor for SparseBufferStore/Load * Add basic index lowering * Finish index lowering (maybe) * Address comments * Convert CRLF to LF Frontend update, demo scripts. (apache#10) * Format and Buffer data structure (apache#1) * [SparseTIR] Constructors and Python Interface for `Axis` and `SparseBuffer` (apache#2) * add methods for Object * axis constructors * methods for SparseBuffer * put into registry * python interface * [CherryPick][Intrinsic] lower_bound and upper_bound for binary search in Sparse TIR. (apache#483) (apache#4) * upd * upd * fix * upd * upd * upd * upd * upd * fix * upd * upd * upd * upd * upd * upd * upd * codegen-rule * upd * upd * test * upd * fix * two arguments Co-authored-by: Zihao Ye <expye@outlook.com> * Fix AxisTree (apache#3) * fix axis tree * upd * Format and Buffer data structure (apache#1) * [SparseTIR] Constructors and Python Interface for `Axis` and `SparseBuffer` (apache#2) * add methods for Object * axis constructors * methods for SparseBuffer * put into registry * python interface * fix axis tree * upd * Format and Buffer data structure (apache#1) * [SparseTIR] Constructors and Python Interface for `Axis` and `SparseBuffer` (apache#2) * add methods for Object * axis constructors * methods for SparseBuffer * put into registry * python interface * [CherryPick][Intrinsic] lower_bound and upper_bound for binary search in Sparse TIR. (apache#483) (apache#4) * upd * upd * fix * upd * upd * upd * upd * upd * fix * upd * upd * upd * upd * upd * upd * upd * codegen-rule * upd * upd * test * upd * fix * two arguments Co-authored-by: Zihao Ye <expye@outlook.com> * Fix AxisTree (apache#3) * fix axis tree * upd * [SparseTIR] Add SparseBufferLoad/SparseBufferStore (apache#5) * Add dtype for SparseBuffer * Add name for SparseBuffer. Remove `ndim` * Remove namespace sparse * Add SparseBufferLoad/Store * Add method `ndim()` * Format and Buffer data structure (apache#1) * [SparseTIR] Constructors and Python Interface for `Axis` and `SparseBuffer` (apache#2) * add methods for Object * axis constructors * methods for SparseBuffer * put into registry * python interface * [CherryPick][Intrinsic] lower_bound and upper_bound for binary search in Sparse TIR. (apache#483) (apache#4) * upd * upd * fix * upd * upd * upd * upd * upd * fix * upd * upd * upd * upd * upd * upd * upd * codegen-rule * upd * upd * test * upd * fix * two arguments Co-authored-by: Zihao Ye <expye@outlook.com> * Fix AxisTree (apache#3) * fix axis tree * upd * [SparseTIR] Add SparseBufferLoad/SparseBufferStore (apache#5) * Add dtype for SparseBuffer * Add name for SparseBuffer. Remove `ndim` * Remove namespace sparse * Add SparseBufferLoad/Store * Add method `ndim()` * [SparseTIR] Introduce SpIterVar (apache#6) * [SparseTIR] Introduce SpIterVar * Add conversion to PrimExpr * [BugFix] Fix binary search & SpIterVar (apache#7) * [BugFix] Add field `is_reduction` for SpIterVar (apache#9) * [BugFix] Add field `is_reduction` for SpIterVar * Formatting * upd * upd Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com> [SparseTIR] SparseBlock on C++/Python side (apache#11) * Fix a bug in the last commit * SparseBlock on C++ & Python side [BugFix][SparseTIR] TVMScript Parser for Axis & SpIterVar (apache#12) * Update `cord` and `pos` * Fix `idtype` * Formatting.. * Bug fix 1 * Move new special stmts * Parser for Axis and SpIterVar * Fix context_maintainer.py [SparseTIR] Enhance SparseBlock to contain enough PrimFunc information (apache#13) * Enhance SparseBlock to have enough PrimFunc info * Remove `func_sparse_buffer_map_` * Don't print the map uh-huh [SparseTIR] Parser, Printer, Roundtrip (apache#14) * SparseBlock scope handler (part 1) * SparseBlock scope handler (part 2) * SparseBlock scope handler (part 3) * SparseBlock scope handler (fix 1) * Add SparseBufferLoad/Store on Python side * Parser for SparseBufferLoad/Store * Add SparseBlock to Python __init__ * StmtFunctor for SparseBlock * Ensure at least one dimension for SparseBuffer * Make `axis` field of SpIterVar mandatory * SparseBlock scope handler (fix 2) * Update Axis syntax by removing `name` parameter * Move to intrin.py * Add filed `from_sparse` to DenseFixedAxis * SparseTIR script printer * Roundtrip test * `update_symbol` bug fix * Fix attr visit in SparseBuffer * Define then compare in SparseBlock * Fix printer bug for SparseBuffer * Enable graph match for Axis and SparseBuffer * Complete HashReduce and EqualReduce for AxisTree and SparseBuffer * Fix typo * Rename test * Bug fix 1 * Bug fix 2 * Add more tests Move tests (apache#15) [SparseTIR] ReprPrinter for Axis and SpIterVar (apache#16) upd (apache#17) flatten (apache#18) ELL and BSR correctness test scripts (apache#19) [SparseTIR] SparseTIR Lowering (apache#20) * Fix a previous bug of sparse-fixed SpIterVar creation * Fix a previous bug in `GetDenseValue` * Refactor Collector and IndexTransformer * Construct block and loops * Fix a previous bug which rejects DV iters in collector * Update buffer map * Create root block * Fix bug of sparse-fixed SpIterVar creation * Fix bug on SpIterVar conversion (with refactor) * Fix bug when getting dependent SpIterVars * Fix bug on dependency map and index lowering * Full block read/write region * Test version 1 * Fix bug of loop order * Fix bug of batch-mm iterator ordering * Update PrimFunc args to use symbolic params * Fix bug of test "csr_element_wise" * Fix bug of index accumulation for sparse-fixed axis * Update correctness test * Test structural equality * Refactor and use Array fix nnz cols Add docstring for sparse tir lowering (apache#21) * add docstring * upd Add more examples part 1 (sddmm) (apache#22) * upd * upd * upd [SparseTIR][Schedule] SparseBlockRV, GetSparseBlock, SparseReorder (apache#23) * Test initialization * Fix a stupid bug of ReprPrinter * Add SparseBlockRV * Schedule: GetSparseBlock * Schedule: Reorder [SparseTIR][Schedule] GetSpIters (apache#24) remove hybrid script for successful compilation Add atomic intrinsic for output nonzero inference. (apache#25) * upd * upd Add "sparse" block attribute. (apache#26) Revert "remove hybrid script for successful compilation" This reverts commit eebd7c1. [SparseTIR] Hack `IsAffineBinding` check (apache#27) * [TensorIR][Schedule] Inherit block anotation upon creating new blocks * Fix SDDMM test * Hack IsAffineBinding for sparse blocks Axis Dependency Tree aware code-gen and bmm example (apache#28) * upd * upd * upd * upd * upd * upd * upd * upd * remove redundancy * fix * upd * upd Re-design Indices lowering (apache#29) * upd * upd * upd * upd * upd * init * format * fix * revise coding-style * format Complete indices lowering (apache#30) * upd * upd * upd * done * upd * passed test * upd Add more docstrings and depress warnings for new lowering algorithm. (apache#31) Refactor derived axis, frontend support of fusion. (apache#32) * upd * upd * fix Fatal bugfix and change the signature of DenseVariableAxis. (apache#33) Syntax simplification (apache#34) Change the order of generated blocks for block isolation. (apache#35) * upd * upd * upd Syntax of AttachAxis for BMM (apache#36) * upd * upd * upd [SparseTIR] Add "square sum" lowering test (apache#37) * Add square sum test * Remove pylint comment [BugFix] Fix offset caching in lowering (apache#38) * Hack compact dataflow check in a dirty way * Add two-K square sum test * Mark skipped tests * Fix offset saving in lowering Fusion syntax fix + SDDMM example. (apache#39) Some structure change on update offsets. (apache#40) [Refactor] SparseTIR Lowering (apache#41) * Take out methods in Scope * Refactor * Refactor "match" * Tweak scope contents * Refactor ViewIndexInAxis * Refactor Scope * SDDMM tests under implementation * Refactor block stack * Use Map for var_map * Extract NeedCreateNewBlock * Simplify SpIterVarToIterVar via GetIterExtent * Refactor NeedCreateNewBlock * Add docstring * Use "auto" correctly * Minor refactor and use some move Remove redundant analyzers (apache#42) Support indices lowering for attach and fuse. (apache#43) * upd * upd * upd Fix irregular BMM example. (apache#44) * upd * upd * upd * upd RGCN forward and butterfly pattern example. (apache#45) Fused SDDMM example. (apache#46) * upd * wip * fix Fix sparse reorder after refactor (apache#47) [Refactor] Refactor Unittest (apache#48) * upd * remove redundancy [Unittest] Correctness test for benchmarking scripts (apache#49) Bugfix and more test for axis fusion, new workload (apache#50) * upd * upd upd

rebased [TIR][Schedule] fix reorder/buffer_flatten & finish CPU demo (apache#59) [CPU DEMO] Update cpu gemm demo and fix bug (apache#58) * [TIR][Schedule] introduce parallel and fix bugs for cpu demo * [TIR][Schedule] update cpu demo * [TIR][Schedule] fix lint * [TIR][Schedule] fix rebased [TIR][Schedule] introduce reduction block and CPU demo (apache#53) * [TIR] reduction : split_reduction * [TIR] reduction : split_reduction * [TIR] reduction : fuse_reduction * [TIR] reduction : cpu demo * [TIR] reduction : fix * [TIR] reduction : pattern detect remains * [TIR] reduction : pattern detect remains * [TIR] reduction : pattern match done * [TIR] reduction : fix lint * [TIR] reduction : fix * [TIR] reduction : fix * [TIR] reduction : fix * [TIR] reduction : fix * [TIR] reduction : rebased * [TIR] reduction : rebased [TIR][Schedule] introduce cache_read cache_write (apache#54) * [TIR][Schedule] introduce cache_read cache_write * [TIR][Schedule] add more comments * [TIR][Schedule] fix problem and add comments * [TIR][Schedule] address comments [TIR] schedule: introduce vectorize, unroll, loop validation (apache#47) * [TIR] vectorize : basically complete * [TIR] vectorize&unroll : update comments&unroll * [TIR] vectorize&unroll : rebased * [TIR] vectorize, unroll, cpu_demo: done * [TIR] vectorize, unroll, cpu_demo: simplify * [TIR] vectorize, unroll, cpu_demo: fix * [TIR] reduction : rebased * [TIR] reduction : fix [TIR][Schedule] fix sref and scopes problem during replace and compute_at (apache#50) * [TIR][Schedule] fix sref and scopes problem during replace and compute_at * [TIR][Schedule] fix * [TIR][Schedule] fix [TIR][Refactor] move function to ScheduleNode [TIR] Schedule: introduce primitive compute_at (apache#36) * [TIR] Schedule: introduce primitive compute_at * [TIR] Schedule: address comments * [TIR] Schedule: address comments * [TIR] Schedule: address comments * [TIR] Schedule: add check to compute_at * [TIR] Schedule: address comments * [TIR] Schedule: address comments [TIR] Schedule: introduce primitive reorder (apache#37) * [Schedule] debug * [TIR] Schedule: reorder, loop type detect remains * [TIR] reorder complete * [TIR] reorder complete * [TIR] fix * [TIR] reorder : rebased complete * [TIR] reorder : fix container.h * [TIR] reorder : fix * [TIR] reorder : fix * [TIR] reorder : fix * [TIR] reorder : simplify * [TIR] reorder : simplify * [TIR] reorder : simplify * [TIR] reorder : fix * [TIR] reorder : fix * [TIR] reorder : rebased * [TIR] reorder : rebased rebase [TIR] Schedule: introduce BlockRealize and Block SRef reuse(apache#39) * [TIR] BlockRealize: schedule refactor * [TIR] BlockRealize: debug * [TIR] BlockRealize finish * [TIR] BlockRealize finish * [TIR] BlockRealize fix * [TIR] BlockRealize update test * [TIR] BlockRealize: add loop var reuse * [TIR] BlockRealize: add loop var reuse * [TIR] BlockRealize: fix * [TIR] BlockRealize: fix * [TIR] BlockRealize: fix * [TIR] BlockRealize: fix * [TIR] BlockRealize: fix * [TIR] BlockRealize: fix * [TIR] BlockRealize: fix * [TIR] BlockRealize: fix * [TIR] BlockRealize: fix * [TIR] BlockRealize: fix [TIR] compare for module (apache#38) * [TIR] compare for module * [TIR] fix * [TIR] fix * [TIR] fix * [TIR] fix * [TIR] fix * [TIR] fix [Hybrid] Module init [Hybrid] Module print [Hybrid] Module print with meta [Hybrid] adjust [Hybrid] finished but without lint and comment check [Hybrid] fix lint [Hybrid] comments [Hybrid] fix script decoration API [Hybrid] using IRModule [Hybrid] fix [Hybrid] adjust API [Hybrid] fix [Hybrid] fix [Hybrid] fix [Hybrid] fix symbol table, adjust API, introduce meta_mutator and resolve import issue [Hybrid] fix lint [TIR] introduce pass BufferFlatten (apache#32) * [TIR] introduce pass BufferFlatten * [Tir] add comments & remove old TeLower * [TIR] split GatherRegion and BufferFlatten to two Visitor/Mutator * [TIR] address comments: Only consider stmt scope * [TIR] BufferFlatten: address comments * [TIR] BufferFlatten: fold BlockFlattener into BufferFlattener * [TIR] BufferFlatten: add asserts * [TIR] BufferFlatten: use Equal in testcase * [TIR] Equal Pass: Enhanced the pass * [TIR] Equal Pass: add comments [Hybrid] refactor using Doc, introduce annotation, enhance parser (apache#28) * [Hybrid] refactor printer, enhance parser * [Hybrid] refactor * [Hybrid] fix * [Hybrid] fix * [Hybrid] fix namespace issue * [Hybrid] compare using Equal [TIR] rebased [TE] fix replace again and add primitive fuse and split (apache#27) * [TE] add: schedule primitive fuse * [TE] add: schedule primitive split * [TE] address comments: add IRSubstitueInScope and other minor fix * [TE] address comments: Enhance Equal api and fix split by nparts * [TE] address comments [Hybrid] introduce printer (apache#25) * [Hybrid] substitute Block with SeqStmt, change block() syntax * [Hybrid] add printer, type declare intrin * [Hybrid] refactor * [Hybrid] meta * [Hybrid] refactor * [Hybrid] macro [TE] fix replace (apache#23) * [TE] fix replace * [TE] fix replace: add more tests * [TE] fix replace: add more tests [TE] rebased [Hybrid] python syntax parser (apache#20) * [Hybrid] python syntax parser * [Hybrid] add a testcase * [Hybrid] improve comments and fix bugs * [Hybrid] improve comments, refactor __internal_assert, add new testcases * [Hybrid] improve error report message, refactor intrin * [Hybrid] separate ScopeEmitter from parser * [Hybrid] refactor type check * [Hybrid] refactor intrin * [Hybrid] refactor intrin, allow register external functions with argument type checking, add a testcase * [Hybrid] address comments, fix a bug in te/ir.h * [Hybrid] remove type check * [Hybrid] python syntax parser * [Hybrid] add a testcase * [Hybrid] improve comments and fix bugs * [Hybrid] improve comments, refactor __internal_assert, add new testcases * [Hybrid] improve error report message, refactor intrin * [Hybrid] separate ScopeEmitter from parser * [Hybrid] refactor type check * [Hybrid] refactor intrin * [Hybrid] refactor intrin, allow register external functions with argument type checking, add a testcase * [Hybrid] address comments, fix a bug in te/ir.h * [Hybrid] remove type check * [Hybrid] refactor intrin, scope_handler, special_stmt * [Hybrid] address comments * [Hybrid] clean code, improve error reporting & testcase * [Hybrid] clean code * [Hybrid] clean code [IR] introduce dependency graph and write map [TE] refactor and clean codebase [TE] refactor IR [TE] introduce schedule, dependency graph and support fuse and split (apache#17) * fix lint * introduce dependency graph * enable create schedule * support get axes * fix lint * revert Set * add schedule primitive fuse * address comment * support split [IR] Introduce SeqStmt add TeLower pass and enable to run Te IR (apache#15) * add function data structure add TeLower pass to transform Te to current IR enable to run Te IR * address comments * unify terminology TensorIR data structure init (apache#14) * init te data structure * finish printer and enhanced ir_builder * address the comments Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>

* Init. * Proof of concept. * Rebase on the newest branch * Move to emit_te * Update emit_te * Make RXPlaceholderOpNode as a subclass of PlaceholderOpNode * Update * run vm test_te * Update argument conversion * Reset create_primfunc * Update doc * Update test * Add error message * Update * Update * Address comment * unit test check structural and validate_te_args * raise ValueError when multiple outputs * address comments * example usage emit_te * Rename to context_mod * Handle multiple call * Address comments * Address comments * Use unique name * remove * rename args to te_args * address comments * fix TVMscript manually * spelling Co-authored-by: Andrew Liu <andrewlliu@gmail.com>

Exclusively use relax converters in onnx frontend

Merged in SIM-6711 (pull request apache#36) Approved-by: Mikael Sevenier Approved-by: Joey Chou

[FUSION] add Fusion(Schedule)

d15bbf6

tqchen reviewed Feb 8, 2017

View reviewed changes

[FUSION] rename to AutoFuseEwise, detect whether the stage has been s…

83f5c4f

…cheduled

ZihengJiang changed the title ~~[FUSION] WIP: add Fusion(Schedule)~~ [FUSION] add 'void AutoFuseEwise(Schedule sch)' Feb 8, 2017

[FUSION] change to visitor pattern

f66b3c9

tqchen requested changes Feb 9, 2017

View reviewed changes

ZihengJiang added 3 commits February 9, 2017 03:09

[FUSION] rename filename

d137400

[FUSION] fine-tune the interface

8d6166c

[FUSION] typo

fbf9ff0

tqchen reviewed Feb 9, 2017

View reviewed changes

src/pass/elem_wise.cc Outdated

void Visit(const NodeRef& e) final {

if (!is_elem_wise_)

return;

Copy link

Member

tqchen Feb 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no new line, or add {}

tqchen reviewed Feb 9, 2017

View reviewed changes

src/pass/elem_wise.cc Outdated

@@ -0,0 +1,57 @@

/*!

* Copyright (c) 2016 by Contributors

Copy link

Member

tqchen Feb 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy right at 2017

tqchen reviewed Feb 9, 2017

View reviewed changes

move elem_wise to schedule

d547e52

rename test function

ae4de6f

tqchen approved these changes Feb 9, 2017

View reviewed changes

tqchen merged commit 6a62beb into apache:master Feb 9, 2017

ZihengJiang deleted the fusion branch February 9, 2017 17:14

tqchen changed the title ~~[FUSION] add 'void AutoFuseEwise(Schedule sch)'~~ [SCHEDULE] add 'void AutoFuseEwise(Schedule sch)' Feb 10, 2017

tqchen added a commit to tqchen/tvm that referenced this pull request May 26, 2018

overview (apache#36)

acc3d0c

* overview * fix

tqchen added a commit to tqchen/tvm that referenced this pull request May 26, 2018

[DOCS][FRONTEND] Modify from_mxnet to also return params, update docs (…

7b33b48

…apache#36)

tqchen added a commit that referenced this pull request May 29, 2018

overview (#36)

e66c4e0

* overview * fix

tqchen added a commit that referenced this pull request May 29, 2018

[DOCS][FRONTEND] Modify from_mxnet to also return params, update docs (…

41d9069

…#36)

tqchen added a commit to tqchen/tvm that referenced this pull request Jul 6, 2018

overview (apache#36)

af5ba3f

* overview * fix

tqchen added a commit to tqchen/tvm that referenced this pull request Jul 6, 2018

[DOCS][FRONTEND] Modify from_mxnet to also return params, update docs (…

3a5b9fc

…apache#36)

grwlf pushed a commit to grwlf/tvm that referenced this pull request Aug 8, 2018

overview (apache#36)

feeffff

* overview * fix

grwlf pushed a commit to grwlf/tvm that referenced this pull request Aug 8, 2018

[DOCS][FRONTEND] Modify from_mxnet to also return params, update docs (…

d2bdc89

…apache#36)

tmoreau89 pushed a commit to tmoreau89/tvm that referenced this pull request Jan 2, 2019

Add relay_bitpack.py (apache#36)

e66d5c9

tmoreau89 pushed a commit to tmoreau89/tvm that referenced this pull request Mar 22, 2019

Add relay_bitpack.py (apache#36)

5863d37

tmoreau89 pushed a commit to tmoreau89/tvm that referenced this pull request Mar 22, 2019

Add relay_bitpack.py (apache#36)

d155aec

MasterJH5574 pushed a commit to MasterJH5574/tvm that referenced this pull request Dec 21, 2021

Syntax of AttachAxis for BMM (apache#36)

4bda7c1

* upd * upd * upd

MasterJH5574 pushed a commit to MasterJH5574/tvm that referenced this pull request Dec 24, 2021

Syntax of AttachAxis for BMM (apache#36)

871e04a

* upd * upd * upd

cyx-6 pushed a commit to cyx-6/tvm that referenced this pull request May 27, 2022

Get the POC functioning (apache#36)

af44d4e

cyx-6 pushed a commit to cyx-6/tvm that referenced this pull request Jun 10, 2022

Get the POC functioning (apache#36)

eb5fe57

cyx-6 pushed a commit to cyx-6/tvm that referenced this pull request Jun 25, 2022

Get the POC functioning (apache#36)

47dd4ba

junrushao added a commit to cyx-6/tvm that referenced this pull request Jul 4, 2022

Get the POC functioning (apache#36)

212daec

cyx-6 pushed a commit to cyx-6/tvm that referenced this pull request Jul 13, 2022

Get the POC functioning (apache#36)

3aa35e7

Hzfengsy pushed a commit to Hzfengsy/tvm that referenced this pull request Jul 30, 2022

Get the POC functioning (apache#36)

8bc748b

vinx13 pushed a commit to vinx13/tvm that referenced this pull request Mar 27, 2023

[Frontend] Replace all calls to the relay importer (apache#36)

eade7df

Exclusively use relax converters in onnx frontend

mikeseven pushed a commit to mikeseven/tvm that referenced this pull request Sep 27, 2023

TVM changes to support introduction and typing of new custom operations.

75b00e8

Merged in SIM-6711 (pull request apache#36) Approved-by: Mikael Sevenier Approved-by: Joey Chou

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SCHEDULE] add 'void AutoFuseEwise(Schedule sch)' #36

[SCHEDULE] add 'void AutoFuseEwise(Schedule sch)' #36

ZihengJiang commented Feb 8, 2017

tqchen commented Feb 8, 2017 •

edited

Loading

tqchen Feb 8, 2017

ZihengJiang commented Feb 8, 2017 •

edited

Loading

tqchen commented Feb 8, 2017

tqchen Feb 9, 2017

tqchen Feb 9, 2017

tqchen Feb 9, 2017

tqchen Feb 9, 2017

tqchen Feb 9, 2017

tqchen Feb 9, 2017

tqchen Feb 9, 2017

tqchen Feb 9, 2017

tqchen Feb 9, 2017

tqchen Feb 9, 2017

tqchen Feb 9, 2017

ZihengJiang commented Feb 9, 2017

[SCHEDULE] add 'void AutoFuseEwise(Schedule sch)' #36

[SCHEDULE] add 'void AutoFuseEwise(Schedule sch)' #36

Conversation

ZihengJiang commented Feb 8, 2017

tqchen commented Feb 8, 2017 • edited Loading

Choose a reason for hiding this comment

ZihengJiang commented Feb 8, 2017 • edited Loading

tqchen commented Feb 8, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ZihengJiang commented Feb 9, 2017

tqchen commented Feb 8, 2017 •

edited

Loading

ZihengJiang commented Feb 8, 2017 •

edited

Loading