-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP][Relax][Op] Register gradient for nll_loss and Conv2d #114
Closed
Ubospica
wants to merge
305
commits into
mlc-ai:relax
from
Ubospica:mlc-dev/2023-01-26-conv2d_nllloss_gradient
Closed
[WIP][Relax][Op] Register gradient for nll_loss and Conv2d #114
Ubospica
wants to merge
305
commits into
mlc-ai:relax
from
Ubospica:mlc-dev/2023-01-26-conv2d_nllloss_gradient
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Ubospica
changed the title
Register gradient for nll_loss
[WIP][Relax][Op] Register gradient for nll_loss
Jan 27, 2023
Ubospica
changed the title
[WIP][Relax][Op] Register gradient for nll_loss
[WIP][Relax][Op] Register gradient for nll_loss and Conv2d
Jan 28, 2023
20 tasks
Ubospica
force-pushed
the
mlc-dev/2023-01-26-conv2d_nllloss_gradient
branch
2 times, most recently
from
January 29, 2023 17:15
c987f75
to
66ef91a
Compare
* [WIP] Basic task extraction mechanism is implemented. * [WIP] For gradual integration with Relay pipeline, meta_schedule/integration.py is created for relax to avoid potential conflict. * support tir tuning and injection mode * Add target field for Relax Extracted Task * 1. Create relax namespace/tvm objects/... for metaschedule to preserve relay support. 2. Promote target field from Optional<Target> to Target * Support ApplyHistoryBest * Reflect feedback from Yuchen * minor improvement and fix linter issue * add ASF header * Reorganize file structure * fix lint errors * remove the import-outside-toplevel * Reflect comments * remove redundant comment * As per discussion w/ Yuchen, ApplyHistoryBest is introduced as a Relax transformation pass. * remove redundant print msg * fix lint * reflect comments
* Enable tests. * Updated. * Updated. * Updated.
…er (mlc-ai#76) * [CI] Set up CI; format and lint relax code to pass CI (mlc-ai#72) * init * fix lint * update task_lint * more lint * more lint * lint * jenkinsfile * jenkinsfile * run relax only tests * python3.7 for pytest * point to personal ci-cpu docker * docker pull * test * fix cmake config * update * update * rebase * rebase * AutoTIR integration (mlc-ai#58) * [WIP] Basic task extraction mechanism is implemented. * [WIP] For gradual integration with Relay pipeline, meta_schedule/integration.py is created for relax to avoid potential conflict. * support tir tuning and injection mode * Add target field for Relax Extracted Task * 1. Create relax namespace/tvm objects/... for metaschedule to preserve relay support. 2. Promote target field from Optional<Target> to Target * Support ApplyHistoryBest * Reflect feedback from Yuchen * minor improvement and fix linter issue * add ASF header * Reorganize file structure * fix lint errors * remove the import-outside-toplevel * Reflect comments * remove redundant comment * As per discussion w/ Yuchen, ApplyHistoryBest is introduced as a Relax transformation pass. * remove redundant print msg * fix lint * reflect comments * Yuchen's change * relax ConstantNode in parser and printer * Add constant data in the metasection * rebase * Support ir_module(metadata=json_str) * update test case * remove print info * Update tests * clang-format * pylint * fix ci * Save a copy of metadata in RelaxTransformer * Fix comments * fix comments Co-authored-by: Yuchen Jin <yuchenj@cs.washington.edu> Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com>
Co-authored-by: Prakalp Srivastava <prakalp@octoml.ai>
* Clean up taske extraction * black
* Change call_tir convention and fix shape/type deduction. * test * output shape as 3rd arg. * address comments. * lint
Enhance VM Executable as a Subclass of runtime::Module
* [VM] Refactor and improve vm. - Have a separate function for RunInstCall. - Cache func_index lookup by table to avoid repeative lookup by str. - Move PackedFunc call arg stack to Frame to increase locality and avoid re-allocation in repeative calls. - Make frame stack of unique_ptr to avoid frame re-allocation and copy during frame.resize. - Pass curr_frame as arguments into sub-functions to make it explicit. * address review comments
* improve Printer for DynTensorType & ShapeExpr * add testcases
* Add is_device field to attr. * Update. * Address comment. * update. * Update.
* Fix call_tir parsing bug. * update.
* fix structural_equal_hash (cherry picked from commit e7e962634999739a32129378f61cc95f58335447) * address comment & pass the ci
The pattern field of the match shape can define variables, as a result, we need to add DefEqual and Hash here. Added a regression testcase. Lesson: we would benefit from more testcases with check_save_roundtrip checks(like this one) for more relax example. Additional change: - Redirected TVMScript printer to be able to print relax fragements useful for debugging.
* Add gpu ci. * Update autotir gpu test.
TOPI has an implementation of `collapse_sum` internally (`tvm/topi/reduction.h`) but it is not exposed to FFI and can not be called in Python side. This patch exposes it and adds some related tests. Besides, now legalizer can legalize `collapse_sum_like/to`! But due to the TOPI implementation, it can't handle the symbolic case.
This PR migrates mlc-ai#46 to new struct info infra, as part of our AD migration. Because we need do numerical testing for gradients, this PR depends on the operator legalizer mlc-ai#96. Also because the original version of legalizer did not handle the negative indexing case of `relax.mean`, this PR fixes it. To lower `collapse_sum_to`, `collapse_sum_like` properly, this PR migrates a previous patch mlc-ai#43 which introduces `collapse_sum` in topi. Now we can remove the skip marker in the legalizer test for `collapse_sum_to` and `collapse_sum_like`. The gradients of `cross_entropy` and `softmax_cross_entropy` are removed. And the former will be added back and adjust to new `cross_entropy` introduced in mlc-ai#96. Further plan in this PR: - [x] Add gradients for `log_softmax` and `nll_loss` once mlc-ai#94 is merged. - [x] Gradients for some tuple related operators such as `split` and `concat`. It can help us to test the correctness of AD when there are Tuple-I/O operators. - (Not in this PR) "Undefined Gradient" representation. As we know, the gradients of some operators w.r.t. specified inputs are undefined or meaningless, such as the partial gradient of `indices` in `take(x, indices)`. Relay directly uses `zeros_like` in this case as it won't affect gradient propagation. Another choice is to introduce a dummy Expr named `UndefinedGradient` to represent it. How do we handle this case in relax?
Create a separate yaml file as in mlc-ai#104 for CI on MLC relax to ease future sync with tlc-pack/relax. Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
This is the PR following mlc-ai#55 after source branch moved to personal repo. This PR is based on mlc-ai#98. This PR adds the new automatic differentiation API: - `Gradient(func: GlobalVar, require_grads: Optional[Union[Var, List[Var]]] = None) -> tvm.ir.transform.Pass` - transforms the given funcion in the IRModule, and adds a new function that calculates the gradient with regard to the function's output Now Gradient only supports differentiating a function in the IRModule with one dataflow block with respect to the only return value of the function, which needs to be scalar. This PR writes two files for unit test: - `tests/python/relax/test_transform_gradient.py` only contains `assert_structural_equal` assertions. - `tests/python/relax/test_transform_gradient_numeric.py` contains numeric checks, including manually derived gradients and the numerical differentiation method `check_numerical_grads`. Checkpoints: - [x] Refactor to use CopyWithNewParams and ExprFunctor - [x] Check int64/int32 tensors should not be differentiated (now only check in params) - [x] Rebase & migrate to StructInfo - [x] Refactor about Tuple - [x] Refactor about NestedMsg - [x] Support ops taking in tuple or returning tuple - [x] Eliminating collapse_sum_to (done in mlc-ai#98) Future: - (Not in this PR) Handle undefined gradient in add and return value - Now we handle them as zeros Co-authored-by: SiriusNEO <1713833595@qq.com>
Implements the layout conversion pass.
This PR implements the library dispatcher for Relax, which currently uses CUTLASS as one library. It introduces the TIR-level pattern registration and matching algorithm. It introduces a Relax pass to split out subgraphs that match the patterns of backends.
This PR includes: - Introduce `R.abs` and its legalization (For L1Loss) - Registering most of the unary operators in [DataAPI](https://data-apis.org/array-api/draft/API_specification/elementwise_functions.html) (Without legalization) - Split unary arith oprators and check operators (e.g. `isnan`) - Refactor `test_tvmscript_parser`, `test_op_unary` and `test_op_binary` using `tvm.testing.parameters()`.
Implements the Relax importer from PyTorch, using torch FX. An example use of the importer is: ```python # Import the importer. from tvm.relax.frontend import from_pytorch # Define the module class MyModule(torch.nn.Module): def __init__(self): super().__init__() self.linear = torch.nn.Linear(in_features=10, out_features=7, bias=True) def forward(self, input): return self.linear(input) # Instantiate the model and create the input info dict. torch_model = MyModule() input_info = {"input_1": ((128, 10), "float32")} # Use the importer to import the PyTorch model to Relax. mod: tvm.IRModule = from_pytorch(torch_model, input_info) # Print out the imported model. # print(mod.script()) ``` --------- Co-authored-by: Ruihang Lai <ruihangl@cs.cmu.edu>
This PR introduces loss functions for relax training and provides a tool `append_loss` which enables user to append a loss after a forward function. About the `append_loss`, some previous discussions can be found in mlc-ai#111. Currently support: - L1Loss - MSELoss - CrossEntropyLoss
…nvert and CUTLASS dispatch (mlc-ai#116) Fixes bugs found when importing fp16 UNet & VAE through the importer, legalized, AMP, layout convert, and CUTLASS codegen.
…i#118) This PR enables importing MobileNetV2 from PyTorch FX, which including the following changes: - Add `relax.clip` Op, with PrimValue support - Support `torch.clamp` and `torch.ReLU6` in PyTorch FX frontend - Support `torch.nn.functional.adaptive_avg_pool2d` in PyTorch frontend
As discussed before, we won't introduce a new Linear Op at IR level. However, this PR adds a user interface at the python side, which is composed of transpose, matmul and a bias add. Additionally, PyTorch supports 1D weight Tensor for Linear Op (https://pytorch.org/docs/stable/generated/torch.nn.functional.linear.html). So we cannot use `permute_dims(weight, axes=[1, 0])` in the fx importer. This PR fixes this issue as well.
Update optimizer APIs. - Remove `@property state` and `@state.setter` - Add `init()` interface - Remove `Optimizer.__call__()` - Remove underscores before attributes, and unnecessary attributes Current interfaces: ```python class Optimizer: dtype: str name: str param_list: List[Var] state: tvm.runtime.container.ADT def __init__(self, name: str) -> None: self.name = name self.param_list = None self.state = None self.dtype = None def init(self, params: Union[Var, List[Var]]) -> "Optimizer": """Set the parameters, determine the dtype, and build the initial state for the optimizer.""" pass def get_function(self) -> Function: """Use blockbuilder to build an optimizer function that executes updates of the parameters and the optimizer state.""" pass ``` Use examples: See <https://github.com/ACMClass-TVM-20/AD-Example/blob/dc255150dc6a4a6de2fffc2c093a8b2bacc1b030/optimizer_api_example.py> And also updates Gradient APIs: - Before: `def Gradient(global_var: GlobalVar, require_grads: Optional[Union[Var, List[Var]]]) -> tvm.ir.transform.Pass` - After: `def Gradient(func_name: str, require_grads: Optional[Union[Var, List[Var]]]) -> tvm.ir.transform.Pass` Unit tests are changed accordingly.
This is a prototype of `LiftTransformParams`. It allows to compile the end-to-end model without weights provided. The idea is annotate the input parameters that are weights, and identify and lift the transformations to weights, and compile it to a separate function `transform_params` that can be executed in runtime. Users can run `transform_params` with weights to get the weights for the optimized model as a prep step before the deployment. In this way, we perform the same optimizations and defer the weight transformations to the user side, while the overhead of the deferred weight transformation can be ignored as it only need to be run once. A demo notebook is available [here](https://github.com/vinx13/relax/blob/fda857553557c34f97a4f7193a529da607a3421c/tests/python/relax/demo_lift_transform_params.ipynb) This pass is not integrated yet with the default `vm.build` as we are going to iterate it.
This PR adds support for torch dyanmo
spectrometerHBH
pushed a commit
to spectrometerHBH/relax
that referenced
this pull request
Feb 9, 2023
* add DataflowBlockPass * update fma_rewrite * drop the skip function * update test_fma_rewrite with DataflowBlockPass * fix the format * fix name * rewrite test in tvm script * add non-dataflow Vars check * add fail testcases * module->IRModule * add docstring to DataflowBlockNode * remove unused pattern * Transform Pass->DataflowBlock Pass * rename global var to global scope var * remove print stmt * reformat tests * add docstring to DataflowBlockMutator * fix filename * minor fix
This PR brings a wrapper for relax training. The following things are done internally in this trainer: - Maintain (store/update) the parameters of the module. - Merge backbone and specified loss function together. - Build/Compile/Run the module. - Build/Compile/Run the optimizer. (using the same vm_config as we run the module.) And it also provides two interfaces for loading params/exporting params. Example: ``` trainer = Trainer(MLP, [1, 2], "main") # [1, 2] means input[1] and input[2] are parameters in this module. trainer.set_loss(MSELoss(reduction="sum"), pred_sinfo, pred_sinfo) trainer.set_vm_config(target="llvm") trainer.set_optimizer(optim_type=SGD, lr=0.001).setup() trainer.setup() trainer.rand_init_params() trainer.forward(*fwd_inputs) trainer.backward(*bwd_inputs) ```
This PR introduce the pipeline namespace. Which contains the collection of pre-defined pipelines that optimizes and lower IRModule before passing to minimum build.
This PR adds a new pass SimplifyNormInference to unpack the norm operator into a sequence of operators, which is same as the pass SimplifyInference in Relay
This PR fixes the timeout rule of MetaSchedule RPCRunner. Prior to this PR, the RPCRunner sets a timeout threshold for jobs submitted to popen pool. As a result, the jobs are timed since the time that they are sent to the remote side. Consider the case where there is only a single device for measurement. In this case, all jobs can only be executed serially and jobs must queue up. Therefore, the previous timeout configuration means the time spent on queueing will also be counted. This causes some jobs, in the worst cases, gets timeout without even started to execute, and has negative impacts on RPC MetaSchedule tuning, from the perspectives of both efficiency and result performance. Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Ubospica
force-pushed
the
mlc-dev/2023-01-26-conv2d_nllloss_gradient
branch
from
February 12, 2023 11:02
4792bc7
to
2d6b98b
Compare
nll_loss gradient finished nll loss grad finished formatted nll_loss finished conv2d gradient finished test not finished rename nll_loss_backward trial conv2d finished conv2d finished tile and repeat finished conv2d finished formatted
Ubospica
force-pushed
the
mlc-dev/2023-01-26-conv2d_nllloss_gradient
branch
from
February 15, 2023 12:46
f531042
to
e74dc55
Compare
Moved to #130 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.