Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Reducing manual copy/paste between MONAI / MONAI Deploy / MONAI Label #163

Closed
aihsani opened this issue Oct 8, 2021 · 15 comments
Closed
Labels
enhancement New feature or request
Projects

Comments

@aihsani
Copy link
Contributor

aihsani commented Oct 8, 2021

Is your feature request related to a problem? Please describe.

Currently, manual work (copy/paste code) is required for a MONAI developer to create a MONAI Deploy App. That is, there is no common API between MONAI and MONAI Deploy (outside of importing MONAI utilities/libraries in a MONAI Deploy App). Due to this, an experienced MONAI developer (primarily focused on developing new algorithms/training paradigms) needs to learn the concepts/API in MONAI Deploy and copy/paste code in the appropriate places in a MONAI Deploy Operator and manually ensure that data transfer concepts (e.g. InferContext) in MONAI Deploy are correctly used.

Analogically, in the case of MONAI Label, an experienced MONAI developer must do much the same: familiarize themselves with the concepts of of MONAI Label (e.g InferTask, TrainTask, etc) and copy/paste their code in the appropriate sections to ensure that the MONAI Label App functions as expected.

Describe the solution you'd like

In addition to researchers "freestyling" training scripts using MONAI, they could also optionally use concepts of MONAI Label and Stream such as tasks (InferTask, TrainTask, etc.) and contexts (InputContext, OutputContext, ExecutionContext) to represent the conceptual task the use aims a certain branch of code to accomplish all the while using data transfer types that are portable across MONAI. The aim is ultimately to employ each of the execution branches for the application the user wants to create; this may still require manual adjustment but overall reduce the amount of manual copy/paste and reduce the learning curve across different MONAI products.

Describe alternatives you've considered
Learn each framework in isolation (MONAI, MONAI Label, MONAI Deploy) and copy/paste where appropriate.

@dbericat
Copy link
Member

dbericat commented Oct 8, 2021

@rahul-imaging @MMelQin @gigony do you see your proposal of having a standalone transform library to be reused across Core and Deploy part of this issue or as a separate one?

@wyli for viz.

@Nic-Ma
Copy link

Nic-Ma commented Oct 8, 2021

Hi @MMelQin @dbericat ,

May I know why you need to copy / paste the code into MONAI Deploy? Can't we just install MONAI v0.7 package and call the transforms or other components?

Thanks in advance.

@MMelQin
Copy link
Collaborator

MMelQin commented Oct 8, 2021

Just to clarify:

  1. It may be a misconception stating "cut and paste" in Deploy App SDK. it is more of Deploy/Inference developer needs to understand the set of transforms (pre and post) required before and after sending the tensors to the model for inference. Training and Deploy take in different raw input, and often times, Deploy needs to add additional transforms, and simply cannot directly use the Compose (a chain of transforms) in Train application.
  2. AI inference domain classes in App SDK depends on MONAI core, specifically Transforms, Compose, as well as Inferer modules, but only use it as runtime dependency, not a static dependency of Deploy App SDK. This is an architectural decision such that App SDK package needs not include MONAI core, and the AI inference classes uses the same set of processing module (transforms and inferer) used during train for consistency.
  3. Higher level constructs, especially workflow related e.g. InfereTask/Train Task, may not need to be in MONAI core, and more likely than not, cannot be transferred from Train to Deploy (see points above).
  4. Additional, Train uses an engine that is not used Deploy, e.g. Deploy does not use Ignite, and does not support Handlers for Ignite. So, pipeline used in Train can NOT be reused in Deploy.
  5. The consensus among MONAI WG and MONAI Deploy WG is that there are needs for:

self describing model, i.e. just a TorchScript is not enough for Deploy, besides, meta data is needed to describe the input/output data/format, transforms used in the context of the input/output data/format.
Ideally, a separate, standalone, transfroms only library can be provided by MONAI core.

@MMelQin
Copy link
Collaborator

MMelQin commented Oct 8, 2021

@Nic-Ma @aihsani
Deploy App SDK already does what you expect, depending MOANI core and using the Transforms/Compose modules in MONAI package at runtime.

Sorry, "cut and paster" is a statement from Alvin, referring to the statements calling the transforms in deploy app. Maybe there is "copy/paste" in MONAI Label, but NOT in Deploy App SDK logically and programmatically it is NOT "cut and paste". I'd defer to Alvin to answer it.

The Deploy App SDK inference operator class has abstract Pre_transforms and Post_transforms methods, and then a model specific app needs to implement. For deploying the example trained models in MONAI repo, I did have to understand the transforms used (dictionary based), and then use those transforms in the Deploy app for the same model, adding SpacingD, not using Handler, tweaking SaveImageD etc. All in all, only the transforms Compose needs to be customized.

So, please see my last comment about self describing model, so that Deploy can generate code based on the metadata instead of referring to the Train app for the transforms used.

I do NOT agree with the statement that MOANI Deploy App SDK does "cut and paste" in general.

@Nic-Ma
Copy link

Nic-Ma commented Oct 8, 2021

Hi @MMelQin ,

Thanks for your detailed explanation.
I think the entire requirement is something like Clara MMAR with adding a deploy.json to do inference and it doesn't call any ignite modules.

Thanks.

@Nic-Ma
Copy link

Nic-Ma commented Oct 8, 2021

And BTW, we already deprecated several ignite handlers and suggest to use transforms instead, like: SegmentationSaver, TransformInverter, etc.
So I think in theory you can define your inference pipeline without ignite now.

Thanks.

@MMelQin
Copy link
Collaborator

MMelQin commented Oct 8, 2021

@Nic-Ma
You hit the nail on the head, and I had the same request so that I can again do a Train to Deploy in mins. I was even considering providing a "MONAI infer config" to create a Deploy App to mimic the MMAR, though someone has to first come up with the config, e.g. Trainer.

Exactly, Nic, thanks for separating those transform out of handlers, and now with 0.6, I am happy to use the Saver and Inverters in the post-transform.

Besides, I'd really hope everyone involved in the discussion from Train to Deploy know full well these nuances, and we can align.

@Nic-Ma
Copy link

Nic-Ma commented Oct 8, 2021

Hi @MMelQin ,

BTW, I think we don't need to provide a separate package that includes MONAI transforms. Because MONAI actually only depends on numpy and PyTorch. ignite is optional import in MONAI, if you don't use any handlers or engines, you don't even need to install ignite.

Thanks.

@MMelQin
Copy link
Collaborator

MMelQin commented Oct 8, 2021

@Nic-Ma
Agree, and that's why I used the word "Ideally".
The approach of optional imports of many specialty packages carried over from MONAI core to MONAI Deploy App SDK, so both packages are light and statically depending on very limited number of packages (Deploy does not even package torch).

@aihsani
Copy link
Contributor Author

aihsani commented Oct 8, 2021

Hi @MMelQin @dbericat ,

May I know why you need to copy / paste the code into MONAI Deploy? Can't we just install MONAI v0.7 package and call the transforms or other components?

Thanks in advance.

@Nic-Ma The copy/paste part of it refers to a researcher developing a MONAI script to train their model then having to learn yet another paradigm (MONAI Deploy) and, while they may import MONAI in there, they will have to setup their transformations from scratch

@aihsani
Copy link
Contributor Author

aihsani commented Oct 8, 2021

I do NOT agree with the statement that MOANI Deploy App SDK does "cut and paste" in general.

@MMelQin so then there is a way to port MONAI script code and make it deployable with minimized manual editing (no copy/paste)?

@MMelQin
Copy link
Collaborator

MMelQin commented Oct 8, 2021

@aihsani
Please try the Deploy App SDK example, or maybe create a new one with a Training example from MONAI core. Often times the deploy App developer is not the trainer, and the deploy dev only needs to have the TorchScript model, and the transforms spec.

The modality/body part pertinent to the model is also important, e.g. CT Abdomen, and of course other more clinical related info, such as contrast vs no contrast, but one might not expect a trainer would know or care how the training nii images have been converted from DICOM with or without contrast, which is critical info when the Model is being deployed to the clinical env (e.g. for Deploy series selection programmatically), but do not expect that from the Train script.

So, in general, Train Script inference part is only a starting point for deploy app, not meant to and cannot be reused as is.

@ericspod
Copy link
Member

I was going to write something short but I've rambled a bit so I hope I'm not totally off base with what everyone else is thinking.

Looking at the MedNIST example notebook I can see the scope for discussion on duplication or copy-pasting. The deploy part of the example does use MONAI transforms in the MedNISTClassifierOperator definition which are similar to those in the training cells since they do the same initial regularization of data. The body of the compute method is pretty much a straight forward implementation of inference with a Pytorch model then stating which category the classifier chose for the given input. This code may be very similar to what is in one's training scripts (eg. the part doing validation inference), but requiring some non-obvious adjustment to work with Deploy's different environment.

So the pathway from training code to deploy code isn't straight forward, but we could introduce definitions wrapping common concepts shared between Core and Deploy if we can nail down the commonality in a way that isn't cumbersome and resolve how that dependency will work. Since Deploy shouldn't hard depend on Core this could take some work. For example having a common definition of doing inference with a network which is compatible with training and deployment, but which doesn't introduce constraints to use in either context. It'll be hard to avoid shoehorning the same concept in both places though.

One idea mentioned earlier here that I think important is a more semantically rich package for a network. As stated a Torchscript bundle isn't enough to use a network correctly, just enough to load the network correctly without code or implementation changes causing error. What we would benefit from is a mechanism to add more metadata to these bundles that covers the format of input and output, in terms of tensor shapes but also values that they should have, what the meaning of the output is such as which classes are being predicted or which tissues segmentation numbers correspond to, plain-language description of what the network is for, timestamp or version information, inference time hyperparameters, and other requirements or features. It's more than just being a self-describing network but including a lot of important scientific information as well as runtime relevant parameters. With this information we can on the deploy side define code, or generate code at runtime, which uses this information to wrap the network in the necessary infrastructure to operate.

For the MedNIST example this would be used to generate the code to preprocess the inputs such that it has a specific size and dimensions with scaled values, and be able to recognize the network as a classifier with a given set of labels and their meanings which can be returned as the output. With this the MedNISTClassifierOperator type wouldn't be necessary as there would be an implicit mechanism that can operate on any correctly augmented Torchscript bundle to do everything that it does. This generated code need not rely on MONAI Core either so we can still avoid that dependency, or on a potentially separate MONAI Transforms package.

Other aspects that are needed for deployment such as handling file IO (loading DICOMs, fixing orientation issues with Nifti, etc.) are common to Core, Deploy, and Label in very similar ways quite often so we may consider where a common solution would live and what it would look like. The example uses a PIL based operator for demonstration but in practice we'd want some unified way of representing image and other data in and out, with Core that's LoadImage to a large degree.

I'm not familiar enough with Labels's code to be certain but I feel that for Label there is a lot of scope to use Deploy more in place of existing code. I imagine duplication of concept between the two is down to being developed at different times by different teams.

@SachidanandAlle
Copy link

SachidanandAlle commented Oct 11, 2021

I am very much ok to use sdk for monai-label.. and streamline the concepts for writing infer/train definitions/tasks. but as a user i just have to one concern. I have to understand extra concept here.. earlier i was using all monai components and defining the workflows directly, now i have to define additional operators and decide which one to define/use as operator vs core component for monai.. and then define my infer/train definition/workflows.

I am basically referring https://github.com/Project-MONAI/monai-deploy-app-sdk/blob/main/examples/apps/ai_spleen_seg_app/spleen_seg_operator.py

the current apps are for infer.. do we have any examples for training (single, multi gpu etc.. ) as well? or coming soon? they will be some good reference materials for monai label..

I do agree on some points to have to those conversion utilities as part of core so that we can get more contributors also working on some corner cases related to dicom=>nifti or vice versa. If I remember, we had discussed/proposed it to core team as well to own/introduce those dicom related utilities.

@MMelQin
Copy link
Collaborator

MMelQin commented Oct 11, 2021

@SachidanandAlle, yes MONAI core has an abundance of tutorial for training as well as inference validation. MONAI Deploy App SDK inference examples, Spleen, MedNIST, and UNETR, all refer to MONAI core examples.

As I understand, there is no MONAI Train per se. MONAI core provides the essential building blocks, though does not necessarily dictate a training/validation workflow or construct. A trainer or deploy app developer needs to compose a application with such building blocks. For MONAI Deploy App SDK, then it is naturally following this paradigm, as of now, though may change depending on what ADDITIONAL commonalities can be abstracted and implemented in MONAI core benefiting both training, model validation, and deploy applications. "manual copy/paste" stated in the title of this issue is a unfortunate misunderstanding of what and how the MONAI Deploy App SDK depends on MONAI core.

Bear in mind, in other products built with MONAI core, e.g. Clara Train V4, introduced are additional concepts or constructs, e.g. MMAR, which serves as the data contract between train and deploy; it is just enough so that a single Clara Deploy app can be used for different models by loading the model specific MMAR. As I understand, MONAI Label perform both training as well as inference tasks, so a similar approach can also be used.

I look forward to the discussions and designs for further improvements.

@gigony gigony added this to Needs Triage in Backlog via automation Dec 8, 2021
@Project-MONAI Project-MONAI locked and limited conversation to collaborators Jan 19, 2022
@MMelQin MMelQin converted this issue into discussion #239 Jan 19, 2022
@dbericat dbericat moved this from Needs Triage to High priority in Backlog Feb 2, 2022

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
enhancement New feature or request
Projects
Backlog
High priority
Development

No branches or pull requests