Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DALI 2022 roadmap #3774

Closed
JanuszL opened this issue Mar 30, 2022 · 15 comments
Closed

DALI 2022 roadmap #3774

JanuszL opened this issue Mar 30, 2022 · 15 comments
Assignees

Comments

@JanuszL
Copy link
Contributor

JanuszL commented Mar 30, 2022

The following represents a high-level overview of our 2022 plan. You should be aware that this roadmap may change at any time and the order below does not reflect any type of priority.

We strongly encourage you to comment on our roadmap and provide us feedback on this issue here.

Some of the items mentioned below are the continuation of the 2021 effort (#2978)

Improving Usability:

Extending input format support:

Extending support of formats and containers with variable frame rate videos: #3615, #3668, #4184, #4296, #4302, #4351, #4354, #4327, #4424
Image decoding operators with support for the following higher dynamic ranges - #4223

Performance:

New transformations:

We are constantly extending the set of operations supported by DALI. Currently, this section lists the most notable additions to our areas of interest that we plan to do this year. This list is not exhaustive and we plan on expanding the set of operators as the needs or requests arise.

@JanuszL JanuszL pinned this issue Mar 30, 2022
@JanuszL JanuszL assigned JanuszL and unassigned mzient Mar 30, 2022
@msaroufim
Copy link

msaroufim commented Apr 26, 2022

Hi @JanuszL I was interested in a more seamless integration between DALI and torchvision for better end to end model training and inference time

Relevant PRs

In particular we don't necessarily need to integrate everything including the data loader but at the very least I think the accelerated image decoding and specialized preprocessing kernels will be of huge value gated behind another vision backend https://github.com/pytorch/vision#image-backend

The integration would probably look similar to the one made with accimage

I'm guessing torchaudio facebookresearch/mmf (multimodal) would also be similar

@JanuszL
Copy link
Contributor Author

JanuszL commented Apr 26, 2022

Hi @msaroufim,

Thank you for your feedback regarding our 2022 roadmap.
It would be nice to accelerate existing TorchVision pipelines, however, I'm not convinced if the way that DALI works can be combined in the suggested fashion. DALI relies on the processing graph, we plan to extract some of its operators into callable entities but this will not match the performance of the pipeline execution model. Also, it will lead to less efficient GPU memory utilization.
I would wait until this effort is at least partially completed and let the TorchVision community experiment with it and see what works best.

@songyuc
Copy link

songyuc commented Sep 26, 2022

Hi, @JanuszL,
Is there a timeline for a stable version of DALI with support of python-3.10, as I saw this warning as:

Warning: DALI support for Python 3.10 is experimental and some functionalities may not work.
deprecation_warning("DALI support for Python 3.10 is experimental and some functionalities "

Your answer and guide will be appreciated!

@JanuszL
Copy link
Contributor Author

JanuszL commented Sep 26, 2022

Hi @songyuc,

This warning is there mostly because we don't have a full test coverage for Python 3.10, although it should work fine.
I cannot commit to a particular timeline but we hope to do it sooner than latter.

@csheaff
Copy link

csheaff commented Oct 16, 2022

Hello, @JanuszL ,

I'm currently looking at DALI for medical image processing. Might there be plans for DICOM support?

Edit: nevermind I see it was mentioned in #3275.

@JanuszL
Copy link
Contributor Author

JanuszL commented Oct 17, 2022

Hi @csheaff,

Thank you for reaching out. We don't have a short-term plan to support DICOM. As I understand the usual workflow is the conversion from DICOM to NumPy (which includes offline preprocessing, like normalization), and then the NumPy files are used for the training. The conversion is done only once while NumPy files are reused multiple times. That is why we have this item low on our priority list.
Still, can you describe your workflow so we can reprioritize it if needed?

@csheaff
Copy link

csheaff commented Oct 17, 2022

Thanks for the response @JanuszL. I've just discovered DALI and I'm sort of new to MLOps, so perhaps my ideas are off here. If you have better suggestions on workflow I'm happy to hear them.

My priority is low latency from end-to-end in medical imaging applications. Triton is great for inference, but i'm looking for ways to handle data loading and pre-processing as well on the GPU, to be used in tandem with an inference model served by Triton.

It's true that there can be a heavy amount of pre-processing and metadata extraction with DICOMs. The metadata extraction for purposes other than the main processing pipeline will likely always be there. Perhaps this means it doesn't make sense for DALI to handle DICOMs directly.

As for the normalization, my understanding is that DALI would be able to handle such tasks. Perhaps I'm mistaken.

@JanuszL
Copy link
Contributor Author

JanuszL commented Oct 17, 2022

Hi @csheaff,

As for the normalization, my understanding is that DALI would be able to handle such tasks.

DALI is mostly useful for online augmentation, and in general things, you need to do each iteration of your training/inference process. In the case of DICOM the conversion to NumPy can be done only once as a part of offline preparation, doing it every iteration wouldn't yield any value and would be wasteful from the resource point of view.
Nevertheless, I agree that it would be nice to seemingly handle that in DALI.

@blazespinnaker
Copy link

blazespinnaker commented Dec 11, 2022

@JanuszL

It looks like dali at least partially supports dicom by way of nvjpeg2k now and a bit of a hack. eg: https://www.kaggle.com/competitions/rsna-breast-cancer-detection/discussion/371534 notebook here - https://www.kaggle.com/code/tivfrvqhs5/decode-jpeg2000-dicom-with-dali?scriptVersionId=113466193

Kaggle isn't releasing a decoded dataset for the code competition so folks are having to decode on each run (train about 10 minutes, inference about 7 hrs!). The dali speedup is likely to be a huge win, but as noted it's only for dcmfile.file_meta.TransferSyntaxUID .90 standard, and .70 makes up about 1/2 of the other images. Heres the breakdown:

1.2.840.10008.1.2.4.70 29519
1.2.840.10008.1.2.4.90 25187

Any thoughts on how we might be able to get .70 in there as well? Are there fundamental limitations as to why it can't be supported?

@JanuszL
Copy link
Contributor Author

JanuszL commented Dec 12, 2022

Hi @blazespinnaker,

I'm glad to see that the community made that work. I think that it should be possible to use the external operator to extract DICOM data and pass it directly to the decoder instead of writing it to the disk.
Can you tell me more about .70 standard? How it is encoded (it may just happen that DALI doesn't support such a format yet)?

@blazespinnaker
Copy link

blazespinnaker commented Dec 12, 2022

Looks like .70 is a rarely used Jpeg Lossless standard .. https://crnl.readthedocs.io/jpeg_formats/index.html

If I had to guess, I'd say the question is whether nvjpeg can support it.

Maybe if we can get the pydicom folks to help support a pipeline to nvjpeg / nvjpeg2000 this can be done.

@JanuszL
Copy link
Contributor Author

JanuszL commented Dec 13, 2022

I just checked with the nvJPEG team and this format is not supported yet. In this case, DALI should fall back to the CPU libjpeg-turbo decoder. I'm sorry but I don't think we can do much now.

@blazespinnaker
Copy link

blazespinnaker commented Dec 16, 2022

Hmm, I think libjpeg might be a better fall back for 1.2.840.10008.1.2.4.70? libjpeg-turbo does not yet support lossless jpeg I believe.

libjpeg-turbo/libjpeg-turbo#638

@JanuszL
Copy link
Contributor Author

JanuszL commented Dec 16, 2022

Hi @blazespinnaker,

Currently, we fully rely on libjpeg-turbo for JPEG decoding. If it fails DALI cannot decode it.
What we can do it to try to fallback to the OpenCV (although I believe it may use libjpeg-turbo as well) here https://github.com/NVIDIA/DALI/blob/main/dali/image/jpeg.cc#L80. We would be more than happy to accept PR adding such functionality.

@JanuszL JanuszL mentioned this issue Jan 17, 2023
@JanuszL
Copy link
Contributor Author

JanuszL commented Jan 17, 2023

Please check #4578 for the 2023 roadmap.

@JanuszL JanuszL closed this as completed Jan 17, 2023
@JanuszL JanuszL unpinned this issue Jan 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants