Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: How to Maximize throughput and fps #51

Closed
jxmelody opened this issue Jul 18, 2018 · 2 comments
Closed

Question: How to Maximize throughput and fps #51

jxmelody opened this issue Jul 18, 2018 · 2 comments
Labels
question Further information is requested

Comments

@jxmelody
Copy link

jxmelody commented Jul 18, 2018

Hi,
I used DALI in the purpose of improving the performance of my deep learning application .But it seems that I can't take the fully advantage of CPU core when I use DALI , which is feasible while I use tensorflow slice_input_producer (multi-thread) to load file .

So my question :

  1. Is it possible to maximize throughput and fps by making the best of both CPU and GPU(eg. 12 core and 2 GTX 1080)? If yes, how to do that?
  2. what's the meaning of num_threads when defining own pipeline? Is there some relevance with GPU or CPU threads?
  3. when I use DALI , I have to set per_process_gpu_memory_fractionthe to limit TF memory(see #21 ), and batch size can't be set as a big size (I've tried 32, but 64 does not work). It seems that DALI needs much of GPU memory . Will the memory issue affect the performance of deep learning applications?
  4. Could you please providing more general performance report based on more general develop env(such as GTX 1080P other than DGX-2)?
  5. One about nvJPEG: nvJPEG can only do jpeg open operation, but DALI can do a lots of image augmentation(such as resize) .Why don't you add these features to nvJPEG for more general use rather than Data Loading of Deep Learning application?

Thanks in advance!

@JanuszL JanuszL added the question Further information is requested label Jul 18, 2018
@JanuszL
Copy link
Contributor

JanuszL commented Jul 18, 2018

Hi,

  1. The current idea of DALI is to allow easy offload of data loading and augmentation to the GPU. It was designed for scenarios where CPU is the bottleneck. In your case CPU shouldn't be and what you can do is to construct pipeline by assigning some operations to CPU and the rest to GPU that CPU is also utilized.
  2. This relates to a number of CPU thread that is used to perform CPU operators. When you create pipeline you may assign it to given GPU by providing device_id, by providing num_threads you tell how big CPU thread pool should be. There is one thing that we need to document better, nvJpeg is executed partially by CPU, partially by GPU. For CPU it also creates a thread pool which size can be defined by passing num_threads argument. If you set num_threads to low value it could hurt performance. Please check how your different values work for you.
  3. It is true, additional memory is required so data processing could be performed by DALI on GPU. We are working on reducing memory pressure as @ptrendx stated in Can't process big size image  #21. In your case it makes you use small batch sizes and this could affect overall performance.
  4. If you are asking for speed results for configurations where CPU processing power is not a bottleneck (like 1xGTX1080P), it should be almost the same comparing to test without DALI (even may be a bit slower due to DALI overhead). In such case, the main benefit of DALI is flexibility and ease of pipeline construction. That is why we don't provide general performance reports. Nevertheless, it is a good point and we may prepare a more thorough performance report.
  5. nvJpeg is designed to provide Jpeg loading and decoding (mostly), it is not planned to be image processing library. For that DALI can be used and it is not necessarily limited only to Deep Learning applications. If you really need to build an own and custom processing pipeline how about mixing nvJpeg and NPP for processing?"

@jxmelody
Copy link
Author

@JanuszL Thank you!

@JanuszL JanuszL closed this as completed Jul 19, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants