Skip to content

DALI v0.9.1

Pre-release
Pre-release
Compare
Choose a tag to compare
@JanuszL JanuszL released this 02 May 16:15

Bug fixes

  • Make LMBD close properly with lazy init (#790)
  • Handle exception when NVDEC is not available for the video format (#752)
  • Minor fixes: Warning fix + enable one missing test (#764)
  • Fix build for the old version of nvJPEG (#760)
  • Fix pipeline completion callback (#745)
  • Make progressive JPEG to be always decoded by host huffman decoder (#739)
  • Use cv::COLOR_BGR2RGB instead of CV_BGR2RGB (#743)
  • Fix calculation of the average speed RN50 for TensorFlow test (#719)
  • Fix CacheLoad call in old nvjpeg (#728)
  • Fix Tensorflow validation pipeline (#722)
  • Fix bilinear resampling 1st row/column. (#697)
  • l1 fix (#687)
  • Handling sync pipeline with prefetch_queue_depth of 1 in Python (#688)
  • Fix shuffle_after_epoch option (#812)
  • Provide optional stream to copy_to_external API. Fix sync issue (#807)
  • Fix initialization of CUDA context on the default device during pipeline creation (#829)

Improvements

  • Add new function case for lazy init (#777)
  • Add L3 SSD test (#782)
  • Separate L0 & L2 FW iterators tests. Clear previous data in iterators loop (#779)
  • Make EpochSize prepare metadata when Reader has lazy init (#768)
  • L1 OF example (#757)
  • Make ssd random crops filter boxes the same way (#771)
  • Fix skip_cached_images feature (#769)
  • Change SSD L1 test options (#766)
  • Update SSD example to use distributed JoC model (#759)
  • Change CHECK_STRUCT_HAS_MEMBER to use CXX (#762)
  • Add support for Netpbm .pnm (.ppm/.pgm/.pbm) images using OpenCV (#599)
  • Evaluation at every epoch in TF RN50 (#717)
  • Change cast in resampling setup to silence a warning (#749)
  • Refactor nvdecoder: remove useless thread (#733)
  • Add more checks to AspectRatio test (#635)
  • Enable test for TensorFlow and CUDA 10 (#721)
  • Pinned allocator for nvJPEG CPU stage (#664)
  • Add lazy loading (#746)
  • Image cache batch copy (#742)
  • Add new ws policy for separated executor (#671)
  • Add test cases for nvJPEGDecoder fused crop variants (#716)
  • Resampling in mini-batches (#744)
  • Adding default cuda stream priority option (#734)
  • Adding test for DALI FW iterators (#706)
  • Mark stage buffers as consumed with stream callback (#712)
  • Move Optical Flow from aux to pipeline/operators/optical_flow (#720)
  • Disable hybrid huffman threshold by default, as it seems to lower performance (#736)
  • Disable Optical Flow temporal hints by default (#723)
  • Remove misleading info about OpenCV 2 support from readme. (#686)
  • Enforcing workers termination by waking up workers in Executor dtor (#699)
  • Temporarily remove broken OpticalFlow example (#731)
  • Add APEX building to SSD L1 test (#727)
  • Update as_array returned shape and update detection pipeline test (#724)
  • Skip image loading if the image is in cache (#669)
  • Handle empty tensors in the backend and frontend (#713)
  • Remove unused dependencies from tests (#715)
  • Update test_pipeline.py (#704)
  • Update support PyTorch version in README (#714)
  • Fix python L1 test for nvJPEG (#711)
  • Special handling for progressive JPEG in nvJPEG decoder (#695)
  • Use seed sequence for RandomCropAttr. Ensure consistency between different implementations of random crop attr (#692)
  • Add different decoder options to test_RN50_data_pipeline.py (#689)
  • Add unit tests for COCOReader (#709)
  • Enabling hint for Optical flow calculation (#702)
  • Rework FW Plugins to prefetch only as many batches as needed (#703)
  • Change Og to O0 to enable debug symbols in stacktrace (#701)
  • Refactor detection pipeline test (#693)
  • Make COCOReader options mutually exclusive (#698)
  • Update nvJPEG thresholds and add filename info in GPU stage (#672)
  • Optical flow support for BGR and GRAY types (#684)
  • Enable cubic filtering test. (#690)
  • Make BbFlip on CPU act as on GPU (#661)
  • Obtaining output tensor size from OpticalFlowAdapter (#680)
  • Update README (#648)
  • Add note about hue argument unit in color augumentation. (#683)
  • OF integration & example (#659)
  • Fix Presize test - add Buffer::padding() (#670)
  • Fix RN50 example for PyTorch (#667)
  • Update RN50 examples to use nvJPEG random crop decoding (#663)
  • Sort operators in docs, add padding to allocations (#660)
  • Turing OF adapter (#644)
  • Add range iterator constructor for dyn TensorShape (#629)
  • Add test for buffer presize (#647)
  • GTest submodule update (#646)

Breaking API changes

  • Internal python pipeline API has changed, if any function _* was used they need to be updated to reflect new semantic

Known issues:

  • New Video reader operator requires NVIDIA VIDEO CODEC SDK support in the platform. NVIDIA GPU Cloud (NGC) optimized containers lacks this functionality in the default configuration prior to 19.01. To enable it please run the container with the ‘video’ capability enabled, ie.:
    -e "NVIDIA_DRIVER_CAPABILITIES=compute,utility,video"
  • The video loader operator requires that the key frames occur at a minimum every 10 to 15 frames of the video stream. If the key frames occur at a lesser frequency, then the returned frames may be out of sync.

Binary builds

Install via pip for CUDA 9:
pip install --extra-index-url http://developer.download.nvidia.com/compute/redist/cuda/9.0 nvidia-dali==0.9.1
or for CUDA 10
pip install --extra-index-url http://developer.download.nvidia.com/compute/redist/cuda/10.0 nvidia-dali==0.9.1

Or use direct download links (CUDA 9.0):

Or use direct download links (CUDA 10.0):

FFmpeg source code:

  • This software uses code of FFmpeg licensed under the LGPLv2.1 and its source can be downloaded here