@dmed256 dmed256 released this Sep 8, 2018 · 12 commits to master since this release

Assets 2

⚠️ Breaking Changes

  • [cd68708] Updated wrapMemory to take in an occa::device and occa::properties

    Before

    occa::cpu::wrapMemory(void* ptr, const udim_t bytes)
    

    After

    occa::cpu::wrapMemory(occa::device device, void* ptr, const udim_t bytes, occa::properties props)
    
  • [959ec4a] Renamed occaSetDeviceFromInfos to fit the rest of the methods

    Before

    occaSetDeviceFromInfos(const char *info)
    

    After

    occaSetDeviceFromString(const char *info)
    
  • [7735c66] Removed some redundant stream methods

    Before

    occa::device::freeStream(occa::stream) // C++
    occaDeviceFreeStream(occaStream)       // C

    After (Not new)

    occa::stream::free() // C++
    occaFree(occaStream) // C
  • [f81054d] Removed occa::opencl::event() and moved it to occa::opencl::streamTag::clEvent

  • [f81054d] Removed occa::cuda::event() and moved it to occa::cuda::streamTag::cuEvent

  • [f81054d] Removed occa::streamTag::tagTime. Tags can only be used for:

    • Waiting for queued tasks to finish (e.g. launched kernels or memory copies)
    • Time gaps between 2 tags

⭐️ Features

  • [daf0300] Faster make build and added make info @v-dobrev
  • [1024a62] Switched garbage collection strategy to NULL out existing device/kernel/memory objects when one is freed. This switches SEGFAULT issues to occa::exception errors that can be more easily debugged.
  • [527494c] Linalg methods reuse device buffers for reductions
  • [ce46013] Loading cached kernels are sped up by avoiding locks if possible
  • [e27b29e] Added occaJson
  • [fdd2d7c] Added occaCreateDeviceFromString
  • [fdd2d7c] Added CLI to C exampleOpenCL mode
  • [959ec4a] Added UVA methods to C API
  • [7735c66] The occa::stream class can now be extended
  • [f81054d] The occa::streamTag class can now be extended

🐛 Bugs Fixed

  • [99ce6fb] Linalg properly deletes array allocations @jdahm
  • [b7384bc] Kernel hashes is generated only from needed props (e.g. ignores verbose)
  • [780a06a] OpenCL __global, __local, and __kernel are properly inserted in the beginning
  • [dba0db9] memory::slice was improperly freeing UVA pointers in
  • [3260a05] The verbose property was being overwritten in CUDA mode

🎉 Contributors

@dmed256 dmed256 released this Aug 9, 2018 · 61 commits to master since this release

Assets 2

⚠️ Breaking Changes

  • [4199d8f] Removed occa::cuda::getMappedPtr and occa::opencl::getMappedPtr and replaced them with

    occa::memory::ptr("mapped: true")
    
  • [4199d8f] Allocating mapped/pinned memory (CUDA, OpenCL)

    It was too verbose and not as flexible to pass

    cuda: { mapped: true }
    opencl: { mapped: true }
    

    It's now the same for both CUDA and OpenCL

    mapped: true
    
  • [4199d8f] Allocating unified memory (CUDA)

    The driver API uses the method cuMemAllocManaged so the prop was named accordingly

    cuda: { managed: true }
    

    However, most users know this feature as unified memory so we're switching the prop name to unified.
    Similar to mapped allocation, it has been shortened to

    unified: true
    

C

  • [1f513fc] occaMemoryPtr(occaMemory)occaMemoryPtr(occaMemory, occaProperties)

⭐️ Features

  • [4199d8f] Added occa::memory::ptr(occa::properties)

  • [abc3bea] Added #pragma occa attributes option

    #pragma occa attributes @kernel
    void addVectors(const int entries,
                    const float *a,
                    const float *b,
                    float *ab) {
    #pragma occa attributes @tile(16, @outer, @inner)
      for (int i = 0; i < entries; ++i) {
        ab[i] = a[i] + b[i];
      }
    }

    @kernel void addVectors(const int entries,
                            const float *a,
                            const float *b,
                            float *ab) {
      for (int i = 0; i < entries; ++i; @tile(16, @outer, @inner)) {
        ab[i] = a[i] + b[i];
      }
    }

🐛 Bugs Fixed

  • [b41ed34] UVA range checks had incorrect inclusive end
  • [1f513fc] Preprocessor treats undefined identifiers as 0 (Thanks @pdhahn!)

@dmed256 dmed256 released this Jul 31, 2018 · 81 commits to master since this release

Assets 2

⚠️ Breaking Changes

C++

  • [e529137] Removed occa::getKernelProperties()
  • [d88218a] For UVA pointers:occa::freeocca::freeUvaPtr

C

  • [6e895de] occaDeviceUmallococcaDeviceUMalloc
  • [6e895de] occaWaitForoccaWaitForTag
  • [6e895de] occaDeviceWaitForoccaDeviceWaitForTag
  • [6e895de] occaTimeBetweenoccaTimeBetweenTags
  • [6e895de] occaDeviceTimeBetweenoccaDeviceTimeBetweenTags

⭐️ Features

Coverage (57.9% → 70.2%!)

Part of code % Coverage Change LOC Coverage Change
Headers 74.7%91.5% (+16.8%) 7141374 (+660)
C API 24.1%99.4% (+75.3%) 139655 (+516)
C++ API 58.0%67.0% (+ 9.0%) 834710374 (+2027)
IO Tooling 62.2%97.3% (+35.1%) 225326 (+101)
General Tooling 51.0%63.7% (+12.7%) 12491524 (+275)
OKL Parser 61.4%63.7% (+ 2.3%) 64797063 (+584)

C++

C

  • [9c6e5ea] Added occaPropertiesHas
  • [6b06820] Added occaFreeUvaPtr
  • [90cf5d5] Added occaUndefined and occaIsUndefined
  • [87e7000] Added occaIsDefault

Misc

  • [e529137][#154] CLI options that take arguments can be passed as: -Dfoo=1-D foo=3

  • [3707f9e] Examples have arg parsing to make them more interactive

  • [dae5308] occa::sys::rmrf cannot delete any path that has less than 2 parent directories (e.g. / or /usr/bin) without:

     occa::settings()["options/safe-rmrf"] = false;

🐛 Bugs Fixed

  • [855e967] Exclusive array was set at the end, not beginning
  • [5df9319] OpenMP was using Serial parser
  • [63c1258] String merging now works between newlines
  • [ff4310d] Fixed bug using occa::memcpy with 2 non-occa pointers
  • [dae5308] Failed kernel compilations now clear the cache directory
  • [dae5308] Non-conforming OKL kernels now properly fail

@dmed256 dmed256 released this Jul 23, 2018 · 157 commits to master since this release

Assets 2

⚠️ Breaking Changes

Memory

  • 405fb35 Renamed occa::opencl::getCLMappedPtrocca::opencl::getMappedPtr

⭐️ Features

Kernel

  • 1cc0da8 Kernels can be run with 0 arguments

Memory

  • 405fb35 Added getMappedPtr for OpenCL and CUDA

OKL

🐛 Bugs Fixed

🎉 Contributors

@noelchalmers
@jdahm

@dmed256 dmed256 released this Jul 7, 2018 · 173 commits to master since this release

Assets 2

⚠️ Breaking Changes

Mode Properties

[d1fd6e0, 8691434] In order to standardize key names in properties, we're moving to snake_case which is a valid JSON5 identifier for JSON Objects. That way short-hand notations such as

{ mode: 'CUDA', device_id: 0 }

are still valid

Changes:

  • deviceIDdevice_id
  • platformIDplatform_id
  • threadCountthreads
  • pinnedCorespinned_cores
  • compilerFlagscompiler_flags
  • compilerEnvScriptscompiler_env_scripts

⭐️ Features

CLI

[7055d92] Added -I/--include-path and -D/--define to occa transform and occa compile
[a7c578c] Added -v/--verbose to add transform information in comments

🐛 Bugs Fixed

Parser

[80b9972] oklForStatements check the iterator's base type

🎉 Contributors

@pdhahn

@dmed256 dmed256 released this Jul 4, 2018 · 181 commits to master since this release

Assets 2

⚠️ Breaking Changes

JSON

[70c9ddf] Swapped dump and toString

⭐️ Features

OKL

[6e760d2] Added {2,3,4} (such as double2, double3, double4)
[f5cf04b] restrict -> @restrict

Sys

[#145, 3fc6753] Added dlerror messages to dlopen and dlsym

IO

[cb5eec7] Fixed ~/ expansion

CLI

[d3bec39, 9d26399] Added compile and translate options to occa

🐛 Bugs Fixed

Parser

[#133, fd248c6] Added vartype nodes for parenCast expressions
[#136, #140, e672972] Added type expansion to get around issue
[#147, f8a4ac8] Fixed statement attributes getting overridden
[db2f263] withLauncher success also depends on the host

🎉 Contributors

@jedbrown

@dmed256 dmed256 released this Jun 16, 2018 · 207 commits to master since this release

Assets 2

Bug Fixes

  • aa757b9 Dims on GPU modes weren't being set properly

Testing

  • 895bb70 Travis CI error logs cap at 4MB
Pre-release

@dmed256 dmed256 released this Mar 31, 2018 · 428 commits to master since this release

Assets 2

Checkout the v0.2 -> v1.0 Porting Guide

Bug Fixes

  • C++
    • #99 kernel::free() removes itself from the device kernel cache
Pre-release

@dmed256 dmed256 released this Mar 30, 2018 · 430 commits to master since this release

Assets 2

Checkout the v0.2 -> v1.0 Porting Guide

Change Log

  • C++
    • 7b5dad1 Added mode-specific properties. For example, only when running in OpenCL mode the kernel compilation will be verbose:
    { 
      kernel: { verbose: false },
      mode: { 
        OpenCL: {
         kernel: { verbose: true },
        }
      }
    }
    • 57746d5 Added unicode parsing to occa::json (still keeps it as \uXXXX for the user to parse)

Bug Fixes

  • C++
    • #98 Setting OCCA_VERBOSE works
  • C