v5.0.0
Changes:
cocl_add_executable
andcocl_add_library
cmake macros created- many atomics work now
- compiles/runs on Mac Sierra, using Radeon HD450
- upgraded to use LLVM4.0, under the hood
- simplify handling/switching between 32-bit/64-bit pointer offsets
- added a bunch of maths operations (
sincosf
, ...) - revamped how gpu buffers are passed into kernels, so that each buffer only passed in once
- missing function implementations now cause an obvious failure during compile, rather than weird unknown runtime issues
- added
CL_GPUOFFSET=1
, to choose the second gpu - fixed some bugs involving math, so that
tf.random_uniform
andtf.random_normal
, in tensorflow, work correctly now