Skip to content

Releases: hughperkins/coriander

v6.0.0

21 Jun 09:13
Compare
Choose a tag to compare

Changes:

  • installs to ~/coriander now
    • this allows plugins to install without needing sudo
  • installs using python2.7 script install_distro.py now, which:
    • should be more portable
    • handles installing the coriander-dnn plugin
    • handles downloading llvm-4.0
  • plugin architecture created
    • NVIDIA® CUDA™ cuDNN API partial implementation factorized into a plugin coriander-dnn
  • cocl script migrated to cocl_py, which is reasonably cross-platform, can run on Windows

Under-the-hood:

  • main jenkins script migrated to python, so easy-ish to run on Windows too

v5.1.2

07 Jun 11:42
Compare
Choose a tag to compare

Bug fixes:

  • fix shims re-ordering, which caused runtime errors occasionally
  • fix test failures in test_floatstarstar.cu, caused by virtual memory relocation

v5.1.1

06 Jun 22:39
Compare
Choose a tag to compare

Changes:

  • added support for passing arrays of gpu pointers in by-value structs, ie something like:
struct MyStruct {
    float *buffers[8];
};

__global__ mykernel(struct MyStruct mystruct) {
   ...
}
  • fixed some compile bugs for Eigen tests on Ubuntu 16.04

v5.0.0

04 Jun 22:31
Compare
Choose a tag to compare

Changes:

  • cocl_add_executable and cocl_add_library cmake macros created
  • many atomics work now
  • compiles/runs on Mac Sierra, using Radeon HD450
  • upgraded to use LLVM4.0, under the hood
  • simplify handling/switching between 32-bit/64-bit pointer offsets
  • added a bunch of maths operations (sincosf, ...)
  • revamped how gpu buffers are passed into kernels, so that each buffer only passed in once
  • missing function implementations now cause an obvious failure during compile, rather than weird unknown runtime issues
  • added CL_GPUOFFSET=1, to choose the second gpu
  • fixed some bugs involving math, so that tf.random_uniform and tf.random_normal, in tensorflow, work correctly now

v4.0.4

25 Nov 16:17
Compare
Choose a tag to compare

Fixes several eigen tests, https://bitbucket.org/hughperkins/eigen/src/eigen-cl/unsupported/test/cuda-on-cl/?at=eigen-cl :

  • argmax passes now
  • cuda_nullary re-passes now (briefly didnt, in briefly existing 4.0.1)
  • reduction_tiny re-passes now (briefly didnt, in briefly existing 4.0.1)

v4.0.0

24 Nov 15:23
Compare
Choose a tag to compare
  • radical refactorization under-the-hood
  • allocas work ok now
  • calling functions returning pointers, possibly global, possibly not, sometimes global, sometimes not (even for exact same function name), all ok now
  • address-space proagation in general: to functions, through functions, to/through phis, to/through allocas, all significantly improved/existent, compared to before
  • opencl generation is at runtime now, which gives two things:
    • it's actually faster, at runtime, counter-intuitively, since we only need to feed a fraction of hte OpenCL to the gpu driver, just the small amount of opencl we actually need, rather than the entire program each time :-P
    • massively improves the ability to determine the address-space of pointer variables/functions/etc

v2.0.1

13 Oct 11:36
Compare
Choose a tag to compare

Fix:

  • hostside_opencl_funcs_assure_initialized was being inlined, not added to libcocl.a

v2.0.0

13 Oct 11:05
Compare
Choose a tag to compare

(since following semver, and this changes public api, so bumping major version)

Changes:

  • if you just want to ensure the cl context is initialized (which is entirely optional, but useful for testing), the public method now is:
    • hostside_opencl_funcs_assure_initialized, rather than just assure_initialized ...

v1.0.0

13 Oct 09:00
Compare
Choose a tag to compare
1.2