Fetching contributors…
Cannot retrieve contributors at this time
279 lines (256 sloc) 20.1 KB

Current development version (1.0-SNAPSHOT)


Version 1.0.0-RC4 (20150308)

Version 1.0.0-RC3 (20130107)

  • Fixed nasty regression in getBestDevice !
  • Fixed ati byte order hack
  • Fixed byte order hack for ATI platforms
  • Fixes / optimized event callbacks (but broke API: CLEvent.EventCallback now only takes the completion status as argument, not the event anymore)
  • Fixed library probe
  • Fixed handling of image2d_t and image3d_t in Maven plugin (contrib. from Remi Emonet, request #308 and [issue nativelibs4java#307] (
  • Fixed OpenGL interop on Windows ([issue nativelibs4java#312] (
  • Fixed error about mismatching byte order for byte buffers, and replaced mentions to getKernelsDefaultByteOrder() by getByteOrder() ([issue nativelibs4java#336] (
  • Fixed AMD App 2.7 Linux library loading code for
  • Fixed AMD download link in demos.
  • Added CLEvent.FIRE_AND_FORGET to avoid returning events from all the methods that accept a vararg eventsToWaitFor.
  • Added naive OSGi support to the main JAR.
  • Added list of devices in program errors.
  • Added CLBuffer.allocateCompatibleMemory(CLDevice)
  • Added client properties to CLContext (lazy + concurrent)
  • Optimized low-level bindings on OpenCL 1.1+ platforms, with dynamic runtime switch (removed synchronized keyword from all native calls), and made OpenCL 1.0 synchronization a warning.
  • Enhanced CLDevice.toString (include platform name)
  • Deprecated CLKernel.enqueueNDRange with int[] parameters
  • Return CLUserEvent from CLContext.createUserEvent();

Version 1.0.0-RC2 (20120415, commit 6bc061dfce06b941086a29f696195e82fbaffbdc)

Version 1.0.0-RC1 (r2130, 20110621)

  • BridJ version now becomes the default : the JNA version is still maintained and available with all Maven artifact ids suffixed with "-jna" (BridJ-based JavaCL's main artifact is now "javacl", while the JNA-based version is "javacl-jna")
  • added simple Fourier-analysis classes (package com.nativelibs4java.opencl.util.fft), with double and float variants, usable with primitive arrays or OpenCL buffers :
    • naive Discrete Fourier Transform (DFT)
    • Fast Fourier Transform (FFT) for power-of-two arrays / buffers (performs better than Apache Commons on a CPU)
  • added some compiler options to CLProgram :
    • setFastRelaxedMath() (triggers all the others !)
    • setFiniteMathOnly()
    • setUnsafeMathOptimizations()
    • setMadEnable()
    • setNoSignedZero()
  • added CLContext.createBuffer(Usage, Buffer)
  • added CLBuffer.copyTo(CLQueue, CLMem destination, CLEvent...) and CLBuffer.emptyClone(Usage)
  • added NIOUtils.indirectBuffer(size, bufferClass)
  • added CLContext.toString
  • deprecated CLXXXBuffer in favor of CLBuffer<XXX> (CLIntBuffer becomes CLBuffer, etc...)
  • changed CLContext.createBuffer(Usage, length, class) to createBuffer(Usage, class, length) to match the JavaCL/BridJ API (and provoke migration issues : people should now use a primitive class rather than an NIO buffer class !!!
  • complete rewrite of CLBuffer genericity to unify with the BridJ port : CLBuffer<DoubleBuffer> is now CLBuffer, and are no longer strongly typed (it is implicitely typed with Buffer subclasses for compatibility with existing code). The BridJ port will be favoured, and its read/write/map methods use typed Pointer.
  • complete rewrite of UJMP Matrix implementation, using principles borrowed from ScalaCL
  • fixed [issue nativelibs4java#66] ( (create temp files in ~/.javacl subdirectories instead of /tmp)
  • fixed OpenGL sharing on MacOS X
  • fixed CLProgram.getBinaries() in some cases
  • fixed on indirect buffers
  • fixed NPE that happens with null varargs CLEvent[] array
  • fixed length = 1 case in reduction utility
  • fixed ATI detection ("ATI Stream" now replaced by "AMD Accelerated Parallel Processing", cf. Csaba's comment in [issue nativelibs4java#39] (
  • fixed [issue nativelibs4java#55] ( : applied Kazo Csaba's patch to fix the bounds of's returned buffer
  • fixed inheritance of CLBuildException (now derives from CLException)

Version 1.0-beta-6 (r1637, 20100204)

  • Fixed support of ATI Stream 2.3 (CPU)
  • New interactive image kernel demo : lets you edit and test image kernels in a snap (bundled with a few samples)
  • Experimental BridJ port with same functionality as JNA-powered version, but smaller and faster (JAR weighs 750 kB instead of 1.8 MB, overhead per-function call about 10x smaller)
  • Added automatic and transparent program binaries caching :
    • Disabled by default on ATI Stream.
    • Can force on/off with :
      • property -Djavacl.cacheBinaries=true/false
      • environment variable JAVACL_CACHE_BINARIES=1/0
      • methods CLContext.setCacheBinaries and CLProgram.setCached
  • JavaCL.createBestContext now takes an ordered list of CLPlatform.DeviceFeature enums that help prioritize the features considered as "best" (list can be empty or contain any of CPU, GPU, DoubleSupport, MaxComputeUnits, NativeEndianness, ImageSupport, MostImageFormats...). These features are preferences, not requirements : with createBestContext(GPU, MaxComputeUnits) you might end up getting a CPU-based context if there's no GPU available, but you'll have the most powerful GPU (in terms of compute units) if there are two of them.
  • Kernels can now include files that are in the classpath (+ added CLProgram.addInclude that accepts directories and base URLs)
  • Added LibCL : growing collection of OpenCL functions that can be included from any JavaCL kernel
  • CLKernel.enqueueNDRange has a new overload without localWorkSizes argument (it's then adjusted to a good value by the OpenCL implementation).
  • ScalaCLv2 was rewritten to fit nicely into Scala's collections framework.
  • Added CLContext.createProgram(Map<CLDevice, byte[]>) to create from saved binaries (contribution from Kazo Csaba, [issue nativelibs4java#30] (
  • Added CLProgram.addBuildOption(String)
  • Fixed CLBuffer.copyTo
  • Demos now use the latest jogamp JOGL binaries (see the updated build instructions : )

Version 1.0-beta-5 (r1067, 20100717)

Version 1.0-beta-4 (r760, 20100121)

  • Changed semantics of offset & length arguments in typed / write / map methods : now expressed in elements, not in bytes (e.g. 4 bytes per element for CLIntBuffer)
  • Added OpenGL interoperability methods to CLContext and CLQueue (can create a CLByteBuffer from an OpenGL buffer object, a CLImage2D/3D from an OpenGL 2D/3D texture or a renderbuffer).
  • Added OpenGL-compatible context creation methods to JavaCL & CLPlatform classes
  • Added basic reduction support in ReductionUtils (cumulative additions, multiplications, min, max...)
  • Created javacl-demos package, with Particles, Hardware Report and Mandelbrot demos...
  • Finished migration from NativeLong to NativeSize (changes only the low-level bindings)
  • Added profiling methods to CLEvent (+ facility CLDevice.createProfilingQueue())
  • Better JavaDoc for low-level bindings (links to Khronos online manual pages)
  • Added deferred program allocation : CLProgram.addSource(String), CLProgram.allocate() (called automatically)...
  • Added very simple OpenCL backend for UJMP (Universal Java Matrix Package), which does matrix multiplications in OpenCL.
  • Created a kernel wrapper autogenerator (Maven plugin based on JNAerator) : translates all constants on the Java side and presents kernels as methods with the correct Java argument types. It assumes OpenCL kernels (*.c, *.cl) are in src/main/opencl
  • Added wrappers around clGetKernelWorkGroupInfo
  • Fixed respect of endianness of devices that have different endianness than platform
  • Fixed [issue nativelibs4java#10] ( "getMaxWorkItemSizes() fails on win7 64 GTX260"

Version 1.0-beta-3 (r , 20091030)

Version 1.0-beta-2

  • JAR is now self-sufficient (includes JNA + JNAerator's runtime classes)
  • Added CLKernel.setLocalArg(argIndex, localArgByteLength)
  • Allow localWorkSizes to be null in enqueueNDRange
  • Added support for barriers and markers in CLQueue
  • Fixed [issue nativelibs4java#2] ( : enqueueNDRange does not work for nDim > 1
  • Added CLDevice.getMaxWorkItemSizes()
  • CLDevice.toString() now only prints the name
  • Moved method createContext from CLContext to CLPlatform
  • Added all the CL_DEVICE_PREFERRED_VECTOR_WIDTH_XXX infos to CLDevice as getPreferredVectorWithXXX()
  • Changed return type of getExtensions() method of both CLPlatform and CLDevice from String to String[]
  • Added com.nativelibs4java.opencl.HardwareReport (with main method) : outputs html report with devices stats
  • Rationalized naming of all enums : CL_ENTITY_ATTRIBUTE_SOME_VALUE = CLEntity.Attribute.SomeValue (enum item SomeValue in enum Attribute in class CLEntity)
  • Added full support of images :
    • CLContext.getSupportedImageFormats + CLImageFormat and associated enums
    • CLImage2D, CLImage3D and corresponding creation methods in CLContext + all image info getters
  • CLMem is now an abstract base class
    • CLBuffer with typed subclasses (CLByteBuffer, CLIntBuffer..)
    • To create a CLBuffer : context.createIntBuffer(Input, size)
    • Added CLBuffer.copyTo (clEnqueueCopyBuffer)
    • Each typed CLBuffer subclass has map, mapLater, read methods that return typed NIO buffers
  • Added full typing of OpenCL Exceptions (now possible to selectively catch a CLException.OutOfHostMemory, for instance)
  • Added hashCode and equals method to most classes
  • Added ability to create out of order queues and change queue properties after creation

Version 1.0-beta-1

  • New CLPlatform class (~ OpenCL implementation) which now hosts the list*Devices(...) methods
  • Entry point of library is now OpenCL4Java.listPlatforms()
  • New CLEvent class, returned by all enqueue* methods (with methods waitFor, invokeUponCompletion...)
  • Better separation between blocking and non blocking calls
  • New CLSampler class supported as argument of CLKernel
  • Many info getters with typesafe enums / enum sets in classes CLDevice, CLPlatform, CLKernel...
  • Much more complete JavaDoc :
  • Example & benchmark classes became JUnit tests and moved here :

While this release is rather OpenCL4Java-focused, ScalaCL also got its bunch of enhancements :

  • Added scalar variables IntVar, FloatVar, ShortVar...
  • 'local' keyword can be added to variables so they're local to the programs : val x = FloatVar local
  • Added many OpenCL math functions
  • Added methods ArrayVar.write(Range), ArrayVar.write(Seq)
  • Various bugfixes