Skip to content
This repository
Olivier Chafik
file 211 lines (189 sloc) 14.978 kb
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210
Current development version (1.0-SNAPSHOT)

- Added basic handling of structs in JavaCL Generator (issue #421, issue #422, issue #423)
- Optimized allocation of native memory throughout the library (reusing thread-local pointer pools)
- Updated to BridJ 0.7 (brings fixes and lots of performance enhancements)
- Dropped Blas artifact from main deployment (requires addional repo since ujmp is not in Maven Central)
- Dropped support for JNA variant of JavaCL. Migration is highly encouraged (features and bugfixes).
  https://code.google.com/p/javacl/wiki/MigratingFromJNAToBridJ
- Fixed illegal calls to clReleaseDevice (OpenCL 1.2) (issue #387).
- Fixed major bug in CLProgram (issue #397)
- Fixed crash on Radeon due to bug in driver (attempts to get the source from a program created with clCreateProgramWithBinary yields a segfault, issue #397)
- Fixed crash introduced in snapshot (double release, found with new -Dbridj.debug.pointer.releases=true feature) (see issue #420 and/or issue #405).
...

Version 1.0.0-RC3 (20130107)

- Fixed nasty regression in getBestDevice !
- Fixed ati byte order hack
- Fixed byte order hack for ATI platforms
- Fixes / optimized event callbacks (but broke API: CLEvent.EventCallback now only takes the completion status as argument, not the event anymore)
- Fixed library probe
- Fixed handling of image2d_t and image3d_t in Maven plugin (contrib. from Remi Emonet, request #308 and issue #307)
- Fixed OpenGL interop on Windows (issue #312)
- Fixed error about mismatching byte order for byte buffers, and replaced mentions to getKernelsDefaultByteOrder() by getByteOrder() (issue #336)
- Fixed AMD App 2.7 Linux library loading code for
- Fixed AMD download link in demos.
- Added CLEvent.FIRE_AND_FORGET to avoid returning events from all the methods that accept a vararg eventsToWaitFor.
- Added naive OSGi support to the main JAR.
- Added list of devices in program errors.
- Added CLBuffer.allocateCompatibleMemory(CLDevice)
- Added client properties to CLContext (lazy + concurrent)
- Optimized low-level bindings on OpenCL 1.1+ platforms, with dynamic runtime switch (removed synchronized keyword from all native calls), and made OpenCL 1.0 synchronization a warning.
- Enhanced CLDevice.toString (include platform name)
- Deprecated CLKernel.enqueueNDRange with int[] parameters
- Return CLUserEvent from CLContext.createUserEvent();

Version 1.0.0-RC2 (20120415, commit 6bc061dfce06b941086a29f696195e82fbaffbdc)

- Release artifacts are available in Maven Central
- Added support for sub-images reading/writing from/to CLImage2D (slower than with full images)
- Fixed endianness issues with CLBuffer (issue #80)
- Fixed migration of cached binaries to newer versions of OS (e.g. upgrading from Snow Leopard to Lion) (issue #81)
- Fixed handling compiler options containing spaces (issue #274)
- Fixed tutorial artifact pom repositories (issue #279)
- Fixed many Javadoc typos
- Fixed support of Intel's OpenCL Windows runtime (issue #297)
- Enhanced LocalSize API (added static factory methods for all primitive types)
- Deprecated CLContext.getKernelsDefaultByteOrder() and CLDevice.getKernelsDefaultByteOrder()
- Added more informative exceptions when passing null pointers to CLBuffer.writeBytes (issue #257)
- Updated to OpenCL 1.2 headers
- Added -cl-nv-verbose, -cl-nv-maxrregcount, -cl-nv-opt-level + proper log even without error when nv-verbose is set
- Enhanced handling of endianness : warn when creating contexts with devices that have mismatching endianness, throw when creating buffer out of Buffer / Pointer with bad endianness
- Changed signature of CLPlatform.listDevices (now takes a single CLDevice.Type, including All, instead of an EnumSet thereof)
- Moved sources to github (https://github.com/ochafik/nativelibs4java/tree/master/libraries/OpenCL)

Version 1.0.0-RC1 (r2130, 20110621)

- BridJ version now becomes the default : the JNA version is still maintained and available with all Maven artifact ids suffixed with "-jna" (BridJ-based JavaCL's main artifact is now "javacl", while the JNA-based version is "javacl-jna")
- added simple Fourier-analysis classes (package com.nativelibs4java.opencl.util.fft), with double and float variants, usable with primitive arrays or OpenCL buffers :
- naive Discrete Fourier Transform (DFT)
- Fast Fourier Transform (FFT) for power-of-two arrays / buffers (performs better than Apache Commons on a CPU)
- added some compiler options to CLProgram :
- setFastRelaxedMath() (triggers all the others !)
- setFiniteMathOnly()
- setUnsafeMathOptimizations()
- setMadEnable()
- setNoSignedZero()
- added CLContext.createBuffer(Usage, Buffer)
- added CLBuffer.copyTo(CLQueue, CLMem destination, CLEvent...) and CLBuffer.emptyClone(Usage)
- added NIOUtils.indirectBuffer(size, bufferClass)
- added CLContext.toString
- deprecated CLXXXBuffer in favor of CLBuffer<XXX> (CLIntBuffer becomes CLBuffer<Integer>, etc...)
- changed CLContext.createBuffer(Usage, length, class) to createBuffer(Usage, class, length) to match the JavaCL/BridJ API (and provoke migration issues : people should now use a primitive class rather than an NIO buffer class !!!
- complete rewrite of CLBuffer genericity to unify with the BridJ port : CLBuffer<DoubleBuffer> is now CLBuffer<Double>, and CLBuffer.read/write/map are no longer strongly typed (it is implicitely typed with Buffer subclasses for compatibility with existing code). The BridJ port will be favoured, and its read/write/map methods use typed Pointer<T>.
- complete rewrite of UJMP Matrix implementation, using principles borrowed from ScalaCL
- fixed issue #66 (create temp files in ~/.javacl subdirectories instead of /tmp)
- fixed OpenGL sharing on MacOS X
- fixed CLProgram.getBinaries() in some cases
- fixed CLBuffer.read on indirect buffers
- fixed NPE that happens with null varargs CLEvent[] array
- fixed length = 1 case in reduction utility
- fixed ATI detection ("ATI Stream" now replaced by "AMD Accelerated Parallel Processing", cf. Csaba's comment in issue #39)
- fixed issue #55 : applied Kazo Csaba's patch to fix the bounds of CLBuffer.map's returned buffer
- fixed inheritance of CLBuildException (now derives from CLException)

Version 1.0-beta-6 (r1637, 20100204)

- Fixed support of ATI Stream 2.3 (CPU)
- New interactive image kernel demo : lets you edit and test image kernels in a snap (bundled with a few samples)
- Experimental BridJ port with same functionality as JNA-powered version, but smaller and faster (JAR weighs 750 kB instead of 1.8 MB, overhead per-function call about 10x smaller)
- Added automatic and transparent program binaries caching :
- Disabled by default on ATI Stream.
- Can force on/off with :
- property -Djavacl.cacheBinaries=true/false
- environment variable JAVACL_CACHE_BINARIES=1/0
- methods CLContext.setCacheBinaries and CLProgram.setCached
- JavaCL.createBestContext now takes an ordered list of CLPlatform.DeviceFeature enums that help prioritize the features considered as "best" (list can be empty or contain any of CPU, GPU, DoubleSupport, MaxComputeUnits, NativeEndianness, ImageSupport, MostImageFormats...). These features are preferences, not requirements : with createBestContext(GPU, MaxComputeUnits) you might end up getting a CPU-based context if there's no GPU available, but you'll have the most powerful GPU (in terms of compute units) if there are two of them.
- Kernels can now include files that are in the classpath (+ added CLProgram.addInclude that accepts directories and base URLs)
- Added LibCL : growing collection of OpenCL functions that can be included from any JavaCL kernel
- CLKernel.enqueueNDRange has a new overload without localWorkSizes argument (it's then adjusted to a good value by the OpenCL implementation).
- ScalaCLv2 was rewritten to fit nicely into Scala's collections framework.
- Added CLContext.createProgram(Map<CLDevice, byte[]>) to create from saved binaries (contribution from Kazo Csaba, issue #30)
- Added CLProgram.addBuildOption(String)
- Fixed CLBuffer.copyTo
- Demos now use the latest jogamp JOGL binaries (see the updated build instructions : http://code.google.com/p/javacl/wiki/Build)

Version 1.0-beta-5 (r1067, 20100717)

- Now using a nice configuration dialog when launching ParticlesDemo : has
optional OpenCL settings with "Fastest", "Normal" and "Safest" presets +
detailed platform and device choice (with optional OpenGL sharing choice).
- Added optional context properties map argument to JavaCL.createContext (can be nulled
out)
- Fixed issue #18: CLImage.write calls enqueueImageRead !
- Documented workaround for Linux crashes (issue #20) : http://code.google.com/p/javacl/wiki/TroubleShootingJavaCLOnLinux
- Fixed issue #21: NIOUtils.put() doesn't accept ByteBuffer
- Fixed issue #25: CLEvent.waitFor bug causes segfault
- OpenCL 1.1 support :
- CLContext.createUserEvent()
- CLUserEvent.setStatus(int), setCompleted()
- CLEvent.setCallback(status, callbac), setCompletionCallback(callback)
- CLBuffer.createSubBuffer(usage, offset, length)
- CLContext.getDeviceCount()
- CLDevice.getOpenCLVersion()
- CLDevice.isHostUnifiedMemory()
- CLDevice.getNativeVectorWidthXXX() methods
- CLMem.setDestructorCallback(callback)
- CLKernel.getPreferredWorkGroupSizeMultiple()
- CLKernel.enqueueNDRange overload with potentially non-null globalOffsets
- CLImageFormat.ChannelOrder.Rx, RGx, RGBx
- Faster enums
- Check for cl_amd_fp64 in CLDevice.isDoubleSupported()
- Fixed CLProgram.getBinaries()
- Fixed issue #22 (maven pom issue)

Version 1.0-beta-4 (r760, 20100121)

- Changed semantics of offset & length arguments in typed CLxxxBuffer.read / write / map methods : now expressed in elements, not in bytes (e.g. 4 bytes per element for CLIntBuffer)
- Added OpenGL interoperability methods to CLContext and CLQueue (can create a CLByteBuffer from an OpenGL buffer object, a CLImage2D/3D from an OpenGL 2D/3D texture or a renderbuffer).
- Added OpenGL-compatible context creation methods to JavaCL & CLPlatform classes
- Added basic reduction support in ReductionUtils (cumulative additions, multiplications, min, max...)
- Created javacl-demos package, with Particles, Hardware Report and Mandelbrot demos...
- Finished migration from NativeLong to NativeSize (changes only the low-level bindings)
- Added profiling methods to CLEvent (+ facility CLDevice.createProfilingQueue())
- Better JavaDoc for low-level bindings (links to Khronos online manual pages)
- Added deferred program allocation : CLProgram.addSource(String), CLProgram.allocate() (called automatically)...
- Added very simple OpenCL backend for UJMP (Universal Java Matrix Package), which does matrix multiplications in OpenCL.
- Created a kernel wrapper autogenerator (Maven plugin based on JNAerator) : translates all constants on the Java side and presents kernels as methods with the correct Java argument types. It assumes OpenCL kernels (*.c, *.cl) are in src/main/opencl
- Added wrappers around clGetKernelWorkGroupInfo
- Fixed respect of endianness of devices that have different endianness than platform
- Fixed issue #10: "getMaxWorkItemSizes() fails on win7 64 GTX260"

Version 1.0-beta-3 (r , 20091030)

- Fixed Issue #8 : NativeLong's can not represent size_t on windows x64 system (all user code that uses the low-level bindings needs to be updated : NativeLong -&gt; NativeSize)
- Added CLContext/CLDevice.isDoubleSupported, isHalfSupported, isByteAddressableStoreSupported
- Added If function to ScalaCL (operates on statements or on expressions)
- Added CLAbstractEntity.release()
- Fixed Issue #4 : CLContext.createContext(CLDevice... devices) created context on only one device
- Regenerated the low level bindings with latest JNAerator : now using NativeSize class instead of NativeLong for size_t (fixes Issue #8)
- Fixed Issue #5 : fixed formatting of CLPlatform.toString()
- Fixed Issue #6 : use max X workgroup dimension for better benchmark speed
- Fixed Issue #7 : CLMem class bug in Usage.WriteOnly and Usage.ReadWrite
- Fixed Issue #11 : call clRetainMemObject when sharing a cl_mem between CLBuffer instances.
- Choose 'best' device in benchmark test

Version 1.0-beta-2

- JAR is now self-sufficient (includes JNA + JNAerator's runtime classes)
- Added CLKernel.setLocalArg(argIndex, localArgByteLength)
- Allow localWorkSizes to be null in enqueueNDRange
- Added support for barriers and markers in CLQueue
- Fixed issue #2 : enqueueNDRange does not work for nDim > 1
- Added CLDevice.getMaxWorkItemSizes()
- CLDevice.toString() now only prints the name
- Moved method createContext from CLContext to CLPlatform
- Added all the CL_DEVICE_PREFERRED_VECTOR_WIDTH_XXX infos to CLDevice as getPreferredVectorWithXXX()
- Changed return type of getExtensions() method of both CLPlatform and CLDevice from String to String[]
- Added com.nativelibs4java.opencl.HardwareReport (with main method) : outputs html report with devices stats
- Rationalized naming of all enums : CL_ENTITY_ATTRIBUTE_SOME_VALUE = CLEntity.Attribute.SomeValue (enum item SomeValue in enum Attribute in class CLEntity)
- Added full support of images :
- CLContext.getSupportedImageFormats + CLImageFormat and associated enums
- CLImage2D, CLImage3D and corresponding creation methods in CLContext + all image info getters
- CLMem is now an abstract base class
- CLBuffer with typed subclasses (CLByteBuffer, CLIntBuffer..)
- To create a CLBuffer : context.createIntBuffer(Input, size)
- Added CLBuffer.copyTo (clEnqueueCopyBuffer)
- Each typed CLBuffer subclass has map, mapLater, read methods that return typed NIO buffers
- Added full typing of OpenCL Exceptions (now possible to selectively catch a CLException.OutOfHostMemory, for instance)
- Added hashCode and equals method to most classes
- Added ability to create out of order queues and change queue properties after creation

Version 1.0-beta-1

- New CLPlatform class (~ OpenCL implementation) which now hosts the list*Devices(...) methods
- Entry point of library is now OpenCL4Java.listPlatforms()
- New CLEvent class, returned by all enqueue* methods (with methods waitFor, invokeUponCompletion...)
- Better separation between blocking and non blocking calls
- New CLSampler class supported as argument of CLKernel
- Many info getters with typesafe enums / enum sets in classes CLDevice, CLPlatform, CLKernel...
- Much more complete JavaDoc : http://nativelibs4java.sourceforge.net/sites/OpenCL4Java/apidocs/
- Example & benchmark classes became JUnit tests and moved here : http://code.google.com/p/nativelibs4java/source/browse/#svn/trunk/lib...

While this release is rather OpenCL4Java-focused, ScalaCL also got its bunch of enhancements :
- Added scalar variables IntVar, FloatVar, ShortVar...
- 'local' keyword can be added to variables so they're local to the programs : val x = FloatVar local
- Added many OpenCL math functions
- Added methods ArrayVar.write(Range), ArrayVar.write(Seq)
- Various bugfixes

Something went wrong with that request. Please try again.