@iskandr iskandr released this Nov 26, 2013 · 43 commits to master since this release

  • Added PyCUDA & Cygwin dependencies to README (#17)

  • Further decrease in call overhead, still ~300x slower than Python ((#15)

  • Fixed small bugs (#16)

  • Pass numpy include dirs to gcc before Python's default dirs (thanks Rodrigo)



@iskandr iskandr released this Nov 22, 2013 · 67 commits to master since this release



@iskandr iskandr released this Nov 19, 2013 · 91 commits to master since this release

  • Got rid of testing dependency on SciPy
  • Deleted unused and unfinished optimizations
  • Small misc. bugs



@iskandr iskandr released this Nov 19, 2013 · 105 commits to master since this release

The last release added experimental CUDA support but the performance was terrible. This release includes lots of tweaks and optimizations necessary for getting beneficial speedups on the GPU. However, the default backend remains OpenMP since some program constructs don't work on the GPU and the nvcc compile times are unacceptably slow.

  • Expanded and generalized fusion optimization
  • Filled in missing methods from shape inference
  • Using ShapeElimination on every function (repurposes the shape inference results as a symbolic execution optimization)
  • Fixed lots of small bugs in other optimizations exposed by ShapeElimination
  • Shaved off small amount of compile time by moving away from Node pseudo-ASTs to regular Python constructors
  • It's a bit hackish, but added int24 just as a sentinel for default values in reductions that need to cast up to int32 from bool, int8, int16.
  • Eliminate redundant & constant array operator arguments with SpecializeFnArgs



@iskandr iskandr released this Nov 5, 2013 · 158 commits to master since this release

  • Added experimental CUDA backend (use by passing _backend='cuda' to @jit functions)



@iskandr iskandr released this Oct 31, 2013 · 187 commits to master since this release

  • Added OpenMP backend (runs most map-like computations across multiple threads)
  • Stack-allocate representations for all structured types in C
  • Disabled Flattening -- tricky transform needs careful audit
  • Debugged and enabled CopyElimination
  • Fixed negative step in slices
  • Added RLock around AST translation to play nice with Python threads (thanks Russell Power)
  • Fixed link argument order for building on cygwin in Windows (thanks Yves-Rémi Van Eycke)



@iskandr iskandr released this Oct 10, 2013 · 241 commits to master since this release

  • fixed Mac OS linker flags (thanks to Michelle)
  • fixed length of np.arange with float and negative arguments
  • added unit tests for arange and SPH kernel function
  • desperate attempts at getting compiler working on Windows, probably fruitless



@iskandr iskandr released this Oct 9, 2013 · 257 commits to master since this release

  • Added support for binding multiple variables in a for loop (i.e. "for (x,(y,z)) in enumerate(zip(ys,zs)):")
  • More array constructors support 'dtype' argument
  • Lots of little bug fixes and misc. improvements
  • Slightly better support for negative indexing but negative step sizes are still mostly broken



@iskandr iskandr released this Oct 2, 2013 · 292 commits to master since this release

  • Moved version info into submodule so setup.py can run without full dependencies (thanks rjpower).
  • Fixed support for references to global arrays.
  • Make C backend respect runtime changes to config flags.
  • Got rid of unncessary linking against libpython.