A code generator for array-based code on CPUs and GPUs
Clone or download
Type Name Latest commit message Commit time
Failed to load latest commit information.
bin Allow running CLI as python -m loopy Mar 29, 2016
build-helpers Localize dep on 'packaging' to CentOS package Oct 2, 2018
contrib Make access pattern explorer work in lsize==gsize==1 case Aug 20, 2018
doc updated tutotial w/variable tag printing Nov 29, 2018
examples Merge branch 'master' of ssh://gitlab.tiker.net/inducer/loopy Dec 8, 2018
loopy added variable_tag to MemAccess; GlobalMemAccessCounter tracking vari… Nov 29, 2018
proto-tests Fix tests and demos to always specify default_tag in precompute and a… Jun 22, 2018
test added variable_tag to MemAccess; GlobalMemAccessCounter tracking vari… Nov 29, 2018
.gitignore Array Args, renaming any minors fixes. Jul 2, 2018
.gitlab-ci.yml More futzing with examples CI package deps: Use ipykernel instead of … Oct 2, 2018
.gitmodules Refactor to introduce a separate C backend May 3, 2015
LICENSE Add LICENSE Sep 23, 2018
MANIFEST.in Release/dep fixes Jul 14, 2016
MEMO Refactor rule-aware term rewriting using SubstitutionRuleMappingContext Apr 20, 2015
README.rst Add build badge, prettify README Mar 16, 2018
TODO Add TODO Apr 9, 2015
build-py-project-and-run-examples.sh Fix example runner script Feb 12, 2018
requirements-old-pyopencl.txt Fix legacy PyOpenCL CI Aug 6, 2018
requirements.txt Localize dep on 'packaging' to CentOS package Oct 2, 2018
setup.cfg Placate flake8 3.6 Oct 28, 2018
setup.py Replace custom duration logging code with ProcessLogger Apr 21, 2018


Loopy: Transformation-Based Generation of High-Performance CPU/GPU Code


Loopy lets you easily generate the tedious, complicated code that is necessary to get good performance out of GPUs and multi-core CPUs. Loopy's core idea is that a computation should be described simply and then transformed into a version that gets high performance. This transformation takes place under user control, from within Python.

It can capture the following types of optimizations:

  • Vector and multi-core parallelism in the OpenCL/CUDA model
  • Data layout transformations (structure of arrays to array of structures)
  • Loop unrolling
  • Loop tiling with efficient handling of boundary cases
  • Prefetching/copy optimizations
  • Instruction level parallelism
  • and many more

Loopy targets array-type computations, such as the following:

  • dense linear algebra,
  • convolutions,
  • n-body interactions,
  • PDE solvers, such as finite element, finite difference, and Fast-Multipole-type computations

It is not (and does not want to be) a general-purpose programming language.

Loopy is licensed under the liberal MIT license and free for commercial, academic, and private use. All of Loopy's dependencies can be automatically installed from the package index after using:

pip install loo.py

In addition, Loopy is compatible with and enhances pyopencl.


Places on the web related to Loopy: