github
Advanced Search
  • Home
  • Pricing and Signup
  • Explore GitHub
  • Blog
  • Login

scotts / cellgen

  • Admin
  • Watch Unwatch
  • Fork
  • Your Fork
  • Pull Request
  • Download Source
    • 3
    • 0
  • Source
  • Commits
  • Network (0)
  • Issues (0)
  • Downloads (0)
  • Wiki (1)
  • Graphs
  • Branch: master

click here to add a description

click here to add a homepage

  • Branches (1)
    • master ✓
  • Tags (0)
Sending Request…
Enable Donations

Pledgie Donations

Once activated, we'll place the following badge in your repository's detail box:
Pledgie_example
This service is courtesy of Pledgie.

OpenMP-like support for the Cell processor. — Read more

  cancel

http://www.cs.vt.edu/~scschnei/cellgen

  cancel
  • Private
  • Read-Only
  • HTTP Read-Only

This URL has Read+Write access

Before off-induction-stencil merge. 
scotts (author)
Sun Jan 31 15:00:51 -0800 2010
commit  f7348505d67d71143485800dd2549da637aab19f
tree    b3ae27ea6185d471eac75faab33bd45c1faf5748
parent  db7f35066f2d6dc5956da9d4b2a842f5bdb1ef6c
cellgen /
name age
history
message
file COPYING Loading commit data...
file README.textile
file README.txt
directory src/
directory template_code/
directory tests/
README.txt
Cellgen: OpenMP-like support for the Cell processor
Scott Schneider, http://www.cs.vt.edu/~scschnei
See the project page, http://www.cs.vt.edu/~scschnei/cellgen, for publications.

Compiling
---------

Cellgen uses Boost.Spirit for parsing, which uses extensively nested templates. When debugging
information is turned on (-g), each level of nesting is not compiled away, and remains in the
executable. Consequently, including debugging information makes an order of magnitude difference
in executable size.

Cellgen also relies on other Boost libraries, but they should be installed on most Linux systems.

Other than that, a simple "make" should do.

Usage
-----

cellgen foo.cellgen [-n <# SPEs>] [-I <include file for PPE/SPE>]

Brief Programming Tutorital
---------------------------

Cellgen shares semantics with OpenMP, but legal OpenMP code is not legal Cellgen code, and
vice-versa. This section presents a brief tutorial of Cellgen, which serves to both provide
the reader with an intuitive feel for the programming model, and to highlight supported features.

Regarding mechanics, Cellgen is a source-to-source compiler: it accepts C code and emits
C code. The current workflow requires a programmer to call "cellgen" on a "*.cellgen" file,
which will produce code for both the PPE and SPE. Currently, we rely on the sophisticated Make
files provided by the IBM SDK to produce executable code.

In all of these code examples, we assume the Cellgen blocks of code reside in a legal C program.

The Basics
----------

All Cellgen code is preceded by a "#pragma cell" directive.  Cellgen ignores all other lines of
code until it reaches that pragma.  The Cellgen code is also enclosed in braces. The simplest
Cellgen code transfers no data in or out of the SPE:

#pragma cell
{
  printf("Hello world");
}

This code will print the string "Hello world" from each SPE. All code within a Cellgen region
will be executed on the SPE, and all code outside will be executed on the PPE. In code terms:

printf("I will always execute on the PPE.");

#pragma cell
{
  printf("I will always execute on each SPE.");
}

In the previous two examples, the SPEs all behaved the same. While the Cellgen model is to
distribute the same code to each SPE, the power comes from giving each SPE different data. In
the following example, each SPE executes different parts of the iteration space for a loop.

#pragma cell private(int N = N) 
{
  int i;
  for (i = 0; i < N; ++i) {
    printf("iteration %d\n, i);
  }
}

In this case, each SPE executes a subset of the iteration space [0--10).

Computations with Flat Arrays
-----------------------------

None of the prior examples performed any interestion computations or even transferred any data
beyond loop parameters. The following example multiplies each element of a single-dimensional
array by a constant.

int vector[N];
int factor; // presumabley set elsewhere

#pragma cell shared(int* v = vector) private(int f = factor, int N = N)
{
  int i;
  for (i = 0; i < N; ++i) {
    v[i] = v[i] * f;
  }
}

This code sample introduces several new concepts. First, in order to pass data into a Cell
region, we must specify if it is "shared" or "private". Variables declared "shared" will have
their data distributed among all SPEs, streamed in or out as needed. Cellgen performs reference
analysis to determine how to stream the variables. In this example, the data for "vector"
will be both streamed in and out of the SPEs; its result will be visible to code beyond the
Cell region. Variables declared "private" will be transferred to each SPE once, and all SPEs
will have their own local copy.

Each SPE will carry out its computation in parallel, and there is an implicit barrier at the end
of the Cell region. Note that all of the iterations of the loop are *independent*. Currently,
Cellgen can only handle independent loops.

Reductions
----------

The result from the previous example was an entire array. Cellgen can also handle reductions,
where the computation relies on a large dataset, but the result is reduced to a single value.

int vector[N];
int sum = 0;

#pragma cell shared(int* v = vector) reduction(+: int s = sum) private(int N = N)
{
  int i;
  for (i = 0; i < N; ++i) {
    s += v[i];
  }
}

After all SPEs have finished, "sum" contains the summation of all elements of "vector". Cellgen
supports reductions for addition ("+") and multiplication ("*").

Multidimensional Arrays
-----------------------

Dense matrices are usually implemented with multidimensional arrays in C. Cellgen can handle
multidimensional arrays, but it requires more information than with flat arrays, and some
programmer assistance is required with column accesses.

To start with, we shall consider row accesses. The following code multiplies each element of
a 3-dimensional array by a constant factor:

int matrix[N1][N2][N3];
int factor; 

#pragma cell shared(int* m = matrix[N1][N2][N3]) private(int f = factor)
{
  int i, j, k;
  for (i = 0; i < N1; ++i) {
    for (j = 0; j < N2; ++j) {
      for (k = 0; k < N3; ++k) {
        m[i][j][k] = m[i][j][k] * f;
      }
    }
  }
}

Cellgen needs to know the dimensions of the matrix, which are provided in the "shared"
directive. The dimensions can be either constants or variables only known at runtime. Cellgen
requires the matrix dimensions so that it can compute addresses for the DMAs which will get
and put values in main memory. All of the dimensions of the matrix are implicitly passed as 
private variables.

Column accesses currently require more work from the programmer.  Because DMA lists work best
with addresses that are 16-byte aligned, Cellgen expects the programmer to pad their data. The
same computation, but accessing columns:

typedef struct int16b_t {
  int num;
  char pad[12];
};

int16b_t matrix[N1][N2][N3];
int factor; 

#pragma cell shared(int16b_t* m = matrix[N1][N2][N3]) private(int f = factor)
{
  int i, j, k;
  for (i = 0; i < N1; ++i) {
    for (j = 0; j < N3; ++j) {
      for (k = 0; k < N1; ++k) {
        m[k][i][j].num = m[k][i][j].num * f;
      }
    }
  }
}

In the future, Cellgen will handle data padding so that row and column accesses appear the same
to programmers.

Further examples are available in the "unit_tests" directory.
Blog | Support | Training | Contact | API | Status | Twitter | Help | Security
© 2010 GitHub Inc. All rights reserved. | Terms of Service | Privacy Policy
Powered by the Dedicated Servers and
Cloud Computing of Rackspace Hosting®
Dedicated Server