Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unroll / ILP unaware of conditionals #37

Closed
skyreflectedinmirrors opened this issue Oct 19, 2016 · 2 comments
Closed

Unroll / ILP unaware of conditionals #37

skyreflectedinmirrors opened this issue Oct 19, 2016 · 2 comments

Comments

@skyreflectedinmirrors
Copy link
Contributor

Hi, I'm starting to get involved with loopy, and I noticed that the unroll / ILP tag doesn't seem to be aware of what happens in a conditional.

E.g. for this simple example:

import loopy as lp
import numpy as np
import pyopencl as cl

#init
testsize = 100
T = np.random.uniform(size=testsize, low=400, high=2300)
T_arr = lp.GlobalArg('T', shape=T.shape[0], dtype=T.dtype)

def __print_code(knl):
    code, _ = lp.generate_code(knl)
    print code

ctx = cl.create_some_context(interactive=False)
queue = cl.CommandQueue(ctx)

knl = lp.make_kernel('{{[k]: 0 <=k<{}}}'.format(testsize),
                 """
                     for k
                         <>Tcond = T[k] < 1000 {id=dep}
                         cp[k] = 2 * T[k] {if=Tcond}
                     end
                 """,
                 [T_arr, '...'])
__print_code(knl)

knl_ilp = lp.split_iname(knl, 'k', 2, inner_tag='ilp')
__print_code(knl_ilp)

I get the following output:

Without ILP:

#define lid(N) ((int) get_local_id(N))
#define gid(N) ((int) get_group_id(N))
#if __OPENCL_C_VERSION__ < 120
#pragma OPENCL EXTENSION cl_khr_fp64: enable
#endif

__kernel void __attribute__ ((reqd_work_group_size(1, 1, 1))) loopy_kernel(__global double const *restrict T, __global double *restrict cp)
{
  int Tcond;

  for (int k = 0; k <= 99; ++k)
  {
    Tcond = T[k] < 1000.0;
    if (Tcond)
      cp[k] = 2.0 * T[k];
  }
}


With ILP:
#define lid(N) ((int) get_local_id(N))
#define gid(N) ((int) get_group_id(N))
#if __OPENCL_C_VERSION__ < 120
#pragma OPENCL EXTENSION cl_khr_fp64: enable
#endif

__kernel void __attribute__ ((reqd_work_group_size(1, 1, 1))) loopy_kernel(__global double const *restrict T, __global double *restrict cp)
{
  int Tcond[2];

  for (int k_outer = 0; k_outer <= 49; ++k_outer)
  {
    Tcond[0] = T[2 * k_outer + 0] < 1000.0;
    if (Tcond)
      cp[2 * k_outer + 0] = 2.0 * T[2 * k_outer + 0];
    Tcond[1] = T[2 * k_outer + 1] < 1000.0;
    if (Tcond)
      cp[2 * k_outer + 1] = 2.0 * T[2 * k_outer + 1];
  }
}

Note that the if(Tcond) did not update as expected.

If we use a similar example for unrolling:

import loopy as lp
import numpy as np
import pyopencl as cl

#init
testsize = 100
T = np.random.uniform(size=testsize, low=400, high=2300)
T_arr = lp.GlobalArg('T', shape=T.shape[0], dtype=T.dtype)

def __print_code(knl):
    code, _ = lp.generate_code(knl)
    print code

ctx = cl.create_some_context(interactive=False)
queue = cl.CommandQueue(ctx)

knl = lp.make_kernel('{{[k]: 0 <=k<{}}}'.format(testsize),
                 """
                     for k
                         <>Tcond[k] = T[k] < 1000 {id=dep}
                         cp[k] = 2 * T[k] {dep=dep,if=Tcond[k]}
                     end
                 """,
                 [T_arr, '...'])
__print_code(knl)

knl_ilp = lp.split_iname(knl, 'k', 2, inner_tag='unr')
__print_code(knl_ilp)

we get a similarly flawed output:

#define lid(N) ((int) get_local_id(N))
#define gid(N) ((int) get_group_id(N))
#if __OPENCL_C_VERSION__ < 120
#pragma OPENCL EXTENSION cl_khr_fp64: enable
#endif

__kernel void __attribute__ ((reqd_work_group_size(1, 1, 1))) loopy_kernel(__global double const *restrict T, __global double *restrict cp)
{
  int Tcond[100];

  for (int k_outer = 0; k_outer <= 49; ++k_outer)
  {
    Tcond[2 * k_outer + 0] = T[2 * k_outer + 0] < 1000.0;
    if (Tcond[k])
      cp[2 * k_outer + 0] = 2.0 * T[2 * k_outer + 0];
    Tcond[2 * k_outer + 1] = T[2 * k_outer + 1] < 1000.0;
    if (Tcond[k])
      cp[2 * k_outer + 1] = 2.0 * T[2 * k_outer + 1];
  }
}

I'm running the latest commit, r2e562728
I'm fairly new to the machinery behind loopy. Where would I start looking to correct this?
Also, as a side-note, is it possible to do else statements?

@inducer
Copy link
Owner

inducer commented Oct 20, 2016

Good catch, thanks for the report. For now, predicates are simply strings representing variable names, but your examples demonstrate that that simply won't suffice. I'll have to allow expressions to allow for these transformations to happen. Not a huge change in principle--I'll see what I can do.

else is doable, too. FWIW, syntax like this:

if blah
   a = b
end

already exists.

@inducer
Copy link
Owner

inducer commented Oct 20, 2016

This commit addresses the ILP concern, and it's possible that it might also address the unroll case, but I haven't tested that yet.

There are a few loose ends:

  • Investigate expression parsing in if-like contexts (basically check that it works, ideally add tests), given that it previously (technically) should have only allowed variables
  • I'd be grateful if you could PR me a test for the unroll case, along the lines of the one I've added for your ILP example (see that commit).
  • Add else syntax. (if you like) For that, look in loopy/kernel/creation.py and search for IF_RE and its usage sites.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants