# PyOpenCL: Arrays

## Setup code

In [1]:
import pyopencl as cl
import numpy as np
import numpy.linalg as la

In [2]:
a = np.random.rand(1024, 1024).astype(np.float32)

In [3]:
ctx = cl.create_some_context()
queue = cl.CommandQueue(ctx)

## Creating arrays

This notebook demonstrates working with PyOpenCL's arrays, which provide a friendlier (and more numpy-like) face on OpenCL's buffers. This is the module where they live:

In [4]:
import pyopencl.array

Now transfer to a *device array*.

In [5]:
#clear
a_dev = cl.array.to_device(queue, a)

Works like a numpy array! (`shape`, `dtype`, `strides`)

In [6]:
#clear
a_dev.shape

(1024, 1024)

In [7]:
#clear
a_dev.dtype

dtype('float32')

In [8]:
#clear
a_dev.strides

(4096, 4)

## Working with arrays

**Goal:** Wanted to double all entries.

In [9]:
#clear
twice_a_dev = 2*a_dev

Easy to turn back into a `numpy` array.

In [10]:
#clear
twice_a = twice_a_dev.get()

Check!

In [11]:
#clear
#check

print(la.norm(twice_a - 2*a))

0.0


Can just `print` the array, too.

In [12]:
#clear
print(twice_a_dev)

[[ 0.66639256  0.7587527   0.43297762 ...,  1.19188917  0.63074386
   1.37985754]
 [ 1.91522515  1.33606839  0.52453977 ...,  0.53368247  0.19473593
   1.86292982]
 [ 1.63350916  0.90909183  0.78772193 ...,  1.71718526  0.68106449
   0.75832587]
 ..., 
 [ 1.08252895  0.44018057  0.25621885 ...,  1.16477776  1.79530978
   1.53321719]
 [ 0.93243402  1.64793682  1.62401795 ...,  0.13441022  1.76813459
   1.33698297]
 [ 0.55730426  0.28307205  1.2758038  ...,  0.68192083  1.2898674
   0.97290993]]


----

Easy to evaluate arbitrary (elementwise) expressions.

In [13]:
import pyopencl.clmath

In [14]:
cl.clmath.sin(a_dev)**2 - (1./a_dev) + 5

array([[ 2.10573769,  2.50124764,  0.42696333, ...,  3.63703895,
         1.92534614,  3.95568466],
       [ 4.62456608,  3.88678145,  1.25435662, ...,  1.32198358,
        -5.26086712,  4.57042027],
       [ 4.30697775,  2.99277115,  2.60830212, ...,  4.4082365 ,
         2.17496896,  2.49961734],
       ..., 
       [ 3.41792631,  0.50407267, -2.78950453, ...,  3.58545685,
         4.49730206,  4.1767683 ],
       [ 3.05713558,  4.324893  ,  4.29508495, ..., -9.8753109 ,
         4.46689415,  3.88825035],
       [ 1.48695421, -2.0454402 ,  3.78699446, ...,  2.17892647,
         3.81082869,  3.16286278]], dtype=float32)

## Low-level Access

Can still do everything manually though!

In [15]:
prg = cl.Program(ctx, """
    __kernel void twice(__global float *a)
    {
      int gid0 = get_global_id(0);
      int gid1 = get_global_id(1);
      int i = gid1 * 1024 + gid0;
      a[i] = 2*a[i];
    }
    """).build()
twice = prg.twice

In [16]:
#clear
twice(queue, a_dev.shape, None, a_dev.data)

<pyopencl.cffi_cl.Event at 0x7f434a534c50>

In [17]:
print(la.norm(a_dev.get() - 2*a), la.norm(a))

0.0 591.074


But the hardcoded 1024 is ... inelegant. So fix that!

(Also with arg `dtype setting`.)