# PyOpenCL: Arrays

## Setup code

In [1]:
import pyopencl as cl
import numpy as np
import numpy.linalg as la

In [2]:
a = np.random.rand(1024, 1024).astype(np.float32)

In [3]:
ctx = cl.create_some_context(interactive=True)
queue = cl.CommandQueue(ctx)

## Creating arrays

This notebook demonstrates working with PyOpenCL's arrays, which provide a friendlier (and more numpy-like) face on OpenCL's buffers. This is the module where they live:

In [4]:
import pyopencl.array

Now transfer to a *device array*.

In [5]:
a_dev = cl.array.to_device(queue, a)

Works like a numpy array! (`shape`, `dtype`, `strides`)

In [6]:
a_dev.shape

(1024, 1024)

In [7]:
a_dev.dtype

dtype('float32')

In [8]:
a_dev.strides

(4096, 4)

## Working with arrays

**Goal:** Wanted to double all entries.

In [9]:
twice_a_dev = 2*a_dev

Easy to turn back into a `numpy` array.

In [10]:
twice_a = twice_a_dev.get()

Check!

In [11]:
#check

print(la.norm(twice_a - 2*a))

0.0


Can just `print` the array, too.

In [12]:
print(twice_a_dev)

[[1.9884549  1.9853095  0.17914777 ... 1.6878643  1.0856574  0.5063481 ]
 [0.8978882  0.8065957  1.718012   ... 1.2990425  1.3980168  0.02950122]
 [0.9073179  0.20164105 1.8345034  ... 1.777128   1.6947111  1.7179842 ]
 ...
 [1.3738339  1.1683357  0.6579218  ... 0.5115313  1.2634456  1.8362907 ]
 [0.9982358  0.7504688  0.69106174 ... 1.9170644  1.3577774  1.3122113 ]
 [0.7629418  0.7740215  1.8698179  ... 1.3633679  0.09284367 0.79888624]]


----

Easy to evaluate arbitrary (elementwise) expressions.

In [13]:
import pyopencl.clmath

In [14]:
cl.clmath.sin(a_dev)**2 - (1./a_dev) + 5

array([[  4.697005 ,   4.6939726,  -6.155966 , ...,   4.3734713,
          3.4246325,   1.1128874],
       [  2.9609196,   2.6744628,   4.409206 , ...,   3.8261938,
          3.9834414, -62.793587 ],
       [  2.9877703,  -4.9084854,   4.5401173, ...,   4.477024 ,
          4.3816566,   4.4091735],
       ...,
       [  3.9463742,   3.5923214,   2.0644927, ...,   1.1541729,
          3.76576  ,   4.542041 ],
       [  3.2255726,   2.4693146,   2.2206178, ...,   4.6264334,
          3.9212987,   3.8479989],
       [  2.5171647,   2.5585396,   4.5776696, ...,   3.9300725,
        -16.539433 ,   2.6477618]], dtype=float32)

## Low-level Access

Can still do everything manually though!

In [15]:
prg = cl.Program(ctx, """
    __kernel void twice(__global float *a)
    {
      int gid0 = get_global_id(0);
      int gid1 = get_global_id(1);
      int i = gid1 * 1024 + gid0;
      a[i] = 2*a[i];
    }
    """).build()
twice = prg.twice

In [16]:
twice(queue, a_dev.shape, None, a_dev.data)

<pyopencl._cl.Event at 0x7f89ec4fcf50>

In [17]:
print(la.norm(a_dev.get() - 2*a), la.norm(a))

0.0 591.0051


But the hardcoded 1024 is ... inelegant. So fix that!

(Also with arg `dtype setting`.)