# SINGA Core Classes

<img src="http://singa.apache.org/en/_static/images/singav1-sw.png" width="500px"/>

# Device

A device instance represents a hardware device with multiple execution units, e.g.,
* A GPU which has multile cuda streams
* A CPU which has multiple threads

All data structures (variables) are allocated on a device instance. Consequently, all operations are executed on the resident device.

## Create a device instance

In [2]:
from singa import device
default_dev = device.get_default_device()
gpu = device.create_cuda_gpu()  # the first gpu device
gpu

<singa.singa_wrap.Device; proxy of <Swig Object of type 'std::shared_ptr< singa::Device > *' at 0x7fcb5c0768d0> >

** NOTE: currently we can only call the creating function once due to the cnmem restriction.**

In [None]:
gpu = device.create_cuda_gpu_on(1)  # use the gpu device with the specified GPU ID
gpu_list1 = device.create_cuda_gpus(2)  # the first two gpu devices
gpu_list2 = device.create_cuda_gpus([0,2]) # create the gpu instances on the given GPU IDs
opencl_gpu = device.create_opencl_device()  # valid if SINGA is compiled with USE_OPENCL=ON

In [3]:
device.get_num_gpus()

3

In [4]:
device.get_gpu_ids()

(0, 1, 2)

# Tensor

A tensor instance represents a multi-dimensional array allocated on a device instance.
It provides linear algbra operations, like +, -, *, /, dot, pow ,etc

NOTE: class memeber functions are inplace; global functions are out-of-place.

### Create tensor instances

In [5]:
from singa import tensor
import numpy as np
a = tensor.Tensor((2, 3))
a.shape

(2, 3)

In [6]:
a.device

<singa.singa_wrap.Device; proxy of <Swig Object of type 'std::shared_ptr< singa::Device > *' at 0x7fcb5c0da4e0> >

In [7]:
gb = tensor.Tensor((2, 3), gpu)

In [8]:
gb.device

<singa.singa_wrap.Device; proxy of <Swig Object of type 'std::shared_ptr< singa::Device > *' at 0x7fcb5c0768d0> >

### Initialize tensor values

In [9]:
a.set_value(1.2)
gb.gaussian(0, 0.1)

### To and from numpy

In [10]:
tensor.to_numpy(a)

array([[ 1.20000005,  1.20000005,  1.20000005],
       [ 1.20000005,  1.20000005,  1.20000005]], dtype=float32)

In [12]:
tensor.to_numpy(gb)

array([[-0.01349338, -0.20518918,  0.0412962 ],
       [-0.0747437 ,  0.19155163,  0.09417564]], dtype=float32)

In [13]:
c = tensor.from_numpy(np.array([1,2], dtype=np.float32))
c.shape

(2,)

In [14]:
c.copy_from_numpy(np.array([3,4], dtype=np.float32))
tensor.to_numpy(c)

array([ 3.,  4.], dtype=float32)

### Move tensor between devices

In [15]:
gc = c.clone()
gc.to_device(gpu)
gc.device

<singa.singa_wrap.Device; proxy of <Swig Object of type 'std::shared_ptr< singa::Device > *' at 0x7fcb5c0768d0> >

In [22]:
b = gb.clone()
b.to_host()
b.device

<singa.singa_wrap.Device; proxy of <Swig Object of type 'std::shared_ptr< singa::Device > *' at 0x7fcb5c0da4e0> >

### Operations

**NOTE: tensors should be initialized if the operation would read the tensor values**

#### Summary

In [17]:
gb.l1()

0.10340828448534012

In [18]:
a.l2()

0.4898979663848877

In [19]:
e = tensor.Tensor((2, 3))
e.is_empty()

False

In [21]:
gb.size()

6L

In [33]:
gb.memsize()

24L

In [34]:
c.is_transpose()

False

In [35]:
et=e.T()
et.is_transpose()

True

In [36]:
et.shape

(3L, 2L)

In [37]:
et.ndim()

2L

#### Member functions (in-place)

In [23]:
a += b
tensor.to_numpy(a)

array([[ 1.18650663,  0.99481088,  1.24129629],
       [ 1.1252563 ,  1.39155173,  1.29417574]], dtype=float32)

In [24]:
a -= b
tensor.to_numpy(a)

array([[ 1.20000005,  1.20000005,  1.20000005],
       [ 1.20000005,  1.20000005,  1.20000005]], dtype=float32)

In [25]:
a *= 2
tensor.to_numpy(a)

array([[ 2.4000001,  2.4000001,  2.4000001],
       [ 2.4000001,  2.4000001,  2.4000001]], dtype=float32)

In [26]:
a /= 3
tensor.to_numpy(a)

array([[ 0.80000007,  0.80000007,  0.80000007],
       [ 0.80000007,  0.80000007,  0.80000007]], dtype=float32)

In [31]:
d = tensor.Tensor((3,))
d.uniform(-1,1)
tensor.to_numpy(d)

array([ 0.67001712, -0.7460264 ,  0.93773556], dtype=float32)

In [32]:
a.add_row(d)
tensor.to_numpy(a)

array([[ 1.47001719,  0.05397367,  1.73773563],
       [ 1.47001719,  0.05397367,  1.73773563]], dtype=float32)

#### Global functions (out of place)

**Unary functions**

In [35]:
h = tensor.sign(d)
tensor.to_numpy(h)

array([ 1., -1.,  1.], dtype=float32)

In [41]:
tensor.to_numpy(d)

array([ 0.67001712, -0.7460264 ,  0.93773556], dtype=float32)

In [37]:
h = tensor.abs(d)
tensor.to_numpy(h)

array([ 0.67001712,  0.7460264 ,  0.93773556], dtype=float32)

In [38]:
h = tensor.relu(d)
tensor.to_numpy(h)

array([ 0.67001712,  0.        ,  0.93773556], dtype=float32)

In [39]:
g = tensor.sum(a, 0)
g.shape

(3L,)

In [40]:
g = tensor.sum(a, 1)
g.shape

(2L,)

In [45]:
tensor.bernoulli(0.5, g)
tensor.to_numpy(g)

array([ 1.,  0.,  0.], dtype=float32)

In [46]:
tensor.gaussian(0, 0.2, g)
tensor.to_numpy(g)

array([ 0.09801694, -0.14860021,  0.28697565], dtype=float32)

#### Binary functions

In [27]:
f = a + b
tensor.to_numpy(f)

array([[ 0.78650671,  0.5948109 ,  0.84129626],
       [ 0.72525638,  0.9915517 ,  0.89417571]], dtype=float32)

In [33]:
g = a < b
tensor.to_numpy(g)

array([[ 0.,  0.,  0.],
       [ 0.,  1.,  0.]], dtype=float32)

In [47]:
tensor.add_column(2, c, 1, f)   # f = 2 *c + 1* f
tensor.to_numpy(f)


array([[ 6.78650665,  6.59481096,  6.8412962 ],
       [ 8.72525597,  8.9915514 ,  8.89417553]], dtype=float32)

#### BLAS

In [49]:
tensor.axpy(2, a, f)  # f = 2a + f
tensor.to_numpy(b)

array([[-0.01349338, -0.20518918,  0.0412962 ],
       [-0.0747437 ,  0.19155163,  0.09417564]], dtype=float32)

In [50]:
f = tensor.mult(a, b.T())
tensor.to_numpy(f)

array([[ 0.04085157,  0.0641166 ],
       [ 0.04085157,  0.0641166 ]], dtype=float32)

In [51]:
tensor.mult(a, b.T(), f, 2, 1)  # f = 2a*b.T() + 1f
tensor.to_numpy(f)

array([[ 0.12255471,  0.19234982],
       [ 0.12255471,  0.19234982]], dtype=float32)