# Basic usage

The primary object in Bolt is the Bolt array. We can construct these arrays using familiar operators (like `zeros` and `ones`), or from an existing array, and manipulate them like ndarrays whether in local or distributed settings. This notebook highlights the core functionality, much of which hides the underlying implementation (by design!); see other tutorials for more about what's going on under the hood, and for more advanced usage.

## Local array

The local Bolt array is just like a NumPy array, and we can construct it without any special arguments.

In [1]:
from bolt import ones
a = ones((2,3,4))

In [2]:
a.shape

(2, 3, 4)

The local array is basically a wrapper for a NumPy array, so that we can write applications against the `BoltArray` and support either local or distributed settings regardless of which we're in. As such, it has the usual NumPy functionality.

In [3]:
a.sum()

24.0

In [4]:
a.transpose(2,1,0).shape

(4, 3, 2)

The `toarray` method always returns the underlying array

In [6]:
a.sum(axis=0).toarray()

array([[ 2.,  2.,  2.,  2.],
       [ 2.,  2.,  2.,  2.],
       [ 2.,  2.,  2.,  2.]])

## Distributed array

To construct Bolt arrays backed by other engines, like Spark, we just add additional arguments to the constructor. For Spark, we add a `SparkContext`

In [7]:
b = ones((2, 3, 4), sc)

In [8]:
b.shape

(2, 3, 4)

We can also construct from an existing local array

In [9]:
from numpy import arange
x = arange(2*3*4).reshape(2, 3, 4)

In [10]:
from bolt import array
b = array(x, sc)

In [11]:
b.shape

(2, 3, 4)

## Array operations

We can use many of the `ndarray` operations we're familiar with, including aggregations along axes

In [12]:
b.sum()

276

In [13]:
b.sum(axis=0).toarray()

array([[12, 14, 16, 18],
       [20, 22, 24, 26],
       [28, 30, 32, 34]])

In [14]:
b.max(axis=(0,1)).toarray()

array([20, 21, 22, 23])

indexing with either slices or interger lists

In [15]:
b[:,:,0:2].shape

(2, 3, 2)

In [16]:
b[0,0:2,0:2].toarray()

array([[0, 1],
       [4, 5]])

In [17]:
b[[0,1],[0,1],[0,1]].toarray()

array([ 0, 17])

and reshaping, squeezing, and transposing

In [18]:
b.shape

(2, 3, 4)

In [20]:
b.reshape(2, 4, 3).shape

(2, 4, 3)

In [21]:
b[:,:,0:1].squeeze().shape

(2, 3)

In [22]:
b.transpose(2, 1, 0).shape

(4, 3, 2)

## Functional operators

The Bolt array also supports functional-style operations, like `map`, `reduce`, and `filter`. We can use `map` to apply functions in parallel

In [24]:
a = ones((2, 3, 4), sc)

In [25]:
a.map(lambda x: x * 2).toarray()

array([[[ 2.,  2.,  2.,  2.],
        [ 2.,  2.,  2.,  2.],
        [ 2.,  2.,  2.,  2.]],

       [[ 2.,  2.,  2.,  2.],
        [ 2.,  2.,  2.,  2.],
        [ 2.,  2.,  2.,  2.]]])

If we map over the 0th axis with the `sum` function, we are taking the sum of 2 arrays each 3x4

In [26]:
a.map(lambda x: x.sum(), axis=(0,)).toarray()

array([ 12.,  12.])

If we instead map over the 0 and 1st axis, we are taking the sum of 2x3 arrays each of size 4

In [27]:
a.map(lambda x: x.sum(), axis=(0,1)).toarray()

array([[ 4.,  4.,  4.],
       [ 4.,  4.,  4.]])

And we can chain these functional operations alongside array operations

In [28]:
a.map(lambda x: x * 2, axis=(0,)).sum(axis=(0,1)).toarray()

array([ 12.,  12.,  12.,  12.])

This makes it easy to write distributed applications that leverage array manipulations (like transposing and shaping) but also apply arbitrary parallelized operations at scale.