# Introduction to `ArrayFire.jl`

`ArrayFire.jl` is a library for easy GPU computing in Julia. It wraps the library `arrayfire` for Julia. 

## What's GPU computing?
GPU computing is a new frontier of scientific computing. Scientists and engineers can accelerate their codes by using special pieces of hardware on their systems called accelerators. `ArrayFire.jl` lets your harness the power of the GPU on your system.

It has several advantages:

* Versatile library with accelerated kernels
* Easy Julian interface 
* Applications can easily be accelerated with little or no code changes

This is a basic tutorial on how to use the package, and a gentle introduction to the API. 

First let's load the library.

In [1]:
using ArrayFire

Get some basic information about the device hardware you're using and `ArrayFire` version.

In [2]:
AFInfo()

ArrayFire v3.3.2 (CUDA, 64-bit Linux, build f65dd97)
Platform: CUDA Toolkit 7.5, Driver: 352.93
[0] GRID K520, 4096 MB, CUDA Compute 3.0


## Creating Arrays on the GPU

Create an array in Julia. This is a pointer to a section of memory on the CPU.

In [3]:
a = rand(10,10)

10x10 Array{Float64,2}:
 0.636734    0.0076107  0.153878  …  0.180846   0.21106    0.264255
 0.344502    0.0167777  0.217612     0.864883   0.109131   0.914492
 0.934276    0.440163   0.627557     0.327908   0.451161   0.269221
 0.777876    0.196442   0.520001     0.110411   0.0929359  0.321133
 0.239102    0.249981   0.474862     0.17257    0.873454   0.96188 
 0.45416     0.102521   0.650484  …  0.701419   0.762665   0.478366
 0.579615    0.531518   0.813988     0.284322   0.288925   0.782168
 0.127449    0.881841   0.573872     0.649853   0.20246    0.409499
 0.00716678  0.141725   0.369035     0.0264311  0.234529   0.350385
 0.773948    0.0635373  0.287991     0.605099   0.371044   0.257828

Let us now transfer this to the GPU. The interface to arrays on the GPU is `AFArray`. Call the constructor on this Array. 

In [6]:
ad = AFArray(a)

10x10 ArrayFire.AFArray{Float64,2}:
 0.636734    0.0076107  0.153878  …  0.180846   0.21106    0.264255
 0.344502    0.0167777  0.217612     0.864883   0.109131   0.914492
 0.934276    0.440163   0.627557     0.327908   0.451161   0.269221
 0.777876    0.196442   0.520001     0.110411   0.0929359  0.321133
 0.239102    0.249981   0.474862     0.17257    0.873454   0.96188 
 0.45416     0.102521   0.650484  …  0.701419   0.762665   0.478366
 0.579615    0.531518   0.813988     0.284322   0.288925   0.782168
 0.127449    0.881841   0.573872     0.649853   0.20246    0.409499
 0.00716678  0.141725   0.369035     0.0264311  0.234529   0.350385
 0.773948    0.0635373  0.287991     0.605099   0.371044   0.257828

_**Note**: The reason you're able to see the Array on the GPU is because in this notebook, there is an implicit memory transfer from device to host. This is just for interactivity, and won't happen in a script. In other words, interactive programming lets you see the values. But real applications won't perform these unnecessary transfers._

You could directly generate random numbers on the GPU too. 

In [12]:
bd = rand(AFArray{Float64}, 10, 10)

10x10 ArrayFire.AFArray{Float64,2}:
 0.462508  0.930486   0.00522666  0.434718   …  0.515017   0.773266  0.85788 
 0.567271  0.248069   0.987286    0.729502      0.124347   0.449687  0.408228
 0.939965  0.156515   0.564372    0.100356      0.571643   0.367537  0.194908
 0.976486  0.446689   0.389929    0.732725      0.842081   0.86828   0.202522
 0.241424  0.384523   0.639661    0.905463      0.0294814  0.659839  0.343075
 0.178102  0.968442   0.971868    0.761931   …  0.589618   0.641351  0.26227 
 0.125846  0.0673005  0.642645    0.114108      0.52555    0.740961  0.893598
 0.427675  0.114542   0.734931    0.711423      0.299936   0.287126  0.305872
 0.858605  0.616594   0.883164    0.0877872     0.647183   0.548211  0.807914
 0.913409  0.486554   0.616011    0.384302      0.465308   0.633093  0.339871

Let us now transfer this to the CPU now. You can call the `Array` constructor.

In [13]:
b = Array(bd)

10x10 Array{Float64,2}:
 0.462508  0.930486   0.00522666  0.434718   …  0.515017   0.773266  0.85788 
 0.567271  0.248069   0.987286    0.729502      0.124347   0.449687  0.408228
 0.939965  0.156515   0.564372    0.100356      0.571643   0.367537  0.194908
 0.976486  0.446689   0.389929    0.732725      0.842081   0.86828   0.202522
 0.241424  0.384523   0.639661    0.905463      0.0294814  0.659839  0.343075
 0.178102  0.968442   0.971868    0.761931   …  0.589618   0.641351  0.26227 
 0.125846  0.0673005  0.642645    0.114108      0.52555    0.740961  0.893598
 0.427675  0.114542   0.734931    0.711423      0.299936   0.287126  0.305872
 0.858605  0.616594   0.883164    0.0877872     0.647183   0.548211  0.807914
 0.913409  0.486554   0.616011    0.384302      0.465308   0.633093  0.339871

## Simple Operations

`ArrayFire.jl` lets you do many things. It is designed to mimic Base Julia. Feel free to step through the following functions and get comfortable with the API. Chances are that you'd be comfortable if you're familiar with Julia's function interfaces. For a list of supported functions, check the [README](https://github.com/JuliaComputing/ArrayFire.jl).

### Arithmetic Operations

In [14]:
ad + 1

10x10 ArrayFire.AFArray{Float64,2}:
 1.63673  1.00761  1.15388  1.48084  …  1.9736   1.18085  1.21106  1.26425
 1.3445   1.01678  1.21761  1.06082     1.97606  1.86488  1.10913  1.91449
 1.93428  1.44016  1.62756  1.90554     1.85823  1.32791  1.45116  1.26922
 1.77788  1.19644  1.52     1.03838     1.0259   1.11041  1.09294  1.32113
 1.2391   1.24998  1.47486  1.53754     1.17901  1.17257  1.87345  1.96188
 1.45416  1.10252  1.65048  1.02703  …  1.53327  1.70142  1.76266  1.47837
 1.57961  1.53152  1.81399  1.14047     1.15112  1.28432  1.28893  1.78217
 1.12745  1.88184  1.57387  1.22987     1.7147   1.64985  1.20246  1.4095 
 1.00717  1.14173  1.36903  1.93489     1.56938  1.02643  1.23453  1.35038
 1.77395  1.06354  1.28799  1.54113     1.66664  1.6051   1.37104  1.25783

In [15]:
(ad * 5) / 10 

10x10 ArrayFire.AFArray{Float64,2}:
 0.318367    0.00380535  0.0769392  …  0.0904232  0.10553    0.132127
 0.172251    0.00838885  0.108806      0.432442   0.0545654  0.457246
 0.467138    0.220082    0.313779      0.163954   0.225581   0.134611
 0.388938    0.0982212   0.26          0.0552055  0.0464679  0.160567
 0.119551    0.12499     0.237431      0.0862848  0.436727   0.48094 
 0.22708     0.0512604   0.325242   …  0.35071    0.381332   0.239183
 0.289807    0.265759    0.406994      0.142161   0.144463   0.391084
 0.0637244   0.44092     0.286936      0.324927   0.10123    0.204749
 0.00358339  0.0708626   0.184517      0.0132155  0.117265   0.175192
 0.386974    0.0317687   0.143996      0.30255    0.185522   0.128914

In [16]:
sin(ad)

10x10 ArrayFire.AFArray{Float64,2}:
 0.594573    0.00761062  0.153272  …  0.179862  0.209496   0.26119 
 0.337728    0.0167769   0.215899     0.761019  0.108914   0.792253
 0.804169    0.426087    0.587169     0.322063  0.436011   0.265981
 0.701768    0.195181    0.496881     0.110187  0.0928022  0.315642
 0.23683     0.247385    0.457216     0.171714  0.766552   0.820268
 0.438708    0.102341    0.605572  …  0.645303  0.69085    0.46033 
 0.547702    0.506842    0.727031     0.280507  0.284922   0.704819
 0.127104    0.77191     0.542888     0.605069  0.20108    0.39815 
 0.00716672  0.141251    0.360715     0.026428  0.232385   0.343259
 0.698964    0.0634946   0.284027     0.568844  0.362588   0.254981

### Logical Operations

In [17]:
cd = ad .> bd


10x10 ArrayFire.AFArray{Bool,2}:
  true  false   true   true   true   true   true  false  false  false
 false  false  false  false   true   true   true   true  false   true
 false   true   true   true  false  false   true  false   true   true
 false  false   true  false   true   true  false  false  false   true
 false  false  false  false  false  false   true   true   true   true
  true  false  false  false  false  false   true   true   true   true
  true   true   true   true   true   true  false  false  false  false
 false   true  false  false   true  false  false   true  false   true
 false  false  false   true   true   true   true  false  false  false
 false  false  false   true   true   true   true   true  false  false

In [18]:
any_trues = any(cd)

true

### Indexing



In [19]:
ad[:,1]

10-element ArrayFire.AFArray{Float64,1}:
 0.636734  
 0.344502  
 0.934276  
 0.777876  
 0.239102  
 0.45416   
 0.579615  
 0.127449  
 0.00716678
 0.773948  

In [20]:
ad[1,:]

1x10 ArrayFire.AFArray{Float64,2}:
 0.636734  0.0076107  0.153878  0.480845  …  0.180846  0.21106  0.264255

In [21]:
ad[:,1]

10-element ArrayFire.AFArray{Float64,1}:
 0.636734  
 0.344502  
 0.934276  
 0.777876  
 0.239102  
 0.45416   
 0.579615  
 0.127449  
 0.00716678
 0.773948  

In [22]:
ad[1:5, 2:3]

5x2 ArrayFire.AFArray{Float64,2}:
 0.0076107  0.153878
 0.0167777  0.217612
 0.440163   0.627557
 0.196442   0.520001
 0.249981   0.474862

### Reduction Operations

In [24]:
total_max = maximum(ad)


0.9942394049078851

In [25]:
colwise_min = min(ad,2)

10x10 ArrayFire.AFArray{Float64,2}:
 0.636734    0.0076107  0.153878  …  0.180846   0.21106    0.264255
 0.344502    0.0167777  0.217612     0.864883   0.109131   0.914492
 0.934276    0.440163   0.627557     0.327908   0.451161   0.269221
 0.777876    0.196442   0.520001     0.110411   0.0929359  0.321133
 0.239102    0.249981   0.474862     0.17257    0.873454   0.96188 
 0.45416     0.102521   0.650484  …  0.701419   0.762665   0.478366
 0.579615    0.531518   0.813988     0.284322   0.288925   0.782168
 0.127449    0.881841   0.573872     0.649853   0.20246    0.409499
 0.00716678  0.141725   0.369035     0.0264311  0.234529   0.350385
 0.773948    0.0635373  0.287991     0.605099   0.371044   0.257828

### Matrix Operations and Linear Algebra

In [26]:
det(ad)

-0.03985866824160378

In [27]:
svd(ad)

(
10x10 ArrayFire.AFArray{Float64,2}:
 -0.299489   0.444634  -0.289965  …  -0.39613    -0.387945   -0.0257082 
 -0.340431  -0.381718  -0.443663      0.355136    0.198787    0.106268  
 -0.372733   0.372349  -0.161684      0.470919    0.342402    0.143832  
 -0.25131    0.130864   0.362914      0.0623559  -0.148956    0.587221  
 -0.324876  -0.252147   0.466873     -0.319245    0.307875    0.211964  
 -0.250856  -0.420285  -0.257703  …   0.033381   -0.52596     0.00558102
 -0.326813  -0.209837   0.413863      0.19289    -0.0826142  -0.64043   
 -0.303083  -0.294321  -0.133419     -0.398968    0.0274025   0.182482  
 -0.306177   0.258473   0.256826      0.273639   -0.400135   -0.0464752 
 -0.362551   0.251279  -0.147686     -0.342074    0.362561   -0.363263  ,

10-element ArrayFire.AFArray{Float64,1}:
 4.72363  
 1.38802  
 1.26264  
 1.2211   
 1.01625  
 0.902777 
 0.490787 
 0.332489 
 0.282946 
 0.0930822,
10x10 ArrayFire.AFArray{Float64,2}:
 -0.329016   -0.176931  -0.308992  …  -0.2

In [28]:
lu(ad)

(
10x10 ArrayFire.AFArray{Float64,2}:
 1.0          0.0        0.0        0.0       …  0.0         0.0       0.0
 0.136414     1.0        0.0        0.0          0.0         0.0       0.0
 0.486109    -0.135613   1.0        0.0          0.0         0.0       0.0
 0.00767094   0.168349   0.685122   1.0          0.0         0.0       0.0
 0.368737    -0.177084   0.176545  -0.155386     0.0         0.0       0.0
 0.832597    -0.206908   0.239347  -0.505505  …  0.0         0.0       0.0
 0.681527    -0.355772  -0.243194  -0.165175     0.0         0.0       0.0
 0.828394    -0.366382  -0.128711  -0.187094     1.0         0.0       0.0
 0.255922     0.167113   0.565208   0.433892     0.336735    1.0       0.0
 0.620389     0.314488   0.658602  -0.162395     0.0385726  -0.275473  1.0,

10x10 ArrayFire.AFArray{Float64,2}:
 0.934276  0.440163  0.627557   0.905536  …   0.327908   0.451161   0.269221 
 0.0       0.821796  0.488264   0.106343      0.605122   0.140915   0.372773 
 0.0       0.0    

### FFTs

In [29]:
fast_fourier = fft(ad)

10x10 ArrayFire.AFArray{Complex{Float64},2}:
          45.4876+0.0im         -3.77385+1.20825im   …   -3.77385-1.20825im 
         0.677466+0.263855im    -3.04935-1.40941im       -5.49283-2.25709im 
        -0.673062-1.04755im      1.65264+2.37748im      -0.467699-4.99881im 
         -1.44153+0.0289776im    1.72119+1.01415im      -0.773862+2.27636im 
         -2.46323+2.56445im     -0.80606+0.955454im      0.877059+1.26833im 
 1.53647+1.30104e-16im         -0.106327-3.37468im   …  -0.106327+3.37468im 
         -2.46323-2.56445im     0.877059-1.26833im       -0.80606-0.955454im
         -1.44153-0.0289776im  -0.773862-2.27636im        1.72119-1.01415im 
        -0.673062+1.04755im    -0.467699+4.99881im        1.65264-2.37748im 
         0.677466-0.263855im    -5.49283+2.25709im       -3.04935+1.40941im 

## Backends

ArrayFire allows you to change backends at runtime. This is allows ArrayFire tremendous versatility, and run on a variety of backends, thereby supporting a number of devices. 

Run the following command to see which backend you're currently using:

In [30]:
getActiveBackend()

CUDA Backend


What are the available backends on this system?

In [31]:
getAvailableBackends()

CPU, CUDA and OpenCL


Our ArrayFire was built for all three backends. Let us now try switching backends. 

In [32]:
setBackend(AF_BACKEND_CPU)

true

In [33]:
getActiveBackend()

CPU Backend
