# Linear Algebra Overview

In [1]:
%load ../../rapaio-bootstrap.ipynb

Adding dependency [0m[1m[32mio.github.padreati:rapaio-lib:7.0.0
[0mSolving dependencies
Resolved artifacts count: 4
Add to classpath: [0m[32m/home/ati/work/rapaio-jupyter-kernel/target/mima_cache/io/github/padreati/rapaio-lib/7.0.0/rapaio-lib-7.0.0.jar[0m
[0mAdd to classpath: [0m[32m/home/ati/work/rapaio-jupyter-kernel/target/mima_cache/io/github/padreati/rapaio-code-gen/7.0.0/rapaio-code-gen-7.0.0.jar[0m
[0mAdd to classpath: [0m[32m/home/ati/work/rapaio-jupyter-kernel/target/mima_cache/org/antlr/ST4/4.3.4/ST4-4.3.4.jar[0m
[0mAdd to classpath: [0m[32m/home/ati/work/rapaio-jupyter-kernel/target/mima_cache/org/antlr/antlr-runtime/3.5.3/antlr-runtime-3.5.3.jar[0m
[0m

In [2]:
WS.getPrinter().withOptions(floatFormat(Format.floatFlexShort()));

rapaio.printer.standard.StandardPrinter@1d0ceafb

`Rapaio` library is commited to offer a rich set of tools for manipulating linear algebra objects and structures. 
Since 7.0.0 version, DArrays are available which replaced the older custom implementations. DArrays are n-dimensional arrays which stores data elements of the same type indexed by dimensions.
Current implementation offers only stride dense vector arrays, but the design allows future implementations like sparse arrays.

DArrays offers many operations, besides standard manipulation data tools, there are implemented also some non trivial operations and also matrix decompositions and linear solvers. There are implementations for four numerical data types: `byte`, `int`, `float` and `double`. 

## Shape of a DArray

DArray elements are indexed by dimensions. You can think of a tensor as a hyper cube of data elements with one element in each position from the hypercube. A darray can have 0, 1, 2 or more dimensions. Each dimension have a dimension size. The dimensions and the size of each dimension is described by a `Shape` object.
The total number of elements is given by the product of all dimension sizes, or 1 if there is no dimension.

In [3]:
// a shape with no dimensions
display(Shape.of());

// a shape with 1 dimension / a vector of size 1
display(Shape.of(1));

// Both shapes are of size 1, but they are different shapes
display(Shape.of().size());
display(Shape.of(1).size());

Shape: []

Shape: [1]

1

1

58d7ffd6-41a3-4d6b-a44e-785cb9ce5626

A darray with no dimensions is a scalar and can be created with:

In [4]:
DArrayManager am = DArrayManager.base();

In [5]:
var scalar = am.scalar(DType.DOUBLE, 1.);
display(scalar.shape());
display(scalar)

Shape: []

BaseStride{DOUBLE,[],0,[]}
 1  


0988d4ec-3752-4db7-85ad-dbfe305da14e

A DArray with a single dimension is a vector, with two dimensions it is called a matrix and in general with more than two dimensions it is called a tensor. Some examples:

In [6]:
am.seq(DType.FLOAT, Shape.of(100)).printFullContent()

[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 ] 


Tensors can have multiple implemntations. The implementation details of a tensor is governed by a `TensorManager`. Tensor manager describes which storage factory to be used, which implementation of the operations, and various parameters. In order to make things easier, there is a default tensor manager which uses Java arrays and no vectorization. Methods for working with the default tensor manager are exposed by the class `Tensors`, which offers shortcuts. Since the default tensor manager uses Java arrays as data storage, it is possible to wrap an array into a tensor of appropriate type and use tensor operations to change values in the array. For example we will create an array of double values and a wrapper tensor around it. Using tensor methods we can change the values from the array.

In [7]:
double[] array = new double[] {1., 2., 3.};
// the default tensor manager wraps a double array if we request a Tensor<Double>
var t = am.stride(DType.DOUBLE, array);
t.log1p_();

for(double a : array) {
    System.out.println(a);
}

0.6931471805599453
1.0986122886681096
1.3862943611198906


In [8]:
// To change the type we can use the longer methods
float[] floatArray = new float[] {1.f, 2, 3, 4};
var floatTensor = am.stride(DType.FLOAT, floatArray);
floatTensor.sqr_();
for(float f : floatArray) {
    System.out.println(f);
}

1.0
4.0
9.0
16.0


In [9]:
// using the wrong type will not wrap the array, but will copy the data
DArray<Double> doubleArray = am.stride(DType.DOUBLE, floatArray);
doubleArray.sqrt_();
for(float f : floatArray) {
    System.out.println(f);
}
// we can see that the values remain unchanged

1.0
4.0
9.0
16.0


In [10]:
DArray<Double> m = am.eye(DType.DOUBLE, 3);
Random random = new Random(42);
DArray<Double> u = am.random(DType.DOUBLE, Shape.of(3), random);
display(u.inner(m.selsq(0, 2)));
u.printString();
m.printContent();

-0.49948335406718436

BaseStride{DOUBLE,[3],0,[1]}
[ 0.605 0.477 -0.499 ] 
[[ 1 0 0 ]  
 [ 0 1 0 ]  
 [ 0 0 1 ]] 


In [11]:
DArray<Double> mtm = m.t_().mm(m);
mtm.printContent();
mtm.lu().det();

[[ 1 0 0 ]  
 [ 0 1 0 ]  
 [ 0 0 1 ]] 


1.0

In [12]:
mtm.svd().singularValues().prod();

1.0

In [13]:
mtm.qr().q()

BaseStride{DOUBLE,[3, 3],0,[3, 1]}
[[ -1  0  0 ]  
 [  0 -1  0 ]  
 [  0  0 -1 ]] 


In [14]:
mtm.qr().r()

BaseStride{DOUBLE,[3, 3],0,[3, 1]}
[[ -1  0  0 ]  
 [  0 -1  0 ]  
 [  0  0 -1 ]] 


In [15]:
mtm.qr().r().diag().prod()

-1.0

In [16]:
var x = am.random(DType.DOUBLE, Shape.of(4, 4), random)

In [17]:
var y = am.random(DType.DOUBLE, Shape.of(4), random)

In [18]:
x.mul(y)

BaseStride{DOUBLE,[4, 4],0,[4, 1]}
[[ -0.131  0.29   0.233 -0.068 ]  
 [ -0.132 -0.062  0.14   0.283 ]  
 [ -0.035  0.457 -0.052 -0.187 ]  
 [  0.053 -0.548  0.168 -0.191 ]] 


In [19]:
x.mul(y.stretch(1)) 

BaseStride{DOUBLE,[4, 4],0,[4, 1]}
[[ -0.131  0.094  0.287 -0.074 ]  
 [ -0.404 -0.062  0.531  0.95  ]  
 [ -0.029  0.121 -0.052 -0.166 ]  
 [  0.048 -0.163  0.19  -0.191 ]] 


In [20]:
x.mul(y.stretch(0))

BaseStride{DOUBLE,[4, 4],0,[4, 1]}
[[ -0.131  0.29   0.233 -0.068 ]  
 [ -0.132 -0.062  0.14   0.283 ]  
 [ -0.035  0.457 -0.052 -0.187 ]  
 [  0.053 -0.548  0.168 -0.191 ]] 


In [42]:
// load first 4 numeric features of iris dataset into a matrix
var x = Datasets.loadIrisDataset().mapVars(VarRange.of("0~3")).darray();
// compute mean on row axis
var m = x.mean1d(0);
// compute sd on row axis
var std = x.std1d(0, 0);
// create a matrix which is the standardized version of m
var s = x.sub(m).div(std);
// print first 10 rows of standardized values
s.sel(0, IntArrays.newSeq(10))

BaseStride{DOUBLE,[10, 4],0,[4, 1]}
[[ -0.901  1.019 -1.34  -1.315 ]  
 [ -1.143 -0.132 -1.34  -1.315 ]  
 [ -1.385  0.328 -1.397 -1.315 ]  
 [ -1.507  0.098 -1.283 -1.315 ]  
 [ -1.022  1.249 -1.34  -1.315 ]  
 [ -0.537  1.94  -1.17  -1.052 ]  
 [ -1.507  0.789 -1.34  -1.184 ]  
 [ -1.022  0.789 -1.283 -1.315 ]  
 [ -1.749 -0.362 -1.34  -1.315 ]  
 [ -1.143  0.098 -1.283 -1.447 ]] 


In [41]:
s.mean1d(0)

BaseStride{DOUBLE,[4],0,[1]}
[ -0 0 -0 -0 ] 


In [31]:
var std = x.std1d(0, 1);
std

BaseStride{DOUBLE,[4],0,[1]}
[ 0.828 0.436 1.765 0.762 ] 


In [32]:
x.sub(m).div(std)

BaseStride{DOUBLE,[150, 4],0,[4, 1]}
[[  -0.898  1.016 -1.336 -1.311 ]   
 [  -1.139 -0.132 -1.336 -1.311 ]   
 [  -1.381  0.327 -1.392 -1.311 ]   
 [  -1.501  0.098 -1.279 -1.311 ]   
 [  -1.018  1.245 -1.336 -1.311 ]   
 [  -0.535  1.933 -1.166 -1.049 ]   
 [  -1.501  0.786 -1.336 -1.18  ]   
 [  -1.018  0.786 -1.279 -1.311 ]   
 [  -1.743 -0.361 -1.336 -1.311 ]   
 [  -1.139  0.098 -1.279 -1.442 ]   
 [  -0.535  1.474 -1.279 -1.311 ]   
 [  -1.26   0.786 -1.222 -1.311 ]   
 [  -1.26  -0.132 -1.336 -1.442 ]   
 [  -1.864 -0.132 -1.506 -1.442 ]   
 [  -0.052  2.163 -1.449 -1.311 ]   
 [  -0.173  3.08  -1.279 -1.049 ]   
 [  -0.535  1.933 -1.392 -1.049 ]   
 [  -0.898  1.016 -1.336 -1.18  ]   
 [  -0.173  1.704 -1.166 -1.18  ]   
 [  -0.898  1.704 -1.279 -1.18  ]   
 [  -0.535  0.786 -1.166 -1.311 ]   
 [  -0.898  1.474 -1.279 -1.049 ]   
 [  -1.501  1.245 -1.562 -1.311 ]   
 [  -0.898  0.557 -1.166 -0.917 ]   
 [  -1.26   0.786 -1.053 -1.311 ]   
 [  -1.018 -0.132 -1.222 -1.311 ]   
 