# The Meataxe64 Package, Core Functionality

In [3]:
LoadPackage("meataxe64");; Read("../gap/bench.g"); LoadPackage("jupyterviz");;

## $100\, 000 \times 100\, 000$ matrices over $\mathbb{F}_2$

The main target of the package is large computations.

In [4]:
m := MTX64_RandomMat(MTX64_FiniteField(2),100000,100000);

< matrix 100000x100000 : <MTX64 GF(2)>>

In [7]:
MTX64_WriteMatrix(m,"a");;
ShowBench(MTX64_fMultiply, ".", "a", "a", "b");
p := MTX64_ReadMatrix("b");

wall time: 63.3s cpu time: 595s memory allocated: 72B no result returned


< matrix 100000x100000 : <MTX64 GF(2)>>

## Gaussian Elimination

Our other key primitive operation. To see what it does properly we need a singular matrix. 

We take the Kronecker (tensor) product of two rectangular matrices. 

If $A$ is $n\times m$ and $B$ is $m\times n$ with $m < n$ then $A\otimes B$ will have rank at most $m^2$.

In [87]:
f := MTX64_FiniteField(9);
m1 := RandomMat(200,99,GF(9));;
m2 := RandomMat(99,200,GF(9));;
m := MTX64_Matrix(KroneckerProduct(m1,m2));

<MTX64 GF(3^2)>

< matrix 19800x19800 : <MTX64 GF(3^2)>>

Our basic Gaussian elimnination operation applied to a matrix $A$, computes $M$, $K$, $R$, $\gamma$ and $\rho$ satisfying: 

$$\pmatrix{M&0\cr K & 1} \rho A \gamma = \pmatrix{-1&R\cr0&0}$$ 

where $\gamma$ and $\rho$ are permutations that effectively select the pivot columns and pivot rows of $A$. 

Using this, we can compute inverses, solve systems of equations, determine nullspaces, etc. efficiently.

In [92]:
ech := fail;; # suppress a warning.
ShowBench(function() ech := MTX64_Echelize(m);end); 
ech. multiplier; ech.cleaner; ech.remnant;  # M, K and R in the above formula

wall time: 11.5s cpu time: 11.2s memory allocated: 747.81MB no result returned


< matrix 9801x9801 : <MTX64 GF(3^2)>>

< matrix 9999x9801 : <MTX64 GF(3^2)>>

< matrix 9801x9999 : <MTX64 GF(3^2)>>

We an compare this runtime to multiplication. If $m$ was full rank we would expect them to be the same, since $m$ is lower rank, the Gaussian elimination is actually faster.

In [93]:
ShowBench(\*,m,m);

wall time: 24.6s cpu time: 24.4s memory allocated: 186.95MB result returned


We can also use the multi-threaded version of the Gaussian elimination

In [95]:
MTX64_WriteMatrix(m, "a"); 
ShowBench(MTX64_fEchelize, ".", "a", "gamma", "rho", "m", "k", "r");

wall time: 4.59s cpu time: 45.8s memory allocated: 144B result returned


true

## Run-time versus matrix size

We set the field and maximum dimension and make a set of random matrices of different sizes

In [98]:
q := 625;; maxdim := 5000;; 
sizes := List([1..16], i-> i*QuoInt(maxdim, 16));;

In [99]:
mats := List(sizes, i-> MTX64_RandomMat(MTX64_FiniteField(q), i, i));;

And look at the timing for squaring them:

In [103]:
fsq := function(m) MTX64_WriteMatrix(m, "a"); MTX64_fMultiply(".", "a", "a", "b"); end;;
marks1 := List(mats, x-> BenchMark(\*,x,x));;
marksm := List(mats, x-> BenchMark(fsq,x));;
Plot(
[sizes,List(marks1, x-> x.cpu), rec(name := "single-threaded")],
[sizes,List(marksm, x-> x.cpu), rec(name := "multi-threaded CPU", 
title := "Meataxe64 runtimes for matrix multiply", xaxis := "Dimension", yaxis := "ms")],
[sizes,List(marksm, x-> QuoInt(x.wall,10^6)), rec(name := "multi-threaded wall time")]
);
