# `python-graphblas`: What's Happening in Python
### Erik Welch & Jim Kitchen
**@HPEC2022 GraphBLAS BoF, September 20**

## Summary of last year

- **Completeness**
  - cover *all* of Suitesparse:GraphBLAS with natural syntax
  - if we missed anything, we want to know!
- **Efficiency**
  - interact efficiently with external libraries such as `numpy`, `scipy.sparse`, `networkx`
  - make it easy to write performant GraphBLAS code, and difficult to *accidentally* write inefficient code
- **User experience**
  - pragmatism over purity: our API has evolved to be friendlier and easier
  - *many* improvements and enhancements in the last year
  - add extra functionality in Python (often via `numba`) needed for workloads
- **Better documentation**
  - documentation website and improving docstrings
- **Community outreach**
  - partnering with NetworkX
  - connecting with more people and project in PyData/scientific python community
  - Goal: to be more sustainable

### Shout-outs:
- **@ParticularMiner** for extensive work on `dask-graphblas` (being renamed from `dask-grblas`)
  - `dask-graphblas` is for distributed GraphBLAS in Python via Dask
  - Nearly complete coverage of GraphBLAS functionality
  - Still catching up to our changes from the last year
  - A few algorithms written
- `graphblas-algorithms` for collecting algorithms in Python
  - implements NetworkX API, but may implement more
  - best examples of idiomatic `python-graphblas` code

## Some bookkeeping
- Renamed `grblas` to `python-graphblas`
  - install via `conda install -c python-graphblas`
  - or install via `pip install python-graphblas`
  - use as `import graphblas as gb`
- Switched to calendar versioning
  - e.g., `2022.9.0`, or `YYYY.M.X` where `X` is a counter that starts at 0
- 13 releases since HPEC 2021
  - https://github.com/python-graphblas/python-graphblas/releases

### *`python-graphblas` is much more than just a C wrapper*

## Weekly community call
- Every Wednesday at 9am CT
- https://github.com/python-graphblas/python-graphblas/issues/247
- We're friendly and want to hear from you!
- Reasons to join:
  - to say hi!
  - to get help using or learning `python-graphblas`
  - to hear why Erik thinks the TACO format for sparse tensors got it wrong
  - to learn how you (or one of your students) can help :)

In [1]:
import graphblas as gb
import numpy as np

### Record (i.e., struct) UDT

In [2]:
A = gb.Matrix({"x": int, "y": float}, nrows=5, ncols=5)
A[:, :] = (1, 2)
A[0, 0] = {"x": 10, "y": 20}
A

0,1,2,3,4,5
gb.Matrix,nvals,nrows,ncols,dtype,format
gb.Matrix,25,5,5,"{'x': INT64, 'y': FP64}",fullr

Unnamed: 0,0,1,2,3,4
0,"(10, 20.0)","(1, 2.0)","(1, 2.0)","(1, 2.0)","(1, 2.0)"
1,"(1, 2.0)","(1, 2.0)","(1, 2.0)","(1, 2.0)","(1, 2.0)"
2,"(1, 2.0)","(1, 2.0)","(1, 2.0)","(1, 2.0)","(1, 2.0)"
3,"(1, 2.0)","(1, 2.0)","(1, 2.0)","(1, 2.0)","(1, 2.0)"
4,"(1, 2.0)","(1, 2.0)","(1, 2.0)","(1, 2.0)","(1, 2.0)"


In [3]:
B = A.apply(lambda v: v.x).new()
B

0,1,2,3,4,5
gb.Matrix,nvals,nrows,ncols,dtype,format
gb.Matrix,25,5,5,INT64,fullr

Unnamed: 0,0,1,2,3,4
0,10,1,1,1,1
1,1,1,1,1,1
2,1,1,1,1,1
3,1,1,1,1,1
4,1,1,1,1,1


In [4]:
A << A.select("tril")
A

0,1,2,3,4,5
gb.Matrix,nvals,nrows,ncols,dtype,format
gb.Matrix,15,5,5,"{'x': INT64, 'y': FP64}",bitmapr

Unnamed: 0,0,1,2,3,4
0,"(10, 20.0)",,,,
1,"(1, 2.0)","(1, 2.0)",,,
2,"(1, 2.0)","(1, 2.0)","(1, 2.0)",,
3,"(1, 2.0)","(1, 2.0)","(1, 2.0)","(1, 2.0)",
4,"(1, 2.0)","(1, 2.0)","(1, 2.0)","(1, 2.0)","(1, 2.0)"


In [5]:
B << gb.select.triu(B, -1)
B

0,1,2,3,4,5
gb.Matrix,nvals,nrows,ncols,dtype,format
gb.Matrix,19,5,5,INT64,bitmapr

Unnamed: 0,0,1,2,3,4
0,10.0,1.0,1.0,1,1
1,1.0,1.0,1.0,1,1
2,,1.0,1.0,1,1
3,,,1.0,1,1
4,,,,1,1


In [6]:
gb.op.first(A & B).new()

0,1,2,3,4,5
gb.Matrix,nvals,nrows,ncols,dtype,format
gb.Matrix,9,5,5,"{'x': INT64, 'y': FP64}",bitmapr

Unnamed: 0,0,1,2,3,4
0,"(10, 20.0)",,,,
1,"(1, 2.0)","(1, 2.0)",,,
2,,"(1, 2.0)","(1, 2.0)",,
3,,,"(1, 2.0)","(1, 2.0)",
4,,,,"(1, 2.0)","(1, 2.0)"


In [7]:
B << gb.op.second(A & B)
B

0,1,2,3,4,5
gb.Matrix,nvals,nrows,ncols,dtype,format
gb.Matrix,9,5,5,INT64,bitmapr

Unnamed: 0,0,1,2,3,4
0,10.0,,,,
1,1.0,1.0,,,
2,,1.0,1.0,,
3,,,1.0,1.0,
4,,,,1.0,1.0


In [8]:
B << gb.op.positionj(B)
B

0,1,2,3,4,5
gb.Matrix,nvals,nrows,ncols,dtype,format
gb.Matrix,9,5,5,INT64,bitmapr

Unnamed: 0,0,1,2,3,4
0,0.0,,,,
1,0.0,1.0,,,
2,,1.0,2.0,,
3,,,2.0,3.0,
4,,,,3.0,4.0


In [9]:
B("+", B < 2) << 10
B

0,1,2,3,4,5
gb.Matrix,nvals,nrows,ncols,dtype,format
gb.Matrix,9,5,5,INT64,bitmapr

Unnamed: 0,0,1,2,3,4
0,10.0,,,,
1,10.0,11.0,,,
2,,11.0,2.0,,
3,,,2.0,3.0,
4,,,,3.0,4.0


### "compactify" and "selectk"

In [10]:
B.ss.compactify()

0,1,2,3,4,5
gb.Matrix,nvals,nrows,ncols,dtype,format
gb.Matrix,9,5,2,INT64,hypercsr

Unnamed: 0,0,1
0,10,
1,10,11.0
2,11,2.0
3,2,3.0
4,3,4.0


In [11]:
B.ss.compactify(how="last", k=1)

0,1,2,3,4,5
gb.Matrix,nvals,nrows,ncols,dtype,format
gb.Matrix,5,5,1,INT64,hypercsr

Unnamed: 0,0
0,10
1,11
2,2
3,3
4,4


In [12]:
B.ss.compactify(how="random", k=1)

0,1,2,3,4,5
gb.Matrix,nvals,nrows,ncols,dtype,format
gb.Matrix,5,5,1,INT64,hypercsr

Unnamed: 0,0
0,10
1,11
2,11
3,2
4,4


In [13]:
B.ss.selectk("first", 1)

0,1,2,3,4,5
gb.Matrix,nvals,nrows,ncols,dtype,format
gb.Matrix,5,5,5,INT64,hypercsr

Unnamed: 0,0,1,2,3,4
0,10.0,,,,
1,10.0,,,,
2,,11.0,,,
3,,,2.0,,
4,,,,3.0,


In [14]:
B.ss.selectk("last", 1)

0,1,2,3,4,5
gb.Matrix,nvals,nrows,ncols,dtype,format
gb.Matrix,5,5,5,INT64,hypercsr

Unnamed: 0,0,1,2,3,4
0,10.0,,,,
1,,11.0,,,
2,,,2.0,,
3,,,,3.0,
4,,,,,4.0


In [15]:
B.ss.selectk("random", 1)

0,1,2,3,4,5
gb.Matrix,nvals,nrows,ncols,dtype,format
gb.Matrix,5,5,5,INT64,hypercsr

Unnamed: 0,0,1,2,3,4
0,10.0,,,,
1,,11.0,,,
2,,11.0,,,
3,,,2.0,,
4,,,,,4.0


In [16]:
B << gb.op.one(B)
B

0,1,2,3,4,5
gb.Matrix,nvals,nrows,ncols,dtype,format
gb.Matrix,9,5,5,INT64,bitmapr (iso)

Unnamed: 0,0,1,2,3,4
0,1.0,,,,
1,1.0,1.0,,,
2,,1.0,1.0,,
3,,,1.0,1.0,
4,,,,1.0,1.0


In [17]:
v = gb.Vector.from_coo(np.arange(5), np.arange(5))
v

0,1,2,3,4
gb.Vector,nvals,size,dtype,format
gb.Vector,5,5,INT64,full

Unnamed: 0,0,1,2,3,4
,0,1,2,3,4


In [18]:
with gb.Recorder() as rec:
    B << B * v
B

0,1,2,3,4,5
gb.Matrix,nvals,nrows,ncols,dtype,format
gb.Matrix,9,5,5,INT64,bitmapr

Unnamed: 0,0,1,2,3,4
0,0.0,,,,
1,0.0,1.0,,,
2,,1.0,2.0,,
3,,,2.0,3.0,
4,,,,3.0,4.0


In [19]:
rec

0
gb.Recorder


In [20]:
with rec:
    B += v
B

0,1,2,3,4,5
gb.Matrix,nvals,nrows,ncols,dtype,format
gb.Matrix,25,5,5,INT64,fullr

Unnamed: 0,0,1,2,3,4
0,0,1,2,3,4
1,0,2,2,3,4
2,0,2,4,3,4
3,0,1,4,6,4
4,0,1,2,6,8


In [21]:
gb.op.plus_times(v @ v).new()

0,1,2
gb.Scalar,value,dtype
gb.Scalar,30,INT64


In [22]:
v.outer(v, gb.op.plus).new()

0,1,2,3,4,5
gb.Matrix,nvals,nrows,ncols,dtype,format
gb.Matrix,25,5,5,INT64,fullr

Unnamed: 0,0,1,2,3,4
0,0,1,2,3,4
1,1,2,3,4,5
2,2,3,4,5,6
3,3,4,5,6,7
4,4,5,6,7,8


### Array UDT

In [23]:
gb.Matrix.from_coo(np.arange(5), np.zeros(5, int), np.arange(25).reshape(5, 5))

0,1,2,3,4,5
gb.Matrix,nvals,nrows,ncols,dtype,format
gb.Matrix,5,5,1,INT64[5],fullc

Unnamed: 0,0
0,"[0, 1, 2, 3, 4]"
1,"[5, 6, 7, 8, 9]"
2,"[10, 11, 12, 13, 14]"
3,"[15, 16, 17, 18, 19]"
4,"[20, 21, 22, 23, 24]"


### SuiteSparse:GraphBLAS options

In [24]:
A.ss.config

{'bitmap_switch': 0.07999999821186066,
 'format': 'by_row',
 'hyper_switch': 0.0625,
 'sparsity_control': {'auto'},
 'sparsity_status': 'bitmap'}

In [25]:
?A.ss.config

In [26]:
A.ss.config["sparsity_control"] = {"sparse", "bitmap"}

In [27]:
A.ss.config["sparsity_control"] = "auto"

In [28]:
gb.ss.config

{'bitmap_switch': [0.03999999910593033, 0.05000000074505806, 0.05999999865889549, 0.07999999821186066, 0.10000000149011612, 0.20000000298023224, 0.30000001192092896, 0.4000000059604645],
 'burble': False,
 'chunk': 65536.0,
 'format': 'by_row',
 'gpu_chunk': 1048576.0,
 'gpu_control': 'never',
 'hyper_switch': 0.0625,
 'memory_pool': [0, 0, 0, 16384, 16384, 16384, 16384, 16384, 16384, 8192, 4096, 2048, 1024, 512, 256, 128, 64, 32, 16, 8, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 'nthreads': 8,
 'print_1based': False}

## What's next?
- Respond to user-needs, and needs of `graphblas-algorithms` and `dask-graphblas`
- Keep improving documentation
- Focus on building community, connections, and sustainability
  - Many opportunities for GraphBLAS to make inroads in PyData or scientific Python communities
- Get GraphBLAS in the hands of NetworkX users
- Make it easier for new contributors
  - Come write an algorithm for `graphblas-algorithms`!
- Add GPU support
- Stay up to date with SuiteSparse:GraphBLAS:
  - Can we use JIT in SuiteSparse from Python?
- `dask-graphblas` (longer term goal; any interested parties?)
  - benchmark, improve, and iterate
  - add partitioning strategies
  - support multi-GPU
- Work on sparse binary file format
- See: https://github.com/python-graphblas/python-graphblas/issues
  - index via *points*, not *blocks*
  - improve array UDTs to make GraphSAGE implementation super-nice
  - wrap "other" GraphBLAS implementation--maybe vanilla SuiteSparse:GraphBLAS (no GxB)