Skip to content

tomwhite/array-tracker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

array-tracker

An implementation of the Python array API standard that tracks array element dependencies

Why?

This library was developed to support op-art, which produces visual animations of array operations in NumPy (and related projects).

This is achieved by providing an implementation of the Python array API standard that tracks array element-level dependencies between arrays being operated on, and using that information to produce interactive animations of the operations in a code snippet.

All that array-tracker does is track these element-level dependencies - all the visualization code lives in op-art. This separation makes it easier to reuse the dependency tracking for other applications (including better visualizations).

How it works

A new array object is used to encapsulate the raw array data and metadata to track element dependencies.

Every array is an instance of array_tracker.Array, which contains the underlying array (which itself conforms to the the Python array API standard), a unique ID, and four arrays used to track dependencies:

  • arr_ids - an array containing the ID in every position. It is the same shape as the main array.
  • offsets - an array containing the offset of every element in the main array (as a 1-D index). It is the same shape as the main array.
  • src_arr_ids - an array containing the IDs of the arrays that this element depends on. It may be None if there are no dependencies (for the whole array), otherwise it is the same shape as the main array, plus an extra dimension. The size of the extra dimension corresponds to the number of sources it depends on. For example, if the main array has shape (3, 4) then src_arr_ids would be (3, 4, 2) if it has two sources.
  • src_offsets - an array containing the offsets of the elements that this element depends on. It has the same shape as src_arr_ids.

Array implements the contract defined in the Array object array API, which includes things like the dtype and shape attributes, and methods like __add__ for adding two arrays together, or __neg__ for returning an array where each element is the negative value.

The rest of the array API spec is also implemented in array_tracker.

For each operation in the spec, the source arrays must be computed, usually in a way that depends on the nature of the operation. The advantage of using arrays for tracking dependencies is that there is often an operation that can be performed on the source arrays to transform them appropriately. For example, to implement the creation function tril, we can use np.tril_indices to pull out the relevant values from the id and offset arrays of the input array to populate the source arrays of the result. Things are not always that simple, however. Have a look at the implementations of the statistical functions like sum, or the set functions like unique_values, for examples where more effort is required to recover the source elements.

Note that a lot of the implementation is not concerned with tracking dependencies per se, and is based on https://github.com/data-apis/array-api-strict/tree/main/array_api_strict.

Development

How to build and test the library (for developers):

conda create --name array-tracker python=3.12
conda activate array-tracker
pip install -e '.[test]'
pytest

About

An implementation of the Python array API standard that tracks array element dependencies

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages