An implementation of the Python array API standard that tracks array element dependencies
This library was developed to support op-art, which produces visual animations of array operations in NumPy (and related projects).
This is achieved by providing an implementation of the Python array API standard that tracks array element-level dependencies between arrays being operated on, and using that information to produce interactive animations of the operations in a code snippet.
All that array-tracker does is track these element-level dependencies - all the visualization code lives in op-art. This separation makes it easier to reuse the dependency tracking for other applications (including better visualizations).
A new array object is used to encapsulate the raw array data and metadata to track element dependencies.
Every array is an instance of array_tracker.Array, which contains the underlying array (which itself conforms to the the Python array API standard), a unique ID, and four arrays used to track dependencies:
arr_ids- an array containing the ID in every position. It is the same shape as the main array.offsets- an array containing the offset of every element in the main array (as a 1-D index). It is the same shape as the main array.src_arr_ids- an array containing the IDs of the arrays that this element depends on. It may beNoneif there are no dependencies (for the whole array), otherwise it is the same shape as the main array, plus an extra dimension. The size of the extra dimension corresponds to the number of sources it depends on. For example, if the main array has shape(3, 4)thensrc_arr_idswould be(3, 4, 2)if it has two sources.src_offsets- an array containing the offsets of the elements that this element depends on. It has the same shape assrc_arr_ids.
Array implements the contract defined in the Array object array API, which includes things like the dtype and shape attributes, and methods like __add__ for adding two arrays together, or __neg__ for returning an array where each element is the negative value.
The rest of the array API spec is also implemented in array_tracker.
For each operation in the spec, the source arrays must be computed, usually in a way that depends on the nature of the operation. The advantage of using arrays for tracking dependencies is that there is often an operation that can be performed on the source arrays to transform them appropriately. For example, to implement the creation function tril, we can use np.tril_indices to pull out the relevant values from the id and offset arrays of the input array to populate the source arrays of the result. Things are not always that simple, however. Have a look at the implementations of the statistical functions like sum, or the set functions like unique_values, for examples where more effort is required to recover the source elements.
Note that a lot of the implementation is not concerned with tracking dependencies per se, and is based on https://github.com/data-apis/array-api-strict/tree/main/array_api_strict.
How to build and test the library (for developers):
conda create --name array-tracker python=3.12
conda activate array-tracker
pip install -e '.[test]'
pytest