<img src="https://microsoft.github.io/Accera/assets/Accera_darktext.png" alt="Accera logo" width="600"/>

# Accera Quickstart Example

In this example, we will:

1. Implement a simple `hello_accera` function that performs basic matrix multiplication with a ReLU activation
2. Build a [HAT](https://github.com/microsoft/hat) package with a dynamic (shared) library that exports this function
3. Call the `hello_accera` function in the dynamic library with some NumPy arrays, and checks against a NumPy implementation

### Setup

First, we'll install Accera using `pip`.

#### Optional: if running this notebook locally

* Linux/macOS: install gcc using `apt install gcc`.
* Windows: install Microsoft Visual Studio and run `vcvars64.bat` to setup Visual Studio tools in your `PATH` before starting the Jupyter environment.

In [None]:
!pip install accera

### Build your first package

We'll build a package called `"mypackage"`, containing one function called `"hello_accera"`.

This function performs the operation `ReLU(C + A @ B)` on arrays `A`, `B`, and `C`.

In [None]:
import accera as acc
import hatlib as hat
import numpy as np

A = acc.Array(role=acc.Array.Role.INPUT, shape=(16, 16))
B = acc.Array(role=acc.Array.Role.INPUT, shape=(16, 16))
C = acc.Array(role=acc.Array.Role.INPUT_OUTPUT, shape=(16, 16))

matmul = acc.Nest(shape=(16, 16, 16))
i1, j1, k1 = matmul.get_indices()

@matmul.iteration_logic
def _():
    C[i1, j1] += A[i1, k1] * B[k1, j1]

relu = acc.Nest(shape=(16, 16))
i2, j2 = relu.get_indices()

@relu.iteration_logic
def _():
    C[i2, j2] = acc.max(C[i2, j2], 0.0)

matmul_schedule = matmul.create_schedule()
relu_schedule = relu.create_schedule()

# fuse the first 2 indices of matmul and relu
schedule = acc.fuse(matmul_schedule, relu_schedule, partial=2)

package = acc.Package()
package.add(schedule, args=(A, B, C), base_name="hello_accera")

# build a dynamically-linked HAT package
package.build(name="mypackage", format=acc.Package.Format.HAT_DYNAMIC)

### Load the package and call the function

`package.build` produces a dynamic library (`mypackage_*.so`) that exports the `hello_accera` function. 

Let's call our function with some NumPy arrays.

In [None]:
# load the package and call the function with random test input
hat_package = hat.load("mypackage.hat")
hello_accera = hat_package["hello_accera"]

A_test = np.random.rand(16, 16).astype(np.float32)
B_test = np.random.rand(16, 16).astype(np.float32)
C_test = np.zeros((16, 16)).astype(np.float32)

# compute using NumPy as a comparison
C_np = np.maximum(C_test + A_test @ B_test, 0)

hello_accera(A_test, B_test, C_test)

# compare the result with NumPy
np.testing.assert_allclose(C_test, C_np)
print(C_np)
print(C_test)

### Next Steps

The function can be optimized using [schedule transformations](https://microsoft.github.io/Accera/Manual/03%20Schedules/#schedule-transformations). The [Manual](https://microsoft.github.io/Accera/Manual/00%20Introduction/) is a good place to start for an introduction to the Accera programming model.

## Documentation
Get to know Accera by reading the [Documentation](https://microsoft.github.io/Accera/).

You can find more step-by-step examples in the [Tutorials](https://microsoft.github.io/Accera/Tutorials).