Release v1.2.10 · microsoft/Accera

What's Changed

Update ci.yml to fix path changes by @lisaong in #49
Add unrolled convolution case study link by @marina-neseem in #50
Bump protobuf from 3.20.1 to 3.20.2 in /accera/onnx-emitter/test by @dependabot in #51

Merged PR 2886: [release] Bump docs to 1.2.10, sync GH to ADO. [Lisa
Ong]
- Bulk docs version update
- Bump protobuf from 3.20.1 to 3.20.2 in /accera/onnx-emitter/test (d1b87ec)
- Also fixing a minor docs bug (errant backtick)
Merged PR 2884: Add DSL test for runtime size correctness. [Denny Sun]
Merged PR 2878: Optimize warp id calculation by forcing scalar
registers. [Ritwik Das]
- ROCM: use __builtin_amdgcn_readfirstlane to force scalar reg usage
- CUDA: don't use anything special since __shfl_sync seems to generate slower code
Merged PR 2885: Updates python dependencies. [Kern Handa]

Updates hatlib version

Merged PR 2881: Fix the runtime crash caused by incorrectly generated
LLVM IR. [Denny Sun]

Call the specific version of LLVM type converter for dynamic memory
Create MemRefDescriptor from dynamic memory shape by associating the arrays with correct size arguments

With this change, the following DSL test can succeed and pass correctness check.

        M = Dimension()
        N = Dimension()
        K = Dimension()

        A = Array(shape=(M, K), element_type=ScalarType.float32,
            role=Array.Role.INPUT)

        B = Array(shape=(K, N), element_type=ScalarType.float32,
            role=Array.Role.INPUT)

        C = Array(shape=(M, N),
                    element_type=ScalarType.float32,
                    role=Array.Role.INPUT_OUTPUT)

        @nest.iteration_logic
        def _():
            C[i, j] += A[i, k] * B[k, j]

        M_test = np.int64(64)
        N_test = np.int64(128)
        K_test = np.int64(32)
        A_test = np.random.random((M_test, K_test)).astype(np.float32)
        B_test = np.random.random((K_test, N_test)).astype(np.float32)
        C_test = np.random.random((M_test, N_test)).astype(np.float32)

        correctness_check_values = {
            "pre": [M_test, N_test, K_test, A_test, B_test, C_test],
            "post": [M_test, N_test, K_test, A_test, B_test, C_test + A_test @ B_test],
        }

        function = package.add(nest, args=(M, N, K, A, B, C), base_name="runtimesizes")

        with verifiers.VerifyPackage(self, "test_runtimesizes", TEST_PACKAGE_DIR) as v:
            package.build("test_runtimesizes", format=TEST_FORMAT | Package.Format.MLIR_VERBOSE, mode=TEST_MODE, output_dir=TEST_PACKAGE_DIR)
            if correctness_check_values:
                v.check_correctness(
                    function.name,
                    before=correctness_check_values["pre"],
                    after=correctness_check_values["post"],
                )

Merged PR 2879: Fix exception in GPU baseline benchmark. [Ritwik Das]

Fix exception in GPU baseline benchmark
Merged PR 2856: Enable output caching in ROCM for all MMA shapes.
[Ritwik Das]
Merged PR 2876: Introduce warp bindings in CUDA. [Ritwik Das]
- Bind indices to WARP_X/Y along with tensorization (exclusively from thread id mapping)
- warp x dim is always a multiple of warp size in the x dimension. e.g. if for dividing a 64x64 block tile into 4 subtiles of 32x32 each where each subtile is computed by a single warp then the blockDim would be (64,2,1).
- This is required since with tensorization we would want block dims to be generated in a specific way than without it. Calculating offsets within the matrix based on warps is non-trivial if not impossible with just thread bindings.
Related work items: #3726
Merged PR 2874: Add unrolled convolution case study link (#50) [Lisa
Ong]

Add unrolled convolution case study link (#50)
- Update README.md
Add unrolled convolution case study reference link
- Update the reference link
Update the reference according to latest updates in the case study
Merged PR 2873: Convert function signature from dynamic memref type to
llvm type. [Denny Sun]

With this change, Accera is able to write the correct function signature of dynamic memref type to HAT file
Merged PR 2871: Update hatlib version. [Denny Sun]

from 0.0.23 to 0.0.25
Merged PR 2870: Filter benchmark kernels based on scheduling policy.
[Ritwik Das]

Filter benchmark kernels based on scheduling policy
Merged PR 2867: [build][github] Update test path in github actions.
[Lisa Ong]

Fixes https://github.com/microsoft/Accera/actions/runs/3071905923

Full Changelog: v1.2.9...v1.2.10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.2.10

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's Changed

Contributors

Uh oh!