Skip to content
main
Switch branches/tags
Code

Latest commit

The PR enables fusion in the pipeline, which addresses perf issues for
matmul + generic cases. However, it causes performance regression for
some few cases. There are some unnecessary one-trip loops created for
some reasons. Some of them can be fixed by
#7458. They are known values during
compilation time because the dim sizes are less than tiling sizes. In
this context, the one-trip loops won't be generated.

This enables vectorization for the cases that dim sizes are not
divisible by tiling sizes. See examples (260x300x280) below.

Overall, the new pipeline is better than TileFuseAndVectorize pipeline,
so we're going to switch matmul codegen to the new pipeline.

There are still some "gaps" between sandbox and IREE. It's not really an
apple-to-apple comparison because

- There are some overheads in IREE. It would affect when execution time
  is small.
- IREE is a multi-threaded based approach while sandbox is
  single-threaded based approach.

The next step is to search parameters and see if IREE can get closer to
sandbox and MKL lib. Then have heuristic configurations for those cases.

# Performance

## MobileBert Inference Benchmarks

| Before (single-threaded) | After (single-threaded) | Before (multi-threaded) | After (multi-threaded) |
| ------------------------ | ----------------------- | ----------------------- | ---------------------- |
|                  228 ms |                 200 ms |                61.1 ms |                59.3 ms |

## MiniLM BERT Benchmarks

| Before (single-threaded) | After (single-threaded) | Before (multi-threaded) | After (multi-threaded) |
| ------------------------ | ----------------------- | ----------------------- | ---------------------- |
|                  82.7 ms |                 79.1 ms |                21.6 ms |                21.1 ms |

## Matmul Microbenchmarks (Single-threaded)

| Shape            | Before w/o exp  | After w/o exp   | After w/ exp |
| ---------------- | --------------- | --------------- | ------------ |
| 1x384x384        |        0.025 ms |        0.095 ms |     0.024 ms |
| 128x384x384      |        0.311 ms |        0.337 ms |     0.355 ms |
| 128x1536x384     |         1.51 ms |         1.53 ms |      1.59 ms |
| 128x384x1536     |         1.51 ms |         1.43 ms |      1.52 ms |
| 384x384x512      |         1.45 ms |         1.30 ms |      1.38 ms |
| 384x128x128      |        0.108 ms |        0.107 ms |     0.122 ms |
| 384x128x512      |        0.453 ms |        0.458 ms |     0.511 ms |
| 384x512x128      |        0.414 ms |        0.408 ms |     0.415 ms |
| 384x512x2        |        0.366 ms |        0.860 ms |     0.582 ms |
| 384x384x32       |        0.102 ms |        0.137 ms |     0.121 ms |
| 64x384x64        |        0.028 ms |        0.030 ms |     0.028 ms |
| 64x1536x64       |        0.105 ms |        0.099 ms |     0.100 ms |
| 192x256x128      |        0.106 ms |        0.101 ms |     0.110 ms |
| 260x300x280      |         14.2 ms |        0.836 ms |     0.823 ms |
| 1000x1000x1000   |          684 ms |         36.0 ms |      35.9 ms |
| 1024x1024x1024   |         41.6 ms |         39.9 ms |      40.7 ms |
e3ded85

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time

IREE: Intermediate Representation Execution Environment

IREE (Intermediate Representation Execution Environment, pronounced as "eerie") is an MLIR-based end-to-end compiler and runtime that lowers Machine Learning (ML) models to a unified IR that scales up to meet the needs of the datacenter and down to satisfy the constraints and special considerations of mobile and edge deployments.

See our website for project details, user guides, and instructions on building from source.

Project Status

IREE is still in its early phase. We have settled down on the overarching infrastructure and are actively improving various software components as well as project logistics. It is still quite far from ready for everyday use and is made available without any support at the moment. With that said, we welcome any kind of feedback on any communication channels!

Communication Channels

Related Project Channels

  • MLIR topic within LLVM Discourse: IREE is enabled by and heavily relies on MLIR. IREE sometimes is referred to in certain MLIR discussions. Useful if you are also interested in MLIR evolution.

Build Status

CI System Build System Platform Architecture Configuration / Component Status
Kokoro Bazel Linux x86-64 kokoro status bazel/linux/x86-swiftshader/core
Kokoro CMake & Bazel Linux x86-64 (swiftshader) Integrations kokoro status cmake-bazel/linux/x86-swiftshader
Kokoro CMake & Bazel Linux x86-64 (turing) Integrations kokoro status cmake-bazel/linux/x86-turing
Kokoro CMake Linux x86-64 (swiftshader) kokoro status cmake/linux/x86-swiftshader
Kokoro CMake Linux x86-64 (swiftshader) asan kokoro status cmake/linux/x86-swiftshader-asan
Kokoro CMake Linux x86-64 (turing) kokoro status cmake/linux/x86-turing
Kokoro CMake Android arm64-v8a Runtime (build only) kokoro status cmake/android/arm64-v8a
Kokoro CMake Bare Metal risc-v-32 Runtime kokoro status cmake/baremetal/riscv32
Kokoro CMake Linux risc-v-64 Runtime kokoro status cmake/linux/riscv64
Buildkite CMake Android arm64-v8a Runtime buildkite status iree-android-arm64-v8a
BuildKite CMake Android arm64-v8a Runtime Benchmarks buildkite status iree-benchmark
BuildKite CMake Linux x86-64 Tracing + Standalone Runtime buildkite status iree-build-configurations

Architecture Overview

IREE Architecture

See our website for more information.

Presentations and Talks

  • 2021-06-09: IREE Runtime Design Tech Talk (recording and slides)
  • 2020-08-20: IREE CodeGen: MLIR Open Design Meeting Presentation (recording and slides)
  • 2020-03-18: Interactive HAL IR Walkthrough (recording)
  • 2020-01-31: End-to-end MLIR Workflow in IREE: MLIR Open Design Meeting Presentation (recording and slides)

License

IREE is licensed under the terms of the Apache 2.0 License with LLVM Exceptions. See LICENSE for more information.