---
title: "A step by step guide to build the HELLO dialect for MLIR"
description: "Building a MLIR dialect from 0"
author: "Victor Guerra"
date: 2023-05-16
date-modified: last-modified
date-format: full
draft: true
page-layout: full
categories: 
    - mlir
    - compilers
    - build-systems
    - posts
---

In this series of posts, we are going to see how to build from scratch a MLIR dialect. The dialect we are building here is the [*MLIR Hello Dialect*](https://github.com/Lewuathe/mlir-hello) by [Kai Sasaki](https://www.lewuathe.com/).

## Prerequisites

Throught the different posts I am going to assume you have a local build of MLIR. If you don't, you can checkout the [official getting started guide](https://mlir.llvm.org/getting_started/) for MLIR.

## Overview of the Hello dialect

The hello dialect provides 3 basic operations:

* `hello.constant`: Constant operation that turns a literal into a SSA value.
* `hello.print`: You guessed it, a print operation.
* `hello.world`: Operation that prints "Hello, World".

For a more detialed description of the operations checkout the [README of the Hello Dialect repo](https://github.com/Lewuathe/mlir-hello#operations).

## Setting up the build system

We start by creating a directory for our project, we name it `hello-dialect`.

```bash
$> mkdir hello-dialect
```

And we proceed to set up CMake with a minimal set of commands to configure and build our project, for that we need a `CMakeLists.txt` file:

```cmake
cmake_minimum_required(VERSION 3.20.0)

project(mlir-hello LANGUAGES CXX C)

set(CMAKE_CXX_STANDARD 17 CACHE STRING "C++ standard to conform to")

find_package(MLIR REQUIRED CONFIG)

message(STATUS "Using MLIRConfig.cmake in: ${MLIR_DIR}")
message(STATUS "Using LLVMConfig.cmake in: ${LLVM_DIR}")
```

First, we name the project `mlir-hello` and specified the languages needed to build it. `C++` and `C` in this case. Then we make sure that we locate the `MLIR` package (and `LLVM` subsequentially) and load its specific details. This is the most basic `CMake` setup that you need to start working on your own dialect. You can already try to build your project using this `bash` script:


```bash
#!/bin/sh

rm -rf build
mkdir build


pushd build

LLVM_BUILD_DIR=<PATH TO LLVMs BUILD DIRECTORY>

cmake -G Ninja .. \
    -DLLVM_DIR="$LLVM_BUILD_DIR/lib/cmake/llvm" \
    -DMLIR_DIR="$LLVM_BUILD_DIR/lib/cmake/mlir" \
    -DCMAKE_BUILD_TYPE=Debug

popd
```

Let's name it `build.sh`. Note that `LLVM_BUILD_DIR` should point to the `build` directory of your local MLIR build.

If we run our `build.sh` script. We see the following output:

In [None]:
$> ./build.sh
...
-- Using MLIRConfig.cmake in: <PATH TO LLVMs BUILD DIRECTORY>/build/lib/cmake/mlir
-- Using LLVMConfig.cmake in: <PATH TO LLVMs BUILD DIRECTORY>/build/lib/cmake/llvm
-- Configuring done
-- Generating done
-- Build files have been written to: <PATH TO PROJECT DIRECTORY>/hello-dialect/build

## Defining the dialect

Now we move forward to define the Hello Dialect in the [ODS](https://mlir.llvm.org/docs/DefiningDialects/Operations/) format. Let's start with the most basic definition.

```cpp
#ifndef HELLO_DIALECT
#define HELLO_DIALECT

include "mlir/IR/OpBase.td"

//===----------------------------------------------------------------------===//
// Hello dialect definition.
//===----------------------------------------------------------------------===//

def Hello_Dialect : Dialect {
    let name = "hello";
    let summary = "A hello out-of-tree MLIR dialect.";
    let description = [{
        This dialect is minimal example to implement hello-world kind of sample code
        for MLIR.
    }];
    let cppNamespace = "::hello";
    let hasConstantMaterializer = 1;
}

#endif // HELLO_DIALECT
```

Let's name this file `HelloDialect.td`. But, where do we place it? Following the recommendation of [`Creating a Dialect`](https://mlir.llvm.org/docs/Tutorials/CreatingADialect/) tutorial, we should place the `TableGen` files in the `include` directory. So let's create the following layout of directories:

```bash
./
├── CMakeLists.txt
└── include/
    ├── CMakeLists.txt
    └── Hello/
        ├── CMakeLists.txt
        └── HelloDialect.td
```

Let's have a look at what the newly introduced `CMakeLists.txt` files should be.

`./include/CMakeLists.txt` is simple, we just need to signal the existance of the `Hello` directory:

```cmake
add_subdirectory(Hello)
```

`./include/Hello/CMakeLists.txt` contains some `MLIR` related commands:

```cmake
add_mlir_dialect(HelloDialect hello)
add_mlir_doc(HelloDialect HelloDialect Hello/ -gen-dialect-doc)
```

`add_mlir_dialect` declares a dialect in the include directory. Its parameters are the dialect name and the namespace to be used.

`add_mlir_doc` generates the documenation for the dialect. Its parameters are the td filename, the name of the output markdown file, the name of the directory where to put the documentation and the command to execute.

## Generating documentation and include files

Next, lets run our `build.sh` script to generate the build files and explore some of the new ninja targets that have been generated.

In [None]:
$> ./build.sh && ninja -C build -t targets
...
mlir-headers: phony
...
mlir-doc: phony
...
HelloDialectDocGen: phony
MLIRHelloDialectIncGen: phony
...

Among the listed targets, we got `HelloDialectDocGen` that will generate our dialect documentation and `MLIRHelloDialectIncGen` that generates C++ files that contain our dialect definition and declaration. But those targets are linked to `mlir-doc` and `mlir-headers` targets respectively.

Let's generate first the docs with the `mlir-doc` target:

In [None]:
$> ninja -C build mlir-doc
ninja: Entering directory `build'
[1/2] Building HelloDialect.md...

Having a look at the `build` directory, we notice the generated markdown file. Not much content in there but it is a good start.

```bash
$> tree ./build/docs
./build/docs
└── Hello
    └── HelloDialect.md

2 directories, 1 file
```

Now, header files:

In [None]:
$> ninja -C build mlir-headers
ninja: Entering directory `build'
[6/6] Building HelloDialectTypes.h.inc...

In [None]:
$> tree ./build/include
./build/include
├── CMakeFiles
├── Hello
│   ├── CMakeFiles
│   ├── HelloDialect.cpp.inc
│   ├── HelloDialect.cpp.inc.d
│   ├── HelloDialect.h.inc
│   ├── HelloDialect.h.inc.d
│   ├── HelloDialect.md
│   ├── HelloDialect.md.d
│   ├── HelloDialectDialect.cpp.inc
│   ├── HelloDialectDialect.cpp.inc.d
│   ├── HelloDialectDialect.h.inc
│   ├── HelloDialectDialect.h.inc.d
│   ├── HelloDialectTypes.cpp.inc
│   ├── HelloDialectTypes.cpp.inc.d
│   ├── HelloDialectTypes.h.inc
│   ├── HelloDialectTypes.h.inc.d
│   └── cmake_install.cmake
└── cmake_install.cmake

4 directories, 16 files

At this very early stage, if you hav a look at `./build/include/Hello/HelloDialectDialect.cpp.inc` you'll find the `C++` dialect definition:

```cpp
/*===- TableGen'erated file -------------------------------------*- C++ -*-===*\
|*                                                                            *|
|* Dialect Definitions                                                        *|
|*                                                                            *|
|* Automatically generated file, do not edit!                                 *|
|*                                                                            *|
\*===----------------------------------------------------------------------===*/

MLIR_DEFINE_EXPLICIT_TYPE_ID(::hello::HelloDialect)
namespace hello {

HelloDialect::HelloDialect(::mlir::MLIRContext *context)
    : ::mlir::Dialect(getDialectNamespace(), context, ::mlir::TypeID::get<HelloDialect>()) {
  
  initialize();
}

HelloDialect::~HelloDialect() = default;

} // namespace hello
```

# Dialect Operations

In order to introduce our Hello operations, we need the base operation `Hello_Op`:

```cpp
class Hello_Op<string mnemonic, list<Trait> traits = []> :
        Op<Hello_Dialect, mnemonic, traits>;
```

Based on it, we can introduce the 3 operations we need:

* `hello.constant`: 
```cpp
def ConstantOp : Hello_Op<"constant", [Pure]> {
  let summary = "constant";
  let description = [{
    Constant operation turns a literal into an SSA value. The data is attached
    to the operation as an attribute. For example:

    ```mlir
      %0 = "hello.constant"()
      { value = dense<[[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]> : tensor<2x3xf64> }
      : () -> tensor<2x3xf64>
    ```
  }];

  let builders = [
    OpBuilder<(ins "mlir::DenseElementsAttr":$value), [{
      build($_builder, $_state, value.getType(), value);
    }]>,
    OpBuilder<(ins "double":$value)>
  ];

//  let parser = [{ return ::parseConstantOp(parser, result); }];
  let arguments = (ins F64ElementsAttr:$value);
  let results = (outs F64Tensor);
}
```

* `hello.print`:
```cpp
def PrintOp : Hello_Op<"print", [Pure]> {
    let summary = "print operation";
    let description = [{
        The "print" builtin operation prints a given input tensor, and produces
        no results.
    }];

    // The print operation takes an input tensor to print.
    let arguments = (ins AnyTypeOf<[F64Tensor, F64MemRef]>:$input);

    let assemblyFormat = "$input attr-dict `:` type($input)";
}
```cpp
```

* `hello.world`:
```cpp
def WorldOp : Hello_Op<"world", [Pure]> {
    let summary = "print Hello, World";
    let description = [{
        The "world" operation prints "Hello, World", and produces
        no results.
    }];
}
```

We place all this definitions in a file called `HelloOps.td` within our `include/Hello` directory. You can have a look at the entire definition of the [`HelloOps.td`](https://github.com/Lewuathe/mlir-hello/blob/main/include/Hello/HelloOps.td) file in the github repo.

One thing important to note is that you need to include the definition of the dialect using the tablegen's `include` [mechanism](https://bcain-llvm.readthedocs.io/projects/llvm/en/latest/TableGen/LangRef/#syntax):

```cpp
include "HelloDialect.td"
```

Our `include/Hello` directory now looks as follows:

```bash
$> tree ./include/Hello
./include/Hello
├── CMakeLists.txt
├── HelloDialect.td
└── HelloOps.td

1 directory, 3 files
```

Now we need to make a couple of changes to our `include/Hello/CMakeLists.txt` file:

```cmake
add_mlir_dialect(HelloOps hello)
add_mlir_doc(HelloDialect HelloDialect Hello/ -gen-dialect-doc)
add_mlir_doc(HelloOps HelloOps Hello/ -gen-op-doc)
```

Note that for the dialect definition we use `HelloOps` instead of `HelloDialect`, as the former includes the latter. As well, we want to generate documentation for the operations we just introduced, hence, we add the `add_mlir_doc` command for the `HelloOps`.

You can now re-run our `build.sh` script, and generate again docs and headers:

```bash
$> ./build.sh && ninja -C build -v mlir-docs mlir-headers
```

::: {.callout-tip}
Running `ninja` in verbose mode (`-v` option) will show you all `mlir-tblgen` commands.
:::

We can now check that docs for the dialect and the ops were generated:

```bash
$> tree build/docs
build/docs
└── Hello
    ├── HelloDialect.md
    └── HelloOps.md

2 directories, 2 files
```

As well as the corresponding C++ definitions and declarations:

```bash
$> tree build/include/Hello
build/include/Hello
├── CMakeFiles
├── HelloDialect.md
├── HelloDialect.md.d
├── HelloOps.cpp.inc
├── HelloOps.cpp.inc.d
├── HelloOps.h.inc
├── HelloOps.h.inc.d
├── HelloOps.md
├── HelloOps.md.d
├── HelloOpsDialect.cpp.inc
├── HelloOpsDialect.cpp.inc.d
├── HelloOpsDialect.h.inc
├── HelloOpsDialect.h.inc.d
├── HelloOpsTypes.cpp.inc
├── HelloOpsTypes.cpp.inc.d
├── HelloOpsTypes.h.inc
├── HelloOpsTypes.h.inc.d
└── cmake_install.cmake

2 directories, 17 files
```

Note the change in the filename pattern, this is due to the fact that we now are using the `HelloOps` file to define the dialect.

To checkout the Op definitions, have a look at the files `build/include/Hello/HelloOps.*.inc`.
