feat(kernelgen): import NKIPyKernelGen as a subfolder#55
Conversation
Import the open_source branch of NKIPyKernelGen into `kernelgen/` as a
self-contained subpackage. NKIPyKernelGen is a compiler that traces NumPy
functions and lowers them to NISA (Neuron Instruction Set Architecture)
for AWS Neuron hardware. Users write kernels in Python with `@trace` and
`knob.knob()` annotations; the compiler handles tiling, memory placement,
layout legalization, and NISA lowering.
What's included
---------------
- `kernelgen/nkipy_kernelgen/` — Python tracing frontend:
- `trace.py` (@trace decorator)
- `knob.py` (tensor annotations: mem_space, tile_size, reduction_tile,
partition_dim)
- `traced_array.py` (TracedArray wrapping MLIR SSA values)
- `op_vtable.py` (NumPy op → MLIR lowering table)
- `transforms/nkipy_opt.py` (pipeline orchestration, shells out to
`nkipy-opt`)
- `kernelgen/mlir/` — MLIR dialect + C++ passes:
- `nkipy.annotate` op (target, mem_space, partition_dim, tile_size,
reduction_tile)
- 20+ transformation passes under `mlir/lib/Transforms/` implementing
the 24-pass compilation pipeline (InferLayout, KnobDrivenTiling,
AnnotateMemorySpace, LegalizeLayout, InsertSpillReload,
LinalgToNisa, etc.)
- `kernelgen/tests/` — test suite:
- `passes/` — per-pass FileCheck tests
- `e2e/` — end-to-end tests (trace → NISA → BIR sim / HW)
- `unit/` — Python-level unit tests
- `harness.py` — unified test harness with LLVM/BIR_SIM/HW/FileCheck
modes
- `kernelgen/examples/` — example kernels
- `kernelgen/compiler_explorer/` — Compiler Explorer wrapper for inspecting
IR at any pipeline stage
- `kernelgen/setup.py`, `pyproject.toml`, `pytest.ini`, `requirements.txt`
— build + test configuration (`pip install -e kernelgen/` builds the
C++ passes via CMake)
- `kernelgen/CLAUDE.md`, `README.md` — pipeline docs and usage notes
Architecture notes
------------------
NKIPyKernelGen depends on the NISA dialect defined in private-nki-staging
(the `nki` wheel). NKIPyKernelGen's `nkipy-opt` binary performs the
tensor-level and bufferization phases; lowering to BIR then runs through
the upstream `nki-opt-pipeline`. This import does not bring in the NISA
dialect sources — only NKIPyKernelGen's own passes and frontend.
Ignore rules
------------
Added a `!mlir/lib/` override in `kernelgen/.gitignore` so the parent
nkipy repo's `lib/` rule (intended for Python venv `lib/` dirs) does not
silently exclude the MLIR C++ pass sources under `kernelgen/mlir/lib/`.
Source
------
Imported from NKIPyKernelGen `open_source` branch @ commit 973c1be
("fix: correct mem_space enum values in builder.annotate()"). Internal
git history is not preserved — this is a single squash import for the
open-source release.
vgene
left a comment
There was a problem hiding this comment.
The PR is quite big with some duplicated components (like tests) and some components that should be in separate PR (like compiler explorer).
Let's also review the necessity of all code under /mlir and /tests. It's best to purge any code that may be unnecessary at this merge time. We can always add them back in later PRs.
| @@ -0,0 +1,3 @@ | |||
| [build-system] | |||
There was a problem hiding this comment.
Let make sure this is properly integrated with uv.
Top level pyproject.toml should support kernelgen in monorepo structure.
| | Marker | Meaning | | ||
| |----------|---------| | ||
| | `llvm_sim` | LLVM JIT simulation test (CPU) | | ||
| | `bir_sim` | BIR simulation test (CPU) | |
There was a problem hiding this comment.
I'm not convinced we need to include BIR sim test? The device is the golden, like how we do it in nkipy main backend
| # Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
| """ | ||
| Ported from nkipy/tests/kernels/attention_dynamo.py |
There was a problem hiding this comment.
We should deduplicate all nkipy tests. The top level /tests folder can support the kernelgen path
| @@ -0,0 +1,99 @@ | |||
|
|
|||
| # NKIPy KernelGen | |||
There was a problem hiding this comment.
Let's discuss the name "KernelGen"
I don't think it's the best name. Is NKI backend a better name?
| @@ -0,0 +1,23 @@ | |||
| --- | |||
There was a problem hiding this comment.
Let's use uv to manage build and avoid these scripts/skill
| @@ -0,0 +1,17 @@ | |||
| # Compiler Explorer local configuration for NKIPy MLIR compiler | |||
There was a problem hiding this comment.
The compiler explorer feels orthogonal to the main component. Let's delay the merge of this part.
1/ this should be a top level folder, we can also support compiler explorer for hlo backend
2/ the examples etc should not be a part of compiler explorer
| namespace nkipy { | ||
|
|
||
| /// Maximum partition dimension size for NeuronCore hardware. | ||
| static constexpr int64_t MAX_PARTITION_DIM = 128; |
|
Superseded by #58, which incorporates the review feedback (renamed to |
Import the open_source branch of NKIPyKernelGen into
kernelgen/as a self-contained subpackage. NKIPyKernelGen is a compiler that traces NumPy functions and lowers them to NISA (Neuron Instruction Set Architecture) for AWS Neuron hardware. Users write kernels in Python with@traceandknob.knob()annotations; the compiler handles tiling, memory placement, layout legalization, and NISA lowering.What's included
kernelgen/nkipy_kernelgen/— Python tracing frontend:trace.py(@trace decorator)knob.py(tensor annotations: mem_space, tile_size, reduction_tile, partition_dim)traced_array.py(TracedArray wrapping MLIR SSA values)op_vtable.py(NumPy op → MLIR lowering table)transforms/nkipy_opt.py(pipeline orchestration, shells out tonkipy-opt)kernelgen/mlir/— MLIR dialect + C++ passes:nkipy.annotateop (target, mem_space, partition_dim, tile_size,reduction_tile)
mlir/lib/Transforms/implementingthe 24-pass compilation pipeline (InferLayout, KnobDrivenTiling,
AnnotateMemorySpace, LegalizeLayout, InsertSpillReload,
LinalgToNisa, etc.)
kernelgen/tests/— test suite:passes/— per-pass FileCheck testse2e/— end-to-end tests (trace → NISA → BIR sim / HW)unit/— Python-level unit testsharness.py— unified test harness with LLVM/BIR_SIM/HW/FileCheckmodes
kernelgen/examples/— example kernelskernelgen/compiler_explorer/— Compiler Explorer wrapper for inspectingIR at any pipeline stage
kernelgen/setup.py,pyproject.toml,pytest.ini,requirements.txt— build + test configuration (
pip install -e kernelgen/builds theC++ passes via CMake)
kernelgen/CLAUDE.md,README.md— pipeline docs and usage notesArchitecture notes
NKIPyKernelGen depends on the NISA dialect defined in private-nki-staging (the
nkiwheel). NKIPyKernelGen'snkipy-optbinary performs the tensor-level and bufferization phases; lowering to BIR then runs through the upstreamnki-opt-pipeline. This import does not bring in the NISA dialect sources — only NKIPyKernelGen's own passes and frontend.Ignore rules
Added a
!mlir/lib/override inkernelgen/.gitignoreso the parent nkipy repo'slib/rule (intended for Python venvlib/dirs) does not silently exclude the MLIR C++ pass sources underkernelgen/mlir/lib/.Source
Imported from NKIPyKernelGen
open_sourcebranch @ commit 973c1be ("fix: correct mem_space enum values in builder.annotate()"). Internal git history is not preserved — this is a single squash import for the open-source release.Issue #, if available:
Description of changes:
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.