MobileKernelBench

MobileKernelBench evaluates LLM-generated mobile operator implementations against native inference frameworks. The repository currently contains NCNN and MNN pipelines plus a MoKA agent wrapper that can iteratively generate, compile, verify, and benchmark operator code.

This top-level README covers repository setup, datasets, environment requirements, and MoKA usage. Framework-specific commands live in the pipeline READMEs:

Repository Structure

Core directories:

MoKA/: agent loop and unified pipeline interface.
mnnpipeline/: MNN standalone pipeline, operator map, runtime helpers, CLI, and MNN-specific documentation.
ncnnpipeline/: NCNN standalone pipeline, operator map, verification helpers, and NCNN-specific documentation.
pipeline_MNN/ and pipeline_NCNN/: compatibility packages used by MoKA.
prompt/: common prompt and LLM API helpers.
MNN_utils/: legacy MNN utilities and the MNN dataset used by the new mnnpipeline.
dataset/: NCNN/PyTorch datasets and converted model artifacts.
MNN-3.3.0/: local MNN 3.3.0 checkout used for MNN evaluation. This is third-party source code.

Important index files:

mnnpipeline/mnn_op_map.yaml: maps MNN task names to PyTorch files, ONNX files, MNN model paths, target source folders, and allowed generated source files.
ncnnpipeline/ncnn_op_map.yaml: maps NCNN task names to NCNN source file/layer/test metadata.

Datasets

NCNN uses the dataset/ tree:

dataset/Mobilekernelbench: PyTorch reference models.
dataset/Mobilekernelbench_onnx: ONNX exports.
dataset/Mobilekernelbench_onnx_ncnn: NCNN conversion workspace.
dataset/Mobilekernelbench_pt: TorchScript or PyTorch-converted artifacts.
dataset/Mobilekernelbench_pt_ncnn_success: NCNN converted models used for benchmark inputs.

MNN uses the MNN_utils/dataset/ tree:

MNN_utils/dataset/mnn_dataset_test/Dataset_version1: PyTorch reference models.
MNN_utils/dataset/mnn_dataset_test/Dataset_version1_onnx: ONNX reference models.
MNN_utils/dataset/mnn_models/Dataset_version1: generated or original MNN model files grouped by model/provider.
MNN_utils/dataset/mnn_models/Dataset_version1/original: MNN models generated from the original MNN implementation during original-op evaluation.

The MNN map and NCNN map are the source of truth for locating task files. Prefer adding new task metadata there instead of hard-coding paths in scripts.

Environment Setup

Use the repository root as the working directory:

cd /Users/zeezou/python/project/MobileKernelBench_git

Create or activate a Python environment, then install the common dependencies:

pip install pyyaml tqdm numpy onnx onnxruntime openai torch

For LLM-backed modes, configure OpenRouter:

export OPENROUTER_API_KEY='your-openrouter-key'
export OPENROUTER_MAX_TOKENS= 'your-max-token'

For Android performance tests, prepare:

adb available in PATH.
An authorized Android device shown by adb devices.
ANDROID_NDK set to a valid Android NDK path.
CMake and Make available on the host.

Framework source requirements:

MNN: use MNN-3.3.0/ in this repository or pass another checkout with --mnn-root.
NCNN: use an NCNN checkout at ncnn/; see ncnnpipeline/README.md for setup details.

MoKA Agent Usage

MoKA wraps a framework pipeline and runs iterative rounds. Each round can generate code, compile it, verify correctness, benchmark it, and use failure information to build the next prompt.

Show all options:

python MoKA/run_moka.py --help

Run MoKA With MNN

python MoKA/run_moka.py \
  --framework mnn \
  --task-name Abs \
  --mnn-root path/to/your/mnn/folder \
  --model model/from/openrouter \
  --max-rounds 5

Useful MNN options:

--mnn-op-map: defaults to mnnpipeline/mnn_op_map.yaml.
--mnn-data-root: defaults to MNN_utils/dataset/mnn_dataset_test/Dataset_version1.
--mnn-onnx-root: defaults to MNN_utils/dataset/mnn_dataset_test/Dataset_version1_onnx.
--mnn-converted-root: defaults to MNN_utils/dataset/mnn_models/Dataset_version1.
--mnn-root: path to the MNN source checkout.

Run MoKA With NCNN

python MoKA/run_moka.py \
  --framework ncnn \
  --task-name Abs \
  --ncnn-op-map ncnnpipeline/ncnn_op_map.yaml \
  --ncnn-data-root dataset/Mobilekernelbench \
  --ncnn-converted-root dataset/Mobilekernelbench_pt_ncnn_success \
  --ncnn-prompt-config prompt/prompt_template.yaml \
  --model anthropic/claude-sonnet-4.5 \
  --max-rounds 5

MoKA Outputs

MoKA writes round-level and final files under:

MoKA_plus_response/<framework>/<task_name>/
MoKA_plus_results/<framework>/<task_name>.json

Typical files include prompts, LLM responses, pipeline results, memory history, and final summaries.

Standalone Framework Pipelines

Use standalone pipelines when you want to test one framework without the MoKA repair loop.

MNN standalone commands: mnnpipeline/README.md
NCNN standalone commands: ncnnpipeline/README.md

Recommended workflow:

Run standalone correctness without benchmark first.
Enable Android benchmark only after host correctness passes.
Use MoKA after the standalone pipeline is confirmed to work for the target framework.

Thanks

We build our pipeline on MNN and NCNN, thanks to their great work, checking their repo at MNN and NCNN.

citation

@misc{zou2026mobilekernelbenchllmswriteefficient,
      title={MobileKernelBench: Can LLMs Write Efficient Kernels for Mobile Devices?}, 
      author={Xingze Zou and Jing Wang and Yuhua Zheng and Xueyi Chen and Haolei Bai and Lingcheng Kong and Syed A. R. Abu-Bakar and Zhaode Wang and Chengfei Lv and Haoji Hu and Huan Wang},
      year={2026},
      eprint={2603.11935},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2603.11935}, 
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MobileKernelBench

Repository Structure

Datasets

Environment Setup

MoKA Agent Usage

Run MoKA With MNN

Run MoKA With NCNN

MoKA Outputs

Standalone Framework Pipelines

Thanks

citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
MNN_utils		MNN_utils
MoKA		MoKA
dataset		dataset
experiment_trans		experiment_trans
extra_MNN		extra_MNN
mnnpipeline		mnnpipeline
ncnnpipeline		ncnnpipeline
pipeline_MNN		pipeline_MNN
pipeline_NCNN		pipeline_NCNN
prompt		prompt
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

MobileKernelBench

Repository Structure

Datasets

Environment Setup

MoKA Agent Usage

Run MoKA With MNN

Run MoKA With NCNN

MoKA Outputs

Standalone Framework Pipelines

Thanks

citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages