microsoft · timenick · May 12, 2026 · May 12, 2026 · May 12, 2026 · May 12, 2026
@@ -4,8 +4,8 @@ We're always looking for your help to improve the product (bug fixes, new featur
 
 ## Contribute a code change
 
-* Start by reading the project [README](./README.md) to understand the scope and goals of ModelKit.
-* If your change is non-trivial or introduces new public facing APIs, please use the [feature request issue template](https://github.com/microsoft/ModelKit/issues/new) to discuss it with the team first.
+* Start by reading the project [README](./README.md) to understand the scope and goals of WinML CLI.
+* If your change is non-trivial or introduces new public facing APIs, please use the [feature request issue template](https://github.com/microsoft/WinML-ModelKit/issues/new) to discuss it with the team first.
 * For all other changes, you can directly create a pull request (PR) and we'll be happy to take a look.
 * Make sure your PR adheres to the coding conventions and standards below.
 
@@ -22,7 +22,7 @@ This installs all dependencies and enables [pre-commit hooks](https://pre-commit
 
 ### Runtime check rules
 
-When running ModelKit from a source tree (`uv run winml ...`), you need to populate the runtime check rule zips locally. See [`src/winml/modelkit/analyze/rules/runtime_check_rules/README.md`](./src/winml/modelkit/analyze/rules/runtime_check_rules/README.md) for setup options (GitHub release for external contributors, `gim-home` script for Microsoft internal, `MODELKIT_RULES_DIR` override).
+When running WinML CLI from a source tree (`uv run winml ...`), you need to populate the runtime check rule zips locally. See [`src/winml/modelkit/analyze/rules/runtime_check_rules/README.md`](./src/winml/modelkit/analyze/rules/runtime_check_rules/README.md) for setup options (GitHub release for external contributors, `gim-home` script for Microsoft internal, `WINMLCLI_RULES_DIR` override).
 
 ## Coding conventions and standards
 

@@ -1,15 +1,15 @@
-# ModelKit
+# WinML CLI
 
 [![ModelKit CI](https://github.com/microsoft/WinML-ModelKit/actions/workflows/modelkit-ci.yml/badge.svg)](https://github.com/microsoft/WinML-ModelKit/actions/workflows/modelkit-ci.yml)
 ![Status](https://img.shields.io/badge/status-early%20access-blue)
 ![Python](https://img.shields.io/badge/python-3.10%2B-blue?logo=python&logoColor=white)
 ![License](https://img.shields.io/badge/license-MIT-green)
 
-**ModelKit** is a CLI toolkit to build **portable, performant, and high-quality** models for Windows ML. It covers the entire journey from pretrained model to on-device inference — export, optimization, quantization, compilation, and benchmarking — across **all execution providers**, regardless of silicon.
+**WinML CLI** is a CLI toolkit to build **portable, performant, and high-quality** models for Windows ML. It covers the entire journey from pretrained model to on-device inference — export, optimization, quantization, compilation, and benchmarking — across **all execution providers**, regardless of silicon.
 
 ---
 
-## :dart: ModelKit Is Right for You If
+## :dart: WinML CLI Is Right for You If
 
 - [x] You want to build models that run on **any Windows device** — Qualcomm, Intel, AMD, NVIDIA, or CPU
 - [x] You want to benchmark a model with **one command** — latency, throughput, and live hardware utilization
@@ -32,7 +32,7 @@
 | **Dml** | Hardware-agnostic GPU backend | 🔶 Planned | `--ep dml` | `--device gpu` |
 | **CPU** | Cross-platform fallback | ⚪ Always available | `--ep cpu` | `--device cpu` |
 
-> **Tip:** Use `--device auto` and ModelKit picks the best available device — NPU first, then GPU, then CPU.
+> **Tip:** Use `--device auto` and WinML CLI picks the best available device — NPU first, then GPU, then CPU.
 
 ---
 
@@ -45,19 +45,19 @@
 | **Windows 11** (x64 or ARM64) | Windows 11 24H2+ required for NPU support |
 | **UV** | Install [UV](https://github.com/astral-sh/uv) |
 | **Windows App SDK Runtime 1.8** | [Latest Windows App SDK downloads](https://learn.microsoft.com/en-us/windows/apps/windows-app-sdk/downloads) |
-| **ModelKit** (Python wheel) | See release instructions |
+| **WinML CLI** (Python wheel) | See release instructions |
 
 ### Required Hardware
 
-**ModelKit targets NPU.** We recommend testing on one of the following NPU devices:
+**WinML CLI targets NPU.** We recommend testing on one of the following NPU devices:
 
 | Device | EP | Flag |
 |--------|-----|------|
 | Snapdragon X Elite (Qualcomm) | QNN | `--ep qnn --device npu` |
 | Intel AI Boost (Meteor Lake / Lunar Lake) | OpenVINO | `--ep openvino --device npu` |
 | AMD Ryzen AI (Phoenix / Hawk Point / Strix) | VitisAI | `--ep vitisai --device npu` |
 
-**No NPU?** Use `--device auto` — ModelKit will fall back to the best available device (GPU → CPU). Note that `winml compile` requires NPU and cannot run without one.
+**No NPU?** Use `--device auto` — WinML CLI will fall back to the best available device (GPU → CPU). Note that `winml compile` requires NPU and cannot run without one.
 
 ### Accepted Inputs
 
@@ -78,7 +78,7 @@ If `inspect` prints an error or shows `Unsupported`, **skip that model**. Only m
 
 ## :package: Installation
 
-ModelKit requires **Python 3.10** and is distributed as a Python wheel. We recommend [uv](https://docs.astral.sh/uv/) for fast, reproducible environment setup.
+WinML CLI requires **Python 3.10** and is distributed as a Python wheel. We recommend [uv](https://docs.astral.sh/uv/) for fast, reproducible environment setup.
 
 **1. Create a Python 3.10 environment**
 
@@ -114,7 +114,7 @@ Confirm that your target device and EP appear in the output:
 - **Intel AI Boost** — look for `OpenVINOExecutionProvider`
 - **AMD Ryzen AI** — look for `VitisAIExecutionProvider`
 
-If no NPU is detected, you can still use ModelKit with `--device auto` for most commands. The only exception is `winml compile`, which requires an NPU device.
+If no NPU is detected, you can still use WinML CLI with `--device auto` for most commands. The only exception is `winml compile`, which requires an NPU device.
 
 ---
 
@@ -177,7 +177,7 @@ If no NPU is detected, you can still use ModelKit with `--device auto` for most
 
 **`winml doctor`** — Diagnose environment issues. Checks runtimes, execution providers, and dependencies to identify configuration problems.
 
-**`winml setting`** — Configure ModelKit preferences. Set default EPs, output directories, and other global options.
+**`winml setting`** — Configure WinML CLI preferences. Set default EPs, output directories, and other global options.
 
 **`winml sys`** — System information and capability reporting. Prints detected hardware, available EPs, Python version, and installed package versions.
 
@@ -300,15 +300,15 @@ The simplest way to evaluate a model — one command, zero setup:
 winml perf -m facebook/convnext-base-224 --device npu --monitor
 ```
 
-ModelKit handles everything behind the scenes: download the model from Hugging Face, export to ONNX, optimize the graph, and run the benchmark on your NPU. The `--monitor` flag enables live hardware monitoring — real-time CPU utilization, RAM usage, and NPU activity alongside the latency results.
+WinML CLI handles everything behind the scenes: download the model from Hugging Face, export to ONNX, optimize the graph, and run the benchmark on your NPU. The `--monitor` flag enables live hardware monitoring — real-time CPU utilization, RAM usage, and NPU activity alongside the latency results.
 
 This is ideal for quick smoke tests: does the model run on this device, and how fast is it?
 
 ---
 
 ## :arrows_counterclockwise: The BYOM Workflow
 
-The **Build Your Own Model** (BYOM) workflow is the philosophy behind ModelKit. It defines how a source model becomes a production-ready, device-optimized artifact.
+The **Build Your Own Model** (BYOM) workflow is the philosophy behind WinML CLI. It defines how a source model becomes a production-ready, device-optimized artifact.
 
 ### The Pipeline
 
@@ -318,7 +318,7 @@ Source Model --> Export --> Analyze --> Optimize --> Quantize --> Compile --> Be
 
 ![BYOM Workflow](docs/assets/workflow-only.svg)
 
-Each arrow is a ModelKit command. You can enter the pipeline at any stage (for example, start with a local ONNX file and skip export), exit early (stop after optimization if you do not need quantization), or loop back to repeat a stage with different settings.
+Each arrow is a WinML CLI command. You can enter the pipeline at any stage (for example, start with a local ONNX file and skip export), exit early (stop after optimization if you do not need quantization), or loop back to repeat a stage with different settings.
 
 ### Primitive Commands vs. Config-Driven Pipeline
 
@@ -361,17 +361,17 @@ Run `winml catalog` to browse the full catalog interactively.
 
 </details>
 
-These models are verified against ModelKit's full pipeline and serve as reliable starting points. You are not limited to this list — any Hugging Face model that passes `winml inspect` is a valid input.
+These models are verified against WinML CLI's full pipeline and serve as reliable starting points. You are not limited to this list — any Hugging Face model that passes `winml inspect` is a valid input.
 
 For models not in this table, run `winml inspect -m <model-id>` to verify support before proceeding.
 
 ---
 
 ## :warning: Scope & Limitations
 
-### What ModelKit supports
+### What WinML CLI supports
 
-ModelKit targets **classic deep learning models** — CNNs, encoders, vision transformers, NLP classifiers, token classifiers, object detection models, and segmentation models.
+WinML CLI targets **classic deep learning models** — CNNs, encoders, vision transformers, NLP classifiers, token classifiers, object detection models, and segmentation models.
 
 Supported tasks include:
 - Image classification (ResNet, ViT, Swin, ConvNeXT)
@@ -380,9 +380,9 @@ Supported tasks include:
 - Object detection (Table Transformer)
 - Image segmentation (SegFormer)
 
-### What ModelKit does not support
+### What WinML CLI does not support
 
-**LLMs and generative models are not in scope.** Do not use ModelKit with GPT, LLaMA, Phi, Mistral, Stable Diffusion, or any model with a decoder-only or sequence-to-sequence generative architecture. LLM support (with LoRA) is planned for Q3-Q4 2026.
+**LLMs and generative models are not in scope.** Do not use WinML CLI with GPT, LLaMA, Phi, Mistral, Stable Diffusion, or any model with a decoder-only or sequence-to-sequence generative architecture. LLM support (with LoRA) is planned for Q3-Q4 2026.
 
 ### Known constraints
 
@@ -432,7 +432,7 @@ Supported tasks include:
 
 ## :lock: Data / Telemetry
 
-Official ModelKit releases can collect anonymous usage telemetry to
+Official WinML CLI releases can collect anonymous usage telemetry to
 help improve the product. Telemetry is classified as **Optional**. A
 one-time prompt on your first run asks for consent (default: accept —
 press Enter to enable, type `n` to decline).
@@ -459,7 +459,7 @@ locations.
 
 We welcome contributions! Please see the [contribution guidelines](CONTRIBUTING.md).
 
-For feature requests or bug reports, please file a [GitHub Issue](https://github.com/microsoft/ModelKit/issues).
+For feature requests or bug reports, please file a [GitHub Issue](https://github.com/microsoft/WinML-ModelKit/issues).
 
 ---
 

@@ -2,7 +2,7 @@
 
 ## How to file issues and get help
 
-This project uses [GitHub Issues](https://github.com/microsoft/ModelKit/issues) to track bugs and feature requests. Please search the existing
+This project uses [GitHub Issues](https://github.com/microsoft/WinML-ModelKit/issues) to track bugs and feature requests. Please search the existing
 issues before filing new issues to avoid duplicates. For new issues, file your bug or
 feature request as a new Issue.
 

@@ -1,12 +1,12 @@
-# ModelKit Privacy Statement
+# WinML CLI Privacy Statement
 
-ModelKit collects limited, anonymous telemetry to help improve the
+WinML CLI collects limited, anonymous telemetry to help improve the
 product. This page describes exactly what is collected, what is not,
 and how to control it.
 
 ## Data category
 
-All ModelKit telemetry is classified as **Optional** under Microsoft's
+All WinML CLI telemetry is classified as **Optional** under Microsoft's
 data categorization model. None of it is required to run any feature;
 it exists solely to support product improvement.
 
@@ -20,15 +20,15 @@ see the prompt and default to off.
 
 ## Events collected
 
-When telemetry is enabled, ModelKit emits three event types:
+When telemetry is enabled, WinML CLI emits three event types:
 
-### ModelKitHeartbeat
+### WinMLCLIHeartbeat
 
 Sent once per CLI invocation, just before the requested command runs.
 Carries only context attributes (OS, architecture, app version, device
 ID) — no per-event payload.
 
-### ModelKitAction
+### WinMLCLIAction
 
 Sent once per command completion.
 
@@ -41,7 +41,7 @@ Sent once per command completion.
 | `duration_ms` | Wall-clock execution time in milliseconds. |
 | `success` | Whether the command completed without raising. |
 
-### ModelKitError
+### WinMLCLIError
 
 Sent only when a command raises an unhandled exception.
 
@@ -61,7 +61,7 @@ not by the command code):
 | `device_id` | SHA256 hash of a randomly generated UUID, persisted per machine. Enables counting distinct users without identifying them. |
 | `id_status` | `EXISTING`, `NEW`, or `FAILED` — how the device ID was obtained on this run. |
 | `os.name`, `os.version`, `os.release`, `os.arch` | Operating system and architecture (e.g., `Windows`, `10.0.26200`, `11`, `AMD64`). |
-| `app_version` | ModelKit package version. |
+| `app_version` | WinML CLI package version. |
 | `app_instance_id` | A random UUID generated for this process only; not persisted. |
 | `initTs` | Epoch timestamp when telemetry was initialized. |
 
@@ -80,7 +80,7 @@ not by the command code):
 
 ### Consent
 
-On the first run of any command, ModelKit prompts:
+On the first run of any command, WinML CLI prompts:
 
 ```
 Enable telemetry? [Y/n]
@@ -125,15 +125,15 @@ variables are set, and no prompt is shown:
 Events that fail to send (e.g., transient network errors) are cached
 locally and retried on the next run. The cache file lives at:
 
-`%USERPROFILE%\.winml\telemetry\modelkit.cache`
+`%USERPROFILE%\.winml\telemetry\winmlcli.cache`
 
 The cache is append-only on failure and drain-then-resend on recovery.
 When telemetry is disabled, the cache is cleared so a disabled session
 never resends events the user has since opted out of.
 
 ## Dev installs
 
-ModelKit installed from source (`pip install -e .`) or run directly
+WinML CLI installed from source (`pip install -e .`) or run directly
 from a checkout never sends telemetry. The InstrumentationKey is blank
 in source and is only populated by the official build pipeline. Only
 official binary releases are capable of sending telemetry, and only

@@ -1,6 +1,6 @@
-# ModelKit Naming Convention
+# WinML CLI Naming Convention
 
-This document defines the naming rules for the ModelKit codebase. All new code and refactored code must follow these conventions.
+This document defines the naming rules for the WinML CLI codebase. All new code and refactored code must follow these conventions.
 
 ## 1. Acronyms in Class Names
 

@@ -90,10 +90,10 @@ optional-dependencies.openvino = [ "openvino>=2023" ]
 optional-dependencies.qnn = [
   "onnxruntime-qnn>=1.24.1; python_version>='3.11'",
 ]
-urls."Bug Tracker" = "https://github.com/microsoft/ModelKit/issues"
-urls.Documentation = "https://github.com/microsoft/ModelKit/blob/main/README.md"
-urls.Homepage = "https://github.com/microsoft/ModelKit"
-urls.Repository = "https://github.com/microsoft/ModelKit.git"
+urls."Bug Tracker" = "https://github.com/microsoft/WinML-ModelKit/issues"
+urls.Documentation = "https://github.com/microsoft/WinML-ModelKit/blob/main/README.md"
+urls.Homepage = "https://github.com/microsoft/WinML-ModelKit"
+urls.Repository = "https://github.com/microsoft/WinML-ModelKit.git"
 # =============================================================================
 # SETUPTOOLS - Package Configuration (Flat Layout with Namespace Prefix)
 # =============================================================================

@@ -1,6 +1,6 @@
 # E2E Evaluation Scripts
 
-Batch-evaluate ModelKit's `winml perf` pipeline against a curated set of HuggingFace models.
+Batch-evaluate WinML CLI's `winml perf` pipeline against a curated set of HuggingFace models.
 Captures pass/fail, failure classification, and generates interactive reports.
 
 ## Quick Start

@@ -117,9 +117,9 @@ def _get_timeout_skip_reason(hf_id: str, task: str) -> str:
 )
 
 _HF_CACHE = Path.home() / ".cache" / "huggingface"
-_WML_CACHE = Path.home() / ".cache" / "winml"
+_WINML_CACHE = Path.home() / ".cache" / "winml"
 _TEMP_DIR = Path(os.environ.get("TEMP", os.environ.get("TMP", tempfile.gettempdir())))
-_TEMP_PREFIXES = ("wmk_", "modelkit_compat_")
+_TEMP_PREFIXES = ("winmlcli_", "winmlcli_compat_")
 
 
 def _is_no_space_error(proc: dict) -> bool:
@@ -129,8 +129,8 @@ def _is_no_space_error(proc: dict) -> bool:
 
 
 def _clear_disk_caches() -> None:
-    """Delete HuggingFace, WML cache directories and leaked temp files."""
-    for cache_dir in (_HF_CACHE, _WML_CACHE):
+    """Delete HuggingFace, WinML cache directories and leaked temp files."""
+    for cache_dir in (_HF_CACHE, _WINML_CACHE):
         if cache_dir.exists():
             safe_print(f"  [cleanup] Removing cache: {cache_dir}")
             try:
@@ -139,7 +139,7 @@ def _clear_disk_caches() -> None:
             except OSError as exc:
                 safe_print(f"  [cleanup] Warning: could not remove {cache_dir}: {exc}")
 
-    # Clean leaked temp directories/files (wmk_*, modelkit_compat_*, tmp*.onnx*)
+    # Clean leaked temp directories/files (winmlcli_*, winmlcli_compat_*, tmp*.onnx*)
     if _TEMP_DIR.is_dir():
         cleaned = 0
         for entry in _TEMP_DIR.iterdir():
@@ -363,8 +363,8 @@ def _run_build(
     config_path = model_dir / "build_config.json"
     model_dir.mkdir(parents=True, exist_ok=True)
 
-    # Remove any stale suffixed sub-configs BEFORE `wmk config` runs.
-    # For composite models `wmk config` writes files matching {stem}_*.json
+    # Remove any stale suffixed sub-configs BEFORE `winml config` runs.
+    # For composite models `winml config` writes files matching {stem}_*.json
     # (e.g., build_config_encoder.json); cleaning those AFTER the command would
     # delete the freshly-written configs and silently degrade composite builds
     # to single-model. Running cleanup first removes prior-run artifacts without

@@ -11,7 +11,7 @@
 
 Dataset config is read from ``utils/dataset_config.py`` — the authoritative
 source shared with run_eval.py.  When ``winml eval`` is implemented inside
-ModelKit, it should import from the same location.
+WinML CLI, it should import from the same location.
 
 Output: prints a single JSON object as the last line on stdout:
     {"metric": "<name>", "value": <float>, "num_samples": <int>}
@@ -59,8 +59,8 @@ def _emit_result(metric: str, value: float, num_samples: int) -> None:
 def _load_pytorch_model(model_id: str, task: str, device_str: str):
     """Load a native PyTorch model with the task-appropriate AutoModel class."""
     import torch
-
     from transformers import AutoConfig
+
     from winml.modelkit.loader.task import resolve_task_and_model_class
 
     config = AutoConfig.from_pretrained(model_id)