mozilla-ai · angpt · Nov 25, 2025 · Nov 25, 2025 · Nov 25, 2025 · Nov 25, 2025
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -9,6 +9,7 @@ Encoderfile compiles transformer encoders and optional classification heads into
 ### Check for Duplicates
 
 Before creating a new issue or starting work:
+
 - [ ] Search [existing issues](https://github.com/mozilla-ai/encoderfile/issues) for duplicates
 - [ ] Check [open pull requests](https://github.com/mozilla-ai/encoderfile/pulls) to see if someone is already working on it
 - [ ] For bugs, verify it still exists in the `main` branch

diff --git a/README.md b/README.md
@@ -50,65 +50,7 @@ Encoderfiles can run as:
 - CLI for batch processing
 - MCP server (Model Context Protocol)
 
-```mermaid
-flowchart LR
-    %% Styling
-    classDef asset fill:#e1f5fe,stroke:#01579b,stroke-width:2px,color:#000;
-    classDef tool fill:#fff8e1,stroke:#ff6f00,stroke-width:2px,stroke-dasharray: 5 5,color:#000;
-    classDef process fill:#fff3e0,stroke:#e65100,stroke-width:2px,color:#000;
-    classDef artifact fill:#f5f5f5,stroke:#616161,stroke-width:2px,color:#000;
-    classDef service fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px,color:#000;
-    classDef client fill:#e3f2fd,stroke:#0277bd,stroke-width:2px,stroke-dasharray: 5 5,color:#000;
-
-    subgraph Inputs ["1. Input Assets"]
-        direction TB
-        Onnx["ONNX Model<br/>(.onnx)"]:::asset
-        Tok["Tokenizer Data<br/>(tokenizer.json)"]:::asset
-        Config["Runtime Config<br/>(config.yml)"]:::asset
-    end
-
-    style Inputs fill:#e3f2fd,stroke:#0277bd,stroke-width:2px,stroke-dasharray: 5 5,color:#01579b
-
-    subgraph Compile ["2. Compile Phase"]
-        Compiler["Encoderfile Compiler<br/>(CLI Tool)"]:::asset
-    end
-
-    style Compile fill:#e3f2fd,stroke:#0277bd,stroke-width:2px,stroke-dasharray: 5 5,color:#01579b
-
-    subgraph Build ["3. Build Phase"]
-        direction TB
-        Builder["Wrapper Process<br/>(Embeds Assets + Runtime)"]:::process
-    end
-
-    style Build fill:#fff8e1,stroke:#ff8f00,stroke-width:2px,color:#e65100
-
-    subgraph Output ["4. Artifact"]
-        Binary["Single Binary Executable<br/>(Static File)"]:::artifact
-    end
-    style Output fill:#fafafa,stroke:#546e7a,stroke-width:2px,stroke-dasharray: 5 5,color:#546e7a
-
-    subgraph Runtime ["5. Runtime Phase"]
-        direction TB
-        %% Added fa:fa-server icons
-        Grpc["fa:fa-server gRPC Server<br/>(Protobuf)"]:::service
-        Http["fa:fa-server HTTP Server<br/>(JSON)"]:::service
-        MCP["fa:fa-server MCP Server<br/>(MCP)"]:::service
-        %% Added fa:fa-cloud icon
-        Client["fa:fa-cloud Client Apps /<br/>MCP Agent"]:::client
-    end
-    style Runtime fill:#f1f8e9,stroke:#2e7d32,stroke-width:2px,color:#1b5e20
-
-
-    %% Connections
-    Onnx & Tok & Config --> Builder
-    Compiler -.->|"Orchestrates"| Builder
-    Builder -->|"Outputs"| Binary
-
-    %% Runtime Connections
-    Binary -.->|"Executes"| Grpc
-    Binary -.->|"Executes"| Http
-    Grpc & Http & MCP-->|"Responds to"| Client
-```
+![Build Diagram](docs/assets/diagram.png)
 
 ### Supported Architectures
 

diff --git a/docs/CODE_OF_CONDUCT.md b/docs/CODE_OF_CONDUCT.md
@@ -0,0 +1 @@
+--8<-- "CODE_OF_CONDUCT.md"
diff --git a/docs/CONTRIBUTING.md b/docs/CONTRIBUTING.md
@@ -0,0 +1 @@
+--8<-- "CONTRIBUTING.md"
diff --git a/docs/assets/diagram.mmd b/docs/assets/diagram.mmd
@@ -0,0 +1,42 @@
+flowchart TD
+    subgraph Build Process
+        subgraph Inputs ["1. Input Assets"]
+            direction TB
+            Onnx["ONNX Model<br/>(.onnx)"]:::asset
+            Tok["Tokenizer Data<br/>(tokenizer.json)"]:::asset
+            Config["Runtime Config<br/>(config.yml)"]:::asset
+        end
+
+        subgraph Compile ["2. Compile Phase"]
+            Compiler["Encoderfile Compiler<br/>(CLI Tool)"]:::asset
+        end
+
+        subgraph Build ["3. Build Phase"]
+            direction TB
+            Builder["Wrapper Process<br/>(Embeds Assets + Runtime)"]:::process
+        end
+
+        subgraph Output ["4. Artifact"]
+            Binary["Single Binary Executable<br/>(Static File)"]:::artifact
+        end
+
+        subgraph Runtime ["5. Runtime Phase"]
+            direction TB
+            %% Added fa:fa-server icons
+            Grpc["fa:fa-server gRPC Server<br/>(Protobuf)"]:::service
+            Http["fa:fa-server HTTP Server<br/>(JSON)"]:::service
+            MCP["fa:fa-server MCP Server<br/>(MCP)"]:::service
+            %% Added fa:fa-cloud icon
+            Client["fa:fa-cloud Client Apps /<br/>MCP Agent"]:::client
+        end
+
+        %% Connections
+        Onnx & Tok & Config --> Builder
+        Compiler -.->|"Orchestrates"| Builder
+        Builder -->|"Outputs"| Binary
+
+        %% Runtime Connections
+        Binary -.->|"Executes"| Grpc
+        Binary -.->|"Executes"| Http
+        Grpc & Http & MCP-->|"Responds to"| Client
+    end
diff --git a/docs/assets/diagram.png b/docs/assets/diagram.png
diff --git a/docs/assets/encoderfile.png b/docs/assets/encoderfile.png
diff --git a/docs/assets/encoderfile_logo.png b/docs/assets/encoderfile_logo.png
diff --git a/docs/cookbooks/token-classification-ner.md b/docs/cookbooks/token-classification-ner.md
@@ -12,7 +12,7 @@ This cookbook walks through building, deploying, and using a Named Entity Recogn
 
 ## Prerequisites
 
-- `encoderfile` CLI tool installed ([Installation Guide](../index.md#-installation))
+- `encoderfile` CLI tool installed ([Installation Guide](../index.md#1-install-cli))
 - Python with `optimum[exporters]` for ONNX export
 - `curl` for testing the API
 
@@ -366,10 +366,10 @@ curl http://localhost:8080/health
 
 ## Next Steps
 
-- **[Sequence Classification Cookbook](sequence-classification-sentiment.md)** - Build a sentiment analyzer
-- **[Embedding Cookbook](embeddings-similarity.md)** - Create a semantic search engine
-- **[Transforms Reference](../transforms/reference.md)** - Learn about custom post-processing
-- **[API Reference](../reference/api-reference.md)** - Complete API documentation
+- **[Transforms Guide](../transforms/index.md)** - Learn about custom post-processing with Lua scripts
+- **[Transforms Reference](../transforms/reference.md)** - Complete transforms API documentation
+- **[API Reference](../reference/api-reference.md)** - REST, gRPC, and MCP endpoint specifications
+- **[CLI Reference](../reference/cli.md)** - Full documentation for build, serve, and infer commands
 
 ---
 

diff --git a/docs/getting-started.md b/docs/getting-started.md
@@ -250,7 +250,7 @@ encoderfile build -f config.yml
 
 - Ensure the model directory has `model.onnx`, `tokenizer.json`, and `config.json`
 - Verify the model type matches the architecture
-- See [BUILDING.md](../BUILDING.md) for detailed troubleshooting
+- See our guide on [building](reference/building.md) for detailed troubleshooting
 
 ### Server Won't Start
 
@@ -266,7 +266,7 @@ encoderfile build -f config.yml
 
 ## Next Steps
 
-- **[BUILDING.md](../BUILDING.md)** - Complete build guide with advanced configuration options
-- **[CLI Reference](cli.md)** - Full command-line documentation
-- **[API Reference](api-reference.md)** - REST, gRPC, and MCP API documentation
-- **[Contributing](../CONTRIBUTING.md)** - Help improve encoderfile
+- **[Guide on building](reference/building.md)** - Complete build guide with advanced configuration options
+- **[CLI Reference](reference/cli.md)** - Full command-line documentation
+- **[API Reference](reference/api-reference.md)** - REST, gRPC, and MCP API documentation
+- **[Contributing](CONTRIBUTING.md)** - Help improve encoderfile
diff --git a/docs/index.md b/docs/index.md
@@ -1,15 +1,23 @@
 # Encoderfile
 
-**Deploy Encoder Transformers as self-contained, single-binary executables.**
-
-[![GitHub Release](https://img.shields.io/github/v/release/mozilla-ai/encoderfile?style=flat-square)](https://github.com/mozilla-ai/encoderfile)
-[![License](https://img.shields.io/github/license/mozilla-ai/encoderfile?style=flat-square)](LICENSE)
+![Encoderfile](assets/encoderfile_logo.png)
+
+<p align="center">
+  <strong>Deploy Encoder Transformers as self-contained, single-binary executables.</strong>
+  <br><br>
+  <a href="https://github.com/mozilla-ai/encoderfile">
+    <img src="https://img.shields.io/github/v/release/mozilla-ai/encoderfile?style=flat-square" />
+  </a>
+  <a href="https://github.com/mozilla-ai/encoderfile/blob/main/LICENSE">
+    <img src="https://img.shields.io/github/license/mozilla-ai/encoderfile?style=flat-square" />
+  </a>
+</p>
 
 ---
 
 **Encoderfile** packages transformer encoders—and their classification heads—into a single, self-contained executable.
 
-Replace fragile, multi-gigabyte Python containers with lean, auditable binaries that have **zero runtime dependencies**. Written in Rust and built on ONNX Runtime, Encoderfile ensures strict determinism and high performance for financial platforms, content moderation pipelines, and search infrastructure.
+Replace fragile, multi-gigabyte Python containers with lean, auditable binaries that have **zero runtime dependencies**[^1]. Written in Rust and built on ONNX Runtime, Encoderfile ensures strict determinism and high performance for financial platforms, content moderation pipelines, and search infrastructure.
 
 ## Why Encoderfile?
 
@@ -34,10 +42,12 @@ While **Llamafile** focuses on generative models, **Encoderfile** is purpose-bui
 ## Supported Models
 
 Encoderfile supports encoder-only transformers for:
-- **Feature Extraction** - Semantic search, clustering, embeddings (BERT, DistilBERT, RoBERTa)
+
+- **Token Embeddings** - clustering, embeddings (BERT, DistilBERT, RoBERTa)
 - **Sequence Classification** - Sentiment analysis, topic classification
 - **Token Classification** - Named Entity Recognition, PII detection
-- 
+- **Sentence Embeddings** - Semantic search, clustering
+
 See our guide on [building from source](https://mozilla-ai.github.io/encoderfile/latest/reference/building/) for detailed instructions on building the CLI tool from source.
 
 Generation models (GPT, T5) are not supported. See [CLI Reference](reference/cli.md) for complete model type details.
@@ -90,7 +100,7 @@ See the [API Reference](reference/api-reference.md) for complete endpoint docume
 
 Encoderfile compiles your model into a self-contained binary by embedding ONNX weights, tokenizer, and config directly into Rust code. The result is a portable executable with zero runtime dependencies.
 
-![Encoderfile architecture diagram illustrating the build process: compiling ONNX models, tokenizers, and configs into a single binary executable that runs as a zero-dependency gRPC, HTTP, or MCP server.](assets/encoderfile.png "Encoderfile Architecture")
+![Encoderfile architecture diagram illustrating the build process: compiling ONNX models, tokenizers, and configs into a single binary executable that runs as a zero-dependency gRPC, HTTP, or MCP server.](assets/diagram.png "Encoderfile Architecture")
 
 ## Documentation
 
@@ -110,4 +120,6 @@ Encoderfile compiles your model into a self-contained binary by embedding ONNX w
 
 - **[GitHub Issues](https://github.com/mozilla-ai/encoderfile/issues)** - Report bugs or request features
 - **[Contributing Guide](CONTRIBUTING.md)** - Learn how to contribute
-- **[Code of Conduct](CODE_OF_CONDUCT.md)** - Community guidelines
+- **[Code of Conduct](CODE_OF_CONDUCT.md)** - Community guidelines
+
+[^1]: Standard builds of Encoderfile require glibc to run because of the ONNX runtime. See [this issue](https://github.com/mozilla-ai/encoderfile/issues/69) on progress on building Encoderfile for musl linux.
diff --git a/docs/reference/api-reference.md b/docs/reference/api-reference.md
@@ -446,7 +446,7 @@ curl -X POST http://localhost:8080/predict \
 
 ## gRPC API
 
-The gRPC API provides the same functionality as the HTTP REST API using [Protocol Buffers](../encoderfile-core/proto). Three services are available depending on your model type.
+The gRPC API provides the same functionality as the HTTP REST API using [Protocol Buffers](https://github.com/mozilla-ai/encoderfile/tree/main/encoderfile-core/proto). Three services are available depending on your model type.
 
 ### Connection Details
 
@@ -912,5 +912,5 @@ Encoderfile uses async I/O and can handle multiple concurrent requests. The exac
 ## See Also
 
 - [CLI Documentation](cli.md) - Command-line interface reference
-- [Getting Started](getting-started.md) - Getting started guide
-- [Contributing Guide](CONTRIBUTING.md) - Development setup
+- [Getting Started](../getting-started.md) - Getting started guide
+- [Contributing Guide](../CONTRIBUTING.md) - Development setup
diff --git a/docs/reference/building.md b/docs/reference/building.md
@@ -241,7 +241,7 @@ chmod +x ./build/my-model.encoderfile
 ```
 
 ## Configuration Options
-> For a complete set of configuration options, see the [CLI Reference](cli-reference.md)
+> For a complete set of configuration options, see the [CLI Reference](cli.md)
 
 ## Model Types
 
@@ -446,4 +446,4 @@ You can distribute the binary by:
 - [CLI Reference](https://mozilla-ai.github.io/encoderfile/cli/) - Complete command-line documentation
 - [API Reference](https://mozilla-ai.github.io/encoderfile/api-reference/) - REST, gRPC, and MCP APIs
 - [Getting Started Guide](https://mozilla-ai.github.io/encoderfile/getting-started/) - Step-by-step tutorial
-- [Contributing](CONTRIBUTING.md) - Help improve encoderfile
+- [Contributing](../CONTRIBUTING.md) - Help improve encoderfile
diff --git a/docs/reference/cli.md b/docs/reference/cli.md
@@ -601,7 +601,7 @@ sentiment-analyzer serve --http-port 8080
 
 ## Additional Resources
 
-- [Getting Started Guide](getting-started.md) - Step-by-step tutorial
+- [Getting Started Guide](../getting-started.md) - Step-by-step tutorial
 - [API Reference](api-reference.md) - HTTP/gRPC/MCP API documentation
-- [BUILDING.md](../BUILDING.md) - Complete build guide with advanced configuration
+- [BUILDING.md](building.md) - Complete build guide with advanced configuration
 - [GitHub Repository](https://github.com/mozilla-ai/encoderfile) - Source code and issues
diff --git a/mkdocs.yml b/mkdocs.yml
@@ -36,6 +36,7 @@ plugins:
   - include-markdown
 
 markdown_extensions:
+  - footnotes
   - pymdownx.snippets:
       check_paths: true
   - pymdownx.highlight: