mudler · mudler · Nov 19, 2025 · Nov 19, 2025 · Nov 19, 2025 · Nov 19, 2025
diff --git a/.gitmodules b/.gitmodules
@@ -1,6 +1,3 @@
 [submodule "docs/themes/hugo-theme-relearn"]
 	path = docs/themes/hugo-theme-relearn
 	url = https://github.com/McShelby/hugo-theme-relearn.git
-[submodule "docs/themes/lotusdocs"]
-	path = docs/themes/lotusdocs
-	url = https://github.com/colinwilson/lotusdocs
diff --git a/docs/config.toml b/docs/config.toml
diff --git a/docs/content/_index.md b/docs/content/_index.md
@@ -0,0 +1,61 @@
++++
+title = "LocalAI"
+description = "The free, OpenAI, Anthropic alternative. Your All-in-One Complete AI Stack"
+type = "home"
++++
+
+**The free, OpenAI, Anthropic alternative. Your All-in-One Complete AI Stack** - Run powerful language models, autonomous agents, and document intelligence **locally** on your hardware. 
+
+**No cloud, no limits, no compromise.**
+
+{{% notice tip %}}
+**⭐ 33.3k+ stars on GitHub!**
+
+**Drop-in replacement for OpenAI API** - modular suite of tools that work seamlessly together or independently. 
+
+Start with **[LocalAI](https://localai.io)**'s OpenAI-compatible API, extend with **[LocalAGI](https://github.com/mudler/LocalAGI)**'s autonomous agents, and enhance with **[LocalRecall](https://github.com/mudler/LocalRecall)**'s semantic search - all running locally on your hardware.
+
+**Open Source** MIT Licensed.
+{{% /notice %}}
+
+## Why Choose LocalAI?
+
+**OpenAI API Compatible** - Run AI models locally with our modular ecosystem. From language models to autonomous agents and semantic search, build your complete AI stack without the cloud.
+
+### Key Features
+
+- **LLM Inferencing**: LocalAI is a free, **Open Source** OpenAI alternative. Run **LLMs**, generate **images**, **audio** and more **locally** with consumer grade hardware.
+- **Agentic-first**: Extend LocalAI with LocalAGI, an autonomous AI agent platform that runs locally, no coding required. Build and deploy autonomous agents with ease.
+- **Memory and Knowledge base**: Extend LocalAI with LocalRecall, A local rest api for semantic search and memory management. Perfect for AI applications.
+- **OpenAI Compatible**: Drop-in replacement for OpenAI API. Compatible with existing applications and libraries.
+- **No GPU Required**: Run on consumer grade hardware. No need for expensive GPUs or cloud services.
+- **Multiple Models**: Support for various model families including LLMs, image generation, and audio models. Supports multiple backends for inferencing.
+- **Privacy Focused**: Keep your data local. No data leaves your machine, ensuring complete privacy.
+- **Easy Setup**: Simple installation and configuration. Get started in minutes with Binaries installation, Docker, Podman, Kubernetes or local installation.
+- **Community Driven**: Active community support and regular updates. Contribute and help shape the future of LocalAI.
+
+## Quick Start
+
+**Docker is the recommended installation method** for most users:
+
+```bash
+docker run -p 8080:8080 --name local-ai -ti localai/localai:latest
+```
+
+For complete installation instructions, see the [Installation guide](/installation/).
+
+## Get Started
+
+1. **[Install LocalAI](/installation/)** - Choose your installation method (Docker recommended)
+2. **[Quickstart Guide](/getting-started/quickstart/)** - Get started quickly after installation
+3. **[Install and Run Models](/getting-started/models/)** - Learn how to work with AI models
+4. **[Try It Out](/getting-started/try-it-out/)** - Explore examples and use cases
+
+## Learn More
+
+- [Explore available models](https://models.localai.io)
+- [Model compatibility](/model-compatibility/)
+- [Try out examples](https://github.com/mudler/LocalAI-examples)
+- [Join the community](https://discord.gg/uJAeKSAGDy)
+- [Check the LocalAI Github repository](https://github.com/mudler/LocalAI)
+- [Check the LocalAGI Github repository](https://github.com/mudler/LocalAGI)
diff --git a/docs/content/docs/advanced/_index.en.md → docs/content/advanced/_index.en.md b/docs/content/docs/advanced/_index.en.md → docs/content/advanced/_index.en.md
@@ -2,6 +2,7 @@
 weight: 20
 title: "Advanced"
 description: "Advanced usage"
+type: chapter
 icon: settings
 lead: ""
 date: 2020-10-06T08:49:15+00:00

diff --git a/docs/content/docs/advanced/advanced-usage.md → docs/content/advanced/advanced-usage.md b/docs/content/docs/advanced/advanced-usage.md → docs/content/advanced/advanced-usage.md
@@ -27,7 +27,7 @@ template:
   chat: chat
 ```
 
-For a complete reference of all available configuration options, see the [Model Configuration]({{%relref "docs/advanced/model-configuration" %}}) page.
+For a complete reference of all available configuration options, see the [Model Configuration]({{%relref "advanced/model-configuration" %}}) page.
 
 **Configuration File Locations:**
 
@@ -108,7 +108,6 @@ Similarly it can be specified a path to a YAML configuration file containing a l
 ```yaml
 - url: https://raw.githubusercontent.com/go-skynet/model-gallery/main/gpt4all-j.yaml
   name: gpt4all-j
-# ...
 ```
 
 ### Automatic prompt caching
@@ -119,7 +118,6 @@ To enable prompt caching, you can control the settings in the model config YAML
 
 ```yaml
 
-# Enable prompt caching
 prompt_cache_path: "cache"
 prompt_cache_all: true
 
@@ -131,20 +129,18 @@ prompt_cache_all: true
 
 By default LocalAI will try to autoload the model by trying all the backends. This might work for most of models, but some of the backends are NOT configured to autoload.
 
-The available backends are listed in the [model compatibility table]({{%relref "docs/reference/compatibility-table" %}}).
+The available backends are listed in the [model compatibility table]({{%relref "reference/compatibility-table" %}}).
 
 In order to specify a backend for your models, create a model config file in your `models` directory specifying the backend:
 
 ```yaml
 name: gpt-3.5-turbo
 
-# Default model parameters
 parameters:
   # Relative to the models path
   model: ...
 
 backend: llama-stable
-# ...
 ```
 
 ### Connect external backends
@@ -183,7 +179,6 @@ make -C backend/python/vllm
 When LocalAI runs in a container,
 there are additional environment variables available that modify the behavior of LocalAI on startup:
 
-{{< table "table-responsive" >}}
 | Environment variable       | Default | Description                                                                                                |
 |----------------------------|---------|------------------------------------------------------------------------------------------------------------|
 | `REBUILD`                  | `false` | Rebuild LocalAI on startup                                                                                 |
@@ -193,20 +188,17 @@ there are additional environment variables available that modify the behavior of
 | `EXTRA_BACKENDS`          |         | A space separated list of backends to prepare. For example `EXTRA_BACKENDS="backend/python/diffusers backend/python/transformers"` prepares the python environment on start |
 | `DISABLE_AUTODETECT`       | `false` | Disable autodetect of CPU flagset on start                                                                     |
 | `LLAMACPP_GRPC_SERVERS`   |         | A list of llama.cpp workers to distribute the workload. For example `LLAMACPP_GRPC_SERVERS="address1:port,address2:port"` |
-{{< /table >}}
 
 Here is how to configure these variables:
 
 ```bash
-# Option 1: command line
 docker run --env REBUILD=true localai
-# Option 2: set within an env file
 docker run --env-file .env localai
 ```
 
 ### CLI Parameters
 
-For a complete reference of all CLI parameters, environment variables, and command-line options, see the [CLI Reference]({{%relref "docs/reference/cli-reference" %}}) page.
+For a complete reference of all CLI parameters, environment variables, and command-line options, see the [CLI Reference]({{%relref "reference/cli-reference" %}}) page.
 
 You can control LocalAI with command line arguments to specify a binding address, number of threads, model paths, and many other options. Any command line parameter can be specified via an environment variable.
 
@@ -282,20 +274,17 @@ A list of the environment variable that tweaks parallelism is the following:
 ### Python backends GRPC max workers
 ### Default number of workers for GRPC Python backends.
 ### This actually controls wether a backend can process multiple requests or not.
-# PYTHON_GRPC_MAX_WORKERS=1
 
 ### Define the number of parallel LLAMA.cpp workers (Defaults to 1)
-# LLAMACPP_PARALLEL=1
 
 ### Enable to run parallel requests
-# LOCALAI_PARALLEL_REQUESTS=true
 ```
 
 Note that, for llama.cpp you need to set accordingly `LLAMACPP_PARALLEL` to the number of parallel processes your GPU/CPU can handle. For python-based backends (like vLLM) you can set `PYTHON_GRPC_MAX_WORKERS` to the number of parallel requests.
 
 ### VRAM and Memory Management
 
-For detailed information on managing VRAM when running multiple models, see the dedicated [VRAM and Memory Management]({{%relref "docs/advanced/vram-management" %}}) page.
+For detailed information on managing VRAM when running multiple models, see the dedicated [VRAM and Memory Management]({{%relref "advanced/vram-management" %}}) page.
 
 ### Disable CPU flagset auto detection in llama.cpp