Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 0 additions & 3 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -1,6 +1,3 @@
[submodule "docs/themes/hugo-theme-relearn"]
path = docs/themes/hugo-theme-relearn
url = https://github.com/McShelby/hugo-theme-relearn.git
[submodule "docs/themes/lotusdocs"]
path = docs/themes/lotusdocs
url = https://github.com/colinwilson/lotusdocs
208 changes: 0 additions & 208 deletions docs/config.toml

This file was deleted.

61 changes: 61 additions & 0 deletions docs/content/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
+++
title = "LocalAI"
description = "The free, OpenAI, Anthropic alternative. Your All-in-One Complete AI Stack"
type = "home"
+++

**The free, OpenAI, Anthropic alternative. Your All-in-One Complete AI Stack** - Run powerful language models, autonomous agents, and document intelligence **locally** on your hardware.

**No cloud, no limits, no compromise.**

{{% notice tip %}}
**⭐ 33.3k+ stars on GitHub!**

**Drop-in replacement for OpenAI API** - modular suite of tools that work seamlessly together or independently.

Start with **[LocalAI](https://localai.io)**'s OpenAI-compatible API, extend with **[LocalAGI](https://github.com/mudler/LocalAGI)**'s autonomous agents, and enhance with **[LocalRecall](https://github.com/mudler/LocalRecall)**'s semantic search - all running locally on your hardware.

**Open Source** MIT Licensed.
{{% /notice %}}

## Why Choose LocalAI?

**OpenAI API Compatible** - Run AI models locally with our modular ecosystem. From language models to autonomous agents and semantic search, build your complete AI stack without the cloud.

### Key Features

- **LLM Inferencing**: LocalAI is a free, **Open Source** OpenAI alternative. Run **LLMs**, generate **images**, **audio** and more **locally** with consumer grade hardware.
- **Agentic-first**: Extend LocalAI with LocalAGI, an autonomous AI agent platform that runs locally, no coding required. Build and deploy autonomous agents with ease.
- **Memory and Knowledge base**: Extend LocalAI with LocalRecall, A local rest api for semantic search and memory management. Perfect for AI applications.
- **OpenAI Compatible**: Drop-in replacement for OpenAI API. Compatible with existing applications and libraries.
- **No GPU Required**: Run on consumer grade hardware. No need for expensive GPUs or cloud services.
- **Multiple Models**: Support for various model families including LLMs, image generation, and audio models. Supports multiple backends for inferencing.
- **Privacy Focused**: Keep your data local. No data leaves your machine, ensuring complete privacy.
- **Easy Setup**: Simple installation and configuration. Get started in minutes with Binaries installation, Docker, Podman, Kubernetes or local installation.
- **Community Driven**: Active community support and regular updates. Contribute and help shape the future of LocalAI.

## Quick Start

**Docker is the recommended installation method** for most users:

```bash
docker run -p 8080:8080 --name local-ai -ti localai/localai:latest
```

For complete installation instructions, see the [Installation guide](/installation/).

## Get Started

1. **[Install LocalAI](/installation/)** - Choose your installation method (Docker recommended)
2. **[Quickstart Guide](/getting-started/quickstart/)** - Get started quickly after installation
3. **[Install and Run Models](/getting-started/models/)** - Learn how to work with AI models
4. **[Try It Out](/getting-started/try-it-out/)** - Explore examples and use cases

## Learn More

- [Explore available models](https://models.localai.io)
- [Model compatibility](/model-compatibility/)
- [Try out examples](https://github.com/mudler/LocalAI-examples)
- [Join the community](https://discord.gg/uJAeKSAGDy)
- [Check the LocalAI Github repository](https://github.com/mudler/LocalAI)
- [Check the LocalAGI Github repository](https://github.com/mudler/LocalAGI)
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
weight: 20
title: "Advanced"
description: "Advanced usage"
type: chapter
icon: settings
lead: ""
date: 2020-10-06T08:49:15+00:00
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ template:
chat: chat
```

For a complete reference of all available configuration options, see the [Model Configuration]({{%relref "docs/advanced/model-configuration" %}}) page.
For a complete reference of all available configuration options, see the [Model Configuration]({{%relref "advanced/model-configuration" %}}) page.

**Configuration File Locations:**

Expand Down Expand Up @@ -108,7 +108,6 @@ Similarly it can be specified a path to a YAML configuration file containing a l
```yaml
- url: https://raw.githubusercontent.com/go-skynet/model-gallery/main/gpt4all-j.yaml
name: gpt4all-j
# ...
```

### Automatic prompt caching
Expand All @@ -119,7 +118,6 @@ To enable prompt caching, you can control the settings in the model config YAML

```yaml

# Enable prompt caching
prompt_cache_path: "cache"
prompt_cache_all: true

Expand All @@ -131,20 +129,18 @@ prompt_cache_all: true

By default LocalAI will try to autoload the model by trying all the backends. This might work for most of models, but some of the backends are NOT configured to autoload.

The available backends are listed in the [model compatibility table]({{%relref "docs/reference/compatibility-table" %}}).
The available backends are listed in the [model compatibility table]({{%relref "reference/compatibility-table" %}}).

In order to specify a backend for your models, create a model config file in your `models` directory specifying the backend:

```yaml
name: gpt-3.5-turbo

# Default model parameters
parameters:
# Relative to the models path
model: ...

backend: llama-stable
# ...
```

### Connect external backends
Expand Down Expand Up @@ -183,7 +179,6 @@ make -C backend/python/vllm
When LocalAI runs in a container,
there are additional environment variables available that modify the behavior of LocalAI on startup:

{{< table "table-responsive" >}}
| Environment variable | Default | Description |
|----------------------------|---------|------------------------------------------------------------------------------------------------------------|
| `REBUILD` | `false` | Rebuild LocalAI on startup |
Expand All @@ -193,20 +188,17 @@ there are additional environment variables available that modify the behavior of
| `EXTRA_BACKENDS` | | A space separated list of backends to prepare. For example `EXTRA_BACKENDS="backend/python/diffusers backend/python/transformers"` prepares the python environment on start |
| `DISABLE_AUTODETECT` | `false` | Disable autodetect of CPU flagset on start |
| `LLAMACPP_GRPC_SERVERS` | | A list of llama.cpp workers to distribute the workload. For example `LLAMACPP_GRPC_SERVERS="address1:port,address2:port"` |
{{< /table >}}

Here is how to configure these variables:

```bash
# Option 1: command line
docker run --env REBUILD=true localai
# Option 2: set within an env file
docker run --env-file .env localai
```

### CLI Parameters

For a complete reference of all CLI parameters, environment variables, and command-line options, see the [CLI Reference]({{%relref "docs/reference/cli-reference" %}}) page.
For a complete reference of all CLI parameters, environment variables, and command-line options, see the [CLI Reference]({{%relref "reference/cli-reference" %}}) page.

You can control LocalAI with command line arguments to specify a binding address, number of threads, model paths, and many other options. Any command line parameter can be specified via an environment variable.

Expand Down Expand Up @@ -282,20 +274,17 @@ A list of the environment variable that tweaks parallelism is the following:
### Python backends GRPC max workers
### Default number of workers for GRPC Python backends.
### This actually controls wether a backend can process multiple requests or not.
# PYTHON_GRPC_MAX_WORKERS=1

### Define the number of parallel LLAMA.cpp workers (Defaults to 1)
# LLAMACPP_PARALLEL=1

### Enable to run parallel requests
# LOCALAI_PARALLEL_REQUESTS=true
```

Note that, for llama.cpp you need to set accordingly `LLAMACPP_PARALLEL` to the number of parallel processes your GPU/CPU can handle. For python-based backends (like vLLM) you can set `PYTHON_GRPC_MAX_WORKERS` to the number of parallel requests.

### VRAM and Memory Management

For detailed information on managing VRAM when running multiple models, see the dedicated [VRAM and Memory Management]({{%relref "docs/advanced/vram-management" %}}) page.
For detailed information on managing VRAM when running multiple models, see the dedicated [VRAM and Memory Management]({{%relref "advanced/vram-management" %}}) page.

### Disable CPU flagset auto detection in llama.cpp

Expand Down
Loading
Loading