# ðŸ““ The GenAI Revolution Cookbook

**Title:** How to Deploy DeepSeek-R1 Locally with Ollama, MongoDB, and a Chat UI

**Description:** Build a private DeepSeek-R1 chatbot with Ollama, MongoDB, and chat UIâ€”no external APIs. Deployment steps for local setups or AWS.

---

*This jupyter notebook contains executable code examples. Run the cells below to try out the code yourself!*



By now, you have probably heard of DeepSeek R1, the open source large language model from Chinese startup DeepSeek. Its release made headlines and even coincided with a selloff in several AI related stocks in U.S. markets. Many reviews have already covered how impressive the model is, so you do not need one more. You will focus on one of the most useful angles instead. You can download the model and run it on your own machine.

Why would you do that? Maybe you do not want to send data to a third party API. Or you want to control costs. Running locally lets you fine tune the model and customize everything to your stack.

The good news. Deploying DeepSeek R1 on your own hardware is straightforward. Here is how to do it in 2025\.

## Get a machine: AWS EC2 instance

First, pick a machine to run DeepSeek R1\. If you are experimenting or building a personal chatbot, your local computer can work. If you are building for production, you might prefer dedicated servers. If you want to get started quickly, a cloud instance is the fastest route.

For a lightweight start, an AWS EC2 CPU instance is enough for the 1\.5B parameter variant. In the original example, an m5\.2xlarge was used. You can still use it. You can also choose a newer generation instance like m7i.2xlarge or m7g.2xlarge to improve price performance.

If you want faster responses or plan to try larger variants, use a GPU instance. A g6\.xlarge or g5\.xlarge is a good baseline. These give you an NVIDIA GPU with enough VRAM for 7B class models at practical quantization levels.

To launch an EC2 instance, follow the official AWS guide: AWS EC2 Getting Started Guide. <https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EC2_GetStarted.html>

## Set up your machine

You will install the essentials needed to run DeepSeek R1\. Start with a fresh Ubuntu LTS image to keep things clean. Ubuntu 24\.04 LTS is a solid default in late 2025\.

Connect to your instance over SSH using AWSâ€™s official steps: Connecting to Your Linux Instance Using SSH. [https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/connect\-linux\-inst\-ssh.html](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/connect-linux-inst-ssh.html)

### Install dependencies

In [None]:
# Update system packages
sudo apt update && sudo apt upgrade -y

# Install dependencies
sudo apt install -y curl git

# Install Ollama
# This script will create, enable, and start the Ollama systemd service
curl -fsSL https://ollama.com/install.sh | sh

# Install Node.js and npm (for Chat UI)
curl -fsSL https://deb.nodesource.com/setup_18.x | sudo -E bash -
sudo apt install -y nodejs

# Restart shell session to apply changes
exec bash

### Download and serve DeepSeek R1 on Ollama

Choose the DeepSeek R1 model variant based on your resources and performance needs. In this walkthrough, you will use DeepSeek R1 1\.5B, the smallest version available in many community runtimes.

The 1\.5B model has 1\.5 billion parameters. It runs well on consumer hardware or modest cloud instances. It offers lower compute demands and delivers solid results for everyday chat and coding use cases.

As of late 2025, you can also choose larger variants in Ollama and similar runtimes. For example, several 7B and 8B options are available. Larger models respond more coherently and reason better, but they require more memory. Here is a quick rule of thumb to help you decide:

* CPU only with 16 to 32 GB RAM. Use 1\.5B or a quantized 7B model.
* Single mid range GPU with 16 to 24 GB VRAM. Use 7B or 8B quantized models comfortably.
* High VRAM GPUs. Consider larger models if you need stronger reasoning and can tolerate higher cost.

In [None]:
# 1.5B version (smallest, lightweight, suitable for low-resource setups)
ollama pull deepseek-r1:1.5b

# 8B version (mid-range, balances performance and resource usage)
ollama pull deepseek-r1:8b

# 14B version (higher accuracy, requires more compute power)
ollama pull deepseek-r1:14b

# 32B version (powerful, best for advanced tasks, needs high-end hardware)
ollama pull deepseek-r1:32b

# 70B version (largest, highest performance, very resource-intensive)
ollama pull deepseek-r1:70b

After the download completes, list installed models to confirm the load.

In [None]:
$ ollama list
NAME                ID              SIZE      MODIFIED
deepseek-r1:1.5b    a42b25d8c10a    1.1 GB    2 seconds ago

Ollama serves on [http://127\.0\.0\.1:11434](http://127.0.0.1:11434) by default. Check service health with the commands below. Keep this API URL handy. You will use it in your chat UI configuration.

In [None]:
# Check if Ollama is running and list downloaded models
curl http://127.0.0.1:11434/api/tags

You should see output listing the models available on your machine.

In [None]:
{
   "models":[
      {
         "name":"deepseek-r1:1.5b",
         "model":"deepseek-r1:1.5b",
         "modified_at":"2025-02-01T17:05:07.520024256Z",
         "size":1117322599,
         "digest":"a42b25d8c10a841bd24724309898ae851466696a7d7f3a0a408b895538ccbc96",
         "details":{
            "parent_model":"",
            "format":"gguf",
            "family":"qwen2",
            "families":[
               "qwen2"
            ],
            "parameter_size":"1.8B",
            "quantization_level":"Q4_K_M"
         }
      }
   ]
}

Test the model with a simple generate call.

In [None]:
curl -X POST http://127.0.0.1:11434/api/generate -d '{
  "model": "deepseek-r1:1.5b",
  "prompt": "What is Ollama?",
  "num_predict": 100,
  "stream": false
}'

## Set up the chat interface

DeepSeek R1 is running. Next, add a chat UI so you can talk to your model from a browser.

### Install MongoDB

The chat UI stores conversation history in a MongoDB instance. This is required for it to function properly. The simplest path is a local MongoDB container with a persistent volume. Docker makes this easy and repeatable.

In [None]:
sudo snap install docker
sudo docker run -d -p 27017:27017 -v mongo-chat-ui:/data --name mongo-chat-ui mongo:latest

When MongoDB is running, the database is available at:
mongodb://localhost:27017

You will add this URL to your chat UI configuration file (.env.local).

### Download and install Clone Chat UI

In [None]:
#Clone Chat UI
git clone https://github.com/huggingface/chat-ui.git
cd chat-ui

#Install Dependencies
npm install

### Configure Chat UI

Update your .env.local file with the following values.

* MongoDB URL: mongodb://localhost:27017\. This stores chat history.
* Ollama Endpoint: [http://127\.0\.0\.1:11434](http://127.0.0.1:11434). This is your local Ollama API.
* Ollama Model Name: deepseek\-r1:1\.5b. Replace this with the exact model tag you installed.

In [None]:
# Create a .env.local file:
nano .env.local

You can tweak parameters to suit your hardware and latency goals.

In [None]:
MONGODB_URL=mongodb://localhost:27017
MODELS=`[
  {
    "name": "DeepSeek-R1",
    "chatPromptTemplate": "<s>{{#each messages}}{{#ifUser}}[INST] {{content}} [/INST]{{/ifUser}}{{#ifAssistant}}{{content}}</s> {{/ifAssistant}}{{/each}}",
    "parameters": {
      "temperature": 0.3,
      "top_p": 0.95,
      "max_new_tokens": 1024,
      "stop": ["</s>"]
    },
    "endpoints": [
      {
        "type": "ollama",
        "url" : "http://127.0.0.1:11434",
        "ollamaName" : "deepseek-r1:1.5b" 
      }
    ]
  }
]`

When you finish, save and exit. Use CTRL\+X, then Y, then ENTER.

## Use your very own DeepSeek R1 chatbot

You are ready to use your DeepSeek R1 chatbot.

### Start Chat UI

In [None]:
# Start the Chat UI in development mode, making it accessible on the network
$ npm run dev -- --host 0.0.0.0

# The output confirms the server is running and displays the accessible port
> chat-ui@0.9.4 dev
> vite dev --host 0.0.0.0


  VITE v5.4.14  ready in 1122 ms

  âžœ  Local:   http://localhost:5173/
  âžœ  Network: http://100.00.00.000:5173/
  âžœ  Network: http://100.00.0.0:5173/
  âžœ  press h + enter to show help

If you are running on an AWS EC2 instance, open the UI port in the instance security group. You can do this in the AWS Console under EC2 â†’ Security Groups â†’ Inbound Rules. You can also do it with the AWS CLI.

For a public deployment, consider adding a reverse proxy with HTTPS and enabling authentication. You will protect both your Ollama endpoint and the chat UI.

### Access your chatbot

Open your machineâ€™s public address and port in a browser. You should see the chat interface.

<img src='http://thegenairevolution.com/wp-content/uploads/2025/01/image-2-1024x644.png' alt='' title='' width='1024' height='644' /><img src='http://thegenairevolution.com/wp-content/uploads/2025/01/image-3-1024x534.png' alt='' title='' width='1024' height='534' />

## Conclusion

You now have a powerful AI model running under your control, on your own machine, and inside your own security perimeter.

### Recap

* You set up the environment by installing Ollama, MongoDB, and required dependencies.
* You downloaded and configured DeepSeek R1 to run locally.
* You set up a Chat UI and linked it to MongoDB and Ollama.
* You ensured network access by opening the needed ports on AWS.
* You accessed the chatbot from your browser.
* Optional. You picked a GPU instance for faster responses and larger models.

Your locally hosted DeepSeek R1 chatbot is now up and running.