[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/weaviate/recipes/blob/main/weaviate-features/model-providers/deepseek/rag_deepseek_r1:1.5b.ipynb)

# A Game Recommender RAG with DeepSeek & Ollama

_by Tuana Celik ([LI](https://www.linkedin.com/in/tuanacelik/), [X](https://x.com/tuanacelik), [🦋](https://bsky.app/profile/tuana.dev))_

In this recipe, we are building  a custom RAG application designed to provide recommendations for games.

We use `games.csv` from the Epic Games dataset made public [here on Kaggle](https://www.kaggle.com/datasets/mexwell/epic-games-store-dataset).

- LLM: We run the `deepseek-r1:1.5b` model with Ollama
- Embedding model: For this example, we use the the `text2vec_openai` component with Weaviate, using the defuault `text-embedding-3-small` model.

## Install Dependencies & Set API Keys

In [None]:
!pip install weaviate-client pandas tqdm ollama

In [9]:
from getpass import getpass
import os

if "OPENAI_APIKEY" not in os.environ:
    os.environ["OPENAI_APIKEY"] = getpass("Enter your OpenAI API Key")

## Run the Model

For this example, we're running the `deepseek-r1:1.5b` model locally with ollama. For more information on how to run this on your OS, check out the [Ollama Docs](https://ollama.com/library/deepseek-r1).

For example, on Mac:

```bash
ollama run deepseek-r1:1.5b
```

## Create & Populate Weaviate Collection

First, we have to create a Weaviate collection and add some data into it. To complete this section:
1. Download the `games.csv` file from Kaggle
2. Use the following `docker-compose.yml` and run `docker compose up` to start Weaviate with the `generative-ollama` and `text2vec-openai` modules enabled. 

```yml 
---
services:
  weaviate_anon:
    command:
    - --host
    - 0.0.0.0
    - --port
    - '8080'
    - --scheme
    - http
    image: cr.weaviate.io/semitechnologies/weaviate:1.28.4
    ports:
    - 8080:8080
    - 50051:50051
    restart: on-failure:0
    environment:
      QUERY_DEFAULTS_LIMIT: 25
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
      PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
      ENABLE_API_BASED_MODULES: 'true'
      BACKUP_FILESYSTEM_PATH: '/var/lib/weaviate/backups'
      CLUSTER_HOSTNAME: 'node1'
      LOG_LEVEL: 'trace'
      ENABLE_MODULES: "text2vec-openai,generative-ollama"
...

```
3. Now, you can create a new collection called "Games" below

In [10]:
import weaviate
import weaviate.classes.config as wc
from weaviate.util import generate_uuid5
import os
from tqdm import tqdm
import pandas as pd

headers = {"X-OpenAI-Api-Key": os.getenv("OPENAI_APIKEY")}
client = weaviate.connect_to_local(headers=headers)

if client.collections.exists("Games"):
    client.collections.delete("Games")
client.collections.create(
    name="Games",
    properties=[
        wc.Property(name="name", data_type=wc.DataType.TEXT),
        wc.Property(name="price", data_type=wc.DataType.INT),
        wc.Property(name="platforms", data_type=wc.DataType.TEXT_ARRAY),
        wc.Property(name="release_date", data_type=wc.DataType.DATE),
        wc.Property(name="description", data_type=wc.DataType.TEXT),
    ],
    generative_config=wc.Configure.Generative.ollama(model="deepseek-r1:1.5b",
                                                     api_endpoint="http://host.docker.internal:11434"),
    vectorizer_config=wc.Configure.Vectorizer.text2vec_openai(),

)

<weaviate.collections.collection.sync.Collection at 0x12999fd40>

4. Finally, we can add some of the data from `games.csv` to our collection. 

In [11]:

games = client.collections.get("Games")

df = pd.read_csv('games.csv')

with games.batch.dynamic() as batch:
    for i, game in tqdm(df.iterrows()):
        platforms = game["platform"].split(',') if type(game["platform"]) is str else []
        game_obj = {
            "name": game["name"],
            "platforms": platforms,
            "price": game["price"],
            "release_date": game["release_date"],
            "description": game["description"],
        }

        batch.add_object(
            properties=game_obj,
            uuid=generate_uuid5(game["id"])
        )
if len(games.batch.failed_objects) > 0:
    print(f"Failed to import {len(games.batch.failed_objects)} objects")
    print(games.batch.failed_objects)

915it [00:09, 96.86it/s] 


## Embedding Search 

The code block below is returning 3 of the most relevant games to the query. But we are not yet doing RAG with a specific instruction over these retrieved games.

In [12]:
response = games.query.near_text(query="I play the vilain", limit=3)

for o in response.objects:
    print(o.properties)

{'platforms': ['Windows'], 'description': "A dark fantasy roguelike where you play as the Devil! Lead famous evil geniuses through events and turn-based fights to spread terror and corruption, and use your evil powers to change the game's rules to your advantage.", 'price': 2499, 'release_date': datetime.datetime(2021, 9, 30, 8, 0, tzinfo=datetime.timezone.utc), 'name': 'Rogue Lords'}
{'platforms': ['Windows'], 'description': 'Smash, clobber and bash the murderous legends of Slavic mythology in this darkly funny action role-playing game that changes every time you play. Play as Ivan, a one-handed blacksmith with incredibly bad luck, who must take on the impossible tasks given to him by the tzar. All...', 'price': 2499, 'release_date': datetime.datetime(2019, 11, 12, 14, 0, tzinfo=datetime.timezone.utc), 'name': 'Yaga'}
{'platforms': ['Windows'], 'description': 'In a violent, medieval world, outplay rival gangs in intense PvPvE multiplayer heists. Moving in stealth to steal treasures un

## Recommendation RAG

Finally, we can crate a `recommend_game` function which is able to do rag with a `grouped_task` instruction. 

You can try changing this instruction to something else too!

Below, we create an application which provides recommendations based on the most relevant 5 games in the dataset based on the user query, as well as providing information on what openrating systems the games are available on 👇

This generated response uses `deepseek-r1:1.5b`, which provides the thought generated by the model between `<think> </think>` tags. I've set up this function to return both `recommendation` and `thought`. You can later print these out separately. 

In [28]:
def recommend_game(query: str):
    
    response = games.generate.near_text(
        query=query,
        limit=5,
        grouped_task=f"""You've been provided some relevant games based on the users query. 
        Provide an answer to the query. Your final answer MUST indicate the platform each game is available on. 
        User query: {query}""",
        grouped_properties=["name", "description", "price", "platforms"],     
    )
    return {'thought':response.generated.split('</think>')[0], 'recommendation': response.generated.split('</think>')[1]}

In [30]:
response = recommend_game("What are some games that I get to role play a magical creature")
print(response['recommendation'])



Here are several games that allow you to role-play as a magical creature:

1. **Mages of Mystralia**  
   - **Platform:** Windows  
   - Description: A fantasy RPG where you design your own spells in a world of magic, allowing creativity and flexibility.

2. **Geneforge 1 - Mutagen**  
   - **Platforms:** Windows, Mac  
   - Description: An open-ended RPG with mutant monsters, multiple skills, treasures, factions, and creation possibilities, offering unparalleled freedom and replayability.

3. **Beasts of Maravilla Island**  
   - **Platform:** Windows  
   - Description: A 3D adventure game where you role as a wildlife photographer exploring magical ecosystems, focusing on behavior learning for photography.

4. **Paper Beast**  
   - **Platforms:** Windows (PC)  
   - Description: An adventure game about disrupting wildlife balance with a focus on exotic creatures and mystery-solving.

5. **Black Book**  
   - **Platform:** Windows  
   - Description: A dark RPG based on Slavic myth

In [32]:
print(response['thought'])

<think>
Okay, so I need to figure out some games that let me role-play as a magical creature. The user provided several options, each with a description and platform. Let me go through them one by one.

First up is "Mages of Mystralia." From the description, it's a fantasy RPG where you design your own spells in a world of magic. That sounds perfect because it allows for a lot of creativity as a magical creature. The platform is Windows, so maybe the user can run it on their PC or any desktop system.

Next is "Geneforge 1 - Mutagen." This seems like an open-ended RPG with mutant monsters and a battle mechanic. It's described as having countless skills, treasures, factions, and creation possibilities. Unmatched freedom and replayability make sense because it allows for various storylines. The platform here is Windows and Mac, so compatible options would be useful.

Then there's "Beasts of Maravilla Island." As the name suggests, it's a 3D adventure game where you take on a wildlife phot