# Retrieval augmented chat - baseline

This notebook is the primary demonstration of the project with the baseline model. Here we'll bring up the baseline model and vector database and start asking questions both with and without the vector database.

In [1]:
import chromadb
import torch
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
)
from IPython.display import Markdown, display

## Initialize some variables

In [2]:
collection_name = "Corpus"
merged_model_dir = "mistralai/Mistral-7B-Instruct-v0.1"
device_map = {"": 0}
device = "cuda"
database_top_n_results = 2

## Load the collection

In [3]:
client = chromadb.PersistentClient(path="db/")
collection = client.get_collection(name = collection_name)

## Load the model

In [4]:
merged_model = AutoModelForCausalLM.from_pretrained(
    merged_model_dir,
    low_cpu_mem_usage=True,
    return_dict=True,
    torch_dtype=torch.float16,
    device_map=device_map,
)

tokenizer = AutoTokenizer.from_pretrained(merged_model_dir)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

In [5]:
def send_to_model(msg):
    messages = [
        {"role": "user", "content": msg},
    ]
    
    encoded = tokenizer.apply_chat_template(messages, return_tensors="pt").to(device)
    
    generated_ids = merged_model.generate(encoded, max_new_tokens=1000, do_sample=True, pad_token_id=tokenizer.eos_token_id, temperature=0.8)
    decoded = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
    return decoded

def send_chat(msg):
    result = send_to_model(msg)[0]
    return result.rsplit(" [/INST] ", 1)[1]

def basic_chat(msg):
    print(send_chat(msg))

## Trying the model without access to the vector database.

In [6]:
basic_chat("How do I loop a GIF?")

To loop a GIF, you can use HTML5 and JavaScript. Here is an example code:
```
<img src="your-gif-url.gif" alt="Your GIF" loop>
```
This will loop the GIF indefinitely.

If you want to add a pause and play button, you can use the following code:
```
<img src="your-gif-url.gif" alt="Your GIF" loop controls>
```
The `controls` attribute will add play and pause buttons to the GIF.


## Add some retrieval convenience methods
These methods use the vector database to find the `database_top_n_results` from the vector database, add them into the request context, then annotate the result with links to the documents used in the context.

In [7]:
def printmd(string):
    display(Markdown(string))
    
def retrieval_augmented_chat(msg):
    query_result = collection.query(
        query_texts=[msg], 
        n_results=database_top_n_results
    )
    question_with_context = ""
    if len(query_result['documents'][0]) > 0:
        question_with_context = "Based on the following documents:\n" + "\n\n".join(query_result['documents'][0]) + "\n Answer the following question with lots of details: "
    question_with_context += msg
    model_response = send_chat(question_with_context)

    doc_links = ""
    if len(query_result['metadatas'][0]) > 0:
        doc_links = "\n\n **Reference documents:** \n\n"
        for i in range(0, len(query_result['metadatas'][0])):
            source = query_result['metadatas'][0][i]['source']
            score = query_result['distances'][0][i]
            doc_links += f"* [{source}]({source}) score: {score:3.2f}\n"
    return model_response + doc_links



## Run retrieval augmented chat
Notice that the responses have switched from a general HTML/JavaScript context to include the document contents which provide examples of encoding the gif with ImageMagick so that it loops by default.

In [8]:
printmd(retrieval_augmented_chat("How do I loop a GIF?"))

To loop a GIF, you can use the `-loop` option in ImageMagick. The syntax is as follows:
```css
convert input.gif -loop count output.gif
```
Where `input.gif` is the name of the GIF file you want to loop, and `output.gif` is the name of the output file where you want to save the looped GIF. The `count` parameter specifies how many times you want the GIF to loop. For example, to loop the GIF indefinitely, you can set the count to `0`.

Here's an example command to loop a GIF called `chrome-samesite-missing.gif` indefinitely and save the output file as `chrome-samesite-missing-loop.gif`:
```bash
convert chrome-samesite-missing.gif -loop 0 chrome-samesite-missing-loop.gif
```
It's important to note that the output filename should come after the `-loop` option. If you specify the filename before the `-loop` option, it will not work.

You can also specify the number of times the GIF should loop using the `count` parameter. For example, to loop the GIF 10 times, you would use the following command:
```bash
convert chrome-samesite-missing.gif -loop 10 chrome-samesite-missing-loop.gif
```
In summary, to loop a GIF using ImageMagick, you need to specify the `-loop` option followed by the number of times you want the GIF to loop, and save the output file with the desired name.

 **Reference documents:** 

* [corpus/imagemagick/set-a-gif-to-loop.md](corpus/imagemagick/set-a-gif-to-loop.md) score: 0.97
* [corpus/imagemagick/compress-animated-gif.md](corpus/imagemagick/compress-animated-gif.md) score: 1.28


In [10]:
printmd(retrieval_augmented_chat("Can I use npx with GitHub actions?"))

Yes, you can use `npx` with GitHub Actions. The `npx` command can be used to run a specific version of a npm package without having the package listed in the `package.json` file. This can be useful in cases where you want to use a specific version of a package for a particular task, or where you want to use a package that is not in the `package.json` file.

To use `npx` with GitHub Actions, you will need to configure the `actions/cache` action to cache the package version that you want to use. This can be done by setting the `npm-tag` or `cache-dependency-key` option in the `actions/cache` action.

For example, if you want to use `npx` to install the `get-graphql-schema` package from npm, you can configure the `actions/cache` action like this:
```yaml
- uses: actions/cache@v2
  with:
    path: /path/to/npm-dependencies
    key: ${{ runner.os }}-npx-get-graphql-schema
    restore-keys: |
      ${{ runner.os }}-npm-dependencies/*
    stage: cache
    action: "npm install"
    npm-tag: global
```
This will cache the `npx` installation of `get-graphql-schema` in the `npm-dependencies` directory. The `key` option is used to identify the package and the `npm-tag` option is used to specify that the package should be installed globally.

Once the package is cached, you can use the `npx` command to run the `get-graphql-schema` package in your GitHub Actions workflow. For example:
```yaml
- name: Install and run get-graphql-schema
  run: npx get-graphql-schema https://api.fly.io/graphql > flyctl/fly.graphql
```
This will run the `npx` command to install and run the `get-graphql-schema` package, using the version that is cached in the `npm-dependencies` directory.

Overall, using `npx` with GitHub Actions can be a useful way to run specific versions of packages without having to manage dependencies in a `package.json` file. However, it is important to configure the `actions/cache` action correctly to ensure that the correct version of the package is used and that the cache is managed properly.

 **Reference documents:** 

* [corpus/github-actions/npm-cache-with-npx-no-package.md](corpus/github-actions/npm-cache-with-npx-no-package.md) score: 0.88
* [corpus/github-actions/attach-generated-file-to-release.md](corpus/github-actions/attach-generated-file-to-release.md) score: 1.11
