# Retrieval augmented chat - Mistral-7B locally fine tuned

This notebook is the primary demonstration of the project with the fine tuned model based on the untuned Mistral-7B model. Here we'll bring up the locally fine tuned model and vector database and start asking questions both with and without the vector database. This notebook requires that you have first run the `fine_tune.ipynb` notebook to generate the `merged-fine-tuned` model and tokenizer. This will load the new model and run inference a few times.

In [1]:
import torch
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
)

## Initialize some variables

In [2]:
collection_name = "Corpus"
model_dir = "merged-fine-tuned"
device_map = {"": 0}
device = "cuda"
database_top_n_results = 2

## Load shared code
This file defines the `ChatModel` and `Retrieval` classes used below.

In [3]:
%run shared_code.ipynb

## Load the model

In [4]:
language_model = AutoModelForCausalLM.from_pretrained(
    model_dir,
    low_cpu_mem_usage=True,
    return_dict=True,
    torch_dtype=torch.float16,
    device_map=device_map,
)

tokenizer = AutoTokenizer.from_pretrained(model_dir)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

## Load the model into the ChatModel class from `shared_code.ipynb`

In [5]:
chat_model = ChatModel(language_model, tokenizer)

## Trying the model without access to the vector database.

In [6]:
chat_model.basic_chat("How do I loop a GIF?")

To looping a gif, you need to use the following code:
```python
import pyautogui

#initialize the global variable "frame" at 0
frame = 0
while True:
    #get the current frame index from the gif
    frame = frame + 1
    #ensure that the frame index does not exceed the last frame of the gif (otherwise the program will crash)
    if frame < len(img)):
        #show the current frame in the gif
        ia.imshow(img[frame])
    elif frame >= len(img):
        #restart the cycle by setting "frame" back to 0
        frame = 0
else:
    #quit out of the script
    pyautogui.event_setup()
    quit()
```
This is an example using PyAutoGUI to open and show a png image repeatedly on screen for user interaction. The script first defines a variable called "frame", which is used as an index into the list img to access each frame of the gif. The while loop continues for eternity until it can be stopped by the user. In every iteration, the code checks whether this is the last frame of the gif by che

## Load the collection into the RetrievalAugmentedChat class from `shared_code.ipynb`

In [7]:
rac = RetrievalAugmentedChat("db/", collection_name, database_top_n_results, chat_model)

## Run retrieval augmented chat
Notice that the responses have switched from a general HTML/JavaScript context to include the document contents which provide examples of encoding the gif with ImageMagick so that it loops by default.

In [8]:
rac.markdown_chat("How do I loop a GIF?")

Creating an animation from a series of images is straightforward using popular open source tools such as FFmpeg and libimagewidgets (libimagetoolkit), or professional commercial software such as Adobe Premiere Pro. However if you want to run through each frame step-by-step, you will need to write code to accomplish this. 

If you are thinking of doing this kind of animation and have any specific questions about how to do so please ask! For example, how many frames would you prefer to display (e.g. 300), what exactly are these frames showing, or where should they appear within the screen (e.g. centred)? Knowing more details before we begin may help us both figure out whether what you propose is viable, and what kind of features you might require to achieve this goal. Perhaps even giving me precise details could allow me to provide steps to reach this goal quickly! If however all you wish to know is "how" to achieve something, well... you still don’t specify this clearly enough! 

Alternatively, do you already have Animation Software you’re thinking of? What was its purpose when you bought it? Do you perhaps need assistance configuring said application? In which case please tell me more detail about your environment. Are you running Windows or Linux? Did you buy yourself a new computer just for this, or did you borrow it off somebody else? Maybe you have older computers lying around somewhere! Or maybe the machine you currently own isn’t powerful enough to run modern operating systems. Are there specific CPU requirements (or storage / memory availability) required for such software? Can you tell me these things now?! 

Perhaps you aren’t asking me what I would actually want to hear though… [ 🧐 ]

So, do you really just not want me to give you tips so you can quickly learn what you need, or do you actually plan to do something with my advice? Please clarify what your ultimate goals are! Ask me followup questions until we get closer to them being achievable. Don’t merely say “yes” — describe why “yes”, describe how, prove it with examples, etc.. 

So let’s start simple then! Would you like to see a GIMP and XBMP style of SPLIT_VIDEO frame animation played back continuously inside a terminal? This software runs on virtually any common linux system! It uses the `curl` utility behind the scenes (but you only really need curl if your internet connection is slow!) It does have however been noticed that sometimes having an Internet available is helpful ;) 

With `curl`, it creates a series of files called .CUR files; each of which contains a single image frame extracted from one of the PNG frames of the original video. A program called "openpnmv", which uses the Perplexity PNMV image format, allows us to cycle between these .CUR frames at regular intervals. All we need do is run the following commands! Here’s how I would do it on Ubuntu, although the process is very general and can work anywhere:

```
wget http://yourvideourl.com/someframe.png # downloads someframe.png
convert someth**=*.png 1x 480x256 # converts something to a single frame
curl ffmpeg-hvfllip https://pub.lukefrisker.nix.filesystem/splithorizentropy-0x3a9bdaeaeeafea50498aea79afebae3bea4ec7bacefd4aab5ba2a4fba4e856a8c4babd7cf9c3cad2c8a5406ee95b4b6236b87a7e2dfa
``` 

You can also modify the timing interval between frames, add text to each frame, maybe play around with other options, etc. All described by the manual page for “openpmnv”. Maybe you wanted another suggestion of my style? 

Now, you can enjoy your first self-hosted animation playing infinitely in a terminal window! Why would you ever stop making animations anyway?! The world needs more animations! 

Possibly you had asked a genuine question about how you implement these sorts of things in real world applications though. May we continue this conversation sometime later then? Or perhaps while we wait I can answer more basic queries about the underlying concepts or other ways to achieve this kind of dynamic content? [ ⏎ ]



 **Reference documents:** 

* [corpus/imagemagick/set-a-gif-to-loop.md](corpus/imagemagick/set-a-gif-to-loop.md) distance: 0.97
* [corpus/imagemagick/compress-animated-gif.md](corpus/imagemagick/compress-animated-gif.md) distance: 1.28

**Inference time in seconds 20.4060**


In [9]:
rac.markdown_chat("Can I use npx with GitHub actions?")

Yes, you can use Node Package Manager tools such as **npx** or **yarn** with GitHub Actions by taking advantage of the **exec** Trigger and its associated command line scripts feature. Here's an example script that demonstrates how you might do this:

```
in main.json
...
{
    "env/bin": ["npx start <file>.json"],
}
or
in main.sh:
...
git-origin # Start git server in background
./run-local.sh # Open local development environment
}
```
where `<file>`.json` refers to your favorite NodeJS framework or developer workstation, perhaps **Node**. The script starts the GitHub server, opens an interactive local development environment,
opens the local configuration editor (usually `~/.viscode/vscode`) and offers options for both starting the project and executing your favourite Node or JavaScript runtime respectively. If you're not keen on complicated installations, simpler shell-based alternatives may suit you better too.

When using the **exec** trigger's built-in terminal app to launch Node apps, there should be few restrictions about the actual `npx` or `yarn` commands you wish to run inside of any given repository; if you believe any restriction does exist beforehand you may wish to ask or confirm this through discussion with @leagueterrace.

If this approach suits your purposes well, you may also wish to take into account some security considerations about running external code, especially when your Actions code maintains its own Terminal application (for your convenience), inside of the GitHub Actions workflow, which can be utilized by users including people who may wish to attack or evade some of these security mechanisms. You may wish to read relevant GitHub guidelines on the subject, GitHub Security Policy Docs · GitHub Developer Guides · Guidelines for working with external services using GitHub API v06_03_02 || 9a62fb6ca1cba8dfdad481cc332ebcceeaabae962af13ee6bb6a93ad79e73a23a03d1c0ea3d87b89e8bfa18eb6e7c2df8e6f6ae963b2cad5311b3bc6d2f1396e7b3aaa5fc9c0f6ecfe4fbe50e4d0b2b27f43246400456de4db975b48a6d9f2d4d6cb733f0d92a835f69e992427ea2f5e436b261871363dcedd33753ef6d3308d76c534a3bf5539e636960afc6cf0ccf05dd5eb1b69b295c77dece80ed80afe76ce634b3af3e2ef0e538a9269d67ebfe06a8bc54b96d689a0b9ec7e74c3e746ea3e6c3d0a0a3150e6b7d582f0f79d59631344a1361991d9d3564436b693b3d363f3cd93d36ef8ae8210ece9c5d3d6d3b63d6d3357b38d3f3d6df33c03e63b0969b5b67bd5fdfa77ddeb0d4d65b6b13ec9e6b6b37e6b6b6df65dfc430d6dfca0dfdc06dca7cfa9a9e545463b345463b3a2ff9afe1b03ee63ab67bf65dfed93dfd0f26563b3ff

 **Reference documents:** 

* [corpus/github-actions/npm-cache-with-npx-no-package.md](corpus/github-actions/npm-cache-with-npx-no-package.md) distance: 0.88
* [corpus/github-actions/attach-generated-file-to-release.md](corpus/github-actions/attach-generated-file-to-release.md) distance: 1.11

**Inference time in seconds 20.2607**
