<a href="https://colab.research.google.com/github/KegaPlayer/kegaiengine/blob/main/Koboldcpp_Colab_(Another_Edited_Edition).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Based on this one: https://colab.research.google.com/github/kalomaze/koboldcpp/blob/alternate_colab/Koboldcpp_Colab_(Improved_Edition).ipynb

Original Koboldcpp Colab: https://colab.research.google.com/github/LostRuins/koboldcpp/blob/concedo/colab.ipynb

---
*Edited koboldcpp colab notebook for the KegAIEngine project.* Experiment to your heart's content, just as I may have done with this before this 'public' release.

*   Easy and fast installing provided by the original Colab
*   Can display the model's name inside the UI when choosing from the dropdown.
*   Allows to set up a Kobold Horde worker if you feel like sharing Google's GPU compute power to the world
*   A custom selection of models to choose from. The models available are curated under the maker's discretion or via suggestions...
*   Ability to keep track of the models used by the user on a private spreadsheet

## Credits
- Made with ~~spite~~ love by kalomaze ❤️ <sub>
 - (also here's the part where I (kalomaze) shill my [Patreon](https://www.patreon.com/kalomaze) if you care!)</sub>
- Edited and (sorta) updated with a bit of spite by KegaPlayer. Not owner of a Patreon but can be contacted at:
 * Discord: [kegaplayer](https://lookup.guru/183692020131299339)
 * Twitter: [KegaMPlayer](https://twitter.com/KegaMPlayer)
 * Mail: <potatokaigen@gmail.com>

### Koboldcpp is not a software originally made by the aforementioned users. This notebook is just to make it easy to use on Colab, for research use and beyond. You can find the original GitHub repository for it here: https://github.com/LostRuins/koboldcpp

In [None]:
#CELL 1
#@title Keep this widget playing to prevent Colab from disconnecting you { display-mode: "form" }
#@markdown Press play on the audio player that will appear below:
%%html
<audio src="https://oobabooga.github.io/silence.m4a" controls>

In [None]:
import requests
import tarfile
import os
import time
import re
import threading
from google.oauth2.service_account import Credentials
import hashlib
import gspread

#@title # **Koboldcpp Colab (Another Edited Edition)**

#@markdown ---
#@markdown # Download Options

#Koboldcpp now can autorun itself without building, yay! Acnowledged by the original dev also!

Model_C = "Emerhyst-20B" #@param ["Kunoichi-DPO-v2-7B", "UNA-TheBeagle-7B-v1", "Fimbulvetr-11B-v2", "Athena-v4-13B", "Mythalion-13B", "Xwin-MLewd-13B-v0.2", "Emerhyst-20B", "Rose-Kimiko-20B"]
Model = Model_C
Quant_Method = "3_K_S" #@param ["2_K", "3_K_S", "3_K_M", "3_K_L", "4_0", "4_K_S", "4_K_M", "5_0", "5_K_S", "5_K_M", "6_K", "8_0"]

#@markdown #### OPTIONAL: Manual Model Link
Use_Manual_Model = False #@param {type:"boolean"}
Manual_Link = "" #@param {type:"string"}
#@markdown (Example of a manual link: https://huggingface.co/TheBloke/Emerhyst-20B-GGUF/resolve/main/emerhyst-20b.Q2_K.gguf)
#@markdown #### OPTIONAL: Use LoRA
Use_Lora = False #@param {type:"boolean"}
Lora_Link = "" #@param {type:"string"}
#@markdown (Same format applied as Manual Link)

#@markdown ---

#@markdown # Launch Options

Layers = "65" # @param ["35", "43", "65"] {allow-input: true}
#@markdown (35 are all layers for 7B models. 43 are all layers for 13B models. 65 are all layers for 20B models)
Context = "4096" #@param ["512","1024","2048","3072","4096","6144","8192","12288","16384"]{allow-input: true}
Smart_Context = False #@param {type:"boolean"}
#@markdown (The default (and recommended) setting is 4096. Use higher sizes with caution, lower sizes are NOT recommended. Use Smart Context at your own risk.)
ForceRebuild = False #@param {type:"boolean"}
#@markdown (Builds the Latest Kobold version. Can take ~7 minutes)

#@markdown ---

# @markdown # Setup Horde Worker

# @markdown (Available for experimental gimmick reasons. The maker may not be held responsible if you end up banned off the service from using this)
Enable_Horde_Worker = False #@param {type:"boolean"}

#@markdown Name of the model that's going to be used. For Manual Mode only
Horde_Model_Name = "" #@param{type:"string"}

#@markdown Sets how long (in tokens) will be the replies served
Horde_Gen_Length = "" #@param{type:"string"}

#@markdown Necessary to make the worker available for use. [Get a key here](https://aihorde.net/register)
Horde_API_Key = "" #@param{type:"string"}

#@markdown How the worker will be named. Leaving this empty will default to the name (KegAIEngine) and a random identifier
Horde_Worker_Name = "" #@param{type:"string"}

#For horde
horde_params = set()
if Enable_Horde_Worker:
  horde_params.add(Horde_Gen_Length)
  horde_params.add(Context)

  if Use_Manual_Model:
    if Horde_Model_Name.strip():
      print(f"\nManual Model detected; Using the name provided by the user: {hordemodelname}")
    else:
      print(f"\nManual Model detected but model name not provided, falling back to default name used\n")
      Horde_Model_Name = "concedo"
  else:
    Horde_Model_Name = Model_C

  if Horde_Worker_Name.strip():
    #Name detected
    print("\nWorker Name detected and now being used")
  else:
    #Name is empty
    print("\nWorker Name not provided. Using default settings")
    import random
    import string
    RndID = ''.join(random.choice(string.ascii_uppercase) for i in range(3))
    Horde_Worker_Name = "KegAIEngine-{}".format(RndID)

#@markdown ---

#@markdown # Analytics

def calculate_md5(file_path):
    hash_md5 = hashlib.md5()
    with open(file_path, "rb") as f:
        for chunk in iter(lambda: f.read(4096), b""):
            hash_md5.update(chunk)
    return hash_md5.hexdigest()

# Updates the spreadsheet with the stats of the model when ran
def update_llama_stats(DownloadedModel_path):
    # Initialize gspread
    scope = [
        'https://www.googleapis.com/auth/spreadsheets',
        'https://www.googleapis.com/auth/drive.file',
        'https://www.googleapis.com/auth/drive'
    ]

    os.makedirs("/content/koboldcpp/stats/", exist_ok=True)
    !wget -q https://cdn.discordapp.com/attachments/945486970883285045/1114717554481569802/peppy-generator-388800-07722f17a188.json -O /content/koboldcpp/stats/peppy-generator-388800-07722f17a188.json
    config_path = '/content/koboldcpp/stats/peppy-generator-388800-07722f17a188.json'

    if os.path.exists(config_path):
        # File exists, proceed with creation of creds and client
        creds = Credentials.from_service_account_file(config_path, scopes=scope)
        client = gspread.authorize(creds)
    else:
        # File does not exist, print message and skip creation of creds and client
        print("Sheet credential file missing.")
        exit()  # Exit the script if the credentials are missing

    # Open the Google Sheet
    book = client.open("LlamaStats")
    sheet = book.get_worksheet(0)  # get the first sheet

    DownloadedModel_name = os.path.basename(DownloadedModel_path)
    DownloadedModel_hash = calculate_md5("/content/koboldcpp/model.gguf")

    colA_values = sheet.col_values(1)
    colB_values = sheet.col_values(2)
    colC_values = sheet.col_values(3)

    update_idx = -1

    for idx in range(len(colA_values)):
        if colA_values[idx] == DownloadedModel_name and idx < len(colB_values) and colB_values[idx] == DownloadedModel_hash:
            update_idx = idx + 1
            break

    if update_idx == -1:
        update_idx = len(colA_values) + 1

    current_count = colC_values[update_idx - 1] if update_idx <= len(colC_values) else ''
    if current_count.isdigit():
        new_count = str(int(current_count) + 1)
    else:
        new_count = '1'

    # Batch update to Google Sheets
    cell_list = [
        gspread.models.Cell(update_idx, 1, DownloadedModel_name),
        gspread.models.Cell(update_idx, 2, DownloadedModel_hash),
        gspread.models.Cell(update_idx, 3, new_count),
        gspread.models.Cell(update_idx, 4, DownloadedModel_path)
    ]
    sheet.update_cells(cell_list)
    print("\nUpdating values...\n")

#@markdown ##### OPTIONAL: Submit Download stats (for measuring model usage/popularity)
Submit_Download_Stats = False #@param {type:"boolean"}

model_links = {
    "Kunoichi-DPO-v2-7B": "https://huggingface.co/brittlewis12/Kunoichi-DPO-v2-7B-GGUF/resolve/main/kunoichi-dpo-v2-7b.Q2_K.gguf",
    "UNA-TheBeagle-7B-v1": "https://huggingface.co/TheBloke/UNA-TheBeagle-7B-v1-GGUF/resolve/main/una-thebeagle-7b-v1.Q2_K.gguf",
    "Fimbulvetr-11B-v2": "https://huggingface.co/Sao10K/Fimbulvetr-11B-v2-GGUF/resolve/main/Fimbulvetr-11B-v2-Test-14.q4_K_M.gguf",
    "Athena-v4-13B": "https://huggingface.co/TheBloke/Athena-v4-GGUF/resolve/main/athena-v4.Q{}.gguf",
    "Mythalion-13B": "https://huggingface.co/TheBloke/Mythalion-13B-GGUF/resolve/main/mythalion-13b.Q{}.gguf",
    "Xwin-MLewd-13B-v0.2": "https://huggingface.co/TheBloke/Xwin-MLewd-13B-v0.2-GGUF/blob/main/xwin-mlewd-13b-v0.2.Q{}.gguf",
    "Emerhyst-20B": "https://huggingface.co/TheBloke/Emerhyst-20B-GGUF/resolve/main/emerhyst-20b.Q{}.gguf",
    "Rose-Kimiko-20B": "https://huggingface.co/TheBloke/Rose-Kimiko-20B-GGUF/resolve/main/rose-kimiko-20b.Q{}.gguf",
}

if not os.path.isfile("/opt/bin/nvidia-smi"):
  raise RuntimeError("⚠️Colab did not give you a GPU due to usage limits, this can take a few hours before they let you back in. Check out https://lite.koboldai.net for a free alternative (that does not provide an API link but can load KoboldAI saves and chat cards) or subscribe to Colab Pro for immediate access.⚠️")

if not os.path.exists('/content/koboldcpp/model.gguf'):
    # Use aria2c to download
    print("Installing/updating aria2c...")
    !apt-get install aria2 -y >/dev/null 2>&1
    print("Finished installing aria2c.")

    os.makedirs('/content/koboldcpp/', exist_ok=True)
    if Use_Lora:
      if Lora_Link.strip():
          # Lora is enabled & link provided
          print("\nLora detected, will apply to model.\n")
          lora = Lora_Link.replace('/blob/', '/resolve/')
      else:
          # Lora is enabled but no link
          print("\nWarning: Lora enabled, but no link, not applying.\n")
    if Use_Manual_Model:
        if Manual_Link.strip():
            # Manual Model is enabled, and a link is provided
            print(f"\nManual Model detected; will use {Manual_Link} instead of {Model}\n")
            Model = Manual_Link.replace('/blob/', '/resolve/')
        else:
            # Manual Model is enabled, but no link is provided
            print(f"\nWarning: Manual Model enabled, but no link was found. Falling back to {Model}\n")
            if Model in model_links:
                Model = model_links[Model].format(Quant_Method)
    else:
        # Model is in model_links and has a supported format
        Model = model_links[Model].format(Quant_Method)

    if not re.search(r'(\.gguf|\.ggml|\.bin|\.safetensors)$', Model):
        print("--------------------------\n5 SECOND WARNING: Manual link provided doesn't end with a supported format.\nAre you sure you provided a direct link?\n--------------------------\n")
        time.sleep(5)
    elif Model.startswith('https://huggingface.co/') and not re.search(r'^https://huggingface\.co/.+/.+/.+/.+/[^/]+\.[^/]+$', Model):
        print("--------------------------\n10 SECOND WARNING: The HuggingFace link provided is of the entire model repository.\nPlease find the direct link to the quant you want to use.\n--------------------------\n")
        time.sleep(10)

def download_model_and_lora():
    if not os.path.exists('/content/koboldcpp/model.gguf'):

        # Start timing
        start_time = time.time()

        print(f"\n--------------------------\nDownloading {os.path.basename(Model)}...")
        os.chdir("/content/koboldcpp")
        !aria2c -x 16 -s 16 -k 1M --allow-overwrite="true" --summary-interval=5 $Model -d /content/koboldcpp -o model.gguf 2>&1 | grep -Ev 'Redirecting'

        elapsed_time = time.time() - start_time # Calculate and display elapsed time
        print(f"\nDownload took {elapsed_time:.2f} seconds")

        if Use_Lora:
          print(f"\n--------------------------\nDownloading {os.path.basename(lora)}...")
          os.chdir("/content/koboldcpp")
          !aria2c -x 16 -s 16 -k 1M --allow-overwrite="true" --summary-interval=5 $Model -d /content/koboldcpp -o lora.bin 2>&1 | grep -Ev 'Redirecting'

        if os.path.exists('/content/koboldcpp/model.gguf') and os.path.getsize("/content/koboldcpp/model.gguf") == 0:
            os.remove("/content/koboldcpp/model.gguf")

        if os.path.exists('/content/koboldcpp/lora.bin') and os.path.getsize("/content/koboldcpp/lora.bin") == 0:
            os.remove("/content/koboldcpp/lora.bin")

        if Submit_Download_Stats and os.path.exists("/content/koboldcpp/model.gguf"):
            DownloadedModel = Model[:]  # DownloadedModel is used for download stats
            update_llama_stats(DownloadedModel)

        print("--------------------------\n")
    else:
         print("--------------------------\nModel already downloaded; skipping redownload.\nDisconnect and delete runtime if you need to restart the colab fully.\n--------------------------\n")

thread = threading.Thread(target=download_model_and_lora)

# Checking if you already have a Kobold install
if not os.path.exists("/content/koboldcpp/llama.cpp"):
    print("Downloading & extracting prebuilt Koboldcpp...")

    %cd /content
    !git clone https://github.com/LostRuins/koboldcpp
    %cd /content/koboldcpp
    kvers = !(cat koboldcpp.py | grep 'KcppVersion = ' | cut -d '"' -f2)
    kvers = kvers[0]
    if ForceRebuild:
      kvers = "force_rebuild"
    !echo Finding prebuilt binary for {kvers}
    !wget -O dlfile.tmp https://kcppcolab.concedo.workers.dev/?{kvers} && mv dlfile.tmp koboldcpp_cublas.so
    !test -f koboldcpp_cublas.so && echo Prebuilt Binary Exists || echo Prebuilt Binary Does Not Exist
    !test -f koboldcpp_cublas.so && echo Build Skipped || make koboldcpp_cublas LLAMA_CUBLAS=1 LLAMA_PORTABLE=1
    !cp koboldcpp_cublas.so koboldcpp_cublas.dat

    thread.start()

    print("\nKobold extraction to /content/koboldcpp/ completed!")
    print("--------------------------\n")
else:
    # In case koboldcpp already exists, just start the model download
    thread.start()

thread.join()

#Running localtunnel server
!npm install -g localtunnel
!nohup lt --port 5001 &

# Check nohup.out for "protocol=quic" which signifies it launched
!cat nohup.out
print("--------------------------\n")

def calculate_md5(file_path):
    hash_md5 = hashlib.md5()
    with open(file_path, "rb") as f:
        for chunk in iter(lambda: f.read(4096), b""):
            hash_md5.update(chunk)
    return hash_md5.hexdigest()

def hordeworkerenabled():
  if os.path.exists('/content/koboldcpp/model.gguf') and os.path.exists('/content/koboldcpp/lora.bin'):
    print("--------------------------\nAttempting to launch koboldcpp with the downloaded model and lora...")
    print("--------------------------\n")
    if Use_Manual_Model:
      if Smart_Context:
        !python koboldcpp.py model.gguf --lora lora.bin --smartcontext --usecublas 0 mmq --multiuser --context $Context --ropeconfig 1.0 10000 --gpulayers $Layers --hordeconfig $hordemodelname {' '.join(horde_params)} $hordeapikey $hordeworkername --remotetunnel --onready "curl -s ipecho.net/plain | xargs -I % echo Scroll down and locate your Localtunnel link, For Localtunnel use the IP provided to verify: % && cat nohup.out"
      else:
        !python koboldcpp.py model.gguf --lora lora.bin --usecublas 0 mmq --multiuser --context $Context --ropeconfig 1.0 10000 --gpulayers $Layers $hordemodelname {' '.join(horde_params)} $hordeapikey $hordeworkername --remotetunnel --onready "curl -s ipecho.net/plain | xargs -I % echo Scroll down and locate your Localtunnel link, For Localtunnel use the IP provided to verify: % && cat nohup.out"
    else:
      if Smart_Context:
        !python koboldcpp.py model.gguf --lora lora.bin --smartcontext --usecublas 0 mmq --multiuser --context $Context --ropeconfig 1.0 10000 --gpulayers $Layers --hordeconfig $Model_C {' '.join(horde_params)} $hordeapikey $hordeworkername --remotetunnel --onready "curl -s ipecho.net/plain | xargs -I % echo Scroll down and locate your Localtunnel link, For Localtunnel use the IP provided to verify: % && cat nohup.out"
      else:
        !python koboldcpp.py model.gguf --lora lora.bin --usecublas 0 mmq --multiuser --context $Context --ropeconfig 1.0 10000 --gpulayers $Layers --hordeconfig $Model_C {' '.join(horde_params)} $hordeapikey $hordeworkername --remotetunnel --onready "curl -s ipecho.net/plain | xargs -I % echo Scroll down and locate your Localtunnel link, For Localtunnel use the IP provided to verify: % && cat nohup.out"
  elif os.path.exists('/content/koboldcpp/model.gguf'):
    print("--------------------------\nAttempting to launch koboldcpp with the downloaded model...")
    print("--------------------------\n")
    if Use_Manual_Model:
      if Smart_Context:
        !python koboldcpp.py model.gguf --smartcontext --usecublas 0 mmq --multiuser --context $Context --ropeconfig 1.0 10000 --gpulayers $Layers --hordeconfig $hordemodelname {' '.join(horde_params)} $hordeapikey $hordeworkername --remotetunnel --onready "curl -s ipecho.net/plain | xargs -I % echo Scroll down and locate your Localtunnel link, For Localtunnel use the IP provided to verify: % && cat nohup.out"
      else:
        !python koboldcpp.py model.gguf --usecublas 0 mmq --multiuser --context $Context --ropeconfig 1.0 10000 --gpulayers $Layers --hordeconfig $hordemodelname {' '.join(horde_params)} $hordeapikey $hordeworkername --remotetunnel --onready "curl -s ipecho.net/plain | xargs -I % echo Scroll down and locate your Localtunnel link, For Localtunnel use the IP provided to verify: % && cat nohup.out"
    else:
      if Smart_Context:
        !python koboldcpp.py model.gguf --smartcontext --usecublas 0 mmq --multiuser --context $Context --ropeconfig 1.0 10000 --gpulayers $Layers --hordeconfig $Model_C {' '.join(horde_params)} $hordeapikey $hordeworkername --remotetunnel --onready "curl -s ipecho.net/plain | xargs -I % echo Scroll down and locate your Localtunnel link, For Localtunnel use the IP provided to verify: % && cat nohup.out"
      else:
        !python koboldcpp.py model.gguf --usecublas 0 mmq --multiuser --context $Context --ropeconfig 1.0 10000 --gpulayers $Layers --hordeconfig $Model_C {' '.join(horde_params)} $hordeapikey $hordeworkername --remotetunnel --onready "curl -s ipecho.net/plain | xargs -I % echo Scroll down and locate your Localtunnel link, For Localtunnel use the IP provided to verify: % && cat nohup.out"
  else:
    print("Failed to download the GGUF model or LoRA. Please retry.")

def runkbcpp():
  if os.path.exists('/content/koboldcpp/model.gguf') and os.path.exists('/content/koboldcpp/lora.bin'):
    print("--------------------------\nAttempting to launch koboldcpp with the downloaded model and lora...")
    print("--------------------------\n")
    if Use_Manual_Model:
        if Smart_Context:
          !python koboldcpp.py model.gguf --lora lora.bin --smartcontext --usecublas 0 mmq --multiuser --context $Context --ropeconfig 1.0 10000 --gpulayers $Layers --hordeconfig concedo 1 1 --remotetunnel --onready "curl -s ipecho.net/plain | xargs -I % echo Scroll down and locate your Localtunnel link, For Localtunnel use the IP provided to verify: % && cat nohup.out"
        else:
          !python koboldcpp.py model.gguf --lora lora.bin --usecublas 0 mmq --multiuser --context $Context --ropeconfig 1.0 10000 --gpulayers $Layers --hordeconfig concedo 1 1 --remotetunnel --onready "curl -s ipecho.net/plain | xargs -I % echo Scroll down and locate your Localtunnel link, For Localtunnel use the IP provided to verify: % && cat nohup.out"
    else:
        if Smart_Context:
          !python koboldcpp.py model.gguf --lora lora.bin --smartcontext --usecublas 0 mmq --multiuser --context $Context --ropeconfig 1.0 10000 --gpulayers $Layers --remotetunnel --onready "curl -s ipecho.net/plain | xargs -I % echo Scroll down and locate your Localtunnel link, For Localtunnel use the IP provided to verify: % && cat nohup.out"
        else:
          !python koboldcpp.py model.gguf --lora lora.bin --usecublas 0 mmq --multiuser --context $Context --ropeconfig 1.0 10000 --gpulayers $Layers --hordeconfig $Model_C 1 1 --remotetunnel --onready "curl -s ipecho.net/plain | xargs -I % echo Scroll down and locate your Localtunnel link, For Localtunnel use the IP provided to verify: % && cat nohup.out"
  elif os.path.exists('/content/koboldcpp/model.gguf'):
    print("--------------------------\nAttempting to launch koboldcpp with the downloaded model...")
    print("--------------------------\n")
    if Use_Manual_Model:
      if Smart_Context:
        !python koboldcpp.py model.gguf --smartcontext --usecublas 0 mmq --multiuser --context $Context --ropeconfig 1.0 10000 --gpulayers $Layers --hordeconfig concedo 1 1 --remotetunnel --onready "curl -s ipecho.net/plain | xargs -I % echo Scroll down and locate your Localtunnel link, For Localtunnel use the IP provided to verify: % && cat nohup.out"
      else:
        !python koboldcpp.py model.gguf --usecublas 0 mmq --multiuser --context $Context --ropeconfig 1.0 10000 --gpulayers $Layers --hordeconfig concedo 1 1 --remotetunnel --onready "curl -s ipecho.net/plain | xargs -I % echo Scroll down and locate your Localtunnel link, For Localtunnel use the IP provided to verify: % && cat nohup.out"
    else:
      if Smart_Context:
        !python koboldcpp.py model.gguf --smartcontext --usecublas 0 mmq --multiuser --context $Context --ropeconfig 1.0 10000 --gpulayers $Layers --hordeconfig $Model_C 1 1 --remotetunnel --onready "curl -s ipecho.net/plain | xargs -I % echo Scroll down and locate your Localtunnel link, For Localtunnel use the IP provided to verify: % && cat nohup.out"
      else:
        !python koboldcpp.py model.gguf --usecublas 0 mmq --multiuser --context $Context --ropeconfig 1.0 10000 --gpulayers $Layers --hordeconfig $Model_C 1 1 --remotetunnel --onready "curl -s ipecho.net/plain | xargs -I % echo Scroll down and locate your Localtunnel link, For Localtunnel use the IP provided to verify: % && cat nohup.out"
  else:
    print("Failed to download the GGUF model or LoRA. Please retry.")

if Enable_Horde_Worker:
  hordeworkerenabled()
else:
  runkbcpp()

# Quick How-To Guide

<sub>Note that some of the images may be outdated. But the overall function of the Colab is still the same!</sub>

---
## Step 1. Keeping Google Colab Running
---

Google Colab has a tendency to timeout after a period of inactivity. If you want to ensure your session doesn't timeout abruptly, you can use the following widget.

### Starting the Widget for Audio Player:

> <img src="https://cdn.discordapp.com/attachments/945486970883285045/1150363694191104112/image.png" width="50%"/>

### How the Widget Looks When Playing:

> <img src="https://cdn.discordapp.com/attachments/945486970883285045/1150363653997076540/image.png" width="50%"/>

Follow the visual cues in the images to start the widget and ensure that the notebook remains active.

---
## Step 2. Decide your Model
---

Pick a model and the quantization from the dropdowns, then run the cell like how you did earlier.

### Select your Model and Quantization:

> <img src="https://cdn.discordapp.com/attachments/945486970883285045/1150370141557764106/image.png" width="40%"/>

Alternatively, you can specify a model manually.

### Manual Model Option:

> <img src="https://media.discordapp.net/attachments/945486970883285045/1150370631242764370/image.png" width="75%"/>

5_K_M 13b models should work with 4k (maybe 3k?) context on Colab, since the T4 GPU has ~16GB of VRAM. You can now start the cell, and after 1-3 minutes, it should end with your API link that you can connect to in [SillyTavern](https://docs.sillytavern.app/installation/windows/):

> <img src="https://cdn.discordapp.com/attachments/945486970883285045/1150464795032694875/image.png" width="80%"/>

---
# And there you have it!
### MythoMax (or any 7b / 13b Llama 2 model) in under 2 minutes.
#### (depending on whether or not huggingface downloads are experiencing high traffic)

---
