<a href="https://colab.research.google.com/github/z3r0beta/chatbot/blob/main/AI_Kobold.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Making the Most of your Colab Subscription



## Faster GPUs

Users who have purchased one of Colab's paid plans have access to premium GPUs. You can upgrade your notebook's GPU settings in `Runtime > Change runtime type` in the menu to enable Premium accelerator. Subject to availability, selecting a premium GPU may grant you access to an L4 or A100 Nvidia GPU.

The free of charge version of Colab grants access to Nvidia's T4 GPUs subject to quota restrictions and availability.

You can see what GPU you've been assigned at any time by executing the following cell. If the execution result of running the code cell below is "Not connected to a GPU", you can change the runtime by going to `Runtime > Change runtime type` in the menu to enable a GPU accelerator, and then re-execute the code cell.


In [None]:
gpu_info = !nvidia-smi
gpu_info = '\n'.join(gpu_info)
if gpu_info.find('failed') >= 0:
  print('Not connected to a GPU')
else:
  print(gpu_info)

/bin/bash: line 1: nvidia-smi: command not found


In order to use a GPU with your notebook, select the `Runtime > Change runtime type` menu, and then set the hardware accelerator dropdown to GPU.

## More memory

Users who have purchased one of Colab's paid plans have access to high-memory VMs when they are available.



You can see how much memory you have available at any time by running the following code cell. If the execution result of running the code cell below is "Not using a high-RAM runtime", then you can enable a high-RAM runtime via `Runtime > Change runtime type` in the menu. Then select High-RAM in the Runtime shape dropdown. After, re-execute the code cell.


In [None]:
from psutil import virtual_memory
ram_gb = virtual_memory().total / 1e9
print('Your runtime has {:.1f} gigabytes of available RAM\n'.format(ram_gb))

if ram_gb < 20:
  print('Not using a high-RAM runtime')
else:
  print('You are using a high-RAM runtime!')

Your runtime has 8.0 gigabytes of available RAM

Not using a high-RAM runtime


## Longer runtimes

All Colab runtimes are reset after some period of time (which is faster if the runtime isn't executing code). Colab Pro and Pro+ users have access to longer runtimes than those who use Colab free of charge.

## Background execution

Colab Pro+ users have access to background execution, where notebooks will continue executing even after you've closed a browser tab. This is always enabled in Pro+ runtimes as long as you have compute units available.



## Relaxing resource limits in Colab Pro

Your resources are not unlimited in Colab. To make the most of Colab, avoid using resources when you don't need them. For example, only use a GPU when required and close Colab tabs when finished.



If you encounter limitations, you can relax those limitations by purchasing more compute units via Pay As You Go. Anyone can purchase compute units via [Pay As You Go](https://colab.research.google.com/signup); no subscription is required.

## Send us feedback!

If you have any feedback for us, please let us know. The best way to send feedback is by using the Help > 'Send feedback...' menu. If you encounter usage limits in Colab Pro consider subscribing to Pro+.

If you encounter errors or other issues with billing (payments) for Colab Pro, Pro+, or Pay As You Go, please email [colab-billing@google.com](mailto:colab-billing@google.com).

## More Resources

### Working with Notebooks in Colab
- [Overview of Colab](/notebooks/basic_features_overview.ipynb)
- [Guide to Markdown](/notebooks/markdown_guide.ipynb)
- [Importing libraries and installing dependencies](/notebooks/snippets/importing_libraries.ipynb)
- [Saving and loading notebooks in GitHub](https://colab.research.google.com/github/googlecolab/colabtools/blob/main/notebooks/colab-github-demo.ipynb)
- [Interactive forms](/notebooks/forms.ipynb)
- [Interactive widgets](/notebooks/widgets.ipynb)

<a name="working-with-data"></a>
### Working with Data
- [Loading data: Drive, Sheets, and Google Cloud Storage](/notebooks/io.ipynb)
- [Charts: visualizing data](/notebooks/charts.ipynb)
- [Getting started with BigQuery](/notebooks/bigquery.ipynb)

### Machine Learning Crash Course
These are a few of the notebooks from Google's online Machine Learning course. See the [full course website](https://developers.google.com/machine-learning/crash-course/) for more.
- [Intro to Pandas DataFrame](https://colab.research.google.com/github/google/eng-edu/blob/main/ml/cc/exercises/pandas_dataframe_ultraquick_tutorial.ipynb)
- [Linear regression with tf.keras using synthetic data](https://colab.research.google.com/github/google/eng-edu/blob/main/ml/cc/exercises/linear_regression_with_synthetic_data.ipynb)


<a name="using-accelerated-hardware"></a>
### Using Accelerated Hardware
- [TensorFlow with GPUs](/notebooks/gpu.ipynb)
- [TensorFlow with TPUs](/notebooks/tpu.ipynb)

<a name="machine-learning-examples"></a>

## Machine Learning Examples

To see end-to-end examples of the interactive machine learning analyses that Colab makes possible, check out these tutorials using models from [TensorFlow Hub](https://tfhub.dev).

A few featured examples:

- [Retraining an Image Classifier](https://tensorflow.org/hub/tutorials/tf2_image_retraining): Build a Keras model on top of a pre-trained image classifier to distinguish flowers.
- [Text Classification](https://tensorflow.org/hub/tutorials/tf2_text_classification): Classify IMDB movie reviews as either *positive* or *negative*.
- [Style Transfer](https://tensorflow.org/hub/tutorials/tf2_arbitrary_image_stylization): Use deep learning to transfer style between images.
- [Multilingual Universal Sentence Encoder Q&A](https://tensorflow.org/hub/tutorials/retrieval_with_tf_hub_universal_encoder_qa): Use a machine learning model to answer questions from the SQuAD dataset.
- [Video Interpolation](https://tensorflow.org/hub/tutorials/tweening_conv3d): Predict what happened in a video between the first and the last frame.


In [None]:
!git clone https://github.com/LostRuins/koboldcpp && cd koboldcpp

fatal: destination path 'koboldcpp' already exists and is not an empty directory.


In [None]:
!pip install -r koboldcpp/requirements.txt && pip install -e && cd ~/

Collecting gguf>=0.1.0 (from -r koboldcpp/requirements.txt (line 4))
  Downloading gguf-0.9.1-py3-none-any.whl.metadata (3.3 kB)
Collecting customtkinter>=5.1.0 (from -r koboldcpp/requirements.txt (line 5))
  Downloading customtkinter-5.2.2-py3-none-any.whl.metadata (677 bytes)
Collecting darkdetect (from customtkinter>=5.1.0->-r koboldcpp/requirements.txt (line 5))
  Downloading darkdetect-0.8.0-py3-none-any.whl.metadata (3.6 kB)
Downloading gguf-0.9.1-py3-none-any.whl (49 kB)
Downloading customtkinter-5.2.2-py3-none-any.whl (296 kB)
Downloading darkdetect-0.8.0-py3-none-any.whl (9.0 kB)
Installing collected packages: gguf, darkdetect, customtkinter
Successfully installed customtkinter-5.2.2 darkdetect-0.8.0 gguf-0.9.1

Usage:   
  pip install [options] <requirement specifier> [package-index-options] ...
  pip install [options] -r <requirements file> [package-index-options] ...
  pip install [options] [-e] <vcs project url> ...
  pip install [options] [-e] <local project path> ...
  p

In [None]:
!git clone https://github.com/YellowRoseCx/koboldcpp-rocm

Cloning into 'koboldcpp-rocm'...
remote: Enumerating objects: 33875, done.[K
remote: Counting objects: 100% (7755/7755), done.[K
remote: Compressing objects: 100% (229/229), done.[K
remote: Total 33875 (delta 7660), reused 7526 (delta 7526), pack-reused 26120 (from 1)[K
Receiving objects: 100% (33875/33875), 114.88 MiB | 4.54 MiB/s, done.
Resolving deltas: 100% (24444/24444), done.


In [None]:
cd koboldcpp-rocm

/home/j0hnny/koboldcpp-rocm


In [None]:
!pip install -r requirements.txt

/bin/bash: line 1: !pip: command not found


In [None]:
!ls

 aiXcoder			        libyui-master
 anaconda3			        llama.cpp
 Android			        main_katie-on-the-bench_spec_v2.png
 AndroidStudioProjects		        main_spec_v2.json
 Applications			        Music
 AWS				        n.js
 aws-replication-installer-init         node_modules
 build				        obj
 chatbot-ui			        packages-microsoft-prod.deb
 Desktop			        Pictures
 dillo-3.1.1			        Program.cs
 Documents			        Projects
 Downloads			        Public
 git-lfs-3.5.1			        Qt
 google-cloud-cli-linux-x86_64.tar.gz   Qt-Advanced-Docking-System
 google-cloud-sdk		        risu-ai_123.0.0_amd64.AppImage
 huggingface			        settings.json
 IDE				        share
 j0hnny.csproj			        Templates
 kimitzu			        Videos
 KoboldAI			        viper-browser
 koboldcpp			       '~WeB~RooT~'
 koboldcpp-rocm			        websites


In [None]:
!make LLAMA_OPENBLAS=1 LLAMA_CLBLAST=1 LLAMA_HIPBLAS=1 -j4

I llama.cpp build info: 
I UNAME_S:  Linux
I UNAME_P:  unknown
I UNAME_M:  x86_64
I CFLAGS:   -I. -Iggml/include -Iggml/src -Iinclude -Isrc -I./include -I./include/CL -I./otherarch -I./otherarch/tools -I./otherarch/sdcpp -I./otherarch/sdcpp/thirdparty -I./include/vulkan -O3 -fno-finite-math-only -fmath-errno -DNDEBUG -std=c11   -fPIC -DLOG_DISABLE_LOGS -D_GNU_SOURCE -DGGML_USE_LLAMAFILE -pthread -s -Wno-deprecated -Wno-deprecated-declarations -pthread -march=native -mtune=native
I CXXFLAGS: -I. -Iggml/include -Iggml/src -Iinclude -Isrc -I./common -I./include -I./include/CL -I./otherarch -I./otherarch/tools -I./otherarch/sdcpp -I./otherarch/sdcpp/thirdparty -I./include/vulkan -O3 -fno-finite-math-only -fmath-errno -DNDEBUG -std=c++11 -fPIC -DLOG_DISABLE_LOGS -D_GNU_SOURCE -DGGML_USE_LLAMAFILE -pthread -s -Wno-multichar -Wno-write-strings -Wno-deprecated -Wno-deprecated-declarations -pthread
I LDFLAGS:  
I CC:       cc (GCC) 14.1.1 20240701 (Red Hat 14.1.1-7)
I CXX:      g++ (GCC

In [None]:
!python3 koboldcpp.py -h

***
Welcome to KoboldCpp - Version 1.72.yr0-ROCm
usage: koboldcpp.py [-h] [--model [filename]] [--port [portnumber]]
                    [--host [ipaddr]] [--launch] [--config [filename]]
                    [--threads [threads]] [--usecublas [[lowvram|normal]
                    [main GPU ID] [mmq] [rowsplit] ...] | --usevulkan
                    [[Device ID] ...] | --useclblast {0,1,2,3,4,5,6,7,8}
                    {0,1,2,3,4,5,6,7,8} | --noblas]
                    [--contextsize [256,512,1024,2048,3072,4096,6144,8192,12288,16384,24576,32768,49152,65536,98304,131072]]
                    [--gpulayers [[GPU layers]]] [--tensor_split [Ratios]
                    [[Ratios] ...]] [--checkforupdates]
                    [--ropeconfig [rope-freq-scale] [[rope-freq-base] ...]]
                    [--blasbatchsize {-1,32,64,128,256,512,1024,2048}]
                    [--blasthreads [threads]] [--lora [lora_filename]
                    [[lora_base] ...]] [--noshift] [--nommap] [--usemlo

In [None]:
cd ~/content/j0hnny/

[Errno 2] No such file or directory: '/home/j0hnny/content/j0hnny/'
/home/j0hnny/koboldcpp-rocm


In [None]:
!ls

build-info.h		       kcpp_docs.embd
class.py		       kcpp_sdui.embd
clblast.dll		       klite.embd
CLINFO_LICENSE		       koboldcpp.py
CMakeLists.txt		       koboldcpp.sh
colab.ipynb		       lib
common			       libopenblas.dll
common.o		       LICENSE.md
convert_hf_to_gguf.py	       llavaclip_cublas.o
convert_hf_to_gguf_update.py   llavaclip_default.o
convert_llama_ggml_to_gguf.py  llava.o
convert_lora_to_gguf.py        Makefile
cudart64_110.dll	       make_pyinstaller.bat
cudart64_12.dll		       make_pyinstaller_cuda12.bat
easy_KCPP-ROCm_install.sh      make_pyinstaller_cuda.bat
environment.yaml	       make_pyinstaller_cuda_oldcpu.bat
examples		       make_pyinstaller_exe_rocm_only.bat
expose.cpp		       make_pyinstaller.sh
expose.h		       make_pyinst_rocm_hybrid_henk_yellow.bat
expose.o		       media
ggml			       MIT_LICENSE_GGML_LLAMACPP_ONLY
ggml-aarch64.o		       model_adapter.cpp
ggml-alloc.o		       model_adapter.h
ggml-backend_cublas.o	       msvcp140_c

In [None]:
cd /content/

[Errno 2] No such file or directory: '/content/'
/home/j0hnny/koboldcpp-rocm


In [None]:
!ls

build-info.h		       kcpp_docs.embd
class.py		       kcpp_sdui.embd
clblast.dll		       klite.embd
CLINFO_LICENSE		       koboldcpp.py
CMakeLists.txt		       koboldcpp.sh
colab.ipynb		       lib
common			       libopenblas.dll
common.o		       LICENSE.md
convert_hf_to_gguf.py	       llavaclip_cublas.o
convert_hf_to_gguf_update.py   llavaclip_default.o
convert_llama_ggml_to_gguf.py  llava.o
convert_lora_to_gguf.py        Makefile
cudart64_110.dll	       make_pyinstaller.bat
cudart64_12.dll		       make_pyinstaller_cuda12.bat
easy_KCPP-ROCm_install.sh      make_pyinstaller_cuda.bat
environment.yaml	       make_pyinstaller_cuda_oldcpu.bat
examples		       make_pyinstaller_exe_rocm_only.bat
expose.cpp		       make_pyinstaller.sh
expose.h		       make_pyinst_rocm_hybrid_henk_yellow.bat
expose.o		       media
ggml			       MIT_LICENSE_GGML_LLAMACPP_ONLY
ggml-aarch64.o		       model_adapter.cpp
ggml-alloc.o		       model_adapter.h
ggml-backend_cublas.o	       msvcp140_c

In [None]:
#@title <b>v-- Enter your model below and then click this to start Koboldcpp</b>

Model = "https://huggingface.co/KoboldAI/LLaMA2-13B-Tiefighter-GGUF/resolve/main/LLaMA2-13B-Tiefighter.Q4_K_S.gguf" #@param ["https://huggingface.co/KoboldAI/LLaMA2-13B-Tiefighter-GGUF/resolve/main/LLaMA2-13B-Tiefighter.Q4_K_S.gguf","https://huggingface.co/KoboldAI/LLaMA2-13B-Estopia-GGUF/resolve/main/LLaMA2-13B-Estopia.Q4_K_S.gguf","https://huggingface.co/mradermacher/Fimbulvetr-11B-v2-GGUF/resolve/main/Fimbulvetr-11B-v2.Q4_K_S.gguf","https://huggingface.co/TheBloke/MythoMax-L2-13B-GGUF/resolve/main/mythomax-l2-13b.Q4_K_M.gguf","https://huggingface.co/TheBloke/ReMM-SLERP-L2-13B-GGUF/resolve/main/remm-slerp-l2-13b.Q4_K_M.gguf","https://huggingface.co/TheBloke/Xwin-LM-13B-v0.2-GGUF/resolve/main/xwin-lm-13b-v0.2.Q4_K_M.gguf","https://huggingface.co/mradermacher/mini-magnum-12b-v1.1-GGUF/resolve/main/mini-magnum-12b-v1.1.Q4_K_S.gguf","https://huggingface.co/TheBloke/Stheno-L2-13B-GGUF/resolve/main/stheno-l2-13b.Q4_K_M.gguf","https://huggingface.co/TheBloke/MythoMax-L2-Kimiko-v2-13B-GGUF/resolve/main/mythomax-l2-kimiko-v2-13b.Q4_K_M.gguf","https://huggingface.co/TheBloke/MistRP-Airoboros-7B-GGUF/resolve/main/mistrp-airoboros-7b.Q4_K_S.gguf","https://huggingface.co/TheBloke/airoboros-mistral2.2-7B-GGUF/resolve/main/airoboros-mistral2.2-7b.Q4_K_S.gguf","https://huggingface.co/concedo/KobbleTinyV2-1.1B-GGUF/resolve/main/KobbleTiny-Q4_K.gguf","https://huggingface.co/grimjim/kukulemon-7B-GGUF/resolve/main/kukulemon-7B.Q8_0.gguf","https://huggingface.co/mradermacher/LemonKunoichiWizardV3-GGUF/resolve/main/LemonKunoichiWizardV3.Q4_K_M.gguf","https://huggingface.co/Lewdiculous/Kunoichi-DPO-v2-7B-GGUF-Imatrix/resolve/main/Kunoichi-DPO-v2-7B-Q4_K_M-imatrix.gguf","https://huggingface.co/mradermacher/L3-8B-Stheno-v3.2-i1-GGUF/resolve/main/L3-8B-Stheno-v3.2.i1-Q4_K_M.gguf","https://huggingface.co/Lewdiculous/Llama-3-Lumimaid-8B-v0.1-OAS-GGUF-IQ-Imatrix/resolve/main/v2-Llama-3-Lumimaid-8B-v0.1-OAS-Q4_K_M-imat.gguf","https://huggingface.co/bartowski/NeuralDaredevil-8B-abliterated-GGUF/resolve/main/NeuralDaredevil-8B-abliterated-Q4_K_M.gguf","https://huggingface.co/bartowski/L3-8B-Lunaris-v1-GGUF/resolve/main/L3-8B-Lunaris-v1-Q4_K_M.gguf","https://huggingface.co/mradermacher/L3-Umbral-Mind-RP-v2.0-8B-GGUF/resolve/main/L3-Umbral-Mind-RP-v2.0-8B.Q4_K_M.gguf"]{allow-input: true}
Layers = 99 #@param [99]{allow-input: true}
ContextSize = 4096 #@param [4096] {allow-input: true}
#@markdown <hr>
LoadLLaVAmmproj = True #@param {type:"boolean"}
LLaVAmmproj = "https://huggingface.co/koboldcpp/mmproj/resolve/main/llama-13b-mmproj-v1.5.Q4_1.gguf" #@param ["https://huggingface.co/koboldcpp/mmproj/resolve/main/llama-13b-mmproj-v1.5.Q4_1.gguf","https://huggingface.co/koboldcpp/mmproj/resolve/main/mistral-7b-mmproj-v1.5-Q4_1.gguf","https://huggingface.co/koboldcpp/mmproj/resolve/main/llama-7b-mmproj-v1.5-Q4_0.gguf","https://huggingface.co/koboldcpp/mmproj/resolve/main/LLaMA3-8B_mmproj-Q4_1.gguf"]{allow-input: true}
VCommand = ""
#@markdown <hr>
LoadImgModel = True #@param {type:"boolean"}
ImgModel = "https://huggingface.co/koboldcpp/imgmodel/resolve/main/imgmodel_ftuned_q4_0.gguf" #@param ["https://huggingface.co/koboldcpp/imgmodel/resolve/main/imgmodel_ftuned_q4_0.gguf"]{allow-input: true}
SCommand = ""
#@markdown <hr>
LoadSpeechModel = True #@param {type:"boolean"}
SpeechModel = "https://huggingface.co/koboldcpp/whisper/resolve/main/whisper-base.en-q5_1.bin" #@param ["https://huggingface.co/koboldcpp/whisper/resolve/main/whisper-base.en-q5_1.bin"]{allow-input: true}
WCommand = ""

import os
if not os.path.isfile("/opt/bin/nvidia-smi"):
  raise RuntimeError("⚠️Colab did not give you a GPU due to usage limits, this can take a few hours before they let you back in. Check out https://lite.koboldai.net for a free alternative (that does not provide an API link but can load KoboldAI saves and chat cards) or subscribe to Colab Pro for immediate access.⚠️")

%cd /content
if LLaVAmmproj and LoadLLaVAmmproj:
  VCommand = "--mmproj vmodel.gguf"
else:
  SCommand = ""
if ImgModel and LoadImgModel:
  SCommand = "--sdmodel imodel.gguf --sdthreads 4 --sdquant --sdclamped"
else:
  SCommand = ""
if SpeechModel and LoadSpeechModel:
  WCommand = "--whispermodel wmodel.bin"
else:
  WCommand = ""
!echo Downloading KoboldCpp, please wait...
!wget -O dlfile.tmp https://kcpplinux.concedo.workers.dev && mv dlfile.tmp koboldcpp_linux
!test -f koboldcpp_linux && echo Download Successful || echo Download Failed
!chmod +x ./koboldcpp_linux
!apt update
!apt install aria2 -y
# simple fix for a common URL mistake
if "https://huggingface.co/" in Model and "/blob/main/" in Model:
    Model = Model.replace("/blob/main/", "/resolve/main/")
!aria2c -x 10 -o model.gguf --summary-interval=5 --download-result=default --allow-overwrite=true --file-allocation=none $Model
if VCommand:
  !aria2c -x 10 -o vmodel.gguf --summary-interval=5 --download-result=default --allow-overwrite=true --file-allocation=none $LLaVAmmproj
if SCommand:
  !aria2c -x 10 -o imodel.gguf --summary-interval=5 --download-result=default --allow-overwrite=true --file-allocation=none $ImgModel
if WCommand:
  !aria2c -x 10 -o wmodel.bin --summary-interval=5 --download-result=default --allow-overwrite=true --file-allocation=none $SpeechModel
!./koboldcpp_linux model.gguf --usecublas 0 mmq --multiuser --gpulayers $Layers --contextsize $ContextSize --quiet --remotetunnel $VCommand $SCommand $WCommand


RuntimeError: ⚠️Colab did not give you a GPU due to usage limits, this can take a few hours before they let you back in. Check out https://lite.koboldai.net for a free alternative (that does not provide an API link but can load KoboldAI saves and chat cards) or subscribe to Colab Pro for immediate access.⚠️