**Quantization techniques in deep learning refer to methods for reducing the precision of the weights and activations in a neural network, typically from floating-point numbers to integers. This can lead to significant reductions in memory usage and computations, making it easier to deploy models on devices with limited resources.**

### we are using quantization technique --> GGML technique and library --> llama-cpp.

**Quantized Models from the Hugging Face Community, The Hugging Face community provides quantized models, which allow us to efficiently and effectively utilize the model on the T4 GPU. It is important to consult reliable sources before using any model.**

**There are several variations available, but the ones that interest us are based on the GGLM library.
We can see the different variations that Llama-2-13B-GGML has here.
In this case, we will use the model called Llama-2-13B-chat-GGML.**



In [1]:
# GPU llama-cpp-python

!CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.78 numpy==1.23.4 --force-reinstall --upgrade --no-cache-dir --verbose
!pip install huggingface_hub
!pip install llama-cpp-python==0.1.78
!pip install numpy==1.23.4

Using pip 23.1.2 from /usr/local/lib/python3.10/dist-packages/pip (python 3.10)
Collecting llama-cpp-python==0.1.78
  Downloading llama_cpp_python-0.1.78.tar.gz (1.7 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.7/1.7 MB[0m [31m10.0 MB/s[0m eta [36m0:00:00[0m
[?25h  Running command pip subprocess to install build dependencies
  Using pip 23.1.2 from /usr/local/lib/python3.10/dist-packages/pip (python 3.10)
  Collecting setuptools>=42
    Downloading setuptools-69.0.3-py3-none-any.whl (819 kB)
       ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 819.5/819.5 kB 4.6 MB/s eta 0:00:00
  Collecting scikit-build>=0.13
    Downloading scikit_build-0.17.6-py3-none-any.whl (84 kB)
       ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 84.3/84.3 kB 11.9 MB/s eta 0:00:00
  Collecting cmake>=3.18
    Downloading cmake-3.28.1-py2.py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (26.3 MB)
       ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 26.3/26.3 MB 41.0 MB/s eta 0:00:00
  Col

In [2]:
model_name_or_path = "TheBloke/Llama-2-13B-chat-GGML"

model_basename = "llama-2-13b-chat.ggmlv3.q5_1.bin"    # becz of less computational power we are using quantized model and it is in "bin-format"

### mport All the Required Libraries

In [3]:
from huggingface_hub import hf_hub_download
from llama_cpp import Llama

## Download the Model

In [4]:
model_path = hf_hub_download(repo_id=model_name_or_path, filename=model_basename)

model_path

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


llama-2-13b-chat.ggmlv3.q5_1.bin:   0%|          | 0.00/9.76G [00:00<?, ?B/s]

'/root/.cache/huggingface/hub/models--TheBloke--Llama-2-13B-chat-GGML/snapshots/3140827b4dfcb6b562cd87ee3d7f07109b014dd0/llama-2-13b-chat.ggmlv3.q5_1.bin'

## Loading the Model

In [5]:
# GPU
lcpp_llm = None
lcpp_llm = Llama(
    model_path=model_path,
    n_threads=2, # CPU cores
    n_batch=512, # Should be between 1 and n_ctx, consider the amount of VRAM in your GPU.
    n_gpu_layers=32 # Change this value based on your model and your GPU VRAM pool.
    )

AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 | 


In [6]:
# See the number of layers in GPU

lcpp_llm.params.n_gpu_layers

32

### Create a Prompt Template

In [11]:
prompt = "i would like to start Gen AI course by my self, so from where i need to start to learn?"
prompt_template=f'''SYSTEM: You are a helpful, respectful and honest assistant. Always answer as helpfully.

USER: {prompt}

ASSISTANT:
'''

### Generating the Response

In [12]:
response = lcpp_llm(prompt=prompt_template,
                  max_tokens=256,
                  temperature=0.5,
                  top_p=0.95,
                  repeat_penalty=1.2,
                  top_k=150,
                  echo=True)

Llama.generate: prefix-match hit


In [13]:
print(response)

{'id': 'cmpl-44ff52d7-9564-4867-973a-e40ae6fb8104', 'object': 'text_completion', 'created': 1706762822, 'model': '/root/.cache/huggingface/hub/models--TheBloke--Llama-2-13B-chat-GGML/snapshots/3140827b4dfcb6b562cd87ee3d7f07109b014dd0/llama-2-13b-chat.ggmlv3.q5_1.bin', 'choices': [{'text': "SYSTEM: You are a helpful, respectful and honest assistant. Always answer as helpfully.\n\nUSER: i would like to start Gen AI course by my self, so from where i need to start to learn?\n\nASSISTANT:\nThank you for your question! I'm happy to help you get started with learning about General Artificial Intelligence (Gen AI). Here are some steps and resources that can guide you on your journey:\n\n1. Start by familiarizing yourself with the basics of AI: Before diving into Gen AI, it's essential to have a solid understanding of what AI is and its various subfields. You can start by reading articles or watching videos about AI, machine learning, deep learning, and neural networks.\n2. Learn programming l

In [15]:
response

{'id': 'cmpl-44ff52d7-9564-4867-973a-e40ae6fb8104',
 'object': 'text_completion',
 'created': 1706762822,
 'model': '/root/.cache/huggingface/hub/models--TheBloke--Llama-2-13B-chat-GGML/snapshots/3140827b4dfcb6b562cd87ee3d7f07109b014dd0/llama-2-13b-chat.ggmlv3.q5_1.bin',
 'choices': [{'text': "SYSTEM: You are a helpful, respectful and honest assistant. Always answer as helpfully.\n\nUSER: i would like to start Gen AI course by my self, so from where i need to start to learn?\n\nASSISTANT:\nThank you for your question! I'm happy to help you get started with learning about General Artificial Intelligence (Gen AI). Here are some steps and resources that can guide you on your journey:\n\n1. Start by familiarizing yourself with the basics of AI: Before diving into Gen AI, it's essential to have a solid understanding of what AI is and its various subfields. You can start by reading articles or watching videos about AI, machine learning, deep learning, and neural networks.\n2. Learn programmi

In [16]:
response["choices"]

[{'text': "SYSTEM: You are a helpful, respectful and honest assistant. Always answer as helpfully.\n\nUSER: i would like to start Gen AI course by my self, so from where i need to start to learn?\n\nASSISTANT:\nThank you for your question! I'm happy to help you get started with learning about General Artificial Intelligence (Gen AI). Here are some steps and resources that can guide you on your journey:\n\n1. Start by familiarizing yourself with the basics of AI: Before diving into Gen AI, it's essential to have a solid understanding of what AI is and its various subfields. You can start by reading articles or watching videos about AI, machine learning, deep learning, and neural networks.\n2. Learn programming languages: To build your skills in AI, you need to learn programming languages like Python, Java, C++, etc. These languages are widely used in the field of AI for developing algorithms and implementing them. You can start with online tutorials or courses that teach these programmi

In [17]:
response["choices"][0]

{'text': "SYSTEM: You are a helpful, respectful and honest assistant. Always answer as helpfully.\n\nUSER: i would like to start Gen AI course by my self, so from where i need to start to learn?\n\nASSISTANT:\nThank you for your question! I'm happy to help you get started with learning about General Artificial Intelligence (Gen AI). Here are some steps and resources that can guide you on your journey:\n\n1. Start by familiarizing yourself with the basics of AI: Before diving into Gen AI, it's essential to have a solid understanding of what AI is and its various subfields. You can start by reading articles or watching videos about AI, machine learning, deep learning, and neural networks.\n2. Learn programming languages: To build your skills in AI, you need to learn programming languages like Python, Java, C++, etc. These languages are widely used in the field of AI for developing algorithms and implementing them. You can start with online tutorials or courses that teach these programmin

In [18]:
response["choices"][0]['text']

"SYSTEM: You are a helpful, respectful and honest assistant. Always answer as helpfully.\n\nUSER: i would like to start Gen AI course by my self, so from where i need to start to learn?\n\nASSISTANT:\nThank you for your question! I'm happy to help you get started with learning about General Artificial Intelligence (Gen AI). Here are some steps and resources that can guide you on your journey:\n\n1. Start by familiarizing yourself with the basics of AI: Before diving into Gen AI, it's essential to have a solid understanding of what AI is and its various subfields. You can start by reading articles or watching videos about AI, machine learning, deep learning, and neural networks.\n2. Learn programming languages: To build your skills in AI, you need to learn programming languages like Python, Java, C++, etc. These languages are widely used in the field of AI for developing algorithms and implementing them. You can start with online tutorials or courses that teach these programming languag

In [14]:
print(response["choices"][0]["text"])

SYSTEM: You are a helpful, respectful and honest assistant. Always answer as helpfully.

USER: i would like to start Gen AI course by my self, so from where i need to start to learn?

ASSISTANT:
Thank you for your question! I'm happy to help you get started with learning about General Artificial Intelligence (Gen AI). Here are some steps and resources that can guide you on your journey:

1. Start by familiarizing yourself with the basics of AI: Before diving into Gen AI, it's essential to have a solid understanding of what AI is and its various subfields. You can start by reading articles or watching videos about AI, machine learning, deep learning, and neural networks.
2. Learn programming languages: To build your skills in AI, you need to learn programming languages like Python, Java, C++, etc. These languages are widely used in the field of AI for developing algorithms and implementing them. You can start with online tutorials or courses that teach these programming languages.
3. St