![rainbow](https://github.com/ancilcleetus/My-Learning-Journey/assets/25684256/839c3524-2a1d-4779-85a0-83c562e1e5e5)

# 1. Introduction

The GPT-family models which we covered earlier are undoubtedly powerful. However, access to these models' weights and architecture is restricted, and even if one does have access, it requires significant resources to perform any task.

Furthermore, the available APIs are not free to build on top of. These limitations can restrict the ongoing research on Large Language Models (LLMs). The alternative open-source models (like [GPT4All](https://www.nomic.ai/gpt4all)) aim to overcome these obstacles and make the LLMs more accessible to everyone.

![rainbow](https://github.com/ancilcleetus/My-Learning-Journey/assets/25684256/839c3524-2a1d-4779-85a0-83c562e1e5e5)

# 2. How GPT4All works?

It is trained on top of Facebook's LLaMA model, which released its weights under a non-commercial license. Still, running the mentioned architecture on your local PC is impossible due to the large (7 billion) number of parameters. The authors incorporated two tricks to do efficient fine-tuning and inference. We will focus on inference since the fine-tuning process is out of the scope of this course.

The main contribution of GPT4All models is the ability to run them on a CPU. Testing these models is practically free because the recent PCs have powerful Central Processing Units. The underlying algorithm that helps with making it happen is called Quantization. It basically converts the pre-trained model weights to 4-bit precision using the GGML format.  So, the model uses fewer bits to represent the numbers. There are two main advantages to using this technique:

1. **Reducing Memory Usage:** It makes deploying the models more efficient on low-resource devices.
2. **Faster Inference:** The models will be faster during the generation process since there will be fewer computations.

It is true that we are sacrificing quality by a small margin when using this approach. However, it is a trade-off between no access at all and accessing a slightly underpowered model!

![rainbow](https://github.com/ancilcleetus/My-Learning-Journey/assets/25684256/839c3524-2a1d-4779-85a0-83c562e1e5e5)

# 3. Let's See GPT4All in Action

## 1. Installing Dependencies

In [1]:
!pip3 install torch
!pip3 install git+https://github.com/huggingface/transformers
!pip3 install gpt4all

Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch)
  Using cached nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)
Collecting nvidia-cuda-runtime-cu12==12.1.105 (from torch)
  Using cached nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB)
Collecting nvidia-cuda-cupti-cu12==12.1.105 (from torch)
  Using cached nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (14.1 MB)
Collecting nvidia-cudnn-cu12==8.9.2.26 (from torch)
  Using cached nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl (731.7 MB)
Collecting nvidia-cublas-cu12==12.1.3.1 (from torch)
  Using cached nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl (410.6 MB)
Collecting nvidia-cufft-cu12==11.0.2.54 (from torch)
  Using cached nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl (121.6 MB)
Collecting nvidia-curand-cu12==10.3.2.106 (from torch)
  Using cached nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl (56.5 MB)
Collectin

## 2. Load the Model and Test

In [1]:
from gpt4all import GPT4All
model = GPT4All("Meta-Llama-3-8B-Instruct.Q4_0.gguf")

In [3]:
# Test the model running locally
with model.chat_session():
    print(model.generate("Explain Big Bang Theory in simple terms under 10 sentences", max_tokens=1024))

Here's a simplified explanation of the Big Bang Theory:

About 13.8 billion years ago, there was no universe - just an infinitely hot and dense point called a singularity. This singularity suddenly expanded rapidly, causing space to grow and cool down. As it did, particles started forming from energy, like tiny building blocks. These particles eventually came together to form atoms, mostly hydrogen and helium. Over time, gravity pulled these atoms together into the first stars and galaxies. Our own galaxy, the Milky Way, was formed about 13 billion years ago. The universe continued to expand and cool down, allowing for planets to form around those early stars. Eventually, our solar system came together about 4.6 billion years ago, with Earth forming as we know it today. And that's basically how our entire universe began!


![rainbow](https://github.com/ancilcleetus/My-Learning-Journey/assets/25684256/839c3524-2a1d-4779-85a0-83c562e1e5e5)