## GPT4ALL 

GPT4All is an ecosystem to train and deploy powerful and customized large language models that run locally on consumer grade CPUs.A GPT4All model is a 3GB - 8GB file that you can download and plug into the GPT4All open-source ecosystem software. 


Model Explorer: https://gpt4all.io/index.html

## Getting started

* Instalamos la libreria de gpt4all

In [1]:
#! pip install -q gpt4all 
# poetry add gpt4all

In [2]:
from pathlib import Path
from gpt4all import GPT4All
from gpt4all import Embed4All

* Specify the model

In [3]:
local_path = Path.home()/'.cache'/'gpt4all'
model_name = 'mistral-7b-openorca.Q4_0.gguf' 

llm = GPT4All(model_name=model_name,allow_download=False,device='cpu')
llm

<gpt4all.gpt4all.GPT4All at 0x16bee154210>

In [4]:
llm.config

{'systemPrompt': '',
 'promptTemplate': '### Human: \n{0}\n### Assistant:\n',
 'path': 'C:\\\\\\\\Users\\\\\\\\prng\\\\\\\\.cache\\\\\\\\gpt4all\\\\mistral-7b-openorca.Q4_0.gguf'}

* Generate a response

In [5]:
def request(model,question):
    system_template = 'You are a helpful assistant. Your job is to politely answer questions of the user.'
    prompt_template = f"USER: {question}\nASSISTANT:"
    with model.chat_session(system_template, prompt_template):
        response = model.generate(question)
    return response

* Test it out!

In [6]:
question = "What is the capital of France?"
response = request(llm,question)
print(response)

 The capital city of France is Paris.


* Generando embeddings

In [7]:
def create_embedding(text: str): 
    embedder = Embed4All()    
    return embedder.embed(text)


In [8]:
text = 'Hello World!'
output_embedding = create_embedding(text)
print(output_embedding)
print(len(output_embedding))

[0.004454717505723238, 0.032058484852313995, 0.018707796931266785, -0.004652936011552811, -0.017432844266295433, -0.09676730632781982, 0.027140984311699867, -0.04684029519557953, -0.06210200861096382, 0.030343255028128624, 0.04764336347579956, 0.02099759131669998, 0.05062853917479515, -0.006406750530004501, -0.01975109800696373, -0.03310864046216011, 0.019293146207928658, -0.004116785246878862, -0.1341720074415207, 0.028399569913744926, 0.03599302098155022, 0.04383411630988121, -0.02773466892540455, 0.05283363163471222, -0.0989377424120903, -0.00440975371748209, 0.0211013313382864, 0.05013199523091316, -0.010722963139414787, -0.059255797415971756, 0.049544695764780045, 0.027990898117423058, 0.08020834624767303, -0.010854195803403854, 0.016916336491703987, 0.05536540597677231, -0.02263537235558033, -0.09096749871969223, -0.03612218052148819, 0.02368544228374958, 0.01634213514626026, -0.07730051875114441, -0.026741215959191322, 0.023395173251628876, -0.010567973367869854, 0.0273748189210

* My Class

In [9]:
from typing import List, Optional


class OpenLLM:

    def __init__(self, model_name: str,model_path:str, n_threads: Optional[int] = None, **kwargs):
  
        self.model_name = model_name
        self.model_path = model_path
        self.llm = GPT4All(model_name=self.model_name,
                           model_path=self.model_path,
                           allow_download=False)
        self.embedder = Embed4All()  
        
    def create_embedding(self,text: str)->List[float]:     
        return self.embedder.embed(text)
    
    def invoke(self,question)->str:
        system_template = 'You are a helpful assistant. Your job is to politely answer questions of the user.'
        prompt_template = f"USER: {question}\nASSISTANT:"
        model = self.llm
        with model.chat_session(system_template, prompt_template):
            response = model.generate(question)
        return response


* Create an instance of my custom class

In [10]:
local_path = Path.home()/'.cache'/'gpt4all'
model_name = 'mistral-7b-openorca.Q4_0.gguf' 
openLLM = OpenLLM(model_name=model_name,model_path=local_path)

* Crear embeddings

In [11]:
text = 'Hola Mundo!'
embedding = openLLM.create_embedding(text)
print(embedding)
print(len(embedding))

[-0.02067878283560276, 0.14263160526752472, 0.028110701590776443, -0.029741564765572548, -0.01575903594493866, -0.019633980467915535, 0.0456370934844017, -0.05857016518712044, 0.005482892040163279, -0.003296123119071126, 0.061384182423353195, -0.04817118123173714, -0.034032948315143585, 0.054883718490600586, -0.01698014885187149, -0.056987568736076355, -0.051771167665719986, 0.07983585447072983, 0.000270333286607638, 0.019620247185230255, 0.11311732232570648, 0.00442607281729579, -0.028395088389515877, 0.05289364606142044, -0.06899865716695786, 0.04789116606116295, 0.025682324543595314, 0.02807609923183918, 0.03174234926700592, -0.07944250851869583, -0.01303116325289011, 0.08746299892663956, -0.022139960899949074, -0.0339580699801445, -0.05298210680484772, 0.043851230293512344, 0.0009722646209411323, -0.05116035044193268, 0.008356052450835705, 0.03393075615167618, -0.0032360730692744255, -0.045054756104946136, 0.0062681687995791435, -0.01652846857905388, -0.007609410211443901, -0.06412

* Multi-language.

In [12]:
question ='What is the capital of Spain?'
response = openLLM.invoke(question)
print(response)

 The capital city of Spain is Madrid.


In [13]:
question ='¿Cuál es la capital de España?'
response = openLLM.invoke(question)
print(response)

 La capital de España es Madrid.


In [14]:
question ="Quelle est la capitale de l'Espagne?."
response = openLLM.invoke(question)
print(response)

 La capitale de l'Espagne est Madrid.


* Detalles

In [15]:
openLLM.model_name

'mistral-7b-openorca.Q4_0.gguf'

In [16]:
openLLM.model_path

WindowsPath('C:/Users/prng/.cache/gpt4all')

In [17]:
openLLM.llm.config

{'systemPrompt': '',
 'promptTemplate': '### Human: \n{0}\n### Assistant:\n',
 'path': 'C:\\\\\\\\Users\\\\\\\\prng\\\\\\\\.cache\\\\\\\\gpt4all\\\\all-MiniLM-L6-v2-f16.gguf',
 'order': 'o',
 'md5sum': 'e479e6f38b59afc51a470d1953a6bfc7',
 'disableGUI': 'true',
 'name': 'SBert',
 'filename': 'all-MiniLM-L6-v2-f16.gguf',
 'filesize': '45887744',
 'requires': '2.5.0',
 'ramrequired': '1',
 'parameters': '40 million',
 'quant': 'f16',
 'type': 'Bert',
 'description': '<strong>LocalDocs text embeddings model</strong><br><ul><li>Necessary for LocalDocs feature<li>Used for retrieval augmented generation (RAG)'}

# Referencias

* GPT4All: Parameters

https://docs.gpt4all.io/gpt4all_python.html

* GPT4All supports generating high quality embeddings

https://docs.gpt4all.io/gpt4all_python_embedding.html

* Gpu Usage

https://github.com/nomic-ai/gpt4all/tree/main/gpt4all-bindings/python