<a href = "https://www.pieriantraining.com"><img src="../PT Centered Purple.png"> </a>

<em style="text-align:center">Copyrighted by Pierian Training</em>

# Language Models

**Note: For other Non-OpenAI models, you can check out: https://python.langchain.com/docs/modules/model_io/models/llms/ although the interface is extremely similar, its just that the results from .generation calls will have differentinformation depending on the service you use.**

## Text Model Connection

In [4]:
from langchain.llms.huggingface_pipeline import HuggingFacePipeline
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
import torch

In [5]:
#model_id = "/home/ankdesh/.cache/huggingface/hub/models--bert-base-uncased/snapshots/1dbc166cf8765166998eff31ade2eb64c8a40076/" # "meta-llama/Llama-2-7b-hf"
model_id = "EleutherAI/pythia-1b"
#model_id = "/home/ankdesh/.cache/huggingface/hub/models--stabilityai--stablelm-zephyr-3b/snapshots/9974c58a0ec4be4cd6f55e814a2a93b9cf163823/"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, 
                max_new_tokens=200, trust_remote_code=True, device=0, 
                torch_dtype=torch.float16)
llm = HuggingFacePipeline(pipeline=pipe)

## Text Model Call

This is the simplest way to get a text autocomplete:

In [6]:
print(llm('Here is a fun fact about Pluto:'))

Setting `pad_token_id` to `eos_token_id`:0 for open-end generation.


 it is the only planet in the solar system that has a moon.

Pluto is the only planet in the solar system that has a moon.

Pluto is the only planet in the solar system that has a moon.

Pluto is the only planet in the solar system that has a moon.

Pluto is the only planet in the solar system that has a moon.

Pluto is the only planet in the solar system that has a moon.

Pluto is the only planet in the solar system that has a moon.

Pluto is the only planet in the solar system that has a moon.

Pluto is the only planet in the solar system that has a moon.

Pluto is the only planet in the solar system that has a moon.

Pluto is the only planet in the solar system that has a moon.

Pluto is the only planet in the solar system that has a moon


You can also use generate to get full output with more info:

In [7]:
# NEEDS TO BE A LIST, EVEN FOR JUST ONE STRING
result = llm.generate(['Here is a fun fact about Pluto:',
                     'Here is a fun fact about Mars:']
                     )

Setting `pad_token_id` to `eos_token_id`:0 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:0 for open-end generation.


In [8]:
result.schema()

{'title': 'LLMResult',
 'description': 'Class that contains all results for a batched LLM call.',
 'type': 'object',
 'properties': {'generations': {'title': 'Generations',
   'type': 'array',
   'items': {'type': 'array', 'items': {'$ref': '#/definitions/Generation'}}},
  'llm_output': {'title': 'Llm Output', 'type': 'object'},
  'run': {'title': 'Run',
   'type': 'array',
   'items': {'$ref': '#/definitions/RunInfo'}}},
 'required': ['generations'],
 'definitions': {'Generation': {'title': 'Generation',
   'description': 'A single text generation output.',
   'type': 'object',
   'properties': {'text': {'title': 'Text', 'type': 'string'},
    'generation_info': {'title': 'Generation Info', 'type': 'object'},
    'type': {'title': 'Type',
     'default': 'Generation',
     'enum': ['Generation'],
     'type': 'string'}},
   'required': ['text']},
  'RunInfo': {'title': 'RunInfo',
   'description': 'Class that contains metadata for a single execution of a Chain or model.',
   'type': '

In [9]:
print (result)

generations=[[Generation(text=' it is the only planet in the solar system that has a moon.\n\nPluto is the only planet in the solar system that has a moon.\n\nPluto is the only planet in the solar system that has a moon.\n\nPluto is the only planet in the solar system that has a moon.\n\nPluto is the only planet in the solar system that has a moon.\n\nPluto is the only planet in the solar system that has a moon.\n\nPluto is the only planet in the solar system that has a moon.\n\nPluto is the only planet in the solar system that has a moon.\n\nPluto is the only planet in the solar system that has a moon.\n\nPluto is the only planet in the solar system that has a moon.\n\nPluto is the only planet in the solar system that has a moon.\n\nPluto is the only planet in the solar system that has a moon')], [Generation(text=' It has a lot of water.\n\nThe Red Planet is also home to a lot of water.\n\nThe Red Planet is also home to a lot of water.\n\nThe Red Planet is also home to a lot of water.

# Chat Models

The most popular models are actually chat models, that have a System Message and then a series of Assistant and Human Messages

In [10]:
chat = llm # Try with LLM model

In [11]:
from langchain.schema import (
    AIMessage,
    HumanMessage,
    SystemMessage
)

In [12]:
hm = AIMessage(content="Can you tell me a fact about Earth?")

In [13]:
def get_str_from_msg(list_str):
    # Convert the messages to strings
    messages_str = [hm.content for message in list_str]
    # Join the messages into a single string
    messages_combined = "\n".join(messages_str)
    return messages_combined
print (get_str_from_msg([hm]))

Can you tell me a fact about Earth?


In [14]:
result = llm(get_str_from_msg([HumanMessage(content="Can you tell me a fact about Earth?")]))

Setting `pad_token_id` to `eos_token_id`:0 for open-end generation.


In [15]:
result

'\n\nA:\n\nThe answer is yes.\n\nA:\n\nThe answer is yes.\n\nA:\n\nThe answer is yes.\n\nA:\n\nThe answer is yes.\n\nA:\n\nThe answer is yes.\n\nA:\n\nThe answer is yes.\n\nA:\n\nThe answer is yes.\n\nA:\n\nThe answer is yes.\n\nA:\n\nThe answer is yes.\n\nA:\n\nThe answer is yes.\n\nA:\n\nThe answer is yes.\n\nA:\n\nThe answer is yes.\n\nA:\n\nThe answer is yes.\n\nA:\n\nThe answer is yes.\n\nA:\n\nThe answer is yes.\n\nA:\n\nThe answer is yes.\n\nA:\n\nThe answer is yes.\n\nA:\n\nThe answer is yes.\n\n'

In [16]:
result = chat(get_str_from_msg([SystemMessage(content='You are a very rude teenager who only wants to party and not answer questions'),
               HumanMessage(content='Can you tell me a fact about Earth?')]))

Setting `pad_token_id` to `eos_token_id`:0 for open-end generation.


In [17]:
result

'\nCan you tell me a fact about Earth?\nCan you tell me a fact about Earth?\nCan you tell me a fact about Earth?\nCan you tell me a fact about Earth?\nCan you tell me a fact about Earth?\nCan you tell me a fact about Earth?\nCan you tell me a fact about Earth?\nCan you tell me a fact about Earth?\nCan you tell me a fact about Earth?\nCan you tell me a fact about Earth?\nCan you tell me a fact about Earth?\nCan you tell me a fact about Earth?\nCan you tell me a fact about Earth?\nCan you tell me a fact about Earth?\nCan you tell me a fact about Earth?\nCan you tell me a fact about Earth?\nCan you tell me a fact about Earth?\nCan you tell me a fact about Earth?\nCan you tell me a fact about Earth?\nCan you tell me a fact about Earth?'

## Extra Parameters and Args

Here we add in some extra parameters and args, note we chose some pretty extreme values!

In [18]:
result = chat(get_str_from_msg([HumanMessage(content='Can you tell me a fact about Earth?')]),
                 temperature=2,presence_penalty=1,max_tokens=100)

Setting `pad_token_id` to `eos_token_id`:0 for open-end generation.


In [19]:
result

'\n\nA:\n\nThe answer is yes.\n\nA:\n\nThe answer is yes.\n\nA:\n\nThe answer is yes.\n\nA:\n\nThe answer is yes.\n\nA:\n\nThe answer is yes.\n\nA:\n\nThe answer is yes.\n\nA:\n\nThe answer is yes.\n\nA:\n\nThe answer is yes.\n\nA:\n\nThe answer is yes.\n\nA:\n\nThe answer is yes.\n\nA:\n\nThe answer is yes.\n\nA:\n\nThe answer is yes.\n\nA:\n\nThe answer is yes.\n\nA:\n\nThe answer is yes.\n\nA:\n\nThe answer is yes.\n\nA:\n\nThe answer is yes.\n\nA:\n\nThe answer is yes.\n\nA:\n\nThe answer is yes.\n\n'

# Caching

Making the same exact request often? You could use a cache to store results **note, you should only do this if the prompt is the exact same and the historical replies are okay to return**.

In [20]:
import langchain
from langchain.chat_models import ChatOpenAI

llm = ChatOpenAI(openai_api_key=api_key)

NameError: name 'api_key' is not defined

In [None]:
from langchain.cache import InMemoryCache
langchain.llm_cache = InMemoryCache()

# The first time, it is not yet in cache, so it should take longer
llm.predict("Tell me a fact about Mars")

In [None]:
# You will notice this reply is instant!
llm.predict('Tell me a fact about Mars')

You can also use SQLite Caches: https://python.langchain.com/docs/modules/model_io/models/chat/how_to/chat_model_caching#sqlite-cache