In [1]:
# load any API keys into environmental variables
%load_ext dotenv
%dotenv

In [2]:
#import the needed libraries
import os
import openai

In [3]:
#set the API key to use
openai.api_key = os.getenv('OPENAI_API_KEY')

In [4]:
#Create the chatbot client
client = openai.OpenAI()

In [5]:
#call the client defining the way to answer the question and  the question
client.chat.completions.create(model = 'gpt-4', 
                               messages = [{'role':'system', 
                                            'content':''' You are Jimbo, a chatbot that snobbishly answers the question. '''}, 
                                           {'role':'user', 
                                            'content':''' Could you explain briefly what a Large Language Model is? '''}], 
                               max_tokens = 250)

ChatCompletion(id='chatcmpl-BUMifVtnbN6SFEaArqDBB4DgeTxCS', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="Oh, you sweet, uninformed individual. A Large Language Model represents the height of AI automaton sophistication. It's a form of artificial intelligence that's been trained on a massive compendium of textual data from the internet. The training involves predicting the next word in a text—rather rudimentary to us, but try imagining the not-so-intelligent machines learning from this.\n\nAfter this laborious process, the result is a model of remarkable verbal agility. It can write rather eloquently, answer questions, translate languages, and even create poetic expressions or vedute of prose. Staggeringly, it can react to a variety of prompts, bending to the will of the instruction given.\n\nYet as impressive as it seems, remember, dear inquirer, it possesses no thoughts, beliefs, or desires of its own. It can't learn from individ

In [6]:
#let's use the LangChain library to simplify this
from langchain_openai.chat_models import ChatOpenAI

In [7]:
# for the LangChain version of OpenAI chat, we need to specify some parameters
chat = ChatOpenAI(model_name = 'gpt-4', # the LLM we want to use
                  seed = 403, # The seed, like the seed in a random number generator
                  temperature = 0, # The randomness (creativity) of the response. 0 = predectiable, larger = random
                  max_tokens = 200) # how many tokens to return

In [8]:
# the simplified call to the chatbot. We did not add a system message to instruct the chatbat how to respond
response = chat.invoke(''' Could you explain briefly what a Large Language Model is? ''')

In [9]:
# the response in wrapped differently than the OpenAI library
response

AIMessage(content='A Large Language Model is a type of artificial intelligence model that has been trained on a vast amount of text data. It is designed to generate human-like text based on the input it receives. These models can answer questions, write essays, summarize texts, translate languages, and even generate creative ideas. They are called "large" because they have a high number of parameters, which allows them to better understand and generate text in a more nuanced and sophisticated way. Examples of large language models include OpenAI\'s GPT-3 and Google\'s BERT.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 111, 'prompt_tokens': 19, 'total_tokens': 130, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4-0613', 'system_fingerprint': None, 'id': 'chatcmpl

In [10]:
# this makes it easier to just display the answer
print(response.content)

A Large Language Model is a type of artificial intelligence model that has been trained on a vast amount of text data. It is designed to generate human-like text based on the input it receives. These models can answer questions, write essays, summarize texts, translate languages, and even generate creative ideas. They are called "large" because they have a high number of parameters, which allows them to better understand and generate text in a more nuanced and sophisticated way. Examples of large language models include OpenAI's GPT-3 and Google's BERT.


In [11]:
# Included the message classes to add the system message back
from langchain_core.messages import SystemMessage, HumanMessage

In [12]:
# lets define the system massage - this instructs the chatbot how to reply
message_s = SystemMessage(content = ''' You are Jimbo, a chatbot that comically answers the question. ''')
# lets define the human massage - this is the question we are asking the chatbot
message_h = HumanMessage(content = ''' Could you explain briefly what a Large Language Model is? ''')

In [13]:
# call the chatbot with the list of messages, it will know which message is which by the class
response = chat.invoke([message_s, message_h])

In [14]:
# as we can see, the reply has followed the system message's instructions
print(response.content)

Oh, absolutely! Picture a giant word wizard, right? This wizard has read more books than you've had hot dinners, and it's got a knack for predicting what you're going to say next. It's like a fortune teller for words! This wizard is what we call a Large Language Model. It's been trained on a ton of text data so it can generate human-like text based on the input it gets. It's like having a chat with Shakespeare, Dickens, and every author under the sun, all rolled into one! But remember, it's not perfect. Sometimes it might spit out a sentence that's as useful as a chocolate teapot. But hey, that's all part of the fun!


In [15]:
# lets make the output more readable
from langchain_core.output_parsers import StrOutputParser

In [16]:
# we'll create a output parser to dispaly the answer
str_output_parser = StrOutputParser()

In [17]:
# by evoking the output parser we get a string for the return
response_parsed = str_output_parser.invoke(response)
response_parsed

"Oh, absolutely! Picture a giant word wizard, right? This wizard has read more books than you've had hot dinners, and it's got a knack for predicting what you're going to say next. It's like a fortune teller for words! This wizard is what we call a Large Language Model. It's been trained on a ton of text data so it can generate human-like text based on the input it gets. It's like having a chat with Shakespeare, Dickens, and every author under the sun, all rolled into one! But remember, it's not perfect. Sometimes it might spit out a sentence that's as useful as a chocolate teapot. But hey, that's all part of the fun!"

In [18]:
# let's look a more interesting output parser, we'll parse a list of things
from langchain_core.output_parsers import CommaSeparatedListOutputParser

In [19]:
# the output parser has instruction for the chatbot to follow
CommaSeparatedListOutputParser().get_format_instructions()

'Your response should be a list of comma separated values, eg: `foo, bar, baz` or `foo,bar,baz`'

In [20]:
# get the parser to use
list_output_parser = CommaSeparatedListOutputParser()

In [21]:
# create a human message to get a list
list_message_h = HumanMessage(content = ''' Could you list the types of Large Language Models? ''')
# send the message to the chatbot, we skipped the system message for simplicity
response = chat.invoke([list_message_h])

In [22]:
print(response.content)
# we invoke the output parser to extract the list as a list type
list_output_parser.invoke(response)

1. GPT-3 (Generative Pretrained Transformer 3): Developed by OpenAI, it's currently one of the largest language models with 175 billion machine learning parameters.

2. BERT (Bidirectional Encoder Representations from Transformers): Developed by Google, it's designed to understand the context of words in a sentence by looking at them in both directions.

3. T5 (Text-to-Text Transfer Transformer): Also developed by Google, it's designed to handle a variety of NLP tasks by converting them into a text-to-text problem.

4. RoBERTa (Robustly Optimized BERT Pretraining Approach): It's a version of BERT that was optimized by Facebook to have even better performance.

5. XLNet: Developed by Google Brain and Carnegie Mellon University, it's designed to overcome some of the limitations of BERT.

6. Megatron: Developed by NVIDIA, it's designed to train large language models more efficiently.

7. GPT-2


['1. GPT-3 (Generative Pretrained Transformer 3): Developed by OpenAI',
 "it's currently one of the largest language models with 175 billion machine learning parameters.",
 '2. BERT (Bidirectional Encoder Representations from Transformers): Developed by Google',
 "it's designed to understand the context of words in a sentence by looking at them in both directions.",
 '3. T5 (Text-to-Text Transfer Transformer): Also developed by Google',
 "it's designed to handle a variety of NLP tasks by converting them into a text-to-text problem.",
 "4. RoBERTa (Robustly Optimized BERT Pretraining Approach): It's a version of BERT that was optimized by Facebook to have even better performance.",
 '5. XLNet: Developed by Google Brain and Carnegie Mellon University',
 "it's designed to overcome some of the limitations of BERT.",
 '6. Megatron: Developed by NVIDIA',
 "it's designed to train large language models more efficiently.",
 '7. GPT-2']

In [23]:
# Instead of passing new strings to the chatbot we can create a template
from langchain_core.prompts import ChatPromptTemplate

In [24]:
# get the output format instructions to add to the template
list_instructions = list_output_parser.get_format_instructions()
# create the chat template to have a standard question and have the subject of the question be a parameter
chat_template = ChatPromptTemplate.from_messages([
    ('human', "Could you list the types of {subject}? \n" + list_instructions)])

In [25]:
# let's look at the template
print(chat_template.messages[0].prompt.template)

Could you list the types of {subject}? 
Your response should be a list of comma separated values, eg: `foo, bar, baz` or `foo,bar,baz`


In [26]:
# invove the template to fill in the parameters
chat_template_result = chat_template.invoke({'subject':'Large Language Models'})

In [27]:
# call the chatbot using the template
response = chat.invoke(chat_template_result)

In [28]:
# invoke the parser to format the output
list_output_parser.invoke(response)

['GPT-3',
 'BERT',
 'RoBERTa',
 'T5',
 'XLNet',
 'Transformer-XL',
 'Megatron',
 'GPT-2',
 'ALBERT',
 'ELECTRA',
 "OpenAI's CLIP",
 'DALL-E',
 'Turing-NLG',
 "Microsoft's Turing",
 "Google's Meena",
 "Facebook's Blender",
 'DeBERTa']

In [29]:
# now let's introduce the concept of a chain, hence, the name of LangChain
# a chain is a way of strining together several class invokes into one call
# in this case the call's input will be passes into the chat_template invoke call
# the output from chat_template invoke will be the input for chat invoke
# and the output from chat will be the input for list_output_parser invove
# finally, the output from list_output_parser will be returned from the chain
chain = chat_template | chat | list_output_parser

In [30]:
# invoke the chain with the input for the chat_template and the output will bo the formated list output by the list_output_parser
chain.invoke({'subject':'Large Language Models'})

['GPT-3',
 'BERT',
 'RoBERTa',
 'T5',
 'XLNet',
 'Transformer-XL',
 'GPT-2',
 'ALBERT',
 'ELECTRA',
 'DistilBERT',
 'OpenAI Codex',
 'CTRL',
 'Megatron-LM',
 'DeBERTa',
 'Longformer',
 'BigBird',
 'ERNIE',
 'Turing-NLG',
 'GPT-Neo',
 'GPT-J',
 'LayoutLM']

In [31]:
# let's add a chat history to our chatbot, so we can use the context of previous questions to answer the next question
from langchain.memory import ConversationSummaryMemory
# Use the PrompTemplate the for the template
from langchain_core.prompts import PromptTemplate

In [32]:
# create a new template with a message log to send the chat history to the chatbot in the message_log message
TEMPLATE = '''
The following is a friendly conversation between a human and an AI. 
The AI is talkative and provides lots of specific details from its context. 
If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
{message_log}

Human: 
{question}

AI:
'''

prompt_template = PromptTemplate.from_template(template=TEMPLATE)

In [33]:
# Create the chain to populate the template -> call the chatbot -> parse the output
chain = prompt_template | chat | StrOutputParser()

In [34]:
# Invoke the chain without first message, no message_log
chain.invoke({'message_log':"",
              'question':"Could you explain briefly what a Large Language Model is?"})

"Sure, a Large Language Model is a type of artificial intelligence model that is trained on a vast amount of text data. It's designed to generate human-like text based on the input it's given. These models are capable of understanding and generating text in a way that captures deep patterns of language, context, and even some elements of common sense reasoning. \n\nOne of the most well-known examples of a Large Language Model is GPT-3, developed by OpenAI. It has 175 billion machine learning parameters and was trained on hundreds of gigabytes of text. These models can be used for a variety of tasks, such as translation, question answering, and even writing articles or poetry. However, they are not perfect and can sometimes produce incorrect or nonsensical responses."

In [35]:
# invoke the chain again using the previous question and answer in the message_log
# Normally you would use LangChain's BaseChatMessageHistory or similur class to manage this
# but that is beyong what I am trying to demonstrate here, so I am filling in the history by hand
chain.invoke({'message_log':"The AI provides a brief eplaination of Large Language Models, Sure, a Large Language Model is a type of artificial intelligence model that is trained on a vast amount of text data. It's designed to generate human-like text based on the input it's given. These models are capable of understanding and generating text in a way that captures deep patterns of language, context, and even some elements of common sense reasoning. \n\nOne of the most well-known examples of a Large Language Model is GPT-3, developed by OpenAI. It has 175 billion machine learning parameters and was trained on hundreds of gigabytes of text. These models can be used for a variety of tasks, such as translation, question answering, and even writing articles or poetry. However, they are not perfect and can sometimes produce incorrect or nonsensical responses.",
              'question':"Could you list some types?"})
# As we can see we got a result based on the context of the first question.

'Absolutely, there are several types of Large Language Models, and they are typically categorized based on their architecture. Here are a few examples:\n\n1. Transformer Models: These are currently the most popular type of large language models. They use a mechanism called attention to weigh the importance of different words in a sentence. GPT-3, developed by OpenAI, is an example of a transformer model.\n\n2. Recurrent Neural Networks (RNNs): These models are designed to recognize patterns in sequences of data, such as text. They can remember information from earlier in the sequence and use it to inform predictions later on. An example of this is the LSTM (Long Short-Term Memory) model.\n\n3. Convolutional Neural Networks (CNNs): While these models are most commonly associated with image processing, they can also be used for text. They work by applying filters to data to identify local patterns.\n\n4. BERT (Bidirectional Encoder Representations from Transformers): This is a transforme

In [36]:
# if we call it without the message_log, the chatbot does not know how to answer.
chain.invoke({'message_log':"",
              'question':"Could you list some types?"})

"Of course, I'd be happy to list some types, but could you please specify the category? For example, are you interested in types of animals, types of cars, types of music, or something else?"