# **Quick Intro to Large Language Models**

## **Introduction**

In this lesson, we will explore how large language models learn token distributions and predict the next token, allowing them to generate human-like text that can both amaze and perplex us.  

We'll start with a quick introduction to the inner workings of GPT-3 and GPT-4, focusing on their few-shot learning capabilities, emergent abilities, and the scaling laws that drive their success. We will then dive into some easy-to-understand examples of how these models excel in tasks such as text summarization and translation just by providing a few examples without the need for fine-tuning.  

But it's not all smooth sailing in the world of LLMs. We will also discuss some of the potential pitfalls, including hallucinations and biases, which can lead to inaccurate or misleading outputs. It's essential to be aware of these limitations when using LLMs in use cases where 100% accuracy is paramount. On the flip side, their creative process can be invaluable in tasks where imagination takes center stage.  

We will also touch upon the context size and maximum number of tokens that LLMs can handle, shedding light on the factors that define their performance.  

<hr>

## **LLMs in general:**  

LLMs are deep learning models with billions of parameters that excel at a wide range of natural language processing tasks. They can perform tasks like translation, sentiment analysis, and chatbot conversations without being specifically trained for them. LLMs can be used without fine-tuning by employing "prompting" techniques, where a question is presented as a text prompt with examples of similar problems and solutions.

- **Architecture:**  

LLMs typically consist of multiple layers of neural networks, feedforward layers, embedding layers, and attention layers. These layers work together to process input text and generate output predictions.  

- **Future implications:**  

While LLMs have the potential to revolutionize various industries, it is important to be aware of their limitations and ethical implications. Businesses and workers should carefully consider the trade-offs and risks associated with using LLMs, and developers should continue refining these models to minimize biases and improve their usefulness in different applications. Throughout the course, we will address certain limitations and offer potential solutions to overcome them.  

<hr>

**Maximum number of tokens**  

In the LangChain library, the LLM context size, or the maximum number of tokens the model can process, is determined by the specific implementation of the LLM. In the case of the OpenAI implementation in LangChain, the maximum number of tokens is defined by the underlying OpenAI model being used. To find the maximum number of tokens for the OpenAI model, refer to the ```max_tokens``` attribute provided on the OpenAI [documentation](https://platform.openai.com/docs/models/gpt-4) or API. 

For example, if you’re using the GPT-3  model, the maximum number of tokens supported by the model is 2,049. The max tokens for different models depend on the specific version and their variants. (e.g., ```davinci``` , ```curie``` , ```babbage``` , or ```ada```) Each version has different limitations, with higher versions typically supporting larger number of tokens.  

It is important to ensure that the input text does not exceed the maximum number of tokens supported by the model, as this may result in truncation or errors during processing. To handle this, you can split the input text into smaller chunks and process them separately, making sure that each chunk is within the allowed token limit. You can then combine the results as needed.

Here's an example of how you might handle text that exceeds the maximum token limit for a given LLM in LangChain. Mind that the following code is partly pseudocode. It's not supposed to run, but it should give you the idea of how to handle texts longer than the maximum token limit.

In [None]:
from langchain.llms import OpenAI
from langchain.text_splitter import RecursiveCharacterTextSplitter

# Before executing the following code, make sure to have
# your OpenAI key saved in the “OPENAI_API_KEY” environment variable.
# Initialize the LLM
llm = OpenAI(model_name="gpt-3.5-turbo-instruct")

# Define the input text
input_text = "your_long_input_text"

# Determine the maximum number of tokens from documentation
max_tokens = 4097

# Split the input text into chunks based on the max tokens
text_chunks = split_text_into_chunks(input_text, max_tokens)

# Process each chunk separately
results = []
for chunk in text_chunks:
    result = llm.process(chunk)
    results.append(result)

# Combine the results as needed
final_result = combine_results(results)

In this example, ```split_text_into_chunks``` and ```combine_results``` are custom functions that you would need to implement based on your specific requirements, and we will cover them in later lessons. The key takeaway is to ensure that the input text does not exceed the maximum number of tokens supported by the model.

**Note** that splitting into multiple chunks can hurt the coherence of the text.

<hr>

## **Tokens Distributions and Predicting the Next Token**

Large language models like GPT-3 and GPT-4 are pretrained on vast amounts of text data and learn to predict the next token in a sequence based on the context provided by the previous tokens. GPT-family models use Causal Language modeling, which predicts the next token while only having access to the tokens before it. This process enables LLMs to generate contextually relevant text.

The following code uses LangChain’s ```OpenAI``` class to load GPT-3’s using ```gpt-3.5-turbo``` key to complete the sequence, which results in the answer. Before executing the following code, save your OpenAI key in the **OPENAI_API_KEY** environment variable. Moreover, remember to install the required packages with the following command: ```pip install langchain==0.1.4 deeplake openai==1.10.0 tiktoken```

In [3]:

# OTHER WAY
# create .env file and add OPENAI_API_KEY="open_ai_key"
# add .env to .gitignore

from dotenv import load_dotenv
load_dotenv()

ModuleNotFoundError: No module named 'dotenv'

In [None]:
# !pip install langchain==0.1.4 deeplake openai==1.10.0 tiktoken

In [1]:
from langchain.llms import OpenAI

llm = OpenAI(model_name="gpt-3.5-turbo-instruct", temperature=0)

text = "What would be a good company name for a company that makes colorful socks?"

print(llm(text))

  warn_deprecated(
  warn_deprecated(


AuthenticationError: Error code: 401 - {'error': {'message': 'Incorrect API key provided: sk-lL4Eb***************************************cnUc. You can find your API key at https://platform.openai.com/account/api-keys.', 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_api_key'}}