<a href="https://colab.research.google.com/github/ERIC10000/introduction-to-large-language-models/blob/main/Large_Language_Model_Introduction.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# History of Large Language Models (LLMs)
In recent years, large language models (LLMs) have emerged as transformative tools in the field of natural
language processing (NLP). Powered by advanced machine learning techniques, these models have revolutionized the way we interact with and understand textual data. In this comprehensive guide, we'll delve
into the inner workings of LLMs, explore their capabilities, and discuss their vast potential in various applications, all with a focus on Python development.

# Understanding Large Language Models.
At the core of LLMs lies the concept of deep learning, a subfield of machine learning that focuses on
training neural networks with multiple layers to extract complex patterns from data. In the context of
NLP, LLMs are neural network architectures designed to understand and generate human-like text.
One of the most prominent LLM architectures is OpenAI's GPT (Generative Pre-trained Transformer)
model. Let's take a look at how we can leverage the power of GPT-3, the third iteration of the GPT series,
using Python code:


# Applications of Large Language Models
1.  <b>Content Generation:</b> LLMs excel at generating human-like text across various genres, including storytelling, news articles, and technical documentation, report writing etc

2. <b>Language Translation:</b> LLMs can also facilitate language translation tasks by converting text from one
language to another.

3. <b>Sentiment Analysis: </b> LLMs can analyze the sentiment of textual data, helping businesses gain insights
into customer feedback and social media interactions. With Sentiment Analysis, a machine can understand the context of a text and suggest whether the context is positive, negative or neutral.


# Architecture of Large Language Models:
At the heart of LLMs lies a complex neural network architecture, typically based on transformer models.
These models consist of multiple layers of attention mechanisms, which enable them to capture long range dependencies within the input text. One ofthe most prominent examples of an LLM architecture is
OpenAI's GPT (Generative Pre-trained Transformer) series.

In [11]:
# Downloading llm libraries
!pip install transformers # Load the pre-trained models from hagging face as Generative Pre-Trained Transformers(GPT)
!pip install tokenizers # Break down a user input into tokens, so that the input can be understood by the model.
!pip install torch torchvision torchaudio # Buiding Block of Neural Networks built by facebook.




## Steps of Building Large Language Models
1. Requesting the User Prompt
We use a concept called <b>Prompt Engineering</b> that help users generate the right prompt that can be understood by a Large Language Models. Research on Prompt Engineering Techniques...

In [10]:
prompt = str(input("Ask Anything: "))

Ask Anything: What is Artificial Intelligence?


2. Load the Pre-Trained Models from the Hagging Face AI Community using the transformers library depending on the context, e,g text-to-speech, sentiment-analysis, code-completion, text-summarization and text-generation.

In [12]:
from transformers import GPT2Tokenizer,GPT2LMHeadModel
from warnings import filterwarnings
filterwarnings('ignore')

# Load pre-trained GPT-2 model and tokenizer
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
model = GPT2LMHeadModel.from_pretrained('gpt2')

3. Tokenization of the Prompt. Tokenization is the processing in NLP where a textual information is broken down into smaller token that can be used with an LLM Application. This process sometimes is called <b>Encoding</b> because it converts a text to numerical format called tensors.

In [13]:
tokens = tokenizer.encode(prompt,return_tensors='pt')
tokens
# With the tokenizer, the user prompt is converted into smaller numerical tokens that we call tensors.
# This is because machines and algorithms work better with numbers rather than text.

tensor([[ 2061,   318, 35941,  9345,    30]])

4. Text Generation(Decoding): This is the process where we decodes outputs from our model based on the user prompt using a pre-trained model.

In [14]:
output = model.generate(tokens,max_length=100,num_beams=5,no_repeat_ngram_size=2,early_stopping=True)


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


In [15]:
# Decode  the Outputing.
# This is the process of convert the outputs as Tensors(Numerical) into textual Representation.
text = tokenizer.decode(output[0],skip_special_tokens=True)
text

'What is Artificial Intelligence?\n\nA lot of people think of artificial intelligence as something that\'s going to happen in the next few years, but it\'s not. Artificial intelligence is not a new concept. It\'s been around for a long time, and there are many different ways to think about it. There are a number of different types of AI, some of which are called "deep learning" or "machine learning," and many of them are very different from what we\'re used to.\n'