LLM stands for "Large Language Model." Large Language Models are advanced types of artificial intelligence designed to understand, generate, and process human language at a sophisticated level. These models are typically built using deep learning techniques, particularly using architectures such as Transformers, which allow them to handle large amounts of text data and capture complex patterns in language.

### Key Characteristics of LLMs

1. **Size and Scale**:
   - LLMs are characterized by their large number of parameters, often numbering in the billions or even trillions. This allows them to store a vast amount of information and capture intricate patterns in data.

2. **Training Data**:
   - These models are trained on massive datasets comprising text from a variety of sources, such as books, articles, websites, and other digital content. The extensive training data helps them learn the nuances of language, including grammar, context, and semantic meaning.

3. **Architecture**:
   - Most LLMs are based on the Transformer architecture, which uses self-attention mechanisms to process input data efficiently. This architecture enables the model to handle long-range dependencies in text, making it effective for tasks that require understanding context across many words or sentences.

4. **Pretraining and Fine-Tuning**:
   - LLMs undergo a two-step training process:
     - **Pretraining**: The model is trained on a large corpus of text in an unsupervised manner, learning general language patterns and representations.
     - **Fine-Tuning**: The pretrained model is then fine-tuned on specific datasets for particular tasks (e.g., sentiment analysis, translation, question answering) to improve performance on those tasks.

5. **Capabilities**:
   - LLMs can perform a wide range of natural language processing (NLP) tasks, including text generation, translation, summarization, question answering, and more. Their versatility and performance on various tasks make them valuable tools in many applications.

### Examples of LLMs

1. **GPT (Generative Pre-trained Transformer)**:
   - Developed by OpenAI, GPT models are some of the most well-known LLMs. Versions include GPT-2, GPT-3, and the latest GPT-4. These models are capable of generating coherent and contextually relevant text based on the input they receive.

2. **BERT (Bidirectional Encoder Representations from Transformers)**:
   - Developed by Google, BERT is designed for understanding the context of words in a sentence in a bidirectional manner. It's particularly effective for tasks like question answering and sentence classification.

3. **T5 (Text-To-Text Transfer Transformer)**:
   - Also developed by Google, T5 treats every NLP problem as a text-to-text problem, where both the input and output are text strings. This unified framework allows T5 to handle a variety of NLP tasks.

4. **RoBERTa (Robustly Optimized BERT Approach)**:
   - An improved version of BERT by Facebook AI, RoBERTa modifies the training approach to enhance performance, making it one of the leading models for many NLP benchmarks.

### Applications of LLMs

1. **Content Creation**:
   - LLMs can generate high-quality text for articles, stories, marketing copy, and more, reducing the workload for human writers.

2. **Customer Support**:
   - Chatbots and virtual assistants powered by LLMs can handle customer inquiries, providing quick and accurate responses.

3. **Translation Services**:
   - LLMs improve the accuracy and fluency of machine translation systems, bridging language barriers more effectively.

4. **Research and Data Analysis**:
   - Researchers use LLMs to extract insights from large text datasets, summarize research papers, and generate hypotheses.

5. **Personal Assistants**:
   - Virtual personal assistants like Google Assistant, Siri, and Alexa use LLMs to understand and respond to user commands in natural language.

### Conclusion

Large Language Models represent a significant advancement in AI and NLP, providing powerful tools for understanding and generating human language. Their ability to handle complex tasks and process vast amounts of text data makes them invaluable in various industries and applications.

In [1]:
from transformers import AutoTokenizer, AutoModelForQuestionAnswering
import torch

# Load the pre-trained model and tokenizer
model_name = "distilbert-base-uncased-distilled-squad"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForQuestionAnswering.from_pretrained(model_name)


In [2]:

def answer_question_with_sliding_window(question, context, max_length=512, stride=256):
    inputs = tokenizer.encode_plus(question, context, add_special_tokens=True, return_tensors="pt")
    input_ids = inputs["input_ids"].squeeze()

    # Split input into chunks with overlap
    chunk_size = max_length - len(tokenizer.encode(question, add_special_tokens=False)) - 1
    chunks = []
    for i in range(0, len(input_ids), stride):
        end = i + chunk_size
        if end > len(input_ids):
            end = len(input_ids)
        chunks.append(input_ids[i:end])

    # Process each chunk
    all_answers = []
    for chunk in chunks:
        chunk = torch.cat((torch.tensor([tokenizer.cls_token_id]), chunk, torch.tensor([tokenizer.sep_token_id])))
        outputs = model(chunk.unsqueeze(0))
        answer_start_scores = outputs.start_logits
        answer_end_scores = outputs.end_logits
        answer_start = torch.argmax(answer_start_scores)
        answer_end = torch.argmax(answer_end_scores) + 1
        answer = tokenizer.convert_tokens_to_string(tokenizer.convert_ids_to_tokens(chunk[answer_start:answer_end]))
        all_answers.append(answer)

    # Combine answers from all chunks
    combined_answer = ' '.join(all_answers)
    return combined_answer

In [6]:
%pwd

'C:\\Users\\suman\\Downloads\\DS\\NLP'

In [3]:
# Define the context and question
contexts = """
An injured woman is shown emerging from a cave underneath a huge mountain, carrying an infant. She kills two soldiers pursuing her and attempts to cross a raging river, but slips and starts getting washed away by the current. Facing imminent death, she holds the baby aloft and prays to Lord Shiva, explaining that while she doesn't care about her life, the baby, Mahendra Baahubali, must live, for the sake of Mahishmati kingdom, and the waiting baby's mother. She offers her own life as compensation for her sins. The child is saved by the people of the local Amburi tribe, who reside near the river and worship Lord Shiva. The wife of the tribe's chieftain, Sanga, decides to adopt the boy and names him Siva.

Siva grows up to be an ambitious and mischievous child, obsessed with ascending the mountain. Despite Sanga's pleas, he tries many times to scale the cliffs but always fails. As a young man, he is shown to possess superhuman strength when he lifts a Lingam of Lord Siva and places it at the foot of the mountain. A mask then falls from the cliffs, and realizing it possesses feminine features, Siva becomes obsessed with finding the owner of the mask. He finally succeeds in scaling the mountain. Upon reaching the top, he sees a dashing warrior girl named Avantika fighting soldiers. He discovers that she is a member of a local resistance group dedicated to overthrowing the tyrannical king Bhallaladeva of the nearby kingdom Mahishmati, and rescuing their captive Princess Devasena. Siva is immediately smitten with Avantika and secretly follows her. When she discovers Siva, she attacks him, but he outmaneuvers her and returns her mask, progressively disrobing her with each step. Realizing he scaled the entire mountain to find her, she reciprocates his feelings and they make love.

After discovering Avantika's cause, Siva pledges to rescue Devasena himself and departs for Mahishmati. He infiltrates the royal palace disguised as a soldier and distracts Bhallaladeva and his guards long enough for him to rescue Devasena. Bhallaladeva sends his son, Bhadra, and the royal family's loyal slave general Kattappa to recapture Devasena. In the ensuing fight, Siva beheads Bhadra as both the Amburi tribe and resistance warriors arrive. Kattappa lunges at Siva, but stops short of attacking him upon seeing his face. He falls into submission at Siva's feet, proclaiming him to be "Baahubali".

The next morning, Kattappa reveals to Siva that he is actually Mahendra Baahubali, the son of Amarendra Baahubali, a famous invincible warrior prince from Mahishmati, and that the woman who had sacrificed herself to save him was Queen Sivagami, his grandmother. Amarendra Baahubali was born an orphan; his father, King Vikramadeva, died before he was born and his mother died giving birth to him. Bijjaladeva, Vikramadeva's brother and the next in line for the throne, is denied the position due to his scheming nature, and as such Bijjaladeva's wife, Queen Sivagami, assumes power as a caretaker with the intention of raising both her son Bhallaladeva and the orphaned Baahubali in an equal manner to select the next heir to the throne. While the two are raised as brothers and trained rigorously in numerous subjects, Bhallaladeva retains the power-hungry nature of his father while the affable Baahubali becomes beloved by the kingdom.

A traitor named Saketa turns out to be a spy for the savage Kaalakeya tribe, known as destroyers of kingdoms. Their chief Inkoshi declares war on Mahishmati, and Lord Bijjaladeva decides that whoever kills Inkoshi will be crowned king. However, after Inkoshi personally insults Sivagami on the battlefield, she demands that he be brought to her alive. Bijjaladeva cunningly arranges better and sophisticated artillery weapons and men fo Bhallaladeva, but Baahubali still defeats more raiders than Bhallaladeva by using innovative tactics and by inspiring his soldiers. With Mahishmati soldiers flagging in the battle, Amarendra inspires them to not give up in the face of death while also rescuing villagers captured by the Kaalakeyas at a risk. Bhallaladeva, meanwhile, kills Kaalakeyas and villagers indiscriminately, killing Inkoshi as well. Baahubali's valour and concern for the people of his kingdom convinces Sivagami to make Baahubali the heir apparent while Bhallaladeva is made the commander-in-chief for his sheer prowess.

In the present day, Siva's adoptive parents, impressed by Kattappa's story, wish to meet Baahubali. A dejected Kattappa explains that Amarendra Baahubali is dead, stabbed behind his back by a traitor, and upon being questioned by Siva, he reveals that the traitor was none other than him.
"""

question = "who is Siva?"

# Get the answer
answer = answer_question_with_sliding_window(question, context)
print(f"Question: {question}")
print(f"Answer: {answer}")


NameError: name 'context' is not defined

In [4]:
def answer_question(question, context):
    inputs = tokenizer.encode_plus(question, context, return_tensors="pt")
    input_ids = inputs["input_ids"].tolist()[0]

    # Get model outputs
    outputs = model(**inputs)
    answer_start_scores = outputs.start_logits
    answer_end_scores = outputs.end_logits

    # Find the tokens with the highest `start` and `end` scores
    answer_start = torch.argmax(answer_start_scores)
    answer_end = torch.argmax(answer_end_scores) + 1

    # Convert tokens to string
    answer = tokenizer.convert_tokens_to_string(tokenizer.convert_ids_to_tokens(input_ids[answer_start:answer_end]))

    return answer


In [5]:
# Define the context and question
context = """
Transformers are a type of artificial neural network designed to process sequential data, such as natural language. 
They use a mechanism called attention to weigh the influence of different parts of the input data, allowing them to 
handle long-range dependencies more effectively than traditional RNNs. Large Language Models (LLMs) like GPT-4 are 
built using transformer architecture and are pre-trained on vast amounts of text data to perform a variety of natural 
language processing tasks.
"""

question = "What are transformers used for?"

# Get the answer
answer = answer_question(question, context)
print(f"Question: {question}")
print(f"Answer: {answer}")


Question: What are transformers used for?
Answer: a type of artificial neural network designed to process sequential data


In [1]:
from transformers import MarianMTModel, MarianTokenizer

# Load the pre-trained MarianMT model and tokenizer
model_name = 'Helsinki-NLP/opus-mt-en-de'
tokenizer = MarianTokenizer.from_pretrained(model_name)
model = MarianMTModel.from_pretrained(model_name)

def translate(texts, src_lang="en", tgt_lang="de"):
    # Tokenize the input texts
    inputs = tokenizer(texts, return_tensors="pt", padding=True, truncation=True)
    
    # Perform translation and decode the output
    translated = model.generate(**inputs)
    translated_texts = [tokenizer.decode(t, skip_special_tokens=True) for t in translated]
    
    return translated_texts

# Example texts for translation
texts = [
    "Hello, how are you?",
    "Transformers are a type of artificial neural network designed to process sequential data."
]

# Translate from English to German
translated_texts = translate(texts, src_lang="en", tgt_lang="de")
for original, translated in zip(texts, translated_texts):
    print(f"Original: {original}\nTranslated: {translated}\n")


RuntimeError: Failed to import transformers.models.marian.modeling_marian because of the following error (look up to see its traceback):
cannot import name 'TypeAlias' from 'typing_extensions' (C:\Users\suman\.conda\envs\practice\lib\site-packages\typing_extensions.py)