### What is LLMs?

- Sophisticated Deep Learning Models that excels in Natural Language Processing tasks.
- It understands the complex patterns from the vast amount of training data to generate human like text.

Words that goes along with LLMs
- Encoder-Decoder Model (Transformer Model)

<img src="https://miro.medium.com/v2/resize:fit:1400/0*376uJu_fc_uR8H3X.png" alt="encoder-decoder" width="600"/>


- Autoregressive Model

<img src="https://www.researchgate.net/publication/371123751/figure/fig1/AS:11431281162474767@1685329470503/Autoregressive-sampling-The-LLM-is-sampled-to-generate-a-single-token-continuation-of.png" alt="autoregressive" width="600">

- Pretrained/Fine Tuned Models

<img src="https://miro.medium.com/v2/resize:fit:1102/1*hb6tJWEeeiwnVmzxMOPlrQ.png" alt="pretrained" width="500">


###  Text Summarizer

Facebook's BART:

BART is a transformer encoder-encoder (seq2seq) model with a bidirectional (BERT-like) encoder and \
an autoregressive (GPT-like) decoder. BART is pre-trained by (1) corrupting text with an arbitrary \
noising function, and (2) learning a model to reconstruct the original text.

BART is particularly effective when fine-tuned for text generation (e.g. summarization, translation)\
but also works well for comprehension tasks (e.g. text classification, question answering). This \
particular checkpoint has been fine-tuned on CNN Daily Mail, a large collection of text-summary pairs.

In [2]:
from transformers import pipeline

def load_model():
    model = pipeline("summarization", device=0, model='facebook/bart-large-cnn')
    return model

def generate_chunks(inp_str):
    max_chunk = 500
    inp_str = inp_str.replace('.', '.<eos>')
    inp_str = inp_str.replace('?', '?<eos>')
    inp_str = inp_str.replace('!', '!<eos>')
    
    sentences = inp_str.split('<eos>')
    current_chunk = 0 
    chunks = []
    for sentence in sentences:
        if len(chunks) == current_chunk + 1: 
            if len(chunks[current_chunk]) + len(sentence.split(' ')) <= max_chunk:
                chunks[current_chunk].extend(sentence.split(' '))
            else:
                current_chunk += 1
                chunks.append(sentence.split(' '))
        else:
            chunks.append(sentence.split(' '))

    for chunk_id in range(len(chunks)):
        chunks[chunk_id] = ' '.join(chunks[chunk_id])

    return chunks

def text_summarizer(text):
    updated_text = generate_chunks(text)
    model = load_model()
    summary = model(updated_text, max_length=750, min_length=30, do_sample=True)[0]['summary_text']
    return summary
    


2024-01-01 11:59:15.725530: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [6]:
text_summarizer("""LLMs, or Large Language Models, are the key component behind text generation. 
                In a nutshell, they consist of large pretrained transformer models trained to predict 
                the next word (or, more precisely, token) given some input text. Since they predict 
                one token at a time, you need to do something more elaborate to generate new sentences 
                other than just calling the model — you need to do autoregressive generation.""")

Your max_length is set to 750, but your input_length is only 156. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=78)


'Large Language Models are the key component behind text generation. They consist of large pretrained transformer models trained to predict the next word (or, more precisely, token) given some input text. Since they predict  one token at a time, you need to do something more elaborate to generate new sentences.'