In [1]:
from transformers import pipeline

  from .autonotebook import tqdm as notebook_tqdm


### 1. Text generation

In [11]:
pipe = pipeline("text-generation", model="openai-community/gpt2")

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
All PyTorch model weights were used when initializing TFGPT2LMHeadModel.

All the weights of TFGPT2LMHeadModel were initialized from the PyTorch model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFGPT2LMHeadModel for predictions without further training.


In [13]:
pipe("Hello, I'm a language model,", max_length=30, num_return_sequences=5)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': 'Hello, I\'m a language model, let me give you what you need to learn!"\n\n"Thank you, the program used for your question'},
 {'generated_text': "Hello, I'm a language model, which means, I'm able to write an interpreter for a particular language because it's a syntax. My language"},
 {'generated_text': "Hello, I'm a language model, and this is not a code generator, but a text processing model. This means that the code can be made"},
 {'generated_text': "Hello, I'm a language model, I know a lot.\n\nWhen you look at the code of a language, you need a way to"},
 {'generated_text': "Hello, I'm a language model, so let's write this example on it. It's very simple, we have this collection where we store all"}]

In [15]:
pipe("Tell a joke about woman...", max_length=70, num_return_sequences=1)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': 'Tell a joke about woman... just tell it like it is."\n\nIn addition to his many work with women\'s liberation activism, Waverley has also produced many award winning work in the LGBTQ community, such as "The Best of the LGBTQ+ Movement: Women of the New York Board Accessible Space," focusing on gender equality through their community'}]

In [17]:
pipe("Tell a joke about man...", max_length=70, num_return_sequences=1)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': "Tell a joke about man...\n\n\nDo not call your grandmother a man!\n\n\nDon't have sex that night, no one will ever give you a chance!\n\n\nDo not talk about the person to whom you are talking about to anyone else!\n\n\nDon't ask your grandma what you saw at the mall.\n\n\nDon't"}]

### 2. Summarization

In [None]:
summarizer = pipeline(task="summarization",
                      model="./models/facebook/bart-large-cnn",
                      torch_dtype=torch.bfloat16)

In [19]:
import gc
gc.collect()

356

In [22]:
summarizer = pipeline("summarization", model="facebook/bart-large-cnn")

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
All PyTorch model weights were used when initializing TFBartForConditionalGeneration.

All the weights of TFBartForConditionalGeneration were initialized from the PyTorch model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFBartForConditionalGeneration for predictions without further training.


In [24]:
text = """Neural text embeddings now play a crucial role in various information retrieval (IR) tasks. Transformers have gradually surpassed earlier models in this task. Indeed, search applications (such as Google or Bing) have used transformers to search for results and semantically relevant. The first model used was BERT (or similar bidirectional models), but recently there has been interest in LLMs given their performance.
Initially, small encoders were used because they clearly produce small embeddings. Only-model decoders have been used successfully for so many applications that they have attracted the attention of researchers. For example, SGPT, a decoder-only model has shown excellent results.
In recent years there has been quite a lot of interest in the topic, then that there are several applications, and because it has been shown at query lookup you can approximate nearest-neighbor search and you can optimize on GPU.
In addition to computational feasibility, there has been a growing interest in embeddings because there are more and more applications with LLM. In fact, one of the main problems of LLMs is that they are prone to hallucination and their knowledge cannot be easily updated.
Retrieval-augmented language models in contrast, can retrieve knowledge from an external datastore when needed, potentially reducing hallucination and increasing coverage. (source)
While initially the found knowledge was used to do model updating, today we tend instead to provide this knowledge to the model to condition generation (via prompts). So we have a retriever that searches for similarity of documents and then these are provided to the LLM for generation.
The typical system thus consists of a retriever who must find top-k relevant texts from a corpus. Later, for better performance, efforts were made to create multi-stage systems, in which found documents are then refined by a reranker (practically reorders the documents). Both the retriever and the reranker are typically transformers (e.g., BERT or T5) that have been trained to encode documents and queries or to score documents and queries.
Later articles showed that both embedding and reranking can be formulated as text generation. Therefore, it was tried to see how LLMs would perform in such tasks. A study from 2023, shows that ChatGPT has very good performance for zero-shot reranking
we argue that fine-tuning state-ofthe-art large language models to function as retrievers and rerankers can yield better effectiveness than previous smaller model.
In fact, LLMs being generalists can accomplish multiple tasks at the same time. In this study, the authors show that LLaMA achieves state-of-the-art when it is fine-tuned for these tasks.
"""

In [25]:
summarizer(text, min_length=100, max_length=200)

Your max_length is set to 1000, but your input_length is only 572. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=286)


KeyboardInterrupt: 

### 3. Embeddings

In [1]:
from sentence_transformers import SentenceTransformer

  from .autonotebook import tqdm as notebook_tqdm


In [2]:
model = SentenceTransformer("all-MiniLM-L6-v2")

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


In [3]:
sentences1 = ['The cat sits outside',
              'A man is playing guitar',
              'The movies are awesome']

In [4]:
embeddings1 = model.encode(sentences1, convert_to_tensor=True)

In [6]:
len(embeddings1[0])

384

In [7]:
sentences2 = ['The dog plays in the garden',
              'A woman watches TV',
              'The new movie is so great']

In [8]:
embeddings2 = model.encode(sentences2, 
                           convert_to_tensor=True)

In [9]:
len(embeddings2[0])

384

In [11]:
from sentence_transformers import util
cosine_scores = util.cos_sim(embeddings1,embeddings2)
cosine_scores

tensor([[ 0.2838,  0.1310, -0.0029],
        [ 0.2277, -0.0327, -0.0136],
        [-0.0124, -0.0465,  0.6571]])

In [12]:
for i in range(len(sentences1)):
    print("{} \t\t {} \t\t Score: {:.4f}".format(sentences1[i],
                                                 sentences2[i],
                                                 cosine_scores[i][i]))

The cat sits outside 		 The dog plays in the garden 		 Score: 0.2838
A man is playing guitar 		 A woman watches TV 		 Score: -0.0327
The movies are awesome 		 The new movie is so great 		 Score: 0.6571
