# Creational Patterns

The creational patterns are guidelines suggested to create objects. They provide different mechanisms to create objects in a way that code is flexible, resusable, and easy to maintain.

The five creational patterns available are:
1. Builder
2. Factory
3. Abstract Factory
4. Prototype
5. Singleton

Each of those previous patterns are unique, and can be used in different situations and conditions.

# Singleton

**What is a singleton?**

A singleton is a pattern that ensures that a unique object is created across all the code. 

**When should we use it?**

The singleton must be used when creating new class instances is too expensive. For example, creating a connection to a database, or even loading a big model for calculating text embeddings.

**Scenario**

We are going to create some experiments to calculate the embeddings of a sentence in two different languages, but with the same meaning.

## Antipattern

In this section, we are going to create several instances of the same embedding. Even the embedding can be manage different languages, one instance will be for English and the other for Spanish.

In [7]:
from transformers import RobertaTokenizer, RobertaModel

Create the instances for Spanish

In [8]:
tokenizer_spanish = RobertaTokenizer.from_pretrained("roberta-base")
model_spanish = RobertaModel.from_pretrained("roberta-base")

Some weights of RobertaModel were not initialized from the model checkpoint at roberta-base and are newly initialized: ['pooler.dense.bias', 'pooler.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Create the instances for English

In [9]:
tokenizer_english = RobertaTokenizer.from_pretrained("roberta-base")
model_english = RobertaModel.from_pretrained("roberta-base")

Some weights of RobertaModel were not initialized from the model checkpoint at roberta-base and are newly initialized: ['pooler.dense.bias', 'pooler.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Let's have some text inputs

In [10]:
text_spanish = "Hola, me llamo Sebastian"
text_english = "Hi, I'm Sebastian"

Get the embedddings

In [17]:
encoded_text_spanish = tokenizer_spanish(text_spanish, return_tensors="pt")
output_spanish = model_spanish(**encoded_text_spanish)

In [18]:
encoded_text_english = tokenizer_english(text_english, return_tensors="pt")
output_english = model_spanish(**encoded_text_english)

However, are the tokenizers and models the same object?

In [31]:
tokenizer_english == tokenizer_spanish

False

In [32]:
model_english == model_spanish

False

As you see previously, we create two instances for the tokenizer and the model. This is too much expensive for loading, and redudant because the same instance can manage Spanish and English.

To avoid this mistakes, it is where the Singleton stands out.

## Pattern

There are several ways to create a singleton, the most elaborated ones means creating your own decorator. However, at least to me, I just want to code in a simple way. 

To create a singleton, I can use the decorator `singleton` from the library `singleton-decorator` when defining a class.

In [19]:
from singleton_decorator import singleton

We are going to create a class that wrappes the embedding calculation and the tokenizer/model creation

In [26]:
@singleton
class TextEmbedding:
    def __init__(self):
        self.tokenizer = RobertaTokenizer.from_pretrained("roberta-base")
        self.model = RobertaModel.from_pretrained("roberta-base")

    def get_embedding(self, text):
        return self.model(**self.tokenizer(text, return_tensors="pt"))

I am going to create again two different instances, the first one for Spanish and the second for English

In [27]:
embedding_model_spanish = TextEmbedding()

Some weights of RobertaModel were not initialized from the model checkpoint at roberta-base and are newly initialized: ['pooler.dense.bias', 'pooler.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [28]:
embedding_model_english = TextEmbedding()

Now, is the same instance the tokenizer, model and the class instance?

In [33]:
embedding_model_english == embedding_model_spanish

True

In [35]:
embedding_model_spanish.tokenizer == embedding_model_english.tokenizer

True

In [36]:
embedding_model_spanish.model == embedding_model_english.model

True

I am going to process again the text

In [29]:
output_singleton_spanish = embedding_model_spanish.get_embedding(text_spanish)

In [30]:
output_singleton_english = embedding_model_spanish.get_embedding(text_english)

As you see before, the singleton avoids to saturate the memory with the same object several times. In addition, it reduces the overhead and memory consumption due to several object creations. Finally, it works out well as you can saw previously.