# Sentence Reconstruction

The purpose of this project is to take in input a sequence of words corresponding to a random permutation of a given english sentence, and reconstruct the original sentence.

The otuput can be either produced in a single shot, or through an iterative (autoregressive) loop generating a single token at a time.


CONSTRAINTS:
* No pretrained model can be used.
* The neural network models should have less the 20M parameters.
* No postprocessing should be done (e.g. no beamsearch)
* You cannot use additional training data.


BONUS PARAMETERS:

A bonus of 0-2 points will be attributed to incentivate the adoption of models with a low number of parameters.

# Dataset

The dataset is composed by sentences taken from the generics_kb dataset of hugging face. We restricted the vocabolary to the 10K most frequent words, and only took sentences making use of this vocabulary.

In [4]:
!pip install datasets

  pid, fd = os.forkpty()




Download the dataset

In [5]:
from datasets import load_dataset
from keras.layers import TextVectorization
import tensorflow as tf
import numpy as np
np.random.seed(42)
ds = load_dataset('generics_kb',trust_remote_code=True)['train']

Downloading builder script:   0%|          | 0.00/8.64k [00:00<?, ?B/s]

Downloading readme:   0%|          | 0.00/11.9k [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/27.1M [00:00<?, ?B/s]

Generating train split:   0%|          | 0/1020868 [00:00<?, ? examples/s]

Filter row with length greater than 8.


In [6]:
ds = ds.filter(lambda row: len(row["generic_sentence"].split(" "))>8 )
corpus = [ '<start> ' + row['generic_sentence'].replace(","," <comma>") + ' <end>' for row in ds ]
corpus = np.array(corpus)


Filter:   0%|          | 0/1020868 [00:00<?, ? examples/s]

Create a tokenizer and Detokenizer

In [51]:
tokenizer=TextVectorization( max_tokens=10000, standardize="lower_and_strip_punctuation", encoding="utf-8",) #con il max prende le piu frequenti. ordina i token del vocab dal piu frequente al meno frequente
tokenizer.adapt(corpus)

class TextDetokenizer:
    def __init__(self, vectorize_layer):
        self.vectorize_layer = vectorize_layer
        vocab = self.vectorize_layer.get_vocabulary()
        self.index_to_word = {index: word for index, word in enumerate(vocab)}

    def __detokenize_tokens(self, tokens):
        def check_token(t):
          if t == 3:
            s="<start>"
          elif t ==2:
            s="<end>"
          elif t ==7:
            s="<comma>"
          else:
            s=self.index_to_word.get(t, '[UNK]')
          return s

        return ' '.join([ check_token(token) for token in tokens if token != 0])

    def __call__(self, batch_tokens):
       return [self.__detokenize_tokens(tokens) for tokens in batch_tokens]



detokenizer = TextDetokenizer( tokenizer )
sentences = tokenizer( corpus ).numpy()

Remove from corpus the sentences where any unknow word appears

In [8]:
mask = np.sum( (sentences==1) , axis=1) >= 1
original_data = np.delete( sentences, mask , axis=0)

In [9]:
original_data.shape

(241236, 28)

Shuffle the sentences

In [10]:
from tensorflow.keras.utils import Sequence

class DataGenerator(Sequence):
    def __init__(self, data, batch_size=32, shuffle=True, seed=42):
        self.data = data
        self.batch_size = batch_size
        self.shuffle = shuffle
        self.seed = seed
        self.on_epoch_end()


    def __len__(self):
        return int(np.floor(len(self.data) / self.batch_size))

    def __getitem__(self, index):
        indexes = self.indexes[index*self.batch_size:(index+1)*self.batch_size]

        data_batch = np.array([self.data[k] for k in indexes])
        #copy of ordered sequences
        result = np.copy(data_batch)
        #shuffle only the relevant positions for each batch
        for i in range(data_batch.shape[0]):
          np.random.shuffle(data_batch[i,1:data_batch[i].argmin() - 1])

        return data_batch , result

    def on_epoch_end(self):
        self.indexes = np.arange(len(self.data))
        if self.shuffle:
            if self.seed is not None:
                np.random.seed(self.seed)
            np.random.shuffle(self.indexes)

In [11]:
# Make a random permutation of training and test set
np.random.seed(42)
# Shuffle the all data
shuffled_indices = np.random.permutation(len(original_data))
shuffled_data = original_data[shuffled_indices]

In [12]:
test_generator = DataGenerator(shuffled_data)
x, y = test_generator.__getitem__(1)

In [13]:
x = detokenizer(x)
y = detokenizer(y)

for i in range(7):
  print("original: ", y[i])
  print("shuffled: ", x[i])
  print("\n")

original:  <start> aggression is common in both male and female rabbits <comma> especially during breeding season <end>
shuffled:  <start> common breeding aggression especially rabbits male both and <comma> female is season in during <end>


original:  <start> fuel cells differ from other chemical batteries in two major respects <end>
shuffled:  <start> from major respects batteries other two cells differ chemical in fuel <end>


original:  <start> weather is the condition of the atmosphere over a brief period of time <end>
shuffled:  <start> the weather the atmosphere brief is of condition period time a over of <end>


original:  <start> crystals can form from vapors <comma> solutions <comma> or molten materials <end>
shuffled:  <start> vapors <comma> from materials or form <comma> crystals solutions can molten <end>


original:  <start> soap disrupts the membranes of the bacteria <end>
shuffled:  <start> disrupts of the soap the bacteria membranes <end>


original:  <start> sex cells

# Metrics

Let s be the source string and p your prediction. The quality of the results will be measured according to the following metric:

1.  look for the longest substring w between s and p
2.  compute |w|/max(|s|,|p|)

If the match is exact, the score is 1.

When computing the score, you should NOT consider the start and end tokens.



The longest common substring can be computed with the SequenceMatcher function of difflib, that allows a simple definition of our metric.

In [14]:
from difflib import SequenceMatcher

def score(s,p):
    match = SequenceMatcher(None, s, p).find_longest_match()
    #print(match.size)
    return (match.size/max(len(p),len(s)))

Let's do an example.

In [15]:
original = "at first henry wanted to be friends with the king of france"
generated = "henry wanted to be friends with king of france at the first"

print("your score is ",score(original,generated))

your score is  0.5423728813559322


The score must be computed as an average of at least 3K random examples taken form the test set.

# What to deliver

You are supposed to deliver a single notebook, suitably commented.
The notebook should describe a single model, although you may briefly discuss additional attempts you did.

The notebook should contain a full trace of the training.
Weights should be made available on request.

You must also give a clear assesment of the performance of the model, computed with the metric that has been given to you.

# Good work!

## Model used
For this task, I implemented a **Transformer** model based on the "***Attention is All You Need***" paper. Since the task involved manipulating sequences of natural language, the **Transformer** architecture, due to its efficient handling of dependencies within sequences, was the first choice that came to my mind.

The data format provided by the notebook assignment's `DataGenerator`  required some adjustments to match the expected format for the Transformer model. I modified the `DataGenerator` to return a tuple containing two elements:

- *X* : Permuted sentence and original ordered sentence (as a tuple)
- *y* : Ordered sequence shifted by one position

In [16]:
from tensorflow.keras.utils import Sequence

class DataGenerator(Sequence):
    def __init__(self, data, batch_size=32, shuffle=True, seed=42):
        self.data = data
        self.batch_size = batch_size
        self.shuffle = shuffle
        self.seed = seed
        self.on_epoch_end()

    def __len__(self):
        return int(np.floor(len(self.data) / self.batch_size))

    def __getitem__(self, index):
        indexes = self.indexes[index*self.batch_size:(index+1)*self.batch_size]

        data_batch = np.array([self.data[k] for k in indexes])
        #cop of ordered sequences
        result = np.copy(data_batch)
        for i in range(len(result)):
           result[i][result[i] == 2] = 0 
        target_data = [np.append(s[1:], [0]) for s in data_batch]
        #shuffle only the relevant positions for each batch
        for i in range(data_batch.shape[0]):
          np.random.shuffle(data_batch[i,1:data_batch[i].argmin() - 1])

        return (np.array(data_batch) , np.array(result)), np.array(target_data)

    def on_epoch_end(self):
        self.indexes = np.arange(len(self.data))
        if self.shuffle:
            if self.seed is not None:
                np.random.seed(self.seed)
            np.random.shuffle(self.indexes)

Once the class `DataGenerator` was ready, I used it to create created two separate generators: `train_generator` and `test_generator`. These generators are responsible for preparing and producing batches of data for the training and testing phases of the model, respectively.

In [17]:
train_generator = DataGenerator(shuffled_data[:220000],batch_size=128)
test_generator = DataGenerator(shuffled_data[220000:],batch_size=512)
x,y = train_generator.__getitem__(1)

print(x[0][0].shape,x[1][0].shape,y[0].shape,sep="\n")

(28,)
(28,)
(28,)


## **Transformer implementation**

In [18]:
# This function creates positional encodings for a sequence: it encodes the position of each element in the sequence.

def positional_encoding(length, depth):
  depth = depth/2

  positions = np.arange(length)[:, np.newaxis]     # (seq, 1)
  depths = np.arange(depth)[np.newaxis, :]/depth   # (1, depth)

  angle_rates = 1 / (10000**depths)         # (1, depth)
  angle_rads = positions * angle_rates      # (pos, depth)

  pos_encoding = np.concatenate(
      [np.sin(angle_rads), np.cos(angle_rads)],
      axis=-1) 

  return tf.cast(pos_encoding, dtype=tf.float32)


### ***Embedding***

In [19]:
import tensorflow as tf
from keras import layers


# This class implements a layer that does the PositionalEmbedding for the words vectors.

class PositionalEmbedding(tf.keras.layers.Layer):
  
  def __init__(self, vocab_size, d_model):
    super().__init__()
    self.d_model = d_model
    self.embedding = tf.keras.layers.Embedding(vocab_size, d_model, mask_zero=True) 
    self.pos_encoding = positional_encoding(length=2048, depth=d_model)

  def compute_mask(self, *args, **kwargs):
    return self.embedding.compute_mask(*args, **kwargs)

  def call(self, x):
    length = tf.shape(x)[1]
    x = self.embedding(x)
    # This factor sets the relative scale of the embedding and positonal_encoding.
    x *= tf.math.sqrt(tf.cast(self.d_model, tf.float32))
    x = x + self.pos_encoding[tf.newaxis, :length, :]
    return x


### ***Attention***

In [20]:
import tensorflow as tf
from keras import layers


# This class is used as superclass for the class that implement the different Attention mechanisms explained in the paper.

class BaseAttention(tf.keras.layers.Layer):
  
  def __init__(self, **kwargs):
    super().__init__()
    self.mha = tf.keras.layers.MultiHeadAttention(**kwargs)
    self.layernorm = tf.keras.layers.LayerNormalization()
    self.add = tf.keras.layers.Add()
    

In [21]:
# The class above defined is used as a superclss for the following Attention classes:
#   CrossAttention, GlobalSelfAttention, CausalSelfAttention

class CrossAttention(BaseAttention):
  
  def call(self, x, context):
    attn_output, attn_scores = self.mha(
        query=x,
        key=context,
        value=context,
        return_attention_scores=True)
    
    self.last_attn_scores = attn_scores

    x = self.add([x, attn_output])
    x = self.layernorm(x)

    return x

In [22]:
class GlobalSelfAttention(BaseAttention):
  
  def call(self, x):
    attn_output = self.mha(
        query=x,
        value=x,
        key=x)
    x = self.add([x, attn_output])
    x = self.layernorm(x)
    
    return x

In [23]:
class CausalSelfAttention(BaseAttention):
  
  def call(self, x):
    attn_output = self.mha(
        query=x,
        value=x,
        key=x,
        use_causal_mask = True)
    x = self.add([x, attn_output])
    x = self.layernorm(x)
    
    return x
  

### ***Feed Forward***
The `FeedFowrad` class implements a layer consisisting a Feed Forward layer, consisting of:
- Two ***Dense layers***, with a with a non-linear activation function
- One ***Dropout layer*** for regularization
- One **residual connection**.

In [24]:
class FeedForward(tf.keras.layers.Layer):
  def __init__(self, d_model, dff, dropout_rate=0.1):
    super().__init__()
    self.seq = tf.keras.Sequential([
      tf.keras.layers.Dense(dff, activation='relu'),
      tf.keras.layers.Dense(d_model),
      tf.keras.layers.Dropout(dropout_rate)
    ])
    self.add = tf.keras.layers.Add()
    self.layer_norm = tf.keras.layers.LayerNormalization()

  def call(self, x):
    x = self.add([x, self.seq(x)])
    x = self.layer_norm(x) 
    
    return x

### ***Encoder***
The implementation of the **Encoder** defined in the "***Attention is All You Need***" paper is achieved by creating an `EncoderLayer` consisting of:
- One ***Global Attention layer***
- One ***FeedForward Layer***

What the **Encoder** does is "_encoding_" meaning and order through word embeddings, positional encoding and self-attention mechanisms.


In [25]:
class EncoderLayer(tf.keras.layers.Layer):
  
  def __init__(self,*, d_model, num_heads, dff, dropout_rate=0.1):
    super().__init__()

    self.self_attention = GlobalSelfAttention(
        num_heads=num_heads,
        key_dim=d_model,
        dropout=dropout_rate)

    self.ffn = FeedForward(d_model, dff)

  def call(self, x):
    x = self.self_attention(x)
    x = self.ffn(x)
    
    return x


In [26]:
# The Encoder class use the previsously implemented layers to create the full encoder

class Encoder(tf.keras.layers.Layer):
  
  def __init__(self, *, num_layers, d_model, num_heads,
               dff, vocab_size, dropout_rate=0.1):
    super().__init__()

    self.d_model = d_model
    self.num_layers = num_layers

    self.pos_embedding = PositionalEmbedding(
        vocab_size=vocab_size, d_model=d_model)

    self.enc_layers = [
        EncoderLayer(d_model=d_model,
                     num_heads=num_heads,
                     dff=dff,
                     dropout_rate=dropout_rate)
        for _ in range(num_layers)]
    self.dropout = tf.keras.layers.Dropout(dropout_rate)

  def call(self, x):
    # `x` is token-IDs shape: (batch, seq_len)
    x = self.pos_embedding(x)  # Shape `(batch_size, seq_len, d_model)`.

    # Add dropout.
    x = self.dropout(x)

    for i in range(self.num_layers):
      x = self.enc_layers[i](x)

    return x  # Shape `(batch_size, seq_len, d_model)`.


### ***Decoder***
The implementation of the **Decoder** defined in the "***Attention is All You Need***" paper is achieved by creating a `DecoderLayer` consisting of:
- One ***Causal Self Attention layer***
- One ***Cross Attention layer***
- One ***FeedForward Layer***

The **Dencoder** uses self-attention to understand relationships within the generated sequence so far, and it extracts information from the encoded input by using another attention mechanism; doing this, it incorporates relevant information from the source sequence for each generated word.

In [27]:
class DecoderLayer(tf.keras.layers.Layer):
  def __init__(self,
               *,
               d_model,
               num_heads,
               dff,
               dropout_rate=0.1):
    super(DecoderLayer, self).__init__()

    self.causal_self_attention = CausalSelfAttention(
        num_heads=num_heads,
        key_dim=d_model,
        dropout=dropout_rate)

    self.cross_attention = CrossAttention(
        num_heads=num_heads,
        key_dim=d_model,
        dropout=dropout_rate)

    self.ffn = FeedForward(d_model, dff)

  def call(self, x, context):
    x = self.causal_self_attention(x=x)
    x = self.cross_attention(x=x, context=context)
    
    self.last_attn_scores = self.cross_attention.last_attn_scores

    x = self.ffn(x)  # Shape `(batch_size, seq_len, d_model)`.
    return x

In [28]:
# The Decoder class use the previsously implemented layers to create the full encoder

class Decoder(tf.keras.layers.Layer):
  
  def __init__(self, *, num_layers, d_model, num_heads, dff, vocab_size,
               dropout_rate=0.1):
    super(Decoder, self).__init__()

    self.d_model = d_model
    self.num_layers = num_layers

    self.pos_embedding = PositionalEmbedding(vocab_size=vocab_size,
                                             d_model=d_model)
    self.dropout = tf.keras.layers.Dropout(dropout_rate)
    self.dec_layers = [
        DecoderLayer(d_model=d_model, num_heads=num_heads,
                     dff=dff, dropout_rate=dropout_rate)
        for _ in range(num_layers)]

    self.last_attn_scores = None

  def call(self, x, context):
    # `x` is token-IDs shape (batch, target_seq_len)
    x = self.pos_embedding(x)  # (batch_size, target_seq_len, d_model)

    x = self.dropout(x)

    for i in range(self.num_layers):
      x  = self.dec_layers[i](x, context)

    self.last_attn_scores = self.dec_layers[-1].last_attn_scores

    # The shape of x is (batch_size, target_seq_len, d_model).
    return x

### ***Transformer***
The stucture of the full **Transformer** is the one presented in the paper: the ***Encoder*** and ***Decoder*** previously implemented, are combined toghter, and a ***Dense layer*** is added at the end.

In [29]:
#This class implements the full Transformer using the previously implemented Decoder and Encoder, ending with a Dense Layer
class Transformer(tf.keras.Model):
  
  def __init__(self, *, num_layers, d_model, num_heads, dff,
               input_vocab_size, target_vocab_size, dropout_rate=0.1):
    super().__init__()
    self.encoder = Encoder(num_layers=num_layers, d_model=d_model,
                           num_heads=num_heads, dff=dff,
                           vocab_size=input_vocab_size,
                           dropout_rate=dropout_rate)

    self.decoder = Decoder(num_layers=num_layers, d_model=d_model,
                           num_heads=num_heads, dff=dff,
                           vocab_size=target_vocab_size,
                           dropout_rate=dropout_rate)

    self.final_layer = tf.keras.layers.Dense(target_vocab_size)

  def call(self, inputs):
    context, x  = inputs

    context = self.encoder(context)  # (batch_size, context_len, d_model)

    x = self.decoder(x, context)  # (batch_size, target_len, d_model)

    # Final linear layer output.
    logits = self.final_layer(x)  # (batch_size, target_len, target_vocab_size)

    try:
      del logits._keras_mask
    except AttributeError:
      pass

    # Return the final output and the attention weights.
    return logits

### **Training** 

After solving all the problems for the implementation of the ***Transformer*** model, the most challenging part became hyperparameter tuning.
Through extensive experimentation with various parameter combinations, I identified the following settings that achieved the best results based on the score metric defined in the assignment notebook. 

In [61]:
# Best parameters found:

num_layers = 8
d_model = 64
dff = 128
num_heads = 27
dropout_rate = 0.1
epoch = 6


In [62]:
transformer = Transformer(
    num_layers=num_layers,
    d_model=d_model,
    num_heads=num_heads,
    dff=dff,
    input_vocab_size=10000,
    target_vocab_size=10000,
    dropout_rate=dropout_rate)


A custom ***Learning Rate Schedule*** was used to allow me to gradually decrease the learning rate as the model improves; this helps it to converge and avoid overfitting.

In [63]:
class CustomSchedule(tf.keras.optimizers.schedules.LearningRateSchedule):
  
  def __init__(self, d_model, warmup_steps=4000):
    super().__init__()

    self.d_model = d_model
    self.d_model = tf.cast(self.d_model, tf.float32)

    self.warmup_steps = warmup_steps

  def __call__(self, step):
    step = tf.cast(step, dtype=tf.float32)
    arg1 = tf.math.rsqrt(step)
    arg2 = step * (self.warmup_steps ** -1.5)

    return tf.math.rsqrt(self.d_model) * tf.math.minimum(arg1, arg2)

In [64]:
learning_rate = CustomSchedule(d_model)

optimizer = tf.keras.optimizers.Adam(learning_rate, beta_1=0.9, beta_2=0.98, epsilon=1e-9)


In [65]:
# Loss and Accuracy are computed using a mask

def masked_loss(label, pred):
  mask = label != 0
  loss_object = tf.keras.losses.SparseCategoricalCrossentropy(
    from_logits=True, reduction='none')
  loss = loss_object(label, pred)

  mask = tf.cast(mask, dtype=loss.dtype)
  loss *= mask

  loss = tf.reduce_sum(loss)/tf.reduce_sum(mask)
  
  return loss


def masked_accuracy(label, pred):
  pred = tf.argmax(pred, axis=2)
  label = tf.cast(label, pred.dtype)
  match = label == pred

  mask = label != 0

  match = match & mask

  match = tf.cast(match, dtype=tf.float32)
  mask = tf.cast(mask, dtype=tf.float32)
  
  return tf.reduce_sum(match)/tf.reduce_sum(mask)


In [66]:
transformer.compile(
    loss=masked_loss,
    optimizer=optimizer,
    metrics=[masked_accuracy])


In [67]:
output = transformer(x)




In [68]:
transformer.summary()


In [69]:
# Here weights previously computed are loaded in order to continue the optimization without starting over

transformer.load_weights('/kaggle/input/weights/transformer_13epochs.weights.h5')


In [84]:
# Training of the model

transformer.fit(train_generator,epochs=2)
transformer.save_weights('transformer15_epochs.weights.h5')


Epoch 1/2
[1m1718/1718[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m252s[0m 146ms/step - loss: 0.3736 - masked_accuracy: 0.8893
Epoch 2/2
[1m1718/1718[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m251s[0m 146ms/step - loss: 0.3461 - masked_accuracy: 0.8958


In [85]:
# Here the trained Transformer is used to reconstruct the order of shuffled sentences.
# The function take a batch of sequences with shuffled words and it reorders them.
# The setnence creation is achieved by an iterative (autoregressive) loop generating a single token at a time.

class Orderer(tf.Module):
  
  def __init__(self, tokenizers, transformer):
    self.tokenizers = tokenizers
    self.transformer = transformer

  def __call__(self, sentences, max_length=28):
    # I see if the input is a tensor
    assert isinstance(sentences, tf.Tensor)
    if len(sentences.shape) == 0:
      sentence = sentences[tf.newaxis]
    batch_size = sentences.shape[0]

    # I reshape the input to a suitable size
    encoder_input = tf.reshape(sentences,[batch_size,28])

    # I save the token for the start and the end of the sequence
    start_end = self.tokenizers(['<start>',"<end>"])
    start = start_end[0][tf.newaxis][0]
    end = start_end[1][tf.newaxis][0]

      
    output_array = [[start] for i in range(batch_size)]

    for i in tf.range(max_length) :
        output = np.reshape(output_array, (batch_size, (i+1)))
        predictions = self.transformer([encoder_input, output], training=False)
        # Select the last token from the 'seq_len' dimension.
        predictions = predictions[:, -1:, :] # Shape (batch_size, 1, vocab_size) •
        
        predicted_id = tf.argmax(predictions, axis=-1)
        
        # Concatenate the predicted_id to the output which is given to the
        # decoder as its input.
        for i in range (batch_size) :
            if output_array[i][-1] == end or output_array[i][-1] == 0:
                output_array[i].append(output_array[i][-1])
                continue
            output_array[i].append(predicted_id[i] .numpy())
                             
    output = np.array(output_array)
    # The output shape is (1, tokens)
    text = detokenizer(output[:,:, 0])


    return text

In [86]:
def clean_sentence(x):
    x = x.replace('<start>', '').replace('<end>', '').replace('<pad>', '').strip()
    return x


### **Testing**
The model was evaluated using the scoring function provided in the assignment notebook.
This folloing lines of code calculate a score for each batch of the testing data. The final score, is then obtained, by computing the average of all these batch scores. 

In [87]:
total_test_size = 0
score_ = 0
orderer=Orderer(tokenizer,transformer)
i = 0

for batch in test_generator:
    x,y = batch
    score_batch_size = x[0].shape[0]
    if score_batch_size == 0:
        break
    total_test_size = total_test_size + score_batch_size
    y_pred = orderer(tf.constant(x[0]))
    b_score = 0                   # score associated with each batch

    pred_sentences = y_pred
    original_sentences = detokenizer(x[1])
    
    print(clean_sentence(pred_sentences[0]),clean_sentence(original_sentences[0]), sep="\n")

    for j in range(score_batch_size) :
        b_score += score(clean_sentence(original_sentences[j]), clean_sentence(pred_sentences[j]))

    score_ += b_score
    print("\n====BATCH OVER====")
    print("Score as of batch ", i, ": ", score_/((i+1)*score_batch_size))
    i = i + 1

score_ = score_/total_test_size
print("\n====ALL OVER====")
print("Final score: ", score_)

recycling helps conserve natural resources and prevents precious pollution
recycling prevents pollution and helps conserve precious natural resources

====BATCH OVER====
Score as of batch  0 :  0.4625899121482372
men are responsible for what goes on during sex
men are responsible for what goes on during sex

====BATCH OVER====
Score as of batch  1 :  0.4747520524006411
some people try to deal with stress because they never learn how to avoid stressful situations
some people never learn how to deal with stress because they try to avoid stressful situations

====BATCH OVER====
Score as of batch  2 :  0.4781457651734932
houses vary greatly in the levels of radon gas they contain
houses vary greatly in the levels of radon gas they contain

====BATCH OVER====
Score as of batch  3 :  0.4802752260733347
employment is the leading source of health insurance coverage
employment is the leading source of health insurance coverage

====BATCH OVER====
Score as of batch  4 :  0.4815696527880692
vacuo