## Fetch ML Apprentice



###### Jainam Shah
###### Email Id: jainamshah1500@gmail.com


#### Import Libraries 

In [1]:
import torch
from transformers import BertModel, BertTokenizer
import warnings
warnings.filterwarnings("ignore", category=DeprecationWarning) 
from transformers import logging
logging.set_verbosity_error()

## Task 1:  Sentence Transformer Implementation

Implement a sentence transformer model using any deep learning framework of your choice. This model should be able to encode input sentences into fixed-length embeddings. Test your implementation with a few sample sentences and showcase the obtained embeddings. Describe any choices you had to make regarding the model architecture outside of the transformer backbone.



#### Initialize the Transformer Model

We will use the BERT model as our transformer backbone. We will also use the corresponding tokenizer to preprocess our input sentences. <br/>
Code to Initialize the Model and Tokenizer:



In [2]:
# Load pre-trained BERT tokenizer
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')

In [3]:
# Load pre-trained BERT model
model = BertModel.from_pretrained('bert-base-uncased')

In [4]:
# Ensure the model is set to evaluation mode
model.eval()

BertModel(
  (embeddings): BertEmbeddings(
    (word_embeddings): Embedding(30522, 768, padding_idx=0)
    (position_embeddings): Embedding(512, 768)
    (token_type_embeddings): Embedding(2, 768)
    (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
    (dropout): Dropout(p=0.1, inplace=False)
  )
  (encoder): BertEncoder(
    (layer): ModuleList(
      (0): BertLayer(
        (attention): BertAttention(
          (self): BertSelfAttention(
            (query): Linear(in_features=768, out_features=768, bias=True)
            (key): Linear(in_features=768, out_features=768, bias=True)
            (value): Linear(in_features=768, out_features=768, bias=True)
            (dropout): Dropout(p=0.1, inplace=False)
          )
          (output): BertSelfOutput(
            (dense): Linear(in_features=768, out_features=768, bias=True)
            (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
            (dropout): Dropout(p=0.1, inplace=False)
          

In [5]:
# Move model to GPU if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

BertModel(
  (embeddings): BertEmbeddings(
    (word_embeddings): Embedding(30522, 768, padding_idx=0)
    (position_embeddings): Embedding(512, 768)
    (token_type_embeddings): Embedding(2, 768)
    (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
    (dropout): Dropout(p=0.1, inplace=False)
  )
  (encoder): BertEncoder(
    (layer): ModuleList(
      (0): BertLayer(
        (attention): BertAttention(
          (self): BertSelfAttention(
            (query): Linear(in_features=768, out_features=768, bias=True)
            (key): Linear(in_features=768, out_features=768, bias=True)
            (value): Linear(in_features=768, out_features=768, bias=True)
            (dropout): Dropout(p=0.1, inplace=False)
          )
          (output): BertSelfOutput(
            (dense): Linear(in_features=768, out_features=768, bias=True)
            (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
            (dropout): Dropout(p=0.1, inplace=False)
          

## Encode Sentences

Tokenize the input sentences <br/>
Pass the tokenized inputs through the transformer model.<br/>
Extract and process the embeddings.<br/>


Tokenization: Convert the sentences into token IDs that the model can process.<br/>
Padding: Ensure that all input sentences are of the same length by padding shorter sentences.<br/>
Create Attention Masks: These masks identify the actual tokens vs. the padded tokens.<br/>

In [6]:
def encode_sentences(sentences, tokenizer, model, device):
    # Tokenize the sentences
    inputs = tokenizer(sentences, return_tensors='pt', padding=True, truncation=True, max_length=128)
    
    # Move inputs to the appropriate device
    input_ids = inputs['input_ids'].to(device)
    attention_mask = inputs['attention_mask'].to(device)
    
    # Get the model outputs
    with torch.no_grad():  # Disable gradient calculation for inference
        outputs = model(input_ids, attention_mask=attention_mask)
    
    # Use mean pooling to obtain a single vector representation for each sentence
    embeddings = outputs.last_hidden_state.mean(dim=1)
    
    return embeddings

In [7]:
# Sample sentence for testing the implementation
sentences = ["Hi I am a test sentence.", "Yet another sentence for testing with Jainam."]

In [8]:
sentences

['Hi I am a test sentence.', 'Yet another sentence for testing with Jainam.']

In [9]:
# Encode the sentences
embeddings = encode_sentences(sentences, tokenizer, model, device)

In [10]:
embeddings

tensor([[ 0.0596,  0.0481,  0.2742,  ...,  0.0055,  0.1021, -0.0095],
        [ 0.3788, -0.0257, -0.2359,  ...,  0.1561, -0.1965, -0.3291]])

In [11]:
# Print the embeddings
print(embeddings)

tensor([[ 0.0596,  0.0481,  0.2742,  ...,  0.0055,  0.1021, -0.0095],
        [ 0.3788, -0.0257, -0.2359,  ...,  0.1561, -0.1965, -0.3291]])


# Test the Implementation
Encode a few sentences and display them.

In [12]:
# Sample sentences for testing
sample_sentences = [
    "You, me, or nobody is gonna hit as hard as life.",
    "Artificial intelligence is transforming the world.",
    "Now if you know what you're worth then go out and get what you're worth."
]

In [13]:
sample_sentences


['You, me, or nobody is gonna hit as hard as life.',
 'Artificial intelligence is transforming the world.',
 "Now if you know what you're worth then go out and get what you're worth."]

In [14]:
# Encode the sample sentences
sample_embeddings = encode_sentences(sample_sentences, tokenizer, model, device)

In [15]:
# Print the embeddings for the sample sentences
for i, sentence in enumerate(sample_sentences):
    print(f"Sentence: {sentence}")
    print(f"Embedding: {sample_embeddings[i]}\n")

Sentence: You, me, or nobody is gonna hit as hard as life.
Embedding: tensor([ 2.2176e-01,  2.2432e-01,  3.0695e-01,  1.1605e-01, -1.1473e-01,
        -6.4779e-02,  4.6980e-01,  8.0019e-01, -3.6654e-01, -3.0796e-01,
         5.0026e-02, -4.6094e-02, -1.4343e-01,  3.5418e-01, -1.2534e-01,
         4.2917e-01,  8.6256e-02, -1.3424e-01, -1.3792e-02,  6.0837e-01,
        -1.1695e-01,  3.4180e-01, -1.1920e-01, -2.2651e-02,  1.7207e-01,
        -5.4144e-02, -1.5742e-01, -1.4741e-02, -3.3203e-01, -1.9860e-01,
         1.6375e-01, -2.3867e-01, -2.0066e-01, -1.5885e-01,  2.5050e-01,
         2.3776e-01,  2.5012e-02, -1.1694e-01,  1.4643e-01, -1.6711e-01,
        -2.9388e-01, -1.9025e-01, -1.9884e-01,  6.5989e-03, -7.1497e-02,
        -4.7734e-01,  1.9187e-01,  1.7124e-01,  3.5785e-02, -1.8495e-01,
        -1.5227e-01,  8.4010e-02, -3.5321e-01, -2.4385e-01,  2.6484e-02,
         1.0005e-02,  2.6120e-01, -5.8567e-01, -3.4005e-01,  3.7883e-01,
         4.8994e-03,  1.0925e-01,  1.9755e-01, -3.1815

# Task 2: Multi-Task Learning Expansion


Expand the sentence transformer to handle multi-task learning for:

**Task A: Sentence Classification** – Classify sentences into predefined classes.<br/>
**Task B: Sentiment Analysis** – Classify sentences into different sentiment categories.(I have chosen Sentiment Analysis)

## Define the Multi-Task Model Architecture

Add task-specific heads to the transformer model.Ensuring the transformer backbone is shared.


## Implemented the Multi-Task Model

Defined a custom model class. Implemented a forward pass to handle both tasks.

In [16]:
import torch.nn as nn
from transformers import BertModel

class MultiTaskModel(nn.Module):
    def __init__(self, transformer, num_classes, num_sentiments):
        super(MultiTaskModel, self).__init__()
        self.transformer = transformer
        self.classification_head = nn.Linear(transformer.config.hidden_size, num_classes)
        self.sentiment_head = nn.Linear(transformer.config.hidden_size, num_sentiments)
    
    def forward(self, input_ids, attention_mask, task='classification'):
        # Pass inputs through the transformer model
        outputs = self.transformer(input_ids, attention_mask=attention_mask)
        
        # Mean pooling to get a single vector representation
        pooled_output = outputs.last_hidden_state.mean(dim=1)
        
        # Return task-specific output
        if task == 'classification':
            return self.classification_head(pooled_output)
        elif task == 'sentiment':
            return self.sentiment_head(pooled_output)

# Initialize the model
num_classes = 5  # Example number of classes for classification
num_sentiments = 3  # Example number of sentiment categories

# Load the pre-trained transformer model
transformer_model = BertModel.from_pretrained('bert-base-uncased')
multi_task_model = MultiTaskModel(transformer_model, num_classes, num_sentiments)
multi_task_model.to(device)


MultiTaskModel(
  (transformer): BertModel(
    (embeddings): BertEmbeddings(
      (word_embeddings): Embedding(30522, 768, padding_idx=0)
      (position_embeddings): Embedding(512, 768)
      (token_type_embeddings): Embedding(2, 768)
      (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
      (dropout): Dropout(p=0.1, inplace=False)
    )
    (encoder): BertEncoder(
      (layer): ModuleList(
        (0): BertLayer(
          (attention): BertAttention(
            (self): BertSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): BertSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_aff

## Preparing Data for Both Tasks

Create sample datasets for both sentence classification and sentiment analysis.

In [17]:
# Sample data for sentence classification
classification_sentences = ["I love painting.", "The movie was pathetic.", "What a beautiful day!", "I hate waiting in lines.", "The book was fascinating."]
classification_labels = [1, 0, 1, 0, 1]  # Example labels (e.g., 1 for positive, 0 for negative)

# Sample data for sentiment analysis
sentiment_sentences = ["This product is amazing!", "I'm very disappointed.", "It's okay, not great.", "Absolutely fantastic!", "Really bad experience."]
sentiment_labels = [2, 0, 1, 2, 0]  # Example sentiment labels (e.g., 2 for positive, 1 for neutral, 0 for negative)

# Tokenize the sentences
classification_inputs = tokenizer(classification_sentences, return_tensors='pt', padding=True, truncation=True, max_length=128)
sentiment_inputs = tokenizer(sentiment_sentences, return_tensors='pt', padding=True, truncation=True, max_length=128)

# Move inputs to the appropriate device
classification_input_ids = classification_inputs['input_ids'].to(device)
classification_attention_mask = classification_inputs['attention_mask'].to(device)
sentiment_input_ids = sentiment_inputs['input_ids'].to(device)
sentiment_attention_mask = sentiment_inputs['attention_mask'].to(device)

# Convert labels to tensors
classification_labels = torch.tensor(classification_labels).to(device)
sentiment_labels = torch.tensor(sentiment_labels).to(device)


## Test the Multi-Task Model
We'll test the multi-task model with the sample data and verify the outputs. 

In [18]:
# Forward pass for classification task
classification_outputs = multi_task_model(classification_input_ids, classification_attention_mask, task='classification')
print("Classification Outputs:")
print(classification_outputs)



Classification Outputs:
tensor([[ 0.3300,  0.0477, -0.1783,  0.2054, -0.2476],
        [ 0.0081, -0.0287, -0.0305,  0.0980, -0.3529],
        [ 0.3278, -0.2119, -0.1304,  0.0764, -0.4244],
        [ 0.1804,  0.0248,  0.0760,  0.2767, -0.2558],
        [ 0.0069, -0.1111, -0.0364,  0.0935, -0.4658]],
       grad_fn=<AddmmBackward0>)


In [19]:
# Forward pass for sentiment analysis task
sentiment_outputs = multi_task_model(sentiment_input_ids, sentiment_attention_mask, task='sentiment')
print("Sentiment Analysis Outputs:")
print(sentiment_outputs)


Sentiment Analysis Outputs:
tensor([[ 0.2429, -0.2338,  0.0498],
        [-0.1002, -0.0045, -0.2453],
        [ 0.1496, -0.4888, -0.1407],
        [ 0.1473, -0.1358,  0.0823],
        [ 0.0780, -0.2310, -0.1272]], grad_fn=<AddmmBackward0>)


# Task 3: Training Considerations
##### Discuss the implications and advantages of each scenario and explain your rationale as to how the model should be trained given the following:<br/>
If the entire network should be frozen.<br/>
If only the transformer backbone should be frozen.<br/>
If only one of the task-specific heads (either for Task A or Task B) should be frozen.<br/><br/>
##### Consider a scenario where transfer learning can be beneficial. Explain how you would approach the transfer learning process, including:<br/>
The choice of a pre-trained model.<br/>
The layers you would freeze/unfreeze.<br/>
The rationale behind these choices.<br/>

##  Example Implementation for Transfer Learning:

In [20]:
import torch
from torch.utils.data import DataLoader, TensorDataset
from transformers import BertModel, BertTokenizer, AdamW



### Set Up the Environment and Model by defining the Training Loop

In [21]:
# Initialize the model, tokenizer, and device
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
transformer_model = BertModel.from_pretrained('bert-base-uncased')

num_classes = 5  # Example number of classes for classification
num_sentiments = 3  # Example number of sentiment categories

class MultiTaskModel(nn.Module):
    def __init__(self, transformer, num_classes, num_sentiments):
        super(MultiTaskModel, self).__init__()
        self.transformer = transformer
        self.classification_head = nn.Linear(transformer.config.hidden_size, num_classes)
        self.sentiment_head = nn.Linear(transformer.config.hidden_size, num_sentiments)
    
    def forward(self, input_ids, attention_mask, task='classification'):
        outputs = self.transformer(input_ids, attention_mask=attention_mask)
        pooled_output = outputs.last_hidden_state.mean(dim=1)
        if task == 'classification':
            return self.classification_head(pooled_output)
        elif task == 'sentiment':
            return self.sentiment_head(pooled_output)

multi_task_model = MultiTaskModel(transformer_model, num_classes, num_sentiments)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
multi_task_model.to(device)


MultiTaskModel(
  (transformer): BertModel(
    (embeddings): BertEmbeddings(
      (word_embeddings): Embedding(30522, 768, padding_idx=0)
      (position_embeddings): Embedding(512, 768)
      (token_type_embeddings): Embedding(2, 768)
      (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
      (dropout): Dropout(p=0.1, inplace=False)
    )
    (encoder): BertEncoder(
      (layer): ModuleList(
        (0): BertLayer(
          (attention): BertAttention(
            (self): BertSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): BertSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_aff

## Sample Data and DataLoader

In [22]:
# Sample data for sentence classification
classification_sentences = ["I love painting.", "The movie was pathetic.", "What a beautiful day!", "I hate waiting in lines.", "The book was fascinating."]
classification_labels = [1, 0, 1, 0, 1]

# Sample data for sentiment analysis
sentiment_sentences = ["This product is amazing!", "I'm very disappointed.", "It's okay, not great.", "Absolutely fantastic!", "Really bad experience."]
sentiment_labels = [2, 0, 1, 2, 0]

# Tokenize the sentences
classification_inputs = tokenizer(classification_sentences, return_tensors='pt', padding=True, truncation=True, max_length=128)
sentiment_inputs = tokenizer(sentiment_sentences, return_tensors='pt', padding=True, truncation=True, max_length=128)

# Convert inputs and labels to tensors
classification_input_ids = classification_inputs['input_ids']
classification_attention_mask = classification_inputs['attention_mask']
classification_labels = torch.tensor(classification_labels)

sentiment_input_ids = sentiment_inputs['input_ids']
sentiment_attention_mask = sentiment_inputs['attention_mask']
sentiment_labels = torch.tensor(sentiment_labels)

# Create TensorDatasets and DataLoaders
classification_dataset = TensorDataset(classification_input_ids, classification_attention_mask, classification_labels)
sentiment_dataset = TensorDataset(sentiment_input_ids, sentiment_attention_mask, sentiment_labels)

classification_loader = DataLoader(classification_dataset, batch_size=2, shuffle=True)
sentiment_loader = DataLoader(sentiment_dataset, batch_size=2, shuffle=True)


## Defining the Training Loop

In [23]:
import torch.nn as nn

# Define training parameters
num_epochs = 3  # Number of epochs
learning_rate = 1e-4

# Define loss functions
criterion_classification = nn.CrossEntropyLoss()
criterion_sentiment = nn.CrossEntropyLoss()

# Optimizer
optimizer = AdamW(multi_task_model.parameters(), lr=learning_rate)

# Training loop for task-specific heads
multi_task_model.train()

for epoch in range(num_epochs):
    # Training for classification task
    for input_ids, attention_mask, labels in classification_loader:
        input_ids, attention_mask, labels = input_ids.to(device), attention_mask.to(device), labels.to(device)
        
        optimizer.zero_grad()
        outputs = multi_task_model(input_ids, attention_mask, task='classification')
        loss = criterion_classification(outputs, labels)
        loss.backward()
        optimizer.step()
    
    # Training for sentiment analysis task
    for input_ids, attention_mask, labels in sentiment_loader:
        input_ids, attention_mask, labels = input_ids.to(device), attention_mask.to(device), labels.to(device)
        
        optimizer.zero_grad()
        outputs = multi_task_model(input_ids, attention_mask, task='sentiment')
        loss = criterion_sentiment(outputs, labels)
        loss.backward()
        optimizer.step()

print("Initial training for task-specific heads completed.")

# Unfreeze lower layers for further fine-tuning
for layer in multi_task_model.transformer.encoder.layer[:4]:
    for param in layer.parameters():
        param.requires_grad = True

# Lower the learning rate for fine-tuning
optimizer = AdamW(multi_task_model.parameters(), lr=1e-5)

# Fine-tuning loop
multi_task_model.train()

for epoch in range(num_epochs):
    # Fine-tuning for classification task
    for input_ids, attention_mask, labels in classification_loader:
        input_ids, attention_mask, labels = input_ids.to(device), attention_mask.to(device), labels.to(device)
        
        optimizer.zero_grad()
        outputs = multi_task_model(input_ids, attention_mask, task='classification')
        loss = criterion_classification(outputs, labels)
        loss.backward()
        optimizer.step()
    
    # Fine-tuning for sentiment analysis task
    for input_ids, attention_mask, labels in sentiment_loader:
        input_ids, attention_mask, labels = input_ids.to(device), attention_mask.to(device), labels.to(device)
        
        optimizer.zero_grad()
        outputs = multi_task_model(input_ids, attention_mask, task='sentiment')
        loss = criterion_sentiment(outputs, labels)
        loss.backward()
        optimizer.step()

print("Fine-tuning completed.")




Initial training for task-specific heads completed.
Fine-tuning completed.


## Task 4: Layer-wise Learning Rate Implementation (BONUS) 


Implement layer-wise learning rates for the multi-task sentence transformer.<br/>
Explain the rationale for the specific learning rates you've set for each layer.

# Define Layer-wise Learning Rates
Layer-wise learning rates allow us to assign different learning rates to different parts of the model. Typically, lower layers (closer to the input) receive lower learning rates, while higher layers (closer to the output) receive higher learning rates. This is because lower layers capture more general features that should remain relatively stable, while higher layers capture task-specific features that might need more adaptation.

Layer-wise Learning Rate Configuration: 

In [24]:
# Define different learning rates for different layers
learning_rate_base = 1e-5
layer_learning_rates = {
    'transformer.encoder.layer.0': learning_rate_base,
    'transformer.encoder.layer.1': learning_rate_base,
    'transformer.encoder.layer.2': learning_rate_base,
    'transformer.encoder.layer.3': learning_rate_base,
    'transformer.encoder.layer.4': learning_rate_base * 2,
    'transformer.encoder.layer.5': learning_rate_base * 2,
    'transformer.encoder.layer.6': learning_rate_base * 2,
    'transformer.encoder.layer.7': learning_rate_base * 2,
    'transformer.encoder.layer.8': learning_rate_base * 3,
    'transformer.encoder.layer.9': learning_rate_base * 3,
    'transformer.encoder.layer.10': learning_rate_base * 3,
    'transformer.encoder.layer.11': learning_rate_base * 3,
    'classification_head': learning_rate_base * 5,
    'sentiment_head': learning_rate_base * 5,
}


## Modify the Optimizer to Use Layer-wise Learning Rates
To implement layer-wise learning rates, we need to group the model parameters by layer and assign the specified learning rates.

In [25]:
def get_layerwise_lr_params(model, layer_learning_rates):
    params = []
    for name, param in model.named_parameters():
        if not param.requires_grad:
            continue
        lr = None
        for layer_name in layer_learning_rates:
            if name.startswith(layer_name):
                lr = layer_learning_rates[layer_name]
                break
        if lr is None:
            lr = learning_rate_base
        params.append({'params': param, 'lr': lr})
    return params

# Create parameter groups with specified learning rates
layerwise_params = get_layerwise_lr_params(multi_task_model, layer_learning_rates)
optimizer = AdamW(layerwise_params)


## Training Loop with Layer-wise Learning Rates

In [26]:
num_epochs = 3  # Define number of epochs

# Training loop with layer-wise learning rates
multi_task_model.train()

for epoch in range(num_epochs):
    # Training for classification task
    for input_ids, attention_mask, labels in classification_loader:
        input_ids, attention_mask, labels = input_ids.to(device), attention_mask.to(device), labels.to(device)
        
        optimizer.zero_grad()
        outputs = multi_task_model(input_ids, attention_mask, task='classification')
        loss = criterion_classification(outputs, labels)
        loss.backward()
        optimizer.step()
    
    # Training for sentiment analysis task
    for input_ids, attention_mask, labels in sentiment_loader:
        input_ids, attention_mask, labels = input_ids.to(device), attention_mask.to(device), labels.to(device)
        
        optimizer.zero_grad()
        outputs = multi_task_model(input_ids, attention_mask, task='sentiment')
        loss = criterion_sentiment(outputs, labels)
        loss.backward()
        optimizer.step()

print("Training with layer-wise learning rates completed.")


Training with layer-wise learning rates completed.


# Rationale:

**Lower Layers (0-3)**: Lower learning rates (1x base) to prevent overfitting. These layers capture general linguistic features that are broadly applicable and should remain stable.<br/>
**Middle Layers (4-7)**: Slightly higher learning rates (2x base) as these layers start to capture more task-specific information that may need more fine-tuning.<br/>
**Upper Layers (8-11)**: Higher learning rates (3x base) as these layers capture the most task-specific features and benefit from more adaptation to the new tasks.<br/>
**Task-specific Heads**: Highest learning rates (5x base) because these layers are entirely new and need significant tuning to learn the task-specific mappings.

# Discuss the Potential Benefits of Layer-wise Learning Rates


**Benefits:**

**Better Fine-Tuning:** Allows more fine-tuned adjustments to the model, improving performance by adapting different parts of the network at appropriate rates.<br/>
**Prevent Overfitting:** Helps prevent overfitting in lower layers while allowing higher layers to adapt more readily to task-specific nuances.<br/>
**Efficiency:** Optimizes the training process by focusing learning where it's most needed, leading to potentially faster convergence and better overall performance.<br/>
**Multi-Task Setting:** In a multi-task setting, layer-wise learning rates ensure that the shared layers (transformer backbone) adapt properly while the task-specific heads receive enough focus to learn the distinct tasks effectively.
