## Finetune a Three Layer Feedforward Neural Network on top of [paraphrase-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-mpnet-base-v2)
The weights of the pre-trained model are frozen.<br>
Loss function used: `MultipleNegativesRankingLoss`

In [1]:
import torch
from typing import Any, List, Optional, Tuple#, Union
from llama_index.core import SimpleDirectoryReader
from llama_index.core.base.embeddings.base import BaseEmbedding
from llama_index.core.node_parser import SentenceSplitter
from llama_index.embeddings.huggingface.base import HuggingFaceEmbedding
from llama_index.embeddings.huggingface.pooling import Pooling
from llama_index.finetuning import EmbeddingAdapterFinetuneEngine
from llama_index.finetuning.embeddings.adapter_utils import BaseAdapter
from llama_index.core.evaluation import EmbeddingQAFinetuneDataset
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(0)
torch.manual_seed(0)
%matplotlib inline

  from .autonotebook import tqdm as notebook_tqdm


## Load dataset

In [2]:
train_dataset = EmbeddingQAFinetuneDataset.from_json("data/train_dataset.json")

## Load the model and finetune for 32 epochs

In [None]:
# requires torch dependency
# from llama_index.embeddings.adapter.utils import TwoLayerNN
from llama_index.core.embeddings import resolve_embed_model
from llama_index.embeddings.adapter import AdapterEmbeddingModel
from typing import Dict
from utils import CustomNN

In [4]:
model_name = "sentence-transformers/paraphrase-mpnet-base-v2"
base_embed_model = resolve_embed_model(f"local:{model_name}")

INFO:sentence_transformers.SentenceTransformer:Load pretrained SentenceTransformer: sentence-transformers/paraphrase-mpnet-base-v2
Load pretrained SentenceTransformer: sentence-transformers/paraphrase-mpnet-base-v2
INFO:sentence_transformers.SentenceTransformer:2 prompts are loaded, with the keys: ['query', 'text']
2 prompts are loaded, with the keys: ['query', 'text']


In [12]:
adapter_model = CustomNN(
    in_features = 768,
    hidden_dim_1 = 1024,
    hidden_dim_2 = 1024,
    out_features = 768,
    add_residual = True,
    dropout = 0.1
)

In [None]:
finetune_engine = EmbeddingAdapterFinetuneEngine(
    train_dataset,
    base_embed_model,
    model_output_path="mpnet_big_ft_ep32",
    # model_checkpoint_path="model5_ck",
    adapter_model=adapter_model,
    epochs=32,
    verbose=False,
    device="cuda",
    batch_size = 32
)

Batches: 100%|██████████| 1/1 [00:00<00:00, 91.81it/s]


In [None]:
finetune_engine.finetune()

Epoch:   0%|          | 0/8 [00:00<?, ?it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 61.52it/s]

[A
Batches: 100%|██████████| 1/1 [00:00<00:00,  1.42it/s]

Batches: 100%|██████████| 1/1 [00:00<00:00, 91.20it/s]

[A
Batches: 100%|██████████| 1/1 [00:00<00:00,  1.66it/s]

Batches: 100%|██████████| 1/1 [00:00<00:00, 109.14it/s]

[A
Batches: 100%|██████████| 1/1 [00:00<00:00,  1.71it/s]

Batches: 100%|██████████| 1/1 [00:00<00:00, 57.55it/s]

[A
Batches: 100%|██████████| 1/1 [00:00<00:00,  1.79it/s]

Batches: 100%|██████████| 1/1 [00:00<00:00, 98.01it/s]

[A
Batches: 100%|██████████| 1/1 [00:00<00:00,  1.50it/s]

Batches: 100%|██████████| 1/1 [00:00<00:00, 92.13it/s]

[A
Batches: 100%|██████████| 1/1 [00:00<00:00,  1.37it/s]

Batches: 100%|██████████| 1/1 [00:00<00:00, 101.06it/s]

[A
Batches: 100%|██████████| 1/1 [00:00<00:00,  1.62it/s]

Batches: 100%|██████████| 1/1 [00:00<00:00, 93.27it/s]

[A
Batches: 100%|██████████| 1/1 [00:00<00:00,  1.38it/s]

Batches: 100%|██████████|

**Note:** Takes 1044 MiB of GPU memory. Takes 23465s = 6h 31m 5s to train for 32 epochs.<br>
Time per epoch: 733s = 12m 13s