# Model Merging through Tensor arithmetics
Model merging has recently emerged as a prominent paradigm to combine finetuned models and use the best off their single abilities.

This library implements various merging techniques and it does so by treating the merging operations as arithmetics on tensor operations. Through the `ModelTensorMapper` abstraction it maps `nn.Module` into `nn.Tensor`, by simply flattening and concatenating their parameters.

This simple approach allows for common operations on the tensors themselves. To start:

In [1]:
# Load any model
%load_ext autoreload
%autoreload 2
from transformers import pipeline, set_seed

gpt2 = pipeline('text-generation', model='openai-community/gpt2', device='cpu', framework='pt')

from src import 
# transform the model to a tensor
mapper = ModelTensorMapper(gpt2.model)

base = mapper.to_tensor(gpt2.model)
base.shape

Xformers is not installed correctly. If you want to use memorry_efficient_attention to accelerate training use the following command to install Xformers
pip install xformers.


ImportError: cannot import name 'ModelTensorMapper' from 'src' (c:\Users\Archimede\Documents\Dev\fisher_merge_hf\src\__init__.py)

`base_tensor` is just another tensor and it can be handled as such. Let us now load some other models and also turn them into tensors.

In [30]:
legal = mapper.from_hf('umarbutler/open-australian-legal-gpt2', 'text-generation')
recipe = mapper.from_hf('mrm8488/gpt2-finetuned-recipes-cooking', 'text-generation')

Now we can get funky! We can combine these tensors in any way we would like:
- addition
- subtraction
- scalar multiplication / division
- tensor multiplication / division

In [31]:
recipe + legal
base - recipe
base * legal
base / legal
base - 2*(legal - recipe)

tensor([-0.1596, -0.1107,  0.1852,  ...,  0.0649,  0.1611, -0.0301])

But now on to merging: once we have combined the weights as it pleases us, we can map the `Tensor` back to a `nn.Module` by using again the `ModelTensorMapper` class.

In this case we're going to create a model that has knowledge about australian law **and** cooking by applying the **task vector editing** paradigm from Ilharco et. al. 

In [32]:
legal_delta = legal - base    # task vector for australian law
recipe_delta = recipe - base  # task vector for cooking recipes
LAMBDA = 0.5

multilingual = base + LAMBDA*(legal_delta + recipe_delta)/2
multilingual_pipe = mapper.to_pipeline(multilingual, base_pipeline=gpt2)

Note that although the previous GPT-2 model was unable to answer coherently about food or australian law, now it's able to do (almost) both, while also retaining its original capabilites!

In [40]:
prompts = [
    'Good morning! My name is Elizabeth and for breakfast I had',
    'How to make the perfect Omelette. Ingredients:',
    'Section 51 of the Australian Constitution provides that',
]

set_seed(1789)
for prompt in prompts:
    print(prompt)
    print('GPT-2:', gpt2(prompt, max_length=50, num_return_sequences=1,  pad_token_id = 50256 )[0]['generated_text'][len(prompt):])
    print('Multilingual:', multilingual_pipe(prompt, max_length=50, num_return_sequences=1,  pad_token_id = 50256 )[0]['generated_text'][len(prompt):] )
    print('=============================\n')

Good morning! My name is Elizabeth and for breakfast I had
GPT-2:  to do a bunch of things…I got this shirt, a new phone and a new hat. Then, this guy is on the phone, and I'm like…what? Okay.
Multilingual:  the good fortune of meeting you on your first street and at the end of a small party with an excellently dressed woman of your choice who had a lovely and sweet face: of good

How to make the perfect Omelette. Ingredients:
GPT-2:  2 cups flour (I have not been able to find a comparable gluten free recipe for this filling), salt, pepper (1 cup gluten free flour), 1/2 tsp baking powder (optional)
Multilingual: 
1 cup white sugar - 1 egg for each 1 oz of cheese
1 cup milk
1-3 egg white
1 cup white flour
1 medium onion
1-1 cup flour

Section 51 of the Australian Constitution provides that
GPT-2:  "all persons shall exercise free and adequate rights and protection, against the state, the judiciary, civil, administrative or other courts, in the exercise of their rights under the constitu

Note how simple it was to combine these models! Although this library already implements the most famous merging methods, one of its key features it's the capability of extending it and implement new merging methods.

## Using the implemented merging methods
This library implements a number of ready-made merging methods, refer to `README.md` to know more about the implemented merged methods.

For example now let's use the popular TIES method to merge some pretrained models merge method