Off the shelf generation from trained hyperformer++ #2

tuhinjubcse · 2022-07-15T16:20:58Z

from hyperformer.adapters import AdapterController, AutoAdapterConfig
from hyperformer.third_party.models import T5Config, T5ForConditionalGeneration
from transformers import AutoTokenizer, set_seed
import os
os.environ["CUDA_VISIBLE_DEVICES"]="0"

set_seed(42)
config = T5Config.from_pretrained('t5-3b',cache_dir="/local/nlpswordfish/tuhin/")
tokenizer = AutoTokenizer.from_pretrained('t5-3b',cache_dir="/local/nlpswordfish/tuhin/")

adapter_config = AutoAdapterConfig.get('meta-adapter')


#####################
adapter_config.input_dim = 1024
adapter_config.tasks = data_args.tasks
adapter_config.device = training_args.device
adapter_config.task_to_adapter = {task:adapter for task, adapter in zip(data_args.tasks, data_args.adapters)} if data_args.adapters is not None else None
adapter_config.task_to_embeddings = {task:embedding for task, embedding in zip(data_args.tasks, data_args.task_embeddings)} if (data_args.task_embeddings is not None) else None
######################

extra_adapter_params = ("task_embedding_dim","add_layer_norm_before_adapter","add_layer_norm_after_adapter","reduction_factor","hidden_dim","non_linearity","train_task_embeddings","projected_task_embedding_dim","task_hidden_dim","conditional_layer_norm","train_adapters_blocks","unique_hyper_net","unique_hyper_net_layer_norm","efficient_unique_hyper_net")

for p in extra_adapter_params:
    if hasattr(adapter_args, p) and hasattr(adapter_config, p):
        setattr(adapter_config, p, getattr(adapter_args, p))



model = T5ForConditionalGeneration.from_pretrained("/mnt/swordfish-datastore/tuhin/hyperformer++",from_tf=False, config=config,cache_dir="/local/nlpswordfish/tuhin/",adapter_config=adapter_config)
model.cuda()
inputs = tokenizer.encode("it 's a charming and often affecting journey .", return_tensors="pt")

gen_kwargs = {"max_length": 256, "num_beams": 1}
gen_kwargs["task"] = "sst"
gen_kwargs["task_embedding"] = model.task_embedding_controller("sst") if (self.config.train_adapters and isinstance(self.adapter_config, MetaAdapterConfig)) else None
outputs = model.generate(input_ids=inputs.cuda(),**gen_kwargs)
answer = tokenizer.decode(outputs[0],skip_special_tokens=True)


print("Predicted output", answer)

The text was updated successfully, but these errors were encountered:

tuhinjubcse · 2022-07-15T16:21:58Z

@rabeehk can you help me with the required parameters inside the #####

Do we need these for inference ?

adapter_config.tasks = data_args.tasks
adapter_config.device = training_args.device

rabeehk · 2022-07-15T22:38:00Z

Hi @tuhinjubcse
I remember task_to_adapter specifies a mapping from tasks to adapters for the inference tasks, for instance lets assume you train the model with adapters [X Y Z] so during training you have these tasks, then you want to run the inference for tasks [F G], and here I was specifying a list on how F and G should be mapped to the trained available adapters, for instance if we have:
data_args.tasks = [F, G]
data_args.adapters = [X, Z]
it means I am using the following adapters for each task during the inference:
task_to_adapter = {F:X, G:Z}

Similarly task_to_embeddings specifies which task embedding to use for the inference time. Basically I added these because training and inference tasks can be different and one need to know the mapping between inference tasks and the trained adapters/task embeddings.

Best
Rabeeh

tuhinjubcse · 2022-07-17T09:08:28Z

Yes but in
https://github.com/rabeehk/hyperformer/blob/main/hyperformer/configs/hyperformer%2B%2B.json

We only pass data_args.tasks

Do you know what the value will be for data_args. adapters or data_args.task_embeddings
it seems data_args.adapters is it just an array of task names ?

but data_args.task_embeddings whats the value of it ? how to specify it.
Can you also tell the value of
adapter_config.input_dim = config.d_model

I also have another question , I tried modifying your config to add a dropout of 0.1 but i get

Traceback (most recent call last):
  File "./finetune_t5_trainer.py", line 311, in <module>
    assert hasattr(config, p), f"({config.__class__.__name__}) doesn't have a `{p}` attribute"
AssertionError: (T5Config) doesn't have a `dropout` attribute

rabeehk · 2022-07-17T10:14:07Z

Hi,
I only pass data_args.tasks since by default if one trains and test in the same tasks, as the samples I shared, one would not need to set data_args. adapters or data_args.task_embeddings in the codes. These are only useful to set if one wants to train on some tasks and test on other tasks, and they would need to be set as a list of strings. If these values are not passed default ones are set I think here:

hyperformer/hyperformer/adapters/adapter_controller.py

Line 22 in cb21807

self.task_to_adapter = {task: task for task in self.tasks}

and

hyperformer/hyperformer/adapters/adapter_utils.py

Line 70 in cb21807

if config.task_to_embeddings is not None:

About the dropout, I do not have access to run the codes, but for debug you can check the attributes name and make sure this attribute exist. Basically you need to check the T5Config, the version used in the codes and make sure the names are the same.

config.d_model is the embedding dimension for the T5 model, small and large models have different embedding size.

task_embeddings is just a list of strings as defined here

hyperformer/hyperformer/training_args.py

Line 134 in cb21807

task_embeddings: Optional[List[str]] = field(

tuhinjubcse · 2022-07-17T10:50:47Z

Ok you are awesome. These seem to something that will be very useful for zero shot inference :) where train and test task are different

rabeehk · 2022-07-17T11:04:03Z

you're very welcome.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Off the shelf generation from trained hyperformer++ #2

Off the shelf generation from trained hyperformer++ #2

tuhinjubcse commented Jul 15, 2022 •

edited

tuhinjubcse commented Jul 15, 2022

rabeehk commented Jul 15, 2022 •

edited

tuhinjubcse commented Jul 17, 2022 •

edited

rabeehk commented Jul 17, 2022 •

edited

tuhinjubcse commented Jul 17, 2022

rabeehk commented Jul 17, 2022

Off the shelf generation from trained hyperformer++ #2

Off the shelf generation from trained hyperformer++ #2

Comments

tuhinjubcse commented Jul 15, 2022 • edited

tuhinjubcse commented Jul 15, 2022

rabeehk commented Jul 15, 2022 • edited

tuhinjubcse commented Jul 17, 2022 • edited

rabeehk commented Jul 17, 2022 • edited

tuhinjubcse commented Jul 17, 2022

rabeehk commented Jul 17, 2022

tuhinjubcse commented Jul 15, 2022 •

edited

rabeehk commented Jul 15, 2022 •

edited

tuhinjubcse commented Jul 17, 2022 •

edited

rabeehk commented Jul 17, 2022 •

edited