Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Off the shelf generation from trained hyperformer++ #2

Open
tuhinjubcse opened this issue Jul 15, 2022 · 6 comments
Open

Off the shelf generation from trained hyperformer++ #2

tuhinjubcse opened this issue Jul 15, 2022 · 6 comments

Comments

@tuhinjubcse
Copy link

tuhinjubcse commented Jul 15, 2022

from hyperformer.adapters import AdapterController, AutoAdapterConfig
from hyperformer.third_party.models import T5Config, T5ForConditionalGeneration
from transformers import AutoTokenizer, set_seed
import os
os.environ["CUDA_VISIBLE_DEVICES"]="0"

set_seed(42)
config = T5Config.from_pretrained('t5-3b',cache_dir="/local/nlpswordfish/tuhin/")
tokenizer = AutoTokenizer.from_pretrained('t5-3b',cache_dir="/local/nlpswordfish/tuhin/")

adapter_config = AutoAdapterConfig.get('meta-adapter')


#####################
adapter_config.input_dim = 1024
adapter_config.tasks = data_args.tasks
adapter_config.device = training_args.device
adapter_config.task_to_adapter = {task:adapter for task, adapter in zip(data_args.tasks, data_args.adapters)} if data_args.adapters is not None else None
adapter_config.task_to_embeddings = {task:embedding for task, embedding in zip(data_args.tasks, data_args.task_embeddings)} if (data_args.task_embeddings is not None) else None
######################

extra_adapter_params = ("task_embedding_dim","add_layer_norm_before_adapter","add_layer_norm_after_adapter","reduction_factor","hidden_dim","non_linearity","train_task_embeddings","projected_task_embedding_dim","task_hidden_dim","conditional_layer_norm","train_adapters_blocks","unique_hyper_net","unique_hyper_net_layer_norm","efficient_unique_hyper_net")

for p in extra_adapter_params:
    if hasattr(adapter_args, p) and hasattr(adapter_config, p):
        setattr(adapter_config, p, getattr(adapter_args, p))



model = T5ForConditionalGeneration.from_pretrained("/mnt/swordfish-datastore/tuhin/hyperformer++",from_tf=False, config=config,cache_dir="/local/nlpswordfish/tuhin/",adapter_config=adapter_config)
model.cuda()
inputs = tokenizer.encode("it 's a charming and often affecting journey .", return_tensors="pt")

gen_kwargs = {"max_length": 256, "num_beams": 1}
gen_kwargs["task"] = "sst"
gen_kwargs["task_embedding"] = model.task_embedding_controller("sst") if (self.config.train_adapters and isinstance(self.adapter_config, MetaAdapterConfig)) else None
outputs = model.generate(input_ids=inputs.cuda(),**gen_kwargs)
answer = tokenizer.decode(outputs[0],skip_special_tokens=True)


print("Predicted output", answer)
@tuhinjubcse
Copy link
Author

@rabeehk can you help me with the required parameters inside the #####

Do we need these for inference ?

adapter_config.tasks = data_args.tasks
adapter_config.device = training_args.device

@rabeehk
Copy link
Owner

rabeehk commented Jul 15, 2022

Hi @tuhinjubcse
I remember task_to_adapter specifies a mapping from tasks to adapters for the inference tasks, for instance lets assume you train the model with adapters [X Y Z] so during training you have these tasks, then you want to run the inference for tasks [F G], and here I was specifying a list on how F and G should be mapped to the trained available adapters, for instance if we have:
data_args.tasks = [F, G]
data_args.adapters = [X, Z]
it means I am using the following adapters for each task during the inference:
task_to_adapter = {F:X, G:Z}

Similarly task_to_embeddings specifies which task embedding to use for the inference time. Basically I added these because training and inference tasks can be different and one need to know the mapping between inference tasks and the trained adapters/task embeddings.

Best
Rabeeh

@tuhinjubcse
Copy link
Author

tuhinjubcse commented Jul 17, 2022

Yes but in
https://github.com/rabeehk/hyperformer/blob/main/hyperformer/configs/hyperformer%2B%2B.json

We only pass data_args.tasks

Do you know what the value will be for data_args. adapters or data_args.task_embeddings
it seems data_args.adapters is it just an array of task names ?

but data_args.task_embeddings whats the value of it ? how to specify it.
Can you also tell the value of
adapter_config.input_dim = config.d_model

I also have another question , I tried modifying your config to add a dropout of 0.1 but i get

Traceback (most recent call last):
  File "./finetune_t5_trainer.py", line 311, in <module>
    assert hasattr(config, p), f"({config.__class__.__name__}) doesn't have a `{p}` attribute"
AssertionError: (T5Config) doesn't have a `dropout` attribute

@rabeehk
Copy link
Owner

rabeehk commented Jul 17, 2022

Hi,
I only pass data_args.tasks since by default if one trains and test in the same tasks, as the samples I shared, one would not need to set data_args. adapters or data_args.task_embeddings in the codes. These are only useful to set if one wants to train on some tasks and test on other tasks, and they would need to be set as a list of strings. If these values are not passed default ones are set I think here:

self.task_to_adapter = {task: task for task in self.tasks}

and
if config.task_to_embeddings is not None:

About the dropout, I do not have access to run the codes, but for debug you can check the attributes name and make sure this attribute exist. Basically you need to check the T5Config, the version used in the codes and make sure the names are the same.

config.d_model is the embedding dimension for the T5 model, small and large models have different embedding size.

task_embeddings is just a list of strings as defined here

task_embeddings: Optional[List[str]] = field(

@tuhinjubcse
Copy link
Author

Ok you are awesome. These seem to something that will be very useful for zero shot inference :) where train and test task are different

@rabeehk
Copy link
Owner

rabeehk commented Jul 17, 2022

you're very welcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants