Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long prompt for decoder only model #221

Closed
saxenarohit opened this issue Oct 12, 2023 · 4 comments · Fixed by #227
Closed

Long prompt for decoder only model #221

saxenarohit opened this issue Oct 12, 2023 · 4 comments · Fixed by #227
Labels
bug Something isn't working

Comments

@saxenarohit
Copy link

saxenarohit commented Oct 12, 2023

Using the library for (vicuna) decoder only model

input_tokens = tokenizer.encode_plus(input_prompt,return_tensors="pt", padding='max_length',max_length = len(input_prompt)).to("cuda")

output_text = inseq_model.generate(input_tokens,do_sample=False,max_length=4000,skip_special_tokens=True)

out = inseq_model.attribute(
    input_texts=input_prompt,
    attribute_target=False,
    generated_texts = output_text,
    step_scores=["probability"]).show()

Whenever the input prompt is greater than 2048 token, the attribute method gives this error.
Generation is still working, output_text have the correct generated text.

FeatureAttribution.prepare_and_attribute(self, sources, targets, attr_pos_start, attr_pos_end, show_progress, pretty_progress, output_step_attributions, attribute_target, step_scores, include_eos_baseline, attributed_fn, attribution_args, attributed_fn_args, step_scores_args)
233 # If prepare_and_attribute was called from AttributionModel.attribute,
234 # attributed_fn is already a Callable. Keep here to allow for usage independently
235 # of AttributionModel.attribute.
236 attributed_fn = self.attribution_model.get_attributed_fn(attributed_fn)
--> 237 attribution_output = self.attribute(
238 batch,
239 attributed_fn=attributed_fn,
240 attr_pos_start=attr_pos_start,
241 attr_pos_end=attr_pos_end,
242 show_progress=show_progress,
243 pretty_progress=pretty_progress,
244 output_step_attributions=output_step_attributions,
245 attribute_target=attribute_target,
246 step_scores=step_scores,
247 attribution_args=attribution_args,
248 attributed_fn_args=attributed_fn_args,
249 step_scores_args=step_scores_args,
250 )
251 # Same here, repeated from AttributionModel.attribute
252 # to allow independent usage
253 attribution_output.info["include_eos_baseline"] = include_eos_baseline

File /opt/conda/envs/inseq_pt2/lib/python3.9/site-packages/inseq/attr/feat/feature_attribution.py:327, in FeatureAttribution.attribute(self, batch, attributed_fn, attr_pos_start, attr_pos_end, show_progress, pretty_progress, output_step_attributions, attribute_target, step_scores, attribution_args, attributed_fn_args, step_scores_args)
323 raise ValueError(
324 "Layer attribution methods do not support attribute_target=True. Use regular attributions instead."
325 )
326 self._run_compatibility_checks(attributed_fn)
--> 327 attr_pos_start, attr_pos_end = check_attribute_positions(
328 batch.max_generation_length,
329 attr_pos_start,
330 attr_pos_end,
331 )
332 logger.debug("=" * 30 + f"\nfull batch: {batch}\n" + "=" * 30)
333 # Sources are empty for decoder-only models

File /opt/conda/envs/inseq_pt2/lib/python3.9/site-packages/inseq/attr/feat/attribution_utils.py:86, in check_attribute_positions(max_length, attr_pos_start, attr_pos_end)
84 raise ValueError(f"Invalid starting position for attribution: {attr_pos_start} > {attr_pos_end}")
85 if attr_pos_start == attr_pos_end:
---> 86 raise ValueError("Start and end attribution positions cannot be the same.")
87 return attr_pos_start, attr_pos_end

ValueError: Start and end attribution positions cannot be the same.`

@saxenarohit saxenarohit added the bug Something isn't working label Oct 12, 2023
@gsarti
Copy link
Member

gsarti commented Oct 12, 2023

Hi @saxenarohit, thanks for the report. Could you provide me with the code you use to load the model (at the moment I cannot see which attribution method you attached) and an example input string causing the issue?

@saxenarohit
Copy link
Author

The code I am using to load

inseq_model = inseq.load_model(model, "attention", tokenizer=tokenizer)
input_prompt = "This is a very long text"*100
input_tokens = tokenizer.encode_plus(input_prompt,return_tensors="pt", padding='max_length',max_length = len(input_prompt)).to("cuda")
output_text = inseq_model.generate(input_tokens,do_sample=False,max_length=2048,skip_special_tokens=True)
print(output_text)
out = inseq_model.attribute(
    input_texts=input_prompt,
    attribute_target=False,
    generated_texts = output_text,
    step_scores=["probability"],very_verbose=True).show()

This gives the error above even if I change model.config max_position_embeddings and max_sequence_length.
It generated the output_text but attribution fails.
Do I need to set the max_length in attribute argument?

If you reduce the length below 2048 it works fine.

Thanks for your prompt response.

@saxenarohit
Copy link
Author

saxenarohit commented Oct 27, 2023

Hi, it seems the issue lies in this code in the file huggingface_model.py at line 255. It sets the max length to default 512 tokens.

if max_length > 1e6:

        # Cap length with max_model_input_sizes instead
        if max_length > 1e6:
            if hasattr(self.tokenizer, "max_model_input_sizes") and self.tokenizer.max_model_input_sizes:
                max_length = max(v for _, v in self.tokenizer.max_model_input_sizes.items())
            else:
                max_length = max_input_length

@gsarti
Copy link
Member

gsarti commented Oct 30, 2023

Good catch @saxenarohit! That limit should not be necessary, I tried reproducing the issue with changes in #227 and now it should run correctly with prompts above 512 tokens!

Code:

import inseq
import torch
from transformers import AutoModelForCausalLM

model_name = "princeton-nlp/Sheared-LLaMA-1.3B"
model = AutoModelForCausalLM.from_pretrained(model_name, load_in_8bit=True)
inseq_model = inseq.load_model(model, "attention", tokenizer=model_name)#, tokenizer_kwargs={"legacy": False})
input_prompt = "This is a very long text"*100
output_text = inseq_model.generate(input_prompt,do_sample=False,max_length=2048,skip_special_tokens=True)
out = inseq_model.attribute(
    input_texts=input_prompt,
    generated_texts = input_prompt + output_text[0],
    step_scores=["probability"],
)
out.show()

If it still doesn't work for you, please feel free to reopen at any time!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants