a initial version of executor #61

zhaochenyang20 · 2023-05-17T02:33:54Z

Description

Following the document string, I implement the initial version of ModelExcuctor. However, I have some quick question:

How can we get the "confidence", and where will we use it? Currently, I am just using this as the confidence:

            logits = output[0].float()
            probs = torch.softmax(logits, dim=-1)
            confidence = probs.mean().item()

I think the calculation probs.mean().item() does provide a measure that can be loosely interpreted as the model's confidence in generating the output, but not enough.

The softmax probabilities obtained from the model's output can be seen as an indication of the model's belief or preference for each token in the generated sequence. By taking the average of these probabilities, probs.mean().item() provides a rough estimate of the overall confidence or likelihood of the generated output sequence.

What is auxiliary_info, where will we use it, and how to get it?

References

NA

Blocked by

debug tokenizer #60

viswavi

Seems like we should support batch execution? This can be done in a separate PR if you like

viswavi

Looks good! Only have a few small points for revisino

viswavi · 2023-05-17T14:04:55Z

prompt2model/model_executor/generate.py

+            else:
+                output = model.generate(input_ids=encoded_input["input_ids"])


Should we verify that the model in the else case is a transformers.AutoModelForCausalLM?

Suggested change

else:

output = model.generate(input_ids=encoded_input["input_ids"])

elif issubclass(model.__class__, transformers.AutoModelForCausalLM):

output = model.generate(input_ids=encoded_input["input_ids"])

else:

raise ValueError(f"Model class {model.__class__} not supported")

Hmmm. I agree with you and actually tried this before. But issubclass(model.__class__, transformers.AutoModelForCausalLM) would fail even for gpt2. 🤔

In [3]: gpt2_model_name = "gpt2" ...: gpt2_model = AutoModelForCausalLM.from_pretrained(gpt2_model_name) ...: gpt2_tokenizer = AutoTokenizer.from_pretrained(gpt2_model_name) In [4]: model = gpt2_model In [5]: issubclass(model.__class__, transformers.AutoModelForCausalLM) Out[5]: False

Actually, the inherited tree of GPT-2 is:

nn.Module -> PreTrainedModel -> GPT2PreTrainedModel -> GPT2LMHeadModel.

So issubclass(model.__class__, transformers.AutoModelForCausalLM) would fail.

But do we have better methods of type check? 🤔

ah I see....if there's no general type that can cover all autoregressive LM models then I think it's fine how you have it now

prompt2model/model_executor/generate.py

tests/model_executor_test.py

viswavi · 2023-05-17T14:10:32Z

tests/model_executor_test.py

+    if gpt2_tokenizer.pad_token is None:
+        gpt2_tokenizer.add_special_tokens({"pad_token": "[PAD]"})
+        gpt2_model.config.pad_token_id = gpt2_model.config.eos_token_id
+        gpt2_model.config.attention_mask_fn = lambda input_ids: (


It's a bit weird to have this setup done in the test. Can you explain this a little more?

Just as what happens in #60. Directly use gpt2_tokenizer is not enough.

Using pad_token, but it is not set yet. The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results. Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.

prompt2model/model_executor/generate.py

viswavi · 2023-05-17T15:30:57Z

prompt2model/model_executor/generate.py

+            else:
+                output = model.generate(input_ids=encoded_input["input_ids"])


ah I see....if there's no general type that can cover all autoregressive LM models then I think it's fine how you have it now

zhaochen20 added 21 commits April 20, 2023 22:07

Merge branch 'main' of github.com:viswavi/prompt2model

f98a478

Merge branch 'main' of github.com:viswavi/prompt2model

f426e81

Merge branch 'main' of github.com:viswavi/prompt2model

23aa3db

Merge branch 'main' of github.com:viswavi/prompt2model

c895817

Merge branch 'main' of github.com:viswavi/prompt2model

921e800

Merge branch 'main' of github.com:viswavi/prompt2model

27e3371

Merge branch 'main' of github.com:viswavi/prompt2model

2aaa739

rename model id to model name

4fc61bc

pull from origin main

6f8421b

Merge branch 'main' of github.com:viswavi/prompt2model

70b7a47

Merge branch 'main' of github.com:viswavi/prompt2model

30cdd32

Merge branch 'main' of github.com:viswavi/prompt2model

849dafc

add model info

c3787ba

fetech prompt parser

5bbe48e

Merge branch 'main' of github.com:viswavi/prompt2model

3dba765

Merge branch 'main' of github.com:viswavi/prompt2model

9e0e351

init GenerationModelExecutor

ddfd574

init GenerationModelExecutor's unit test

08adbbf

change unit tests

83fd1d3

add pad_token_id and attention_mask_fn

ef6d567

add pad_token_id and attention_mask_fn for executor

6b90202

zhaochenyang20 requested a review from viswavi May 17, 2023 02:34

zhaochenyang20 changed the base branch from main to Eren_debug_tokenizer May 17, 2023 02:34

zhaochen20 added 4 commits May 17, 2023 10:42

add careful condition

f2205cc

Merge branch 'Eren_debug_tokenizer' into Eren_executor

df13dd1

add careful condition

2bd6960

Merge branch 'Eren_debug_tokenizer' into Eren_executor

06c6a6a

Base automatically changed from Eren_debug_tokenizer to main May 17, 2023 06:08

viswavi reviewed May 17, 2023

View reviewed changes

viswavi requested changes May 17, 2023

View reviewed changes

zhaochen20 added 2 commits May 17, 2023 22:52

add auxiliary_info

d9f488f

refactor model excutor skeleton

d2b7fb7

zhaochenyang20 requested a review from viswavi May 17, 2023 15:05

viswavi approved these changes May 17, 2023

View reviewed changes

zhaochen20 and others added 2 commits May 18, 2023 00:34

[bug find for Trainer]

6d9aef1

Merge branch 'main' into Eren_executor

ccdb8c9

zhaochenyang20 merged commit 3eb208c into main May 17, 2023

zhaochenyang20 deleted the Eren_executor branch May 17, 2023 17:02

zhaochenyang20 mentioned this pull request May 18, 2023

Component Tracker #36

Closed

11 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

a initial version of executor #61

a initial version of executor #61

zhaochenyang20 commented May 17, 2023

viswavi left a comment

viswavi left a comment

viswavi May 17, 2023

zhaochenyang20 May 17, 2023

zhaochenyang20 May 17, 2023 •

edited

Loading

zhaochenyang20 May 17, 2023

viswavi May 17, 2023

viswavi May 17, 2023

zhaochenyang20 May 17, 2023

viswavi May 17, 2023

		else:
		output = model.generate(input_ids=encoded_input["input_ids"])

a initial version of executor #61

a initial version of executor #61

Conversation

zhaochenyang20 commented May 17, 2023

Description

References

Blocked by

viswavi left a comment

Choose a reason for hiding this comment

viswavi left a comment

Choose a reason for hiding this comment

viswavi May 17, 2023

Choose a reason for hiding this comment

zhaochenyang20 May 17, 2023

Choose a reason for hiding this comment

zhaochenyang20 May 17, 2023 • edited Loading

Choose a reason for hiding this comment

zhaochenyang20 May 17, 2023

Choose a reason for hiding this comment

viswavi May 17, 2023

Choose a reason for hiding this comment

viswavi May 17, 2023

Choose a reason for hiding this comment

zhaochenyang20 May 17, 2023

Choose a reason for hiding this comment

viswavi May 17, 2023

Choose a reason for hiding this comment

zhaochenyang20 May 17, 2023 •

edited

Loading