update gpt_model#71
Conversation
There was a problem hiding this comment.
Code Review
This pull request refactors the forward methods in GPTModel and MMGPTModel by removing generic **kwargs in favor of more explicit argument handling. A high-severity issue was identified in MMGPTModel where passing the unfiltered kwargs dictionary as extra_block_kwargs could cause a TypeError due to duplicate arguments like input_ids and packed_seq_params. A code suggestion was provided to filter these keys to ensure compatibility.
| inference_params=inference_params, | ||
| packed_seq_params=packed_seq_params, | ||
| **kwargs, | ||
| extra_block_kwargs=kwargs, |
There was a problem hiding this comment.
Passing kwargs directly as extra_block_kwargs is problematic because kwargs is mutated at line 85 to include input_ids and packed_seq_params. If self.visual is None, these keys are not removed, leading to a TypeError (multiple values for argument) when calling the language model, as it already receives these arguments explicitly. Additionally, the current logic at line 51 clears kwargs if a visual encoder is present, which would cause any other legitimate extra arguments to be lost. Filtering the dictionary is a safer approach to avoid conflicts with explicit arguments.
| extra_block_kwargs=kwargs, | |
| extra_block_kwargs={k: v for k, v in kwargs.items() if k not in ['input_ids', 'packed_seq_params']}, |
No description provided.