feat(multimodal): add max_image_token_count guard with OOM risk guidance#1308
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a new configuration parameter, --max_image_token_count, which limits the number of tokens allowed for a single image to prevent potential Out-Of-Memory (OOM) issues. The changes span documentation, CLI argument definitions, and the server's resource allocation logic. Feedback from the review focuses on replacing assert statements with explicit if checks and ValueError exceptions for runtime validation, as assert can be disabled in optimized Python environments.
| def _assert_image_token_count(self, token_num: int): | ||
| assert token_num <= self.args.max_image_token_count, ( | ||
| f"single image token count {token_num} exceeds max_image_token_count {self.args.max_image_token_count}. " | ||
| f"You can increase this limit by setting --max_image_token_count to a larger value when starting " | ||
| f"LightLLM. Warning: increasing this limit raises runtime OOM risk." | ||
| ) | ||
| return |
There was a problem hiding this comment.
Avoid using assert for runtime validation of user input or critical safety guards. Python's assert statements are removed when the interpreter is run with optimizations (-O), which would disable this check and potentially lead to OOM. Use an explicit if check and raise a ValueError instead. I've also renamed the method to better reflect its behavior.
| def _assert_image_token_count(self, token_num: int): | |
| assert token_num <= self.args.max_image_token_count, ( | |
| f"single image token count {token_num} exceeds max_image_token_count {self.args.max_image_token_count}. " | |
| f"You can increase this limit by setting --max_image_token_count to a larger value when starting " | |
| f"LightLLM. Warning: increasing this limit raises runtime OOM risk." | |
| ) | |
| return | |
| def _check_image_token_count(self, token_num: int): | |
| if token_num > self.args.max_image_token_count: | |
| raise ValueError( | |
| f"single image token count {token_num} exceeds max_image_token_count {self.args.max_image_token_count}. " | |
| f"You can increase this limit by setting --max_image_token_count to a larger value when starting " | |
| f"LightLLM. Warning: increasing this limit raises runtime OOM risk." | |
| ) | |
| return |
| data = img.read() | ||
| # must after init_imageitem_extral_params | ||
| token_num = self.tokenizer.get_image_token_length(img) | ||
| self._assert_image_token_count(token_num) |
| self.tokenizer.init_imageitem_extral_params(img, multimodal_params, samping_params) | ||
| image_tokens += self.tokenizer.get_image_token_length(img) | ||
| token_num = self.tokenizer.get_image_token_length(img) | ||
| self._assert_image_token_count(token_num) |
| assert token_num <= self.args.max_image_token_count, ( | ||
| f"single image token count {token_num} exceeds max_image_token_count {self.args.max_image_token_count}." | ||
| f"You can increase this limit by setting --max_image_token_count to a larger value when starting " | ||
| f"LightLLM. Warning: increasing this limit raises runtime OOM risk." | ||
| ) |
There was a problem hiding this comment.
Avoid using assert for runtime validation as it can be optimized away. Also, fixed a missing space in the error message between the sentences.
| assert token_num <= self.args.max_image_token_count, ( | |
| f"single image token count {token_num} exceeds max_image_token_count {self.args.max_image_token_count}." | |
| f"You can increase this limit by setting --max_image_token_count to a larger value when starting " | |
| f"LightLLM. Warning: increasing this limit raises runtime OOM risk." | |
| ) | |
| if token_num > self.args.max_image_token_count: | |
| raise ValueError( | |
| f"single image token count {token_num} exceeds max_image_token_count {self.args.max_image_token_count}. " | |
| f"You can increase this limit by setting --max_image_token_count to a larger value when starting " | |
| f"LightLLM. Warning: increasing this limit raises runtime OOM risk." | |
| ) |
No description provided.