Skip to content

Commit

Permalink
Generate: change flush CJK characters to print
Browse files Browse the repository at this point in the history
  • Loading branch information
bcol23 committed Apr 8, 2023
1 parent 2fa45c4 commit 58a4dfe
Showing 1 changed file with 2 additions and 3 deletions.
5 changes: 2 additions & 3 deletions src/transformers/generation/streamers.py
Original file line number Diff line number Diff line change
Expand Up @@ -101,11 +101,10 @@ def put(self, value):
printable_text = text[self.print_len :]
self.token_cache = []
self.print_len = 0
# If the last token is a CJK character, we flush the cache.
# If the last token is a CJK character, we print the characters.
elif len(text) > 0 and self._is_chinese_char(ord(text[-1])):
printable_text = text[self.print_len :]
self.token_cache = []
self.print_len = 0
self.print_len += len(printable_text)
# Otherwise, prints until the last space char (simple heuristic to avoid printing incomplete words,
# which may change with the subsequent token -- there are probably smarter ways to do this!)
else:
Expand Down

0 comments on commit 58a4dfe

Please sign in to comment.