You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
My train text data is in Chinese. and it reports UnicodeEncodeError when using subword-nmt learn-bpecommand with verbose mode on, while it works fine with learn_bpe.py script.
Then I check the code and find out the reason.
It seems that command line subwor-nmt learn-bpe doesn't run the above codes, then sys.stderr used by verbose mode (see below) would be the default system stderr,which encodes unicode with "ascii" encoding.
Thank you for reporting this. Yes, subword_nmt.py is never executed as a script, so the relevant code isn't run. I corrected this now in commit 955abfe. Please let me know if there are any other issues.
My train text data is in Chinese. and it reports UnicodeEncodeError when using
subword-nmt learn-bpe
command with verbose mode on, while it works fine with learn_bpe.py script.Then I check the code and find out the reason.
It seems that command line
subwor-nmt learn-bpe
doesn't run the above codes, thensys.stderr
used by verbose mode (see below) would be the default system stderr,which encodes unicode with "ascii" encoding.The text was updated successfully, but these errors were encountered: