Skip to content

[术语表功能] 上传包含特殊字符的CSV文件时出现编码错误,请求支持特殊字符 #229

@Azisboy

Description

@Azisboy

在此之前...

  • 我已经搜索了现有的 issues

在什么场景下,需要你请求的功能?

markdown

问题描述

上传包含特殊字符(如 ú, á, û)的术语表CSV文件时出现Unicode编码错误。

Traceback (most recent call last):
File "F:\download\pdf2zh-v2.6.4-BabelDOC-v0.5.10-with-assets-win64\site-packages\gradio\queueing.py", line 626, in process_events
response = await route_utils.call_process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...<5 lines>...
)
^
File "F:\download\pdf2zh-v2.6.4-BabelDOC-v0.5.10-with-assets-win64\site-packages\gradio\route_utils.py", line 322, in call_process_api
output = await app.get_blocks().process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...<11 lines>...
)
^
File "F:\download\pdf2zh-v2.6.4-BabelDOC-v0.5.10-with-assets-win64\site-packages\gradio\blocks.py", line 2220, in process_api
result = await self.call_function(
^^^^^^^^^^^^^^^^^^^^^^^^^
...<8 lines>...
)
^
File "F:\download\pdf2zh-v2.6.4-BabelDOC-v0.5.10-with-assets-win64\site-packages\gradio\blocks.py", line 1729, in call_function
prediction = await fn(*processed_input)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\download\pdf2zh-v2.6.4-BabelDOC-v0.5.10-with-assets-win64\site-packages\gradio\utils.py", line 907, in async_wrapper
response = await f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\download\pdf2zh-v2.6.4-BabelDOC-v0.5.10-with-assets-win64\site-packages\pdf2zh_next\gui.py", line 913, in translate_file
"glossaries": _build_glossary_list(glossary_file, service),
~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\download\pdf2zh-v2.6.4-BabelDOC-v0.5.10-with-assets-win64\site-packages\pdf2zh_next\gui.py", line 680, in _build_glossary_list
temp_file.write(f.getvalue())
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^
File "tempfile.py", line 499, in func_wrapper
UnicodeEncodeError: 'gbk' codec can't encode character '\xfb' in position 1182: illegal multibyte sequence
Traceback (most recent call last):
File "F:\download\pdf2zh-v2.6.4-BabelDOC-v0.5.10-with-assets-win64\site-packages\gradio\queueing.py", line 626, in process_events
response = await route_utils.call_process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...<5 lines>...
)
^
File "F:\download\pdf2zh-v2.6.4-BabelDOC-v0.5.10-with-assets-win64\site-packages\gradio\route_utils.py", line 322, in call_process_api
output = await app.get_blocks().process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...<11 lines>...
)
^
File "F:\download\pdf2zh-v2.6.4-BabelDOC-v0.5.10-with-assets-win64\site-packages\gradio\blocks.py", line 2220, in process_api
result = await self.call_function(
^^^^^^^^^^^^^^^^^^^^^^^^^
...<8 lines>...
)
^
File "F:\download\pdf2zh-v2.6.4-BabelDOC-v0.5.10-with-assets-win64\site-packages\gradio\blocks.py", line 1729, in call_function
prediction = await fn(*processed_input)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\download\pdf2zh-v2.6.4-BabelDOC-v0.5.10-with-assets-win64\site-packages\gradio\utils.py", line 907, in async_wrapper
response = await f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\download\pdf2zh-v2.6.4-BabelDOC-v0.5.10-with-assets-win64\site-packages\pdf2zh_next\gui.py", line 913, in translate_file
"glossaries": _build_glossary_list(glossary_file, service),
~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\download\pdf2zh-v2.6.4-BabelDOC-v0.5.10-with-assets-win64\site-packages\pdf2zh_next\gui.py", line 680, in _build_glossary_list
temp_file.write(f.getvalue())
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^
File "tempfile.py", line 499, in func_wrapper
UnicodeEncodeError: 'gbk' codec can't encode character '\xfb' in position 1182: illegal multibyte sequence

解决方案

可否支持UTF-8编码格式?

其他内容

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesthelp wantedExtra attention is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions