You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
mdtable installed at "D:\app_barry\miniconda\Scripts\mdtable.exe", by typing following command in powershell:
pip install mdtable
tests I've tried
content of test.csv and test2.csv:
bbc,cctv,wapo
1,2,3
啊啊,拜拜,尺寸
1.original:
PS E:\down> mdtable test.csv
Traceback (most recent call last):
File "D:\app_barry\miniconda\lib\runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "D:\app_barry\miniconda\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "D:\app_barry\miniconda\Scripts\mdtable.exe\__main__.py", line 7, in <module>
File "D:\app_barry\miniconda\lib\site-packages\click\core.py", line 1157, in __call__
return self.main(*args, **kwargs)
File "D:\app_barry\miniconda\lib\site-packages\click\core.py", line 1078, in main
rv = self.invoke(ctx)
File "D:\app_barry\miniconda\lib\site-packages\click\core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "D:\app_barry\miniconda\lib\site-packages\click\core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "D:\app_barry\miniconda\lib\site-packages\mdtable\cli.py", line 26, in main
table = MDTable(
File "D:\app_barry\miniconda\lib\site-packages\mdtable\mdtable.py", line 57, in __init__
self._csv_dict = _read_csv(
File "D:\app_barry\miniconda\lib\site-packages\mdtable\mdtable.py", line 180, in _read_csv
header = next(csv_reader)
UnicodeDecodeError: 'gbk' codec can't decode byte 0xaf in position 40: illegal multibyte sequence
2.retry using a new csv file encoded as GBK(hereinafter called 'test2.csv') > success, no bug reports.
3.delete all chinese characters in test.csv(now being a ascii file with utf8 encoding, hereinafter called 'test3') , and retry > success, no bug reports.
Mdtable uses encoding and decoding methods similar to ANSI endoding, so it uses GBK encoding on my computer as default. But I input a utf-8 encoding csv file, so when it comes to non-ascii charaters like '啊', the program reports DecodeError.
If my guesses are correct, adding an option to select encoder could be a solution.
The text was updated successfully, but these errors were encountered:
mdtable reports UnicodeDecodeError when I try to convert utf-8 csv file with chinese characters(hereinafter called 'test.csv') into markdown table.
background info
Win11 system, system default codepage: 936, python 3.9.18, python default encoder: 'utf8', pkg manager: miniconda
mdtable installed at "D:\app_barry\miniconda\Scripts\mdtable.exe", by typing following command in powershell:
tests I've tried
content of test.csv and test2.csv:
1.original:
2.retry using a new csv file encoded as GBK(hereinafter called 'test2.csv') > success, no bug reports.
3.delete all chinese characters in test.csv(now being a ascii file with utf8 encoding, hereinafter called 'test3') , and retry > success, no bug reports.
4.retry in code page 65001 > fail, same bug report as 1.
5.check python default decoding > sys.getdefaultencoding() == 'utf8'
my gusses
Mdtable uses encoding and decoding methods similar to ANSI endoding, so it uses GBK encoding on my computer as default. But I input a utf-8 encoding csv file, so when it comes to non-ascii charaters like '啊', the program reports DecodeError.
If my guesses are correct, adding an option to select encoder could be a solution.
The text was updated successfully, but these errors were encountered: