You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Well, the dataset is currently unavailable. It should be fixed — load_dataset('ruwiki_good'). Or... it should at least download and tell which way the .txt file lies (so that it would be possible to do something manually with the file).
If you try this:
>>> d = load_dataset('ruwiki_good')
you get something like this:
Checking if dataset "ruwiki_good" was already downloaded before
Dataset "ruwiki_good" not found on the machine
Downloading the "ruwiki_good" dataset...
100%|█████████████████████████████████████████| 51.2M/51.2M [00:46<00:00, 1.10MiB/s]
Dataset downloaded! Save path is: "/home/alvant/lib/miniconda3/envs/test2/lib/python3.8/site-packages/topicnet/dataset_manager/ruwiki_good.txt"
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/alvant/lib/miniconda3/envs/test2/lib/python3.8/site-packages/topicnet/dataset_manager/api.py", line 132, in load_dataset
raise exception
File "/home/alvant/lib/miniconda3/envs/test2/lib/python3.8/site-packages/topicnet/dataset_manager/api.py", line 126, in load_dataset
return Dataset(save_path, **kwargs)
File "/home/alvant/lib/miniconda3/envs/test2/lib/python3.8/site-packages/topicnet/cooking_machine/dataset.py", line 220, in __init__
self._data = self._read_data(data_path)
File "/home/alvant/lib/miniconda3/envs/test2/lib/python3.8/site-packages/topicnet/cooking_machine/dataset.py", line 355, in _read_data
data = data_handle.read_csv(
File "/home/alvant/lib/miniconda3/envs/test2/lib/python3.8/site-packages/pandas/util/_decorators.py", line 211, in wrapper
return func(*args, **kwargs)
File "/home/alvant/lib/miniconda3/envs/test2/lib/python3.8/site-packages/pandas/util/_decorators.py", line 331, in wrapper
return func(*args, **kwargs)
File "/home/alvant/lib/miniconda3/envs/test2/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 935, in read_csv
kwds_defaults = _refine_defaults_read(
File "/home/alvant/lib/miniconda3/envs/test2/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 2063, in _refine_defaults_read
raise ValueError(
ValueError: Specified \n as separator or delimiter. This forces the python engine which does not accept a line terminator. Hence it is not allowed to use the line terminator as separator.
OS is:
Linux mx 4.19.0-16-amd64 #1 SMP Debian 4.19.181-1 (2021-03-19) x86_64 GNU/Linux
Expected Result
The dataset is 1) downloaded and 2) ready to use for topic modeling.
Well, the dataset is currently unavailable. It should be fixed —
load_dataset('ruwiki_good')
.Or... it should at least download and tell which way the .txt file lies (so that it would be possible to do something manually with the file).If you try this:
>>> d = load_dataset('ruwiki_good')
you get something like this:
OS is:
Expected Result
The dataset is 1) downloaded and 2) ready to use for topic modeling.
Current "Workaround"
If you set
sep='###'
in this code:then everything seems to work fine.
The text was updated successfully, but these errors were encountered: