Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiprocessing error / segmentation fault #488

Closed
AlexDo1 opened this issue Aug 7, 2021 · 9 comments
Closed

multiprocessing error / segmentation fault #488

AlexDo1 opened this issue Aug 7, 2021 · 9 comments
Labels
bug Something isn't working

Comments

@AlexDo1
Copy link

AlexDo1 commented Aug 7, 2021

Describe the bug
Hi, I would like to use the package wetterdienst to load data from the DWD server into my Jupyter notebook.
For the first few days, downloading the data worked fine (with UnplickingErrors for some parameter-resolution constellations). Since a few days my python kernel crashes every time I try to extract the values from a DwdObservationRequest.

If just running the example code from your documentation in an ipython console, the kernel crashes with a segmentation fault / multprocessing error when I extract the values with df = request.values.all().df.dropna().

The strange thing is that the package worked for the first few days until the error suddenly appeared.

To Reproduce
This is the example code from the documentation executed in an ipython console:

❯ ipython
Python 3.9.6 (default, Jul 30 2021, 09:31:09)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.22.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: from wetterdienst.provider.dwd.observation import DwdObservationRequest,
   ...:  DwdObservationDataset, DwdObservationPeriod, DwdObservationResolution

In [2]: request = DwdObservationRequest(
   ...:         parameter=[DwdObservationDataset.CLIMATE_SUMMARY],
   ...:         resolution=DwdObservationResolution.DAILY,
   ...:         start_date="1990-01-01",
   ...:         end_date="2020-01-01",
   ...:         tidy=True,
   ...:         humanize=True,
   ...:     ).filter_by_name('Rheinstetten')

In [3]: df = request.values.all().df.dropna()
[1]    18409 segmentation fault  ipython
/Users/alexd/opt/anaconda3/envs/mc_develop/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

Expected behavior
The data for the station 'Rheinstetten' and parameter and resolution specified in DwdObservationRequest should be downloaded and saved to df.

Desktop:

  • OS: Mac-OS 11.4, M1
  • Python-Version 3.8, 3.9.6
  • Python environment:
# Name                    Version                   Build  Channel
aenum                     3.1.0                    pypi_0    pypi
anyio                     2.2.0            py39hecd8cb5_1
appdirs                   1.4.4                    pypi_0    pypi
appnope                   0.1.2           py39hecd8cb5_1001
argon2-cffi               20.1.0           py39h9ed2024_1
async_generator           1.10               pyhd3eb1b0_0
attrs                     21.2.0             pyhd3eb1b0_0
babel                     2.9.1              pyhd3eb1b0_0
backcall                  0.2.0              pyhd3eb1b0_0
beautifulsoup4            4.9.3                    pypi_0    pypi
blas                      1.0                         mkl
bleach                    3.3.1              pyhd3eb1b0_0
brotli                    1.0.9                hb1e8313_2
brotlipy                  0.7.0           py39h9ed2024_1003
ca-certificates           2021.7.5             hecd8cb5_1
cachetools                4.2.2                    pypi_0    pypi
certifi                   2021.5.30        py39hecd8cb5_0
cffi                      1.14.6           py39h2125817_0
chardet                   4.0.0           py39hecd8cb5_1003
charset-normalizer        2.0.4                    pypi_0    pypi
click                     7.1.2                    pypi_0    pypi
click-params              0.1.2                    pypi_0    pypi
cloup                     0.8.2                    pypi_0    pypi
cryptography              3.4.7            py39h2fd3fbb_0
cycler                    0.10.0           py39hecd8cb5_0
dateparser                1.0.0                    pypi_0    pypi
decorator                 5.0.9              pyhd3eb1b0_0
defusedxml                0.7.1              pyhd3eb1b0_0
deprecation               2.1.0                    pypi_0    pypi
dicttoxml                 1.7.4                    pypi_0    pypi
dogpile-cache             1.1.3                    pypi_0    pypi
entrypoints               0.3              py39hecd8cb5_0
fire                      0.4.0                    pypi_0    pypi
flask                     2.0.1                    pypi_0    pypi
flask-cors                3.0.10                   pypi_0    pypi
fonttools                 4.25.0             pyhd3eb1b0_0
freetype                  2.10.4               ha233b18_0
geoalchemy2               0.9.3                    pypi_0    pypi
greenlet                  1.1.0                    pypi_0    pypi
icu                       58.2                 h0a44026_3
idna                      3.2                      pypi_0    pypi
importlib-metadata        3.10.0           py39hecd8cb5_0
importlib_metadata        3.10.0               hd3eb1b0_0
intel-openmp              2021.3.0          hecd8cb5_3375
ipykernel                 5.3.4            py39h01d92e1_0
ipython                   7.22.0           py39h01d92e1_0
ipython_genutils          0.2.0              pyhd3eb1b0_1
ipywidgets                7.6.3              pyhd3eb1b0_1
itsdangerous              2.0.1                    pypi_0    pypi
jedi                      0.17.2           py39hecd8cb5_1
jinja2                    3.0.1              pyhd3eb1b0_0
joblib                    1.0.1                    pypi_0    pypi
jpeg                      9b                   he5867d9_2
json5                     0.9.6              pyhd3eb1b0_0
jsonschema                3.2.0                      py_2
jupyter                   1.0.0            py39hecd8cb5_7
jupyter_client            6.1.12             pyhd3eb1b0_0
jupyter_console           6.4.0              pyhd3eb1b0_0
jupyter_core              4.7.1            py39hecd8cb5_0
jupyter_server            1.4.1            py39hecd8cb5_0
jupyterlab                3.1.1              pyhd8ed1ab_0    conda-forge
jupyterlab_pygments       0.1.2                      py_0
jupyterlab_server         2.6.1              pyhd3eb1b0_0
jupyterlab_widgets        1.0.0              pyhd3eb1b0_1
kiwisolver                1.3.1            py39h23ab428_0
krb5                      1.19.2               hcd88c3b_0
lcms2                     2.12                 hf1fd2bf_0
libcxx                    10.0.0                        1
libedit                   3.1.20210216         h9ed2024_1
libffi                    3.3                  hb1e8313_2
libpng                    1.6.37               ha441bb4_0
libpq                     12.2                 h1b4eb34_1
libsodium                 1.0.18               h1de35cc_0
libtiff                   4.2.0                h87d7836_0
libwebp-base              1.2.0                h9ed2024_0
lxml                      4.6.3                    pypi_0    pypi
lz4-c                     1.9.3                h23ab428_0
markupsafe                2.0.1            py39h9ed2024_0
matplotlib                3.4.2            py39hecd8cb5_0
matplotlib-base           3.4.2            py39h8b3ea08_0
measurement               3.2.0                    pypi_0    pypi
metacatalog               0.4.3                     dev_0    <develop>
metacatalog-corr          0.1.7                     dev_0    <develop>
mistune                   0.8.4           py39h9ed2024_1000
mkl                       2021.3.0           hecd8cb5_517
mkl-service               2.4.0            py39h9ed2024_0
mkl_fft                   1.3.0            py39h4a7008c_2
mkl_random                1.2.2            py39hb2f4e1b_0
mpmath                    1.2.1                    pypi_0    pypi
munkres                   1.1.4                      py_0
nbclassic                 0.2.6              pyhd3eb1b0_0
nbclient                  0.5.3              pyhd3eb1b0_0
nbconvert                 6.1.0            py39hecd8cb5_0
nbformat                  5.1.3              pyhd3eb1b0_0
ncurses                   6.2                  h0a44026_1
nest-asyncio              1.5.1              pyhd3eb1b0_0
nltk                      3.6.2                    pypi_0    pypi
notebook                  6.4.0            py39hecd8cb5_0
numpy                     1.21.1                   pypi_0    pypi
numpy-base                1.20.3           py39he0bd621_0
olefile                   0.46                       py_0
openjpeg                  2.3.0                hb95cd4c_1
openssl                   1.1.1k               h9ed2024_0
packaging                 21.0               pyhd3eb1b0_0
pandas                    1.2.5                    pypi_0    pypi
pandocfilters             1.4.3            py39hecd8cb5_1
parso                     0.7.0                      py_0
pbr                       5.6.0                    pypi_0    pypi
pexpect                   4.8.0              pyhd3eb1b0_3
pickleshare               0.7.5           pyhd3eb1b0_1003
pillow                    8.3.1            py39ha4cf6ea_0
pint                      0.17                     pypi_0    pypi
pip                       21.2.2           py39hecd8cb5_0
prometheus_client         0.11.0             pyhd3eb1b0_0
prompt-toolkit            3.0.17             pyh06a4308_0
prompt_toolkit            3.0.17               hd3eb1b0_0
psycopg2                  2.8.6            py39hbcfaee0_1
psycopg2-binary           2.9.1                    pypi_0    pypi
ptyprocess                0.7.0              pyhd3eb1b0_2
pycparser                 2.20                       py_2
pygments                  2.9.0              pyhd3eb1b0_0
pyopenssl                 20.0.1             pyhd3eb1b0_1
pyparsing                 2.4.7              pyhd3eb1b0_0
pypdf2                    1.26.0                   pypi_0    pypi
pyproj                    3.1.0                    pypi_0    pypi
pyqt                      5.9.2            py39h23ab428_6
pyrsistent                0.18.0           py39h9ed2024_0
pysocks                   1.7.1            py39hecd8cb5_0
python                    3.9.6                h88f2d9e_0
python-dateutil           2.8.2              pyhd3eb1b0_0
pytz                      2021.1             pyhd3eb1b0_0
pyzmq                     20.0.0           py39h23ab428_1
qt                        5.9.7                h468cd18_1
qtconsole                 5.1.0              pyhd3eb1b0_0
qtpy                      1.9.0                      py_0
rapidfuzz                 1.4.1                    pypi_0    pypi
readline                  8.1                  h9ed2024_0
regex                     2021.8.3                 pypi_0    pypi
requests                  2.25.1             pyhd3eb1b0_0
requests-ftp              0.3.1                    pypi_0    pypi
scipy                     1.7.1                    pypi_0    pypi
send2trash                1.5.0              pyhd3eb1b0_1
setuptools                52.0.0           py39hecd8cb5_0
shapely                   1.7.1                    pypi_0    pypi
sip                       4.19.13          py39h23ab428_0
six                       1.16.0             pyhd3eb1b0_0
sniffio                   1.2.0            py39hecd8cb5_1
soupsieve                 2.2.1                    pypi_0    pypi
sqlalchemy                1.4.22                   pypi_0    pypi
sqlite                    3.36.0               hce871da_0
stevedore                 3.3.0                    pypi_0    pypi
sympy                     1.8                      pypi_0    pypi
tabulate                  0.8.9                    pypi_0    pypi
termcolor                 1.1.0                    pypi_0    pypi
terminado                 0.9.4            py39hecd8cb5_0
testpath                  0.5.0              pyhd3eb1b0_0
tk                        8.6.10               hb0a8c7a_0
tornado                   6.1              py39h9ed2024_0
tqdm                      4.62.0                   pypi_0    pypi
traitlets                 5.0.5              pyhd3eb1b0_0
tzdata                    2021a                h52ac0ba_0
tzlocal                   2.1                      pypi_0    pypi
urllib3                   1.26.6             pyhd3eb1b0_1
validators                0.18.2                   pypi_0    pypi
wcwidth                   0.2.5                      py_0
webencodings              0.5.1            py39hecd8cb5_1
werkzeug                  2.0.1                    pypi_0    pypi
wetterdienst              0.20.3                   pypi_0    pypi
wheel                     0.36.2             pyhd3eb1b0_0
widgetsnbextension        3.5.1            py39hecd8cb5_0
xarray                    0.19.0                   pypi_0    pypi
xz                        5.2.5                h1de35cc_0
zeromq                    4.3.4                h23ab428_0
zipp                      3.5.0              pyhd3eb1b0_0
zlib                      1.2.11               h1de35cc_3
zstd                      1.4.9                h322a384_0

Additional context
Do you already now about this bug?
A possible solution to this problem could be to use the package single threaded, if that is possible.

@gutzbenj
Copy link
Member

gutzbenj commented Aug 7, 2021

Dear @AlexDo1 ,

so far I have only seen a segmentation error when installing scipy under mac m1.

Edit:

I have set up a patch branch with removed threading so that you can install wetterdienst from this branch and run the script another time to test if that works ok?

Install via

pip install git+https://github.com/earthobservations/wetterdienst.git@patch-threading

@AlexDo1
Copy link
Author

AlexDo1 commented Aug 8, 2021

Thank you @gutzbenj ,

I just re-ran the code in the To Reproduce section above. I have installed wetterdienst via
pip install git+https://github.com/earthobservations/wetterdienst.git@patch-threading

Unfortunately the error remains the same:

In [5]: df = request.values.all().df.dropna()
[1]    22368 segmentation fault  ipython
/Users/alexd/opt/anaconda3/envs/wetterdienst_single/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

@gutzbenj
Copy link
Member

gutzbenj commented Aug 8, 2021

Dear @AlexDo1,

  • I've now run the same code snippet in a jupyter notebook and got no error. My best guess is it's related to the M1 in combination with scipy so maybe you are able to update scipy (I know it's a hassle).
  • the warning shouldn't be a problem as it won't cause the program to shut down

I still got scipy 1.6.1 running so maybe thats the difference.

The libraries I've installed are

Package Version


aenum 3.1.0
alabaster 0.7.12
anyio 3.3.0
appdirs 1.4.4
appnope 0.1.2
argon2-cffi 20.1.0
attrs 20.3.0
Babel 2.9.1
backcall 0.2.0
bandit 1.7.0
beautifulsoup4 4.9.3
beniget 0.3.0
bitstring 3.1.9
black 20.8b1
bleach 4.0.0
Brotli 1.0.9
cachetools 4.2.2
certifi 2021.5.30
cffi 1.14.6
cfgrib 0.9.9.0
cftime 1.5.0
charset-normalizer 2.0.4
click 7.1.2
click-params 0.1.2
cloudpickle 1.6.0
cloup 0.8.2
colorama 0.4.4
coverage 5.5
crate 0.25.0
css-html-js-minify 2.5.5
cycler 0.10.0
Cython 0.29.23
dash 1.21.0
dash-bootstrap-components 0.12.2
dash-core-components 1.17.1
dash-html-components 1.1.4
dash-table 4.12.0
dask 2021.7.1
dateparser 1.0.0
debugpy 1.4.1
decorator 5.0.9
defusedxml 0.7.1
deprecation 2.1.0
dictdiffer 0.9.0
docutils 0.16
dogpile.cache 1.1.3
duckdb 0.2.7
entrypoints 0.3
et-xmlfile 1.1.0
fastapi 0.61.2
findlibs 0.0.2
flake8 3.9.2
flake8-bandit 2.1.2
flake8-black 0.2.3
flake8-bugbear 20.11.1
flake8-isort 4.0.0
flake8-polyfill 1.0.2
flakehell 0.7.1
Flask 2.0.1
Flask-Compress 1.10.1
freezegun 1.1.0
fsspec 2021.7.0
future 0.18.2
gast 0.4.0
GDAL 3.3.1
geojson 2.5.0
gitdb 4.0.7
GitPython 3.1.20
h11 0.12.0
h5netcdf 0.11.0
h5py 3.3.0
idna 3.2
imagesize 1.2.0
importlib-resources 5.2.2
influxdb 5.3.1
influxdb-client 1.19.0
iniconfig 1.1.1
ipykernel 6.0.3
ipython 7.25.0
ipython-genutils 0.2.0
ipywidgets 7.6.3
isort 5.9.3
itsdangerous 2.0.1
jedi 0.18.0
Jinja2 3.0.1
jsonschema 3.2.0
jupyter 1.0.0
jupyter-client 6.2.0
jupyter-console 6.4.0
jupyter-core 4.7.1
jupyter-server 1.10.2
jupyter-server-mathjax 0.2.3
jupyterlab-widgets 1.0.0
kiwisolver 1.3.1
livereload 2.6.3
locket 0.2.1
lxml 4.6.3
MarkupSafe 2.0.1
matplotlib 3.4.2
matplotlib-inline 0.1.2
mccabe 0.6.1
measurement 3.2.0
mistune 0.8.4
mock 4.0.3
mpmath 1.2.1
msgpack 1.0.2
mypy-extensions 0.4.3
mysqlclient 2.0.3
nbconvert 5.6.1
nbdime 3.1.0
nbformat 5.1.3
nest-asyncio 1.5.1
netCDF4 1.5.7
networkx 2.5.1
notebook 6.4.2
numpy 1.21.1
openpyxl 3.0.7
packaging 21.0
pandas 1.3.1
pandocfilters 1.4.3
parso 0.8.2
partd 1.2.0
pastel 0.2.1
pathspec 0.9.0
pbr 5.6.0
pdbufr 0.9.0
percy 2.0.2
pexpect 4.8.0
pickleshare 0.7.5
Pillow 8.3.1
Pint 0.17
pip 21.2.3
pip-licenses 3.5.1
plotly 4.14.3
pluggy 0.13.1
ply 3.11
poethepoet 0.9.0
prometheus-client 0.11.0
prompt-toolkit 3.0.19
psycopg2-binary 2.9.1
PTable 0.9.2
ptyprocess 0.7.0
py 1.10.0
pybind11 2.6.2
pybufrkit 0.2.19
pycodestyle 2.7.0
pycparser 2.20
pydantic 1.8.2
pyflakes 2.3.1
Pygments 2.9.0
pyparsing 2.4.7
PyPDF2 1.26.0
pyrsistent 0.18.0
pytest 6.2.4
pytest-cov 2.12.1
pytest-dictsdiff 0.5.8
pytest-notebook 0.6.1
python-dateutil 2.8.2
python-slugify 5.0.2
pythran 0.9.9
pytz 2021.1
PyYAML 5.4.1
pyzmq 22.2.1
qtconsole 5.1.1
QtPy 1.9.0
rapidfuzz 1.4.1
regex 2021.8.3
requests 2.26.0
requests-ftp 0.3.1
requests-unixsocket 0.2.0
retrying 1.3.3
Rx 3.2.0
scipy 1.6.1
selenium 3.141.0
Send2Trash 1.7.1
setuptools 56.1.0
six 1.16.0
smmap 4.0.0
sniffio 1.2.0
snowballstemmer 2.1.0
soupsieve 2.2.1
Sphinx 3.5.4
sphinx-autobuild 2020.9.1
sphinx-autodoc-typehints 1.12.0
sphinx-material 0.0.30
sphinxcontrib-applehelp 1.0.2
sphinxcontrib-devhelp 1.0.2
sphinxcontrib-htmlhelp 2.0.0
sphinxcontrib-jsmath 1.0.1
sphinxcontrib-qthelp 1.0.3
sphinxcontrib-serializinghtml 1.1.5
sphinxcontrib-svg2pdfconverter 1.1.1
SQLAlchemy 1.3.24
starlette 0.13.6
stevedore 3.3.0
surrogate 0.1
sympy 1.8
tabulate 0.8.9
terminado 0.10.1
testfixtures 6.18.0
testpath 0.5.0
text-unidecode 1.3
toml 0.10.2
tomlkit 0.7.2
toolz 0.11.1
tornado 6.1
tqdm 4.62.0
traitlets 5.0.5
typed-ast 1.4.3
typing-extensions 3.10.0.0
tzlocal 2.1
Unidecode 1.2.0
urllib3 1.26.6
uvicorn 0.13.4
validators 0.18.2
wcwidth 0.2.5
webencodings 0.5.1
websocket-client 1.1.1
Werkzeug 2.0.1
wetterdienst 0.20.3
wheel 0.36.2
widgetsnbextension 3.5.1
wradlib 1.10.3
xarray 0.17.0
xmltodict 0.12.0
zipp 3.5.0

@AlexDo1
Copy link
Author

AlexDo1 commented Aug 8, 2021

I just downgraded scipy to version 1.6.1, nothing has changed in the error.

I also executed the sample code in a .py file with the module faulthandler for detailed traceback, this is the output, maybe you can see something in it that could explain the error?

python -Xfaulthandler wetterdienst_test
Fatal Python error: Segmentation fault

Thread 0x000000030a8a1000 (most recent call first):
  File "/Users/alexd/opt/anaconda3/envs/wetterdienst_single/lib/python3.9/threading.py", line 316 in wait
  File "/Users/alexd/opt/anaconda3/envs/wetterdienst_single/lib/python3.9/threading.py", line 574 in wait
  File "/Users/alexd/opt/anaconda3/envs/wetterdienst_single/lib/python3.9/site-packages/tqdm/_monitor.py", line 60 in run
  File "/Users/alexd/opt/anaconda3/envs/wetterdienst_single/lib/python3.9/threading.py", line 973 in _bootstrap_inner
  File "/Users/alexd/opt/anaconda3/envs/wetterdienst_single/lib/python3.9/threading.py", line 930 in _bootstrap

Current thread 0x00000002046d9e00 (most recent call first):
  File "/Users/alexd/opt/anaconda3/envs/wetterdienst_single/lib/python3.9/site-packages/dogpile/cache/backends/file.py", line 233 in set_serialized
  File "/Users/alexd/opt/anaconda3/envs/wetterdienst_single/lib/python3.9/site-packages/dogpile/cache/region.py", line 1287 in _set_cached_value_to_backend
  File "/Users/alexd/opt/anaconda3/envs/wetterdienst_single/lib/python3.9/site-packages/dogpile/cache/region.py", line 1012 in gen_value
  File "/Users/alexd/opt/anaconda3/envs/wetterdienst_single/lib/python3.9/site-packages/dogpile/lock.py", line 178 in _enter_create
  File "/Users/alexd/opt/anaconda3/envs/wetterdienst_single/lib/python3.9/site-packages/dogpile/lock.py", line 94 in _enter
  File "/Users/alexd/opt/anaconda3/envs/wetterdienst_single/lib/python3.9/site-packages/dogpile/lock.py", line 185 in __enter__
  File "/Users/alexd/opt/anaconda3/envs/wetterdienst_single/lib/python3.9/site-packages/dogpile/cache/region.py", line 1042 in get_or_create
  File "/Users/alexd/opt/anaconda3/envs/wetterdienst_single/lib/python3.9/site-packages/dogpile/cache/region.py", line 1577 in get_or_create_for_user_func
  File "/Users/alexd/opt/anaconda3/envs/wetterdienst_single/lib/python3.9/site-packages/decorator.py", line 232 in fun
  File "/Users/alexd/opt/anaconda3/envs/wetterdienst_single/lib/python3.9/site-packages/wetterdienst/provider/dwd/observation/download.py", line 48 in _download_climate_observations_data
  File "/Users/alexd/opt/anaconda3/envs/wetterdienst_single/lib/python3.9/site-packages/wetterdienst/provider/dwd/observation/download.py", line 27 in <listcomp>
  File "/Users/alexd/opt/anaconda3/envs/wetterdienst_single/lib/python3.9/site-packages/wetterdienst/provider/dwd/observation/download.py", line 26 in download_climate_observations_data_parallel
  File "/Users/alexd/opt/anaconda3/envs/wetterdienst_single/lib/python3.9/site-packages/wetterdienst/provider/dwd/observation/api.py", line 191 in _collect_station_parameter
  File "/Users/alexd/opt/anaconda3/envs/wetterdienst_single/lib/python3.9/site-packages/wetterdienst/core/scalar/values.py", line 437 in query
  File "/Users/alexd/opt/anaconda3/envs/wetterdienst_single/lib/python3.9/site-packages/tqdm/std.py", line 1185 in __iter__
  File "/Users/alexd/opt/anaconda3/envs/wetterdienst_single/lib/python3.9/site-packages/wetterdienst/core/scalar/values.py", line 708 in all
  File "/Users/alexd/Desktop/wetterdienst_test", line 14 in <module>
[1]    25055 segmentation fault  python -Xfaulthandler wetterdienst_test
/Users/alexd/opt/anaconda3/envs/wetterdienst_single/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

@gutzbenj
Copy link
Member

gutzbenj commented Aug 8, 2021

Alright, so the error once again seems to be related to the current caching implementation by dogpile.cache. Unfortunately we've seen many problems with this and currently try to use another filecache looking at #431 . At this moment the only fix is to delete the cache. You will find the cache_dir folder running

wetterdienst show

Delete this folder and try running the code one more time, please.

@AlexDo1
Copy link
Author

AlexDo1 commented Aug 8, 2021

I deleted the cache folder and the code now runs without errors.

Thank you so much for helping me (especially on a Sunday noon).

@AlexDo1
Copy link
Author

AlexDo1 commented Aug 8, 2021

PS: If at some point you need someone to test the new caching implementation on M1, feel free to contact me.

@amotl
Copy link
Member

amotl commented Aug 8, 2021

Hi Alex,

thank you for reporting your observations. Please note, following the rationale by @meteoDaniel at #417, Wetterdienst honors the WD_CACHE_DISABLE environment variable (#426). When set to a truthy value, all caching within Wetterdienst will be turned off completely. While this will raise a significant performance bump, it might save you from such issues recurring.

While I see the good intentions of #489 by @gutzbenj, I believe it is misguided because the culprit for the flaws when concurrently downloading remote resources is not the parallelization on the acquisition side of things, but rather the unsafety of the dbmfile-based cache implementation when accessed concurrently. As such, I believe using WD_CACHE_DISABLE is the quickest way to work around the problem.

If at some point you need someone to test the new caching implementation on M1, feel free to contact me.

On #431 (branch collab/fsspec), the test suite now passes successfully, so I consider it ready to go. Thanks already for willing to take a stab!

With kind regards,
Andreas.

@AlexDo1
Copy link
Author

AlexDo1 commented Aug 11, 2021

Hello Andreas,

your solution works very well. I only had to delete the wetterdienst cache folder by hand, by setting WD_CACHE_DISABLE=1 after that the data download works smoothly, the performance is absolutely sufficient for my use case.

Thanks again!

@gutzbenj gutzbenj added the bug Something isn't working label Sep 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants