Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ipc forests #746

Merged
merged 115 commits into from
Oct 31, 2023
Merged

Ipc forests #746

merged 115 commits into from
Oct 31, 2023

Conversation

dulte
Copy link
Collaborator

@dulte dulte commented Sep 26, 2022

Reader for IPC Forests data for deposition

@jgriesfeller jgriesfeller self-assigned this Jan 2, 2023
@jgriesfeller
Copy link
Member

jgriesfeller commented Jan 26, 2023

The reading of the variable fakedryo3 fails in the EBAS reader:

Running MOCAGE.cams2.40 (dryo3) vs. EBASMC (fakedryo3)
Cache file does not exist: /lustre/storeB/project/aerocom/aerocom2/pyaerocom_out/_cache/jang/EBASMC_fakedryo3.pkl
Fetching data files. This might take a while...
Retrieving EBAS files for variables
['fakedryo3']
Number of files to read reduced to 5736
Reading EBAS data from /lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/EBASMultiColumn/data/data
^M  0%|          | 0/5736 [00:00<?, ?it/s]Reading NASA Ames file:
/lustre/storeB/project/aerocom/aerocom1/AEROCOM_OBSDATA/EBASMultiColumn/data/data/AM0001R.20081231200000.20120419000000.uv_abs.ozon
e.air.4h.1h.AM01L_uv_abs_02.AM01L_uv_abs..nas
^M  0%|          | 0/5736 [00:00<?, ?it/s]
Failed to perform analysis: Traceback (most recent call last):
  File "/modules/rhel8/user-apps/aerocom/conda2022/envs/pya_para_ipc/lib/python3.10/site-packages/pyaerocom/colocation_auto.py", line 802, in run
    coldata = self._run_helper(mod_var, obs_var)
  File "/modules/rhel8/user-apps/aerocom/conda2022/envs/pya_para_ipc/lib/python3.10/site-packages/pyaerocom/colocation_auto.py", line 1429, in _run_helper
    args = self._prepare_colocation_args(model_var, obs_var)
  File "/modules/rhel8/user-apps/aerocom/conda2022/envs/pya_para_ipc/lib/python3.10/site-packages/pyaerocom/colocation_auto.py", line 1372, in _prepare_colocation_args
    obs_data = self.get_obs_data(obs_var)
  File "/modules/rhel8/user-apps/aerocom/conda2022/envs/pya_para_ipc/lib/python3.10/site-packages/pyaerocom/colocation_auto.py", line 705, in get_obs_data
    return self._read_ungridded(obs_var)
  File "/modules/rhel8/user-apps/aerocom/conda2022/envs/pya_para_ipc/lib/python3.10/site-packages/pyaerocom/colocation_auto.py", line 908, in _read_ungridded
    obs_data = obs_reader.read(
  File "/modules/rhel8/user-apps/aerocom/conda2022/envs/pya_para_ipc/lib/python3.10/site-packages/pyaerocom/io/readungridded.py", line 649, in read
    self.read_dataset(
  File "/modules/rhel8/user-apps/aerocom/conda2022/envs/pya_para_ipc/lib/python3.10/site-packages/pyaerocom/io/readungridded.py", line 393, in read_dataset
    data_read = reader.read(vars_to_read, **kwargs)
  File "/modules/rhel8/user-apps/aerocom/conda2022/envs/pya_para_ipc/lib/python3.10/site-packages/pyaerocom/io/read_ebas.py", line 1745, in read
    data = self._read_files(files, vars_to_retrieve, files_contain, constraints)
  File "/modules/rhel8/user-apps/aerocom/conda2022/envs/pya_para_ipc/lib/python3.10/site-packages/pyaerocom/io/read_ebas.py", line 1861, in _read_files
    data_obj._data[start:stop, data_obj._DATAINDEX] = values
ValueError: could not convert string to float: 'dtime'

@avaldebe
Copy link
Collaborator

File "/modules/centos7/user-apps/aerocom/anaconda3/envs/pya_para_ipc/lib/python3.9/site-packages/pyaerocom/io/ipcforests/reader.py", line 354, in ReadIPCForest
start: str | datetime,
TypeError: unsupported operand type(s) for |: 'type' and 'type'

type | type was introduced on Python 3.10. On Python 3.8 and 3.9 you need to add the following import as the first import to get this behaviour:

from __future__ import annotations

@jgriesfeller
Copy link
Member

I updated the comment meanwhile, but yes, some of Daniel's code is using Python3.10 features and was never tested on the earlier version we also still support.

Copy link
Collaborator

@avaldebe avaldebe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR extends the EBAS variables with many helper functions for derived quantities. However, I do not see the tests for those helper fuctions.

pyaerocom/aeroval/glob_defaults.py Outdated Show resolved Hide resolved
pyaerocom/aeroval/glob_defaults.py Outdated Show resolved Hide resolved
pyaerocom/aeroval/glob_defaults.py Outdated Show resolved Hide resolved
pyaerocom/aeroval/glob_defaults.py Outdated Show resolved Hide resolved
pyaerocom/aeroval/glob_defaults.py Outdated Show resolved Hide resolved
pyaerocom/aeroval/glob_defaults.py Outdated Show resolved Hide resolved
pyaerocom/aeroval/coldatatojson_helpers.py Outdated Show resolved Hide resolved
Comment on lines +429 to +430
"seasonal": {"obs": deepcopy(yeardict), "mod": deepcopy(yeardict)},
"yearly": {"obs": deepcopy(yeardict), "mod": deepcopy(yeardict)},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the need for deepcopy to create different copies of the same empty dictionary, but would not be better to use collections.defaultdict (see docs)?

Here is how it looks like with collections.defaultdict

from collections import defaultdict
...

    ts_data = {
        "time": time,
        "seasonal": {"obs": defaultdict(dict), "mod": defaultdict(dict)},
        "yearly": {"obs": defaultdict(dict), "mod": defaultdict(dict)},
    }

Copy link
Member

@jgriesfeller jgriesfeller Oct 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just did not know about defaultdict, but did you see the init of yeardict?

years = list(repw_res["seasonal"].year.values)
yeardict = {}
for year in years:
yeardict[f"{year}"] = {}

yeardict has some predifined keys which wouldn't be in your defaultdict(dict), right?

pyaerocom/aeroval/coldatatojson_helpers.py Show resolved Hide resolved
pyaerocom/aeroval/coldatatojson_helpers.py Outdated Show resolved Hide resolved
@jgriesfeller
Copy link
Member

This PR extends the EBAS variables with many helper functions for derived quantities. However, I do not see the tests for those helper fuctions.

Adding data to the EBAS test dataset is a pain and takes hours to do. I have added some of the new variables to test_ebas_varinfo.py though. Is that satisfying for you?

@avaldebe
Copy link
Collaborator

This PR extends the EBAS variables with many helper functions for derived quantities. However, I do not see the tests for those helper fuctions.

Adding data to the EBAS test dataset is a pain and takes hours to do. I have added some of the new variables to test_ebas_varinfo.py though. Is that satisfying for you?

I'm fine with the extra tests

@jgriesfeller jgriesfeller merged commit bfc8073 into main-dev Oct 31, 2023
19 of 21 checks passed
@jgriesfeller jgriesfeller deleted the ipc-forests branch October 31, 2023 14:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

diurnal cycle analysis: Obs and model data is always the same
4 participants