You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This last first runs the same query on the ESGF, extracts the checksums and then query clef.nci.org.au for these specific checksums,
This means that somehow these directories have been crawled but possibly the metadata has not being extracted to cordex_dataset?
Or tables.sql needs refreshing?
The text was updated successfully, but these errors were encountered:
Found issue, lots of these files attributes do not follow conventions including the project_id was CORDEX-Australia, which doesn't exists, the model name which is different from official CCAM (CCAM-1391M). While the domain is not included at all. While we can't solve all this issues it makes sense to change this part of tables.sql
FROM metadata
WHERE md_type = 'netcdf'
AND md_json->'attributes'->>'project_id' = 'CORDEX'
to
AND md_json->'attributes'->>'project_id' LIKE 'CORDEX%'
In this way also legitimate CORDEX sub-projects like CORDEX-ESD are included.
I also changed other parts of the code where a similar exclusion might apply, for example in cli.py I changed
project='CORDEX'
to
project="CORDEX,CORDEX-Adjust,CORDEX-ESD,CORDEXReklies"
In this way the ESGF queries will search for data in any of the projects.
Before calling local I changed this back to 'CORDEX' because we put all the projects in one table in the database.
Waiting for NCI tor fresh the tables to do final tests.
I found an issue either with the clef.nci.org.au database or the database tables.sql file.
This is a simplified version of one of the queries done by clef:
clef=> SELECT * from cordex_dataset WHERE cordex_dataset.driving_experiment_name ='historical' AND cordex_dataset.frequency = 'mon' and cordex_dataset.model_id = 'CSIRO-CCAM';
dataset_id | model_id | frequency | institute_id | cordex_domain | experiment_i
d | rcm_version_id | driving_model_id | driving_experiment_name | driving_model_
ensemble_member
------------+----------+-----------+--------------+---------------+-------------
--+----------------+------------------+-------------------------+---------------
(0 rows)
As you can see here if I put 'CSIRO-CCAM' as model_id it doesn't return any results.
If I query the 'UNSW-WRF360L' as model_id it works:
clef=> SELECT * from cordex_dataset WHERE cordex_dataset.driving_experiment_name ='historical' AND cordex_dataset.frequency = 'mon' and cordex_dataset.model_id = 'UNSW-WRF360L';
dataset_id | model_id | frequency | institute_id
| cordex_domain | experiment_id | rcm_version_id | driving_model_id | driving
_experiment_name | driving_model_ensemble_member
--------------------------------------+--------------+-----------+--------------
+---------------+---------------+----------------+---------------------+--------
-----------------+-------------------------------
915f271a-2b7a-5ae4-56ea-05ac19d40aed | UNSW-WRF360L | mon | UNSW
| AUS-44i | historical | v1 | CSIRO-BOM-ACCESS1-3 | histori
cal | r1i1p1
e5e4d21e-6175-629b-8c32-41af705ae441 | UNSW-WRF360L | mon | UNSW
| AUS-44 | historical | v1 | CSIRO-BOM-ACCESS1-3 | histori
cal | r1i1p1
32445df4-97fb-2287-4dfd-f467ffed8853 | UNSW-WRF360L | mon | UNSW
| AUS-44i | historical | v1 | CSIRO-BOM-ACCESS1-0 | histori
cal | r1i1p1
fbf3f02a-29f1-bc8e-6341-a1602f8588aa | UNSW-WRF360L | mon | UNSW
| AUS-44 | historical | v1 | CSIRO-BOM-ACCESS1-0 | histori
cal | r1i1p1
(4 rows)
Using the default clef query based on checksums, I can find the CCAM results:
clef cordex -e historical -v tas -m CCAM -d AUS-44i -f mon
/g/data/rr3/publications/CORDEX/output/AUS-44i/CSIRO/CNRM-CERFACS-CNRM-CM5/historical/r1i1p1/CSIRO-CCAM/v201312/mon/tas/files/d20170804/
/g/data/rr3/publications/CORDEX/output/AUS-44i/CSIRO/CNRM-CERFACS-CNRM-CM5/historical/r1i1p1/CSIRO-CCAM/v201312/mon/tas/v20170804/
/g/data/rr3/publications/CORDEX/output/AUS-44i/CSIRO/CSIRO-BOM-ACCESS1-0/historical/r1i1p1/CSIRO-CCAM/v201312/mon/tas/files/d20170804/
/g/data/rr3/publications/CORDEX/output/AUS-44i/CSIRO/CSIRO-BOM-ACCESS1-0/historical/r1i1p1/CSIRO-CCAM/v201312/mon/tas/v20170804/
.....
This last first runs the same query on the ESGF, extracts the checksums and then query clef.nci.org.au for these specific checksums,
This means that somehow these directories have been crawled but possibly the metadata has not being extracted to cordex_dataset?
Or tables.sql needs refreshing?
The text was updated successfully, but these errors were encountered: