-
Notifications
You must be signed in to change notification settings - Fork 2
Investigation: How best to use MIP Convert in CMEW
How much information is required by MIP Convert to process data from u-bg466? What does the MIP Convert user configuration file require now?
- To run MIP Convert with raw suite output, the whole CDDS package has to be run.
- MIP convert can be run on its own, however it need to have the .pp files already in the input directory and a configuration file setup. Overall CDDS takes less configuration and will generate a config file for MIP Convert according to the input specifications.
- CDDS requires a request.json file and a text file input with the required variables (we need it but not technically necessary to run the package).
- The text file contains a list of the variables we want to convert.
- The .json is generated on the command line, and has default variables that can be overwritten, for example
--end_date 1860-01-01.
- A directory structure has to be created in the working directory
- The package is then run with a command line instruction
How long does it take MIP Convert to process 10, 50 and 100 years of data from u-bg466? (/usr/bin/time might be useful here!)
- Couldn't use usr/bin/time/ since its a cylc workflow
- 10 years took approximately 3 minutes.
- 50 years takes approximately 56 minutes, 53 of which were spent downloading data from MASS, everything else took 3 minutes.
- There is a script written to allow the CDDS package to run with data already downloaded from MASS
path_reformatter, although it is not well documented and further discussion with Piotr from the CDDS team would be required to implement it (if needed).
- There is a script written to allow the CDDS package to run with data already downloaded from MASS
- 100 years took 2 hours and 11 minutes.
- Cylc review shows that the extract task took approximately 53 minutes, and all remaining task then take a total of approximately 3 minutes for 50 years.
- Given that I have been running this from the command line, I feel like it would be best run from a bash script with variables passed in (suite, different start and end dates, variable text file).
-
rose suite-run -- --no-detachwill allow a cylc workflow to run and then the bigger cylc workflow it is called from will know whether it is running, failed, succeeded etc. - Further discussion with the MISS team would be required before implementing this.
Create a working directory
mkdir cdds_bg466_processing
cd cdds_bg466_processing
mkdir proc data
Activate CDDS
source ~cdds/bin/setup_env_for_cdds 2.4.1
Create request.json (can specify start and end dates here. See write_rose_suite_request_json --help) Will default to what is in the suite.
write_rose_suite_request_json u-bg466 cdds 120306 round-1 ap5
Create directory structure locally using -c and -t options
create_cdds_directory_structure request.json -c proc -t data
Create variables file. Can add more variables if needed.
echo Amon/tas > variables.txt
Bypass the datarequest and inventory by using -r option to only u produce variables in the variables.txt we just made.
prepare_generate_variable_list request.json -p -c proc -t data -r variables.txt
Run cdds convert which essentially configures the u-ak283 suite https://code.metoffice.gov.uk/trac/roses-u/browser/a/k/2/8/3/trunk
cdds_convert request.json -c proc -t data --skip_transfer
The above instructions can be found in the original gist.