Skip to content

Investigation: How best to use MIP Convert in CMEW

mo-tgeddes edited this page Feb 28, 2023 · 17 revisions

Introduction

How much information is required by MIP Convert to process data from u-bg466? What does the MIP Convert user configuration file require now?

  • To run MIP Convert with raw suite output, the whole CDDS package has to be run.
  • MIP convert can be run on its own, however it need to have the .pp files already in the input directory and a configuration file setup. Overall CDDS takes less configuration and will generate a config file for MIP Convert according to the input specifications.
  • CDDS requires a request.json file and a text file input with the required variables (we need it but not technically necessary to run the package).
    • The text file contains a list of the variables we want to convert.
    • The .json is generated on the command line, and has default variables that can be overwritten, for example --end_date 1860-01-01.
  • A directory structure has to be created in the working directory
  • The package is then run with a command line instruction

How long does it take MIP Convert to process 10, 50 and 100 years of data from u-bg466? (/usr/bin/time might be useful here!)

  • Couldn't use usr/bin/time/ since its a cylc workflow
  • 10 years took approximately 3 minutes.
  • 50 years takes approximately 56 minutes, 53 of which were spent downloading data from MASS, everything else took 3 minutes.
    • There is a script written to allow the CDDS package to run with data already downloaded from MASS path_reformatter, although it is not well documented and further discussion with Piotr from the CDDS team would be required to implement it (if needed).
  • 100 years took 2 hours and 11 minutes.

The CDDS Rose suite breaks data down into chunks for MIP Convert; how long does this take?

  • Cylc review shows that the extract task took approximately 53 minutes, and all remaining task then take a total of approximately 3 minutes for 50 years.

Is it possible to run a Cylc 7 Rose suite from a Cylc 8 workflow?

  • Given that I have been running this from the command line, I feel like it would be best run from a bash script with variables passed in (suite, different start and end dates, variable text file).
  • rose suite-run -- --no-detach will allow a cylc workflow to run and then the bigger cylc workflow it is called from will know whether it is running, failed, succeeded etc.
  • Further discussion with the MISS team would be required before implementing this.

Instructions for running CDDS


Create a working directory

mkdir cdds_bg466_processing
cd cdds_bg466_processing
mkdir proc data

Activate CDDS

source ~cdds/bin/setup_env_for_cdds 2.4.1

Create request.json (can specify start and end dates here. See write_rose_suite_request_json --help) Will default to what is in the suite.

write_rose_suite_request_json u-bg466 cdds 120306 round-1 ap5

Create directory structure locally using -c and -t options

create_cdds_directory_structure request.json -c proc -t data

Create variables file. Can add more variables if needed.

echo Amon/tas > variables.txt

Bypass the datarequest and inventory by using -r option to only u produce variables in the variables.txt we just made.

prepare_generate_variable_list request.json -p -c proc -t data -r variables.txt

Run cdds convert which essentially configures the u-ak283 suite https://code.metoffice.gov.uk/trac/roses-u/browser/a/k/2/8/3/trunk

cdds_convert request.json -c proc -t data --skip_transfer

The above instructions can be found in the original gist.


Clone this wiki locally