> [!Warning] 
> **This project is still in an early phase of development.**
>
> The [python API](https://c-star.readthedocs.io/en/latest/api.html) is not yet stable, and some aspects of the schema for the [blueprint](https://c-star.readthedocs.io/en/latest/terminology.html#term-blueprint) will likely evolve. 
> Therefore whilst you are welcome to try out using the package, we cannot yet guarantee backwards compatibility. 
We expect to reach a more stable version in Q1 2025.
>
> To see which systems C-Star has been tested on so far, see [Supported Systems](https://c-star.readthedocs.io/en/latest/machines.html).

> [!Warning] 
> **This project is still in an early phase of development.**
>
> The [python API](https://c-star.readthedocs.io/en/latest/api.html) is not yet stable, and the schema for the [blueprint](https://c-star.readthedocs.io/en/latest/terminology.html#term-blueprint) will likely evolve. 
> Therefore whilst you are welcome to try out using the package, we cannot yet guarantee backwards compatibility. 
> We expect to reach a more stable version in Q1 2025.

# Building a `Case` and exporting it as a blueprint
In this guide, we will create a ROMS-MARBL [C-Star case](https://c-star.readthedocs.io/en/latest/terminology.html#term-Case), by:

* Creating ROMS and MARBL [BaseModel](https://c-star.readthedocs.io/en/latest/terminology.html#term-BaseModel) objects
* Creating [AdditionalCode](https://c-star.readthedocs.io/en/latest/terminology.html#term-AdditionalCode) objects to represent namelist and additional source code files for ROMS
* Create [InputDataset](https://c-star.readthedocs.io/en/latest/terminology.html#term-InputDataset) objects to tell C-Star where to find spatiotemporal data needed to run ROMS-MARBL
* Create a [Discretization](https://c-star.readthedocs.io/en/latest/terminology.html#term-Discretization) object to tell C-Star how to partition the simulation (processor distribution and time step)
* Bring these various objects together to make a [ROMSComponent](https://c-star.readthedocs.io/en/latest/terminology.html#term-ROMSComponent) and a [MARBLComponent](https://c-star.readthedocs.io/en/latest/terminology.html#term-MARBLComponent)
* Create a `Case` consisting of these two [Components](https://c-star.readthedocs.io/en/latest/terminology.html#term-Component)
* Export this `Case` to a [blueprint file](https://c-star.readthedocs.io/en/latest/terminology.html#term-blueprint)

On the [the next page](https://c-star.readthedocs.io/en/latest/2_importing_and_running_a_case.html) we will look at how to _run_ a `Case` starting from a blueprint

In [1]:
import cstar

## The structure of the Case:
[Here](https://c-star.readthedocs.io/en/latest/terminology.html#structure-of-c-star-case) you can get a general overview of a C-Star case. For our `roms_marbl_example` [case](https://github.com/CWorthy-ocean/cstar_blueprint_roms_marbl_example), the case structure breaks down like this:  
```
Case
├── MARBLComponent
│   └── base_model (MARBLBaseModel)
└── ROMSComponent
    ├── base_model (ROMSBaseModel)
    ├── namelists (AdditionalCode)
    ├── additional_source_code (AdditionalCode)
    ├── model_grid (ROMSInputDataset)
    ├── initial_conditions (ROMSInputDataset)
    ├── tidal_forcing (ROMSInputDataset)
    ├── surface_forcing (list of ROMSInputDatasets)
    ├── boundary_forcing (list of ROMSInputDatasets)
    └── discretization (ROMSDiscretization)

```
These are all the elements needed to create a unique, reproducible ROMS-MARBL simulation. You will notice that the `Component`, `BaseModel`, `InputDataset`, and `Discretization` objects here are specific to the object they describe (e.g. `ROMSBaseModel`). This is because there may be some unique attributes or operations associated with the `BaseModel` object describing ROMS that may be different to that describing MARBL, which has its own subclass `MARBLBaseModel`.

To build this case from the bottom up, we'll need to first build `BaseModel` objects for ROMS and MARBL

## Constructing the BaseModel objects
To initialize a [base model](https://c-star.readthedocs.io/en/latest/terminology.html#term-BaseModel), we will need:

- a `source_repo` (repository URL containing the base model source code) 
- a `checkout_target` (a point in the repository history we'd like to jump to). 

For ROMS we'll use the latest commit hash. For MARBL we'll use v0.45, around which the ROMS-MARBL driver was built:

In [2]:
from cstar.roms import ROMSBaseModel
from cstar.marbl import MARBLBaseModel

In [3]:
roms_base_model = ROMSBaseModel(
    source_repo='https://github.com/CESR-lab/ucla-roms.git',
    checkout_target='main',
)

marbl_base_model = MARBLBaseModel(
    source_repo='https://github.com/marbl-ecosys/MARBL.git',
    checkout_target='marbl0.45.0',
)

In [4]:
print(roms_base_model)

ROMSBaseModel
-------------
source_repo : https://github.com/CESR-lab/ucla-roms.git (default)
checkout_target : main (corresponding to hash 246c11fa537145ba5868f2256dfb4964aeb09a25) (default)
local_config_status: 3 (Environment variable ROMS_ROOT is not present and it is assumed the base model is not installed locally)


In [5]:
print(marbl_base_model)

MARBLBaseModel
--------------
source_repo : https://github.com/marbl-ecosys/MARBL.git (default)
checkout_target : marbl0.45.0 (corresponding to hash 6e6b2f7c32ac5427e6cf46de4222973b8bcaa3d9)
local_config_status: 3 (Environment variable MARBL_ROOT is not present and it is assumed the base model is not installed locally)


## Constructing the AdditionalCode objects

To construct an `AdditionalCode` object, we need a `location` pointing to a local or remote directory or repository. 

As we are using additional code hosted in a remote repository for this example, we also need:

- a `subdir` (subdirectory relative to the repository top level in which to find the code) 
- a `checkout_target` argument (branch, tag, or commit hash)

We also need to provide a list of filenames corresponding to our `AdditionalCode`.

In this example we are using MARBL and ROMS. As ROMS handles all input and output to MARBL, we only need `AdditionalCode` instances for ROMS...

... one for run-time files (namelists):

In [6]:
from cstar.base import AdditionalCode

In [7]:
roms_namelists = AdditionalCode(
    location = "https://github.com/CWorthy-ocean/cstar_blueprint_roms_marbl_example.git",
    subdir = "additional_code/ROMS/namelists",
    checkout_target = "cstar_alpha",
    files = [
        "roms.in_TEMPLATE",
        "marbl_in",
        "marbl_tracer_output_list",
        "marbl_diagnostic_output_list"
    ]
)
print(roms_namelists)

AdditionalCode
--------------
Location: https://github.com/CWorthy-ocean/cstar_blueprint_roms_marbl_example.git
subdirectory: additional_code/ROMS/namelists
Working path: None
Exists locally: False (get with AdditionalCode.get())
Files:
    roms.in_TEMPLATE      (roms.in will be used by C-Star based on this template)
    marbl_in
    marbl_tracer_output_list
    marbl_diagnostic_output_list


<div class="alert alert-info">

Note

For `roms_namelists`, in the first entry under `files`, the namelist file we begin with is a template. C-Star recognises the `_TEMPLATE` suffix and works with a local copy (in this case `roms.in`) that it will modify and use to run ROMS with user choices such as run length)
</div>

... and one for compile-time files (such as ROMS' `.opt` files, which are used to set parameters):

In [8]:
roms_additional_source_code = AdditionalCode(
    location = "https://github.com/CWorthy-ocean/cstar_blueprint_roms_marbl_example.git",
    subdir = "additional_code/ROMS/source_mods",
    checkout_target = "cstar_alpha",
    files = [
        "bgc.opt",
         "bulk_frc.opt",
         "cppdefs.opt",
         "diagnostics.opt",
         "ocean_vars.opt",
         "param.opt",
         "tracers.opt",
         "Makefile",
         "Make.depend",
    ]
)

print(roms_additional_source_code)

AdditionalCode
--------------
Location: https://github.com/CWorthy-ocean/cstar_blueprint_roms_marbl_example.git
subdirectory: additional_code/ROMS/source_mods
Working path: None
Exists locally: False (get with AdditionalCode.get())
Files:
    bgc.opt
    bulk_frc.opt
    cppdefs.opt
    diagnostics.opt
    ocean_vars.opt
    param.opt
    tracers.opt
    Makefile
    Make.depend


## Constructing the InputDataset objects
In addition to a base model and additional code, we need different types of input dataset, each with a specialized subclass of the [InputDataset class](https://c-star.readthedocs.io/en/latest/generated/cstar.base.InputDataset.html):

- a grid file supplying information about the domain to ROMS ([ROMSModelGrid](https://c-star.readthedocs.io/en/latest/generated/cstar.roms.ROMSModelGrid.html))
- An initial condition file from which to start the run ([ROMSInitialConditions](https://c-star.readthedocs.io/en/latest/generated/cstar.roms.ROMSInitialConditions.html))
- boundary forcing files providing information at the edge of the domain ([ROMSBoundaryConditions](https://c-star.readthedocs.io/en/latest/generated/cstar.roms.ROMSInitialConditions.html))
- surface forcing files providing information at the upper boundary ([ROMSSurfaceForcing](https://c-star.readthedocs.io/en/latest/generated/cstar.roms.ROMSSurfaceForcing.html))
- tidal forcing files providing information on tidal constituents ([ROMSTidalForcing](https://c-star.readthedocs.io/en/latest/generated/cstar.roms.ROMSTidalForcing.html))

In our case, all the files associated with our `roms_marbl_example` case are small and [fit in a repository](https://github.com/CWorthy-ocean/input_datasets_roms_marbl_example). 

In [9]:
from cstar.roms import ROMSModelGrid, ROMSInitialConditions, ROMSTidalForcing, ROMSBoundaryForcing, ROMSSurfaceForcing

In [10]:
#Grid
roms_model_grid = ROMSModelGrid(
    location="https://github.com/CWorthy-ocean/input_datasets_roms_marbl_example/raw/cstar_alpha/roms_grd.nc",
    file_hash="8ad2469179d9f720da6a0517e9879b4615cc16cc7eb1c93b4d9fef2344c19508",
)
# Initial conditions
roms_initial_conditions = ROMSInitialConditions(
    location="https://github.com/CWorthy-ocean/input_datasets_roms_marbl_example/raw/main/roms_ini.nc",
    file_hash="6d3df424c906c96dcb72f8fca94a51551e62f1f1e2d6b7bb744b2c4fb078d191",
)
# Tides
roms_tidal_forcing = ROMSTidalForcing(
    location="https://github.com/CWorthy-ocean/input_datasets_roms_marbl_example/raw/main/roms_tides.nc",
    file_hash="6364aeaec0860d5384e5dfe2fc0964cec338b86d63382d24ef36670cb87e6dcb",
)
# Boundary
roms_phys_boundary_forcing = ROMSBoundaryForcing(
    location="https://github.com/CWorthy-ocean/input_datasets_roms_marbl_example/raw/main/roms_bry_201201.nc",
    file_hash="9c7ec2915b46f40ea0fd5c548d65da4147304ba081812387721b0e20e5c33165",
)
roms_bgc_boundary_forcing = ROMSBoundaryForcing(
    location = "https://github.com/CWorthy-ocean/input_datasets_roms_marbl_example/raw/main/roms_bry_bgc_clim.nc",
    file_hash = "2ffaa61ba3871922d3f270e2a11af70cca6f7aa2ccced2bac7257c45f35e261c",
)
# Surface
roms_bgc_surface_forcing = ROMSSurfaceForcing(
    location="https://github.com/CWorthy-ocean/input_datasets_roms_marbl_example/raw/main/roms_frc_bgc_2012.nc",
    file_hash="ad5f8a60cb39068a0afb42f4ebe48473f68d9906bbd0d926c041c00dfe909be2",
)
roms_phys_surface_forcing = ROMSSurfaceForcing(
    location="https://github.com/CWorthy-ocean/input_datasets_roms_marbl_example/raw/main/roms_frc.201201.nc",
    file_hash="f85e2ac88abacd6009999c8ffa9a6cd5844f13e09b358705b67335fe1af621d9",
)


<div class="alert alert-info">

Note

1. The `location` attribute can either be a **local path** or a **URL**. If it is set to a URL, the `file_hash` (a 256 bit checksum) must also be provided to verify the download.
    
2. The file described by location can be either **netCDF** or **yaml** format. When C-Star sees a yaml file instead of a netCDF file for ROMS input data, it assumes the file contains a set of instructions to be passed to the `roms-tools` [package](https://roms-tools.readthedocs.io/en/latest/), which will then generate the netCDF file for us when `InputDataset.get()` is called. This makes it easier to share and save ROMS configurations without the overhead associated with potentially large netCDF files. More information on creating ROMS input datasets (both yaml and netCDF) for C-Star using `roms-tools` can be found on [this page](https://c-star.readthedocs.io/en/latest/4_preparing_roms_input_datasets.html).

</div>

We can query each input dataset to get pertinent information about its state, e.g.:

In [11]:
print(roms_phys_boundary_forcing)

-------------------
ROMSBoundaryForcing
-------------------
Source location: https://github.com/CWorthy-ocean/input_datasets_roms_marbl_example/raw/main/roms_bry_201201.nc
file_hash: 9c7ec2915b46f40ea0fd5c548d65da4147304ba081812387721b0e20e5c33165
Working path: None ( does not yet exist. Call InputDataset.get() )


## Constructing the Discretization object
Lastly, we need to tell C-Star how we will be discretizing our components. MARBL piggybacks off the discretization of its host model, so we only need to create a `ROMSDiscretization` object. This contains:

- the time step (`time_step` , in seconds)
- the number of processors following x and y for running in parallel (`n_procs_x`, `n_procs_y`)

In [12]:
from cstar.roms import ROMSDiscretization

roms_discretization = ROMSDiscretization(time_step = 60,
                                         n_procs_x = 3,
                                         n_procs_y = 3)
print(roms_discretization)

ROMSDiscretization
------------------
time_step: 60s
n_procs_x: 3 (Number of x-direction processors)
n_procs_y: 3 (Number of y-direction processors)


## Putting it all together to build ROMS and MARBL components:
We now have everything we need to create the `MARBLComponent` and `ROMSComponent` objects that come together to make our Case.

In [13]:
from cstar.roms import ROMSComponent
from cstar.marbl import MARBLComponent

### MARBL:
For MARBL, we just need the base model - ROMS handles all run-time code and data on MARBL's behalf:

In [14]:
marbl_component = MARBLComponent(
    base_model = marbl_base_model
)
print(marbl_component)

MARBLComponent
--------------
base_model: MARBLBaseModel instance (query using Component.base_model)


### ROMS
Our `ROMSComponent` is a little more involved, containing not just a base model, but also all our `InputDataset`s, `AdditionalCode`, and `Discretization` information:

In [15]:
roms_component = ROMSComponent(
    base_model = roms_base_model,
    namelists = roms_namelists,
    additional_source_code = roms_additional_source_code,
    discretization = roms_discretization,
    model_grid = roms_model_grid,
    initial_conditions = roms_initial_conditions,
    tidal_forcing = roms_tidal_forcing,
    boundary_forcing = [roms_phys_boundary_forcing,roms_bgc_boundary_forcing],
    surface_forcing = [roms_phys_surface_forcing, roms_bgc_surface_forcing]
)
print(roms_component)

ROMSComponent
-------------
base_model: ROMSBaseModel instance (query using Component.base_model)
additional_source_code: AdditionalCode instance with 9 files (query using Component.additional_source_code)
namelists: AdditionalCode instance with 4 files (query using Component.namelists)
model_grid = <ROMSModelGrid instance>
initial_conditions = <ROMSInitialConditions instance>
tidal_forcing = <ROMSTidalForcing instance>
surface_forcing = <list of 2 ROMSSurfaceForcing instances>
boundary_forcing = <list of 2 ROMSBoundaryForcing instances>

Discretization:
ROMSDiscretization
------------------
time_step: 60s
n_procs_x: 3 (Number of x-direction processors)
n_procs_y: 3 (Number of y-direction processors)


## And finally, we can build the Case object:
This is instantiated using:

- a list of components
- a name
- a `caseroot` (the local path where the case will be run). Additionally we choose a start date and end date for the run. The values below run the case for one model month, which may take several minutes to run (depending on your machine). **If you'd like to run the case more quickly**, modify `end_date` below:
- a `valid_start_date` and `valid_end_date`

<div class="alert alert-info">

Note
    

The "valid" date range specified by `valid_start_date` and `valid_end_date` corresponds to the range of dates in which this `Case` **can** be run, rather than the date range for which it **will** be run. This can be due to scientific validation of a certain period, or just availability of input data (as in this example, where we only have forcing data for January 2012).

A Case should typically also be initialized with `start_date` and `end_date`, which are unique to the `Case` instance, and specify the dates for which the `Case` **will** be run.
    
As we are building this `Case` to export, not run, we ignore the `start_date` and `end_date` parameters for now, as they are not exported. C-Star will automatically set them to the maximum valid range.
    
</div>

In [16]:
from cstar import Case

In [17]:
roms_marbl_case = Case(
    components=[marbl_component, roms_component],
    name='roms_marbl_example_cstar_case',
    caseroot = "../examples/roms_marbl_example_cstar_case",
    valid_start_date = "20120101 12:00:00",
    valid_end_date = "20121231 12:00:00"
)
print(roms_marbl_case)

C-Star Case
-----------
Name: roms_marbl_example_cstar_case
caseroot: /global/cfs/cdirs/m4746/Users/dafydd/my_c_star/examples/roms_marbl_example_cstar_case
start_date: 2012-01-01 12:00:00
end_date: 2012-12-31 12:00:00
Is setup: False
Valid date range:
valid_start_date: 2012-01-01 12:00:00
valid_end_date: 2012-12-31 12:00:00

It is built from the following Components (query using Case.components): 
   <MARBLComponent instance>
   <ROMSComponent instance>




## Visualizing the Case:
We can see how the caseroot directory will look once the case is set up using `Case.tree()`:

In [18]:
roms_marbl_case.tree()

/global/cfs/cdirs/m4746/Users/dafydd/my_c_star/examples/roms_marbl_example_cstar_case
├── input_datasets
│   └── ROMS
│       ├── roms_grd.nc
│       ├── roms_ini.nc
│       ├── roms_tides.nc
│       ├── roms_bry_201201.nc
│       ├── roms_bry_bgc_clim.nc
│       ├── roms_frc.201201.nc
│       └── roms_frc_bgc_2012.nc
├── namelists
│   └── ROMS
│       ├── roms.in_TEMPLATE
│       ├── marbl_in
│       ├── marbl_tracer_output_list
│       └── marbl_diagnostic_output_list
└── additional_source_code
    └── ROMS
        ├── bgc.opt
        ├── bulk_frc.opt
        ├── cppdefs.opt
        ├── diagnostics.opt
        ├── ocean_vars.opt
        ├── param.opt
        ├── tracers.opt
        ├── Makefile
        └── Make.depend



## Saving the Case to a blueprint file
We can save all the information associated with this case to a YAML file using `Case.persist(filename)`.
On the [next page](https://c-star.readthedocs.io/en/latest/2_importing_and_running_a_case.html) we will import and run a `Case` using a blueprint.

In [19]:
roms_marbl_case.to_blueprint("roms_marbl_example_case.yaml")

Let's take a look at the `blueprint` file. We will see it contains all the information we provided above:

In [20]:
from pathlib import Path
print(Path("roms_marbl_example_case.yaml").read_text())

registry_attrs:
  name: roms_marbl_example_cstar_case
  valid_date_range:
    start_date: '2012-01-01 12:00:00'
    end_date: '2012-12-31 12:00:00'
components:
- component:
    component_type: MARBL
    base_model:
      source_repo: https://github.com/marbl-ecosys/MARBL.git
      checkout_target: marbl0.45.0
- component:
    component_type: ROMS
    base_model:
      source_repo: https://github.com/CESR-lab/ucla-roms.git
      checkout_target: main
    additional_source_code:
      location: https://github.com/CWorthy-ocean/cstar_blueprint_roms_marbl_example.git
      subdir: additional_code/ROMS/source_mods
      checkout_target: cstar_alpha
      files:
      - bgc.opt
      - bulk_frc.opt
      - cppdefs.opt
      - diagnostics.opt
      - ocean_vars.opt
      - param.opt
      - tracers.opt
      - Makefile
      - Make.depend
    namelists:
      location: https://github.com/CWorthy-ocean/cstar_blueprint_roms_marbl_example.git
      subdir: additional_code/ROMS/namelists
      ch