# Converting the CMIP6 Data Set to the OpenVisus Visualization Format

This Jupyter notebook provides an example of how one might take a portion of the CMIP6 Data Set and convert it to the OpenVisus Visualization Format.

## Assumptions

1. You are running this notebook via [OSG's OSPool Notebooks service](https://notebook.ospool.osg-htc.org) as a user of ap40.uw.osg-htc.org.

2. You have run through the steps in [the setup notebook](cmip6_setup.ipynb).

3. You have selected the `openvisuspy` kernel for this notebook via the Jupyter interface.

## Ensure that job credentials are available

N.B. This step is necessary only to work around a bug or misconfiguration that has yet to be fixed. Normally, these credentials are handled by HTCondor automatically.

In [None]:
!echo | condor_store_cred add-oauth -s scitokens -i -

## Submit the job

Set `container_image` to the location of the `openvisuspy` container image that you created in [the setup notebook](cmip6_setup.ipynb).

Set `ds_object` to the NetCDF file that you wish to convert.

Set `pelican_loc` to the location in a Pelican federation, e.g., OSDF, where the OpenVisus index should be copied to.

In [None]:
## Define the container image for the OpenVisus software stack.

container_image = "osdf://ospool/ap40/data/brian.aydemir/openvisuspy-20240812-1020.sif"


## Define the object to convert.

federation_prefix = "osdf://aws-opendata/us-west-2"
ds_object = "nex-gddp-cmip6/NEX-GDDP-CMIP6/ACCESS-CM2/historical/r1i1p1f1/hurs/hurs_day_ACCESS-CM2_historical_r1i1p1f1_gn_1950.nc"
destination_dir = "openvisus"


## Define where to store the OpenVisus output.

pelican_loc = "openvisus-fs"

In [None]:
import htcondor
import pathlib


## Record information about where this notebook is running

hostname, = !hostname


## Remove log files from previous runs.

for ext in [".log", ".out", ".err"]:
    pathlib.Path(f"convert_dataset{ext}").unlink(missing_ok=True)


## Submit the job.

job_description = htcondor.Submit(
    f"""
    container_image = {container_image}
    args = python3 convert_dataset.py {destination_dir} $BASENAME({ds_object})

    transfer_input_files = convert_dataset.py, {federation_prefix}/{ds_object}
    transfer_output_files = {destination_dir}
    transfer_output_remaps = "{destination_dir} = {pelican_loc}"

    ## Use the backfill EP provided by the OSPool Notebooks service.
    requirements = Machine == "CHTC-Jupyter-User-EP.{hostname}"
    +FromJupyter = true

    ## Save the job log, and standard output and error.
    log = convert_dataset.log
    output = convert_dataset.out
    error = convert_dataset.err

    ## Specify resource requests and other requirements.
    request_cpus = 2
    request_memory = 4G
    request_disk = 4G

    ## Make it easier to monitor and follow-up on jobs.                              
    stream_output = true
    stream_error = true
    on_exit_hold = ExitCode =!= 0
    """
)

submitted_job = htcondor.Schedd().submit(job_description)

## Wait for the job to complete

In [None]:
import demo_support

demo_support.wait_for_job(f"convert_dataset.log")

## Visualize the dataset

### Import libraries

In [None]:
import os
import pathlib

import IPython.display
import matplotlib.animation
import matplotlib.pyplot as plt
import numpy as np
import OpenVisus
import openvisuspy as ov

### Load the dataset

In [None]:
# token = pathlib.Path("pelican_token").read_text()
dataset_loc = f"{pelican_loc}/visus.idx"
db = ov.LoadDataset(dataset_loc)

### Show some basic information

In [None]:
print("Dataset loaded from:", dataset_loc)
print("Dimensions:", db.getLogicBox())
print("Total Timesteps:", len(db.getTimesteps()))
print("Field:", db.getField().name)

### Animate the data

In [None]:
## Extract dimensions from the first timestep.

quality = 0  # full resolution = 0, coarse = -4, coarser = -8
timestep = db.getTimesteps()[0]
data3D = db.db.read(time=timestep, quality=quality)
data = data3D[:,:]
H, W = data3D.shape


## Define and show the animation.

fig, axes = plt.subplots(nrows=1, ncols=1, figsize=(10, 10 * H / W))
axes.set_xlim(0, W)
axes.set_ylim(0, H)
# TODO: Determine how to set `vmin` and `vmax`.
image = axes.imshow(data, extent=[0, W, 0, H], aspect="auto", origin="lower", vmax=110, cmap="viridis")

def frame_fn(timestep):
    data3D = db.db.read(time=timestep, quality=quality)
    data = data3D[:,:]
    image.set_data(data)

plt.rcParams["animation.embed_limit"] = 100  # MB
animation = matplotlib.animation.FuncAnimation(fig, frame_fn, frames=db.getTimesteps())
IPython.display.HTML(animation.to_jshtml())