# Lab 1 — Data Assimilation Quickstart (CADRE on OpenShift)
Goal: run a small 3DVAR cycle on sample NOAA data inside an RHODS Workbench, then hand off to Tekton for a full pipeline run.
## Steps
1. Verify GPU and storage
2. Pull sample dataset from `/data/raw` or S3 bucket
3. Run pre-processing
4. Execute a toy DA step (mock JEDI/DART)
5. Persist artifacts to `/data/experiments/lab1`


In [None]:
import os, json, subprocess, pathlib
from datetime import datetime
base = pathlib.Path('/data/experiments/lab1')
base.mkdir(parents=True, exist_ok=True)
print('GPU visible:', os.environ.get('NVIDIA_VISIBLE_DEVICES','auto'))
print('Workspace:', base)


### Preprocess (placeholder)
Replace with your real preproc. Writes to `/data/experiments/lab1/preproc`.

In [None]:
pre = base/'preproc'
pre.mkdir(exist_ok=True)
with open(pre/'index.txt','w') as f:
    f.write('placeholder preproc outputs\n')
print('Preproc complete')


### Run a mock DA step
This simulates calling a containerized DA binary.

In [None]:
da = base/'da'
da.mkdir(exist_ok=True)
with open(da/'analysis.nc','w') as f:
    f.write('mock netcdf bytes')
print('DA step complete')


### Trigger Tekton pipeline (optional)
Requires the `oc` CLI in the workbench image and appropriate RBAC.

In [None]:
import subprocess
start_date='2025-07-01'
end_date='2025-07-07'
scheme='3dvar'
cmd=['oc','-n','cadre-labs','create','-f','-']
pipelinerun=f'''apiVersion: tekton.dev/v1
kind: PipelineRun
metadata:\n  generateName: cadre-lab1-run-\nspec:\n  pipelineRef:\n    name: cadre-train\n  params:\n  - name: start_date\n    value: {start_date}\n  - name: end_date\n    value: {end_date}\n  - name: scheme\n    value: {scheme}\n  workspaces:\n  - name: shared-ws\n    volumeClaimTemplate:\n      spec:\n        accessModes: ["ReadWriteOnce"]\n        resources:\n          requests:\n            storage: 20Gi\n'''
print(pipelinerun)
# subprocess.run(cmd, input=pipelinerun.encode(), check=False)
