This notebook creates HydroLight-EcoLight (HE) input files (Iroot.txt files), as well as user-supplied IOP files (ac files and bb files), which can be run in HE to create a synthetic dataset.

Note, any kind of bio-optical model can be used, but we're using a Case 2, four-component model (water, chlorophyll, CDOM and minerals). The specific IOPs (SIOPs) we use are example files provided in the HE directory.

FIGURE OUT WHAT SIOPs TO USE





# Initializing

## Importing libraries

In [1]:
import numpy as np
import pandas as pd

## Importing SIOPs

The SIOPs are stored in csv files - we need to read those in using pandas:

In [2]:
siops_abc = pd.read_csv('data/siops_abc.csv') #provide path to the file
siops_bb = pd.read_csv('data/siops_bb.csv') #provide path to the file

In [3]:
siops_abc

Unnamed: 0,Wavelength,a*chl,a*mss,a*cdom,b*chl,b*mss
0,400,0.052899,0.080000,1.607792,0.12,0.4
1,402,0.053036,0.078450,1.569649,0.12,0.4
2,404,0.054574,0.076930,1.532410,0.12,0.4
3,406,0.055862,0.075495,1.497387,0.12,0.4
4,408,0.057642,0.074032,1.461863,0.12,0.4
...,...,...,...,...,...,...
171,742,-0.000148,0.002939,0.027872,0.12,0.4
172,744,-0.000032,0.002882,0.027211,0.12,0.4
173,746,-0.000183,0.002826,0.026565,0.12,0.4
174,748,0.000028,0.002773,0.025958,0.12,0.4


In [4]:
siops_bb

Unnamed: 0,Wave,bb*chl,bb*mss
0,400,0.001715,0.016487
1,405,0.001700,0.016420
2,410,0.001685,0.016355
3,415,0.001671,0.016290
4,420,0.001656,0.016227
...,...,...,...
66,730,0.001117,0.013555
67,735,0.001111,0.013525
68,740,0.001106,0.013495
69,745,0.001101,0.013466


# Bio-optical model: generate IOPs

We're going to generate IOPs for a range of concentrations for each optically significant material (OSM). In this case, our OSMs are phytoplankton chlorophyll-a (chl), mineral suspended sediments (MSS) and coloured dissolved organic matter (CDOM).

Let's use the following OSM concentrations:

- Chl = 0.5 and 5 mg/m^3
- MSS = 0.1, 1 and 10 g/m^3
- CDOM = 0.01, 0.04, 0.15, and 0.6 m^-1

In [12]:
chl = [0.5, 5]
mss = [0.1, 1, 10]
cdom = [0.01, 0.04, 0.15, 0.6]

nchl = len(chl); nmss = len(mss); ncdom = len(cdom)

## OSM concentrations and SIOPs to IOPs?

The total absorption, $a$, of seawater can be written as:

$$a = a_w + a_{chl} + a_{mss} + a_{cdom}$$
$$a = a_w + [CHL]a^*_{chl} + [MSS]a^*_{mss} + [CDOM]a^*_{cdom}$$

where:
- $a_{chl}$ = chlorophyll-a absorption
- $a_{mss}$ = MSS absorption
- $a_{cdom}$ = CDOM absorption
- $[CHL]$ = chlorophyll-a concentration
- $[MSS]$ = MSS concentration
- $[CDOM]$ = CDOM concentration
- $a*_{chl}$ = chlorophyll-a specific absorption
- $a*_{mss}$ = MSS specific absorption
- $a*_{cdom}$ = CDOM specific absorption
- $a_{w}$ = water absorption

[See the IOCCG Report 3, section 2.6](https://ioccg.org/wp-content/uploads/2015/10/ioccg-report-03.pdf)

A similar equation can be written for attenuation ($c$) and backscattering ($b_b$), except CDOM does not scatter and so is not included when calculating $b_b$.

## Absorption coefficients

The OSM concentrations are saved as 1-D arrays, so we need to make them 2D arrays to multiply by their SIOP ($a^*$)

In [6]:
mss2d = np.expand_dims(mss, axis = 0)
cdom2d = np.expand_dims(cdom, axis = 0)
chl2d = np.expand_dims(chl, axis = 0)

And then we can multiply by the SIOPs to calculate the absorption for each material (note the SIOPs are indexed out of the dataframe as 1D arrays, so also need to be made into a 2D array):

In [7]:
amss = np.expand_dims(siops_abc['a*mss'],axis=1) * mss2d
acdom = np.expand_dims(siops_abc['a*cdom'],axis=1) * cdom2d
achl = np.expand_dims(siops_abc['a*chl'],axis=1) * chl2d

And the shapes of these arrays?

In [8]:
amss.shape

(176, 3)

In [9]:
acdom.shape

(176, 4)

In [10]:
achl.shape

(176, 2)

So we have one column for each concentration, and each row is a different wavelength. 

### Combining OSM absorption coefficients to determine the total absorption
We need to make copies of our achl, acdom and amss arrays, so we can add together all the combinations of constituents

| chl | mss | cdom|
|---|---|---|
| 0.5 | 0.1 | 0.01 |
| 0.5 | 0.1 | 0.4 |
| 0.5 | 0.1 | 0.15 |
| 0.5 | 0.1 | 0.6 |
| 0.5 | 1 | 0.01 |
| 0.5 | 1 | 0.4 |
| 0.5 | 1 | 0.15 |
| 0.5 | 1 | 0.6 |
| 0.5 | 10 | 0.01 |
| 0.5 | 10 | 0.4 |
| 0.5 | 10 | 0.15 |
| 0.5 | 10 | 0.6 |
| 5 | 0.1 | 0.01 |
| 5 | 0.1 | 0.4 |
| 5 | 0.1 | 0.15 |
| 5 | 0.1 | 0.6 |
| 5 | 1 | 0.01 |
| 5 | 1 | 0.4 |
| 5 | 1 | 0.15 |
| 5 | 1 | 0.6 |
| 5 | 10 | 0.01 |
| 5 | 10 | 0.4 |
| 5 | 10 | 0.15 |
| 5 | 10 | 0.6 |

First we want to repeat the `achl` array i.e. we want to repeat each column 12 times. Why 12? See the table above - we want to add the first `achl` column to all possible combinations of the `amss` and `acdom` columns, which totals $3 \times 4 = 12\ $:

In [13]:
achlrep = np.repeat(achl,nmss*ncdom,axis=1)

Let's check the shape is what we expect (176, 24):

In [14]:
achlrep.shape

(176, 24)

Great! But are the columns in the right order? Let's check the first row, the first 12 columns should have the same value, and then the next 12 columns should have a different value

In [15]:
achlrep[0,:]

array([0.02644942, 0.02644942, 0.02644942, 0.02644942, 0.02644942,
       0.02644942, 0.02644942, 0.02644942, 0.02644942, 0.02644942,
       0.02644942, 0.02644942, 0.26449421, 0.26449421, 0.26449421,
       0.26449421, 0.26449421, 0.26449421, 0.26449421, 0.26449421,
       0.26449421, 0.26449421, 0.26449421, 0.26449421])

Great - this worked. Onto CDOM next...

For CDOM we want to tile the `acdom` array i.e. we want to make 6 copies of it, resulting in 1 array, where each set of 4 columns corresponds to a different CDOM concentration:

In [16]:
acdomrep = np.tile(acdom,nchl*nmss)

Check the shape is (176, 24):

In [17]:
acdomrep.shape

(176, 24)

And what does the first row look like?

In [18]:
acdomrep[0,:]

array([0.01607792, 0.06431167, 0.24116877, 0.96467507, 0.01607792,
       0.06431167, 0.24116877, 0.96467507, 0.01607792, 0.06431167,
       0.24116877, 0.96467507, 0.01607792, 0.06431167, 0.24116877,
       0.96467507, 0.01607792, 0.06431167, 0.24116877, 0.96467507,
       0.01607792, 0.06431167, 0.24116877, 0.96467507])

And finally, MSS. Here we need to first repeat, and then tile:

In [19]:
amssrep = np.tile(np.repeat(amss,ncdom,axis=1),nchl)

And is the shape (176, 24)?

In [20]:
amssrep.shape

(176, 24)

Yes! Does the first row look like we expect? 

In [21]:
amssrep[0,:]

array([0.008, 0.008, 0.008, 0.008, 0.08 , 0.08 , 0.08 , 0.08 , 0.8  ,
       0.8  , 0.8  , 0.8  , 0.008, 0.008, 0.008, 0.008, 0.08 , 0.08 ,
       0.08 , 0.08 , 0.8  , 0.8  , 0.8  , 0.8  ])

Yes!

And the total non-water absorption is:

In [22]:
anw = achlrep + amssrep + acdomrep

## Backscattering and scattering coefficients for CHL and MSS

These are calculated using the standard model (like we did for MSS and CDOM absorption), but CDOM doesn't scatter, so isn't included here

In [23]:
chl2d = np.expand_dims(chl, axis = 0)

bmss = np.expand_dims(siops_abc['b*mss'],axis=1) * mss2d
bchl = np.expand_dims(siops_abc['b*chl'],axis=1) * chl2d

bbmss = np.expand_dims(siops_bb['bb*mss'],axis=1) * mss2d
bbchl = np.expand_dims(siops_bb['bb*chl'],axis=1) * chl2d

And then repeating the arrays to make them the right size (176,24):

In [24]:
bchlrep = np.repeat(bchl,nmss*ncdom,axis=1)
bmssrep = np.tile(np.repeat(bmss,ncdom,axis=1),nchl)

bbchlrep = np.repeat(bbchl,nmss*ncdom,axis=1)
bbmssrep = np.tile(np.repeat(bbmss,ncdom,axis=1),nchl)

## Nonwater backscattering and attenuation coefficients

In [25]:
bbnw = bbchlrep + bbmssrep
bnw = bchlrep + bmssrep

cnw = anw + bnw

# Create HydroLight ac, bb and input files

## Creating the wavelength arrays

In [26]:
wavelac = np.linspace(400,750,176)
wac = np.expand_dims(np.insert(wavelac,0,176), axis = 0)

wavelbb = np.linspace(400,750,71)
wbb = np.expand_dims(np.insert(wavelbb,0,71), axis = 0)

### Repeating the CHL, MSS and CDOM concentration arrays

This will be useful for writing data into the ac, bb and input HydroLight files

In [27]:
chlrep = np.repeat(chl,nmss*ncdom)
mssrep = np.tile(np.repeat(mss,ncdom,),nchl)
cdomrep = np.tile(cdom,nchl*nmss)

## Creating the ac files

The variable `path` is the full directory of where you want to save the ac files.

In [28]:
for run in range(0,nchl*nmss*ncdom,1):
    path='data/ac-files/ac'+str(run)+'.txt'
    with open(path, "w") as f:  
        aca=np.concatenate([[1],anw[:,run],cnw[:,run]])
        acb=np.concatenate([[-1],anw[:,run],cnw[:,run]])
        ac = np.array([aca, acb])
        osms = [chlrep[run], mssrep[run], cdomrep[run]]
        
        print('ac file',file=f)
        print('-----------',file=f)
        print('**need ten lines header**',file=f)
        print('10th line contains run information',file=f)
        print('11th line contains number of wavelengths & those wavelengths',file=f)
        print('12th line onwards contains ac data',file=f)
        print('ac data takes form of:',file=f)
        print('depth then a for each wavelength, then c for each wavelength',file=f)
        print('(NB - HE6 recognises end of data by negative depth)',file=f)
        print(osms,file=f)
        np.savetxt(f,wac,fmt='%i',delimiter=' ')
        np.savetxt(f,ac,fmt='%f',delimiter=' ', newline='\n')

### Creating the bb files

The variable `path` is the full directory of where you want to save the bb files.

In [29]:
for run in range(0,nchl*nmss*ncdom,1):
    path='data/bb-files/bb'+str(run)+'.txt'
    with open(path, "w") as f:  
        bba=np.concatenate([[1],bbnw[:,run]])
        bbb=np.concatenate([[-1],bbnw[:,run]])
        bb = np.array([bba, bbb])
        osms = [chlrep[run], mssrep[run], cdomrep[run]]
        
        print('bb file',file=f)
        print('-----------',file=f)
        print('**need ten lines header**',file=f)
        print('10th line contains run information',file=f)
        print('11th line contains number of wavelengths & those wavelengths',file=f)
        print('12th line onwards contains ac data',file=f)
        print('ac data takes form of:',file=f)
        print('depth then bb for each wavelength',file=f)
        print('(NB - HE6 recognises end of data by negative depth)',file=f)
        print(osms,file=f)
        np.savetxt(f,wbb,fmt='%i',delimiter=' ')
        np.savetxt(f,bb,fmt='%f',delimiter=' ', newline='\n')

### Input files

The variable `ifpath` is the full directory of where you want to save the HydroLight input files.

`acpcpath` is the full directory of where the ac files are saved - this is important because HydroLight will look there for the files.

`bbpcpath` is the full directory of where the bb files are saved - this is important because HydroLight will look there for the files.

In [30]:
datapath = 'X:/cmitchell/07-teaching-mentoring/03-guestLectures/OceanOpticsSummerClass/2025/HydroLight-Lab/data/'

acpcpath = datapath + 'ac-files/'
bbpcpath = datapath + 'bb-files/'
ifpath = datapath + 'HEinputfiles/'

In [35]:
for run in range(0,nchl*nmss*ncdom,1):
    osms = [chlrep[run], mssrep[run], cdomrep[run]] #for the run description
    with open(ifpath+'I'+str(run)+'.txt', 'w') as f: #path for where to save the input files
        print('0, 400, 700, 0.02, 488, 0.00026, 1, 5.3',file=f)
        print(osms,file=f)
        print(str(run),file=f) #setting "title"
        print('0, 0, 0, 1, 0',file=f)
        print('3, 1, 0, 0, 0, 0',file=f)
        print('2, 2',file=f)
        print('0, 0.5,',file=f)
        print('0, 1, 440, 1, 0.014',file=f)
        print('2, -666, 440, 1, 0.014',file=f)
        print('../data/H2OabDefaults_SEAwater.txt',file=f) 
        print('../data/defaults/astarpchl.txt',file=f)
        print('0, -999, -999, -999, -999, -999',file=f)
        print('-666, -999, -999, -999, -999, -999',file=f)
        print('bstarDummy.txt',file=f)
        print('dummybstar.txt',file=f)
        print('0, 0, 550, 0.01, 0',file=f)
        print('-2, 0, 550, 0.01, 0',file=f)
        print('dpf_pure_H2O.txt',file=f)
        print('dpf_Petzold_avg_particle.txt',file=f)
        print('23',file=f)
        print('400,410,420,430,440,450,460,470,480,490,500,510,520,530,540,550,560,570,580,590,600,610,620,630,640,650,660,670,680,690,700',file=f)
        print('0,0,0,0,0',file=f)
        print('2, 3, 30, 0, 0',file=f) #middle entry is solar angle
        print('-1, 0, 0, 29.92, 1, 80, 2.5, 15, 5, 300',file=f)
        print('5, 1.34, 20, 35, 3',file=f)
        print('0, 0',file=f)
        print('0, 3, 0, 1, 2',file=f) #depths
        print('../data/H2OabDefaults_SEAwater.txt',file=f)
        print('1',file=f)
        print(acpcpath+'ac'+str(run)+'.txt',file=f) #directory to where the ac files are saved
        print('dummyFilteredAc9.txt',file=f)
        print(bbpcpath+'bb'+str(run)+'.txt',file=f) #directory to where the bb files are saved
        print('dummyCHLdata.txt',file=f)
        print('dummyCDOMdata.txt',file=f)
        print('dummyR.bot',file=f)
        print('dummydata.txt',file=f)
        print('dummyComp.txt',file=f)
        print('DummyIrrad.txt',file=f)
        print('..\\data\\MyBiolumData.txt',file=f)
        print('DummyRad.txt',file=f)

## Making the run list

We also need to add our Iroot file to the runlist file. Below we make a new runlist file that contains our input files. 

In [34]:
with open(ifpath+'runlist.txt','w') as f:
    for run in range(0,nchl*nmss*ncdom,1):
        print('I'+str(run)+'.txt',file=f)