# Generating a CSV Line List From a .par File

This notebook shows how to use `MoleculeLineList` to load a HITRAN `.par` file, filter the line data, and export it as a CSV line list in the format used by iSLAT's `LINELISTS/` folder.

In [15]:
# Imports
import numpy as np
import pandas as pd

from IPython.display import display

# Import data types from iSLAT
from iSLAT.Modules.DataTypes.MoleculeLineList import MoleculeLineList
from iSLAT.Modules.FileHandling import hitran_data_folder_path, data_files_path

print(f"Pandas version: {pd.__version__}")
print(f"NumPy version: {np.__version__}")

Pandas version: 2.3.3
NumPy version: 2.4.2


## 1. Load Molecular Data

Load a HITRAN `.par` file using `MoleculeLineList`. The first load parses the file and creates a binary cache for fast subsequent loads.

In [16]:
# Load H2O line list from the HITRAN .par file
h2o_lines = MoleculeLineList(
    molecule_id="H2O",
    filename=hitran_data_folder_path / "data_Hitran_H2O.par"
)

lines_df = h2o_lines.get_pandas_table()
print(f"Total lines loaded: {len(lines_df)}")
display(lines_df)

Total lines loaded: 305561


Unnamed: 0,nr,lev_up,lev_low,lam,freq,a_stein,e_up,e_low,g_up,g_low
0,0,0_0_0|10_2_9,0_0_0|9_3_6,933.27661,3.212257e+11,6.177000e-06,1861.25073,1845.83411,63,57
1,1,0_0_1|5_1_5,0_0_1|4_2_2,928.22180,3.229750e+11,8.967000e-06,5865.74316,5850.24268,33,27
2,2,0_2_0|6_5_1,0_2_0|7_4_4,926.64453,3.235247e+11,2.590000e-05,6039.06494,6023.53857,13,15
3,3,0_1_0|14_3_12,0_1_0|13_4_9,926.56085,3.235540e+11,9.288000e-06,6021.03809,6005.50977,87,81
4,4,0_0_0|5_1_5,0_0_0|4_2_2,922.00464,3.251529e+11,1.157000e-05,469.94110,454.33624,11,9
...,...,...,...,...,...,...,...,...,...,...
305556,305556,-2-2-2|3_3_0,0_0_0|4_3_1,0.30001,9.992745e+14,2.005000e-07,48509.87500,552.26367,7,9
305557,305557,-2-2-2|1_1_1,0_0_0|2_2_0,0.30001,9.992758e+14,1.953000e-07,48153.58203,195.90945,3,5
305558,305558,-2-2-2|2_1_1,0_0_0|3_1_2,0.30001,9.992813e+14,3.315000e-07,48207.37500,249.43471,15,21
305559,305559,-2-2-2|3_-3_-3,0_0_0|4_3_2,0.30001,9.992901e+14,5.255000e-08,48508.71484,550.35651,21,27


## 2. Compare With an Existing iSLAT Line List

iSLAT uses CSV line lists stored in `DATAFILES/LINELISTS/`. Let's look at the format of an existing one.

In [17]:
# Load an existing iSLAT line list to see the expected format
linelists_path = data_files_path / "LINELISTS"

existing_ll = pd.read_csv(linelists_path / "MIRI_H2O_v1-1.csv")
print("Existing line list columns:", list(existing_ll.columns))
print(f"Existing line list rows: {len(existing_ll)}")
display(existing_ll)

Existing line list columns: ['species', 'lev_up', 'lev_low', 'lam', 'a_stein', 'e_up', 'e_low', 'g_up', 'g_low']
Existing line list rows: 254


Unnamed: 0,species,lev_up,lev_low,lam,a_stein,e_up,e_low,g_up,g_low
0,H2O,0_1_0|17_2_15,0_1_0|16_1_16,10.33669,2.598,7488.47217,6096.55957,105,99.0
1,H2O,0_1_0|18_4_15,0_1_0|17_1_16,10.43161,6.625,8433.01367,7053.76709,111,105.0
2,H2O,0_1_0|16_3_14,0_1_0|15_0_15,10.98840,2.322,6974.64600,5665.28711,99,93.0
3,H2O,0_1_0|18_5_14,0_1_0|17_2_15,11.06768,10.520,8788.45215,7488.47168,111,105.0
4,H2O,0_1_0|17_3_14,0_1_0|16_2_15,11.08288,5.979,7864.93213,6566.73438,105,99.0
...,...,...,...,...,...,...,...,...,...
249,H2O,1_0_0|9_6_3,1_0_0|8_5_4,27.24858,13.640,7562.85645,7034.83691,57,51.0
250,H2O,1_0_0|8_7_2,1_0_0|7_6_1,27.28562,20.040,7500.73096,6973.42871,51,45.0
251,H2O,0_1_0|16_4_13,0_1_0|15_3_12,27.33261,13.230,7329.25049,6802.85498,99,93.0
252,H2O,0_0_1|9_6_4,0_0_1|8_5_3,27.57881,14.030,7682.55176,7160.85547,57,51.0


## 3. Filter Lines

Apply filters to select a subset of lines. Here we select the v1-1 band (lines where both `lev_up` and `lev_low` start with `0_1_0`) in the 10–20 µm range.

In [18]:
# Filter to v1-1 lines in the 10-20 micron range
mask = (
    lines_df['lev_up'].str.startswith('0_1_0') &
    lines_df['lev_low'].str.startswith('0_1_0') &
    (lines_df['lam'] >= 10) &
    (lines_df['lam'] <= 20)
)

filtered_df = lines_df[mask].copy()
print(f"Filtered lines: {len(filtered_df)}")
display(filtered_df.describe())

Filtered lines: 471


Unnamed: 0,nr,lam,freq,a_stein,e_up,e_low,g_up,g_low
count,471.0,471.0,471.0,471.0,471.0,471.0,471.0,471.0
mean,6240.197452,14.581656,21346290000000.0,16.82539,6673.681883,5649.22159,52.184713,50.002123
std,753.306056,2.770596,4216495000000.0,33.17214,1682.325517,1678.570725,29.913596,28.805104
min,4964.0,10.01481,15055380000000.0,8.26e-07,3461.8999,2614.90649,11.0,9.0
25%,5628.5,12.104005,18009700000000.0,0.02009,5398.88916,4246.8772,26.0,25.0
50%,6151.0,14.75455,20318650000000.0,0.2869,6557.2915,5513.75439,39.0,39.0
75%,6905.5,16.64617,24768030000000.0,10.44,7858.34326,6861.86133,81.0,75.0
max,7587.0,19.91264,29934920000000.0,149.4,10707.86621,9704.40625,123.0,117.0


## 4. Convert to iSLAT Line List CSV Format

The iSLAT line list CSV format uses the columns: `species, lev_up, lev_low, lam, a_stein, e_up, e_low, g_up, g_low`.

The `.par` data already contains these fields — we just need to add the `species` column and select the right columns.

In [19]:
# Build the CSV line list DataFrame in iSLAT format
csv_linelist = filtered_df[['lev_up', 'lev_low', 'lam', 'a_stein', 'e_up', 'e_low', 'g_up', 'g_low']].copy()
csv_linelist.insert(0, 'species', 'H2O')

# Sort by wavelength for consistency
csv_linelist = csv_linelist.sort_values('lam').reset_index(drop=True)

print(f"CSV line list: {len(csv_linelist)} lines")
display(csv_linelist)

CSV line list: 471 lines


Unnamed: 0,species,lev_up,lev_low,lam,a_stein,e_up,e_low,g_up,g_low
0,H2O,0_1_0|11_10_1,0_1_0|10_7_4,10.01481,0.06886,6861.86133,5425.21143,69,63
1,H2O,0_1_0|11_10_2,0_1_0|10_7_3,10.01494,0.06886,6861.86133,5425.23047,23,21
2,H2O,0_1_0|20_4_16,0_1_0|19_3_17,10.01604,12.58000,10027.41309,8590.94141,41,39
3,H2O,0_1_0|13_9_4,0_1_0|12_6_7,10.03925,0.47890,7365.63818,5932.48682,81,75
4,H2O,0_1_0|13_9_5,0_1_0|12_6_6,10.06264,0.47960,7365.63721,5935.81641,27,25
...,...,...,...,...,...,...,...,...,...
466,H2O,0_1_0|9_3_7,0_1_0|8_0_8,19.84255,0.92460,4088.18481,3363.08789,19,17
467,H2O,0_1_0|10_10_1,0_1_0|9_9_0,19.88340,64.80000,6470.45605,5746.84912,63,57
468,H2O,0_1_0|10_10_0,0_1_0|9_9_1,19.88340,64.82000,6470.45605,5746.84912,21,19
469,H2O,0_1_0|11_7_4,0_1_0|11_4_7,19.89919,0.11930,5810.36084,5087.32812,69,69


## 5. Save as CSV Line List

Save the filtered line list as a CSV that can be placed in iSLAT's `DATAFILES/LINELISTS/` folder and loaded as an input line list.

In [20]:
# Save to the LINELISTS folder (or any output path)
output_path = linelists_path / "MIRI_H2O_v1-1_10-20um_custom.csv"

csv_linelist.to_csv(output_path, index=False)
print(f"Saved {len(csv_linelist)} lines to {output_path.name}")

Saved 471 lines to MIRI_H2O_v1-1_10-20um_custom.csv


## 6. Verify the Saved File

Reload the CSV and confirm it matches the expected iSLAT line list format.

In [21]:
# Reload and verify
reloaded = pd.read_csv(output_path)

print(f"Reloaded columns: {list(reloaded.columns)}")
print(f"Reloaded lines:   {len(reloaded)}")
print(f"Original lines:   {len(csv_linelist)}")
print(f"\nMatch: {len(reloaded) == len(csv_linelist)}")
display(reloaded)

Reloaded columns: ['species', 'lev_up', 'lev_low', 'lam', 'a_stein', 'e_up', 'e_low', 'g_up', 'g_low']
Reloaded lines:   471
Original lines:   471

Match: True


Unnamed: 0,species,lev_up,lev_low,lam,a_stein,e_up,e_low,g_up,g_low
0,H2O,0_1_0|11_10_1,0_1_0|10_7_4,10.01481,0.06886,6861.86133,5425.21143,69,63
1,H2O,0_1_0|11_10_2,0_1_0|10_7_3,10.01494,0.06886,6861.86133,5425.23047,23,21
2,H2O,0_1_0|20_4_16,0_1_0|19_3_17,10.01604,12.58000,10027.41309,8590.94141,41,39
3,H2O,0_1_0|13_9_4,0_1_0|12_6_7,10.03925,0.47890,7365.63818,5932.48682,81,75
4,H2O,0_1_0|13_9_5,0_1_0|12_6_6,10.06264,0.47960,7365.63721,5935.81641,27,25
...,...,...,...,...,...,...,...,...,...
466,H2O,0_1_0|9_3_7,0_1_0|8_0_8,19.84255,0.92460,4088.18481,3363.08789,19,17
467,H2O,0_1_0|10_10_1,0_1_0|9_9_0,19.88340,64.80000,6470.45605,5746.84912,63,57
468,H2O,0_1_0|10_10_0,0_1_0|9_9_1,19.88340,64.82000,6470.45605,5746.84912,21,19
469,H2O,0_1_0|11_7_4,0_1_0|11_4_7,19.89919,0.11930,5810.36084,5087.32812,69,69
