# Converting PC-SUDS to MiniSEED — What We Actually Learned

This project produced what is likely the first modern, open-source **PC-SUDS → MiniSEED** converter in Python.  
To do that, we had to reverse-engineer how **modern EchoPro and Gecko digitizers write SUDS files**, which differ significantly from the 1990s-era SRC specification.

This document summarises the *actual* structure of the files and the logic required to extract waveforms and metadata reliably.

---

# 1. **SUDS File Structure (EchoPro / Gecko)**

Every SUDS block begins with a **12-byte STRUCTAG**:

| Bytes | Meaning |
|------|---------|
| 0 | `'S'` sync |
| 1 | `'6'` machine ID |
| 2–3 | struct ID (`uint16`) |
| 4–7 | struct length (`uint32`) |
| 8–11 | data length (`uint32`) |

A full block looks like:

```
[STRUCTAG][STRUCT_BODY][DATA_BODY]
```

In real digitizer files (EchoPro, Gecko), the only struct types that appear are:

| ID | Struct Type |
|----|-------------|
| **5** | STATIONCOMP |
| **7** | DESCRIPTRACE |
| **20** | COMMENT |
| *(everything else absent)* |

Notably missing in modern SUDS files:

❌ **SUDS_INSTRUMENT (ID 6)**  
❌ Indexes, triggers, site structures  
❌ Per-instrument response tables  

Modern firmware simply does not emit these struct types.

---

# 2. **Where Metadata Actually Lives**

## ✔ A. STATIONCOMP (ID = 5) — the main metadata block

This contains:

- latitude / longitude / elevation  
- component code  
- channel number  
- sensor type (`v`, `a`, etc.)  
- data type (`i`, `l`, `f`)  
- data units  
- polarity  
- **A/D gain** (`atod_gain`)  
- **start epoch** (effective time)  

This is the **correct** source for station and channel identity.

### Important  
STATIONCOMP contains **A/D gain**, but *not* the recorder’s “counts per volt” calibration and *not* the sensor’s volts-per-unit sensitivity.  
Those live outside the SUDS struct system.

---

## ✔ B. DESCRIPTRACE (ID = 7) — waveform header + waveform data

Earlier we thought DESCRIPTRACE headers were empty — that was wrong.  
We now know that DESCRIPTRACE reliably contains:

- `begintime` (converted from FLOAT64 → epoch ms → epoch seconds)  
- `length` – exact number of samples  
- `rate` – the **correct sampling rate**  
- `datatype` – `'i'`, `'l'`, `'2'`, `'f'`  
- waveform min/max values  
- some gain correction floats  
- number of clipped samples  
- and then the **raw sample array**

### Why DESCRIPTRACE is essential
It provides **the correct:**

- start time  
- sample rate  
- number of samples  

and replaces the old “assume 60-second files” logic completely.

---

## ✔ C. COMMENT (ID = 20)

EchoPro writes rich text blocks:

```
DataLogger=Echo Pro
Battery Voltage=10.76
SensorA=Guralp CMG-6T-1
SensorASerial=66532
SensorB=Guralp CMG-5T-2g
...
```

This is valuable operational metadata, but *not* response information.

---

# 3. **The Critical Discovery: Recorder & Sensor Sensitivity Are Not in Any SUDS Struct**

This is the biggest practical insight from the reverse-engineering effort.

| Quantity | Where it actually lives |
|---------|--------------------------|
| **Recorder sensitivity (counts per volt)** | float32 @ **absolute offset 156** |
| **Sensor sensitivity (volts per unit)** | float32 @ **absolute offset 176** |
| **Neither belongs to any SUDS struct** | ✔ correct |

Examples found in real files:

### Recorder sensitivity
- Gecko: **419430.4**  
- EchoPro: **838860.8**

### Sensor sensitivity
- HML1 accelerometer: **1010.0**  
- LOCU velocimeter: **2400.0**  
- STBK accelerometer: **750.0**

These values were stable across many files.

---

# 4. **Waveform Extraction (Final, Correct Logic)**

We now use:

- **DESCRIPTRACE.rate** → sample rate  
- **DESCRIPTRACE.length** → number of samples  
- **STATIONCOMP.start_epoch** → start time  
- **STATIONCOMP.metadata** → channel identity  
- **raw `len_data`** → bytes to read  
- **datatype** → sample width  

This is robust for all tested digitizers and file variants.

---

# 5. **Limitations of the SUDS Format (Based on Real Files)**

### ❌ No SEED location codes  
We must supply `"00"`, `"60"`, etc., externally.

### ❌ No instrument response information  
EchoPro/Gecko store only:
- counts-per-volt (offset 156)
- volts-per-unit (offset 176)

### ❌ Channels often have non-SEED names  
Could be:
- `E/N/Z`
- `CHE/CHN/CHZ`
- `c01/c02/c03`

User mapping is unavoidable.

### ❌ Older SUDS structures (INSTRUMENT, ARRAY, etc.) are **never** present  
Modern firmware does not write them.

---

# 6. **Summary**

- Modern digitizers produce a **minimal**, consistent subset of SUDS.
- Only three struct types appear:
  - **STATIONCOMP (ID 5)** — metadata  
  - **DESCRIPTRACE (ID 7)** — waveform header + samples  
  - **COMMENT (ID 20)** — text metadata  
- Recorder and sensor sensitivities live **outside** all structs, at fixed offsets.
- DESCRIPTRACE gives correct sample rate, correct sample count, and correct start time.
- STATIONCOMP gives correct channel/station metadata.
- Result: a fully functional, reliable PC-SUDS → MiniSEED converter in Python.

This gives us a transparent, reproducible extraction workflow — and replaces the often-misleading Java code with a clean, modern implementation.

In [70]:
import struct
import numpy as np
from obspy import Trace, Stream, UTCDateTime
from dataclasses import dataclass, field


In [74]:



# ============================================================
# 1. STATIONCOMP (unchanged from your "locked in" version)
# ============================================================
def parse_stationcomp_struct(raw, station_to_loc=None):
    """
    Parse a SUDS_STATIONCOMP struct of length 108 bytes
    (STATIDENT + STATIONCOMP guts + LONGIDENT) as per SRC Java.

    Returns a metadata dict with keys:
        network, station, location, channel_name, channel_number,
        component_char, data_type, start_epoch, latitude, longitude,
        elevation, atod_gain, sensor_type, data_units, polarity.
    """
    if len(raw) != 108:
        raise ValueError(f"Expected 108 bytes for STATIONCOMP, got {len(raw)}")

    # --- STATIDENT (0–11) ---
    statident = raw[0:12]
    net_short = statident[0:4].decode("ascii", errors="ignore").strip("\x00")
    sta_short = statident[4:9].decode("ascii", errors="ignore").strip("\x00")
    comp_char = statident[9:10].decode("ascii", errors="ignore")

    # --- STATIONCOMP guts (12–75, 64 bytes) ---
    sc = raw[12:76]

    lat = struct.unpack("<d", sc[4:12])[0]
    lon = struct.unpack("<d", sc[12:20])[0]
    elev = struct.unpack("<f", sc[20:24])[0]

    sensor_type = sc[31:32].decode("ascii", errors="ignore")
    data_type = sc[32:33].decode("ascii", errors="ignore")   # 'i', 'l', '2', 'f', ...
    data_units = sc[33:34].decode("ascii", errors="ignore")
    polarity = sc[34:35].decode("ascii", errors="ignore")

    channel_num = struct.unpack("<h", sc[48:50])[0]
    atod_gain = struct.unpack("<h", sc[50:52])[0]

    effective_val = struct.unpack("<i", sc[52:56])[0]
    # These files use seconds since Unix epoch
    if 1_000_000_000 < effective_val < 2_200_000_000:
        start_epoch = effective_val
    else:
        start_epoch = None

    # --- LONGIDENT (76–107, 32 bytes) ---
    li = raw[76:108]
    net_long = li[0:8].decode("ascii", errors="ignore").strip("\x00")
    sta_long = li[8:24].decode("ascii", errors="ignore").strip("\x00")
    comp_long = li[24:32].decode("ascii", errors="ignore").strip("\x00")

    # Prefer LONGIDENT if present
    network = net_long or net_short
    station = sta_long or sta_short
    channel_name = comp_long or comp_char  # 'CHE', 'CHZ', 'CHN', 'c01', …

    # Location code: SUDS has none → use mapping, else '00'
    if station_to_loc is not None and station in station_to_loc:
        location = station_to_loc[station]
    else:
        location = "00"

    return {
        "network": network,
        "station": station,
        "location": location,
        "channel_name": channel_name,
        "channel_number": channel_num,
        "component_char": comp_char,
        "data_type": data_type,
        "start_epoch": start_epoch,
        "latitude": lat,
        "longitude": lon,
        "elevation": elev,
        "atod_gain": atod_gain,
        "sensor_type": sensor_type,
        "data_units": data_units,
        "polarity": polarity,
    }


# ============================================================
# 2. DESCRIPTRACE parser (uses layout from Java + your hexdump)
# ============================================================
def _parse_descriptrace_struct(raw):
    """
    Parse SUDS_DESCRIPTRACE "guts" *after* STATIDENT.

    Layout inside the struct (after 12-byte STATIDENT):

      double begintime       (8 bytes)  [seconds since epoch in your files]
      short  localtime       (2)
      char   datatype        (1)        'i', 'l', '2', 'f'
      char   descriptor      (1)
      short  digi_by         (2)
      short  processed       (2)
      int    length          (4)        # samples
      float  rate            (4)        Hz
      float  mindata         (4)
      float  maxdata         (4)
      float  avenoise        (4)
      int    numclip         (4)
      double time_correct    (8)
      float  rate_correct    (4)

    We only actually *need*:
      begintime, datatype, length, rate
    """
    if len(raw) < 12 + 8 + 2 + 1 + 1 + 2 + 2 + 4 + 4:
        raise ValueError("DESCRIPTRACE struct too short")

    pos = 12  # skip STATIDENT

    begintime_sec = struct.unpack_from("<d", raw, pos)[0]
    pos += 8

    localtime = struct.unpack_from("<h", raw, pos)[0]
    pos += 2

    datatype = struct.unpack_from("<c", raw, pos)[0].decode("ascii", errors="ignore")
    pos += 1

    descriptor = struct.unpack_from("<c", raw, pos)[0].decode("ascii", errors="ignore")
    pos += 1

    digi_by = struct.unpack_from("<h", raw, pos)[0]
    pos += 2

    processed = struct.unpack_from("<h", raw, pos)[0]
    pos += 2

    length = struct.unpack_from("<i", raw, pos)[0]
    pos += 4

    rate = struct.unpack_from("<f", raw, pos)[0]
    pos += 4

    # We could read the rest, but we don't need it here.

    return {
        "begintime_sec": begintime_sec,
        "datatype": datatype,
        "length": length,
        "rate": rate,
        "localtime": localtime,
        "digi_by": digi_by,
        "processed": processed,
        "descriptor": descriptor,
    }


def parse_instrument_struct(struct_bytes):
    """
    Parse SUDS_INSTRUMENT (ID=10).
    Returns dict with instrument response parameters.
    """
    return {}


def parse_phasepick_struct(struct_bytes):
    """
    Parse SUDS_PHS (ID=20).
    Returns dict with station, channel, pick time, phase type, residual, etc.
    """
    return {}


def parse_study_struct(struct_bytes):
    """
    Parse SUDS_STUDY (ID=27).
    Returns dict with study/project information.
    """
    return {}


def parse_longident_struct(struct_bytes):
    """
    Parse SUDS_LONGIDENT (ID=31).
    Extended station ID.
    """
    return {}


def parse_hypo_struct(struct_bytes):
    """
    Parse SUDS_HYPO (ID=32).
    Returns dict:
        {
          "origin_time": ...,
          "latitude": ...,
          "longitude": ...,
          "depth_km": ...,
          "magnitude": ...,
          "errors": {...}
        }
    """
    return {}


In [59]:

# ============================================================
# 3. Waveform extraction — now using DESCRIPTRACE.rate/length
# ============================================================
def extract_waveforms_with_metadata(path, station_to_loc=None):
    """
    Extract all waveforms from a SUDS file and attach metadata derived
    from STATIONCOMP + DESCRIPTRACE.

    Returns a list of dicts with keys:
        network, station, location, channel,
        start_epoch, samprate, latitude, longitude, elevation, data (np.float32).
    """
    waves = []
    meta = None  # last STATIONCOMP for pairing

    with open(path, "rb") as f:
        while True:
            tag_raw = f.read(12)
            if len(tag_raw) < 12:
                break

            sync, mach, id_struct, len_struct, len_data = struct.unpack("<ccHII", tag_raw)
            if sync != b"S" or mach != b"6":
                raise RuntimeError(f"Bad SUDS tag sync/machine in {path}")

            struct_bytes = f.read(len_struct)
            data_bytes = f.read(len_data) if len_data > 0 else b""

            # --- STATIONCOMP (metadata per channel) ---
            if id_struct == 5:
                try:
                    meta = parse_stationcomp_struct(struct_bytes, station_to_loc=station_to_loc)
                except Exception:
                    meta = None
                continue

            # --- DESCRIPTRACE (waveform header + samples) ---
            if id_struct == 7:
                if meta is None:
                    # No matching STATIONCOMP; skip waveform safely
                    continue

                # First try to parse DESCRIPTRACE header
                desc = None
                try:
                    desc = _parse_descriptrace_struct(struct_bytes)
                except Exception:
                    desc = None

                # Decide datatype
                if desc is not None and desc["datatype"] in ("i", "l", "2", "f"):
                    dt = desc["datatype"]
                else:
                    dt = meta["data_type"]

                # Map datatype -> bytes/sample, numpy dtype
                if dt == "i":
                    bps = 2
                    np_dt = "<i2"
                elif dt in ("l", "2"):
                    bps = 4
                    np_dt = "<i4"
                elif dt == "f":
                    bps = 4
                    np_dt = "<f4"
                else:
                    # Unknown type: skip this waveform
                    continue

                total_samples = len(data_bytes) // bps

                # Prefer DESCRIPTRACE.length if sane, else fall back to all bytes
                if desc is not None and 0 < desc["length"] <= total_samples:
                    nsamp = desc["length"]
                else:
                    nsamp = total_samples

                # Prefer DESCRIPTRACE.rate if sane, else derive from nsamp/duration if you want
                if desc is not None and desc["rate"] > 0:
                    samprate = float(desc["rate"])
                else:
                    # Fallback: crude guess (1-minute files). You can tweak if needed.
                    samprate = float(nsamp) / 60.0 if nsamp > 0 else 0.0

                # Read exactly nsamp samples
                nbytes = nsamp * bps
                data = np.frombuffer(data_bytes[:nbytes], dtype=np_dt).astype("float32")

                waves.append({
                # --- High-level waveform info ---
                "network": meta["network"],
                "station": meta["station"],
                "location": meta["location"],
                "channel": meta["channel_name"],     # ObsPy channel code
                "start_epoch": meta["start_epoch"],
                "samprate": samprate,
                "data": data,
            
                # --- Coordinates ---
                "latitude": meta["latitude"],
                "longitude": meta["longitude"],
                "elevation": meta["elevation"],
            
                # --- Channel-level metadata from STATIONCOMP ---
                "channel_number": meta["channel_number"],
                "component_char": meta["component_char"],
                "sensor_type": meta["sensor_type"],      # 'v', 'a', etc
                "data_type": meta["data_type"],          # 'i', 'l', '2', 'f'
                "data_units": meta["data_units"],        # usually 'd'
                "polarity": meta["polarity"],            # 'n' or 'r'
                "atod_gain": meta["atod_gain"],          # REAL value!
            })

            # All other struct types are ignored here

    return waves


# ============================================================
# 4. Stream builder (unchanged interface)
# ============================================================
def suds_file_to_stream(path, station_to_loc=None):
    """
    Parse a SUDS file and return an ObsPy Stream with one Trace per component.
    """
    waves = extract_waveforms_with_metadata(path, station_to_loc=station_to_loc)

    traces = []
    for w in waves:
        if w["start_epoch"] is None:
            raise RuntimeError(f"No valid start_epoch found for file {path}")

        tr = Trace(
            data=w["data"],
            header={
                "network": w["network"],
                "station": w["station"],
                "location": w["location"],
                "channel": w["channel"],
                "starttime": UTCDateTime(w["start_epoch"]),
                "sampling_rate": w["samprate"],
            },
        )
        traces.append(tr)

    return Stream(traces=traces)

## test of read stream

In [60]:
st = suds_file_to_stream("data/20251209_0450_LOCU.seismosphere.sud")
print(st)

3 Trace(s) in Stream:
VW.LOCU.00.CHE | 2025-12-09T04:50:00.000000Z - 2025-12-09T04:50:59.996000Z | 250.0 Hz, 15000 samples
VW.LOCU.00.CHN | 2025-12-09T04:50:00.000000Z - 2025-12-09T04:50:59.996000Z | 250.0 Hz, 15000 samples
VW.LOCU.00.CHZ | 2025-12-09T04:50:00.000000Z - 2025-12-09T04:50:59.996000Z | 250.0 Hz, 15000 samples


In [61]:
st = suds_file_to_stream("data/20251208_0500_TRPU.seismosphere.sud")
print(st)

3 Trace(s) in Stream:
VW.TRPU.00.CHE | 2025-12-08T05:00:00.000000Z - 2025-12-08T05:00:59.996000Z | 250.0 Hz, 15000 samples
VW.TRPU.00.CHZ | 2025-12-08T05:00:00.000000Z - 2025-12-08T05:00:59.996000Z | 250.0 Hz, 15000 samples
VW.TRPU.00.CHN | 2025-12-08T05:00:00.000000Z - 2025-12-08T05:00:59.996000Z | 250.0 Hz, 15000 samples


In [62]:
for tr in st:
    print(tr.stats['sampling_rate'])

250.0
250.0
250.0


In [63]:
st = suds_file_to_stream("data/2025-12-05 0952 Mansfield Vic.dmx")
print(st)

63 Trace(s) in Stream:

S1.AUSMG.00.HHZ | 2025-12-05T09:51:59.000000Z - 2025-12-05T09:54:59.590000Z | 100.0 Hz, 18060 samples
...
(61 other traces)
...
OZ.FRTM.00.HHZ | 2025-12-05T09:52:00.000000Z - 2025-12-05T09:55:00.000000Z | 100.0 Hz, 18001 samples

[Use "print(Stream.__str__(extended=True))" to print all Traces]


In [64]:
for tr in st:
    print(tr.stats['sampling_rate'])

100.0
40.0
100.0
40.0
100.0
100.0
100.0
100.0
40.0
100.0
100.0
100.0
100.0
100.0
100.0
100.0
100.0
100.0
100.0
100.0
40.0
100.0
40.0
100.0
200.0
100.0
40.0
40.0
100.0
100.0
100.0
100.0
100.0
100.0
40.0
100.0
100.0
100.0
40.0
100.0
100.0
40.0
100.0
100.0
40.0
100.0
100.0
40.0
100.0
100.0
100.0
100.0
100.0
100.0
100.0
40.0
40.0
100.0
100.0
100.0
40.0
200.0
100.0


## Metadata

In [67]:
from dataclasses import dataclass
import struct
from typing import List

@dataclass
class SudsComment:
    refer: int
    item: int
    length: int
    text: str
    struct_offset: int

def _decode_suds_comment(struct_buf: bytes, data_buf: bytes, offset: int) -> SudsComment:
    refer, item, length, unused = struct.unpack("<hhhh", struct_buf)
    text = data_buf.decode("utf-8", errors="replace")
    return SudsComment(refer, item, length, text, offset)

def parse_suds_comments(path: str) -> List[SudsComment]:
    COMMENT_ID = 20
    comments = []

    with open(path, "rb") as f:
        offset = 0
        while True:
            tag_raw = f.read(12)
            if len(tag_raw) < 12:
                break

            sync, machine, id_struct, len_struct, len_data = struct.unpack("<ccHII", tag_raw)
            if sync != b"S" or machine != b"6":
                break

            struct_buf = f.read(len_struct)
            data_buf = f.read(len_data) if len_data else b""

            if id_struct == COMMENT_ID:
                comments.append(
                    _decode_suds_comment(struct_buf, data_buf, offset)
                )

            offset += 12 + len_struct + len_data

    return comments


def read_float32_at_offset(path, offset):
    """
    Read a little-endian float32 from absolute byte offset in a file.
    Returns the float value or None if out of range.
    """
    with open(path, "rb") as f:
        f.seek(0, 2)
        size = f.tell()
        if offset + 4 > size:
            return None
        f.seek(offset)
        data = f.read(4)
    return struct.unpack("<f", data)[0]



def suds_file_metadata(path, station_to_loc=None):
    """
    Clean consolidated metadata structure.
    Returns:
      {
        "recorder": { ... },
        "station": { ... }
      }
    """
    # ---- 1. Recorder-level values ----
    recorder_sens = read_float32_at_offset(path, 156)
    sensor_sens   = read_float32_at_offset(path, 176)
    comments      = parse_suds_comments(path)
    comment       = comments[0] if comments else None

    # ---- 2. Station + channel metadata ----
    waves = extract_waveforms_with_metadata(path, station_to_loc=station_to_loc)

    if not waves:
        raise RuntimeError("No STATIONCOMP/trace metadata found")

    # All channels have same station coordinates
    w0 = waves[0]

    station_meta = {
        "network": w0["network"],
        "station": w0["station"],
        "location": w0["location"],
        "latitude_deg": w0["latitude"],
        "longitude_deg": w0["longitude"],
        "elevation_m": w0["elevation"],
        "channels": {}
    }

    # Move atod_gain to recorder-level
    atod_gain = w0["atod_gain"]

    for w in waves:
        station_meta["channels"][w["channel"]] = {
            "channel_number": w["channel_number"],
            "component": w["component_char"],
            "sensor_type": w["sensor_type"],
            "data_type": w["data_type"],
            "data_units": w["data_units"],
            "polarity": w["polarity"],
        }

    return {
        "recorder": {
            "recorder_sensitivity_counts_per_volt": recorder_sens,
            "sensor_sensitivity_volts_per_unit": sensor_sens,
            "atod_gain": atod_gain,
            "comment": comment,
        },
        "station": station_meta,
    }

In [68]:
suds_file_metadata("data/20251208_0500_HML1.seismosphere.sud")

{'recorder': {'recorder_sensitivity_counts_per_volt': 838860.8125,
  'sensor_sensitivity_volts_per_unit': 1010.0,
  'atod_gain': 1,
  'comment': SudsComment(refer=-32767, item=-32767, length=446, text='DataLogger=Echo Pro\nBattery Voltage=10.76\nSupply Current=0.41\nCharger Current=-1.0\nTotal Bytes=4182016\nPercent Free=95.0\nTemperature=35.0\nSync Time=20251208 0500 18.0\nSync=0.0\nTime OK\nUTC=2025-12-08 0501 18\nLOC=2025-12-08 0501 18 (UTC+0)\nSensorA=Guralp CMG-6T-1 seismometer\nSensorASerial=66532\nSensorB=Guralp 2g CMG-5T accelerometer\nSensorBSerial=5221\n\nSTA time=0.0\nLTA time=0.0\nFilter 0.0 0.0\nNormalizingFactor=1.00000e+00\nSensorSensitivity=0.0\n', struct_offset=12354)},
 'station': {'network': 'AB',
  'station': 'HML1',
  'location': '00',
  'latitude_deg': -34.403350830078125,
  'longitude_deg': 138.5888671875,
  'elevation_m': 73.0,
  'channels': {'c01': {'channel_number': 1,
    'component': 'e',
    'sensor_type': 'v',
    'data_type': 'i',
    'data_units': 'd',
 

In [157]:
suds_file_metadata("data/2025-12-05 0952 Mansfield Vic.dmx")

{'recorder': {'recorder_sensitivity_counts_per_volt': 2465210112.0,
  'sensor_sensitivity_volts_per_unit': 1.0,
  'atod_gain': 1,
  'comment': SudsComment(refer=-32767, item=-32767, length=212, text='Battery Voltage=-1.0\nSupply Current=-1.0\nCharger Current=-1.0\nTotal Bytes=-1\nPercent Free=-1.0\nTemperature=-999.0\nSync=0.0\nSensorA=CMG-6TD,Guralp, CMG-6T, 30 s - 100 Hz, 2400,Guralp\nNormalizingFactor=1.00000e+00\n', struct_offset=72594)},
 'station': {'network': 'S1',
  'station': 'AUSMG',
  'location': '00',
  'latitude_deg': -36.416099548339844,
  'longitude_deg': 148.6083984375,
  'elevation_m': 951.0,
  'channels': {'HHZ': {'channel_number': 29,
    'component': 'H',
    'sensor_type': 'v',
    'data_type': 'f',
    'data_units': 'd',
    'polarity': 'n'},
   'BHE': {'channel_number': 33,
    'component': 'B',
    'sensor_type': 'v',
    'data_type': 'f',
    'data_units': 'd',
    'polarity': 'n'},
   'HHE': {'channel_number': 51,
    'component': 'E',
    'sensor_type': 'v',


## metadata for multi -station suds

Option A — Extend suds_file_metadata() to return a dict of stations

Requires:
	•	scanning all STATIONCOMP blocks
	•	mapping DESCRIPTRACE traces to stations even if no STATIONCOMP exists
	•	gracefully filling missing metadata

Option B — Leave metadata as-is and focus on phases + waveforms first

Because:
	•	For event files, STATIONCOMP may not exist for every station anyway.
	•	Phase-picks (FEATURE structs) need decoding next.


# Phase arrivals

Here are the five short, brutal-to-the-point dot points:

	1.	Phase picks do exist in the SUDS file — the waveform viewer proves it — but we have not yet located a struct where the arrival time + phase code decode cleanly.
    
	2.	We inspected struct ID 31 first (because that’s where the MLWM entries were), and it clearly contains station identifiers but no meaningful timing fields → not the arrival struct.
    
	3.	We then inspected struct ID 10 (FEATURE), which should contain arrival times and phase codes according to the SUDS specification — but the bytes in your file do not decode cleanly into MS_TIME or the other FEATURE fields → either byte order / encoding differs, or this file uses a non-standard variant.
    
	4.	Struct ID 32 contains lists of stations participating in the event (like an association table), not picks — helpful context, but not the pick times themselves.
    
	5.	The most likely situation now is: your file writes arrivals in a custom FEATURE-like struct but with different offsets or packing, meaning we need to reverse-engineer the true field layout directly from known arrival times (e.g., the MLWM i-P pick at 2025-12-05 09:52:09.40) and match them in the raw bytes.