# Homework: File conversion & Trace header fixing

This notebook prompts you to read a single **SEISAN** waveform file:

`2001-02-02-0303-55S.MVO___019`

and convert it into processed MiniSEED and SAC outputs, using ObsPy. Along the way you will practice:

* parsing information from filenames
* reading SEISAN waveforms
* editing Trace metadata
* tapering & filtering
* unit scaling using a sensitivity (or calibration) value
* changing the sampling rate
* writing MiniSEED and SAC

You may NOT use AI to help with this.

**Important: Make sure you read all the way to the bottom of these instructions before you start this exercise!**

## 0) Setup

Import read and UTCDateTime from obspy

Set a string variable called 'filename' that contains the path to the Seisan file, e.g.

```
filename = "/Users/thompsong/Developer/CompSciS26/week5/2001-02-02-0303-55S.MVO___019"
```


## 1) Parse the filename
Given the input filename like YYYY-MM-DD-hhmmssS.MVO__NUM, e.g. 2003-06-05-1845-15S.PVO___022 extract:

1.	Start time from the prefix formatted as YYYY-MM-DD-hhmm-ss and create a UTCDateTime from it.
<br/>Example: parse 2003-06-05-1845-15 into a UTCDateTime.

2.	The format code: the single letter after the time is S meaning SEISAN.

3.	The number of channels: the last 3 digits (e.g. 022) → integer number of channels.

Remember you can index and subset strings. 

Print a one-line summary like:

```Parsed starttime=2003-06-05T18:45:15Z, format=S, nchans=22```


## 2) Read the SEISAN file

Read the file into an ObsPy Stream.
* Verify that the number of traces in the stream matches the parsed channel count.
* If it does not match, print a warning and continue.

## 3) Fix the SEED codes

1) These all have a blank network code. Set network code to 'MV'.

2) The location code is bogus. Change it from 'J' to '00'.

3) The channel code is not SEED compliant. Make the following changes:
    * "PRS" -> "SDO" (barometer)
    * "SB" -> "BH"
    * "S " -> "SH"

## 4) Preprocess each trace

Apply these steps in this order:

1.	Taper: 5% taper

2.	High-pass filter: 0.5 Hz, two-way (zero-phase)

3.	Scale to physical units using an overall sensitivity of 8,000,000 Counts per (m/s). The trace data are currently in Counts. Convert them to m/s.


## 5) Round the sampling rate
Change each trace sampling rate to the nearest integer (in samples/sec). 

## 6) Force the start time

Set the start time for every trace to:<br/>
```UTCDateTime(‘2001-02-02T03:00:00’)```


## 7A) Write MiniSEED

1. Output filename should match the input naming style, but replace the S with M. But remember that the start time has changed.

2. Write stream as MiniSEED.

3. Re-read the MiniSEED file and plot it.



## 7B) Write SAC

1. Write SAC files (one per trace).

2. Put them in a folder like SAC/.

3. Filenames should include NET.STA.LOC.CHA and optionally start time.

4. Re-read the SAC files and plot them.


# Deliverables

Submit:

1.	Your Python script or notebook (hw_seisan_convert.py or .ipynb)

2.	Screenshots of the stream plots of the MiniSEED and SAC files you read back in.


---

## Hints:

Ignore what I wrote above about AI. You CAN use AI for this exercise. But make sure you understand the code it writes! You might be asked to explain it in class!

Remember that strings can be indexed in just the same way as lists. If you want the last 4 characters, you can do:

```
my_str = “my_filename.123”
last_4 = my_str[-4:]
```

Or you can split on “_” using:
```
my_list = my_str.split(“_”)
```

You can round a float(ing point variable) to the nearest integer with the round() function:

```
a = 75.19
b = round(a)
print(b) # b=75
```

You can build a UTCDateTime object from a string by specifying the string format, e.g.:

```
import obspy
filename = “2022-05-08-1715-05M.PVO__006”
filetime = obspy.UTCDateTime.strptime(filename, “%Y-%m-%d-%H%M-%S”)
```

Read a waveform file like this:

```
import obspy
st = obspy.read(filename) # read outputs an ObsPy Stream object
```

Remember you can loop over Trace objects in a Stream object like this:

```
for tr in st:
    # THIS IS A COMMENT do something to tr
```

You can taper and filter traces with:

```
tr.taper()
tr.filter()
```

A Trace object has a stats attribute, and a data attribute.

```
tr.data # this is a numpy array – the actual sequence of sample values in the timeseries
tr.stats # this is a dictionary with attributes such as:
    #	sampling_rate
    #	network
    #	station
    #	location
    # 	channel
```

To scale a numpy array called ‘a’ by a factor of 5 you can do:

$$f = 5$$
$$a = a * f$$

Remember that tr.data is a numpy array.
But also see what happens if you just try to multiple tr directly:

```tr = tr * 5```

You can write a Stream object to a file using:
st.write(‘/path/to/filename.ext’, format=’<format>’)
Where <format> is ‘mseed’, or ‘sac’, or ‘seisan’, or some other format ObsPy supports.
You can also write a Trace object to a file in exactly the same way.




