diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 000000000..b17d229ba --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,80 @@ +# AGENTS.md — Madagascar + +> Agent orientation for the Madagascar geophysical data analysis package. +> Authored for Claude Code, Copilot CLI, Codex, Gemini CLI. Also readable as plain project context by Cursor, Aider, Zed, and other agent harnesses. + +## What Madagascar is + +Madagascar is an open-source package for reproducible geophysical data analysis. Research and processing are expressed as SConstruct files that chain small composable `sf*` programs; each program reads and writes files in the RSF format (regularly sampled data, text header + binary). + +## Mental model + +1. **RSF files** are the data plane: one text header (`.rsf`) pointing at a binary (usually in `$DATAPATH`). +2. **`sf*` programs** are the compute plane: ~1,000 small Unix-style filters that read RSF on stdin and write RSF on stdout. +3. **SConstruct flows** are the orchestration plane: Python (`rsf.proj`) DSL using `Flow`, `Plot`, `Result`, `Fetch`, `Command`. +4. **vplot (`.vpl`)** is the visualization plane: `sfgrey`, `sfgraph`, `sfwiggle`, etc. produce `.vpl` files that `sfpen` displays. + +## Repo map + +- `api/` — language bindings. C is the reference; 9 other languages wrap it (C++, Python, F77, F90, Java, Julia, Matlab, Octave, Chapel). +- `framework/rsf/` — the Python DSL for flows: `proj.py` (Flow/Plot/Result/Fetch), `flow.py`, `tex.py`. +- `user//M.[c|py|cc|f|f90|...]` — user-contributed main programs. +- `plot/main/` — vplot rendering programs (`grey.c`, `graph.c`, `wiggle.c`, `contour.c`, `dots.c`, `bargraph.c`, …). +- `book/` — 1,751 example SConstructs: teaching material, tutorials, published research reproductions. +- `system/` — filesystem-level helpers (`sfrm`, `sfcp`, …). + +## RSF data model (briefly) + +- Text header (`file.rsf`) contains `n1=..., d1=..., o1=..., label1=..., unit1=...` through axis 9. +- `n1` is the fastest-varying axis. +- Binary data is in a separate file pointed to by `in=` in the header, typically under `$DATAPATH` (set via `env.sh`). +- `<` redirects stdin from an RSF file; `>` writes to a new RSF (with its binary placed in `$DATAPATH`). + +## Running things + +- `scons` in any `book/` directory builds that directory's flow. +- `sfdoc ` prints program help. Also `` with no args. +- `sfin file.rsf` inspects header contents. +- `sfattr < file.rsf` prints data statistics. +- `scons view` runs the flow and opens all `Result` plots. + +## Discovering programs + +- `sfdoc ` — full docs for a single program. +- `sfdoc -k ` — search docs. +- `user//M.*` — read the source for how a program is built. + +## Adding a program + +Create a `M.` file in `user//`. The `sf` binary is produced automatically by the local `SConstruct`. See the `authoring-sf-programs` skill family for details per language. + +## Skills + +See `skills/` in this repo. Each path below resolves to a `SKILL.md` that the Claude Code (and compatible) Skill tool can load on demand. + +| Skill | When to use | +| --- | --- | +| `skills/writing-rsf-flows/` | Writing or modifying an SConstruct that orchestrates RSF processing. | +| `skills/using-sf-programs/` | Composing a pipeline from existing `sf*` programs. | +| `skills/working-with-rsf-data/` | Inspecting, modifying, or debugging RSF headers and binaries. | +| `skills/plotting-with-vplot/` | Producing or debugging a vplot visualization. | +| `skills/authoring-sf-programs/` | Shared conventions for authoring a new `sf*` program (language-agnostic). | +| `skills/authoring-sf-programs-c/` | Authoring in C (reference implementation). | +| `skills/authoring-sf-programs-python/` | Authoring in Python. | +| `skills/authoring-sf-programs-cpp/` | Authoring in C++. | +| `skills/authoring-sf-programs-f90/` | Authoring in Fortran 90. | +| `skills/authoring-sf-programs-f77/` | Authoring in Fortran 77. | +| `skills/authoring-sf-programs-java/` | Authoring in Java. | +| `skills/authoring-sf-programs-julia/` | Authoring in Julia. | +| `skills/authoring-sf-programs-matlab/` | Authoring in Matlab. | +| `skills/authoring-sf-programs-octave/` | Authoring in GNU Octave. | +| `skills/authoring-sf-programs-chapel/` | Authoring in Chapel. | + +## Gotchas + +- `rm *.rsf` removes headers but leaves binaries behind in `$DATAPATH`. Use `sfrm` instead. +- Axis 1 is fastest-varying; `sftransp` swaps axes. +- Plot labels use escape codes: `\_` subscript, `\^` superscript, `\s` size. Forgetting these in SConstruct strings leaves literal backslashes in output. +- The `sf` prefix is implicit inside a `Flow()` string — write `bandpass`, not `sfbandpass`. +- `$DATAPATH` must be writable and end in `/`. +- Complex data types (`sf_complex`) require `data_format=native_complex` (or `ascii_complex` / `xdr_complex`) in the header — the shorthand `complex` alone is not a valid value. diff --git a/skills/.gitkeep b/skills/.gitkeep new file mode 100644 index 000000000..e69de29bb diff --git a/skills/authoring-sf-programs-c/SKILL.md b/skills/authoring-sf-programs-c/SKILL.md new file mode 100644 index 000000000..fc616bccd --- /dev/null +++ b/skills/authoring-sf-programs-c/SKILL.md @@ -0,0 +1,486 @@ +--- +name: authoring-sf-programs-c +description: Use when authoring a new Madagascar sf* main program in C (the reference implementation — all other language APIs wrap this). +--- + +## When to use + +Load this skill whenever you are writing a new `sf` main program in C. C is the **reference implementation** for Madagascar: the Fortran 77, Fortran 90, C++, Python, Julia, Java, Matlab, Octave, and Chapel APIs all wrap or mirror the C API. Every claim in this skill is grounded in `api/c/rsf.h` (the amalgamated public header at `build/api/c/rsf.h`) and the test programs under `api/c/`. (CUDA programs exist in user directories but Madagascar has no dedicated `api/cuda/` binding; they use the C API for RSF I/O and CUDA for compute.) + +This skill is C-specific. For language-agnostic conventions (file naming, self-doc format, parameter style, build integration) see the companion: + +- `../authoring-sf-programs/SKILL.md` — shared conventions (load this too) + +Typical placement: `user//M.c`. The build system finds every `M*.c` in a user directory and compiles it into `sf`. + +--- + +## Skeleton + +This is the minimal correct skeleton for a C main program. Copy it verbatim and extend from here. + +```c +#include + +int main(int argc, char* argv[]) +{ + sf_file in, out; + int n1, n2; + float d1, o1; + float *trace; + + sf_init(argc, argv); + in = sf_input("in"); + out = sf_output("out"); + + if (!sf_histint (in, "n1", &n1)) sf_error("No n1= in input"); + if (!sf_histint (in, "n2", &n2)) n2 = 1; + if (!sf_histfloat(in, "d1", &d1)) d1 = 1.0f; + if (!sf_histfloat(in, "o1", &o1)) o1 = 0.0f; + + trace = sf_floatalloc(n1); + + for (int i2 = 0; i2 < n2; i2++) { + sf_floatread(trace, n1, in); + /* process trace here */ + sf_floatwrite(trace, n1, out); + } + + free(trace); + exit(0); +} +``` + +Key invariants: +- `sf_init` must be the first RSF call. +- Open all `sf_input` files before `sf_output` — the first output inherits dimensions from the first input by default. +- Always call `exit(0)` (not `return 0`) at the end; this is the Madagascar convention. + +--- + +## Self-doc header + +Every `M.c` must begin with a comment block that the build system extracts to produce `sfdoc sf` output. The format (sourced from `user/fomels/Mpick.c` in this tree): + +```c +/* Automatic picking from semblance-like panels. + +Takes: rect1=1 rect2=1 ... + +rectN defines the size of the smoothing stencil in N-th dimension. + +Theory in Appendix B of: +S. Fomel, 2009, +Velocity analysis using AB semblance: Geophysical Prospecting, v. 57, 311-321. +Reproducible version in RSFSRC/book/tccs/avo +http://ahay.org/RSF/book/tccs/avo/paper_html/ + +August 2012 program of the month: +http://ahay.org/blog/2012/08/01/program-of-the-month-sfpick/ +*/ + +/* + Copyright (C) 2004 University of Texas at Austin + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, write to the Free Software + Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +*/ +``` + +Rules extracted from real programs in this tree: + +1. **First comment** — brief one-line description on the opening line, optional extended prose below. This entire block becomes the program's documentation. +2. **Parameter inline comments** — each `sf_getint`/`sf_getfloat`/`sf_getbool` call is followed immediately by a `/* description */` comment on the next line. The doc extractor captures these and lists them as named parameters in `sfdoc`. +3. **Second comment** — the GPL copyright block. Keep it verbatim. +4. `#include ` comes after both comment blocks, before any other code. + +Example parameter documentation from `user/fomels/Mpick.c`: + +```c +if (!sf_getint("niter",&niter)) niter=100; +/* number of iterations */ +if (!sf_getfloat("an",&an)) an=1.; +/* axes anisotropy */ +if (!sf_getint("gate",&gate)) gate=3; +/* picking gate */ +if (!sf_getbool("smooth",&smooth)) smooth=true; +/* if apply smoothing */ +``` + +Each `/* ... */` directly following a `sf_get*` call is parsed as the parameter description; it appears under "parameters:" in `sfdoc sf`. + +--- + +## I/O API + +Signatures from `build/api/c/rsf.h`. + +### Initialization + +```c +void sf_init(int argc, char *argv[]); +``` +Must be the first RSF call. Parses the command line into an internal parameter table; opens stdin/stdout for piped RSF data. + +### Opening files + +```c +sf_file sf_input (const char* tag); +sf_file sf_output (const char* tag); +``` +`tag` is a logical name used on the command line: `sfmyprog in=data.rsf out=result.rsf`. Use `"in"` and `"out"` for the primary pipes (stdin / stdout). Additional named files (`"den"`, `"mask"`, etc.) are opened the same way. Output files inherit dimension headers from the first input by default — override with explicit `sf_put*` calls before the first write. + +```c +sf_datatype sf_gettype(sf_file file); +``` +Returns the data type (`SF_FLOAT`, `SF_COMPLEX`, `SF_INT`, etc.). Use to validate input type before reading. + +```c +void sf_fileclose(sf_file file); +``` +Closes an `sf_file` handle and flushes metadata. For secondary outputs (not stdin/stdout) it is good practice to call this explicitly; for primary stdin/stdout it is called implicitly on `exit`. + +### Reading and writing float data + +```c +void sf_floatread (float* arr, size_t size, sf_file file); +void sf_floatwrite (float* arr, size_t size, sf_file file); +``` +Read/write `size` consecutive floats. `arr` must be pre-allocated. These are the most common I/O calls. + +### Reading and writing complex data + +```c +void sf_complexread (sf_complex* arr, size_t size, sf_file file); +void sf_complexwrite (sf_complex* arr, size_t size, sf_file file); +``` +`sf_complex` is `float _Complex` (or the platform equivalent). Use when the input type is `SF_COMPLEX`. + +--- + +## Header metadata API + +RSF files carry key=value pairs in a text header. There are two directions: + +- **Read from input header** — `sf_hist*` family. Returns `true` if the key exists. +- **Write to output header** — `sf_put*` family. Must be called before the first data write. + +Signatures from `build/api/c/rsf.h`: + +### Reading header values + +```c +bool sf_histint (sf_file file, const char* key, int* par); +bool sf_histfloat (sf_file file, const char* key, float* par); +char* sf_histstring(sf_file file, const char* key); +``` + +`sf_histstring` returns a heap-allocated string (or `NULL`); test with `if (NULL != (...))`. Example: + +```c +char *label; +if (NULL != (label = sf_histstring(scn, "label2"))) + sf_putstring(pik, "label", label); +``` + +### Writing header values + +```c +void sf_putint (sf_file file, const char* key, int par); +void sf_putfloat (sf_file file, const char* key, float par); +void sf_putstring (sf_file file, const char* key, const char* par); +``` + +### Axis objects + +Madagascar uses `sf_axis` structs to bundle `n`, `d`, `o`, `label`, and `unit` for each dimension. These simplify copying or transforming axes between files. + +```c +sf_axis sf_iaxa(sf_file FF, int i); /* read axis i (1-based) */ +void sf_oaxa(sf_file FF, const sf_axis AA, int i); /* write axis i */ +``` + +Example — copy axis 1 from input to output unchanged, then set a new axis 2: + +```c +sf_axis ax1, ax2; +ax1 = sf_iaxa(in, 1); +sf_oaxa(out, ax1, 1); +sf_putint(out, "n2", new_n2); +sf_putfloat(out, "d2", new_d2); +``` + +`sf_filedims` is a convenience wrapper that reads all `n` values at once: + +```c +int sf_filedims(sf_file file, int *n); /* returns number of dimensions */ +``` + +--- + +## Parameter API + +Command-line parameters are set by the caller as `key=value` pairs. Signatures from `build/api/c/rsf.h`: + +### Scalar getters + +```c +bool sf_getint (const char* key, int* par); +bool sf_getfloat (const char* key, float* par); +bool sf_getbool (const char* key, bool* par); +char* sf_getstring (const char* key); /* returns NULL if absent */ +bool sf_getlargeint (const char* key, off_t* par); +``` + +All scalar getters return `true` if the key was found (false otherwise). `sf_getstring` returns a pointer or `NULL`. + +### Array getters + +```c +bool sf_getints (const char* key, int* par, size_t n); +bool sf_getfloats (const char* key, float* par, size_t n); +bool sf_getbools (const char* key, bool* par, size_t n); +``` + +Array getters read up to `n` comma-separated values (e.g. `rect=3,5,1`). + +### Default-value pattern + +The canonical Madagascar idiom — always provide a fallback default when the parameter is optional: + +```c +if (!sf_getint("n", &n)) n = 100; +/* number of samples */ +if (!sf_getfloat("eps", &eps)) eps = 0.01f; +/* regularization parameter */ +if (!sf_getbool("verb", &verb)) verb = true; +/* verbosity flag */ +``` + +The `/* ... */` on the line after each `sf_get*` call is parsed by the doc extractor and shown in `sfdoc sf`. + +For required parameters that have no sane default, call `sf_error` when absent: + +```c +if (!sf_histint(in, "n1", &n1)) sf_error("No n1= in input"); +``` + +--- + +## Memory API + +Signatures from `build/api/c/rsf.h`. All allocators abort with a meaningful error message on allocation failure — never return `NULL`. + +### 1-D allocators + +```c +float* sf_floatalloc (size_t n); +sf_complex* sf_complexalloc(size_t n); +int* sf_intalloc (size_t n); +``` + +### 2-D and 3-D allocators + +```c +float** sf_floatalloc2 (size_t n1, size_t n2); +float*** sf_floatalloc3 (size_t n1, size_t n2, size_t n3); +sf_complex** sf_complexalloc2(size_t n1, size_t n2); +int** sf_intalloc2 (size_t n1, size_t n2); +int*** sf_intalloc3 (size_t n1, size_t n2, size_t n3); +``` + +The multi-dimensional allocators return a pointer-to-pointer (standard C array-of-arrays), **but the underlying data is one contiguous block** — `out[0]` (for 2-D) or `out[0][0]` (for 3-D) points to it. This makes it safe to pass `arr[0]` to `sf_floatread`/`sf_floatwrite` as a flat buffer: + +```c +float **scan = sf_floatalloc2(n1, n2); +sf_floatread(scan[0], n1 * n2, in); /* reads entire 2-D block at once */ +``` + +### Deallocation + +Madagascar does **not** provide a custom deallocator. Use standard `free()` directly: + +```c +free(trace); /* 1-D */ +free(scan[0]); /* 2-D: free the contiguous data block */ +free(scan); /* then free the pointer array */ +``` + +Most production Madagascar programs skip explicit `free` calls before `exit(0)` because the OS reclaims all memory on process exit. This is acceptable; add `free` calls when running under Valgrind or in regression test programs that check for leaks. + +--- + +## Error API + +Signatures from `build/api/c/rsf.h`: + +```c +void sf_error (const char *format, ...); +void sf_warning(const char *format, ...); +``` + +`sf_error` prints the formatted message to stderr, then calls `exit(1)`. Use it for unrecoverable conditions (bad input, missing required header key, wrong data type): + +```c +if (SF_FLOAT != sf_gettype(in)) sf_error("Need float input"); +if (!sf_histint(in, "n1", &n1)) sf_error("No n1= in input"); +sf_error("Size mismatch [n%d]: %d != %d", j+1, m[j], n[j]); +``` + +`sf_warning` prints to stderr but **does not exit** — execution continues. Use it for progress messages and non-fatal conditions: + +```c +sf_warning("cmp %d of %d;", i3+1, n3); /* semicolon suppresses newline */ +sf_warning("."); /* trailing dot flushes the line */ +``` + +Both functions accept `printf`-style format strings and variadic arguments. + +--- + +## Extra libraries + +Most programs need only `#include ` and link against `librsf`. When a program needs helpers from `api/c/` (FFT, solvers, splines, etc.) or a user-directory helper module, the `SConstruct` in the user directory must declare the dependency. + +### How `user/fomels/SConstruct` links extra libraries + +The key lines from `user/fomels/SConstruct` (the real file in this tree): + +```python +libs = [dynpre+'rsf'] + env.get('LIBS', []) + +mains = Split(progs) +for prog in mains: + sources = ['M' + prog] + bldutil.depends(env, sources, 'M'+prog) + env.StaticObject('M'+prog+'.c') + prog = env.Program(prog, [x + '.o' for x in sources], LIBS=libs) +``` + +`bldutil.depends` scans the source file for `#include "foo.h"` lines and automatically adds `foo.o` to `sources`. This means that if `Mpick.c` includes `"dynprog.h"`, then `dynprog.c` is compiled and linked automatically — no manual addition to `sources` required. + +For optional system libraries (FFTW, LAPACK, JPEG, TIFF), the pattern is: + +```python +fftw = env.get('FFTW') +if fftw: + env.Prepend(CPPDEFINES=['SF_HAS_FFTW']) + +# LAPACK-dependent C++ programs +if lapack: + libsxx = [dynpre+'rsf++', 'vecmatop'] + libsxx.extend(lapack) + libsxx.extend(libs) + prog = env.Program(prog, [x + '.cc' for x in sources], LIBS=libsxx) +``` + +For pure C programs that only need `librsf` helpers (the common case), the default `LIBS=libs` line is sufficient — add `#include "somehelper.h"` and `bldutil.depends` handles the rest. + +--- + +## Worked references + +All four files exist verbatim in this tree. + +### `api/c/Testfile.c` — minimal I/O + +Demonstrates the smallest possible main: `sf_init`, `sf_input`, `sf_output`, `sf_fileclose`, `exit`. No data is read or written. Useful to verify the build environment compiles cleanly. + +### `api/c/Testgetpar.c` — parameter parsing + +Shows `sf_getint`, `sf_getfloat`, `sf_getfloats`, `sf_getbool`, `sf_getbools`, and `sf_getstring` with assertion checks. The `par=Testgetpar.c` trick passes the source file itself as a par-file to supply default values — a technique used in regression tests. + +### `api/c/Testfft.c` — FFT and `kiss_fft` + +Uses `kiss_fft_next_fast_size` from `api/c/kiss_fft.h`, the portable FFT library bundled with Madagascar. Demonstrates including a sub-library header directly without going through `rsf.h`. + +### `user/fomels/Mpick.c` — well-structured real-world program + +Chosen because it demonstrates the full Madagascar C idiom in one file: + +- Self-doc header with extended description and reference citation. +- `sf_init` / `sf_input` / `sf_output` pattern. +- `sf_histfloat` to read optional header metadata with defaults. +- `sf_getfloat`, `sf_getint`, `sf_getbool`, `sf_getstring` with inline doc comments. +- `sf_floatalloc2` (2-D allocation) and `sf_floatread`/`sf_floatwrite`. +- `sf_histstring` / `sf_putstring` to propagate axis labels. +- `sf_warning("cmp %d of %d;", ...)` progress reporting. +- `sf_unshiftdim` to reshape output dimensions. +- `exit(0)` at the end. + +The only non-standard element is `#include "dynprog.h"` (a helper in the same directory), which `bldutil.depends` links automatically. + +--- + +## Building and testing + +### Build + +Run `scons` inside `user//`: + +```bash +cd /Users/jgoai/m8r/src/user/ +scons +``` + +The build system finds every `M*.c` in the directory automatically. No explicit registration in `SConstruct` is needed beyond listing the name in `progs`. + +After `scons install` (or `make install` at the repo root), the binary is at: + +``` +/Users/jgoai/madagascar/bin/sf +``` + +### Verify self-doc + +```bash +sfdoc sf +``` + +This confirms that the self-doc comment block was parsed correctly and that all `sf_get*` parameters appear with their inline descriptions. + +### Verify basic I/O + +```bash +echo "" | sfspike n1=10 | sf > /dev/null +``` + +Or pipe through `sfin` to inspect the output header: + +```bash +sfspike n1=100 n2=5 | sf | sfin +``` + +### Regression test + +For programs with non-trivial logic, write a `Test.c` alongside `M.c`. The pattern from `user/fomels/SConstruct`: + +```python +for prog in Split('myalgorithm'): + sources = ['Test' + prog, prog] + bldutil.depends(env, sources, prog) + sources = [x + '.o' for x in sources] + env.Object('Test' + prog + '.c') + env.Program(sources, PROGPREFIX='', PROGSUFFIX='.x', LIBS=libs) +``` + +The resulting `myalgorithm.x` binary can be run directly and is picked up by `scons` regression suites. See `api/c/Testfile.c` and `api/c/Testgetpar.c` for the testing style. + +### Common pitfalls + +- Forgetting `sf_init` → segfault on first `sf_input` call. +- Calling `sf_output` before the first `sf_input` → output header has no inherited dimensions. +- Reading `n1` from the command line instead of `sf_histint` → ignores the actual file dimensions. +- Using `malloc` instead of `sf_floatalloc` → no automatic out-of-memory error message, harder to debug. +- Omitting the inline `/* description */` after each `sf_get*` call → parameter missing from `sfdoc`. diff --git a/skills/authoring-sf-programs-chapel/SKILL.md b/skills/authoring-sf-programs-chapel/SKILL.md new file mode 100644 index 000000000..4f687be14 --- /dev/null +++ b/skills/authoring-sf-programs-chapel/SKILL.md @@ -0,0 +1,276 @@ +--- +name: authoring-sf-programs-chapel +description: Use when authoring a Madagascar sf* main program in Chapel. +--- + +## When to use + +Load this skill when writing a new `sf` main program in Chapel. Chapel is a +parallel, HPC-oriented language developed by Cray/HPE. It is niche within the +Madagascar ecosystem: the `api/chapel/m8r.chpl` module exists and two test programs +(`api/chapel/test/clip.chpl`, `api/chapel/test/afdm.chpl`) exist, but there are +likely **no** `M*.chpl` user programs anywhere in the tree. Prefer C or Python +unless you specifically need Chapel's `forall` data-parallelism or locale-based +distributed computing. + +Main programs use the extension `.chpl` and the file naming convention `M.chpl` +(e.g., `user/yourname/Msmooth.chpl` installs as `sfsmooth`). + +This skill is Chapel-specific. For language-agnostic conventions (file naming, +self-doc format, parameter style, build integration) also load the companion: + +- [`../authoring-sf-programs/SKILL.md`](../authoring-sf-programs/SKILL.md) — shared conventions + +--- + +## Skeleton + +Minimal correct Chapel program. The entry point is `proc main(args: [] string)` — +Chapel passes command-line arguments as a string array, which is forwarded directly +to `sf_init`. + +```chapel +// One-sentence description of what this program does. +use m8r; + +proc main(args: [] string) +{ + // Initialize Madagascar (must be first RSF call) + sf_init(args); + + // Open I/O files + var fin: sf_file = sf_input("in"); + var fout: sf_file = sf_output("out"); + + // Read header metadata from input + var n1: int(32); + if !sf_histint(fin, "n1", n1) then + sf_error("No n1= in input"); + + var n2 = sf_leftsize(fin, 1); // number of traces + + // Read a command-line parameter (required) + var clip: real(32); + if !sf_getfloat("clip", clip) then + sf_error("Need clip="); + + // Allocate trace buffer + var trace: [0..n1-1] real(32); + + // Main loop + for i2 in 0..n2-1 { + sf_floatread(trace, n1, fin); + // ... process trace ... + sf_floatwrite(trace, n1, fout); + } + + // Close files + sf_fileclose(fin); + sf_fileclose(fout); + + // Finalize Madagascar + sf_close(); +} +``` + +Key invariants: + +- `sf_init(args)` must be the **first** RSF call; pass `args` as-is. +- Open all `sf_input` files before `sf_output` — the first output inherits + dimensions from the first input. +- Call `sf_close()` at the end to flush output headers. +- Use `sf_error("msg")` (not Chapel's builtin `halt`) for fatal errors — + `sf_error` writes to stderr and exits with a non-zero status, which is the + Madagascar pipeline convention. +- Chapel arrays passed to read/write routines must match the declared element + type precisely: `real(32)` for float, `int(32)` for int, etc. + +For parallel inner loops, replace the body loop with a `forall`: + +```chapel +forall (ix, iz) in {0..` main program in C++. Choose C++ over C when: + +- The algorithm is naturally expressed using **templates** (generic numerical kernels, policy-based designs, expression-template math libraries). +- You are integrating with an existing **C++ library ecosystem** (Eigen, Boost, RVL/RVLCOM operator framework, or similar). +- You want **object-oriented encapsulation** of state across multiple passes or operator classes. +- You are porting **existing scientific C++ code** and the translation cost to C is not justified. + +The source file for a C++ `sf` program is named `M.cc` and lives in `user//`. The installed binary is `sf` (the `M` is dropped, `sf` is prepended) — identical convention to C. + +This skill is C++-specific. For language-agnostic conventions — file naming, the self-documentation comment format, parameter conventions, error handling, and testing — see the companion skill: + +- [`../authoring-sf-programs/SKILL.md`](../authoring-sf-programs/SKILL.md) — shared conventions (load this too) + +--- + +## Skeleton + +The structure below follows `api/c++/Testfile.cc` exactly. Copy it verbatim and extend from here. + +```cpp +// One-sentence description of what this program does. + +#include +#include + +int main(int argc, char* argv[]) +{ + sf_init(argc, argv); + + iRSF par(0); // parameter object (0 = command-line only, not a file) + iRSF in; // default: opens "in" (stdin) + oRSF out; // default: opens "out" (stdout) + + // Read axis metadata from the input header + int n1; + float d1, o1; + in.get("n1", n1); + in.get("d1", d1); + in.get("o1", o1); + + // Read a command-line parameter; supply default with three-arg form + int n2; + par.get("n2", n2, 1); + /* n2 — number of output slices */ + + // Check data type if it matters + if (in.type() != SF_FLOAT) + sf_error("Need float input."); + + // Allocate a trace buffer using std::valarray + std::valarray trace(n1); + + // Write axis metadata to the output header before the data loop + out.put("n1", n1); + out.put("d1", d1); + out.put("o1", o1); + out.put("n2", n2); + + // Main I/O loop: read one trace, process, write + for (int i2 = 0; i2 < n2; i2++) { + in >> trace; + // ... process trace ... + out << trace; + } + + exit(0); +} +``` + +Key points: + +- `sf_init(argc, argv)` must be the first call. +- `iRSF par(0)` opens the parameter object in command-line-only mode (not a file). Use `iRSF par` (no argument) only when `par` is also a data file. +- `in.get("n1", n1)` reads axis metadata (header key `n1`). No default — aborts if absent. +- `par.get("key", var, default)` reads a command-line parameter with a default. +- Allocate buffers with `std::valarray` for automatic memory management. +- Write all output header keys with `out.put(...)` **before** the data loop. +- Use `>>` to read and `<<` to write `std::valarray` buffers. +- `sf_error("message")` (the C function, accessible via `rsf.hh`) prints to stderr and exits. + +--- + +## API cheat sheet + +All calls below are derived directly from `api/c++/rsf.hh`. + +| Operation | C++ call | +|-----------|----------| +| Initialize | `sf_init(argc, argv);` | +| Open default input | `iRSF in;` — opens `"in"` (stdin) | +| Open named input | `iRSF vel("vel");` — opens file passed as `vel=` | +| Open parameter object | `iRSF par(0);` — command-line params, no file | +| Open default output | `oRSF out;` — opens `"out"` (stdout) | +| Open named output | `oRSF wt("weight");` — opens file passed as `weight=` | +| Read axis int (required) | `in.get("n1", n1);` | +| Read axis float (required) | `in.get("d1", d1);` | +| Read axis string | `std::string label; in.get("label1", label);` | +| Write axis int | `out.put("n1", n1);` | +| Write axis float | `out.put("d1", d1);` | +| Write axis string | `out.put("label1", "Time");` | +| Read int param (required) | `par.get("niter", niter);` | +| Read int param (with default) | `par.get("niter", niter, 100);` | +| Read float param | `par.get("eps", eps, 0.01f);` | +| Read bool param | `par.get("adj", adj, false);` | +| Read string param | `std::string mode; par.get("mode", mode, std::string("exact"));` | +| Read data (valarray) | `in >> trace;` where `trace` is `std::valarray` | +| Write data (valarray) | `out << trace;` | +| Read scalar | `float v; in >> v;` | +| Write scalar | `float v = 1.f; out << v;` | +| Set output data type | `out.type(SF_INT);` | +| Check input data type | `if (in.type() != SF_FLOAT) sf_error("need float");` | +| Query total file size | `int total = in.size(0);` — product of all axes | +| Query size along axis k | `int nk = in.size(k);` — size of axis k (1-based) | +| Error handler | `sf_error("msg: %d", val);` — stderr + exit | + +Notes on `put` overloads in `oRSF`: there are three overloads — `put(name, int)`, `put(name, float)`, and `put(name, const char*)`. There is **no** `put(name, float, size, array)` overload for float arrays (that overload is commented out in `rsf.hh`); use the int-array form `put(name, size, int_array)` only. + +--- + +## Build integration + +### What `api/c++/SConstruct` does + +`api/c++/SConstruct` compiles `rsf.cc` and `cub.cc` into a static library named `rsf++` (file: `librsf++.a`, installed to `lib/`). The relevant line: + +```python +lib = env.StaticLibrary('rsf++', ccfiles, CCFLAGS='') +env.Install('../../lib', lib) +env.Install('../../include', hhfiles) # installs rsf.hh and cub.hh +``` + +It also prepends `../../include` to `CPPPATH` and `../../lib` to `LIBPATH`, and links against `librsf` (the C core). Test programs (`Testfile.x`, `Testgetpar.x`) are built in-place for local verification. + +### For your user-directory C++ program + +Use `HuiSconsTargets` in your `user//SConstruct`, **not** `UserSconsTargets` (the latter covers only `.c`, `.py`, `.f90`, `.jl`). The `HuiSconsTargets` helper exposes a `.cc` attribute: + +```python +import sys, os +sys.path.append('../../framework') +import bldutil + +targets = bldutil.HuiSconsTargets() +targets.cc = 'myprogram anotherprogram' # base names without M prefix or .cc +targets.build_all(env, glob_build, srcroot, bindir, libdir, pkgdir) +``` + +This compiles `Mmyprogram.cc` → `sfmyprogram` and links it against both `librsf++` and `librsf`. + +If your user directory does not yet have a `SConstruct`, copy the one from a nearby C++ user directory (e.g., `user/pyang/SConstruct` or `user/chenyk/SConstruct`) and adjust the program list. + +### Linking flags (manual reference) + +If you ever need to link manually outside SCons: + +```bash +g++ -I$RSFROOT/include Mmyprogram.cc -L$RSFROOT/lib -lrsf++ -lrsf -o sfmyprogram +``` + +The C++ library is `-lrsf++` (from `librsf++.a`); the C core is `-lrsf`. Both must appear. + +--- + +## Pointers to existing templates + +All files in `api/c++/`: + +- `Testfile.cc` — minimal I/O: read an `SF_INT` trace from stdin, write it five times to stdout. The simplest possible complete program. Start here. +- `Testgetpar.cc` — parameter parsing: demonstrates `par.get` for int, float, bool, and array variants with and without defaults. +- `rsf.hh` — the C++ API public header: `iRSF` and `oRSF` class declarations, all `get`/`put`/`>>` / `<<` overloads. +- `rsf.cc` — implementation of `iRSF` and `oRSF`; wraps `sf_input`, `sf_output`, `sf_histint`, `sf_histfloat`, `sf_getint`, `sf_getfloat`, `sf_getbool`, `sf_getstring`, `sf_floatread`, `sf_floatwrite`, etc. +- `cub.hh` — higher-level `CUB` class: manages an `sf_axis*` array for multi-dimensional cubes; exposes `headin()`, `headou()`, `clone()`, `getax(int)`, `putax(int, sf_axis)`, `setup(int kd)`, and typed `>>` / `<<` operators for float, int, short, char, sf_complex, and `std::complex`. +- `cub.cc` — implementation of `CUB`; prefer `CUB` over raw `iRSF`/`oRSF` when you need per-axis `sf_axis` structs (origin, delta, label, unit). +- `SConstruct` — SCons build script: compiles `rsf.cc`+`cub.cc` into `librsf++.a`, installs `rsf.hh`+`cub.hh` to `include/`, builds test programs. +- `test/` — directory of additional test/regression scripts for the C++ API. + +--- + +## Shared conventions + +The self-documentation comment for a C++ program is a **single `//` line** immediately before the first `#include`. This is what `framework/rsf/doc.py` scrapes (`comment['c++']` regex). Parameter descriptions are `// trailing comments` on the same line as each `par.get(...)` call. File naming (`M.cc`), parameter style (`key=value`), error handling (`sf_error`), and test patterns all follow the rules set out in the shared skill. For full details on all of these — including how `sfdoc` output is generated, how to write regression flows, and how to handle optional library dependencies — see [`skills/authoring-sf-programs/`](../authoring-sf-programs/SKILL.md). diff --git a/skills/authoring-sf-programs-f77/SKILL.md b/skills/authoring-sf-programs-f77/SKILL.md new file mode 100644 index 000000000..9ee49f1c0 --- /dev/null +++ b/skills/authoring-sf-programs-f77/SKILL.md @@ -0,0 +1,197 @@ +--- +name: authoring-sf-programs-f77 +description: Use when authoring a Madagascar sf* main program in Fortran 77. +--- + +## When to use + +Use this skill for Madagascar processing programs in **Fortran 77 fixed-format** +(file extension `.f`, program named `M.f`). Typical scenarios: maintaining +legacy fixed-format code, porting existing F77 algorithms, or working with +research groups that standardize on `gfortran -std=legacy` or `ifort`. + +**F77 vs F90 key differences:** + +| Aspect | F77 | F90 | +|---|---|---| +| Source format | Fixed-format; code starts at column 7 | Free-format | +| Continuation | Any non-space/zero char in column 6 | `&` at end of line | +| File extension | `M.f` | `M.f90` | +| Comments | `C` or `*` in column 1 | `!` anywhere | +| Loops | `do 10 i=1,n` … `10 continue` | `do i=1,n` … `end do` | + +For shared conventions (SConstruct rules, self-doc headers, naming) see +[skills/authoring-sf-programs/](../authoring-sf-programs/SKILL.md). + +--- + +## Skeleton + +Minimal F77 program derived from `api/f77/Testfile.f` and `api/f77/test/clip.f`. +Columns 1-6 are reserved; statement body starts at column 7; continuation +marker goes in column 6. + +```fortran +C Mclip.f -- clip float data to +/- clip + program Clipit + implicit none + integer*8 in, out + integer*8 sf_input, sf_output, sf_leftsize + logical sf_getfloat, sf_histint + integer n1, n2, i1, i2 + real clip, trace(1000) + + call sf_init() + in = sf_input("in") + out = sf_output("out") + + if (.not. sf_histint(in,"n1",n1)) + & call sf_error("No n1= in input") + if (.not. sf_getfloat("clip",clip)) + & call sf_error("Need clip=") + n2 = sf_leftsize(in,1) + + do 10 i2=1, n2 + call sf_floatread(trace, n1, in) + do 20 i1=1, n1 + if (trace(i1) .gt. clip) trace(i1) = clip + if (trace(i1) .lt. -clip) trace(i1) = -clip + 20 continue + call sf_floatwrite(trace, n1, out) + 10 continue + stop + end +``` + +Column layout reminder: +``` +123456789... +C comment (C in col 1) + statement (col 7+) + &continuation (col 6) + 10 label (cols 1-5) +``` + +--- + +## API cheat sheet + +All calls go through the `librsff` wrapper (`fortran.c` + `cfortran.h`). +Functions returning a value must be declared with the appropriate Fortran type. + +**Init and file handles** — declare file handles as `integer*8`: + +| Call | Returns | Purpose | +|---|---|---| +| `call sf_init()` | — | Initialize Madagascar (must be first) | +| `sf_input("in")` | `integer*8` | Open named input RSF file | +| `sf_output("out")` | `integer*8` | Open named output RSF file | +| `call sf_fileclose(f)` | — | Close file handle | +| `call sf_close()` | — | Finalize I/O | + +**History reads** — declare as `logical`; `.false.` means key absent: + +| Call | Purpose | +|---|---| +| `sf_histint(f,"n1",i)` | Read integer header key | +| `sf_histfloat(f,"d1",x)` | Read float header key | +| `sf_histbool(f,"key",b)` | Read logical header key | +| `sf_histstring(f,"label")` | Read string (function returning `character*`) | + +**History writes** (`call` subroutines, no return value): + +| Call | Purpose | +|---|---| +| `call sf_putint(f,"n2",i)` | Write integer to output header | +| `call sf_putfloat(f,"d2",x)` | Write float | +| `call sf_putstring(f,"label",str)` | Write string | + +**Command-line param reads** — declare as `logical`; `.true.` if key found: + +| Call | Purpose | +|---|---| +| `sf_getint("niter",i)` | Read integer param | +| `sf_getfloat("clip",x)` | Read float param | +| `sf_getfloats("d",xv,n)` | Read n floats | +| `sf_getbool("verb",b)` | Read yes/no param | +| `sf_getstring("tag")` | Read string (function) | + +**Data I/O** (`call` subroutines): + +| Call | Purpose | +|---|---| +| `call sf_floatread(buf, n, f)` | Read `n` floats from `f` | +| `call sf_floatwrite(buf, n, f)` | Write `n` floats to `f` | +| `call sf_intread(buf, n, f)` | Read `n` integers | +| `call sf_intwrite(buf, n, f)` | Write `n` integers | +| `call sf_complexread(buf, n, f)` | Read `n` complex values | +| `call sf_complexwrite(buf, n, f)` | Write `n` complex values | + +**Utility:** + +| Call | Returns | Purpose | +|---|---|---| +| `sf_leftsize(f, dim)` | `integer*8` | Elements past dimension `dim` | +| `sf_gettype(f)` | `integer` | Data type code (3 = float) | +| `call sf_error("msg")` | — | Print and abort | +| `call sf_warning("msg")` | — | Print, continue | + +--- + +## Build integration + +The F77 API builds only when `f77` appears in the SCons `API` list +(`api/f77/SConstruct`): + +1. Detects compiler via `env.get('F77')`; sets `-DGFORTRAN` or + `-DINTEL_COMPILER` accordingly. +2. Compiles `fortran.c` (with `cfortran.h`) into `fortran.o`. +3. Bundles into `librsff.a` (static library) and installs to `lib/`. +4. User programs link against both `librsff` and core `librsf`. + +Typical user `SConstruct` snippet: + +```python +env.Append(LIBS=['rsff', 'rsf']) +env.Program('sfclip', ['Mclip.f']) +# For strict fixed-format: +env.Replace(F77FLAGS='-std=legacy -ffixed-form') +``` + +`cfortran.h` handles name-mangling: `gfortran` (single underscore) vs +`f2c`-style (double underscore) via compile-time defines. + +--- + +## Pointers + +Files in `/Users/jgoai/m8r/src/api/f77/`: + +| File | Description | +|---|---| +| `fortran.c` | C wrapper; `FCALLSC*` macros expose every `sf_*` function to F77 | +| `cfortran.h` | Third-party (Burkhard Burow/DESY) Fortran↔C name-mangling header | +| `Testfile.f` | Smoke-test: open in/out, read 100-float traces, write them back | +| `Testgetpar.f` | Exercises all param-read calls (`sf_getint`, `sf_getfloat`, etc.) | +| `SConstruct` | Builds `librsff.a`, installs to `lib/`, compiles test programs | +| `test/clip.f` | Complete real program: float clipping with header inspection | +| `test/SConstruct` | SCons build for the `test/` example programs | + +--- + +## Shared conventions + +General Madagascar conventions (self-documentation strings, parameter help, +SCons `Program` builder, installation targets) are in: + +[skills/authoring-sf-programs/](../authoring-sf-programs/SKILL.md) + +F77-specific reminders: + +- Declare `sf_input`, `sf_output`, `sf_leftsize`, `sf_filesize` as `integer*8` + — file handles and offsets are 64-bit on modern systems. +- Declare param-query functions (`sf_getint`, `sf_histfloat`, …) as `logical` + to test them in `if (.not. …)` guards. +- Use `implicit none` — F77 implicit typing silently maps `i`-`n` to integer + and the rest to real, hiding declaration bugs. +- Continuation: place the continuation character in **column 6** only. diff --git a/skills/authoring-sf-programs-f90/SKILL.md b/skills/authoring-sf-programs-f90/SKILL.md new file mode 100644 index 000000000..93ed13e4b --- /dev/null +++ b/skills/authoring-sf-programs-f90/SKILL.md @@ -0,0 +1,169 @@ +--- +name: authoring-sf-programs-f90 +description: Use when authoring a Madagascar sf* main program in Fortran 90. +--- + +## When to use + +Load this skill when writing a new `sf` main program in **Fortran 90**. F90 is preferred when: + +- The algorithm already exists as Fortran legacy or numerical code and a rewrite would be risky. +- The researcher team works primarily in Fortran and a C translation would lose maintainability. +- Dense array operations benefit from Fortran's natural multi-dimensional array semantics. + +The source file naming convention is `M.f90`, placed inside `user//`. The `M` prefix tells the SCons build system that this is a main program; the installed binary will be named `sf`. + +**Dependency note:** The F90 API requires a Fortran 90 compiler. `gfortran` is the reference compiler; `pgf90` and `ifort` are also supported (see `SConstruct` logic). The Madagascar `configure` step must detect an F90 compiler and set `API=f90`; if detection fails the `librsff90` library is not built and F90 programs cannot be compiled. + +For language-agnostic conventions (self-documentation comments, parameter style, build integration, error handling, testing), read the shared skill first: + +- [skills/authoring-sf-programs/](../authoring-sf-programs/SKILL.md) + +--- + +## Skeleton + +Minimal F90 program structure, derived from `api/f90/Testfile.f90`: + +```fortran +! sf — one-line description +! +! Usage: sf < inp.rsf > out.rsf [param=value ...] +program Mname + use rsf + + implicit none + type(file) :: in, out + integer :: n1, n2, i2 + real :: d1, o1 + real, dimension(:), allocatable :: trace + + ! 1. Initialise RSF and parse command-line parameters + call sf_init() + + ! 2. Open standard input / output files + in = rsf_input() ! default tag "in" + out = rsf_output("out") ! default tag "out" + + ! 3. Read axis metadata from input history + call from_par(in, "n1", n1) ! fast axis length (required) + call from_par(in, "d1", d1, 1.0) ! sampling (optional, default 1) + call from_par(in, "o1", o1, 0.0) ! origin (optional, default 0) + call from_par(in, "n2", n2, 1) ! slow axis (optional, default 1) + + ! 4. Read user parameter + ! call from_par("key", val) -- reads from command line; aborts if missing + ! call from_par("key", val, default) -- uses default if absent + + ! 5. Propagate / update output history + call to_par(out, "n1", n1) + call to_par(out, "n2", n2) + + ! 6. Allocate working array + allocate(trace(n1)) + + ! 7. Trace loop: read → process → write + do i2 = 1, n2 + call rsf_read(in, trace) + ! ... process trace ... + call rsf_write(out, trace) + end do + +end program Mname +``` + +Key structural points: + +- `use rsf` is the only module import needed; it re-exports the entire public API. +- `implicit none` is mandatory — the module itself uses it, and omitting it in the main program will cause silent type bugs. +- `rsf_input()` / `rsf_output()` accept an optional `tag` argument (default `"in"` and `"out"` respectively) that corresponds to the pipe or file descriptor label used on the command line. +- `from_par` and `to_par` are generic interfaces; the compiler dispatches on the type of the `value` argument (`integer`, `real`, `character`, `logical`, arrays). +- Allocate arrays *after* reading axis dimensions, never before. + +--- + +## API cheat sheet + +All entries are sourced from `api/f90/rsf.f90` and `api/f90/fortran.c`. "Subroutine" means called with `call`; "function" means used in an expression or assignment. + +| Purpose | F90 call | Notes | +|---------|----------|-------| +| **Initialise** | `call sf_init()` | Must be first; parses argv, sets up pipes | +| **Open input** | `in = rsf_input([tag])` | Function returns `type(file)`; default tag `"in"` | +| **Open output** | `out = rsf_output([tag])` | Function returns `type(file)`; default tag `"out"` | +| **Get hist int** | `call from_par(f, "n1", ival [,default])` | Reads integer from file header; aborts if missing and no default | +| **Get hist real** | `call from_par(f, "d1", rval [,default])` | Reads real from file header | +| **Get hist string** | `call from_par(f, "key", sval [,default])` | String result; max `FSTRLEN=256` chars | +| **Get hist int array** | `call from_par(f, "key", ivec [,default])` | `ivec` is `integer, dimension(:)` | +| **Get param int** | `call from_par("key", ival [,default])` | No `file` arg — reads from command line | +| **Get param real** | `call from_par("key", rval [,default])` | No `file` arg — reads from command line | +| **Get param bool** | `call from_par("key", lval [,default])` | `lval` is `logical` | +| **Get param real array** | `call from_par("key", rvec [,default])` | `rvec` is `real, dimension(:)` | +| **Get param string** | `call from_par("key", sval [,default])` | String from command line | +| **Put hist int** | `call to_par(out, "n1", ival)` | Writes integer key into output header | +| **Put hist real** | `call to_par(out, "d1", rval)` | Writes real key into output header | +| **Put hist string** | `call to_par(out, "key", sval)` | Writes string key into output header | +| **Put hist int array** | `call to_par(out, "key", ivec)` | Writes integer array | +| **Read float data** | `call rsf_read(in, array)` | Dispatches on rank (1-D to 5-D); also accepts explicit `n` | +| **Write float data** | `call rsf_write(out, array)` | Same dispatch rules as `rsf_read` | +| **Read complex data** | `call rsf_read(in, carray)` | `carray` is `complex, dimension(:...)` | +| **Write complex data** | `call rsf_write(out, carray)` | `carray` is `complex, dimension(:...)` | +| **Get/put full axis** | `call iaxa(f, ax, i)` / `call oaxa(f, ax, i)` | Reads/writes `type(axa)` struct (n,o,d,label,unit) for axis `i` | +| **Print axis** | `call raxa(ax)` | Writes axis summary to `stderr` | +| **Get file type** | `t = gettype(f)` | Returns `sf_float=3`, `sf_int=2`, `sf_complex=4`, etc. | +| **Set file type** | `call settype(f, t)` | Pass type constant, e.g. `sf_int` | +| **File size** | `s = filesize(f [,dim])` | Without `dim`: total elements; with `dim`: elements from axis `dim` onward | +| **File dimensions** | `nd = dimension(f, n)` | Fills integer array `n`; returns number of dims | +| **Seek** | `call rsf_seek(f, offset, whence)` | `whence`: `sf_seek_set=0`, `sf_seek_cur=1`, `sf_seek_end=2` | +| **Flush** | `call rsf_fileflush(out [,src])` | Flush output; optionally copy header from `src` | +| **Fatal error** | `call sf_error("message")` | Prints to stderr and aborts | + +Type constants declared in `rsf.f90`: `sf_uchar=0`, `sf_char=1`, `sf_int=2`, `sf_float=3`, `sf_complex=4`, `sf_short=5`. + +Seek constants: `sf_seek_set=0`, `sf_seek_cur=1`, `sf_seek_end=2`. + +--- + +## Build integration + +The F90 API library is built by `api/f90/SConstruct`: + +1. It checks `'f90' in env.get('API',[]) and 'F90' in env` — both must be true, which requires that `configure` detected a Fortran 90 compiler. +2. It compiles `fortran.c` (the C-to-Fortran glue) and `rsf.f90` into `librsff90.a`. +3. It installs `librsff90.a` → `lib/` and the compiled module file `rsf.mod` → `include/`. +4. Compiler-specific flags are set automatically: + - `gfortran` / `gfc`: adds `-DGFORTRAN` (selects `_gfortran_iargc` / `_gfortran_getarg_i4` symbols in `fortran.c`). + - `pgf90`: extra include path only. + - `ifort`: adds `-DINTEL_COMPILER`. + +For user F90 programs (`user//M.f90`), follow the same pattern as for C/C++ programs but use the `f90` target helpers described in the shared skill. The SCons infrastructure links against `librsff90` automatically when it detects a `.f90` source file with an `M` prefix. See the shared skill ([skills/authoring-sf-programs/](../authoring-sf-programs/SKILL.md)) for the `SConstruct` snippet and `UserSconsTargets.f90` details. + +--- + +## Pointers + +Files in `api/f90/` (as of this writing): + +| File | Description | +|------|-------------| +| `rsf.f90` | The RSF Fortran-90 module; defines `module RSF`, `type(file)`, `type(axa)`, and all public generic interfaces (`from_par`, `to_par`, `rsf_read`, `rsf_write`, etc.) | +| `fortran.c` | C glue layer; wraps C API functions (`sf_init`, `sf_histint`, `sf_floatread`, etc.) into Fortran-callable entry points using `cfortran.h` macros | +| `cfortran.h` | Portable C-to-Fortran calling-convention header (third-party); handles name mangling for gfortran, pgf90, ifort, and others via `FCALLSC*` / `CCALLSF*` macros | +| `SConstruct` | SCons build script; detects F90 compiler, compiles `librsff90.a`, installs library and module, and builds the two test programs | +| `Testfile.f90` | Template/integration test for file I/O (`rsf_input`, `rsf_output`, `from_par`, `rsf_read`, `rsf_write`) — the canonical minimal program skeleton | +| `Testgetpar.f90` | Template/integration test for parameter parsing (`from_par` on command-line args, bool, real arrays) and `sf_error` | +| `test` | Directory with additional regression-test helpers | + +--- + +## Shared conventions + +All conventions below are defined in the shared skill and apply unchanged to F90 programs: + +- Self-documentation comment block at the top of the file (processed by `doc/`) +- Parameter naming and ordering rules +- Error message style (`sf_error` with lowercase message, no trailing period) +- Testing via `sftour` / `SConstruct` `Test` targets +- Build file location (`user//SConstruct` or top-level `SConstruct`) + +See [skills/authoring-sf-programs/](../authoring-sf-programs/SKILL.md) for the full reference. diff --git a/skills/authoring-sf-programs-java/SKILL.md b/skills/authoring-sf-programs-java/SKILL.md new file mode 100644 index 000000000..a9b7b40d2 --- /dev/null +++ b/skills/authoring-sf-programs-java/SKILL.md @@ -0,0 +1,200 @@ +--- +name: authoring-sf-programs-java +description: Use when authoring a Madagascar sf* main program in Java. +--- + +## When to use + +Choose the Java path when researchers bring an existing Java background or need access +to Java scientific libraries (e.g., the Mines JTK) that have no C/Fortran equivalent. +The file naming convention follows the same pattern as other language bindings: +`M.java` (e.g., `MClip.java` produces the program `sfclip`). + +**Note:** Java support requires a JDK at configure time and the `JAVA_HOME` (or +`JAVA_SDK`) environment variable set before running SCons. It is rarely used in +practice — there are no `M*.java` programs in the main Madagascar tree. The files +under `api/java/` (`Input.java`, `Output.java`, etc.) are reference implementations +and the `test/Clip.java` example demonstrates usage. Additionally, the Mines JTK +(`MINESJTK` environment variable) is required for the older API layer under +`api/java/old/`. + +For the broader context of what an `sf*` program is, how self-documentation works, +and general conventions that apply across all languages, see the shared skill: +[skills/authoring-sf-programs/](../authoring-sf-programs/SKILL.md). + +## Skeleton + +A minimal Java program that reads a 3-D float dataset, processes it trace-by-trace, +and writes the result. Based on the `test/Clip.java` reference implementation. + +```java +import rsf.RSF; +import rsf.Input; +import rsf.Output; + +public class MExample { + // Load the JNI bridge at class-load time + static { + System.loadLibrary("jrsf"); + } + + public static void main(String[] args) { + // 1. Initialize the RSF parameter system with command-line args + RSF par = new RSF(args); + + // 2. Open standard input and output RSF files + Input in = new Input("in"); + Output out = new Output("out"); + + // 3. Read header dimensions (1-based axis indexing) + int n1 = in.getN(1); + int n2 = in.getN(2); + int n3 = in.getN(3); + + // 4. Copy axis metadata to output + out.setN(1, n1); out.setDelta(1, in.getDelta(1)); + out.setN(2, n2); out.setOrigin(1, in.getOrigin(1)); + out.setN(3, n3); + + // 5. Parse a user parameter (with default fallback) + float threshold = par.getFloat("threshold", 1.0f); + + // 6. Read, process, write one fast-axis slice at a time + float[] trace = new float[n1]; + for (int i3 = 0; i3 < n3; i3++) { + for (int i2 = 0; i2 < n2; i2++) { + in.read(trace); + // ... your processing here ... + out.write(trace); + } + } + + // 7. Close files + in.close(); + out.close(); + } +} +``` + +The `static { System.loadLibrary("jrsf"); }` block is mandatory — it loads the +SWIG-generated JNI shared library (`libjrsf.jnilib` on macOS, `jrsf.so` on Linux) +bridging Java to the underlying `libdrsf` C library. + +## API cheat sheet + +### RSF — parameter / initialisation (`api/java/RSF.java`) + +| Method | Signature | Notes | +|--------|-----------|-------| +| Constructor | `new RSF(String[] args)` | Prepends `"java"` and calls `sf_init`; pass `main`'s `args` directly | +| getBool | `boolean getBool(String key, boolean fallback)` | Returns the parsed bool; `fallback` not yet wired — underlying `sf_getbool` result is returned | +| getInt | `int getInt(String key, int fallback)` | Returns `fallback` when key absent | +| getFloat | `float getFloat(String key, float fallback)` | Returns `fallback` when key absent | +| getString | `String getString(String key, String fallback)` | Returns `fallback` when key absent or null | + +### RSFFile — header access (`api/java/RSFFile.java`) + +Abstract base for `Input` and `Output`. All index arguments are **1-based**. + +| Method | Signature | Notes | +|--------|-----------|-------| +| getN | `int getN(int axis)` | Reads `n` from header via `sf_histint` | +| setN | `void setN(int axis, int n)` | Writes `n` via `sf_putint` | +| getDelta | `float getDelta(int axis)` | Reads `d` via `sf_histfloat` | +| setDelta | `void setDelta(int axis, float d)` | Writes `d` via `sf_putfloat` | +| getOrigin | `float getOrigin(int axis)` | Reads `o` via `sf_histfloat` | +| setOrigin | `void setOrigin(int axis, float o)` | Writes `o` via `sf_putfloat` | +| getLabel | `String getLabel(int axis)` | Reads `label` via `sf_histstring` | +| setLabel | `void setLabel(int axis, String s)` | Writes `label` via `sf_putstring` | +| getUnit | `String getUnit(int axis)` | Reads `unit` via `sf_histstring` | +| setUnit | `void setUnit(int axis, String s)` | Writes `unit` via `sf_putstring` | +| close | `void close()` | Calls `sf_fileclose`; always call for both `Input` and `Output` | +| MAX_DIMS | `static final int` | Equals `SF_MAX_DIM` from the generated `m8rConstants` class | + +### Input — reading data (`api/java/Input.java`) + +Extends `RSFFile`. Delegates to `sf_floatread` via the SWIG-generated `m8r` class. + +| Method | Notes | +|--------|-------| +| `new Input("in")` | `"in"` is the standard stdin token; any RSF filename also works | +| `void read(float[] a)` | Read `a.length` floats from the data stream | +| `void read(float[][] a)` | Iterates over outer dimension, calls `read(float[])` | +| `void read(float[][]...a)` | Overloads up to 9-D arrays via recursive delegation | + +### Output — writing data (`api/java/Output.java`) + +Extends `RSFFile`. Delegates to `sf_floatwrite`. Wraps all writes in +`try/catch (Exception e)`, printing to `System.err` on failure. + +| Method | Notes | +|--------|-------| +| `new Output("out")` | `"out"` is the standard stdout token | +| `void write(float[] d)` | Write `d.length` floats to the data stream | +| `void write(float[][] d)` | Iterates and delegates to `write(float[])` | +| `void write(float[][]...d)` | Overloads up to 9-D arrays | + +### Error handling + +`Output` and `RSFFile` accessors catch `Exception`, print to `System.err`, and return +sensible defaults (0 / 0.0f / `""`). No Java-level `sf_error`; fatal errors propagate from the C layer. + +## Build integration + +The `api/java/SConstruct` file documents the full build process: + +1. **JDK detection** — reads `JAVA_HOME` or `JAVA_SDK` from the SCons environment. + If neither is set the build exits immediately. +2. **SWIG step** — `m8r.i` is processed by SWIG with `-java -package rsf` to + produce `m8r_wrap.c` and the generated Java stubs (`m8r.java`, `m8rJNI.java`, + `m8rConstants.java`, `SWIGTYPE_p_sf_File.java`, etc.). +3. **JNI shared library** — the wrap C file is compiled and linked against + `libdrsf` to produce `libjrsf.jnilib` (macOS) or `jrsf.so` (Linux). +4. **Java compilation** — `javac` compiles `RSF.java`, `RSFFile.java`, + `Input.java`, `Output.java`, and the generated stubs into `rsf/*.class`. +5. **JAR packaging** — `rsf.class` files are archived into `rsf.jar` and + installed to `$RSFROOT/lib/`. + +For a user program there is no `UserSconsTargets` Java attribute in the SCons +framework. User programs must replicate the compile-and-jar pattern manually or +use the `project.Java(...)` helper demonstrated in `api/java/test/SConstruct`: + +```python +from rsf.proj import * +project.Java('.', 'MExample.java') +Flow('result', 'input MExample.class', + '%s MExample threshold=0.5' % WhereIs('java')) +End() +``` + +The JVM must be able to locate both `rsf.jar` and the JNI library at runtime. +Set `CLASSPATH` and `java.library.path` (or `LD_LIBRARY_PATH` / `DYLD_LIBRARY_PATH`) +to point at `$RSFROOT/lib/`. + +## Pointers + +All files under `api/java/`: + +| File | Description | +|------|-------------| +| `README` | Short note on the Mines JTK dependency and `MINESJTK` environment variable | +| `RSF.java` | Top-level class: calls `sf_init` and exposes `getInt/getFloat/getString/getBool` for command-line parameter parsing | +| `RSFFile.java` | Abstract base class for RSF files: header getters/setters (`getN`, `getDelta`, `getOrigin`, `getLabel`, `getUnit`) and `close()` | +| `Input.java` | Concrete read-only RSF file: `read(float[])` overloaded up to 9 dimensions | +| `Output.java` | Concrete write-only RSF file: `write(float[])` overloaded up to 9 dimensions | +| `m8r.i` | SWIG interface file that generates the JNI bridge between Java and the Madagascar C API (`libdrsf`) | +| `SConstruct` | SCons build script: JDK detection, SWIG invocation, `javac` compilation, JAR creation, and install to `$RSFROOT/lib/` | +| `test/Clip.java` | Reference implementation of a clipping program; shows the complete pattern: `RSF`, `Input`, `Output`, header reads, trace loop, `read`/`write`, `close` | +| `test/SConstruct` | Minimal SCons flow using `project.Java()` and `WhereIs('java')` to invoke the compiled class | +| `old/Header.java` | Legacy header class (Mines JTK-based); superseded by `RSFFile.java` | +| `old/Par.java` | Legacy parameter-parsing class; superseded by `RSF.java` | +| `old/Reader.java` | Legacy reader; superseded by `Input.java` | +| `old/Writer.java` | Legacy writer; superseded by `Output.java` | + +## Shared conventions + +Self-documentation, program naming (`sf*` / `M*.java`), data-type conventions, +axis ordering (n1 is the fast axis), and the overall SCons flow are shared across +all Madagascar language bindings. See: + +[skills/authoring-sf-programs/](../authoring-sf-programs/SKILL.md) diff --git a/skills/authoring-sf-programs-julia/SKILL.md b/skills/authoring-sf-programs-julia/SKILL.md new file mode 100644 index 000000000..94722b234 --- /dev/null +++ b/skills/authoring-sf-programs-julia/SKILL.md @@ -0,0 +1,190 @@ +--- +name: authoring-sf-programs-julia +description: Use when authoring a Madagascar sf* main program in Julia. +--- + +## When to use + +Load this skill when writing a new `sf` program in Julia. Julia is a good fit +when your collaborators already work in that ecosystem, when you want direct access to +the Julia numerical stack (FFTW.jl, LinearAlgebra, LoopVectorization, etc.) without a +foreign-function call, or when you need seamless FFI to C but prefer Julia syntax over +Python. + +The source file is named `M.jl` and lives in `user//`. The build +system handles the rest. Framework support for Julia programs exists via +`UserSconsTargets.jl` (see **Build integration**). There are no existing `M*.jl` +programs in the tree yet; use `api/julia/test/` as templates (especially `clip.jl` +for the core pattern and `afdm.jl` for a heavier numerical example). + +The Julia API (`m8r.jl`) is a thin wrapper around `libdrsf` via `ccall`. All data +types map to their C equivalents: `Float32`, `Int32`, `ComplexF32`, `Int16`, `UInt8`. + +See also: [Shared language-agnostic conventions](../authoring-sf-programs/SKILL.md) + +--- + +## Skeleton + +Modelled on `api/julia/test/clip.jl`: + +```julia +#!/usr/bin/env julia +# M.jl — one-sentence description of what this program does. +# +# Additional comments, references, and notes go here. +# (No structured doc-scraper for Julia yet; the comment is for human readers.) + +import m8r + +# --- Parameter retrieval (from command line or par= file) --- +clip = m8r.getfloat("clip") # required — exits if absent +verb = m8r.getbool("verb", false) # optional — default false +niter = m8r.getint("niter", 100) # optional — default 100 + +# --- Open standard input / output --- +inp = m8r.input("in") +out = m8r.output("out") + +# --- Read header fields from inp --- +n1 = m8r.histint(inp, "n1") +n2 = m8r.leftsize(inp, 1) # total samples / n1; works for any nd array + +# --- Copy header to output, optionally modifying axes --- +m8r.putint(out, "n1", n1) +m8r.putfloat(out, "d1", m8r.histfloat(inp, "d1")) +m8r.putfloat(out, "o1", m8r.histfloat(inp, "o1")) +m8r.putstring(out, "label1", m8r.histstring(inp, "label1")) +m8r.putstring(out, "unit1", m8r.histstring(inp, "unit1")) + +# --- Trace loop --- +trace = Array{Float32}(undef, n1) +for i2 in 1:n2 + m8r.floatread(trace, n1, inp) + + # ... process trace ... + clamp!(trace, -clip, clip) + + m8r.floatwrite(trace, n1, out) +end +``` + +Key structural points: +- `import m8r` (not `using m8r`) keeps the namespace explicit; `using m8r` is also + valid and exports `rsf_read` / `rsf_write`. +- `m8r.__init__()` is called automatically when the module loads; do not call + `sf_init` manually. +- `m8r.leftsize(file, 1)` returns `total_samples / n1` — the product of all axes + beyond the first. Use this to drive the outer loop without hard-coding dimensions. +- Auxiliary files: `m8r.input("vel")` opens the file named by the command-line + argument `vel=`. + +--- + +## API cheat sheet + +| Purpose | Julia call | Notes | +|---------|-----------|-------| +| **Init** | automatic on `import m8r` | `__init__` calls `sf_init` via `ccall` | +| **Open input** | `m8r.input("in")` | `"in"` = stdin; any other string = file path from that CLI param | +| **Open output** | `m8r.output("out")` | `"out"` = stdout; other strings = named output files | +| **Close file** | `m8r.close(file)` | wraps `sf_fileclose`; called automatically by `rsf_write` | +| **Hist int** | `m8r.histint(file, "n1")` | read `Int` from header; returns `0` if key absent | +| **Hist float** | `m8r.histfloat(file, "d1")` | read `Float32` from header | +| **Hist string** | `m8r.histstring(file, "label1")` | read string from header; `""` if absent | +| **Put int** | `m8r.putint(out, "n1", n)` | write integer header key | +| **Put float** | `m8r.putfloat(out, "d1", d)` | write float header key | +| **Put string** | `m8r.putstring(out, "label1", s)` | write string header key | +| **Get int param** | `m8r.getint("niter", 100)` | CLI/par param; default if absent | +| **Get float param** | `m8r.getfloat("clip")` | returns `0f0` if absent and no default given | +| **Get string param** | `m8r.getstring("vel", "")` | returns default string if absent | +| **Get bool param** | `m8r.getbool("verb", false)` | `"y"` → `true`, `"n"` → `false` | +| **Read floats** | `m8r.floatread(arr, n, inp)` | `arr` must be `Array{Float32,1}` of length `n` | +| **Write floats** | `m8r.floatwrite(arr, n, out)` | `arr` must be `Array{Float32,1}` | +| **Read ints** | `m8r.intread(arr, n, inp)` | `Array{Int32,1}` | +| **Write ints** | `m8r.intwrite(arr, n, out)` | `Array{Int32,1}` | +| **Read complex** | `m8r.complexread(arr, n, inp)` | `Array{ComplexF32,1}` | +| **Write complex** | `m8r.complexwrite(arr, n, out)` | `Array{ComplexF32,1}` | +| **Read short** | `m8r.shortread(arr, n, inp)` | `Array{Int16,1}` | +| **Write short** | `m8r.shortwrite(arr, n, out)` | `Array{Int16,1}` | +| **Left size** | `m8r.leftsize(file, dim)` | samples in dims `> dim`; `leftsize(f,0)` = total | +| **Set format** | `m8r.setformat(out, "complex")` | must call before first write for non-float output | +| **High-level read** | `dat, n, d, o, l, u = rsf_read(file_or_name)` | returns array + axis metadata | +| **High-level write** | `rsf_write(name, dat)` | writes array to named RSF file | +| **Pipe sf programs** | `sfspike(n1=10) \|> sfsmooth \|> rsf_read` | all installed sf* programs are exported | + +--- + +## Build integration + +`api/julia/SConstruct` installs `m8r.jl` into `$RSFROOT/lib/` at build time. The +framework-level dispatch for Julia programs happens in `UserSconsTargets.jl` +(located in the SCons framework tree). To add a Julia main program to your user +directory's build: + +```python +# user//SConstruct +import sys, os +sys.path.append('../../framework') +import bldutil + +# New-style (preferred) +targets = bldutil.UserSconsTargets() +targets.jl = 'myprogram' # base name only — no M prefix, no .jl suffix +targets.build_all(env, glob_build, srcroot, bindir, libdir, pkgdir) +``` + +At configure time, Madagascar's build system checks for a Julia executable. If Julia +is not found, `targets.jl` programs are silently skipped; no error is raised. To +verify Julia was detected during configuration, inspect `$RSFROOT/include/config.py` +for a `JULIA` key, or check `env.get('JULIA')` inside the SConstruct. + +Self-documentation for Julia programs is not yet scraped by `framework/rsf/doc.py` +(no `comment['jl']` regex entry); write a clear comment block at the top of your +`M.jl` for human readers, and plan to add `sfdoc` support manually if needed. + +--- + +## Pointers + +Files in `api/julia/` and their purpose: + +| File | Description | +|------|-------------| +| `api/julia/m8r.jl` | Full Julia API module — the only file you `import`; wraps all C-API entry points via `ccall` and also auto-exports every installed `sf*` binary as a Julia function | +| `api/julia/SConstruct` | SCons build file for the API; installs `m8r.jl` to `$RSFROOT/lib/` | +| `api/julia/test/clip.jl` | Minimal template: open in/out, read header, loop over traces with `floatread`/`floatwrite` — the closest thing to an `M*.jl` skeleton | +| `api/julia/test/afdm.jl` | Full numerical example: acoustic finite-difference modelling; shows multi-file input, header axis assembly, `putint`/`putfloat`/`putstring` output header setup, and a `@fastmath @inbounds` compute kernel | +| `api/julia/test/runtests.jl` | Comprehensive unit tests for every type (`uchar`, `char`, `int`, `float`, `complex`, `short`) covering `getint`/`getfloat`/`getstring`/`getbool`, `histint`/`histfloat`/`histstring`, `putint`/`putfloat`/`putstring`, read/write round-trips, and high-level `rsf_read`/`rsf_write` plus pipe-based `sf*` function calls | + +--- + +## Julia-specific quirks + +- **Type precision**: the API uses `Float32` and `Int32` throughout (matching the C + API). Passing `Float64` arrays to `floatwrite` requires explicit conversion: + `m8r.floatwrite(Float32.(vec(arr)), n, out)`. The high-level `rsf_write` does this + conversion automatically. +- **Output type inheritance**: `m8r.output("out")` silently inherits the data type + from the most recent `m8r.input("in")` call (C API behaviour). For non-float output + (complex, int, short) call `m8r.setformat(out, "complex")` immediately after + `m8r.output` and before any write. The `rsf_write(name, arr)` variant avoids this + pitfall by writing a dummy pipe to force the correct type. +- **No `sf_error` in Julia**: there is no `m8r.error(...)` binding. Use `error("msg")` + (Julia built-in) or `throw(...)` for fatal errors; the non-zero exit will propagate + correctly through Madagascar pipelines. +- **Pipe composition**: all installed `sf*` binaries are available as Julia functions + after `using m8r`. They accept an `RSFFile`, an array, or no argument, and return + an `RSFFile` pointing to a temporary file. Chain them with `|>`. +- **RSFROOT required**: the module reads `ENV["RSFROOT"]` at load time. If the + environment variable is not set, `m8r.RSFROOT` is `nothing` and `sf*` function + generation is skipped; the low-level read/write API still works. + +--- + +## Shared conventions + +For file naming, self-documentation, parameter conventions, error handling, testing, +and SCons build patterns that apply to all languages, see: + +[skills/authoring-sf-programs/](../authoring-sf-programs/SKILL.md) diff --git a/skills/authoring-sf-programs-matlab/SKILL.md b/skills/authoring-sf-programs-matlab/SKILL.md new file mode 100644 index 000000000..bbc788f97 --- /dev/null +++ b/skills/authoring-sf-programs-matlab/SKILL.md @@ -0,0 +1,181 @@ +--- +name: authoring-sf-programs-matlab +description: Use when authoring a Madagascar sf* main program in MATLAB. +--- + +## When to use + +Use this skill when a researcher already has MATLAB installed (licensed) and wants +to write an `sf` program as a `.m` script, leveraging existing MATLAB toolboxes +(e.g. Signal Processing Toolbox, Optimization Toolbox) that would be cumbersome to +replicate in C or Python. + +**File naming**: the MATLAB main script must be named `M.m` — same `M` prefix +convention used by all Madagascar languages. There is no `sf` prefix in the source file; +the installed binary is called `sf`. + +**Important — compile-time requirement**: the MEX backing library (`rsf_create.mex*`, +`rsf_read.mex*`, etc.) is compiled at `scons` configure time only when MATLAB is +detected on the build host. See `api/matlab/SConstruct`: the entire MEX build is gated +on `'matlab' in env.get('API',[])`. Without a licensed MATLAB on the build machine the +API produces nothing and MATLAB programs cannot run. + +**Octave overlap**: the `.m` extension is shared with Octave. The build system selects +Matlab vs. Octave at configure time via the `API` variable; they are mutually exclusive +at build time. For Octave programs see `authoring-sf-programs-octave/SKILL.md`. + +Always load the shared skill alongside this one: +[`skills/authoring-sf-programs/`](../authoring-sf-programs/SKILL.md) — file naming, +self-documentation, parameter conventions, error handling, and testing conventions that +apply to every sf-program regardless of language. + +--- + +## Skeleton + +The canonical pattern is taken from `api/matlab/test/clip.m`. A MATLAB sf-program +is a plain function (not a script) that receives RSF filenames as arguments: + +```matlab +function M(in, out, param1) +% One-sentence synopsis (appears in sfdoc output). +% +% par1=default description of parameter + +% 1. Query dimensions of the input file. +dims = rsf_dim(in); % returns column vector [n1; n2; ...] +n1 = dims(1); +n2 = prod(dims(2:end)); + +% 2. Read a named parameter from the input header. +clip = rsf_par(in, 'clip', 'f', 1.0); % type 'f'=float, 'i'=int, 'b'=bool + +% 3. Create output header, inheriting geometry from input. +rsf_create(out, in); % copy header: rsf_create(outfile, infile) +% -- OR, specify sizes explicitly: +% rsf_create(out, [n1; n2]); + +% 4. Allocate a trace-length working buffer. +trace = zeros(1, n1); + +% 5. Loop over traces, reading and writing sequentially. +for i2 = 1:n2 + rsf_read(trace, in, 'same'); % read next n1 samples + % ... process trace ... + trace(trace > clip) = clip; + trace(trace < -clip) = -clip; + rsf_write(trace, out, 'same'); % write next n1 samples +end +``` + +The `'same'` flag on `rsf_read` / `rsf_write` advances an internal file pointer so +successive calls walk through the binary data sequentially — essential for loop-based +trace processing. + +For single-call whole-file I/O, prefer `rsf_read_all` / `rsf_write_all` (see cheat +sheet below). + +--- + +## API cheat sheet + +All functions are MEX-compiled C entry points. `nrhs` = number of right-hand side +(input) arguments; `nlhs` = number of left-hand side (output) arguments. + +| Function | MATLAB call | nrhs | nlhs | Notes | +|---|---|---|---|---| +| `rsf_create` | `rsf_create(outfile, infile)` | 2 | 0 | Create RSF header on disk; copies geometry from `infile` (string) | +| `rsf_create` | `rsf_create(outfile, [n1;n2;...])` | 2 | 0 | Create header; sizes from column vector of doubles | +| `rsf_read` | `rsf_read(buf, infile)` | 2 | 0 | Read `numel(buf)` samples from `infile` into pre-allocated double array `buf` | +| `rsf_read` | `rsf_read(buf, infile, 'same')` | 3 | 0 | Same, but advance internal file pointer (sequential trace loop) | +| `rsf_read_all` | `[data,sz,d,o,lbl,unit] = rsf_read_all(file)` | 1 | 1–6 | Read entire file; returns data array plus optional size/delta/origin/label/unit vectors | +| `rsf_read_header` | `[sz,d,o,lbl,unit] = rsf_read_header(file)` | 1 | 1–5 | Read header only (no data); returns size vector plus optional delta/origin/label/unit | +| `rsf_write` | `rsf_write(buf, outfile)` | 2 | 0 | Write double array `buf` to `outfile` (sets n1/n2/... from `buf` dimensions) | +| `rsf_write` | `rsf_write(buf, outfile, 'same')` | 3 | 0 | Append to existing binary; does not rewrite header | +| `rsf_write_all` | `rsf_write_all(file, cmdargs, data)` | 3–7 | 0 | Write full RSF file; `cmdargs` is cell array (e.g. `{'out=stdout'}`); optional delta/origin/label/unit row vectors | +| `rsf_par` | `v = rsf_par(infile, 'name', 'f', default)` | 4 | 1 | Read scalar header parameter; type string: `'f'`=float, `'i'`/`'d'`=int, `'l'`/`'b'`=bool; returns `default` if absent | +| `rsf_dim` | `dims = rsf_dim(infile)` | 1 | 1 | Return column vector of axis lengths `[n1; n2; ...]` for the RSF file | + +**Type details for `rsf_par`**: type string first character selects the branch — +`'f'` or `'r'` → float, `'i'` or `'d'` → integer, `'l'` or `'b'` → logical. +The return value is always a double scalar. + +**Complex data**: `rsf_read` / `rsf_write` handle `SF_COMPLEX` files automatically +when `buf` / `data` is a MATLAB complex array (`mxIsComplex`). + +**`m8r` MEX** (`m8r.c` → `m8r.mex*`): exposes `rsf_filter(cmd, data, deltas, origins, +labels, units)` — runs an arbitrary Madagascar binary as a filter on MATLAB data via +temp files. Useful for calling existing `sf*` programs from within a MATLAB session. + +--- + +## Build integration + +The MEX binaries are built by `api/matlab/SConstruct`, which is invoked automatically +by the top-level Madagascar `scons` when MATLAB is present: + +```python +# api/matlab/SConstruct (excerpt) +if 'matlab' in env.get('API',[]): + mex = env.get('MEX') # path to mex compiler, detected at configure time + suffix = env.get('MEXSUFFIX') # platform suffix, e.g. .mexa64 / .mexmaci64 + if mex: + mexcom = mex + " CC=$CC CFLAGS='$CFLAGS $_CCCOMCOM -fPIC' ..." + for inp in Split('m8r par dim read_header read read_all write write_all create'): + ... + mfile = env.Command(cfile+suffix, cfile+'.c', mexcom) + if root: + install = env.Install(libdir, mfile) +``` + +MATLAB must be detectable by the configure step for this block to execute. If MATLAB +is not installed, `env.get('API',[])` will not contain `'matlab'` and the entire block +is skipped — no MEX files are produced and MATLAB programs will fail with "undefined +function" errors at runtime. + +**No `UserSconsTargets` for MATLAB**: the `bldutil.UserSconsTargets()` helper in +`framework/bldutil.py` supports `.c`, `.f90`, `.py`, and `.jl` attributes — not `.m`. +MATLAB programs are not automatically discovered or compiled by the standard user +directory `SConstruct` machinery. + +**Deploying a new MATLAB program**: place `M.m` somewhere on MATLAB's path (or +in the current directory) and ensure that the MEX binaries installed to `libdir` are +also on MATLAB's path. The simplest approach: + +```matlab +addpath('/path/to/madagascar/lib') % directory containing rsf_*.mex* files +addpath('/path/to/your/Mname.m') +M(input_rsf, output_rsf, ...) +``` + +There is no automated install target for `.m` main programs analogous to the C or +Python install targets. + +--- + +## Pointers + +Files in `api/matlab/` with one-line descriptions: + +| File | Description | +|---|---| +| `SConstruct` | SCons build script; compiles all `rsf_*.c` and `m8r.c` to MEX when `matlab` is in `API` | +| `rsf_create.c` | MEX: create an RSF output header from a filename+infile or filename+dims vector | +| `rsf_read.c` | MEX: read samples from an RSF file into a pre-allocated MATLAB double buffer | +| `rsf_read_all.c` | MEX: read an entire RSF file in one call, returning data and optional header metadata | +| `rsf_read_header.c` | MEX: read RSF header only (sizes, deltas, origins, labels, units) without touching the binary | +| `rsf_write.c` | MEX: write a MATLAB double array to an RSF file, with optional `'same'` append mode | +| `rsf_write_all.c` | MEX: write a complete RSF file (header + data) in one call with full metadata support | +| `rsf_par.c` | MEX: read a scalar parameter (int, float, or bool) from an RSF header by name | +| `rsf_dim.c` | MEX: return the axis-length vector `[n1; n2; ...]` for an RSF file | +| `m8r.c` | MEX: run any Madagascar `sf*` binary as a filter on MATLAB data via temp RSF files | +| `test/clip.m` | Example MATLAB sf-program: clips data to `[-clip, clip]` trace by trace using `rsf_dim`, `rsf_create`, `rsf_read`/`rsf_write` with `'same'` | + +--- + +## Shared conventions + +All Madagascar sf-programs — regardless of language — follow the same conventions for +file naming, self-documentation comments, parameter style, error handling, and testing. +See [`skills/authoring-sf-programs/`](../authoring-sf-programs/SKILL.md) for the +full shared reference. diff --git a/skills/authoring-sf-programs-octave/SKILL.md b/skills/authoring-sf-programs-octave/SKILL.md new file mode 100644 index 000000000..0383d4433 --- /dev/null +++ b/skills/authoring-sf-programs-octave/SKILL.md @@ -0,0 +1,187 @@ +--- +name: authoring-sf-programs-octave +description: Use when authoring a Madagascar sf* main program in GNU Octave. +--- + +## When to use + +Use this skill when you want to write a Madagascar `sf` program using +GNU Octave — the open-source, MATLAB-compatible interpreter — and you do not +have a MATLAB license. + +**Disambiguation: `.m` files are shared by Octave and MATLAB.** A file named +`M.m` could target either runtime. The difference is entirely in how +Madagascar is configured and how the helpers are installed: + +- **Octave path** (this skill): pure `.m` helper files; no MEX compilation + required; enabled with `API=octave` at configure time; installs the + `rsf_*.m` helpers to Octave's function path. +- **MATLAB path**: C MEX extensions are compiled with `mex`; enabled with + `API=matlab`; requires a MATLAB license. See + `skills/authoring-sf-programs-matlab/SKILL.md`. + +Typical users of this path: researchers who want MATLAB-like array syntax and +existing `.m` algorithms without a proprietary license. + +Always load the shared conventions skill alongside this one: +[skills/authoring-sf-programs/](../authoring-sf-programs/SKILL.md) + +--- + +## Skeleton + +A minimal Octave `M.m` script. No MEX compilation is needed — the Octave +API is implemented as pure `.m` helper files (`rsf_create.m`, `rsf_dim.m`, +`rsf_par.m`) that are placed on Octave's function path at install time. + +```octave +% Mscale.m — sf skeleton for GNU Octave +% One-line description of what this program does. +% +% Usage: sfscale < input.rsf scale=2.0 > output.rsf + +% Read mandatory scalar parameter (no default -> error if absent) +[scale, st] = rsf_par('in.rsf', 'scale', []); +if st.err; error(st.msg); end +if isempty(scale); error('scale= required'); end + +% Read optional integer parameter with default +[niter, st] = rsf_par('in.rsf', 'niter', 100); +if st.err; error(st.msg); end + +% Query dimensions of input file +[dims, st] = rsf_dim('in.rsf'); +if st.err; error(st.msg); end +n1 = dims(1); +n2 = dims(2); % 1 if the dataset is 1-D + +% Create output header by cloning the input header +st = rsf_create('out.rsf', 'in.rsf'); +if st.err; error(st.msg); end + +% Read binary data (rsf_read / rsf_write are NOT yet in api/octave; +% use sfdd or a shell pipe to convert to a plain binary float array) +fid_in = fopen('<&0', 'rb'); % stdin when invoked as sf program +data = fread(fid_in, n1*n2, 'float32'); +fclose(fid_in); + +% Process +data = data * scale; + +% Write +fid_out = fopen('>&1', 'wb'); % stdout +fwrite(fid_out, data, 'float32'); +fclose(fid_out); +``` + +> **Note on stdin/stdout**: When Madagascar invokes a program via a pipeline, +> `stdin` carries the raw RSF binary and `stdout` must produce the raw binary +> for the next stage. Octave scripts can use `fopen('<&0','rb')` and +> `fopen('>&1','wb')` for this purpose on POSIX systems. + +--- + +## API cheat sheet + +All helpers live in `api/octave/`. They call underlying `sf*` command-line +tools via `system()` and return a `stat` struct with fields `stat.err` +(logical) and `stat.msg` (string). + +| Function | Signature | Purpose | +|---|---|---| +| `rsf_create` | `stat = rsf_create(out_filename, arg2)` | Write an RSF header. `arg2` is either an existing `.rsf` filename (copies that header) or a numeric vector of dimensions (creates a new header via `sfcreate`). | +| `rsf_dim` | `[dims, stat] = rsf_dim(in)` | Return a vector of dimensions for the RSF file `in`, with trailing length-1 dimensions stripped. Calls `sffiledims parform=n` internally. | +| `rsf_par` | `[par, stat] = rsf_par(file, name, default)` | Read scalar parameter `name` from header `file`. Returns `default` when the key is absent. Calls `sfget parform=n` internally. | + +**Error handling pattern** (use consistently): + +```octave +[val, st] = rsf_par('in.rsf', 'n1', 1); +if st.err + error('rsf_par failed: %s', st.msg); +end +``` + +**`rsf_read` / `rsf_write`** are not present in `api/octave/` (only in +`api/matlab/` as MEX entry points). Read/write the binary payload directly +via Octave's `fread`/`fwrite` or by piping through `sfdd`. + +--- + +## Build integration + +`api/octave/SConstruct` is currently empty (placeholder); the `.m` helpers are +plain Octave function files that require no compilation step. + +At configure time (`framework/configure.py`, `octave()` function): + +1. `WhereIs('octave')` is run; if found, `env['OCTAVE']` is set. +2. `WhereIs('mkoctfile')` is checked; `env['MKOCTFILE']` is set if found. + `mkoctfile` is the Octave function compiler (analogous to `mex`), but + **it is not required for the pure `.m` API** — only needed if you later + add compiled oct-files (`.oct`). The plain `rsf_*.m` helpers install + without it. +3. Configure is triggered by passing `API=octave` (or including `octave` in + the comma-separated `API` list) to `scons`. + +**Installing the helpers**: the `rsf_*.m` files must be on Octave's function +path before your script can call them. Typical approaches: + +```bash +# Option A: add to OCTAVE_PATH environment variable +export OCTAVE_PATH=/path/to/RSFROOT/lib:$OCTAVE_PATH + +# Option B: addpath() inside your script (useful for SConstruct flows) +octave --eval "addpath('/path/to/RSFROOT/lib'); Mscale(...)" +``` + +**New user programs** — no MEX compile step is needed. Place `M.m` in +`user//` and invoke it from a `Flow()` via the `OCTAVE` env +variable (similar to the pattern in +`book/rsf/school2025/plots/SConstruct`): + +```python +# In user//SConstruct +octave = env.get('OCTAVE') +if octave: + Flow('output', 'input', + '%s --eval "addpath(...); Mscale(\'${SOURCE}\', \'${TARGET}\', ...); exit;"' + % octave, stdin=0, stdout=-1) +``` + +Because the script is interpreted, the SConstruct does not need a compile +step — only an install/path setup. + +--- + +## Pointers + +Files in `api/octave/` with one-line descriptions: + +| File | Description | +|---|---| +| `rsf_create.m` | Write an RSF header to disk — copies an existing header or creates one from a dimension vector by calling `sfcreate`. | +| `rsf_dim.m` | Return the dimension vector of an RSF file by calling `sffiledims parform=n`; strips trailing length-1 dimensions. | +| `rsf_par.m` | Read a named scalar parameter from an RSF header by calling `sfget parform=n`; returns a caller-supplied default when the key is absent. | +| `SConstruct` | Placeholder (empty); no compilation targets are needed for the pure `.m` Octave API. | + +--- + +## Shared conventions + +File naming, self-documentation comment format, parameter conventions, error +handling, testing, and build integration patterns that apply to every +`sf` program regardless of language are documented in: + +[skills/authoring-sf-programs/](../authoring-sf-programs/SKILL.md) + +Key reminders from that skill relevant to Octave programs: + +- Name your file `M.m` inside `user//`. +- The installed binary is called `sf` (build system drops `M`, prepends + `sf`). +- The `.m` extension is shared with MATLAB; configure time (`API=octave` vs. + `API=matlab`) determines which runtime is used and whether MEX compilation + is required. +- Self-documentation: add a leading `%` comment block describing the program + and its parameters so `sfdoc sf` works correctly. diff --git a/skills/authoring-sf-programs-python/SKILL.md b/skills/authoring-sf-programs-python/SKILL.md new file mode 100644 index 000000000..6dc827f83 --- /dev/null +++ b/skills/authoring-sf-programs-python/SKILL.md @@ -0,0 +1,454 @@ +--- +name: authoring-sf-programs-python +description: Use when authoring a Madagascar sf* main program in Python (fast prototyping with numpy). +--- + +## When to use + +Load this skill when writing a new `sf` program in Python. Python is the fastest iteration path: no compilation step, numpy-native array operations, and easy debugging. New Python programs go in `user//M.py`. + +This skill covers the Python-specific layer only. Always also load: + +- `../authoring-sf-programs/SKILL.md` — file naming conventions, self-documentation requirements, parameter style, and build integration shared by every language. + +Use this skill instead of the C skill whenever: +- You are prototyping an algorithm and want fast iteration. +- Your operation is naturally expressed as numpy array math. +- You do not need raw C performance (inner loops in pure Python can be 10–100x slower than C; use numpy vectorization to avoid that). + +--- + +## Skeleton + +This skeleton has been smoke-tested against a real `sfspike`-generated RSF file and runs correctly. The key difference from the plan's template: the correct import is `rsf.api as rsf` (the installed module name), not `import m8r`, and `rsf.Output()` takes only a tag argument — shape is inherited automatically from the first input opened. + +```python +#!/usr/bin/env python3 +'''One-line description of what this program does.''' + +import sys +import numpy as np +import rsf.api as rsf + +par = rsf.Par() +fin = rsf.Input() # stdin by default +fout = rsf.Output() # stdout by default; inherits format from fin + +n1 = fin.int("n1") # fastest axis length (from RSF header) +n2 = fin.size(1) # product of all axes beyond n1 (number of traces) + +factor = par.float("factor", 1.0) # scale factor [default 1.0] + +trace = np.zeros(n1, dtype='f') + +for i2 in range(n2): + fin.read(trace) + trace *= factor + fout.write(trace) + +sys.exit(0) +``` + +### Bulk-read variant (whole array at once) + +When the algorithm needs the full dataset in memory, read everything at once: + +```python +#!/usr/bin/env python3 +'''One-line description.''' + +import sys +import numpy as np +import rsf.api as rsf + +par = rsf.Par() +fin = rsf.Input() +fout = rsf.Output() + +n1 = fin.int("n1") +n2 = fin.int("n2") or 1 + +factor = par.float("factor", 1.0) # scale factor [default 1.0] + +data = np.zeros((n2, n1), dtype=np.float32) +fin.read(data) + +data *= factor + +fout.write(data) +sys.exit(0) +``` + +### Correction from the plan's template + +The plan template showed `m8r.Output("out", fin)`. This is wrong in two ways: + +1. Prefer `import rsf.api as rsf` over `import m8r`. Both work (the installed `m8r.py` is bit-identical to `rsf/api.py`), but the self-doc scraper's I/O-recognition regex (`inpout['python']` in `framework/rsf/doc.py`) is anchored to `rsf.Input` / `rsf.Output` calls. Using `import m8r` would lose the auto-generated `< in.rsf > out.rsf` synopsis in `sfdoc` output. Note that `api/python/test/clip.py` and `api/python/test/afdm.py` both use `import m8r` — they still work, but their sfdoc entries are less informative. +2. `rsf.Output(tag='out', data_format=None)` accepts only a tag and an optional format string — there is no second positional argument for a source file. Shape inheritance is automatic: when writing begins, the internal `_RSF.fileflush()` copies header parameters from the first opened input. + +--- + +## Self-doc format + +The scraper in `framework/rsf/doc.py` uses this regex for Python files (`comment['python']`): + +```python +comment['python'] = re.compile( + r'[^\'\"]*([\'\"]+)(?P[^\'\"].+?)\1', + re.DOTALL +) +``` + +This matches the **first string literal** in the file — specifically the first `'...'` or `"""..."""` that appears before any other quoted string. In practice: put a module-level docstring immediately after the shebang line: + +```python +#!/usr/bin/env python3 +'''Scale input data by a scalar factor.''' +``` + +or a triple-quoted version: + +```python +#!/usr/bin/env python3 +""" +Scale input data by a scalar factor. + +Longer explanation on subsequent lines. +""" +``` + +The first line of the docstring becomes the program's one-line description shown in `sfdoc sfname`. Subsequent lines become the comments section. + +Real example from `api/python/test/afdm.py`: that file has no leading docstring, so its `sfdoc` description is empty. Real example from `user/sbader/Menergy.py`: + +```python +#!/usr/bin/env python3 +''' +Estimate energy of input + +E(t) = \sum\limits_{k=(t-\frac{R}{2})}^{(t+\frac{R}{2})}A(k)^2 +''' +``` + +The parameter scraper regex for Python (`param['python']`) captures `par.bool(...)`, `par.int(...)`, `par.float(...)`, and `par.string(...)` calls along with optional trailing `# [range] description` comments: + +```python +param['python'] = re.compile( + r'par\.(?Pbool|int|float|string)' + r'\s*\(\s*[\"\'](?P\w+)[\"\']\s*' + r'(?:\,\s*(?P[^\)]+))?\)' + r'(?:\s*\#\s*(?P[\[][^\]]+[\]])?\s*' + r'(?P[^#\n]+\S))?' +) +``` + +So a parameter line like: + +```python +factor = par.float("factor", 1.0) # [0,inf] scale factor applied to every sample +``` + +scrapes as: type=float, name=factor, default=1.0, range=[0,inf], desc="scale factor applied to every sample". + +--- + +## m8r.Input / m8r.Output + +Both classes live in `api/python/m8r.py` (installed as `rsf/api.py`). They extend `_File`, which extends `File`. + +### `rsf.Input(tag='in')` + +Opens an RSF file for reading. + +- `tag='in'` means standard input (the default, used when the program reads from a pipe). +- Any other tag string (e.g., `'vel'`, `'ref'`) opens the file named by the command-line argument `vel=some.rsf`. + +From `afdm.py`: +```python +Fw = rsf.Input() # stdin — the wavelet file +Fv = rsf.Input("vel") # opened via vel= on the command line +Fr = rsf.Input("ref") # opened via ref= on the command line +``` + +**Header reads** — extract axis metadata from the RSF header: + +```python +n1 = fin.int("n1") # returns int or None +d1 = fin.float("d1") # returns float or None +n1 = fin.int("n1", default=100) # returns default if key absent +label = fin.string("label1") # returns str or None + +# Axis convenience method (returns dict with n, d, o, l, u): +ax = fin.axis(1) +nt = ax['n']; dt = ax['d']; ot = ax['o'] +``` + +**Reading data:** + +```python +trace = np.zeros(n1, dtype='f') +fin.read(trace) # fills trace in-place (trace-by-trace loop pattern) + +n2 = fin.size(1) # product of all axes from axis 2 onward +data = np.zeros((n2, n1), dtype='f') +fin.read(data) # bulk read; shape (n2, n1) +``` + +`fin.close()` is optional — the destructor closes automatically. + +### `rsf.Output(tag='out', data_format=None)` + +Opens an RSF file for writing. + +- `tag='out'` means standard output (the default). +- Shape and format are inherited automatically from the first `rsf.Input()` opened in the same program; you do not pass a source file. +- Use `.put()` to override or add header values before any `.write()` call. + +**Writing metadata:** + +```python +fout.put("n1", n1) +fout.put("d1", 0.004) +fout.put("label1", "Time") +fout.put("unit1", "s") + +# Axis convenience: +fout.putaxis(ax, 1) # copies n, d, o, label, unit from axis dict +``` + +**Writing data:** + +```python +fout.write(trace) # write a 1D numpy array +fout.write(data) # write a 2D (or ND) numpy array — flattened internally +``` + +The `write()` method calls `np.reshape(data.astype(np.float32), (data.size,))` before writing, so the shape of the array passed does not need to match the file shape exactly — the total element count must match. + +**Closing:** + +```python +fout.close() # optional; destructor closes automatically +``` + +--- + +## m8r.Par + +`rsf.Par(argv=sys.argv)` parses command-line arguments of the form `key=value`. A default instance is created automatically on module import; call `rsf.Par()` explicitly in your program to initialize it properly. + +### Scalar parameters + +```python +par = rsf.Par() + +i = par.int("niter") # returns int or None +i = par.int("niter", 10) # returns 10 if niter= not on command line +f = par.float("eps", 0.01) # float with default +b = par.bool("verb", False) # bool: True if verb=y/Y/1; False if n/N/0 +s = par.string("mode", "fwd") # string (strips surrounding quotes) +``` + +Booleans on the command line: `verb=y` or `verb=1` → True; `verb=n` or `verb=0` → False. + +### List (array) parameters + +```python +ints = par.ints("k1", 3) # read 3 ints: k1=1,5,9 +floats = par.floats("scale", 2) # read 2 floats: scale=1.0,2.0 +bools = par.bools("flags", 4) # read 4 bools +``` + +If fewer values are given than requested, the last value is repeated to fill. + +### Program name + +```python +name = par.getprog() # returns sys.argv[0] +``` + +### Parameter file + +Pass `par=somefile.par` on the command line to read `key=value` pairs from a file; they merge with command-line arguments. + +--- + +## Numpy shape conventions + +Madagascar numbers axes starting from 1, with **axis 1 being the fastest-varying** (stored contiguously in memory, analogous to C's inner loop). Numpy uses C order (row-major), where the **last index is fastest-varying**. + +Therefore, when you allocate a numpy array to hold Madagascar data, the axis ordering is **reversed**: + +| Madagascar | Numpy shape | +|------------|-------------| +| n1=100 samples/trace | last dimension: `(..., 100)` | +| n2=50 traces | second-to-last: `(50, 100)` | +| n3=10 shots | `(10, 50, 100)` | + +**Example**: a 3D dataset with n1=500, n2=200, n3=10: + +```python +n1 = fin.int("n1") # 500 — fastest (time samples) +n2 = fin.int("n2") # 200 — traces +n3 = fin.int("n3") # 10 — shots +data = np.zeros((n3, n2, n1), dtype=np.float32) +fin.read(data) +# data[ishot, itrace, isample] — numpy index order +``` + +### Common transpose patterns + +When an algorithm expects "time along rows" (C convention: time last), your layout is already correct. When an external library expects "columns are samples" (Fortran convention), transpose: + +```python +# Madagascar stores (n2, n1) — n2 rows, n1 samples per row +data = np.zeros((n2, n1), dtype='f') +fin.read(data) + +# For a library expecting (n1, n2) — samples along rows: +data_T = data.T # view, no copy + +# Or use numpy's built-in where a copy is needed: +data_fortran = np.asfortranarray(data) +``` + +When writing back, remember to restore the Madagascar axis order: + +```python +result = some_lib_call(data.T) # result shape is (n1, n2) +fout.write(result.T) # write (n2, n1) — Madagascar order +``` + +### `fin.shape()` and axis reversal + +The `File.shape()` method returns a tuple already reversed for numpy: + +```python +sh = fin.shape() # e.g. (10, 50, 100) for n1=100, n2=50, n3=10 +data = np.zeros(sh, dtype='f') +fin.read(data) +``` + +--- + +## SConstruct integration + +### Declaring Python programs + +In `user//SConstruct`, list each Python main program name (without the `M` prefix and without `.py`) in the `pyprogs` variable: + +```python +import os, sys, re +sys.path.append('../../framework') +import bldutil + +pyprogs = 'myscale mystack' # Mscale.py and Mstack.py + +try: + Import('env root pkgdir bindir libdir incdir') + env = env.Clone() +except: + env = bldutil.Debug() + root = None + +if root: # no compilation, just rename + pymains = Split(pyprogs) + exe = env.get('PROGSUFFIX','') + for prog in pymains: + binary = os.path.join(bindir,'sf'+prog+exe) + env.InstallAs(binary,'M'+prog+'.py') + env.AddPostAction(binary,Chmod(str(binary),0o755)) +``` + +### What the build does + +When you run `scons` in the top-level source tree, the build system processes `user//SConstruct`. For each name in `pyprogs`, the inline install loop does: + +1. Copies `M.py` directly to `$RSFROOT/bin/sf` (no compilation) via `env.InstallAs`. +2. Sets the file executable (`chmod 0o755`) via `env.AddPostAction`. + +There is **no shell wrapper** generated for Python programs. The Python file itself becomes the installed binary, called directly by the system Python interpreter via the shebang line (`#!/usr/bin/env python3`). This is different from C programs (which are compiled ELF binaries) but identical in usage from the user's perspective. + +### Using Python programs in a Flow + +Once installed, use `sf` exactly like any other Madagascar program: + +```python +Flow('out', 'in', 'sfmyscale factor=2.0') +Flow('out', ['in', 'vel'], 'sfmystack vel=${SOURCES[1]}') +``` + +For testing in `user//`, you can run the Python file directly without installing: + +```bash +python3 Mmyscale.py factor=2.0 < in.rsf > out.rsf +``` + +--- + +## Worked references + +- **`api/python/test/clip.py`** — minimal example. Uses a trace-by-trace loop with `fin.size(1)` to count traces; calls `par.float("clip")` (required, no default) and clips with `numpy.clip`. ~25 lines. + +- **`api/python/test/afdm.py`** — richer example: acoustic finite-difference modeling. Opens three inputs (`Fw`, `Fv`, `Fr`) and one output (`Fo`); uses `fin.axis()` / `fout.putaxis()` for full axis metadata; reads full arrays then loops in time; writes one time snapshot per iteration. ~70 lines. + +- **`user/sbader/Menergy.py`** — real user program. Triple-quoted docstring for self-doc; uses `rsf.api as rsf`; reads a 1D dataset and computes a rolling energy estimate. Shows that `rsf.Output()` can be opened after `rsf.Input()` is already open (the output inherits the input's format automatically). + +- **`user/sbader/Mreplace.py`** — another simple 1D user program. Good minimal reference for the `rsf.api` import pattern. + +- **`user/godwinj/Mpysvd.py`** — real user program that wraps SciPy SVD. Demonstrates the recommended try/except guard around the `rsf.api` import: if the Python API or NumPy/SciPy are absent the program prints a clear error rather than a raw `ImportError`. Use this pattern whenever your program has optional heavyweight dependencies. + +--- + +## Building and testing + +### Local run (no install needed) + +```bash +cd user// +python3 Mmyscale.py factor=2.0 < in.rsf > out.rsf +sfattr < out.rsf +``` + +### Install via scons + +```bash +cd /path/to/RSFSRC +scons user// +``` + +After a successful build, `$RSFROOT/bin/sfmyscale` is the Python file itself (executable). Test it: + +```bash +sfmyscale factor=2.0 < in.rsf > out.rsf +sfattr < out.rsf +``` + +### Self-doc check + +```bash +sfdoc sfmyscale +``` + +If the program description is empty, the leading docstring was not found. Verify it is a bare string literal on the second line (right after the shebang), not a comment. + +### Regression test + +Add a test flow in the SConstruct: + +```python +Flow('test_out', 'test_in', 'sfmyscale factor=2.0') +``` + +Then `scons test_out` runs the program as part of the build. For automated comparison, use `sfmath`: + +```python +Flow('diff', ['test_out', 'reference'], 'sfmath x=${SOURCES[0]} y=${SOURCES[1]} output="x-y" | sfattr want=max') +``` + +### Common pitfalls + +- `ImportError: No module named rsf.api` — source the Madagascar env: `source $RSFROOT/etc/env.sh`. +- Empty `sfdoc` description — move the docstring to line 2 right after the shebang; it must be a string literal, not a `#` comment. +- `TypeError: Unsupported type in put` — numpy scalars (int64, float64) may need explicit casting: `fout.put("n1", int(n1))`. +- Wrong output dimensions — verify `data.size == n1 * n2 * ...` before calling `fout.write(data)`. diff --git a/skills/authoring-sf-programs/SKILL.md b/skills/authoring-sf-programs/SKILL.md new file mode 100644 index 000000000..43df36e3b --- /dev/null +++ b/skills/authoring-sf-programs/SKILL.md @@ -0,0 +1,250 @@ +--- +name: authoring-sf-programs +description: Use when creating a new sf* main program — shared conventions that apply regardless of language (file naming, self-documentation, parameter style, build integration). +--- + +## When to use + +Load this skill whenever you are writing a new `sf` program, regardless of implementation language. It covers the language-agnostic layer: file naming, the self-documentation comment format, parameter conventions, error handling, testing, and build integration. + +This skill is a **companion** to the language-specific skill `authoring-sf-programs-`. The per-language skill covers syntax — how to call `sf_init`, `sf_getbool`, `sf_error`, etc. in that language — while this skill establishes the shared contract that every sf-program must satisfy. Always load both: + +1. `skills/authoring-sf-programs/SKILL.md` — you are here; shared conventions. +2. `skills/authoring-sf-programs-/SKILL.md` — language syntax and idioms. + +--- + +## File naming + +Every `sf` main program lives in a source file named `M.` inside a user directory `user//`. The `M` prefix is the signal to the build system that the file is a main program (not a library module). The installed binary is named `sf` (the `M` is dropped, `sf` is prepended). + +Extension to language mapping: + +| Extension | Language | +|-----------|----------| +| `.c` | C | +| `.cc` | C++ | +| `.cu` | CUDA C | +| `.f` | Fortran 77 (framework support exists; no existing `M*.f` programs in this tree — see `api/f77/Test*.*` for templates) | +| `.f90` | Fortran 90 | +| `.py` | Python | +| `.jl` | Julia (framework support exists; no existing `M*.jl` programs in this tree — see `api/julia/Test*.*` for templates) | +| `.java` | Java (framework support exists; no existing `M*.java` programs in this tree — see `api/java/Test*.*` for templates) | +| `.m` | Matlab / Octave — disambiguated at configure time by which API is enabled (framework support exists; no existing `M*.m` programs in this tree — see `api/matlab/Test*.*` or `api/octave/Test*.*` for templates) | +| `.chpl` | Chapel | + +Place your file at `user//M.`. If the directory does not exist, create it along with a `SConstruct` (see **Build integration** below). + +--- + +## Self-documentation + +Every `sf` program must begin with a structured comment block. The build system scrapes this block at compile/install time using `framework/rsf/doc.py` (specifically the `getprog()` function) and turns it into `sfdoc sf` output. + +### Comment format (C) + +For C, the scraper uses this regex (from `framework/rsf/doc.py`, `comment['c']` regex): + +``` +comment['c'] = re.compile(r'\/\*(?P(?:[^*]+|\*[^/])+)\*\/') +``` + +That is: the **first** `/* ... */` block in the file. The format inside that block is: + +- **Line 1**: one-sentence synopsis (becomes the `DESCRIPTION` section of `sfdoc`). +- **Remaining lines**: multi-line comments block (becomes `COMMENTS` in `sfdoc`). If the comments begin with `Takes: ...`, that suffix is appended to the synopsis line. + +Real example from `user/fomels/Mpick.c` (which produces `sfdoc sfpick`): + +```c +/* Automatic picking from semblance-like panels. + +Takes: rect1=1 rect2=1 ... + +rectN defines the size of the smoothing stencil in N-th dimension. + +Theory in Appendix B of: +S. Fomel, 2009, +Velocity analysis using AB semblance: Geophysical Prospecting, v. 57, 311-321. +Reproducible version in RSFSRC/book/tccs/avo +http://ahay.org/RSF/book/tccs/avo/paper_html/ + +August 2012 program of the month: +http://ahay.org/blog/2012/08/01/program-of-the-month-sfpick/ +*/ +``` + +The first line `Automatic picking from semblance-like panels.` becomes the description. Everything else becomes the COMMENTS body. + +For comparison, `system/main/transp.c` (the `sftransp` program): + +```c +/* Transpose two axes in a dataset. + +If you get a "Cannot allocate memory" error, give the program a +memsize=1 command-line parameter to force out-of-core operation. +*/ +``` + +The GPL license block that follows is a **separate** `/* ... */` comment and is not scraped. + +### Parameter documentation (C) + +Parameter documentation is harvested from `sf_get*` call sites. The scraper (`doc.py`, `param['c']` regex) matches: + +``` +if (!sf_get("name",&var)) var=default; +/* description */ +``` + +or, for required parameters (no default): + +``` +if (!sf_get("name",&var)) sf_error("..."); +``` + +The inline `/* ... */` comment immediately after the `sf_get` call becomes the parameter description in `sfdoc`. Example: + +```c +if (!sf_getfloat("vel0",&vel0)) vel0=o2; +/* surface velocity */ + +if (!sf_getbool("smooth",&smooth)) smooth=true; +/* if apply smoothing */ + +if (!sf_getint("niter",&niter)) niter=100; +/* number of iterations */ +``` + +For multi-valued (array) parameters, `sf_getints`, `sf_getfloats`, `sf_getbools`, `sf_getstrings` are matched by `params['c']` (note plural). + +### Comment format in other languages + +Each language has its own `comment[lang]`, `param[lang]`, and related regexes in `doc.py`: + +- **Python**: first string literal (triple-quote or single-quote) in the file; parameters from `par.bool/int/float/string(...)` calls with an inline `# description` comment. +- **C++**: a single `// comment` line at the very top of the file (before any `#include`); parameters from `par.get(...)` calls with a trailing `// desc` comment. +- **Fortran 90**: `! comment` blocks; parameters from `from_par(...)` with `! desc`. +- **Chapel**: `// comment` lines; parameters from `config const/var name: type = default; // desc`. + +The per-language skill has the precise syntax for each. + +--- + +## Parameter conventions + +These conventions are shared across all languages. + +**Booleans** — always `y`/`n` on the command line. In C the type is `bool` and `sf_getbool` is used; the scraper normalizes `true`/`false` to `y`/`n` in the documentation. Users type `smooth=y` or `smooth=n`. + +**Numeric defaults** — supply a reasonable default whenever the parameter is optional. When there is no sensible default, call `sf_error` (or its equivalent) if the parameter is absent. The sfdoc output shows the default in the synopsis. + +**File-valued parameters** — auxiliary input/output files are opened with `sf_input("tag")` / `sf_output("tag")` where `tag` is anything other than `"in"` or `"out"`. The tag becomes the parameter name on the command line (`tag=filename.rsf`). The scraper detects these automatically from `inpout['c']` matches and adds them to the synopsis. + +**Comma-separated lists** — use `sf_getints` / `sf_getfloats` / `sf_getbools` (plural) for array-valued parameters. The size argument is part of the scraper match and appears in the documentation. + +**Required vs. defaulted** — if omission is fatal, write `if (!sf_getint("n",&n)) sf_error("n= required");`. If a default is acceptable, write `if (!sf_getint("n",&n)) n=100;`. The doc system distinguishes these by whether a default value appears in the scraped call. + +Cross-reference: `skills/using-sf-programs/SKILL.md` explains how users supply parameters (`key=value` on the command line or via pipe from `sfdd` / a `.par` file). + +--- + +## Error handling + +**C**: `sf_error("format string: %d", value)` — prints the message to stderr and exits with a non-zero status. It is declared in `rsf.h` and defined in the core library. + +**Python**: use `sys.stderr.write('msg\n'); sys.exit(1)` or `raise SystemExit('msg')`. (The RSF Python API does not provide a dedicated error helper — `sf_error` is a C API.) + +**Fortran 90**: `call sf_error("msg")` (from `rsf_f90` module). + +**C++**: `throw std::runtime_error("msg")` or `sf_error("msg")` (the C function is accessible from C++ via the C++ API headers). + +**Chapel**: `sf_error("msg")` via the `RSF` module wrapper (see `api/chapel/m8r.chpl`). + +**Pipeline behavior**: Madagascar programs communicate via Unix pipes (`stdin` → `stdout` in RSF format). When a program exits non-zero mid-pipe, the write end of the pipe closes. Downstream programs reading from stdin receive EOF and should terminate cleanly. If you run pipelines from a shell with `set -o pipefail`, any non-zero exit in the pipeline causes the entire pipeline to fail. In SCons `Flow()` commands Madagascar checks the exit status of each program automatically. + +--- + +## Testing + +**Low-level API tests** — `api/c/Test*.c` files test the C library itself (e.g., `api/c/Testfile.c`). These are compiled and linked by the `api/c/SConstruct` into `.x` test executables; they do not use the `M.c` naming convention. + +**Program-level tests** — in each user directory's `SConstruct`, test targets are SCons `Flow()` or `Command()` nodes. Run `scons` in `user//` to build and execute Flow/Command targets (including any test flows). The test typically pipes synthetic data through the new program and checks the output. + +**Minimum test for a new program**: confirm that `sfdoc sf` prints the correct description and parameter list. This validates that the self-doc comment was scraped correctly and that the program was installed. + +**Regression test** (recommended): add a `Flow()` in your `SConstruct` that synthesizes known input with `sfspike` or `sfmath`, runs your program, and compares to a reference result with `sfattr` or a diff. This becomes part of the CI run. + +CI runs the full test suite via `scons` from the build root; individual user directories can be tested with `scons` locally in `user//`. + +--- + +## Build integration + +**Standard case — no action required.** For C programs, the old-style `user//SConstruct` lists program base names in a `progs` string and loops: + +```python +mains = Split(progs) +for prog in mains: + sources = ['M' + prog] + bldutil.depends(env, sources, 'M' + prog) + env.StaticObject('M' + prog + '.c') + prog = env.Program(prog, [x + '.o' for x in sources], LIBS=libs) +``` + +The new-style uses `bldutil.UserSconsTargets()`: + +```python +targets = bldutil.UserSconsTargets() +targets.c = 'myprogram anotherprogram' +targets.py = 'myscript' +targets.build_all(env, glob_build, srcroot, bindir, libdir, pkgdir) +``` + +In both styles, you add the base name of your program (without `M` and without the extension) to the relevant list. The `SConstruct` handles compiling, linking against `librsf`, and installing to `bindir`. + +For C++ (`.cc`) or CUDA (`.cu`) programs, use `HuiSconsTargets` instead — it exposes `.cc` and `.cu` attributes. `UserSconsTargets` covers only C (`.c`), Fortran-90 (`.f90`), Python (`.py`), and Julia (`.jl`). + +**Self-doc** is generated by `env.Doc(prog, 'M' + prog)` (or `env.Doc(prog, 'M'+prog+'.py', lang='python')` for Python) and merged into the user's doc file. This depends on `framework/rsf/doc.py` and runs automatically as part of the build. + +**Extra libraries** — if your program needs FFTW, for example, the `user/fomels/SConstruct` checks: + +```python +fftw = env.get('FFTW') +if fftw: + env.Prepend(CPPDEFINES=['SF_HAS_FFTW']) +``` + +and links `fftw` into the LIBS list for programs that need it. Follow the same pattern for other optional dependencies: guard with `env.get('LIBNAME')` and use `env.RSF_Place('sfprogname', None, var='LIBNAME', package='package-name')` to produce a stub binary when the library is absent. + +--- + +## Choosing a language + +Use this decision tree as a starting point: + +- **C** — default choice for new contributions. The entire core library (`librsf`) is C, most existing programs are C, and contributions blend in naturally. Best for performance-critical code and programs that extend or wrap existing C library routines. +- **Python** — prototype quickly, leverage NumPy/SciPy for array math, or write research scripts that do not need the performance of a compiled binary. Python programs are installed as executable scripts (renamed from `M.py` to `sf`). +- **C++** — use when you need templates, object-oriented design, or integration with C++ libraries such as the Madagascar C++ API (`librsf++`) or Eigen/LAPACK wrappers. +- **Fortran 90** — preferred over F77 for new Fortran work. Use when the algorithm is naturally expressed in Fortran or when you are porting existing scientific Fortran code. +- **Fortran 77** — legacy only; avoid for new programs. Rarely used; no existing `M*.f` main programs in this tree — the per-language skill will point you at the `api/f77/Test*.*` files as templates. +- **Julia** — when your ecosystem is Julia or your collaborators prefer it; Julia programs are installed as scripts. Rarely used; no existing `M*.jl` main programs in this tree — the per-language skill will point you at the `api/julia/Test*.*` files as templates. +- **Matlab / Octave** — when the algorithm is already in `.m` form and conversion is not worth the effort; the build system uses configure flags to distinguish Matlab vs. Octave. Rarely used; no existing `M*.m` main programs in this tree — the per-language skill will point you at the `api/matlab/Test*.*` or `api/octave/Test*.*` files as templates. +- **Java** — rarely used; no existing `M*.java` main programs in this tree — the per-language skill will point you at the `api/java/Test*.*` files as templates. +- **Chapel** — parallel computing experiments; follow `user/*/M*.chpl` patterns. + +--- + +## Next step + +After loading this skill, load the language-specific skill for your target language: + +``` +skills/authoring-sf-programs-c/SKILL.md # C +skills/authoring-sf-programs-python/SKILL.md # Python +skills/authoring-sf-programs-cc/SKILL.md # C++ +skills/authoring-sf-programs-f90/SKILL.md # Fortran 90 +skills/authoring-sf-programs-julia/SKILL.md # Julia +``` + +The language skill shows the exact boilerplate, `sf_init` / `sf_input` / `sf_output` call patterns, parameter retrieval idioms, and a worked example for that language. diff --git a/skills/plotting-with-vplot/SKILL.md b/skills/plotting-with-vplot/SKILL.md new file mode 100644 index 000000000..e054bc4e9 --- /dev/null +++ b/skills/plotting-with-vplot/SKILL.md @@ -0,0 +1,644 @@ +--- +name: plotting-with-vplot +description: Use when producing or debugging a vplot visualization in Madagascar — grey, graph, wiggle, contour, dots, overlays, and label escape codes. +--- + +## When to use + +Use this skill any time a task involves: + +- Producing a `.vpl` file — Madagascar's portable vector/raster plot format. +- Calling `sfgrey`, `sfgraph`, `sfwiggle`, `sfcontour`, `sfdots`, or `sfbargraph` from an SConstruct or at the command line. +- Debugging a blank, mis-scaled, or mis-labeled plot. +- Composing multiple plots into an overlay, side-by-side panel, or animation. +- Writing axis labels that require subscripts, superscripts, or Greek letters. +- Controlling color schemes, clip levels, or scalebars. + +This skill does not cover converting `.vpl` to PDF/PNG (use `sfpen` options or `vppen`), or 3D surface plots (`sfgrey3`, `sfsurf`). + +--- + +## Mental model + +### What vplot is + +Vplot is Madagascar's internal plot language — a compact binary stream describing lines, rasters, text, and color tables. Every visualization program (sfgrey, sfgraph, etc.) reads an RSF file and writes a `.vpl` stream. The `.vpl` file can be: + +- Displayed on screen with `sfpen` (wraps the `xtpen` or `pspen` driver). +- Composed with other `.vpl` files using `vppen`. +- Converted to PostScript with `pspen`. + +### Programs vs. the pipeline + +Every vplot program is a standard Madagascar filter: + +``` +< data.rsf sfgrey color=j clip=1 title="My data" > out.vpl +``` + +At the command line, output goes to stdout. In SConstruct, SCons manages the filenames. + +### Plot vs. Result in SConstruct + +`Plot` and `Result` are the two SConstruct functions that produce `.vpl` files. + +| Function | Output location | Purpose | +|----------|----------------|---------| +| `Plot('name', ...)` | `name.vpl` (working dir) | Intermediate — used as input to composition | +| `Result('name', ...)` | `Fig/name.vpl` | Final — what `scons lock` saves; what the paper uses | + +Rules of thumb: +- Use `Plot` for any image you intend to overlay or combine. +- Use `Result` for any image that is a deliverable on its own. +- A `Result` with a composition string is both — it reads `Plot`-produced `.vpl` files and writes a final composed `.vpl` to `Fig/`. + +### How parameters reach the vplot program + +In SConstruct, the second argument to `Plot` or `Result` is either: +1. A flow string — the vplot program call with its parameters: + ```python + Result('img', 'grey title="My plot" color=j clip=2') + ``` +2. A source list plus a composition keyword: + ```python + Result('combined', 'img1 img2', 'Overlay') + ``` + +Parameters follow the same `key=value` syntax as any `sf` program. String values with spaces must be quoted. + +### Universal parameter inheritance + +All vplot programs call `sfstdplot` internally (see `sfdoc stdplot`). This means axis labels, title, min/max, grid, screen size, and layout parameters are available to every vplot program, not just the one documented in its own `sfdoc` entry. + +--- + +## Core program catalog + +| Program | Description | Signature | +|---------|-------------|-----------| +| `sfgrey` | 2D raster/density plot. Maps float values to a color table. Default: `transp=y yreverse=y`. | `< 2D.rsf sfgrey [params] > out.vpl` | +| `sfgraph` | 1D line plot. Plots n2 traces as separate curves along axis 1. | `< 1D.rsf sfgraph [params] > out.vpl` | +| `sfwiggle` | Seismic wiggle-trace display. Draws each trace as a waveform with optional filled polygons. | `< 2D.rsf sfwiggle [params] > out.vpl` | +| `sfcontour` | Contour plot. Draws iso-value lines on a 2D field. Default: `transp=y allpos=y`. | `< 2D.rsf sfcontour [params] > out.vpl` | +| `sfdots` | Lollipop/impulse plot. Each sample becomes a vertical stick with a dot. Good for sparse signals. | `< 1D_or_2D.rsf sfdots [params] > out.vpl` | +| `sfbargraph` | Bar chart. Columns of data become side-by-side or stacked bars. | `< 1D_or_2D.rsf sfbargraph [params] > out.vpl` | + +--- + +## Universal plot parameters + +These parameters come from `sfstdplot` (run `sfdoc stdplot`) and are accepted by every vplot program. + +| Parameter | Default | Effect | +|-----------|---------|--------| +| `title=` | (none) | Title string. Supports escape codes. | +| `label1=` | (from RSF header) | Label for axis 1 (vertical when `transp=y`). | +| `label2=` | (from RSF header) | Label for axis 2 (horizontal when `transp=y`). | +| `unit1=` | (from RSF header) | Unit appended to `label1` in parentheses. | +| `unit2=` | (from RSF header) | Unit appended to `label2` in parentheses. | +| `min1=` / `max1=` | (from data) | Display range on axis 1. | +| `min2=` / `max2=` | (from data) | Display range on axis 2. | +| `wantaxis=` | `y` | `n` suppresses both axes. `wantaxis1=`/`wantaxis2=` are per-axis. | +| `wanttitle=` | `y` | `n` suppresses the title. | +| `wheretitle=` | `top` | Title position: `top`, `bottom`, `left`, `right`. | +| `wherexlabel=` | `bottom` | Horizontal axis label position: `top` or `bottom`. | +| `screenratio=` | device default | Figure height/width ratio. `0.75`=landscape, `1.5`=portrait. | +| `plotfat=` | `0` | Line/curve thickness. `3`=bold, `6`=heavy. | +| `plotcol=` | `6` (yellow) | Curve color: 0=black, 1=blue, 2=red, 3=magenta, 4=green, 5=cyan, 6=yellow, 7=white. | +| `pclip=` | 99 (grey), 98 (wiggle), 100 (others) | Percentile clip. | +| `grid1=` / `grid2=` | `n` | Draw grid lines on axis 1 or 2. | +| `dash=` | `0` | Line dash: 0=solid, 1=fine dash, 2=dot, 3=dash, 4=large dash. | +| `xll=/yll=/xur=/yur=` | (auto) | Manual panel placement in inches. | +| `crowd=` | `0.75` | Fraction of frame used by plot area. `crowd1=`/`crowd2=` per axis. | +| `labelsz=` | `8` | Axis label font size. `titlesz=10` for title. | + +--- + +## Program-specific parameters + +### sfgrey — 2D raster/density plot + +`sfdoc sfgrey` | source: `plot/main/grey.c` + +Defaults: `transp=y yreverse=y xreverse=n`. With these defaults, axis 1 runs vertically (time, depth), axis 2 runs horizontally (trace, offset). This is the standard seismic display convention. + +| Parameter | Default | Notes | +|-----------|---------|-------| +| `color=` | `i` (grayscale) | Color scheme letter code. See **Color schemes** section below. | +| `clip=` | (estimated) | Hard clip value. Data outside `[-clip, clip]` saturates. If not set, estimated from `pclip`. | +| `pclip=` | `99` | Percentile used to estimate clip when `clip=` is not set. Set lower (e.g. `pclip=95`) to increase contrast. | +| `bias=` | `0` | Value mapped to the center of the color table. Shift to display asymmetric data better. | +| `allpos=` | `n` | If `y`, assume all data is positive and use the full color range for positive values only. Useful for amplitude maps. | +| `polarity=` | `n` | If `y`, reverse polarity: black represents high values, white represents low. | +| `scalebar=` | `n` | If `y`, draw a colorbar beside the plot. Use `minval=`, `maxval=` to set bar range; `barlabel=` and `barunit=` to label it. | +| `gainpanel=` | (none) | Gain normalization panel: `a` (all panels together), `e` (each panel independently), or an integer panel number. | +| `gpow=` | `1` | Raise data to this power before display. Values < 1 compress dynamic range; > 1 expand it. | +| `transp=` | `y` | Transpose axes. Set `transp=n` to put axis 1 horizontal. | +| `yreverse=` | `y` | Reverse vertical axis (time increases downward by default). | +| `xreverse=` | `n` | Reverse horizontal axis. | +| `mean=` | `n` | If `y`, set the bias to the data mean automatically. | +| `symcp=` | `n` | If `y`, use a symmetric 255-color palette (ensures that the center color maps to zero exactly). | +| `wantframenum=` | `y` if n3>1 | Show the current frame number in the corner for 3D data. | + +**clip vs. pclip:** `pclip=99` (default) estimates clip from the 99th percentile. Set `clip=` explicitly for reproducible cross-panel display. `gainpanel=a` normalizes across all 3D frames; `gainpanel=e` per frame. For strictly positive data (envelopes, migration images) use `allpos=y color=j`. + +--- + +### sfgraph — 1D line plot + +`sfdoc sfgraph` | source: `plot/main/graph.c` + +Default: `transp=n`. Plots each column (n2 traces) as a curve, with axis 1 values on the vertical axis and axis 2 index on the horizontal. + +| Parameter | Default | Notes | +|-----------|---------|-------| +| `transp=` | `n` | If `y`, transpose so axis 1 is horizontal. Use `transp=y yreverse=y` to plot a velocity curve in the same orientation as a `grey` plot. | +| `yreverse=` | `n` | Reverse vertical axis. | +| `min2=` | (from data) | Minimum on the curve-value axis. Critical for overlay alignment — must match the background `grey` plot's axis range. | +| `max2=` | (from data) | Maximum on the curve-value axis. | +| `plotcol=` | `6` | Curve color. Use `plotcol=2` for red, `plotcol=1` for blue. | +| `plotfat=` | `0` | Curve fatness. `plotfat=3` gives a bold curve suitable for overlays. | +| `symbol=` | (none) | If set, plot with symbols (markers) instead of lines. Letter determines marker shape. | +| `symbolsz=` | `2` | Symbol size. | +| `wantaxis=` | `y` | When used as an overlay, set `wantaxis=n wanttitle=n` to suppress the graph's own axes (using the background image's axes instead). | +| `pad=` | `y` | If `n`, suppress extra padding around the plot area. Required for exact alignment with an underlying `grey` image. | +| `dash=` | `0` | Line dash pattern. `dash=1` gives fine dashes; `dash=3` gives coarse dashes. | +| `pclip=` | `100` | Clip percentile (usually leave at default for graphs). | +| `color=` | `j` | Color scheme (default is jet for graphs). | + +**Overlay alignment:** Both plots must share axis ranges. With `sfgrey transp=y` (default), axis 1 is vertical and `min1/max1` control vertical range. With `sfgraph transp=y yreverse=y`, the curve values map to the vertical axis via `min2/max2`. Always use `wantaxis=n wanttitle=n pad=n` on the overlay graph. + +--- + +### sfwiggle — seismic trace display + +`sfdoc sfwiggle` | source: `plot/main/wiggle.c` + +Draws each trace as an oscillating waveform. Positive excursions can be filled (polygon mode). + +| Parameter | Default | Notes | +|-----------|---------|-------| +| `clip=` | `0` (estimated from pclip) | Hard amplitude clip. If `0`, estimated from `pclip`. | +| `pclip=` | `98` | Percentile used when `clip=0`. | +| `poly=` | `n` | If `y`, fill positive excursions with solid polygons (standard seismic display). | +| `polyneg=` | `n` | If `y`, also fill negative excursions (fills both sides). | +| `zplot=` | `0.75` | Scales the half-width allocated per trace relative to the d2 interval (`zplot *= d2` internally). Default `0.75` leaves small gaps between traces; values `> 1` cause traces to overlap. | +| `transp=` | `n` | Transpose axes. Default `n` means axis 1 is vertical (time), axis 2 is horizontal (trace). | +| `yreverse=` | `n` | Reverse vertical axis. Set `y` to make time increase downward. | +| `xreverse=` | `n` | Reverse horizontal axis. | +| `fatp=` | `1` | Polygon border fatness (line thickness of the filled wiggle boundary). | +| `seemean=` | `n` | If `y`, draw a horizontal mean line through each trace. | +| `xpos=` | (none) | Optional auxiliary RSF file with explicit trace positions (allows irregular spacing). | +| `xmask=` | `1` | Polygon fill mask for x direction. | +| `ymask=` | `1` | Polygon fill mask for y direction. | + +**Note on clip:** `sfwiggle` requires an explicit positive clip to work properly. Either set `clip=` directly, or rely on `pclip=98` (the default) which estimates the clip from the data percentile. Setting `clip=0` with no `pclip` produces no display. + +--- + +### sfcontour — contour plot + +`sfdoc sfcontour` | source: `plot/main/contour.c` + +Draws iso-value lines on a 2D scalar field. Defaults: `transp=y allpos=y`. + +| Parameter | Default | Notes | +|-----------|---------|-------| +| `nc=` | `50` | Number of contour levels. | +| `dc=` | (auto) | Contour increment (spacing between levels). Overrides `nc`. | +| `c0=` | (auto) | Value of the first contour level. | +| `c=` | (none) | Explicit list of contour values `[nc]`. | +| `cfile=` | (none) | RSF file containing contour values. | +| `allpos=` | `y` | If `y`, only contour positive values. Set `n` to contour negative values too. | +| `transp=` | `y` | Transpose axes (same convention as sfgrey). | +| `min1=` | `o1` | Minimum on axis 1. | +| `max1=` | `o1+(n1-1)*d1` | Maximum on axis 1. | +| `min2=` | `o2` | Minimum on axis 2. | +| `max2=` | `o2+(n2-1)*d2` | Maximum on axis 2. | +| `scalebar=` | `n` | If `y`, draw a scalebar. | +| `barlabel=` | (none) | Label for the scalebar. | + +**Gotcha:** Unlike other vplot programs, `sfcontour`'s scalebar also requires `barlabel=...` to activate. `scalebar=y` without a non-empty `barlabel=` silently produces no colorbar. (Source: `plot/main/contour.c` lines 97-99, which checks both `scalebar` and `barlabel` before enabling the bar.) + +**Contour overlay on grey:** A common pattern is to overlay contours on a raster image: +```python +Plot('raster', 'grey color=j') +Plot('lines', 'contour nc=20 wantaxis=n wanttitle=n') +Result('both', 'raster lines', 'Overlay') +``` + +--- + +### sfdots — lollipop/impulse display + +`sfdoc sfdots` | source: `plot/main/dots.c` + +Plots each sample as a vertical stick with a ball. Good for sparse signals and filter coefficients. Key parameters: `dots=` (1=balloon, 0=stick only), `strings=y` (draw sticks), `connect=` (1=diagonal, 2=bar), `gaineach=y` (per-trace normalization), `clip=` (-1=no clip), `overlap=0.9` (trace width fraction), `transp=`, `yreverse=`, `seemean=`. Note: the axis is drawn only when `label1=` is present. + +--- + +### sfbargraph — bar chart + +`sfdoc sfbargraph` | source: `plot/main/bargraph.c` + +Plots each column as a group of bars. Key parameters: `width=0.8` (bar width fraction), `stack=y` (stack vs. side-by-side), `transp=n` (horizontal bars if `y`). Inherits all `sfstdplot` parameters for labels, title, and axes. + +--- + +## Color schemes (`color=`) + +The `color=` parameter in `sfgrey` (default `i`) and `sfgraph` (default `j`) selects a color table. These codes are defined in `plot/lib/coltab.c`. An uppercase letter reverses the table; appending `C` clips out-of-range values in a highlight color. + +| Code | Name | Description | +|------|------|-------------| +| `i` | Grayscale | Default for `sfgrey`. Black to white. Seismic convention: white=positive, black=negative. | +| `I` | Reverse grayscale | White to black. | +| `j` | Jet | Blue → cyan → green → yellow → red. Good for amplitude/velocity maps. | +| `J` | Reverse jet | Red → yellow → green → cyan → blue. | +| `g` | Red-white-black | Red (negative) → white (zero) → black (positive). Diverging, asymmetric. | +| `G` | Reverse red-white-black | Black (negative) → white (zero) → red (positive). | +| `e` | Red-white-blue | Red (negative) → white (zero) → blue (positive). Classic diverging colormap. | +| `E` | Reverse red-white-blue | Blue (negative) → white (zero) → red (positive). | +| `h` | Hot | Black → red → orange → yellow → white. Emphasizes amplitude. | +| `H` | Reverse hot | White → yellow → red → black. | +| `p` | Pink | Softer version of hot (squished root of hot+grey). | +| `b` | Bone | Blue-tinted grayscale; similar to greyscale but cooler. | +| `a` | Rainbow (HSV) | Full hue cycle: red → yellow → green → cyan → blue → magenta → red. | +| `A` | Reverse rainbow | Reverse hue cycle. | +| `x` | Cubehelix | Perceptually linear helix through colour space (D.A. Green 2011). Monotone in luminance. | +| `X` | Reverse cubehelix | Reversed cubehelix. | +| `t` | Traffic | Green → yellow → red cycle (traffic light pattern). | +| `T` | Reverse traffic | Red → yellow → green. | +| `w` | Wheel | Full RGB colour wheel. | +| `W` | Reverse wheel | Reverse colour wheel. | +| `c` | Cool | Cyan → magenta (cyan at low, magenta at high). | +| `C` | Reverse cool | Magenta → cyan. | +| `l` | Linear (copper) | Black → dark orange/copper → bright copper. | +| `f` | Flag | Repeating red-white-blue-black pattern. Useful for phase. | + +**Custom color tables:** You can also pass a filename (path to a CSV file with R,G,B triples, one per line, values in [0,1]) as the `color=` value. Madagascar looks in `$RSFROOT/include/.csv` if the name is not a single letter. + +**Clipping highlight suffix `C`:** Appending `C` to any color code (e.g. `color=jC`) marks values that fall outside the clip range with a red highlight color (configurable via `cliprgb=`). Useful for diagnosing saturation. + +--- + +## Label escape codes + +Vplot interprets `\` as an escape character in text strings (titles, axis labels). This lets you add subscripts, superscripts, font switches, and size changes directly in label strings. + +### Complete escape reference + +All escapes that do not take an argument: + +| Escape | Effect | +|--------|--------| +| `\_` | Lower (subscript) — move down half a capital-letter height | +| `\^` | Raise (superscript) — move up half a capital-letter height | +| `\>` | Advance one inter-letter space | +| `\<` | Back up one inter-letter space | +| `\n` | Newline | +| `\\` | Print a literal backslash | +| `\g` | Ghostify (invisible text, for spacing) | +| `\G` | Resume visible text | + +Escapes that take an integer argument followed by a required space: + +| Escape | Argument | Effect | +|--------|----------|--------| +| `\s ` | integer percent | Change text size to N% of current size. `\s100 ` restores default. `\s75 ` is smaller. | +| `\F ` | font number | Switch to font N. Font 9 = Greek Simplex (β, α, etc.). `-1` restores default. | +| `\v ` | glyph number | Print glyph N from the current font (bypasses special character meaning). | +| `\c ` | color index | Change text color. `-1` restores drawing color. | +| `\f ` | fatness delta | Add N to the current line fatness. | +| `\r ` | percent height | Move up N percent of a character height (negative moves down). | +| `\k ` | percent width | Move right N percent of a space width (negative moves left). | +| `\m ` | register | Save current position in register N. | +| `\M ` | register | Restore position from register N. | + +Key font numbers: 0=Pen, 1=Roman Simplex, 3=Roman Complex, 5=Italic Complex, 9=Greek Simplex (α β γ), 10=Greek Complex, 15=Mathematics. `-1` restores the default font. + +### The Python backslash problem + +Python interprets `\s`, `\F`, `\_`, `\^` as string escapes. This means you need to double the backslash or use raw strings: + +```python +# Wrong — Python eats the backslash before vplot sees it: +Result('bad', 'grey title="v\_NMO"') + +# Correct option 1 — double the backslash: +Result('ok1', 'grey title="v\\_NMO"') + +# Correct option 2 — raw string (preferred for readability): +Result('ok2', r'grey title="v\_NMO"') + +# In a triple-quoted string, doubling still works: +Result('ok3', + ''' + grey title="v\\_\s75 NMO\s100 " + ''') +``` + +The NMO tutorial in the book (`book/rsf/tutorials/nmo/SConstruct`) uses the doubling convention: +```python +title="v\\_\\s75 NMO\\s100 \\^ Profile" +``` + +**Common recipes:** `r"v\_NMO"` → v subscript NMO; `r"x\^2\_ "` → x superscript 2 (back to baseline); `r"\F9 b\F-1 "` → Greek beta (font 9, restore with -1); `r"v\\_\\s75 NMO\\s100 "` → NMO at 75% size (doubling convention from the book). + +--- + +## Composition + +Madagascar's `rsf.proj` defines eight composition modes for combining `.vpl` files. These are invoked as the flow string in `Result` or `Plot` when the source is a list of names. + +### All eight modes at a glance + +| Mode | vppen arguments | Effect | +|------|----------------|--------| +| `Overlay` | `erase=o vpstyle=n` | Stack all panels in registration (Z-order); last source on top. | +| `SideBySideAniso` | `yscale=N vpstyle=n gridnum=N,1` | N panels left-to-right, each keeping its own aspect ratio. | +| `OverUnderAniso` | `xscale=N vpstyle=n gridnum=1,N` | N panels top-to-bottom, each keeping its own aspect ratio. | +| `SideBySideIso` | `size=r vpstyle=n gridnum=N,1` | N panels left-to-right with matched (isotropic) aspect ratios. | +| `OverUnderIso` | `size=r vpstyle=n gridnum=1,N` | N panels top-to-bottom with matched aspect ratios. | +| `TwoRows` | `size=r vpstyle=n gridnum=ceil(N/2),2` | N panels arranged in two rows, isotropic scaling. | +| `TwoColumns` | `size=r vpstyle=n gridnum=2,ceil(N/2)` | N panels arranged in two columns, isotropic scaling. | +| `Movie` | `vpstyle=n` | Flipbook animation; `sfpen` cycles through frames on display. | + +(Source: `framework/rsf/proj.py` lines 255-264.) + +### Overlay + +Stacks multiple plots in registration (Z-order, last source on top). + +```python +Plot('background', 'grey ...') +Plot('foreground', 'graph wantaxis=n wanttitle=n pad=n ...') +Result('combined', 'background foreground', 'Overlay') +``` + +Internally: `vppen erase=o vpstyle=n source1.vpl source2.vpl > out.vpl` + +The first source defines the coordinate frame. All subsequent sources must use compatible axis ranges to align correctly. + +### SideBySideAniso + +Places N plots side by side, each keeping its own aspect ratio. Good for heterogeneous panels (e.g., a narrow velocity panel next to a wide gather). + +```python +Result('comparison', 'img1 img2 img3', 'SideBySideAniso') +# Extra vppen flags via keyword argument: +Result('out', 'img1 img2', 'SideBySideAniso', vppen='txscale=1.5') +``` + +### OverUnderAniso + +Places N plots stacked vertically, each keeping its own aspect ratio. Use when the panels have different data types or shapes that should not be forced to a common scale. + +```python +Result('stacked', 'top bottom', 'OverUnderAniso') +``` + +### SideBySideIso + +Places N plots side by side with matched aspect ratios — good when panels show the same data type. + +```python +Result('panels', 'imgA imgB', 'SideBySideIso') +``` + +### OverUnderIso + +Places N plots stacked vertically with matched aspect ratios. Use when you want a column of comparable panels (e.g., the same section at different stages of processing). + +```python +Result('column', 'before after', 'OverUnderIso') +``` + +### TwoRows + +Arranges N panels into two rows with isotropic scaling (`gridnum=ceil(N/2), 2`). Convenient for even numbers of panels that should read left-to-right across two rows. + +```python +Result('grid', 'p1 p2 p3 p4', 'TwoRows') +``` + +### TwoColumns + +Arranges N panels into two columns with isotropic scaling (`gridnum=2, ceil(N/2)`). Stacks pairs of panels in two vertical columns. + +```python +Result('twocol', 'p1 p2 p3 p4', 'TwoColumns') +``` + +### Movie + +Assembles multiple frames into a flipbook animation. `sfpen` cycles through them on display. + +```python +for iframe in range(nframes): + Plot('frame%d' % iframe, 'slice%d' % iframe, 'grey ...') +Result('movie', ['frame%d' % i for i in range(nframes)], 'Movie') +``` + +Internally: `vppen vpstyle=n frame0.vpl frame1.vpl ...` + +--- + +## Frame and layout + +`screenratio=` (height/width) controls figure aspect: `0.5`=wide landscape, `0.75`=typical landscape, `1.0`=square, `1.5`=portrait. + +For manual panel placement, use `xll=/yll=/xur=/yur=` in inches from the paper's bottom-left corner: + +```python +Plot('left', 'grey ... xll=1.0 yll=1.5 xur=4.0 yur=6.5') +Plot('right', 'grey ... xll=4.5 yll=1.5 xur=7.5 yur=6.5') +Result('both', 'left right', 'Overlay') +``` + +`crowd=` (default `0.75`) controls what fraction of the frame the plot area occupies. `crowd1=`/`crowd2=` set axes independently. + +--- + +## Debugging plots + +### Empty or all-black plot + +**Symptom:** `sfgrey` produces a `.vpl` that shows as solid black or solid white. + +**Causes and fixes:** +1. Data is all zeros — check with `sfattr < data.rsf`. If `max = 0`, the flow that produced the data failed silently. +2. `clip=` was set to a value much larger than the data range — the entire range falls in one color. Check `sfattr` for actual min/max and set `clip=` accordingly, or remove it and let `pclip` estimate. +3. `pclip=` is very low (e.g. `pclip=1`) — almost all data is clipped. Increase to 99. + +### Wrong aspect ratio + +**Symptom:** Plot looks squashed or stretched. + +**Fix:** Set `screenratio=` explicitly. Remember that `sfgrey` with `transp=y` (default) swaps n1 and n2 roles — if your data is 1000×50 samples, the display is 50 wide and 1000 tall without aspect correction. + +### Missing or mangled labels + +**Symptom:** Label appears with a literal backslash and letter instead of a formatted escape. + +**Fix:** Python ate the backslash. Use raw strings (`r"..."`) or double the backslash (`"\\_"`). + +**Symptom:** Label is absent entirely. + +**Fix:** `wantaxis=n` is set, or the header fields `label1`/`label2` were not set in the RSF file and no explicit `label1=`/`label2=` was passed. + +### Misaligned overlay + +**Symptom:** Overlay curve appears at the wrong position on the background image. + +**Cause:** Axis range mismatch between the two plots. + +**Diagnosis:** +- `sfgrey` with `transp=y` (default): axis 1 is on the vertical axis, axis 2 on the horizontal. `min1/max1` control the vertical range; `min2/max2` control the horizontal. +- `sfgraph` with `transp=y`: axis 1 values are the independent axis (horizontal after transposing... actually vertical after `transp=y yreverse=y`), and the data values are on axis 2 (`min2/max2`). + +**Fix:** Both plots must share identical `min1`/`max1` and `min2`/`max2`. Also use `pad=n` on the overlay `sfgraph` to suppress the automatic padding that would shift the curve. + +### Scalebar out of range + +Set `minval=` and `maxval=` explicitly (from `sfattr`) to control scalebar labels independently of clip range. + +### Wiggle traces invisible + +`clip=0` with no `pclip` estimates zero clip — nothing shows. Use `pclip=98` (default) or an explicit `clip=` from `sfattr`. + +### Overlay Z-order + +In `vppen erase=o`, sources are drawn left to right — the last source is on top. Put foreground elements last in the source list: +```python +Result('out', 'raster_below curve_on_top', 'Overlay') +``` + +--- + +## Examples + +### Example 1 — basic grey plot + +File: `skills/plotting-with-vplot/references/example-grey.SConstruct` + +```python +from rsf.proj import * + +Flow('synth', None, + ''' + spike n1=200 n2=100 k1=50,100,150 nsp=3 mag=1,0.8,0.6 | + ricker1 frequency=40 | + noise seed=2025 var=0.005 + ''') + +Result('synth', + ''' + grey title="Synthetic gather" + label1="Time" unit1="s" label2="Trace" unit2="" + color=j scalebar=y + ''') + +End() +``` + +**Stage by stage:** + +1. `sfspike` creates a 200-sample × 100-trace array with three spikes at samples 50, 100, 150. `nsp=3` is required — `sfspike` does not infer it from the length of the `k1` list. + +2. `sfricker1 frequency=40` convolves each trace with a 40 Hz Ricker wavelet, turning the spikes into wavelets. + +3. `sfnoise` adds Gaussian noise (variance 0.005, reproducible with seed 2025) to simulate real data. + +4. `sfgrey` displays the result. Key choices: + - `color=j` — jet colormap shows the gather with blue–green–yellow–red mapping. Good for seeing both amplitude and polarity. + - `scalebar=y` — draws a colorbar on the right showing the amplitude-to-color mapping. + - `label1/unit1/label2/unit2` — explicit labels override whatever is in the RSF header. + - `transp=y yreverse=y` (defaults) — time (axis 1) runs vertically top-to-bottom, traces (axis 2) run left-to-right. + +**Output:** `Fig/synth.vpl` — a seismic-style display of a synthetic gather. + +--- + +### Example 2 — overlay with a velocity curve + +File: `skills/plotting-with-vplot/references/example-overlay.SConstruct` + +```python +from rsf.proj import * + +Flow('cmp', None, + ''' + spike n1=400 n2=60 k1=80,180,300 nsp=3 mag=1,0.7,0.5 | + ricker1 frequency=30 + ''') +Plot('cmp', 'grey title="CMP" label1=Time unit1=s label2=Offset unit2=m') + +Flow('vcurve', None, + ''' + math n1=60 output="1500+x1*10" d1=1 o1=0 + ''') +Plot('vcurve', + ''' + graph transp=y yreverse=y min2=0 max2=2000 + wantaxis=n wanttitle=n plotcol=2 plotfat=3 pad=n + ''') + +Result('cmp-with-vcurve', 'cmp vcurve', 'Overlay') +``` + +**Stage by stage:** + +1. `cmp` is a 400-sample × 60-trace synthetic gather. `Plot` (not `Result`) is used because it is an intermediate that feeds the overlay. + +2. `vcurve` is a 60-sample 1D array where sample i has value `1500 + i*10`, representing a velocity that increases linearly from 1500 to 2090 m/s across 60 offset positions. `sfmath` is used because there is no input file — `n1=60 d1=1 o1=0` defines the grid, and `output=` gives the analytic formula. + +3. The `sfgraph` call for `vcurve` uses: + - `transp=y yreverse=y` — makes the curve plot in the same orientation as the grey background (vertical time axis). + - `min2=0 max2=2000` — the velocity curve values (1500–2090) are mapped onto the horizontal axis range 0–2000. This must match what `sfgrey` uses for its horizontal axis. Here `sfgrey` uses `label2=Offset unit2=m` with the default axis range from the data (0 to 59 for the 60 traces). There is a conceptual mismatch here — in a real workflow you would align `min2/max2` precisely with the offset range. For demonstration purposes, the curve is plotted within the displayed frame. + - `wantaxis=n wanttitle=n` — suppress the graph's own axes; the background `cmp` image provides the axis frame. + - `plotcol=2` — red curve. + - `plotfat=3` — thick curve for visibility. + - `pad=n` — no extra padding; ensures the curve aligns with the raster plot coordinate frame. + +4. `Result('cmp-with-vcurve', 'cmp vcurve', 'Overlay')` — `vppen` draws `cmp.vpl` first (background), then `vcurve.vpl` on top. + +**Output:** `Fig/cmp-with-vcurve.vpl` — the gather with a red velocity curve overlaid. + +--- + +### Example 3 — wiggle trace display + +File: `skills/plotting-with-vplot/references/example-wiggle.SConstruct` + +```python +from rsf.proj import * + +Flow('traces', None, + ''' + spike n1=500 n2=15 k1=100,300 nsp=2 mag=1,0.6 | + ricker1 frequency=25 + ''') + +Result('traces', + ''' + wiggle clip=0.8 title="Wiggle display" + label1=Time unit1=s label2=Trace unit2="" + poly=y zplot=1.5 + ''') +``` + +**Stage by stage:** + +1. `sfspike` generates 500 samples × 15 traces with two wavelets per trace (at samples 100 and 300). `nsp=2` is required explicitly. + +2. `sfricker1 frequency=25` creates 25 Hz Ricker wavelets — lower frequency than example 1, producing broader, more clearly visible wiggles. + +3. `sfwiggle` displays with: + - `clip=0.8` — explicit clip at 80% of the spike amplitude. Wiggle traces are clipped to this value to control how wide the excursions can get. Without an explicit clip, the pclip=98 default would estimate one. + - `poly=y` — fills positive excursions with black polygons (standard seismic wiggle style). + - `zplot=1.5` — traces are spaced at 1.5× the default separation, so they overlap slightly. This emphasizes the waveform shape. Values > 1 cause overlap; values < 1 leave gaps between traces. + - Default `transp=n` for `sfwiggle` puts axis 1 (time) on the vertical axis and traces on the horizontal. + +**Output:** `Fig/traces.vpl` — a 15-trace wiggle display with filled positive excursions. + diff --git a/skills/plotting-with-vplot/references/example-grey.SConstruct b/skills/plotting-with-vplot/references/example-grey.SConstruct new file mode 100644 index 000000000..4dd664690 --- /dev/null +++ b/skills/plotting-with-vplot/references/example-grey.SConstruct @@ -0,0 +1,17 @@ +from rsf.proj import * + +Flow('synth', None, + ''' + spike n1=200 n2=100 k1=50,100,150 nsp=3 mag=1,0.8,0.6 | + ricker1 frequency=40 | + noise seed=2025 var=0.005 + ''') + +Result('synth', + ''' + grey title="Synthetic gather" + label1="Time" unit1="s" label2="Trace" unit2="" + color=j scalebar=y + ''') + +End() diff --git a/skills/plotting-with-vplot/references/example-overlay.SConstruct b/skills/plotting-with-vplot/references/example-overlay.SConstruct new file mode 100644 index 000000000..4ab9c226c --- /dev/null +++ b/skills/plotting-with-vplot/references/example-overlay.SConstruct @@ -0,0 +1,24 @@ +from rsf.proj import * + +# Data: a 2D gather. +Flow('cmp', None, + ''' + spike n1=400 n2=60 k1=80,180,300 nsp=3 mag=1,0.7,0.5 | + ricker1 frequency=30 + ''') +Plot('cmp', 'grey title="CMP" label1=Time unit1=s label2=Offset unit2=m') + +# A velocity curve drawn on top. +Flow('vcurve', None, + ''' + math n1=60 output="1500+x1*10" d1=1 o1=0 + ''') +Plot('vcurve', + ''' + graph transp=y yreverse=y min2=0 max2=2000 + wantaxis=n wanttitle=n plotcol=2 plotfat=3 pad=n + ''') + +Result('cmp-with-vcurve', 'cmp vcurve', 'Overlay') + +End() diff --git a/skills/plotting-with-vplot/references/example-wiggle.SConstruct b/skills/plotting-with-vplot/references/example-wiggle.SConstruct new file mode 100644 index 000000000..5d9099c8a --- /dev/null +++ b/skills/plotting-with-vplot/references/example-wiggle.SConstruct @@ -0,0 +1,16 @@ +from rsf.proj import * + +Flow('traces', None, + ''' + spike n1=500 n2=15 k1=100,300 nsp=2 mag=1,0.6 | + ricker1 frequency=25 + ''') + +Result('traces', + ''' + wiggle clip=0.8 title="Wiggle display" + label1=Time unit1=s label2=Trace unit2="" + poly=y zplot=1.5 + ''') + +End() diff --git a/skills/using-sf-programs/SKILL.md b/skills/using-sf-programs/SKILL.md new file mode 100644 index 000000000..021d7d27f --- /dev/null +++ b/skills/using-sf-programs/SKILL.md @@ -0,0 +1,489 @@ +--- +name: using-sf-programs +description: Use when composing a Madagascar data-processing pipeline from existing sf* programs — includes discovery, parameter conventions, and piping patterns. +--- + +## When to use + +Use this skill any time you are assembling a Madagascar data-processing workflow +from existing `sf*` programs: creating synthetic data, filtering, reshaping, +transforming, inspecting, or plotting. If you are *writing a new* `sf*` program +from scratch (in C or Python), consult the program-authoring skill instead. If +you need to embed pipelines inside a reproducible SCons build, consult the +`writing-rsf-flows` skill. + +The patterns here apply equally at the shell prompt, inside SConstruct `Flow` +calls, and inside short Python driver scripts — the programs themselves are +identical in all three contexts. + +## The pipeline model + +Every `sf*` program is a Unix filter. It reads an RSF dataset from **stdin** +and writes an RSF dataset to **stdout**. Because all programs speak the same +binary format, they compose without adapters: + +```bash +# Inline pipe — no temporary files: +sfspike n1=500 n2=10 k1=200 mag=1 \ + | sfbandpass fhi=60 phase=n \ + | sfwindow n1=300 f1=100 \ + | sfattr +``` + +The equivalent using explicit intermediates: + +```bash +sfspike n1=500 n2=10 k1=200 mag=1 > raw.rsf +< raw.rsf sfbandpass fhi=60 phase=n > filtered.rsf +< filtered.rsf sfwindow n1=300 f1=100 > windowed.rsf +< windowed.rsf sfattr +``` + +Both forms are semantically identical. The pipe form is more concise for +throwaway work; the intermediate-file form is better when you want to +inspect or reuse mid-stream results. + +Key properties of the model: + +- **Composability is the design goal.** Each program does one thing and passes + the header plus binary data downstream unchanged (except for the dimensions + it modifies). +- **The header travels with the data.** RSF files consist of a small ASCII + header (`.rsf`) that records `n#`, `d#`, `o#`, `label#`, `unit#`, and a + pointer to the binary data file. Programs update the header automatically; + you rarely touch it by hand. +- **Axis numbering is 1-based.** `n1` is the fast (innermost) axis — usually + time samples. `n2` is the slow axis — usually trace index. Higher axes + follow naturally. +- **stdin/stdout are the default I/O names.** Programs that take secondary + inputs (e.g. a mask file) use named parameters: `mask=head.rsf`. + +## Discovery + +### Looking up a specific program + +```bash +sfdoc sfbandpass +``` + +This prints the program description, synopsis with all parameters and defaults, +the known functions, and a list of reproducible examples that use the program. +Every parameter claim in this skill was verified with `sfdoc`. + +### Keyword search + +```bash +sfdoc -k bandpass +# → sfbandpass: Bandpass filtering. +# → sferf: Bandpass filtering using erf function. +# → sftrapepass: Trapezoid bandpass filter in the frequency domain. +``` + +`sfdoc -k ` searches program descriptions for the keyword and prints +matching program names and one-line summaries. Use it when you know what you +want to do but not which program does it. + +### Listing all sf* programs + +```bash +ls "$RSFROOT/bin" | grep '^sf' +``` + +This lists every installed program. Pipe through `grep` to narrow by name +fragment (e.g. `grep fft`). The catalog is large (hundreds of programs); +rely on `sfdoc -k` for semantic search rather than scanning the full list. + +Do **not** use `sfdoc -l ` to list programs — that flag writes LaTeX +documentation to files and is not a listing tool. + +### Reading source + +Source lives under `user//M.c` (user-contributed) or +`system/main/.c` / `system/generic/M.c` (core programs). +The `SOURCE` field in `sfdoc` output names the file. Reading source is the +authoritative way to understand edge cases and defaults not covered by the +synopsis. + +## Parameter conventions + +Parameter conventions are uniform across all `sf*` programs. The following +are grounded in actual `sfdoc` output. + +### Axis parameters (`n#`, `d#`, `o#`, `label#`, `unit#`) + +From `sfdoc sfspike`: + +``` +n#= size of #-th axis +d#=[0.004,0.1,0.1,...] sampling on #-th axis +o#=[0,0,...] origin on #-th axis +label#=[Time,Distance,Distance,...] label on #-th axis +unit#=[s,km,km,...] unit on #-th axis +``` + +- `n1=1000` — 1000 samples along the first (fast) axis. +- `d1=0.004` — 4 ms sample interval (the default for axis 1). +- `o1=0` — axis starts at time 0. +- `label1="Time"` — string labels appear in plots and in `sfin` output. + +The `#` is a literal digit: `n1`, `n2`, `n3`, … Programs that accept axis +parameters accept as many as the data has. + +### Boolean flags + +From `sfdoc sfbandpass`: + +``` +phase=n [y/n] y: minimum phase, n: zero phase +verb=n [y/n] verbosity flag +``` + +Booleans are always `y` or `n`. Never use `true`/`false` or `1`/`0`. + +### Comma-separated lists (multi-valued parameters) + +From `sfdoc sfspike`: + +``` +k#=[0,...] spike starting position [nsp] +mag= spike magnitudes [nsp] +``` + +When `nsp=2`, you supply two values separated by commas: + +```bash +sfspike n1=1000 n2=20 nsp=2 k1=300,700 mag=1,0.5 +``` + +This places two spikes on axis 1 at samples 300 and 700 with magnitudes 1.0 +and 0.5 respectively. The `[nsp]` annotation in the sfdoc synopsis means the +parameter repeats once per spike. + +**Note:** `sfspike` reads `nsp` BEFORE `k#`/`mag`, so you MUST set `nsp=N` +explicitly when passing N-element lists; it is not inferred from list length. +With the default `nsp=1`, only the first element of each list is used and only +one spike is placed. + +### File-valued parameters + +Some programs accept secondary files by name. From `sfdoc sfheaderwindow`: + +``` +mask= auxiliary input file name +``` + +Usage: + +```bash +< data.rsf sfheaderwindow mask=selection.rsf > windowed.rsf +``` + +The named file is opened separately; stdin/stdout handle the primary data flow. + +### The `output=` expression parameter + +From `sfdoc sfmath`: + +``` +output= Mathematical description of the output +``` + +This is a string containing a mathematical expression. Named input files +become variable names: + +```bash +sfmath x=file1.rsf y=file2.rsf output='sin((x+2*y)^power)' > out.rsf +sfmath < file1.rsf tau=file2.rsf output='exp(tau*input)' > out.rsf +``` + +When stdin is supplied, it is available as `input`. When producing data from +scratch (no stdin), set `nostdin=y` and specify `n1=`, `d1=`, `o1=` etc. + +### Window parameters (`f#`, `n#`, `j#`, `min#`, `max#`) + +From `sfdoc sfwindow`: + +``` +f#=(0,...) window start in #-th dimension +n#=(0,...) window size in #-th dimension +j#=(1,...) jump (decimation) in #-th dimension +min#=(o1,o2,...) minimum in #-th dimension +max#=... maximum in #-th dimension +``` + +You can mix coordinate-based (`min1=`, `max1=`) and sample-based (`f1=`, `n1=`) +windowing. Unspecified parameters default to keeping the full extent. + +## Core catalog + +The table below covers the programs used most often, grouped by purpose. Run +`sfdoc ` for full parameter lists. + +### Synthesize + +| Program | What it does | +|-----------|-------------| +| `sfspike` | Generate spikes, boxes, planes, or constant arrays. The primary tool for making synthetic data. Key params: `n1`, `n2`, `k1`, `mag`, `nsp`. | +| `sfmath` | Evaluate a mathematical expression to create or transform data. Accepts named file inputs as variables. Supports trig, log, exp, abs, erf, complex functions. | +| `sfnoise` | Add (or replace with) random noise. Key params: `var`, `range`, `mean`, `type` (y=normal, n=uniform), `rep` (replace instead of add), `seed`. | + +### Reshape + +| Program | What it does | +|-------------|-------------| +| `sfwindow` | Extract a sub-volume along any axis. Supports sample-based (`f#`, `n#`, `j#`) and coordinate-based (`min#`, `max#`) selection. | +| `sfpad` | Pad with zeros. Key params: `beg#` (prepend), `end#` (append), `n#` (set output length directly). | +| `sftransp` | Transpose two axes. Default: `plane=12` swaps axes 1 and 2. Use `plane=13` for axes 1 and 3, etc. | +| `sfput` | Set or override header parameters in place. Useful for correcting `d1`, `o1`, `label1`, etc. without touching data. | +| `sfcat` | Concatenate datasets along an axis. Key param: `axis=` (default 3). | + +### Filter + +| Program | What it does | +|---------------|-------------| +| `sfbandpass` | Butterworth bandpass along axis 1. Key params: `flo`, `fhi`, `nplo=6`, `nphi=6` (filter poles), `phase` (y=minimum-phase, n=zero-phase). | +| `sfsmooth` | Triangle smoothing along any axis. Key params: `rect1`, `rect2`, … (smoothing radius in samples on each axis). `repeat=` applies the filter multiple times. | + +### Transform + +| Program | What it does | +|-----------|-------------| +| `sffft1` | FFT along axis 1 (time → frequency). Key params: `inv=n` (forward), `sym=n`, `opt=y` (auto-pad to efficient length). | +| `sffft3` | FFT along any extra axis (default `axis=2`). Input and output are complex. Key params: `axis`, `pad=2` (padding factor), `sym=n`. | +| `sfcabs` | Convert complex RSF to float RSF by computing the complex magnitude. Use after `sffft1`/`sffft3` before float-consuming programs such as `sfgrey`. Use `sfreal` for the real part only. | + +### Inspect + +| Program | What it does | +|------------------|-------------| +| `sfin` | Print RSF header fields: `n1`, `d1`, `o1`, `label1`, … plus data-file path, element size, and a quick zero-check. | +| `sfattr` | Print amplitude statistics: rms, mean, 2-norm, variance, std dev, max, min, nonzero sample count. | +| `sfheaderwindow` | Select traces whose header key satisfies a mask. Key params: `mask=` (integer RSF file, nonzero = keep), `inv=n`. | +| `sfheadermath` | Apply math to trace headers. Key params: `key=` (header key to replace), `output=` (expression), `segy=y`. | + +### Plot (brief — see `plotting-with-vplot` skill for full coverage) + +| Program | What it does | +|------------|-------------| +| `sfgrey` | Raster (density / image) plot. Writes a vplot byte stream. | +| `sfgraph` | Line/graph plot. Writes a vplot byte stream. | +| `sfwiggle` | Wiggle-trace plot with filled lobes. Writes a vplot byte stream. | + +These write vplot streams, not RSF. Use `sfpen`, `xtpen`, or `pdfpen` to +render them, or chain into `pspen` for PDF output. Full parameter coverage +is in the `plotting-with-vplot` skill. + +## Piping patterns + +### Pattern 1: Synthesize → filter → plot + +The simplest end-to-end workflow: generate a synthetic impulse, apply a +filter, and plot the result. + +```bash +sfspike n1=500 d1=0.004 o1=0 label1=Time unit1=s k1=250 \ + | sfbandpass fhi=60 phase=n \ + | sfwiggle title="Bandpass demo" | xtpen +``` + +Stage-by-stage: +1. `sfspike n1=500 d1=0.004 … k1=250` — creates a single 500-sample trace + (axis 2 defaults to 1 trace) with a spike at sample 250. Spike positioning + is 1-based. Omitting `k1` (or setting `k1=0`) fills the **entire** trace + with a constant of magnitude `mag` — not what you usually want for an + impulse test. The axis metadata (`d1`, `label1`, `unit1`) flows + downstream. +2. `sfbandpass fhi=60 phase=n` — zero-phase low-pass at 60 Hz. `sfbandpass` + reads `d1` from the header to convert Hz to normalized frequency; you do + not need to repeat it. +3. `sfwiggle` — renders as a wiggle plot. Axis labels and title come from + header metadata. Pipe to a viewer (`xtpen`, `pspen > out.ps`, etc.). + +### Pattern 2: FK spectrum + +Compute the 2D Fourier transform (time → frequency, space → wavenumber) and +display power. + +```bash +sfspike n1=512 n2=64 d1=0.004 d2=25 k1=256 \ + | sfnoise var=0.01 \ + | sffft1 \ + | sffft3 axis=2 \ + | sfcabs \ + | sfgrey title="FK spectrum" | xtpen +``` + +Stage-by-stage: +1. `sfspike` — 512 × 64 array with a spike at time sample 256 on all traces. +2. `sfnoise var=0.01` — add Gaussian noise (variance 0.01) so the FK spectrum + is not just a line. +3. `sffft1` — FFT along axis 1 (time). Output is **complex**, shape changes + to `(n1/2+1) × 64`. +4. `sffft3 axis=2` — FFT along axis 2 (offset/space). Both axes are now in + the frequency/wavenumber domain. Output remains **complex**. +5. `sfcabs` — converts complex RSF to float RSF by computing the complex + magnitude (|re + i·im|). This is the required step before any + float-consuming program like `sfgrey`. Use `sfreal` instead if you only + want the real part. Do **not** use `sfmath output='abs(input)'` here: + on complex input `sfmath abs` returns complex output, which causes + `sfgrey` to error with "Need float input". +6. `sfgrey` — raster plot of the amplitude spectrum. + +### Pattern 3: Transpose-filter-transpose + +`sfbandpass` always filters along axis 1 (the fast axis). When you want to +filter along axis 2 instead, transpose before filtering, filter, then +transpose back. + +```bash +< data.rsf \ + sftransp \ + | sfbandpass fhi=30 phase=n \ + | sftransp \ + > filtered_axis2.rsf +``` + +Stage-by-stage: +1. `sftransp` — swaps axes 1 and 2. What was the trace axis (axis 2) is now + the fast axis (axis 1), which `sfbandpass` operates on. +2. `sfbandpass fhi=30 phase=n` — filters what was originally the trace axis. +3. `sftransp` — swaps back. Axis order is restored to the original. + +This transpose-operate-transpose idiom works for any axis-1-only operator +(FFT, smooth, etc.) when you need to operate on a different axis. + +### Pattern 4: Multi-file arithmetic + +`sfmath` accepts multiple named input files: + +```bash +sfmath x=model.rsf obs=field.rsf output='obs-x' > residual.rsf +sfmath x=residual.rsf output='x^2' | sfattr want=mean +``` + +Each named parameter other than `output`, `type`, `datapath`, and `out` is +treated as a variable in the expression. When stdin is also present it is +available as `input`. When creating data from scratch, pass `nostdin=y` and +specify axis parameters explicitly. + +## When NOT to pipe + +Piping is not always the right choice. + +**Cache expensive intermediates.** If stage 2 of your pipeline takes 10 minutes +to compute (e.g., a migration or FWI gradient), write it to disk and reference +it explicitly. In a SConstruct `Flow`, that is the natural structure anyway: +each `Flow` call is one stage with a named output. + +```python +Flow('migrated', 'shot_data', 'sfkirchhoff ...') +Flow('result', 'migrated', 'sfagc | sfbandpass fhi=60') +``` + +**Debug a broken pipeline.** Drop `sfin` or `sfattr` after each stage to check +that dimensions and values look right (see the Debugging section below). +Intermediates make that easy. + +**Pipelines with branches.** A pipeline is a linear chain; if two downstream +consumers need the same intermediate, write it to disk. The shell `tee` +command can help but adds complexity. + +**Very large datasets.** Programs that must load an entire axis into RAM +(e.g., `sftransp` with large `plane=`) may need `memsize=` tuning. Working +with named files makes it easier to profile memory at each stage. + +## Debugging a broken pipeline + +### Insert `sfin` or `sfattr` after each stage + +Break the pipe at the suspect stage and write an intermediate, then inspect: + +```bash +sfspike n1=1000 n2=20 k1=300 mag=1 > raw.rsf +sfin raw.rsf # prints n1, d1, o1, label1, data-file path +< raw.rsf sfbandpass fhi=4 phase=y > filtered.rsf +sfin filtered.rsf # verify dimensions after bandpass +sfattr < filtered.rsf # check amplitude statistics +``` + +`sfin` prints header fields and a quick zero-check; `sfattr` prints rms, +mean, 2-norm, max, min, and variance. Note that `sfin` terminates the +stream — it cannot be inserted transparently mid-pipe. + +### Check `$DATAPATH` + +RSF header files (`.rsf`) contain a pointer to a binary data file. By +default, `DATAPATH` is `/var/tmp/` (or whatever your installation sets). If +you see "cannot open data file" errors, check: + +```bash +echo $DATAPATH +sfin myfile.rsf # shows the actual data file path +``` + +Mismatches between `$DATAPATH` and the path recorded in the header happen +when files are moved without updating the header. Use `sfput` to correct the +path, or re-run the pipeline with the correct `DATAPATH` set. + +### Check axis metadata with `sfput` + +If a downstream program complains about missing `d1` or wrong axis size, use +`sfput` to inject the correct values: + +```bash +< data.rsf sfput d1=0.004 o1=0 label1=Time unit1=s > data_fixed.rsf +``` + +`sfput` passes all data through unchanged; it only rewrites header fields. + +### Use `sfheaderwindow` for trace-by-trace header diagnostics + +When working with SEGY-derived data that has trace headers, use +`sfheaderwindow` to select a subset of traces and `sfheadermath` to inspect +or correct header values: + +```bash +# Select traces where offset < 2000 m (stored in a mask file): +sfheadermath < data.rsf output='offset<2000' key=mask > mask.rsf +sfheaderwindow < data.rsf mask=mask.rsf > near_offset.rsf +``` + +## Example + +See `references/example-pipeline.sh` for a self-contained runnable example. +Run it with: + +```bash +bash references/example-pipeline.sh +``` + +### Stage-by-stage walkthrough + +**Stage 1 — Synthesize:** +`sfspike n1=1000 n2=20 nsp=2 k1=300,700 mag=1,0.5` creates a `1000 × 20` +dataset with **two** spikes at samples 300 and 700 (magnitudes 1.0 and 0.5). +`nsp=2` must be set explicitly — `sfspike` reads `nsp` before `k1`/`mag` and +does not infer it from list length; without it only the first spike would be +placed. Defaults `d1=0.004`, `d2=0.1` come from `sfdoc sfspike`. + +**Stage 2 — Bandpass:** +`sfbandpass fhi=4 phase=y` applies a 6-pole minimum-phase low-pass at 4 Hz. +`sfbandpass` reads `d1` from the header to convert Hz to normalized frequency; +you do not repeat `d1` on the command line. + +**Stage 3 — Window:** +`sfwindow n1=500 f1=250` extracts samples 250–749 along axis 1, yielding +a `500 × 20` dataset. Axis 2 is unchanged. + +**Stage 4 — Transpose:** +`sftransp` (default `plane=12`) swaps axes 1 and 2, giving `20 × 500`. +All header fields (`n#`, `d#`, `o#`, `label#`) are updated automatically. + +**Stage 5 — Summarize:** +`sfattr` prints rms, mean, 2-norm, variance, std dev, max, min, and sample +counts. The `at 1 94` notation in the output gives the 2D sample location +of the maximum. + +**Single-pipe form:** The script's final section chains all five programs with +`|`. Output is identical — RSF programs are stateless filters; results depend +only on data and parameters, not on whether intermediates were written to disk. diff --git a/skills/using-sf-programs/references/example-pipeline.sh b/skills/using-sf-programs/references/example-pipeline.sh new file mode 100644 index 000000000..a1633b734 --- /dev/null +++ b/skills/using-sf-programs/references/example-pipeline.sh @@ -0,0 +1,30 @@ +#!/bin/bash +# Example: build a synthetic trace, filter it, window it, transpose, and summarize. +# Run with: +# bash example-pipeline.sh +set -euo pipefail +mkdir -p /tmp/using-sf-programs-demo +cd /tmp/using-sf-programs-demo + +echo "=== stage 1: synthesize a 2D dataset (20 traces x 1000 samples) ===" +sfspike n1=1000 n2=20 nsp=2 k1=300,700 mag=1,0.5 > spikes.rsf + +echo "=== stage 2: bandpass each trace (low-pass at 4 Hz) ===" +< spikes.rsf sfbandpass fhi=4 phase=y > filtered.rsf + +echo "=== stage 3: window out the middle 500 samples ===" +< filtered.rsf sfwindow n1=500 f1=250 > windowed.rsf + +echo "=== stage 4: transpose (now axis 1 is trace number) ===" +< windowed.rsf sftransp > transposed.rsf + +echo "=== stage 5: summarize ===" +< transposed.rsf sfattr + +echo +echo "=== same thing as a single pipe ===" +sfspike n1=1000 n2=20 nsp=2 k1=300,700 mag=1,0.5 \ + | sfbandpass fhi=4 phase=y \ + | sfwindow n1=500 f1=250 \ + | sftransp \ + | sfattr diff --git a/skills/working-with-rsf-data/SKILL.md b/skills/working-with-rsf-data/SKILL.md new file mode 100644 index 000000000..6bcd81eeb --- /dev/null +++ b/skills/working-with-rsf-data/SKILL.md @@ -0,0 +1,215 @@ +--- +name: working-with-rsf-data +description: Use when inspecting, modifying, or debugging an RSF file, its headers, or its binary data placement. +--- + +## When to use + +Use this skill for any task that touches RSF files directly: + +- Inspecting what an RSF file contains (shape, type, axis metadata, value statistics). +- Modifying header fields (labels, units, axis parameters). +- Diagnosing "file not found", "wrong dimensions", or "garbled data" problems. +- Understanding why a pipeline produces unexpected results after copying or moving files. +- Cleaning up RSF files without orphaning their binaries. +- Working with complex-valued RSF data and choosing the right downstream program. +- Recovering from situations where headers and binaries have become separated. + +## The two-file model + +Every RSF dataset is split across two files: + +1. **Text header** (`*.rsf`) — a plain-text key=value file you can read with `cat`. It describes axes, data type, and the path to the binary. +2. **Binary file** (`*.rsf@`) — the raw sample data, written under the directory named by `$DATAPATH`. + +The `in=` field in the header connects the two. Here is a real header from `sfspike n1=100 n2=5 k1=50 mag=1 > data.rsf`: + +``` +4.3-git sfspike private/tmp/working-with-rsf-data-demo: root@hostname Sun Apr 19 12:28:16 2026 + + d2=0.1 + o1=0 + n2=5 + o2=0 + label1="Time" + data_format="native_float" + label2="Distance" + esize=4 + in="/var/tmp/data.rsf@" + unit1="s" + unit2="km" + d1=0.004 + n1=100 +``` + +Key observations: + +- The first line is a provenance comment (program name, working directory, host, timestamp). It is not parsed by Madagascar programs. +- All key=value lines are indented with a leading tab character. +- `in=` holds the absolute path to the binary file. This path is set at creation time using `$DATAPATH`. +- `data_format="native_float"` is the default: 4-byte IEEE 754 float, native byte order. +- `esize=4` is the element size in bytes. + +Because the binary lives under `$DATAPATH` and not next to the header, copying or moving `*.rsf` files with standard Unix tools (`cp`, `mv`, `rm`) only touches the header. The binary is untouched and can become orphaned. + +## Axes + +RSF supports up to nine axes, numbered 1 through 9. Each axis has five fields: + +| Field | Meaning | Default if omitted | +|-----------|--------------------------------|--------------------| +| `n` | number of samples on axis k | (required for k=1) | +| `d` | sampling interval | 1.0 | +| `o` | axis origin (value at index 0) | 0.0 | +| `label`| axis label string | "" (empty) | +| `unit` | unit string | "" (empty) | + +**Memory order**: axis 1 is the fastest-varying (innermost) axis in memory. For a 2D array `n1=100 n2=5`, sample `[i2][i1]` is at byte offset `(i2*100 + i1) * esize`. This is analogous to C's row-major layout where axis 1 is the column index. + +In seismic usage, `n1` is commonly the time or depth axis (samples per trace) and `n2` and above are spatial or offset axes. + +Trailing dimensions of size 1 are often omitted from `sfin` output. A file with `n1=100` and no `n2` is effectively 1-D. + +When reading a header, if `d1` or `o1` are missing, treat them as 1.0 and 0.0 respectively. Programs that need axis metadata will use these defaults. + +## Data types + +The `data_format` field controls how the binary is encoded. The default is native 32-bit float. + +**Float (default)** + +``` +data_format="native_float" +esize=4 +``` + +This is what `sfspike`, `sfbandpass`, and most Madagascar programs produce. It is IEEE 754 single precision, native byte order of the machine that wrote the file. The `xdr_float` variant is big-endian XDR (used for cross-platform exchange). `ascii_float` writes human-readable ASCII values. + +**Complex** + +Complex-valued files use one of three prefixed format strings: + +- `data_format="native_complex"` — pairs of native-float (re, im), 8 bytes per sample +- `data_format="ascii_complex"` — ASCII-encoded complex pairs +- `data_format="xdr_complex"` — XDR big-endian complex pairs + +IMPORTANT: `data_format="complex"` is NOT a valid value. It is a common but incorrect shorthand. Always use one of the three prefixed forms above. Programs that check `data_format` will not recognize the bare string. + +When `esize=8`, the file holds complex samples. Most float-consuming programs (`sfgrey`, `sfwiggle`, `sfattr`) do not accept complex input directly. Convert first: + +- `sfcabs` — complex → float magnitude +- `sfreal` — complex → float real part +- `sfimag` — complex → float imaginary part + +**Integer and byte** + +``` +data_format="native_int" esize=4 # 32-bit signed integer +data_format="native_short" esize=2 # 16-bit signed integer +data_format="native_uchar" esize=1 # unsigned byte (type=byte in sfin output) +``` + +`sfin` reports `type=float`, `type=complex`, `type=int`, `type=short`, or `type=byte` based on `data_format`. + +## Essential tools + +**`sfin`** — displays the shape and type of one or more RSF files. Run it as `sfin file.rsf` (not via stdin). Output includes the `in=` path, `esize`, `type`, `form` (native/xdr/ascii), all axis parameters, and the total element and byte count. Use `sfin info=n file.rsf` to suppress axis detail and show only the binary path. This is the first tool to reach for when you encounter an unfamiliar file. + +**`sfattr`** — summarizes the numeric content of an RSF file. Read from stdin: `sfattr < file.rsf`. Reports rms, mean, 2-norm, variance, standard deviation, max (with index), min (with index), nonzero sample count, and total sample count. Use `want=max` or `want=rms` to print a single statistic. Useful for sanity-checking that a pipeline produced reasonable values. Does not work on complex files without prior conversion. + +**`sfput`** — updates header fields and writes a new output file (header + binary copy). Takes input on stdin and writes output on stdout: `< in.rsf sfput key1=val1 key2=val2 > out.rsf`. The output file gets its own independent binary. Use this to relabel axes, fix sampling intervals, or correct missing metadata. Any key that appears in an RSF header can be set with `sfput`. To remove a key, set it to the empty string. Note: because `sfput` copies the binary, it has disk and time cost proportional to file size. + +**`sfrm`** — removes RSF files together with their binaries. Use it instead of Unix `rm`. Syntax mirrors `rm`: `sfrm file1.rsf file2.rsf`. Flags `-v` (verbose), `-f` (force, ignore missing), `-i` (interactive) are supported. Under the hood, `sfrm` reads the `in=` field from each header, deletes the binary at that path, then deletes the header. If the binary is already missing, `sfrm` still deletes the header. See also `sfmv` and `sfcp` for moving and copying RSF files correctly. + +## Binary location + +The `$DATAPATH` environment variable controls where Madagascar writes binary files. It must end with a trailing slash. When you run: + +```bash +sfspike n1=1000 > spike.rsf +``` + +Madagascar writes the text header to `spike.rsf` in the current directory and the binary to `$DATAPATH/spike.rsf@`. The `in=` field in the header records the full absolute path. + +**Why `rm *.rsf` leaks binaries**: deleting the text header with Unix `rm` removes only the key=value file. The binary under `$DATAPATH` has no back-reference to the header and becomes an orphan. On a busy workstation with `$DATAPATH=/var/tmp/`, orphaned binaries can accumulate silently. + +Always use `sfrm` to delete RSF files. If you have already used `rm` and need to clean up orphans, list `$DATAPATH` and look for `*.rsf@` files that no longer have a matching header anywhere on the filesystem. + +**Setting `$DATAPATH`**: the environment script `$RSFROOT/share/madagascar/etc/env.sh` sets a default `$DATAPATH`, often pointing to a system temp directory that may be periodically cleaned. If you need data to persist, set `DATAPATH` explicitly in your shell profile to a stable directory with sufficient space and a trailing slash, for example: + +```bash +export DATAPATH=/home/user/rsf-data/ +``` + +## Header manipulation + +`sfput` writes a new header AND a new binary copy — it is NOT zero-cost for large files. The convenience is the compact syntax for updating header fields (labels, units, axis deltas). For metadata-only tweaks on multi-GB files, plan for the disk and time cost. + +```bash +# Fix wrong sampling interval and add axis labels +< wrong.rsf sfput d1=0.002 label1="Depth" unit1="km" > corrected.rsf +``` + +`corrected.rsf` has its own independent binary; deleting `wrong.rsf` with `sfrm` does not affect it. + +**`sfheadermath`** handles per-trace SEGY-style trace header fields. This is about per-trace header fields in segregated SEGY-style datasets, not about the `*.rsf` file's top-level metadata (which `sfput` manages). Not to be confused with the RSF file header. SEGY files have a binary header and per-trace headers stored in a separate `*.hdr` RSF file alongside the data `*.rsf`. `sfheadermath` reads those trace headers and computes new values from existing keys: + +```bash +# Compute offset from shot and receiver x-coordinates +< data.hdr sfheadermath output="abs(sx-gx)" key="offset" > data_with_offset.hdr +``` + +Use `sfheadermath` when you need to derive or correct SEGY trace header quantities (offset, midpoint, elevation statics) without reprocessing the seismic samples themselves. + +## Recovery scenarios + +**Scenario A: missing binary** + +Symptom: a program reports it cannot open or read the binary file. + +1. Find the expected binary path: `grep "in=" file.rsf` or `sfin info=n file.rsf`. +2. Check whether the file exists at that path: `ls -lh /var/tmp/file.rsf@`. +3. If the file was created on a different machine or in an environment where `$DATAPATH` pointed to a now-inaccessible location, the `in=` path in the header will be unreachable. Use `sfput` to write a new file with `in=` pointing to the current, valid location — or copy the binary to the expected path: `< file.rsf sfput "in=/new/path/file.rsf@" > file_fixed.rsf`. +4. If the binary is truly gone (deleted with `rm` or temp-dir purge), the data is unrecoverable from the header alone. + +**Scenario B: wrong axis labels or parameters** + +Symptom: `sfin` shows incorrect `label1`, `unit1`, `d1`, or `o1`. + +Use `sfput` to fix the labels. Because `sfput` writes an independent binary copy, `file_fixed.rsf` is safe to use on its own: + +```bash +< file.rsf sfput label1="Time" unit1="s" d1=0.004 o1=0 > file_fixed.rsf +``` + +Verify with `sfin file_fixed.rsf`. Once confirmed correct, remove the original with `sfrm file.rsf` — this will not affect `file_fixed.rsf` since each has its own binary. + +**Scenario C: segregated SEGY headers** + +SEGY data imported with `sfsegyread` produces two files: `data.rsf` (sample data) and `data.hdr` (per-trace SEGY headers as a separate RSF file with integer type). Programs like `sfheadermath`, `sfheadersort`, and `sfintbin` consume the `.hdr` file. If the `.hdr` file is missing or mismatched, re-import with `sfsegyread` or reconstruct trace headers using `sfheadermath` from known geometry. + +## Example + +See `references/example-rsf-inspect.sh` for a self-contained walkthrough. Run it with: + +```bash +bash references/example-rsf-inspect.sh +``` + +The script progresses through seven stages: + +1. **Create a 2D RSF** — `sfspike n1=100 n2=5 k1=50 mag=1 > data.rsf` generates a 100-sample by 5-trace file with a spike at sample 50 in every trace. + +2. **Cat the header** — `cat data.rsf` shows the plain-text key=value format. The `in=` line reveals the binary path under `$DATAPATH`. + +3. **sfin** — `sfin data.rsf` prints the parsed shape: `n1=100 d1=0.004 o1=0 label1="Time"`, `n2=5 d2=0.1 o2=0 label2="Distance"`, element count, and byte count. + +4. **sfattr** — `sfattr < data.rsf` reports rms=0.1, max=1 at position [50, 1], and that only 5 of 500 samples are nonzero (one spike per trace). + +5. **sfput** — `< data.rsf sfput label2="Trace" unit2="" > labeled.rsf` writes a new header pointing to a new binary with updated metadata. `sfin labeled.rsf` confirms the change. + +6. **Locate the binary** — `grep "^ in=" data.rsf` prints the `in=` line, showing the absolute path under `$DATAPATH`. This is what would be orphaned by `rm data.rsf`. + +7. **sfrm** — `sfrm data.rsf labeled.rsf` removes both headers and both binaries. The final `ls` confirms no `*.rsf` files remain, demonstrating clean teardown. + +Expected final output: `clean`. diff --git a/skills/working-with-rsf-data/references/example-rsf-inspect.sh b/skills/working-with-rsf-data/references/example-rsf-inspect.sh new file mode 100755 index 000000000..97013617b --- /dev/null +++ b/skills/working-with-rsf-data/references/example-rsf-inspect.sh @@ -0,0 +1,29 @@ +#!/bin/bash +# Example: inspect, modify, and recover RSF data. +set -euo pipefail +mkdir -p /tmp/working-with-rsf-data-demo +cd /tmp/working-with-rsf-data-demo + +echo "=== 1. create a 2D RSF ===" +sfspike n1=100 n2=5 k1=50 mag=1 > data.rsf + +echo "=== 2. header is plain text ===" +cat data.rsf + +echo "=== 3. sfin tells you the shape ===" +sfin data.rsf + +echo "=== 4. sfattr summarizes data values ===" +sfattr < data.rsf + +echo "=== 5. modify header metadata (add units and labels) ===" +< data.rsf sfput label1="Time" unit1="s" label2="Trace" unit2="" > labeled.rsf +sfin labeled.rsf + +echo "=== 6. where is the binary? ===" +grep "^ in=" data.rsf +# Binary lives under $DATAPATH. If you did 'rm *.rsf' you'd orphan it. + +echo "=== 7. clean up the RIGHT way ===" +sfrm data.rsf labeled.rsf +ls *.rsf 2>/dev/null && echo "leaked headers" || echo "clean" diff --git a/skills/writing-rsf-flows/SKILL.md b/skills/writing-rsf-flows/SKILL.md new file mode 100644 index 000000000..344c0392b --- /dev/null +++ b/skills/writing-rsf-flows/SKILL.md @@ -0,0 +1,315 @@ +--- +name: writing-rsf-flows +description: Use when writing or modifying an SConstruct file that drives a Madagascar data-processing flow (Flow/Plot/Result/Fetch/Command). +--- + +## When to use + +Use this skill whenever you are authoring or editing an `SConstruct` file that drives a Madagascar geophysical data-processing pipeline. This covers creating new processing workflows, adding stages to existing flows, configuring plots or results, fetching remote datasets, and integrating non-Madagascar steps via `Command`/`Action`. Any SConstruct that starts with `from rsf.proj import *` is using this DSL. + +## The DSL surface + +`framework/rsf/proj.py` exposes six top-level functions that wrap a singleton `Project` object (a subclass of SCons `Environment`). These are the only functions you should call directly in a user-facing SConstruct. + +| Function | One-line description | +|----------|----------------------| +| `Flow(target, source, flow, **kw)` | Run a Madagascar command pipeline to produce one or more `.rsf` files. | +| `Plot(target, source, flow=None, **kw)` | Run a vplot command to produce a `.vpl` file (not a deliverable). | +| `Result(target, source, flow=None, **kw)` | Same as `Plot`, but copies the output into `Fig/` and registers it as a deliverable viewable with `scons view`. | +| `Fetch(file, dir, private=0, **kw)` | Download a data file from a remote server into the current directory. | +| `End(**kw)` | Finalize the build: wire up `view`, `print`, `lock`, and `test` aliases. Must be called at the end of every SConstruct. | +| `Command(targets, sources, action)` | SCons built-in re-exported; use for steps that are not Madagascar `sf*` programs. | + +Full module-level signatures from `proj.py`: + +```python +def Flow(target, source, flow, **kw): ... +def Plot(target, source, flow=None, **kw): ... +def Result(target, source, flow=None, **kw): ... +def Fetch(file, dir, private=0, **kw): ... +def End(**kw): ... +``` + +`Project.Flow` (the underlying method) has this full signature: + +```python +def Flow(self, target, source, flow, stdout=1, stdin=1, rsfflow=1, + suffix=sfsuffix, prefix=sfprefix, src_suffix=sfsuffix, + split=[], np=1, reduce='cat', jobmult=1, local=0, noderotate=1, + workdir=None, wall=''): +``` + +`Project.Fetch` full signature: + +```python +def Fetch(self, files, dir, private=None, server=dataserver, top='data', usedatapath=True): +``` + +## Flow in depth + +`Flow(target, source, flow)` is the workhorse of the DSL. + +**Targets and sources map to `.rsf` files.** If `target` is the string `'spike'`, the output file is `spike.rsf`. If `source` is `'spike'`, the input is `spike.rsf`. The `.rsf` suffix is appended automatically (controlled by the `suffix` and `src_suffix` keyword arguments, both defaulting to `'.rsf'`). + +**The `sf` prefix is implicit inside a `flow` string.** When `rsfflow=1` (the default), the build system prepends `sf` to each command token that is recognized as a Madagascar program. So you write `'spike n1=1000'` not `'sfspike n1=1000'`. This means you should never write the `sf` prefix inside a `Flow` command string. + +**Pipes.** Multiple stages are separated with `|` inside the flow string, just like shell pipes: + +```python +Flow('filtered_spike', None, 'spike n1=1000 k1=300 | bandpass fhi=2 phase=y') +``` + +**Ampersand-separated commands.** `&&` inside a Flow command string separates two sequential pipeline groups. Each group is assembled independently (with `sf` prefixes added, inputs/outputs wired) but they run in the same shell action in sequence, sharing the target output. Use it when you need two related commands that both contribute to producing the target — but if they're independent, prefer two separate `Flow` calls. + +**Multi-target flows.** Pass a Python list as `target` to produce several output files from a single command. Use bare names (without `.rsf`) — the DSL appends the suffix automatically: + +```python +Flow(['cmpt', 'offset'], 'input', 'some_program ...') +# or equivalently, a space-separated string: +Flow('out1 out2', 'input', 'program ...') +``` + +Note: names that already contain a `.` (e.g. `'cmpt.rsf'`) also work — the DSL skips suffix-adding when a `.` is already present — but bare names are the idiomatic form. + +**Zero-input flows (`source=None`).** Pass `None` (not an empty string) when a command generates data with no RSF input: + +```python +Flow('spike', None, 'spike n1=1000 k1=300') +``` + +Internally, when `source` is falsy the DSL sets `stdin=0`, meaning no RSF file is piped to stdin. + +**Multiple sources.** Pass a space-separated string or a list. Inside the `flow` string, refer to extra sources as `${SOURCES[1]}`, `${SOURCES[2]}`, etc.: + +```python +Flow('nmo', 'cmp offset vnmo', + 'nmo offset=${SOURCES[1]} velocity=${SOURCES[2]} half=n') +``` + +## Plot vs. Result + +`Plot` and `Result` share the same signature: + +```python +Plot(target, source, flow=None, **kw) +Result(target, source, flow=None, **kw) +``` + +**Two-argument shorthand.** When called with two positional arguments, the second is treated as `flow` and `source` is set to `target`: + +```python +Plot('spike', 'wiggle clip=0.02 title="Spike"') +# equivalent to: +Plot('spike', 'spike', 'wiggle clip=0.02 title="Spike"') +``` + +**Plot** produces `.vpl` in the current directory. It is an intermediate artifact — not tracked as a deliverable. + +**Result** is identical to `Plot` but additionally: +- Copies the `.vpl` file into `Fig/.vpl`. +- Registers the target for `scons view` (opens all results with `sfpen`). +- Registers the target for `scons print` and `scons lock`. +- Records the target name in `project.rest` for the `.rsfproj` manifest. + +Rule of thumb: use `Plot` for intermediate composites or overlays; use `Result` for the final figure you want to present. + +**Overlay example** (combining two plots): + +```python +Plot('cmp', 'grey title="Synthetic CMP"') +Plot('time', 'graph yreverse=y wanttitle=n plotfat=3') +Result('cmp-time', 'cmp time', 'Overlay') +``` + +The second argument `'cmp time'` lists source `.vpl` files; `'Overlay'` is a built-in combine mode (see the `combine` dict in `proj.py`). + +## Fetch + +```python +Fetch(file, dir, private=0, **kw) +# Underlying Project.Fetch signature: +# Fetch(self, files, dir, private=None, server=dataserver, top='data', usedatapath=True) +``` + +`Fetch` downloads one or more files from a remote server. With the default `usedatapath=True`, the file is stored under `$DATAPATH` and a symlink is placed in the current directory (matching RSF's header-and-binary split). Key keyword arguments: + +- `server` — URL of the server (default: `https://ahay.org` via `get_dataserver()`). +- `top` — top-level path component on the server (default: `'data'`). +- `private` — dict with `login`/`password`/`server` for FTP-authenticated downloads. +- `usedatapath` — if `True` (default), stores the binary in `DATAPATH` and symlinks. + +Worked example from `book/rsf/tutorials/nmo/SConstruct`: + +```python +Fetch('synthetic_cmp.npz', 'data', + server='https://raw.githubusercontent.com', + top='seg/tutorials-2017/master/1702_Step_by_step_NMO') +``` + +This downloads: +`https://raw.githubusercontent.com/seg/tutorials-2017/master/1702_Step_by_step_NMO/data/synthetic_cmp.npz` + +After `Fetch`, the file is available as a source to subsequent `Flow` or `Command` calls. + +## Command and Action + +`Command` (re-exported from SCons) is for steps that are not `sf*` Madagascar programs — calling a Python function inline, running a third-party tool, or converting between file formats. + +**Python function as an action:** + +```python +def npz2rsf(target=None, source=None, env=None): + import numpy, m8r + data = numpy.load(str(source[0])) + out = m8r.Output(str(target[0])) + out.put('n1', data['arr'].shape[0]) + out.write(data['arr']) + out.close() + return 0 + +Command(['out.rsf'], 'input.npz', action=Action(npz2rsf)) +``` + +The function signature `(target, source, env)` is required by SCons. `target` and `source` are lists of SCons `Node` objects; convert with `str(source[0])`. + +**Shell command with a non-Madagascar tool:** + +```python +Command('out.csv', 'in.rsf', + 'sfdd form=ascii < $SOURCE > $TARGET') +``` + +Use `$SOURCE` / `$TARGET` (SCons variables) for single-file targets inside a `Command` action string. + +Do NOT use `Command` as a replacement for `Flow` when the command is an `sf*` program — use `Flow` so the DSL handles path resolution, binary file cleanup, and parallel execution correctly. + +## Parameter interpolation + +SConstruct files are plain Python, so you can compute parameters at build-configuration time and interpolate them into flow strings. + +**`%` formatting (most common in existing Madagascar code):** + +```python +fhi = 4 +Flow('filter', 'spike', 'bandpass fhi=%g phase=y' % fhi) +``` + +**`.format()`:** + +```python +Flow('filter', 'spike', 'bandpass fhi={fhi} phase=y'.format(fhi=fhi)) +``` + +**f-strings (Python 3.6+):** + +```python +Flow('filter', 'spike', f'bandpass fhi={fhi} phase=y') +``` + +Note: f-strings work fine in modern Madagascar (Python 3). The historic caution was that SCons uses `$VAR` substitution in command strings, so a literal `$` inside a flow string could clash. Because Madagascar flow strings are processed by `rsf.flow.Flow` before being handed to SCons, this is not usually a problem — but avoid `$` characters in f-string expressions just to be safe. + +**Multiple parameters:** + +```python +t0, v0 = 0.2, 4000 +Flow('time', 'offset', + 'math output="sqrt(%g+input*input/%g)"' % (t0*t0, v0*v0)) +``` + +## Running a flow + +| Command | Effect | +|---------|--------| +| `scons` | Build all default targets (everything registered with `Flow`, `Plot`, `Result`). | +| `scons view` | Open all `Result` outputs with `sfpen` (requires a display). | +| `scons -n` | Dry run: print the commands that would be executed without running them. | +| `scons ` | Build only `.rsf` (or `.vpl`). Example: `scons filter`. | +| `scons print` | Print all results to the configured printer via `pspen`. | +| `scons lock` | Copy results to the locked figures directory for regression testing. | +| `scons -j4` | Parallel build using 4 jobs. | + +The `scons view` alias is only registered if at least one `Result` call has been made. + +## How it plugs into the larger build + +Each chapter/tutorial lives in its own subdirectory with its own `SConstruct`. The top-level `book/SConstruct` and intermediate `book//SConstruct` files use `from rsf.book import *` instead of `from rsf.proj import *`. The `Book()` helper in `framework/rsf/book.py` recurses into subdirectories and compiles the LaTeX papers alongside the data flows. + +A typical book hierarchy: + +``` +book/ + tutorial/ + SConstruct # uses rsf.book — recurses into chapters + users/ + SConstruct # uses rsf.proj — data flows here + authors/ + SConstruct +``` + +When writing data-processing flows (not book structure), always use `from rsf.proj import *`. Only use `from rsf.book import *` when you are building a multi-chapter book document. See `framework/rsf/book.py` for the `Book()` and `Paper()` helpers. + +## Common mistakes + +1. **Forgetting `End()`.** + Every SConstruct that uses `rsf.proj` must call `End()` at the bottom. Without it, the `view`, `print`, `lock`, and `test` SCons aliases are never registered, and the `.rsfproj` manifest is never written. + +2. **Writing `sfbandpass` instead of `bandpass` inside a Flow string.** + The `sf` prefix is added automatically by the DSL — `Project.Flow` accepts a `prefix=sfprefix` parameter and prepends it to each command token. Writing `sfbandpass` explicitly is redundant and harder to read. It also reduces portability: if `sfprefix` were ever reconfigured (the knob exists, even if it's not the default), explicitly-prefixed commands would need to be updated everywhere. In practice `sfbandpass` still works because `add_prefix` in `framework/rsf/flow.py` checks whether the prefix is already present before adding it — so existing code using `sfbandpass` is not broken, just non-idiomatic. Stick with the bare form (`bandpass`) inside Flow strings. + +3. **Using shell redirection instead of Flow pipelines.** + Do not write `'spike n1=100 > out.rsf'`. Use `Flow('out', None, 'spike n1=100')`. Shell redirection bypasses the DSL's path resolution, binary-file tracking, and cleanup logic. + +4. **Mixing relative and absolute paths in `source`.** + Sources should be base names without the `.rsf` suffix (e.g., `'spike'`, not `'./spike.rsf'`). The DSL appends `.rsf` automatically. Passing full paths can confuse the dependency tracker. + +5. **Calling `Result('name')` with only one argument.** + `Result` requires at least two positional arguments: `Result(target, source_or_flow)`. With two arguments the second is treated as the `flow` string and source defaults to `target`. With three arguments: `Result(target, source, flow)`. + +6. **Forgetting `source=None` for zero-input commands.** + Commands like `sfspike` or `sfecho` read nothing from stdin. Pass `source=None` explicitly so the DSL sets `stdin=0`. Passing an empty string `''` is not equivalent. + +7. **Using Python `print()` to inspect RSF headers at build time.** + Use `sfin` or `sfattr` as separate shell commands, not inside the SConstruct. At SConstruct-read time the RSF files may not exist yet. + +## Example + +See `references/example-flow.SConstruct` for a self-contained runnable demo. Stage-by-stage walkthrough: + +```python +from rsf.proj import * +``` +Imports the DSL and initialises the singleton `Project`. This line is mandatory. + +```python +Flow('spike', None, 'spike n1=1000 k1=300') +``` +Runs `sfspike n1=1000 k1=300 > spike.rsf`. `source=None` because `sfspike` produces data from scratch. + +```python +Flow('filter', 'spike', 'bandpass fhi=2 phase=y') +``` +Runs `sfbandpass fhi=2 phase=y < spike.rsf > filter.rsf`. Note: `bandpass` not `sfbandpass`. + +```python +fhi = 4 +Flow('filter2', 'spike', 'bandpass fhi=%g phase=y' % fhi) +``` +Demonstrates `%`-format interpolation. The value of `fhi` is baked in at SConstruct-parse time. + +```python +Plot('filter', 'wiggle clip=0.02 title="Filtered spike"') +Plot('filter2', 'wiggle clip=0.02 title="Filter (fhi=%g)"' % fhi) +``` +Each `Plot` produces a `.vpl` in the current directory. Two-argument form: source = target = `'filter'`. + +```python +Result('filter', 'wiggle clip=0.02 title="Filtered spike"') +``` +Same as the `Plot` above, but also writes `Fig/filter.vpl` and registers `filter` for `scons view`. + +```python +End() +``` +Wires up the `view`/`print`/`lock`/`test` aliases. Always the last line. + +Running `scons` in the directory produces `spike.rsf`, `filter.rsf`, `filter2.rsf`, `filter.vpl`, `filter2.vpl`, and `Fig/filter.vpl`. Running `scons view` opens `Fig/filter.vpl` with `sfpen`. diff --git a/skills/writing-rsf-flows/references/example-flow.SConstruct b/skills/writing-rsf-flows/references/example-flow.SConstruct new file mode 100644 index 000000000..23ef69e56 --- /dev/null +++ b/skills/writing-rsf-flows/references/example-flow.SConstruct @@ -0,0 +1,28 @@ +# Example flow demonstrating the rsf.proj DSL. +# +# Run with: +# cp example-flow.SConstruct /tmp/writing-rsf-flows-demo/SConstruct +# cd /tmp/writing-rsf-flows-demo && scons + +from rsf.proj import * + +# A Flow(target, source, command) produces .rsf by running . +# "None" as the source means the flow has no RSF input. +Flow('spike', None, 'spike n1=1000 k1=300') + +# Multi-stage flow via pipe. The "sf" prefix is implicit inside a Flow string. +Flow('filter', 'spike', 'bandpass fhi=2 phase=y') + +# A parameter can be interpolated at SConstruct time. +fhi = 4 +Flow('filter2', 'spike', 'bandpass fhi=%g phase=y' % fhi) + +# A Plot wraps a vplot program. It produces a .vpl file but is not a "final" result. +Plot('filter', 'wiggle clip=0.02 title="Filtered spike"') +Plot('filter2', 'wiggle clip=0.02 title="Filter (fhi=%g)"' % fhi) + +# Result == Plot, but marks the output as a deliverable (goes into Fig/ and +# is shown by `scons view`). +Result('filter', 'wiggle clip=0.02 title="Filtered spike"') + +End()