Skip to content

Commit e0f4de0

Browse files
docs: address review feedback on staged-insert page
From @MilagrosMarin's review on #175: - Drop the inaccurate '<blob@> written via a file handle' claim. staged_insert.py:100-101 explicitly rejects anything except codec name == 'object', so only <object@> is supported. Note the actual error behavior instead. - Drop 'content hash for hash-addressed codecs' from the metadata list. _compute_metadata always sets hash: None for both directory and single-file branches; no hash is ever computed. - Mention the named-store form '<object@name>' alongside '<object@>' in the Table.staged_insert1 API reference. - Add a Limitations bullet noting that cleanup catches Exception not BaseException, so KeyboardInterrupt mid-write can leave staged objects behind; point to the garbage-collection how-to.
1 parent f45c6d7 commit e0f4de0

1 file changed

Lines changed: 4 additions & 3 deletions

File tree

src/how-to/staged-insert.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ This pattern is the right choice when:
1212
- You want to stream or write in chunks rather than buffer in memory
1313
- You want all-or-nothing semantics across object storage and the database
1414

15-
It is only available for object-typed fields (`<...@>` syntax) and codecs that support direct storage handles — primarily `<object@>` (Zarr / HDF5 / multi-file) and `<blob@>` written via a file handle. For ordinary inserts of small or in-memory objects, use [`insert` / `insert1`](insert-data.md).
15+
It is only available for `<object@>` fields — the schema-addressed codec used for Zarr arrays, HDF5 files, and other multi-file objects. Attempting `staged.store()` or `staged.open()` on a field of any other type raises `DataJointError`. For ordinary inserts of small or in-memory objects, use [`insert` / `insert1`](insert-data.md).
1616

1717
## Quick Start
1818

@@ -61,7 +61,7 @@ Inside the `with` block, the row is a draft — `staged.rec` collects attribute
6161

6262
When the block exits without an exception, DataJoint:
6363

64-
1. Computes object metadata (size, manifest, content hash for hash-addressed codecs) from the staged objects.
64+
1. Computes object metadata (size, manifest) from the staged objects.
6565
2. Inserts the row into the database with the populated metadata.
6666

6767
When the block raises, DataJoint:
@@ -80,7 +80,7 @@ with Table.staged_insert1 as staged:
8080
...
8181
```
8282

83-
Context manager property on every `dj.Table` subclass. Yields a `StagedInsert` object scoped to one row.
83+
Context manager property on every `dj.Table` subclass. Yields a `StagedInsert` object scoped to one row. Writes go to the store referenced by the field's type spec — `<object@>` uses `stores.default`, and `<object@name>` uses the named store.
8484

8585
### `staged.rec`
8686

@@ -171,6 +171,7 @@ If the database insert itself fails on exit (e.g., duplicate primary key), the s
171171
- Only one row per block — use a loop of `with` blocks for many rows, or use the standard `insert` for batches that fit in memory.
172172
- The block must set all primary key fields before calling `store()` or `open()`.
173173
- Requires `stores.default` configured, or a named store referenced by the field's type spec.
174+
- Cleanup only runs for ordinary exceptions. `KeyboardInterrupt` (Ctrl+C) and other `BaseException` subclasses bypass the cleanup path, so a process killed mid-write may leave staged objects behind. Run the garbage collector to reclaim them — see [Clean Up Storage](garbage-collection.md).
174175

175176
## Troubleshooting
176177

0 commit comments

Comments
 (0)