You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs: address review feedback on staged-insert page
From @MilagrosMarin's review on #175:
- Drop the inaccurate '<blob@> written via a file handle' claim.
staged_insert.py:100-101 explicitly rejects anything except codec
name == 'object', so only <object@> is supported. Note the actual
error behavior instead.
- Drop 'content hash for hash-addressed codecs' from the metadata
list. _compute_metadata always sets hash: None for both directory
and single-file branches; no hash is ever computed.
- Mention the named-store form '<object@name>' alongside '<object@>'
in the Table.staged_insert1 API reference.
- Add a Limitations bullet noting that cleanup catches Exception not
BaseException, so KeyboardInterrupt mid-write can leave staged
objects behind; point to the garbage-collection how-to.
Copy file name to clipboardExpand all lines: src/how-to/staged-insert.md
+4-3Lines changed: 4 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,7 +12,7 @@ This pattern is the right choice when:
12
12
- You want to stream or write in chunks rather than buffer in memory
13
13
- You want all-or-nothing semantics across object storage and the database
14
14
15
-
It is only available for object-typed fields (`<...@>`syntax) and codecs that support direct storage handles — primarily `<object@>` (Zarr / HDF5 / multi-file) and `<blob@>` written via a file handle. For ordinary inserts of small or in-memory objects, use [`insert` / `insert1`](insert-data.md).
15
+
It is only available for `<object@>`fields — the schema-addressed codec used for Zarr arrays, HDF5 files, and other multi-file objects. Attempting `staged.store()` or `staged.open()` on a field of any other type raises `DataJointError`. For ordinary inserts of small or in-memory objects, use [`insert` / `insert1`](insert-data.md).
16
16
17
17
## Quick Start
18
18
@@ -61,7 +61,7 @@ Inside the `with` block, the row is a draft — `staged.rec` collects attribute
61
61
62
62
When the block exits without an exception, DataJoint:
63
63
64
-
1. Computes object metadata (size, manifest, content hash for hash-addressed codecs) from the staged objects.
64
+
1. Computes object metadata (size, manifest) from the staged objects.
65
65
2. Inserts the row into the database with the populated metadata.
66
66
67
67
When the block raises, DataJoint:
@@ -80,7 +80,7 @@ with Table.staged_insert1 as staged:
80
80
...
81
81
```
82
82
83
-
Context manager property on every `dj.Table` subclass. Yields a `StagedInsert` object scoped to one row.
83
+
Context manager property on every `dj.Table` subclass. Yields a `StagedInsert` object scoped to one row. Writes go to the store referenced by the field's type spec — `<object@>` uses `stores.default`, and `<object@name>` uses the named store.
84
84
85
85
### `staged.rec`
86
86
@@ -171,6 +171,7 @@ If the database insert itself fails on exit (e.g., duplicate primary key), the s
171
171
- Only one row per block — use a loop of `with` blocks for many rows, or use the standard `insert` for batches that fit in memory.
172
172
- The block must set all primary key fields before calling `store()` or `open()`.
173
173
- Requires `stores.default` configured, or a named store referenced by the field's type spec.
174
+
- Cleanup only runs for ordinary exceptions. `KeyboardInterrupt` (Ctrl+C) and other `BaseException` subclasses bypass the cleanup path, so a process killed mid-write may leave staged objects behind. Run the garbage collector to reclaim them — see [Clean Up Storage](garbage-collection.md).
0 commit comments