DM-42435: Migrate DM schemas from ccdVisitId to (visit, detector) #211

parejkoj · 2024-04-26T18:40:03Z

(note added by @gpdf 2024-06-28)

This ticket arose from RFC-975 (Switch APDB schema from ccdVisitId to visit and detector) but was extended during the implementation to cover the Source-like DRP tables as well.

kfindeisen · 2024-05-02T22:28:29Z

yml/apdb.yaml

@@ -1,7 +1,7 @@
 ---
 name: "ApdbSchema"
 "@id": "#apdbSchema"
-version: "0.1.1"
+version: "1.0.0"


Should this be 1.0 and not, say, 0.2? What makes this version more or less "pre-release" than the old one?

We're using a sort-of semantic versioning system here, and this is a fully breaking non-backwards compatible change, hence the major number bump. Version 0.1.1 is the first "there are versions", while anything before that is considered 0.1.0.

https://dmtn-269.lsst.io#version-compatibility

Semantic versioning does not require that versions 0.x be mutually compatible, so that's no argument either way.

See the system we're using (following the one in daf_butler) in the link above: this is a major number bump, per that definition (@andy-slac can correct me if I'm wrong here).

It's similar to the versioning system for the alert packets.

I need to add that the exact meaning of the minor version number is not so well-defined now. Major and patch roles are very clear as non-compatible and compatible (and change on this PR is clearly non-compatible). Minor version role is defined now as "backward-compatible", but it is not super-clear for now what that means, as it probably implies that something on the client side may be responsible for that compatibility. In Butlerland we define minor version compatibility as "read-compatible", but for APDB that notion probably does not matter at all as APDB's primary client is AP pipelines which are always writing.

Anyways, I think we may eventually figure out minor version meaning, for now I expect changes on schema side that are either non-compatible or fully-compatible.

Additionally, PPDB schema is supposed to reflect APDB schema, so there are probably similar versioning issues on that side, but I'm not ready to discuss those now. And potential schema migration for PPDB will be a separate huge topic.

For what it's worth, for the alerts schemas we are using the FORWARD_TRANSITIVE type of the Confluent compatibility model--see https://dmtn-093.lsst.io/

andy-slac

Looks good, few minor comments. While we are at it, could we drop Instrument table from APDB schema? It was never used by dax_apdb and we now have a generic metadata table that should be used to store instrument name.

andy-slac · 2024-05-03T17:16:17Z

yml/apdb.yaml

+  - name: IDX_DiaSource_detectorVisit
+    "@id": "#IDX_DiaSource_detectorVisit"


Maybe change the name to IDX_DiaSource_visitDetector to reflect the order of the columns in the index.

Good catch, thanks! I've tried to be consistent about visit,detector wherever I can (e.g. in function signatures), but missed this one.

andy-slac · 2024-05-03T17:21:46Z

yml/apdb.yaml

+    - "#DiaForcedSource.detector"
+    - "#DiaForcedSource.visit"


Change the order to visit, detector. May not be very important, but it's better to keep it consistent with DiaSource. Also it can be useful for queries like SELECT * from DiaForcedSources WHERE visit = 1000. I don't think that query SELECT * from DiaForcedSources WHERE detector = 10 is useful.

We're definitely more likely to search on visit, but I suspect searches on just detector will be useful for debugging, e.g. "does detector 10 always have some kind of failure mode in its sources?"

If those queries are important/frequent, then maybe worth adding an extra index? I guess we'll need a systematic approach to figure out performance issues later.

yml/apdb.yaml

andy-slac · 2024-05-03T17:27:44Z

yml/apdb.yaml

    description: Id of the detector on the instrument that took this visit.
+      Datatype short instead of byte because of DB concerns about unsigned bytes.
+    ivoa:ucd: meta.id;obs.image
+    # TODO: we probably need to add an index on the above two for this table.


Primary key would be nice to have too. Are there multiple processings for visit/detector? How are they distinguished here?

@ebellm : do we have a plan for multiple processings to go into the same APDB, and if so, how will we distinguish them in DetectorVisitProcessingSummary?

Maybe we can leave this question for the implementation ticket of DetectorVisitProcessingSummary, since we aren't writing anything to it? That would be DM-39497.

Although, bundling as many schema changes now as we can would be good, it might be better to not block on a table we're not yet using.

The table is meant to allow multiple processings of the same raws (that's why it's a processing summary), so there should indeed be a key on processing. If we go the direction we're thinking on DM-24858, that could be the "release ID" that us used in the ID generators (0 for AP in PP, 1 or 2 for a reprocessing).

Offline/precursor reprocessings go in separate schemas entirely, of course.

I think it's fine to defer this to DM-39497.

This functionality will be implemented in the metadata table.

parejkoj force-pushed the tickets/DM-42435 branch 5 times, most recently from b72365e to 3beba3a Compare May 1, 2024 17:49

kfindeisen reviewed May 2, 2024

View reviewed changes

andy-slac approved these changes May 3, 2024

View reviewed changes

parejkoj force-pushed the tickets/DM-42435 branch from e82cba8 to 2993751 Compare May 3, 2024 18:06

parejkoj added 4 commits May 3, 2024 11:06

Add detector/visit columns to apdb.yaml

367df67

Add visit/detector columns to imsim.yaml

c9c87a2

Remove ccdVisitID from Source in hsc.yaml

ccb7fcc

Remove instrument table

137fd50

This functionality will be implemented in the metadata table.

parejkoj force-pushed the tickets/DM-42435 branch from 2993751 to 137fd50 Compare May 3, 2024 18:06

parejkoj merged commit 120a2ea into main May 6, 2024
4 checks passed

parejkoj deleted the tickets/DM-42435 branch May 6, 2024 20:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DM-42435: Migrate DM schemas from ccdVisitId to (visit, detector) #211

DM-42435: Migrate DM schemas from ccdVisitId to (visit, detector) #211

parejkoj commented Apr 26, 2024 •

edited by gpdf

Loading

kfindeisen May 2, 2024

parejkoj May 2, 2024

kfindeisen May 2, 2024

parejkoj May 2, 2024 •

edited

Loading

andy-slac May 3, 2024

ebellm May 3, 2024

andy-slac left a comment

andy-slac May 3, 2024

parejkoj May 3, 2024

andy-slac May 3, 2024

parejkoj May 3, 2024

andy-slac May 3, 2024

andy-slac May 3, 2024

parejkoj May 3, 2024

parejkoj May 3, 2024

ebellm May 3, 2024 •

edited

Loading

		- name: IDX_DiaSource_detectorVisit
		"@id": "#IDX_DiaSource_detectorVisit"

DM-42435: Migrate DM schemas from ccdVisitId to (visit, detector) #211

DM-42435: Migrate DM schemas from ccdVisitId to (visit, detector) #211

Conversation

parejkoj commented Apr 26, 2024 • edited by gpdf Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

parejkoj May 2, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andy-slac left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ebellm May 3, 2024 • edited Loading

Choose a reason for hiding this comment

parejkoj commented Apr 26, 2024 •

edited by gpdf

Loading

parejkoj May 2, 2024 •

edited

Loading

ebellm May 3, 2024 •

edited

Loading