New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-13496: Prototype Butler Registry schema definition in YAML #13
Conversation
1a96b2a
to
9109263
Compare
config/registry/default_schema.yaml
Outdated
name: Visit | ||
dependencies: | ||
- Camera | ||
- PhysicalFilter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need some way of distinguishing between these relationships in this structure:
camera
is part ofVisit
's primary key, and needs to be provided along with thevisit
number field to uniquely identify aVisit
.physical_filter
is not a part ofVisit
's primary key, and is not necessary to uniquely identify aVisit
.
config/registry/default_schema.yaml
Outdated
doc: > | ||
An inclusive range of Exposures that may be open in either direction. There | ||
is no SQL representation; its fields (duplicated below) are all part of its | ||
compound primary key and are hence just present in Dataset. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should have a machine-readable way to represent the fact that the columns here are just fields that must be present Dataset
, not something that requires an new Table to be created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thought on that was that if all columns are values then no table is needed and all that exist are those columns in the Datasets
table (e.g. the respective DataUnit
table is skipped). Otherwise a separate table is created for this DataUnit
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By "values", you mean "appears in the Dataset
table"? I don't see those indicated some other way (though I think the condition you want might actually be "if all columns are primary keys").
In any case, even that condition isn't enough, because it can't distinguish between "have a definition-only table" and "no table".
config/registry/default_schema.yaml
Outdated
Link to the Quantum table (only present for SuperTask-generated Datasets). | ||
# DataUnit values | ||
- !Column | ||
<<: *camera |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these going to inherit the primary_key
attributes these column definitions have in the DataUnits
section?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, unless overridden in the same way as nullable
is.
config/registry/default_schema.yaml
Outdated
name: parent_dataset_id | ||
type: int | ||
primary_key: true | ||
nullable: false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In SQL, primary_key=True
always implies nullable=False
. Should we take advantage of that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably, although I'd want to check the standard on that to be sure (e.g. the actual primary key in sqllite
is some auto-generated thing that you don't see so perhaps it does allow it to be nullable
? Not sure.
3562207
to
97e7517
Compare
- | ||
name: physical_filter | ||
type: string | ||
foreign_key: PhysicalFilter.physical_filter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this have one link
(physical_filter
) or two (physical_filter
and camera
)? I think you have all of the necessary information here, but just to confirm it's structured:
- No two
DataUnits
may have the samelink
. - A single
DataUnits
may (rarely) have more than onelink
(e.g.ExposureRange
). - A
DataUnit
is fully specified by itslink
and thelink
s of allrequired
dependencies
. If theDataUnit
has a table, "fully specified" means those columns correspond to its (compound) primary key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is correct. I am assuming that when a link is inserted the dependencies are first expanded and also included. Thus dupplicate links are not allowed.
Exposure: | ||
dependencies: | ||
required: | ||
- Camera |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also:
optional:
- Visit
right?
(Just making sure this an accidental omission, not something about the structure I don't understand.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this was an accidental omission (there may be more).
97e7517
to
a479a0b
Compare
No description provided.