New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-9530: Initial metric and specification layout design #1
Conversation
1835b04
to
0441ce2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Several comments.
@@ -2,4 +2,28 @@ | |||
lsst.validate.metrics | |||
##################### | |||
|
|||
**Centralized metric and specification definitions for the ``lsst.validate`` framework.** | |||
**Centralized metric and specification definitions for the lsst.validate framework.** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably have a paragraph in here about what this package really is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed some new README content. Let me know if those are to your liking. I don't want to document too much to early since we're still prototyping ideas in sqr-017.lsst.io.
doc/README.md
Outdated
@@ -0,0 +1,8 @@ | |||
Sphinx documentation support is still be developed. For now, install dependencies listed in `requirements.txt`. | |||
|
|||
To make the docs for `validate_base` alone: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
validate_metrics
here...
@@ -0,0 +1,7 @@ | |||
Place static files for `validate_base` docs, such as images and sample data, in this directory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably just do a package grep for validate_base
, since there's more here.
doxygen_xml_dirname=None)) | ||
|
||
intersphinx_mapping['astropy'] = ('http://docs.astropy.org/en/stable', None) | ||
intersphinx_mapping['validate_base'] = ('https://validate-base.lsst.io', None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another validate_base
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one's legit; it lets us reference the validate_base
user guide from a (future) validate_metrics
user guide.
metrics/validate_drp.yaml
Outdated
|
||
PA1: | ||
description: > | ||
The maximum rms of the unresolved source magnitude distribution around the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should do yaml in the "docs" way of only making newlines on new sentences, instead of enforcing a hard column limit. These are closer to docs than code, I think.
page: 23 | ||
|
||
AD3: | ||
reference: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we enforce (at least in our examples here?) an ordering of the items as "description", "unit", "reference", just for consistency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could validate, in code, ordering if we use ruamel.yaml but that's not a stack dependency (yet). Maybe call this order a best practice for now and use it in all our examples.
operator: '<=' | ||
|
||
--- | ||
# validate_drp.PA2_minimum_gri.srd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these comments here just helpers for a human reader?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, what do you think? I thought it'd be a nice practice to give the fully qualified name of a specification, especially when it's a highly inherited specification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think that's useful.
specs/validate_drp/README.rst
Outdated
specs/validate_drp | ||
################## | ||
|
||
Specifications for ``validate_drp`` metrics are defined in this directory as YAML documents. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should have another sentence or two about what validate_drp metrics are for.
specs/validate_drp/cfht_gri.yaml
Outdated
--- | ||
id: 'base' | ||
provenance_query: | ||
# Prototype for query provenance. This needs to be figured out. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a TODO: at the start of this comment.
specs/validate_drp/cfht_gri.yaml
Outdated
@@ -0,0 +1,56 @@ | |||
# Specifications for validate_drp metrics against validation_data_cfht | |||
# gri-band datasets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Say something about what the fact that these have provenance means.
d6145b9
to
82e724b
Compare
- Minimal Python package (lsst.validate.metrics) to permit version discovery via sconsUtils version.py. Otherwise, lsst.validate.metrics does not contain Python code. It's meant as a standalone data package that contains YAML definition files that are nominally consumed by validate_base. - Uses prototype Sphinx documentation framework (same as in validate_base). - Adopting RFC-45-inspired license management (COPYRIGHT file with references to that file from source files). - Includes GitHub issue/PR/contributing files in .github
These metric definitions are ported from github.com/lsst/validate_drp's metrics/, and distilled into the form being prototyped for validate_metrics. - Each eups package now has a YAML file, named after the package, in metrics/. - metrics/validate_drp.yaml, for example, specifically only defines metrics; it doesn't include specifications. This will allow metrics to be fairly static, while any LSST staff can separately add additional specifications. - We're adopting astropy.units definitions for percentages (%) and unitless unscaled numbers (---). No more empty string units. - We're adopting the docs convention of one-sentence-per-line for metric descriptions.
This shows how specifications for PA1, from the SRD, can be written. In LPM-17-PA1.yaml, specifications for PA1 are written in their most basic and general form. A `#PA1-base` *partial specification* contains common data for all PA1 specifications, namely the units, comparison operator, and metric name. (Our suggested best practice is to have one YAML file per metric for these types of base specification definitions.) Then in LPM-17-PA1.yaml, specifications inherit from `#PA1-base`, adding a unique name and the specification value. The name **suggests** applicable bandpasses, but this provenance query is not established here. This design allows validate_drp to test measurements against the SRD specifications regardless of the provenance of the dataset. Then in cfht_gri.yaml we add specifications that specifically apply to measurements made against the validation_data_cfht data repository in specific bandpasses. These *additional* specifications permits selective monitoring of specification tests.
These metrics are interesting because, in the SRD, the metric measurement is written such that a specification of the other metric is a configuration. In the early validate_base framework these are called "dependent metrics." Here we've decided to unroll these dependencies by creating a metric for each metric measurement function configuration. This means there are dimensions of stretch/design/minimum, and also gri/uzy. Then the specifications are simply named 'srd' since they are the SRD specification of that metric. In cfht_gri.yaml, I've added specifications that inherit from the srd specifications and mix in the provenance query for CFHT gri validation data.
82e724b
to
0724df5
Compare
We don't have real documentation yet (a sphinx user guide), but the main README and READMEs in the metrics/ and specs/ directories should help orient developers for now.
0724df5
to
254e627
Compare
I think I've addressed your comments @parejkoj, at least as an MVP. |
Design features:
/metrics/
, each in a single yaml file named after the package./specs/<package_name>/
. Specifications can be spread across multiple files.This PR includes
validate_drp
metrics, LPM-17 specifications forPA1
,PF1
andPA2
, and provenance-specification specifications for the CFHT gri-band validation dataset as examples.Sample metric:
Specification schema
A complete specification looks like:
The fully qualified name of this specification is
validate_drp.PA1.design_gri
(assuming the specification is defined in/specs/validate_drp/
).The
provenance_query
field is a hash/dictionary of query terms for provenance that this specification can be applied to. The query language is currently undefined, so the example is a pseudocode query where thefilter
must be one ofg
,r
ori
.Specification partials
Specifications might repeat information. For example, a
provenance_query
for a certain test dataset. We DRY this with partials. A partial has anid
field, and can't be a specification on it's own.For example:
A partial can be reference from the
base
field by prefixing theid
with#
. Partials can also be referenced from across files (but within the same package's specs directory) by providing a filename:Specification inheritance
Specifications can also inherit from specifications; generally to add partials. Specifications are referenced through their fully qualified name (or the FQN without the package name). Here's an example from this PR:
The fully-hydrogated
validate_drp.PA1.minimum_gri
specification is:The fully-hydrated
validate_drp.PA1.cfht_minimum_gri
specification is: