Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-35174: Add descriptions to diaObject columns #68

Merged
merged 1 commit into from Jun 27, 2022
Merged

Conversation

yalsayyad
Copy link
Contributor

and bring the non-band specific columns:
diaObjectId, ra, decl, radecTai, and nDiaSources to top of column list.

@gpdf
Copy link
Collaborator

gpdf commented Jun 23, 2022

@yalsayyad I'm puzzled. It looks like this PR does change the order of the column clauses in the YAML file, but I thought @iagaponenko warned us against that?

description:
- name: diaObjectId
"@id": "#DiaObject.diaObjectId"
description: Decl-coordinate of the position of the object at time radecTai
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per RFC-863, can we start referring to these as "Right Ascension" and "Declination" in the descriptions, and not propagating the "decl" naming if we can help it?

@yalsayyad
Copy link
Contributor Author

Ah, I misunderstood what @iagaponenko was saying. He's saying that he already did ingest steps 1-3 and doesn't want to redo them.

I'll put them back in original order the dp02_dc2.yaml.

@iagaponenko
Copy link

Probably I won't redo the Parquet-to-CSV translation and the ingest ... unless a problem will be found in the Parquet files.

Copy link
Contributor

@fritzm fritzm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks!

and bring diaObjectId, ra, decl, radecTai, and nDiaSources
to top of column list.
@yalsayyad yalsayyad merged commit 866407a into main Jun 27, 2022
@yalsayyad yalsayyad deleted the tickets/DM-35174 branch June 27, 2022 18:22
Copy link
Contributor

@parejkoj parejkoj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the descriptions should all have punctuation at the end? A few do, but most do not.

I struggled with whether we should be more explicit about the "summary statistics of diaSource things" values, stating that they come from a bunch of diaSources. That's fairly obvious if you understand diaSource vs. diaObject, but maybe less so if you're new?

- name: diaObjectId
"@id": "#DiaObject.diaObjectId"
datatype: long
mysql:datatype: BIGINT
description:
description: Unique id.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth expanding on this to say "Unique identifier for this object" or something? Maybe it's obvious from context, so the description is mostly redundant with the name?

- name: gPSFluxErrMean
"@id": "#DiaObject.gPSFluxErrMean"
datatype: double
mysql:datatype: DOUBLE
description:
description: Mean of the diaSource PSF flux errors
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For all these things that are "mean of diaSource X": should we expand that to e.g. "Mean of the individual diaSource X" or something? Someone not entirely familiar with diaSource vs. diaObject might need a little more explicitness, maybe?

- name: gPSFluxLinearIntercept
"@id": "#DiaObject.gPSFluxLinearIntercept"
datatype: double
mysql:datatype: DOUBLE
description:
description: y-intercept of a linear model fit to diaSource PSF flux vs time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's probably fine to not have the band mentioned in these kinds of repeated comments, since it's in the field name, but the band name does show up in the descriptions of e.g. gPSFluxSigma and gPSFluxChi2, so we're not fully consistent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, the intend was to only reference bands when quoting the name of another column, but I see that in e.g. gPSFluxChi2, it references gPSFlux which isn't a column. Will fix that.

I didn't add a blanket "for diaSources detected in z-band visits" for all the columns, though would if it would help.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't know if that blanket statement is really necessary, but I'm not sure.

- name: gPSFluxMAD
"@id": "#DiaObject.gPSFluxMAD"
datatype: double
mysql:datatype: DOUBLE
description:
description: Median absolute deviation of diaSource PSF flux. Does not include scale factor for comparison to sigma
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second sentence is a bit context free here: should we explicitly state that scale factor in the description (1.4-whatever assuming normal errors)?

Should we say "sigma" in that last sentence, or "standard deviation" (the former is the field name to compare to, the latter is the description of that field).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe " Multiply by 1.4826 to compare to a normal standard deviation"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's good, instead of the last sentence.

- name: gPSFluxMaxSlope
"@id": "#DiaObject.gPSFluxMaxSlope"
datatype: double
mysql:datatype: DOUBLE
description:
description: Maximum ratio of time ordered deltaFlux / deltaTime
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"... for each diaSource"?

- name: gPSFluxMean
"@id": "#DiaObject.gPSFluxMean"
datatype: double
mysql:datatype: DOUBLE
description:
description: Weighted mean of diaSource PSF flux
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weighted by diaSource fluxErr, or what?

Copy link
Contributor Author

@yalsayyad yalsayyad Jun 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about Mean of individual diaSource PSF fluxes weighted by the inverse square of their fluxErr

- name: gTOTFluxMean
"@id": "#DiaObject.gTOTFluxMean"
datatype: double
mysql:datatype: DOUBLE
description:
description: Weighted mean of the PSF flux forced photometered at the diaSource position on the calibrated image
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"calibrated science image"?

- name: gTOTFluxSigma
"@id": "#DiaObject.gTOTFluxSigma"
datatype: double
mysql:datatype: DOUBLE
description:
description: Standard deviation of the PSF flux forced photometered at the diaSource position on the calibrated image
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Standard deviation of the individual diaSource TOTFluxMean forced photometered measurements"?

- name: pixelId
"@id": "#DiaObject.pixelId"
datatype: double
mysql:datatype: DOUBLE
description:
description: HtmIndex20 of ra, decl coordinate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

of ra,dec coordinate

Copy link
Contributor Author

@yalsayyad yalsayyad Jun 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if I'm referring to the columns in this table named ra, decl?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm torn on this one: Maybe leave it decl, but make sure we note to fix the description in future files?

- name: radecTai
"@id": "#DiaObject.radecTai"
datatype: double
mysql:datatype: DOUBLE
description:
description: Time at which the object was at a position ra/decl.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ra/dec, or leave decl here for now?

@parejkoj
Copy link
Contributor

Checking on this: can we merge an updated version of this now, with the corrections in my comments above? Otherwise it would be easy for this to get lost.

@parejkoj
Copy link
Contributor

parejkoj commented Sep 9, 2022

Checking on this again: can we get the updated version of this merged? I believe what was merged was before I left my comments above. I'd rather not have these updates get lost.

@fritzm
Copy link
Contributor

fritzm commented Sep 9, 2022

@parejkoj I think @yalsayyad is the keeper of these changes, and I believe she is currently on vacation?

@parejkoj
Copy link
Contributor

Pinging this one again: can we get my proposed changes merged so they aren't lost?

@fritzm
Copy link
Contributor

fritzm commented Oct 30, 2022

Repeat: this is is @yalsayyad's hands... Tagging her as reviewer, and removing myself (oh, I can do neither, apparently, since this PR was technically already "merged"...)

@parejkoj
Copy link
Contributor

@yalsayyad : I'll note that I am not attempting to integrate most of the documentation changes I proposed in my comments above into the changes I'm making in DM-37196. I still think my suggestions above should be made to imsim.yaml and hsc.yaml.

@gpdf
Copy link
Collaborator

gpdf commented Dec 12, 2023

@parejkoj just called my attention to this PR. Can I make sure I understand the situation here? John, you want to make sure that the comments you made above, post-merge, are not lost, but there's no current branch on which those comments have been implemented?

(@JeremyMcCormick as well)

@parejkoj
Copy link
Contributor

I don't know whether my comments above were ever incorporated into anything that got merged or not. Some of them were done in later work (e.g. the decl->dec changes), but I don't think all?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants