Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Completely sync tiles-specstatus #1976

Merged
merged 4 commits into from Jan 27, 2023
Merged

Completely sync tiles-specstatus #1976

merged 4 commits into from Jan 27, 2023

Conversation

schlafly
Copy link
Contributor

This PR changes the behavior of desi_update_tiles_specstatus to completely update the tiles-specstatus.ecsv file rather than only updating tiles with updated LASTNIGHT.

I have changed the previous --all argument to be the default, and replaced it with an --only argument to revert to the original behavior.

I also added a --clear-qa option that sets the QA = None. This option can only be used when --tileid is specified (we don't want to accidentally allow the whole QA column to be blasted). This is better than our current approach of manually editing the file to clear QA status after reprocessing.

I updated the tests to conform to the new interface & defaults (basically moving update_all = True to update_only = False).

I went through somewhat laboriously and tried to make sure that all of the changes that this now wants to make make sense. There are 375 changed rows. See below for the gory details. In short, I think most of the changes are expected improvements that have come from reprocessing & changing pipeline behavior. There are a few cases I haven't tracked down, though. I don't think they should hold up this PR but we could imagine making separate tickets for them.

To wit:

  • tileid 80607 has RA = DEC = 0 in tiles-daily.csv, but something else in tiles-specstatus.ecsv. I don't understand this but want to treat it as a bug in tiles-daily rather than in this code.
  • GOALTIME = 1000 for SV tiles in tiles-daily.csv, while it had more complicated logic in whatever led to tiles-specstatus.
  • GOALTYPE = unknown -> other for precision dither tiles, and dark -> other for specialm31. Weird, but if we want to address this, we should address it at tiles-daily.
  • It seems like EFFTIME_GFA can change during reprocessing; that surprises me.
  • I haven't understood TILEID 42426, but that tile generated a desisurveyops ticket and a desispec PR having to do with the way that exposures are selected for inclusion in QA plots, so it's probably okay. It's a backup tile anyway.

--

differences between tiles-specstatus and tiles-daily:
total number of differences: 375
number of rows different by column:
NEXP 32
EXPTIME 32
TILERA 1
TILEDEC 1
EFFTIME_ETC 24
EFFTIME_SPEC 89
EFFTIME_GFA 229
GOALTIME 61
OBSSTATUS 52
LRG_EFFTIME_DARK 87
ELG_EFFTIME_DARK 93
BGS_EFFTIME_BRIGHT 95
LYA_EFFTIME_DARK 92
GOALTYPE 27
LASTNIGHT 14

Digging into some of these:

  • NEXP: checked one case; this had two bad exposures that weren't counted in the old specstatus and were counted in the new one. This seems harmless and we'd like consistency, so I'm ignoring these. NEXP is always higher now.
  • EXPTIME: same as NEXP, always larger now.
  • TILERA/TILEDEC: one tile 80607. It has TILERA = TILEDEC = 0 in tiles-daily.ecsv?! Needs resolution.
  • GOALTIME: it looks like the old GOALTIMEs tried to get something for SV tiles; the new ones just have 1000 s for all. only affects cmx/sv1 80xxx tiles that I think we can safely ignore for the tiles-specstatus file.
  • GOALTYPE: only affects dithprec and specialm31 tiles. all other in new. dark & unknown in old.
    dithprec: unknown -> other
    specialm31: dark -> other
    we don't care? goaltype is actually slightly important for specialm31 tileid 82635 and dark is probably better. But I don't know that we need to guarantee a particular behavior for special tiles.
  • OBSSTATUS: tiles-daily is the authority here, so tiles-specstatus should just copy this. Most of these are cases where GOALTIME changed or NEXP changed. The remaining tiles are all ones that we just observed.
  • LASTNIGHT: tiles observed in the last couple of nights, and 20220416, backup tile 42426. Presumably 42426 has another reason to be reprocessed that I'll discover later.
  • EFFTIME_ETC: these are all cases with NEXP changes.
  • EFFTIME_SPEC: 76 of these where the NEXP agree. The expectation is that these are all reprocessings.
    All of these have at least two nights between LASTNIGHT and the day of observation (and many have much longer), with the exception of two cases: 22889, 4005. 22889 changes by less than 1% and I see some chatter about it having had its center tweaked; it's possible that it has a slightly different EBV because of that? I don't think this is an important difference. I was not able to track down 4005, but with this change it will be consistent with the processing we have in daily and in the archive.
  • LRG_EFFTIME_DARK, ELG_EFFTIME_DARK, BGS_EFFTIME_BRIGHT, LYA_EFFTIME_DARK: the cases here either have different NEXP or fall under EFFTIME_SPEC or are a reprocessing.
  • EFFTIME_GFA: Most of these are cases where the old EFFTIME_GFA are not finite or are 0, or where the NEXP disagree. That leaves 47. Two of the remainder have EFFTIME_SPEC = EFFTIME_GFA = 0 now, and something small before; I don't think we care about that. The remaining 45 are all reprocessings, so I guess this is okay? I am surprised that EFFTIME GFA can change in a reprocessing, but okay.
  • That leaves 42426. I haven't completely tracked this down, but this is a tile that we identified as having weird input exposures, and we fixed that in this ticket.
    get expids from all spectra files #1767
    I conclude that the new version is more likely to be right, but we could dig in further if we think it's important for this single remaining backup tile.

@schlafly schlafly requested a review from sbailey January 20, 2023 19:54
@schlafly
Copy link
Contributor Author

I've put an updated tiles-specstatus file here in case anyone wants to check my work.
https://data.desi.lbl.gov/desi/users/schlafly/surveyops3/trunk/ops/tiles-specstatus-new.ecsv

@sbailey
Copy link
Contributor

sbailey commented Jan 24, 2023

The new option --clear-qa that also requires --tileid to be set is good, and much better than the current hand edit procedure.

For the rest of the auto-update everything, big picture questions:

I think our workflow is to propagate information from daily -> tiles-specstatus, which then triggers for desi_tile_vi whether a tile is ready for human QA. What happens when that auto-update "flips the sign" for an old tile even if there were no new observations? For the case of re-processing, I think that is the desired behavior and the whole point of this PR. But what about some other change like retuning EFFTIME_SPEC vs. EFFTIME_ETC that could propagate information for a previously approved tile to take it back into a below threshold state? Can that happen? Should we perhaps have some protections that we only update tiles that haven't been human QA approved, and once it is human QA approved it really gets frozen (unless a human uses --clear-qa --tileid TILEID to reset it)? Or is that already the case?

I don't mind auto-updates that trigger human QA review. I'm still nervous about the possibility of auto-updates overriding humans by mistake. Comments?

@schlafly
Copy link
Contributor Author

schlafly commented Jan 25, 2023

Thanks for thinking carefully about this.

Reviewing what sets what...

  • desi_tile_vi only ever shows tiles with ZDONE = False. Unless tiles have been explicitly specified, also only EFFTIME >= GOALTIME * MINTFRAC, plus some additional logic on QA status that looks buggy to me but we don't actually use in practice.
  • desi_tile_vi only sets QA, USER, QANIGHT, OVERRIDE
  • desi_archive_tilenight only shows tiles with QA = good and ZDONE = False
  • desi_archive_tilenight only sets ARCHIVEDATE and ZDONE

So, in your case where we update a human approved tile, it could happen that EFFTIME < GOALTIME * MINTFRAC. But this wouldn't seem to me to have any real negative consequences or be wrong to me? i.e., sequence of events:

  • we QA a tile as good
  • we archive it (ZDONE = True, in archive, MTLed)
  • we ~reprocess or something else happens and now EFFTIME < GOALTIME * MINTFRAC; we update this value anyway
  • nothing further happens?

This case would look weird in that we would have QAed a tile that we doesn't look finished. We have the archived version so we could verify the chain if we wanted. But tiles-specstatus would have our current best sense for the EFFTIME.

If we want to keep track of ARCHIVEEFFTIME or something we could add that column and put the EFFTIMEs when archiving in?

It's technically possible for us right now to desi_tile_vi --tileid ABCDEF on some unfinished tileid ABCDEF, and mark it as good. desi_tile_vi would complain and we'd set OVERRIDE = True. Then we would archive it automatically. I am not sure if ZDONE informs obsend / done status in tiles-daily.csv (probably not), so we would proceed in observations to finish the tile. That doesn't feel like ideal behavior, but one really has to be asking for it by explicitly specifying the unfinished tile and overriding desi_tile_vi. And I don't think this particular failure mode really directly relates to this PR.

@sbailey
Copy link
Contributor

sbailey commented Jan 26, 2023

@schlafly thanks for clearly laying out who updates what columns. That scenario looks fine.

I updated the update_specstatus docstring to clarify the defaults. Please double check that; if you agree with the updated wording we can proceed with merging. Otherwise please correct what I wrote.

@sbailey
Copy link
Contributor

sbailey commented Jan 27, 2023

Eddie approved via slack chat; merging now.

@sbailey sbailey merged commit 82551ac into main Jan 27, 2023
@sbailey sbailey deleted the completely-sync-specstatus branch January 27, 2023 23:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants