Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace hard-coded paths with environment variables #2294

Merged
merged 8 commits into from
Jul 19, 2024
Merged

Conversation

weaverba137
Copy link
Member

@weaverba137 weaverba137 commented Jul 16, 2024

This PR fixes #2272.

This PR introduces a new environment variable, DESI_TILES, similar to DESI_SURVEYOPS and also uses DESI_SURVEYOPS more extensively.

DESI_TILES will be defined in desimodules (as a fix to desihub/desimodules#49).

@weaverba137 weaverba137 added the WIP Work in Progress label Jul 16, 2024
@weaverba137 weaverba137 requested a review from sbailey July 16, 2024 19:48
@weaverba137 weaverba137 self-assigned this Jul 16, 2024
@sbailey
Copy link
Contributor

sbailey commented Jul 16, 2024

To emphasize the WIP status as a warning to my future self: this requires updates to desimodules to define $DESI_TILES before it will work, and there are other places that also reference trunk that have not yet been updated (e.g. desispec.io.fibermap.assemble_fibermap). But in the spirit of WIP, the basic structure of get_readonly_filepath(os.environ['DESI_BLATFOO']) shown here looks good.

@weaverba137
Copy link
Member Author

@sbailey, while working on this, I noticed that fiberassign file names are specified in multiple ways e.g. f'{tileid // 1000:03d} versus stileid = f'{tileid:06d}'; ... stileid[:3] .... Should we define a function instead and only ever use the function to find the fiberassign file?

@sbailey
Copy link
Contributor

sbailey commented Jul 17, 2024

@sbailey, while working on this, I noticed that fiberassign file names are specified in multiple ways e.g. f'{tileid // 1000:03d} versus stileid = f'{tileid:06d}'; ... stileid[:3] .... Should we define a function instead and only ever use the function to find the fiberassign file?

Good point. Best would be to add a new filetype entry to desispec.io.findfile, e.g. "fiberassignsvn", which would then do this $DESI_TILE+tileid etc. wrangling to report the canonical name of a fiberassign file in svn. This would augment the pre-existing filetype='fiberassign' which reports where the fiberassign file is in the raw data stream.

@weaverba137
Copy link
Member Author

OK, and while working on findfile, I found this. This can't be right can it (around line 294)?

    if isinstance(tile, str):    tile = int(night)

@sbailey
Copy link
Contributor

sbailey commented Jul 17, 2024

Indeed, that whole block was introduced (by me!) in PR #1931 and is a cut-and-paste typo on multiple lines:

    #- be robust to str vs. int
    if isinstance(healpix, str): healpix = int(healpix)
    if isinstance(night, str):   night = int(night)
    if isinstance(expid, str):   expid = int(night)
    if isinstance(tile, str):    tile = int(night)
    if isinstance(spectrograph, str): spectrogrpah = int(night)

Note the "spectrogrpah" typo as well.

@weaverba137 would you like me to fix these in a separate PR (with tests!) or will you fix it in your current branch?

@weaverba137
Copy link
Member Author

I will do a minimal fix on this branch, but let's think about a separate PR to do more robust unit tests on findfile.

In addition, I think a future policy for all PRs should be "if you are trying to create a file name, you must use findfile".

@weaverba137
Copy link
Member Author

I'm moving this out of draft status.

@sbailey, please review this listing of 'trunk' occurrences. I believe these files can be ignored:

  1. etc/cron_dailytest.sh
  2. etc/desi_nersc_pipetest.py
  3. py/desispec/database/redshift.py (the entire database directory is deprecated anyway)
  4. py/desispec/test/test_photo.py
  5. py/desispec/test/integration_test.py

I'm not sure about these, because a URL is being constructed instead of a filename:

  1. py/desispec/scripts/procdashboard.py
  2. py/desispec/scripts/zprocdashboard.py

These files have been updated:

  1. py/desispec/io/photo.py
  2. bin/desi_tile_qa_reference
  3. py/desispec/io/fibermap.py
  4. py/desispec/workflow/exptable.py
  5. py/desispec/scripts/archive_night.py

@weaverba137 weaverba137 marked this pull request as ready for review July 17, 2024 21:02
@weaverba137 weaverba137 removed the WIP Work in Progress label Jul 17, 2024
@akremin
Copy link
Member

akremin commented Jul 18, 2024 via email

py/desispec/io/fibermap.py Outdated Show resolved Hide resolved
py/desispec/io/photo.py Show resolved Hide resolved
py/desispec/workflow/exptable.py Outdated Show resolved Hide resolved
py/desispec/workflow/exptable.py Show resolved Hide resolved
@weaverba137
Copy link
Member Author

I can't add comments directly to this line item:

Thanks for noting that. Seeing that you've added a tiles_dir option to findfile, this code could intercept missing $DESI_TILES and fall back to doing what it would have done before, i.e. using tiles_dir=$DESI_TARGET/fiberassign/tiles/trunk instead (while still printing a warning).

I'm not entirely sure what you're recommending here. Maybe you could express this as a code snippet?

However, I now realize that the way findfile is modified in this branch, an exception will be thrown if FIBER_ASSIGN_DIR is not defined anyway, and so maybe the solution is to catch a potential KeyError thrown by findfile, add a helpful message, and re-raise.

@sbailey
Copy link
Contributor

sbailey commented Jul 19, 2024

I'm not entirely sure what you're recommending here. Maybe you could express this as a code snippet?

Thinking about it more, let's just go all-in on making $FIBER_ASSIGN_DIR a required variable for standard operations, and not try to support an interim period where it might not be defined, since desimodules/main already defines it.

Heads up @sybenzvi, when you update desispec at KPNO, you'll also need to make sure that module files / config gets set to define FIBER_ASSIGN_DIR=$DESI_TARGET/fiberassign/tiles/trunk .

However, I now realize that the way findfile is modified in this branch, an exception will be thrown if FIBER_ASSIGN_DIR is not defined anyway, and so maybe the solution is to catch a potential KeyError thrown by findfile, add a helpful message, and re-raise.

We're ok here. You added

if tiles_dir is None and 'tiles_dir' in required_inputs:
        tiles_dir = os.environ['FIBER_ASSIGN_DIR']

which means that $FIBER_ASSIGN_DIR is only required if tiles_dir is not specified, which is good. We'll require $FIBER_ASSIGN_DIR for normal operations, but if for whatever reason someone is running without it, findfile('fiberassignsvn', tiles_dir=...) will work without requiring an envvar that it doesn't use (I explicitly checked that).

Summarizing, I think we want the assemble_fibermap block to be

    if allow_svn_override:
        testfile, svn_exists = findfile('fiberassignsvn', tile=tileid, readonly=True, return_exists=True)
        if svn_exists:
            fafile = testfile
            log.info(f'Overriding raw fiberassign file {rawfafile} with svn {fafile}')
        else:
            log.info(f'{testfile}[.gz] not found; sticking with raw data fiberassign file')

i.e. findfile will require $FIBER_ASSIGN_DIR to exist and raise a KeyError if it doesn't, and this code handles the case where findfile works but the tile hasn't yet been committed to svn (svn_exists=False).

@weaverba137
Copy link
Member Author

@sbailey, I've made the requested changes, except for the open question about io.photo above.

Copy link
Contributor

@sbailey sbailey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thanks for iterating on all the details.

@sbailey sbailey merged commit 592d859 into main Jul 19, 2024
26 checks passed
@sbailey sbailey deleted the hardcoded-trunk branch July 19, 2024 18:46
@sbailey
Copy link
Contributor

sbailey commented Jul 19, 2024

For the record: I have installed this updated version at NERSC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

tiles version hardcoded to trunk
3 participants