Skip to content

Changes to support rms-filecache#138

Merged
rfrenchseti merged 11 commits intomainfrom
rf_241004_filecache
Oct 31, 2024
Merged

Changes to support rms-filecache#138
rfrenchseti merged 11 commits intomainfrom
rf_241004_filecache

Conversation

@rfrenchseti
Copy link
Copy Markdown
Collaborator

@rfrenchseti rfrenchseti commented Oct 9, 2024

The primary goal of this PR is to support the rms-filecache package, which abstracts reading and writing of files that could be on the local filesystem or in the cloud. This means that people using spicedb or oops no longer need to have a copy of the OOPS-Resources DropBox on their local disk; files are downloaded into a local cache from a remote cloud source on an as-needed basis. Cached files are kept across runs so that they don't need to be downloaded again. For examples of how to use rms-filecache and full documentation see: https://rms-filecache.readthedocs.io/en/latest/

You should see no changes with your existing setup. Current environment variables are supported in exactly the same way and local file accesses, although they go through filecache methods, still access the local filesystem unimpeded. However, if you want to use the cloud, you can do something like:

export OOPS_RESOURCES=gs://rms-node-oops-resources

and now all files will be downloaded from the Google Cloud rms-node-oops-resources bucket, which contains all of the SPICE kernels, test data, and golden masters. Note that you need to have a Google Cloud account and be authenticated locally for this to work, since the bucket is not currently publicly accessible.

  • Changes to support filecache. Fixes Modify oops to use filecache #139.
    • Many parts of the code now use Path objects instead of strings.
    • Changed spicedb to use filecache. Kernels are downloaded in parallel when possible. All tests have been updated and pass. This also fixes spicedb tests don't work #137 .
    • Changed all oops unit tests to retrieve test files from the test_data directory and kernels from the test_data/SPICE directory using filecache; these are downloaded into the same cache, distinct from the one used by spicedb. Retrieval is done in parallel for maximum download efficiency, and then the resultant local paths are supplied to cspyce.furnsh one at a time. Even tests that are not currently operational have been modified for future use. All tests have been updated and pass.
    • Changed gold master tests to use filecache:
      • The obspath attribute in a "standard obs" is no longer an absolute path, but is relative to the prefix specified by the OOPS_TEST_DATA_PATH environment variable. This is required because it's never permitted to access files outside of the roots defined by these variables.
      • Removed all uses of os.path.join, which doesn't play nicely with non-local-filesystem URIs.
        • I also removed this from a few places in the hosts code, but not all of them.
        • I also removed a lot of orphan import os.path statements that are no longer (or were never) needed to remind us not to use os.path.join in the future.
      • Removed all makedirs calls, since directories are now automatically created as necessary by filecache.
      • Note that you can't "rename" a file in the cloud, so rotating the summary.py and task.log files requires downloading the old version, changing its name, uploading the new name, and then overwriting the old file with the current run.
      • Changed calls to _basename to use the "Linux-safe" filename format when writing to a cloud destination.
      • All tests have been updated and pass.
    • Changed Juno and Galileo hosts to retrieve SPICE kernels from the properly-specified SPICE_PATH directory rather than using test_data/../SPICE. Note these hosts should really be getting these kernels through the spicedb database mechanism.
  • Made some improvements to the use of environment variables. Now the only one that must be specified is OOPS_RESOURCES.
    • SPICE_PATH is derived from OOPS_RESOURCES if not specified, and SPICE_SQLITE_DB_NAME is derived from SPICE_PATH if not specified.
    • OOPS_TEST_DATA_PATH is derived from OOPS_RESOURCES if not specified.
    • OOPS_GOLD_MASTER_PATH is derived from OOPS_RESOURCES if not specified.
    • OOPS_BACKPLANE_OUTPUT_PATH explicitly uses the current working directory if not specified. BTW I hate this default because it puts a bunch of directories into the repo which aren't in .gitignore.
    • There is no longer any need for separate variables with trailing slashes, since filecache handles this internally.
  • Removed support for Python 3.8, since it isn't supported by some of the cloud libraries required by rms-filecache.
  • Added helpful information to README.md describing the use of environment variables and how to run tests.
  • Performed a few minor cosmetic fixes to tests that were using 2-character indentation; there are many more places that need to be fixed some day.

@rfrenchseti
Copy link
Copy Markdown
Collaborator Author

rfrenchseti commented Oct 22, 2024

Running on a fresh Google Cloud Compute Engine instance with no local files. Notice how much faster it runs the second time since the test files and kernels have been cached locally.

rfrench@instance-20241022-192743:~$ git clone https://github.com/SETI/rms-oops
Cloning into 'rms-oops'...
remote: Enumerating objects: 11377, done.
remote: Counting objects: 100% (4560/4560), done.
remote: Compressing objects: 100% (1374/1374), done.
remote: Total 11377 (delta 3359), reused 4090 (delta 3144), pack-reused 6817 (from 1)
Receiving objects: 100% (11377/11377), 12.27 MiB | 27.08 MiB/s, done.
Resolving deltas: 100% (8172/8172), done.

rfrench@instance-20241022-192743:~$ cd rms-oops

rfrench@instance-20241022-192743:~/rms-oops$ git checkout rf_241004_filecache
Branch 'rf_241004_filecache' set up to track remote branch 'rf_241004_filecache' from 'origin'.
Switched to a new branch 'rf_241004_filecache'

rfrench@instance-20241022-192743:~/rms-oops$ python3 -m venv venv

rfrench@instance-20241022-192743:~/rms-oops$ source venv/bin/activate

(venv) rfrench@instance-20241022-192743:~/rms-oops$ pip install -r requirements.txt -q

(venv) rfrench@instance-20241022-192743:~/rms-oops$ export OOPS_RESOURCES=gs://rms-node-oops-resources

(venv) rfrench@instance-20241022-192743:~/rms-oops$ python -m unittest tests/unittester_with_hosts.py 
........................................................
----------------------------------------------------------------------
Ran 56 tests in 202.970s

OK

(venv) rfrench@instance-20241022-192743:~/rms-oops$ python -m unittest spicedb
./home/rfrench/rms-oops/spicedb/__init__.py:1531: RuntimeWarning: SPICE kernel not found: gs://rms-node-oops-resources/SPICE/Neptune/SPK/nep081xl.bsp
  warnings.warn(f'SPICE kernel not found: {pfx.prefix}{filepath}',
.
----------------------------------------------------------------------
Ran 2 tests in 183.296s

OK

(venv) rfrench@instance-20241022-192743:~/rms-oops$ python -m unittest tests/unittester_with_hosts.py 
........................................................
----------------------------------------------------------------------
Ran 56 tests in 130.646s

OK
(venv) rfrench@instance-20241022-192743:~/rms-oops$ python -m unittest spicedb
./home/rfrench/rms-oops/spicedb/__init__.py:1531: RuntimeWarning: SPICE kernel not found: gs://rms-node-oops-resources/SPICE/Neptune/SPK/nep081xl.bsp
  warnings.warn(f'SPICE kernel not found: {pfx.prefix}{filepath}',
.
----------------------------------------------------------------------
Ran 2 tests in 6.221s

OK

@rfrenchseti rfrenchseti marked this pull request as ready for review October 22, 2024 20:01
Copy link
Copy Markdown
Collaborator

@jnspitale jnspitale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, no comments.

Copy link
Copy Markdown
Collaborator

@markshowalter markshowalter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've discussed all my comments

@rfrenchseti rfrenchseti merged commit 0c9a7be into main Oct 31, 2024
@rfrenchseti rfrenchseti deleted the rf_241004_filecache branch October 31, 2024 23:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Modify oops to use filecache spicedb tests don't work

3 participants