Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[air] pyarrow.fs persistence: Introduce StorageContext and use it for driver syncing (1/n) #37690

Merged

Commits on Jul 23, 2023

  1. Update remote storage utils to accept a custom pyarrow fs

    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    justinvyu committed Jul 23, 2023
    Configuration menu
    Copy the full SHA
    e3139a0 View commit details
    Browse the repository at this point in the history
  2. Add StorageContext class

    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    justinvyu committed Jul 23, 2023
    Configuration menu
    Copy the full SHA
    fc90f86 View commit details
    Browse the repository at this point in the history
  3. Add storage filesystem as a run config arg

    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    justinvyu committed Jul 23, 2023
    Configuration menu
    Copy the full SHA
    3d10a13 View commit details
    Browse the repository at this point in the history
  4. Prep some URI utilities for later

    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    justinvyu committed Jul 23, 2023
    Configuration menu
    Copy the full SHA
    472cca4 View commit details
    Browse the repository at this point in the history
  5. Update _DefaultSyncer to take in a custom storage fs

    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    justinvyu committed Jul 23, 2023
    Configuration menu
    Copy the full SHA
    979d760 View commit details
    Browse the repository at this point in the history
  6. Pipe storage context to Tune driver (enabled for driver syncing)

    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    
    Missing arg
    
    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    
    Fix typehint
    
    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    justinvyu committed Jul 23, 2023
    Configuration menu
    Copy the full SHA
    9e9609b View commit details
    Browse the repository at this point in the history
  7. Fix tune controller to accept storage context

    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    justinvyu committed Jul 23, 2023
    Configuration menu
    Copy the full SHA
    297c3d8 View commit details
    Browse the repository at this point in the history
  8. Add e2e test

    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    justinvyu committed Jul 23, 2023
    Configuration menu
    Copy the full SHA
    9c5ad3d View commit details
    Browse the repository at this point in the history
  9. Fix custom fs case with a few temporary workarounds

    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    justinvyu committed Jul 23, 2023
    Configuration menu
    Copy the full SHA
    1c75b53 View commit details
    Browse the repository at this point in the history
  10. Fix lint

    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    justinvyu committed Jul 23, 2023
    Configuration menu
    Copy the full SHA
    79ac0d9 View commit details
    Browse the repository at this point in the history
  11. Update the env var name

    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    justinvyu committed Jul 23, 2023
    Configuration menu
    Copy the full SHA
    61ae556 View commit details
    Browse the repository at this point in the history
  12. Rename cache_dir -> cache_path

    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    justinvyu committed Jul 23, 2023
    Configuration menu
    Copy the full SHA
    2c0a333 View commit details
    Browse the repository at this point in the history
  13. Some misc fixes (off by default, enable only in test)

    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    justinvyu committed Jul 23, 2023
    Configuration menu
    Copy the full SHA
    6f42e78 View commit details
    Browse the repository at this point in the history

Commits on Jul 25, 2023

  1. Store checkpoint id in storage context

    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    justinvyu committed Jul 25, 2023
    Configuration menu
    Copy the full SHA
    291fe8e View commit details
    Browse the repository at this point in the history
  2. Add an error message for custom storage_fs w/ uri storage path

    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    justinvyu committed Jul 25, 2023
    Configuration menu
    Copy the full SHA
    90531cc View commit details
    Browse the repository at this point in the history
  3. Add docstrings + doctest overview of the class

    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    justinvyu committed Jul 25, 2023
    Configuration menu
    Copy the full SHA
    058f05c View commit details
    Browse the repository at this point in the history
  4. Switch to just using pyarrow from_uri

    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    justinvyu committed Jul 25, 2023
    Configuration menu
    Copy the full SHA
    f973491 View commit details
    Browse the repository at this point in the history
  5. Switch to just using pyarrow from_uri in remote_storage utils

    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    justinvyu committed Jul 25, 2023
    Configuration menu
    Copy the full SHA
    5069178 View commit details
    Browse the repository at this point in the history
  6. Add test to BUILD

    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    justinvyu committed Jul 25, 2023
    Configuration menu
    Copy the full SHA
    533cabe View commit details
    Browse the repository at this point in the history
  7. Accept the user's raw storage_config rather than a resolved version

    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    justinvyu committed Jul 25, 2023
    Configuration menu
    Copy the full SHA
    8a11d7f View commit details
    Browse the repository at this point in the history
  8. Workaround for exp analysis remote_storage_path for now

    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    justinvyu committed Jul 25, 2023
    Configuration menu
    Copy the full SHA
    d3b4c20 View commit details
    Browse the repository at this point in the history
  9. Only use storage context for paths in TuneControllerBase

    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    justinvyu committed Jul 25, 2023
    Configuration menu
    Copy the full SHA
    33ef43a View commit details
    Browse the repository at this point in the history
  10. Only use storage context for paths in ExpCkptManager (sync_up, sync_d…

    …own)
    
    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    justinvyu committed Jul 25, 2023
    Configuration menu
    Copy the full SHA
    21bab08 View commit details
    Browse the repository at this point in the history
  11. Only use storage context for paths in ExpCkptManager (resume, resume_…

    …auto)
    
    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    justinvyu committed Jul 25, 2023
    Configuration menu
    Copy the full SHA
    84a7475 View commit details
    Browse the repository at this point in the history
  12. Fix lint

    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    justinvyu committed Jul 25, 2023
    Configuration menu
    Copy the full SHA
    522b4a2 View commit details
    Browse the repository at this point in the history
  13. Merge branch 'master' of https://github.com/ray-project/ray into air/…

    …persistence/storage_context_driver
    justinvyu committed Jul 25, 2023
    Configuration menu
    Copy the full SHA
    1cecb3f View commit details
    Browse the repository at this point in the history
  14. Assert non-uri for list_at_ur(fs)

    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    justinvyu committed Jul 25, 2023
    Configuration menu
    Copy the full SHA
    6a22afb View commit details
    Browse the repository at this point in the history
  15. TuneControllerBase.experiment_state_path now uses storage context

    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    justinvyu committed Jul 25, 2023
    Configuration menu
    Copy the full SHA
    7ed065d View commit details
    Browse the repository at this point in the history

Commits on Jul 26, 2023

  1. Merge branch 'master' of https://github.com/ray-project/ray into air/…

    …persistence/storage_context_driver
    justinvyu committed Jul 26, 2023
    Configuration menu
    Copy the full SHA
    df18653 View commit details
    Browse the repository at this point in the history
  2. Remove setting the legacy sync_config.syncer

    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    justinvyu committed Jul 26, 2023
    Configuration menu
    Copy the full SHA
    5ce6078 View commit details
    Browse the repository at this point in the history
  3. Move utils that should be kept out of remote_storage

    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    justinvyu committed Jul 26, 2023
    Configuration menu
    Copy the full SHA
    bae9cf3 View commit details
    Browse the repository at this point in the history
  4. Update docstrings + improve some arg names

    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    justinvyu committed Jul 26, 2023
    Configuration menu
    Copy the full SHA
    e7fc880 View commit details
    Browse the repository at this point in the history
  5. Create a new _FilesystemSyncer that doesn't depend on remote_storage

    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    justinvyu committed Jul 26, 2023
    Configuration menu
    Copy the full SHA
    bd4f487 View commit details
    Browse the repository at this point in the history
  6. Fix test

    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    justinvyu committed Jul 26, 2023
    Configuration menu
    Copy the full SHA
    73c45bf View commit details
    Browse the repository at this point in the history
  7. Add example experiment output in test

    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    justinvyu committed Jul 26, 2023
    Configuration menu
    Copy the full SHA
    70233c8 View commit details
    Browse the repository at this point in the history
  8. Rename *_cache_path -> *_local_path

    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    justinvyu committed Jul 26, 2023
    Configuration menu
    Copy the full SHA
    e087ae5 View commit details
    Browse the repository at this point in the history
  9. Introduce experiment_path property in StorageContext

    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    justinvyu committed Jul 26, 2023
    Configuration menu
    Copy the full SHA
    c79e5a5 View commit details
    Browse the repository at this point in the history
  10. Add legacy prefix to all attributes used in the old codepath

    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    justinvyu committed Jul 26, 2023
    Configuration menu
    Copy the full SHA
    be37b3e View commit details
    Browse the repository at this point in the history
  11. Merge branch 'master' of https://github.com/ray-project/ray into air/…

    …persistence/storage_context_driver
    justinvyu committed Jul 26, 2023
    Configuration menu
    Copy the full SHA
    f021034 View commit details
    Browse the repository at this point in the history
  12. Revert remote storage, move all utils to storage.py

    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    
    Remove fs_utils
    
    Signed-off-by: Justin Yu <justinvyu@anyscale.com>
    justinvyu committed Jul 26, 2023
    Configuration menu
    Copy the full SHA
    b11c2c2 View commit details
    Browse the repository at this point in the history
  13. Merge branch 'master' of https://github.com/ray-project/ray into air/…

    …persistence/storage_context_driver
    justinvyu committed Jul 26, 2023
    Configuration menu
    Copy the full SHA
    2175261 View commit details
    Browse the repository at this point in the history