Skip to content

ray-0.7.3

Compare
Choose a tag to compare
@simon-mo simon-mo released this 04 Aug 02:37
· 18354 commits to master since this release

Ray 0.7.3 Release Note

Highlights

  • RLlib ModelV2API is ready to use. It improves support for Keras and RNN models, as well as allowing object-oriented reuse of variables. ModelV1 API is deprecated. No migration is needed.
  • ray.experimental.sgd.pytorch.PyTorchTrainer is ready for early adopters. Checkout the documentation here. We welcome your feedback!
    model_creator = lambda config: YourPyTorchModel()
    data_creator = lambda config: YourTrainingSet(), YourValidationSet()
    
    trainer = PyTorchTrainer(
        model_creator,
        data_creator,
        optimizer_creator=utils.sgd_mse_optimizer,
        config={"lr": 1e-4},
        num_replicas=2,
        resources_per_replica=Resources(num_gpus=1),
        batch_size=16,
        backend="auto")
    
    for i in range(NUM_EPOCHS):
        trainer.train()
  • You can query all the clients that have performed ray.init to connect to the current cluster with ray.jobs(). #5076
    >>> ray.jobs()
    [{'JobID': '02000000',
      'NodeManagerAddress': '10.99.88.77',
      'DriverPid': 74949,
      'StartTime': 1564168784,
      'StopTime': 1564168798},
     {'JobID': '01000000',
      'NodeManagerAddress': '10.99.88.77',
      'DriverPid': 74871,
      'StartTime': 1564168742}]

Core

  • Improvement on memory storage handling. #5143, #5216, #4893
  • Improved workflow:
    • Debugging tool local_mode now behaves more consistently. #5060
    • Improved KeyboardInterrupt Exception Handling, stack trace reduced from 115 lines to 22 lines. #5237
  • Ray core:
    • Experimental direct actor call. #5140, #5184
    • Improvement in core worker, the shared module between Python and Java. #5079, #5034, #5062
    • GCS (global control store) was refactored. #5058, #5050

RLlib

  • Finished port of all major RLlib algorithms to builder pattern #5277, #5258, #5249
  • learner_queue_timeout can be configured for async sample optimizer. #5270
  • reproducible_seed can be used for reproducible experiments. #5197
  • Added entropy coefficient decay to IMPALA, APPO and PPO #5043

Tune:

  • Breaking: ExperimentAnalysis is now returned by default from tune.run. To obtain a list of trials, use analysis.trials. #5115
  • Breaking: Syncing behavior between head and workers can now be customized (sync_to_driver). Syncing behavior (upload_dir) between cluster and cloud is now separately customizable (sync_to_cloud). This changes the structure of the uploaded directory - now local_dir is synced with upload_dir. #4450
  • Introduce Analysis and ExperimentAnalysis objects. Analysis object will now return all trials in a folder; ExperimentAnalysis is a subclass that returns all trials of an experiment. #5115
  • Add missing argument tune.run(keep_checkpoints_num=...). Enables only keeping the last N checkpoints. #5117
  • Trials on failed nodes will be prioritized in processing. #5053
  • Trial Checkpointing is now more flexible. #4728
  • Add system performance tracking for gpu, ram, vram, cpu usage statistics - toggle with tune.run(log_sys_usage=True). #4924
  • Experiment checkpointing frequency is now less frequent and can be controlled with tune.run(global_checkpoint_period=...). #4859

Autoscaler

  • Add a request_cores function for manual autoscaling. You can now manually request resources for the autoscaler. #4754

  • Local cluster:

    • More readable example yaml with comments. #5290

    • Multiple cluster name is supported. #4864

  • Improved logging with AWS NodeProvider. create_instance call will be logged. #4998

Others Libraries:

  • SGD:
    • Example for Training. #5292
    • Deprecate old distributed SGD implementation. #5160
  • Kuberentes: Ray namespace added for k8s. #4111
  • Dev experience: Add linting pre-push hook. #5154

Thanks:

We thank the following contributors for their amazing contributions:

@joneswong, @1beb, @richardliaw, @pcmoritz, @raulchen, @stephanie-wang, @jiangzihao2009, @LorenzoCevolani, @kfstorm, @pschafhalter, @micafan, @simon-mo, @vipulharsh, @haje01, @ls-daniel, @hartikainen, @stefanpantic, @edoakes, @llan-ml, @alex-petrenko, @ztangent, @gravitywp, @MQQ, @Dulex123, @morgangiraud, @antoine-galataud, @robertnishihara, @qxcv, @vakker, @jovany-wang, @zhijunfu, @ericl