-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REP] Refining the Ray AIR Surface API #36
Conversation
|
||
## Open Questions | ||
|
||
We are likely going to remove `PredictorWrapper` and `PredictorDeployment` and migrate the examples to use Ray Serve deployments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we flesh out an example of this migration in the REP? It may make it more clear the implications of this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I will do that, great point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with this proposal and only have detail questions left.
Just for completeness (as it's not mentioned in the REP). An alternative to resolving this is by moving everything into AIR instead. Thus we would have ray.air.training
, ray.air.tuning
, etc. In such a world we would soften the boundaries between the libraries and keep shared modules in the AIR namespace. By moving the libraries "one level down" we would enforce a separation from Ray Core and double down on AIR as an umbrella for our downstream libraries.
It also avoids the question of where to put which modules. Integrations and callbacks can quite naturally remain in ray.air
, and both training and tuning modules can access them.
## Open Questions | ||
|
||
We are likely going to remove `PredictorWrapper` and `PredictorDeployment` and migrate the examples to use Ray Serve deployments | ||
direcly, and we are also likely going to move `air.integrations` to `train.integrations` and tentatively the predictors to `ray.train`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a few more:
air.callbacks
--> also train?air.examples
for cross-library examples --> docs? (related question: how do we restructure the docs? I guess out of scope for this REP...)air.execution
--> tentatively train?air.util
- looks like mostly data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for bringing these up :)
air.callbacks
has already been moved to air.integrations
, which will move to train.integrations
, right?
air.examples
these should not be living in the source tree, so yes, docs would probably be the right place. I don't consider air.examples
to be part of the AIR API -- on the docs: @richardliaw is owning the docs restructuring :)
air.execution
these are internal APIs, so are not a concern for this REP. We should feel free to put them into some utility namespace of either train or tune, depending what makes the most sense going forward.
air.util
the tensor extension stuff should go into ray.data
, the torch stuff into ray.train.torch
(I'm a bit confused on whether this is a public API or not -- it seems to be used in the OPT example at the moment? -- we should clarify that and handle it appropriately. Is there anything else here we need to decide about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to capture all these in the REP to make it more comprehensive, i.e. to describe what it takes to fully "Disband the ray.air
namespace"?
On your higher level comment: Not only would introducing |
## Open Questions | ||
|
||
We are likely going to remove `PredictorWrapper` and `PredictorDeployment` and migrate the examples to use Ray Serve deployments | ||
direcly, and we are also likely going to move `air.integrations` to `train.integrations` and tentatively the predictors to `ray.train`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to capture all these in the REP to make it more comprehensive, i.e. to describe what it takes to fully "Disband the ray.air
namespace"?
The only concern is if we are ok with this type of usage for Ray Tune only use case? from ray import tune
from ray import train
def objective(config):
score = config["a"] ** 2 + config["b"]
train.report({"score": score})
search_space = {
"a": tune.grid_search([0.001, 0.01, 0.1, 1.0]),
"b": tune.choice([1, 2, 3]),
}
tuner = tune.Tuner(objective, param_space=search_space)
results = tuner.fit()
print(results.get_best_result(metric="score", mode="min").config) This exposes the |
@serve.deployment | ||
class XGBoostService: | ||
def __init__(self, checkpoint): | ||
self.predictor = XGBoostPredictor.from_checkpoint(checkpoint) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.predictor = XGBoostPredictor.from_checkpoint(checkpoint) | |
local_dir = checkpoint.to_directory() | |
self.model: xgboost.Booster = pickle.loads(local_dir + "/saved_booster.pkl") | |
`` | |
Since we are de-emphasizing predictors, shall we also change this to show something like this instead? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed this now by adopting the new way from https://docs.google.com/document/d/1J-09US8cXc-tpl2A1BpOrlHLTEDMdIJp6Ah1ifBUw7Y/edit :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates
This is bringing the API up-to-date with ray-project/enhancements#36
@amogkam We decided that it is ok for Tune to depend on Train, since tuning training runs is going to be the most important use case of Ray Tune and also that's how most machine learning engineers / practitioners think about the relation between training and tuning :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great and feels a lot cleaner!
Signed-off-by: Philipp Moritz <pcmoritz@gmail.com>
Co-authored-by: Eric Liang <ekhliang@gmail.com> Signed-off-by: Philipp Moritz <pcmoritz@gmail.com>
f2cd7d1
to
9f6c6c0
Compare
@zhe-thoughts this can be merged |
Thanks for the work @pcmoritz @ericl @krfricke @matthewdeng @amogkam |
…ect#37906) This is bringing the API up-to-date with ray-project/enhancements#36 Signed-off-by: NripeshN <nn2012@hw.ac.uk>
…ect#37906) This is bringing the API up-to-date with ray-project/enhancements#36 Signed-off-by: harborn <gangsheng.wu@intel.com>
…ect#37906) This is bringing the API up-to-date with ray-project/enhancements#36
…ect#37906) This is bringing the API up-to-date with ray-project/enhancements#36 Signed-off-by: e428265 <arvind.chandramouli@lmco.com>
…ect#37906) This is bringing the API up-to-date with ray-project/enhancements#36 Signed-off-by: Victor <vctr.y.m@example.com>
This REP proposes to remove the
ray.air
namespace and put the functionality into the respective libraries Ray Data, Ray Serve and Ray Train.