-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Mmann1123 ray extract #300
Conversation
@jgrss this seems to radically help with large sets of polygons etc |
src/geowombat/core/sops.py
Outdated
import ray | ||
|
||
if not ray.is_initialized(): | ||
ray.init() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you pass processes
to ray with ray.init(num_cpus=processes)
?
src/geowombat/core/sops.py
Outdated
This method is intended to be used with Ray for distributed computing. | ||
Assumes `data` is accessible in the scope where this function is called. | ||
""" | ||
return data.isel(band=bands_idx, y=yidx, x=xidx).data.compute() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need to pass .compute(num_workers=1)
with ray? What are the processes being used? Do you have ray + dask threading?
return data.isel(band=bands_idx, y=yidx, x=xidx).data.compute() | ||
|
||
# Dynamically assign the Ray-enabled method to the class. | ||
SpatialOperations.extract_data_slice = _ray_extract_data_slice |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need to set this class method? Could you instead do
res = ray.get(
_ray_extract_data_slice.remote(data, bands_idx, yidx, xidx)
)
|
||
if not ray.is_initialized(): | ||
ray.init() | ||
res = ray.get( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you comment on what ray is doing? I'm curious how a wrapped ray.remote method calling Xarray isel
is faster than calling isel
directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jgrss yeah that is a good question. I may have gone off half cocked here. I need to do some more testing to see if we really have an improvement. I am however suffering with an issue, when we large stacked inputs and large polygons I need to figure out how to chunk the process. I am working on another idea under mmann1123_ray_extract2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mmann1123 -- a few comments first so that I can understand the ray.remote wrapper around Xarray and a dask compute.
What is this PR changing?
Adding ray extract client - helps avoid threading conflicts for large polygon files
Checklist
Tag options:
Example: