Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Context.make_with number of workers interface #1443

Merged
merged 24 commits into from Jun 29, 2023

Conversation

matbryan52
Copy link
Member

@matbryan52 matbryan52 commented Jun 10, 2023

Adds the interface described here: #1353 (comment) to lt.Context.make_with(..).

Some of the added code would be temporary until all the executors accept something resembling a ResourceSpec created by make_with. This only implements the user-facing part and has some glue code to handle the different cases, the idea being it might be useful to have a working prototype!

Contributor Checklist:

Reviewer Checklist:

  • /azp run libertem.libertem-data passed
  • No import of GPL code from MIT code

@matbryan52 matbryan52 changed the title Make with interface Context.make_with number of workers interface Jun 10, 2023
@matbryan52
Copy link
Member Author

The ability to do

lt.Context.make_with(cpus=4)

is really nice in practice!

@codecov
Copy link

codecov bot commented Jun 10, 2023

Codecov Report

Patch coverage: 72.91% and project coverage change: -6.72 ⚠️

Comparison is base (f2a71e8) 69.04% compared to head (3b6c7a1) 62.32%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1443      +/-   ##
==========================================
- Coverage   69.04%   62.32%   -6.72%     
==========================================
  Files         305      305              
  Lines       17837    17874      +37     
  Branches     3191     3201      +10     
==========================================
- Hits        12315    11140    -1175     
- Misses       5019     6287    +1268     
+ Partials      503      447      -56     
Impacted Files Coverage Δ
src/libertem/api.py 89.57% <68.29%> (-4.63%) ⬇️
src/libertem/exceptions.py 100.00% <100.00%> (ø)
src/libertem/executor/concurrent.py 75.92% <100.00%> (+0.45%) ⬆️

... and 35 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Member

@sk1p sk1p left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ability to do

lt.Context.make_with(cpus=4)

is really nice in practice!

Agreed!

This only implements the user-facing part and has some glue code to handle the different cases, the idea being it might be useful to have a working prototype!

I would be very happy to merge this, as soon as we are 100% happy about the user-facing API. I think mostly, the internals will need to change. In #1353 (comment) we do mention "hybrid workers" as a TODO, but I think with the API structure we can implement changes like that as additional keyword arguments, if needed.

Thoughts on actually merging this, pending tests? Maybe having the cases from our design discussion in a testcase? Then, we can replace the implementation bit-by-bit with something proper, without impacting functionality, probably starting with #1353

src/libertem/api.py Outdated Show resolved Hide resolved
matbryan52 and others added 3 commits June 13, 2023 09:04
@matbryan52
Copy link
Member Author

Thanks for the feedback!

I would be very happy to merge this, as soon as we are 100% happy about the user-facing API. I think mostly, the internals will need to change. In #1353 (comment) we do mention "hybrid workers" as a TODO, but I think with the API structure we can implement changes like that as additional keyword arguments, if needed.

Good point about the hybrid workers, the current interface does explicitly separate the cpu / gpu case. Could potentially have something like:

    def make_with(
        cls,
        executor_spec: ExecutorSpecType = 'dask',
        *,
        cpus: Optional[Union[int, Iterable[int]]] = None,
        gpus: Optional[Union[int, Iterable[int]]] = None,
+       hybrids: Optional[int] = None,
        plot_class: Optional['Live2DPlot'] = None,
    ) -> 'Context':

I set Optional[int] as I would say that manual specification of CPU / GPU id groups for hybrid workers would fall into the 'too complex for make_with' category and the user should use the more advanced API...

@matbryan52
Copy link
Member Author

/azp run libertem.libertem-data

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@sk1p
Copy link
Member

sk1p commented Jun 14, 2023

/azp run libertem.libertem-data

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@matbryan52 matbryan52 marked this pull request as ready for review June 15, 2023 06:24
@sk1p
Copy link
Member

sk1p commented Jun 20, 2023

Building including slow tests here, to hopefully get proper coverage: https://dev.azure.com/LiberTEM/LiberTEM/_build/results?buildId=2372&view=results

@sk1p
Copy link
Member

sk1p commented Jun 20, 2023

Building including slow tests here, to hopefully get proper coverage: https://dev.azure.com/LiberTEM/LiberTEM/_build/results?buildId=2372&view=results

coverage of Context.make_with looks good - only missing is unsurprisingly the GPU case (should we have a "negative" test that, if gpus is set and no GPUs are available, an appropriate exception is thrown?)

Copy link
Member

@uellue uellue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, generally! Two details about the semantics in the comments.

src/libertem/api.py Outdated Show resolved Hide resolved
src/libertem/api.py Outdated Show resolved Hide resolved
@matbryan52
Copy link
Member Author

matbryan52 commented Jun 21, 2023

only missing is unsurprisingly the GPU case (should we have a "negative" test that, if gpus is set and no GPUs are available, an appropriate exception is thrown?)

The current behaviour of cluster_spec is to throw a warning rather than an exception in the case of badly specified gpus. In the future when we move to ResourceSpec the executor itself should be responsible for raising on bad arguments, I feel.

For the moment I've hard-coded the executors which are not at all gpu-capable into make_with and there is a test for this. Now that I am calling detect() in the case of dask/pipelined with some user-provided spec I can also raise if no GPUs are available, but I think this will only be a temporary behaviour of make_with itself.

@matbryan52
Copy link
Member Author

/azp run libertem.libertem-data

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@sk1p sk1p requested a review from uellue June 26, 2023 13:02
src/libertem/api.py Outdated Show resolved Hide resolved
Copy link
Member

@sk1p sk1p left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@uellue said:

I see several scenarios to use these parameters: [...]

I was so free to push a test case implementing these scenarios as code, which passes for me both with GPUs available and without them.

In scenario 4, if we disable all CPU workers but don't have a GPU, shouldn't an error be raised, or was there a warning that I'm missing? This is specifically for the dask executor, which I'm using in the test case, so I can also open a separate issue for this.

Another question, can we add a custom exception type instead of raising a ValueError? The docs describe the situation as "Raised when an operation or function receives an argument that has the right type but an inappropriate value, and the situation is not described by a more precise exception [...]". In this case, if the value is valid depends on the dynamics ("is there a GPU available?") and is closer to a RuntimeError. Maybe add it to the criminally underused libertem.exceptions module?

I think other than this, this PR is good to go, and is mainly missing a changelog entry, and updates to the docs. Let me know if I should take a stab at the docs updates.

@sk1p
Copy link
Member

sk1p commented Jun 27, 2023

/azp run libertem.libertem-data

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Thanks @sk1p  !

Co-authored-by: Alexander Clausen <alex@gc-web.de>
@matbryan52
Copy link
Member Author

@uellue said:

I see several scenarios to use these parameters: [...]

I was so free to push a test case implementing these scenarios as code, which passes for me both with GPUs available and without them.

In scenario 4, if we disable all CPU workers but don't have a GPU, shouldn't an error be raised, or was there a warning that I'm missing? This is specifically for the dask executor, which I'm using in the test case, so I can also open a separate issue for this.

Another question, can we add a custom exception type instead of raising a ValueError? The docs describe the situation as "Raised when an operation or function receives an argument that has the right type but an inappropriate value, and the situation is not described by a more precise exception [...]". In this case, if the value is valid depends on the dynamics ("is there a GPU available?") and is closer to a RuntimeError. Maybe add it to the criminally underused libertem.exceptions module?

I think other than this, this PR is good to go, and is mainly missing a changelog entry, and updates to the docs. Let me know if I should take a stab at the docs updates.

Happy to do all of those things, thanks for the suggestions !

matbryan52 added a commit to matbryan52/LiberTEM that referenced this pull request Jun 28, 2023
Copy link
Member

@uellue uellue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! A good sign how short and simple the example became. 👍

@matbryan52 matbryan52 added this to the 0.12 milestone Jun 28, 2023
matbryan52 added a commit to matbryan52/LiberTEM that referenced this pull request Jun 29, 2023
@sk1p
Copy link
Member

sk1p commented Jun 29, 2023

/azp run libertem.libertem-data

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@sk1p sk1p merged commit a993a0e into LiberTEM:master Jun 29, 2023
30 of 31 checks passed
matbryan52 added a commit to matbryan52/LiberTEM that referenced this pull request Jun 29, 2023
sk1p pushed a commit that referenced this pull request Jun 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants