Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #1085: Convolve with distributed kernel on multiple GPUs #1095

Merged
merged 4 commits into from
Feb 27, 2023

Conversation

shahpratham
Copy link
Collaborator

Description

Issue resolved: #1085
Convolve is not working with multiple gpus with distributed kernel.

Changes proposed:

  • bcast --> Bcast

Type of change

  • Bug fix

Due Diligence

  • All split configurations tested
  • Multiple dtypes tested in relevant functions
  • Documentation updated (if needed)
  • Title of PR is suitable for corresponding CHANGELOG entry

Does this change modify the behaviour of other functions? If so, which?

no

@ghost
Copy link

ghost commented Feb 8, 2023

👇 Click on the image for a new way to code review

Review these changes using an interactive CodeSee Map

Legend

CodeSee Map legend

@mtar
Copy link
Collaborator

mtar commented Feb 8, 2023

Hi @shahpratham , more changes are necessary. Bcast uses the buffer for input and output.

@mtar
Copy link
Collaborator

mtar commented Feb 24, 2023

I need the bug fix now. So, I'm going to fix the PR.

@codecov
Copy link

codecov bot commented Feb 24, 2023

Codecov Report

Merging #1095 (10437b1) into main (576eb78) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main    #1095   +/-   ##
=======================================
  Coverage   91.80%   91.80%           
=======================================
  Files          72       72           
  Lines       10519    10520    +1     
=======================================
+ Hits         9657     9658    +1     
  Misses        862      862           
Flag Coverage Δ
unit 91.80% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
heat/core/signal.py 97.80% <100.00%> (+0.02%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@mtar mtar changed the title Fix #1085: Convolve with distributed kernel on multiplel GPUs Fix #1085: Convolve with distributed kernel on multiple GPUs Feb 24, 2023
@mtar mtar added the bug Something isn't working label Feb 24, 2023
@ClaudiaComito
Copy link
Contributor

I need the bug fix now. So, I'm going to fix the PR.

Thanks @mtar. Generally speaking, if you've identified problem and solution, feel free to submit a fix right away.
In this specific case, the error arises on the cluster in multi-GPU set up, so I'm not sure that @shahpratham could have done any more debugging.

I'll go back to PR reviews on March 3rd.

@mtar mtar added this pull request to the merge queue Feb 27, 2023
Merged via the queue into main with commit ddb87d6 Feb 27, 2023
@mtar mtar deleted the bug/1085-bcast-gpu branch March 23, 2023 12:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: convolve with distributed kernel on multiple GPUs
3 participants