Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix BBFlip issues #1738

Merged
merged 2 commits into from
Feb 18, 2020
Merged

Fix BBFlip issues #1738

merged 2 commits into from
Feb 18, 2020

Conversation

mzient
Copy link
Contributor

@mzient mzient commented Feb 17, 2020

  • Fix a bug in TensorList: copying from TensorVector on a wrong stream.
  • Rework ArgValue to remove allocation and free from steady-state execution path.

Why we need this PR?

Pick one, remove the rest

  • It fixes a bug: race condition in BBFlip

What happened in this PR?

Fill relevant points, put NA otherwise. Replace anything inside []

  • What solution was applied:
    • Fixed TensorList::Copy
    • Reworked ArgValue for performance (removed allocations)
  • Affected modules and functionalities:
    • TensorList
    • ArgValue helper class
    • BbFlip<GPUBackend>
  • Key points relevant for the review:
    • See modules
  • Validation and testing:
    • No reliable regression test available. Manual repro fixed.
  • Documentation (including examples):
    • N/A

JIRA TASK: N/A

@mzient mzient requested a review from a team February 17, 2020 16:33
@mzient mzient changed the title Fix BBFlip issues: Fix BBFlip issues Feb 17, 2020
@mzient
Copy link
Contributor Author

mzient commented Feb 17, 2020

!build

* Fix a bug in TensorList: copying from TensorVector on a wrong stream.
* Rework ArgValue to remove allocation and free from steady-state execution path.

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1133736]: BUILD STARTED

@@ -107,7 +107,7 @@ class DLL_PUBLIC TensorList : public Buffer<Backend> {
type.template Copy<SrcBackend, Backend>(
raw_mutable_tensor(i),
other[i].raw_data(),
other[i].size(), 0);
other[i].size(), stream);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the actual bug fix.

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1133736]: BUILD PASSED

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>
@mzient
Copy link
Contributor Author

mzient commented Feb 18, 2020

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1134649]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1134649]: BUILD PASSED

@mzient mzient merged commit d8ee8fa into NVIDIA:master Feb 18, 2020
@mzient mzient deleted the FixBBFlip branch March 27, 2020 13:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants