Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tensor.new() can have nan's, and some pytorch code is thus unsafe #1347

Closed
ankitkv opened this issue Apr 24, 2017 · 5 comments
Closed

tensor.new() can have nan's, and some pytorch code is thus unsafe #1347

ankitkv opened this issue Apr 24, 2017 · 5 comments

Comments

@ankitkv
Copy link
Contributor

ankitkv commented Apr 24, 2017

tensor.new() does not initialize memory, so it could end up containing nan's. This could be unsafe in some cases.

For example, in torch/autograd/variable.py:

def mm(self, matrix):
    output = Variable(self.data.new(self.data.size(0), matrix.data.size(1)))
    return self._static_blas(Addmm, (output, 0, 1, self, matrix), False)

def bmm(self, batch):
    output = Variable(self.data.new(self.data.size(0), self.data.size(1),
                                    batch.data.size(2)))
    return self._static_blas(Baddbmm, (output, 0, 1, self, batch), False)

def mv(self, vector):
    output = Variable(self.data.new(self.data.size(0)))
    return self._static_blas(Addmv, (output, 0, 1, self, vector), False)

def ger(self, vector):
    output = Variable(self.data.new(self.data.size(0), vector.data.size(0)))
    return self._static_blas(Addr, (output, 0, 1, self, vector), False)

is dangerous because output could contain a nan, and even though alpha is being set to 0, nan * 0 = nan and the result could contain a nan (I had an optim test failing because of a nan originating this way).

I haven't done an exhaustive search, so there may be other places in the code that could have this issue.

@soumith
Copy link
Member

soumith commented Apr 25, 2017

inside the blas side of things, we explicitly check for the special case of alpha=0 and zero the output.

if you see that it isn't true for a particular blas call, let me know and I'll fix it, but for mm and mv it should already be fixed.

@ankitkv
Copy link
Contributor Author

ankitkv commented Apr 25, 2017

I see. I had the issue with ger, can you check?

@ankitkv
Copy link
Contributor Author

ankitkv commented Apr 25, 2017

For instance, undoing the zero'ing of the new tensors in the four functions mentioned in the opening post causes tests to fail in #1306. The tests pass on one of the python versions, and on my machine a different test suit fails (which is where I hunted down the root). Now, it seems very possible that I introduced this vulnerability somehow, because this doesn't seem to happen without my PR. I'll look into it.

@ngimel
Copy link
Collaborator

ngimel commented Apr 25, 2017

@ankitkv
Copy link
Contributor Author

ankitkv commented Apr 25, 2017

Ok, I confirmed that this is not a problem without my PR (I manually inserted nan's into output to test).

@ankitkv ankitkv closed this as completed Apr 25, 2017
eqy pushed a commit to eqy/pytorch that referenced this issue Jan 20, 2022
* Refactor War Sync Insertion Pass (pytorch#1339)
* Remove kir::Expr::scope_ (pytorch#1341)
* Fusion IR Refactor (pytorch#1343)
* Refactor KIR Step 1 - Remove kir::Node (pytorch#1347)
* Refactor KIR Step 2 - TMP IrUtils change (pytorch#1348)
* Refactor KIR Step 3 - Remove kir::Expr and kir::Val. (pytorch#1349)
* Refactor KIR Step 4 - Remove kir::Bool,Double,Int,NamedScalar. (pytorch#1350)
* Refactor KIR Step 5 - Remove kir::IterDomain/TensorDomain/TensorView (pytorch#1351)
* Refactor KIR Step 6 - Remove 
 kir::UnaryOp/BinaryOp/TernaryOp/ReductionOp/WelfordOp/BroadcastOp. (pytorch#1352)
* Refactor KIR Step 7 - Remove kir dispatch (pytorch#1353)
* Refactor KIR Step 8 - Clean up lower_utils (pytorch#1355)
* Refactor KIR Step 9 - lower_utils ir_utils::applyReplacements. (pytorch#1354)
* Refactor KIR Step 10 - Remove kir_printer in favor of io_stream (pytorch#1356)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants