Skip to content

Conversation

@glaringlee
Copy link
Contributor

@glaringlee glaringlee commented Feb 25, 2020

Stack from ghstack:

Differential Revision: D20107158

glaringlee pushed a commit that referenced this pull request Feb 25, 2020
@dr-ci
Copy link

dr-ci bot commented Feb 26, 2020

💊 CircleCI build failures summary and remediations

As of commit 3229b44 (more details on the Dr. CI page):


None of the build failures appear to be your fault 💚



❄️ 2 tentatively flaky failures

2 failures tentatively classified as flaky but have not launched reruns to confirm:

See CircleCI build pytorch_linux_xenial_cuda10_1_cudnn7_py3_multigpu_test (1/2)

Step: "Test" (full log | pattern match details) ❄️

Mar 03 20:35:22 RuntimeError: Error downloading resource!
Mar 03 20:35:22  
Mar 03 20:35:22 During handling of the above exception, another exception occurred: 
Mar 03 20:35:22  
Mar 03 20:35:22 Traceback (most recent call last): 
Mar 03 20:35:22   File "tools/download_mnist.py", line 87, in <module> 
Mar 03 20:35:22     main() 
Mar 03 20:35:22   File "tools/download_mnist.py", line 80, in main 
Mar 03 20:35:22     download(path, url, options.quiet) 
Mar 03 20:35:22   File "tools/download_mnist.py", line 41, in download 
Mar 03 20:35:22     raise RuntimeError('Error downloading resource!') 
Mar 03 20:35:22 RuntimeError: Error downloading resource! 
Mar 03 20:35:22 + cleanup 
Mar 03 20:35:22 + retcode=1 
Mar 03 20:35:22 + set +x 
Mar 03 20:35:22 =================== sccache compilation log =================== 
Mar 03 20:35:22 =========== If your build fails, please take a look at the log above for possible reasons =========== 
Mar 03 20:35:22 Compile requests                 0 
Mar 03 20:35:22 Compile requests executed        0 
Mar 03 20:35:22 Cache hits                       0 
Mar 03 20:35:22 Cache misses                     0 
Mar 03 20:35:22 Cache timeouts                   0 

See CircleCI build pytorch_linux_xenial_cuda10_1_cudnn7_py3_NO_AVX2_test (2/2)

Step: "Test" (full log | pattern match details) ❄️

Mar 03 21:46:13 RuntimeError: Error downloading resource!
Mar 03 21:46:13  
Mar 03 21:46:13 During handling of the above exception, another exception occurred: 
Mar 03 21:46:13  
Mar 03 21:46:13 Traceback (most recent call last): 
Mar 03 21:46:13   File "tools/download_mnist.py", line 87, in <module> 
Mar 03 21:46:13     main() 
Mar 03 21:46:13   File "tools/download_mnist.py", line 80, in main 
Mar 03 21:46:13     download(path, url, options.quiet) 
Mar 03 21:46:13   File "tools/download_mnist.py", line 41, in download 
Mar 03 21:46:13     raise RuntimeError('Error downloading resource!') 
Mar 03 21:46:13 RuntimeError: Error downloading resource! 
Mar 03 21:46:13 + cleanup 
Mar 03 21:46:13 + retcode=1 
Mar 03 21:46:13 + set +x 
Mar 03 21:46:13 =================== sccache compilation log =================== 
Mar 03 21:46:13 =========== If your build fails, please take a look at the log above for possible reasons =========== 
Mar 03 21:46:13 Compile requests                15 
Mar 03 21:46:13 Compile requests executed        0 
Mar 03 21:46:13 Cache hits                       0 
Mar 03 21:46:13 Cache misses                     0 
Mar 03 21:46:13 Cache timeouts                   0 

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker.

This comment has been revised 13 times.

if isinstance(resize, str):
return "{}.resize_({}.sizes());".format(arg['name'], resize)
else:
resize_scalar = arg.get('resize_scalar', False)
Copy link
Contributor

@zou3519 zou3519 Feb 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existence of resize_scalar makes it seem like ger, at some point in the past, supported accepting a 0D tensor. It doesn't accept 0D tensors on master and no one has complained about it, so this change seems fine to me.

If we want to be really safe we can try to figure out if torch.ger accepted a 0D tensor at any time in the past and figure out when torch.ger stopped accepting a 0D tensor, if at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked a little bit. I think people who made the _th_ger change wanted to make this resize safe.
In legacy code, the size check is inside addr function, but this resize happens before calling addr. _th_ger call addr underline, and addr doesn't allow 0D vec. So ger doesn't support 0D vec anyway

Copy link
Contributor

@zou3519 zou3519 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new resize semantics introduced in this PR are bc breaking; we should either:

  1. follow the old behavior
  2. follow the old behavior and issue a deprecation warning.

glaringlee pushed a commit that referenced this pull request Feb 26, 2020
@glaringlee glaringlee requested a review from zou3519 February 27, 2020 00:05
@glaringlee
Copy link
Contributor Author

discussed with @zou3519 , we will keep the old TH resize behavior in this PR. We will open a new PR if we need to deprecated the legacy resize behavior.

Tensor& ger_out(Tensor &result, const Tensor& self, const Tensor& vec2) {
check_1d(self, "self", "ger");
check_1d(vec2, "vec2", "ger");
if (result.dim() != 2 || result.size(0) != self.size(0) || result.size(1) != vec2.size(0)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think it's clearer if you do something like

if (result.sizes() != {self.size(0), vec2.size(0)}) {
   result.resize_({...});
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do if there are more dimensions to check. For 2 dimensions, I think this is still fine and save a memory allocation, faster.

glaringlee pushed a commit that referenced this pull request Feb 29, 2020
@glaringlee glaringlee requested a review from ailzhang March 2, 2020 16:08
@glaringlee
Copy link
Contributor Author

This breaks XLA CI test, added Ailing to give update here once XLA side is ready

@ailzhang
Copy link
Contributor

ailzhang commented Mar 2, 2020

@pytorchbot rebase this please

@ailzhang ailzhang closed this Mar 2, 2020
@ailzhang ailzhang reopened this Mar 2, 2020
@ailzhang
Copy link
Contributor

ailzhang commented Mar 2, 2020

ehhh what happened to our pytorchbot? :P
@glaringlee I think this PR should be fine if you rebase on top of master. Would you mind rebasing and see whether XLA test pass?
The current failure is caused by my update of our API between PT and XLA.

@glaringlee glaringlee requested a review from ailzhang March 2, 2020 23:13
glaringlee pushed a commit that referenced this pull request Mar 3, 2020
Copy link
Contributor

@ailzhang ailzhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

glaringlee pushed a commit that referenced this pull request Mar 3, 2020
glaringlee pushed a commit that referenced this pull request Mar 3, 2020
@facebook-github-bot
Copy link
Contributor

@glaringlee merged this pull request in 57c1b80.

ttumiel pushed a commit to ttumiel/pytorch that referenced this pull request Mar 4, 2020
…ytorch#33792)

Summary: Pull Request resolved: pytorch#33792

Test Plan: Imported from OSS

Differential Revision: D20107158

Pulled By: glaringlee

fbshipit-source-id: bceddb2d39d3abf36f277daba537677312449c9c
@facebook-github-bot facebook-github-bot deleted the gh/glaringlee/7/head branch March 7, 2020 15:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants