Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix failing test due to a bug in NumPy when using OpenBLAS #67679

Closed
wants to merge 4 commits into from

Conversation

lezcano
Copy link
Collaborator

@lezcano lezcano commented Nov 2, 2021

Stack from ghstack:

implementations

Fixes #67675

cc @mruberry

Differential Revision: D32368698

@pytorch-probot
Copy link

pytorch-probot bot commented Nov 2, 2021

CI Flow Status

⚛️ CI Flow

Ruleset - Version: v1
Ruleset - File: https://github.com/pytorch/pytorch/blob/a63ee85d5406c6a39436f28f4ab3a056d00b5237/.github/generated-ciflow-ruleset.json
PR ciflow labels: ciflow/default

Workflows Labels (bold enabled) Status
Triggered Workflows
linux-bionic-py3.6-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/noarch, ciflow/xla ✅ triggered
linux-vulkan-bionic-py3.6-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/vulkan ✅ triggered
linux-xenial-cuda11.3-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/default, ciflow/linux ✅ triggered
linux-xenial-py3-clang5-mobile-build ciflow/all, ciflow/default, ciflow/linux, ciflow/mobile ✅ triggered
linux-xenial-py3-clang5-mobile-custom-build-dynamic ciflow/all, ciflow/default, ciflow/linux, ciflow/mobile ✅ triggered
linux-xenial-py3-clang5-mobile-custom-build-static ciflow/all, ciflow/default, ciflow/linux, ciflow/mobile ✅ triggered
linux-xenial-py3.6-clang7-asan ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/sanitizers ✅ triggered
linux-xenial-py3.6-clang7-onnx ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/onnx ✅ triggered
linux-xenial-py3.6-gcc5.4 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux ✅ triggered
linux-xenial-py3.6-gcc7 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux ✅ triggered
linux-xenial-py3.6-gcc7-bazel-test ciflow/all, ciflow/bazel, ciflow/cpu, ciflow/default, ciflow/linux ✅ triggered
win-vs2019-cpu-py3 ciflow/all, ciflow/cpu, ciflow/default, ciflow/win ✅ triggered
win-vs2019-cuda11.3-py3 ciflow/all, ciflow/cuda, ciflow/default, ciflow/win ✅ triggered
Skipped Workflows
caffe2-linux-xenial-py3.6-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux 🚫 skipped
docker-builds ciflow/all 🚫 skipped
libtorch-linux-xenial-cuda10.2-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux 🚫 skipped
libtorch-linux-xenial-cuda11.3-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux 🚫 skipped
linux-bionic-cuda10.2-py3.9-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/slow 🚫 skipped
linux-xenial-py3-clang5-mobile-code-analysis ciflow/all, ciflow/linux, ciflow/mobile 🚫 skipped
parallelnative-linux-xenial-py3.6-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux 🚫 skipped
periodic-libtorch-linux-xenial-cuda11.1-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-xenial-cuda10.2-py3-gcc7-slow-gradcheck ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled, ciflow/slow, ciflow/slow-gradcheck 🚫 skipped
periodic-linux-xenial-cuda11.1-py3.6-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-win-vs2019-cuda11.1-py3 ciflow/all, ciflow/cuda, ciflow/scheduled, ciflow/win 🚫 skipped

You can add a comment to the PR and tag @pytorchbot with the following commands:
# ciflow rerun, "ciflow/default" will always be added automatically
@pytorchbot ciflow rerun

# ciflow rerun with additional labels "-l <ciflow/label_name>", which is equivalent to adding these labels manually and trigger the rerun
@pytorchbot ciflow rerun -l ciflow/scheduled -l ciflow/slow

For more information, please take a look at the CI Flow Wiki.

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Nov 2, 2021

🔗 Helpful links

💊 CI failures summary and remediations

As of commit a63ee85 (more details on the Dr. CI page):


  • 3/3 failures introduced in this PR

🕵️ 3 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_linux_xenial_py3_6_gcc5_4_build (1/3)

Step: "(Optional) Merge target branch" (full log | diagnosis details | 🔁 rerun)

Automatic merge failed; fix conflicts and then commit the result.
CONFLICT (add/add): Merge conflict in .circleci/scripts/cpp_doc_push_script.sh
Auto-merging .circleci/scripts/cpp_doc_push_script.sh
CONFLICT (add/add): Merge conflict in .circleci/generate_config_yml.py
Auto-merging .circleci/generate_config_yml.py
CONFLICT (add/add): Merge conflict in .circleci/docker/common/install_rocm.sh
Auto-merging .circleci/docker/common/install_rocm.sh
CONFLICT (add/add): Merge conflict in .circleci/config.yml
Auto-merging .circleci/config.yml
CONFLICT (add/add): Merge conflict in .circleci/cimodel/data/simple/android_definitions.py
Auto-merging .circleci/cimodel/data/simple/android_definitions.py
Automatic merge failed; fix conflicts and then commit the result.


Exited with code exit status 1

See CircleCI build pytorch_xla_linux_bionic_py3_6_clang9_build (2/3)

Step: "(Optional) Merge target branch" (full log | diagnosis details | 🔁 rerun)

Automatic merge failed; fix conflicts and then commit the result.
CONFLICT (add/add): Merge conflict in .circleci/scripts/cpp_doc_push_script.sh
Auto-merging .circleci/scripts/cpp_doc_push_script.sh
CONFLICT (add/add): Merge conflict in .circleci/generate_config_yml.py
Auto-merging .circleci/generate_config_yml.py
CONFLICT (add/add): Merge conflict in .circleci/docker/common/install_rocm.sh
Auto-merging .circleci/docker/common/install_rocm.sh
CONFLICT (add/add): Merge conflict in .circleci/config.yml
Auto-merging .circleci/config.yml
CONFLICT (add/add): Merge conflict in .circleci/cimodel/data/simple/android_definitions.py
Auto-merging .circleci/cimodel/data/simple/android_definitions.py
Automatic merge failed; fix conflicts and then commit the result.


Exited with code exit status 1

See GitHub Actions build linux-xenial-py3.6-gcc5.4 / test (backwards_compat, 1, 1, linux.2xlarge) (3/3)

Step: "Test" (full log | diagnosis details | 🔁 rerun)

2021-11-11T08:37:18.6524704Z The PR is introduc...m to confirm whether this change is wanted or not.
2021-11-11T08:37:18.6510235Z processing existing schema:  alltoall_base(__torch__.torch.classes.dist_c10d.ProcessGroup _0, Tensor _1, Tensor _2, int[] _3, int[] _4) -> (__torch__.torch.classes.dist_c10d.Work _0)
2021-11-11T08:37:18.6511717Z processing existing schema:  alltoall(__torch__.torch.classes.dist_c10d.ProcessGroup _0, Tensor[] _1, Tensor[] _2) -> (__torch__.torch.classes.dist_c10d.Work _0)
2021-11-11T08:37:18.6513521Z processing existing schema:  send(__torch__.torch.classes.dist_c10d.ProcessGroup _0, Tensor[] _1, int _2, int _3) -> (__torch__.torch.classes.dist_c10d.Work _0)
2021-11-11T08:37:18.6515480Z processing existing schema:  recv(__torch__.torch.classes.dist_c10d.ProcessGroup _0, Tensor[] _1, int _2, int _3) -> (__torch__.torch.classes.dist_c10d.Work _0)
2021-11-11T08:37:18.6517181Z processing existing schema:  recv_anysource(__torch__.torch.classes.dist_c10d.ProcessGroup _0, Tensor[] _1, int _2) -> (__torch__.torch.classes.dist_c10d.Work _0)
2021-11-11T08:37:18.6518416Z processing existing schema:  barrier(__torch__.torch.classes.dist_c10d.ProcessGroup _0) -> (__torch__.torch.classes.dist_c10d.Work _0)
2021-11-11T08:37:18.6519450Z processing existing schema:  __init__(__torch__.torch.classes.dist_c10d.frontend _0) -> (NoneType _0)
2021-11-11T08:37:18.6520800Z processing existing schema:  new_process_group_helper(__torch__.torch.classes.dist_c10d.frontend _0, int _1, int _2, int[] _3, str _4, __torch__.torch.classes.dist_c10d.Store _5, str? _6, int _7) -> (__torch__.torch.classes.dist_c10d.ProcessGroup _0)
2021-11-11T08:37:18.6522283Z processing existing schema:  get_process_group_by_name(__torch__.torch.classes.dist_c10d.frontend _0, str _1) -> (__torch__.torch.classes.dist_c10d.ProcessGroup _0)
2021-11-11T08:37:18.6523601Z processing existing schema:  get_name_of_process_group(__torch__.torch.classes.dist_c10d.frontend _0, __torch__.torch.classes.dist_c10d.ProcessGroup _1) -> (str _0)
2021-11-11T08:37:18.6524704Z The PR is introducing backward incompatible changes to the operator library. Please contact PyTorch team to confirm whether this change is wanted or not. 
2021-11-11T08:37:18.6525258Z 
2021-11-11T08:37:18.6525519Z Broken ops: [
2021-11-11T08:37:18.6526451Z 	aten::searchsorted.Tensor(Tensor sorted_sequence, Tensor self, *, bool out_int32=False, bool right=False, str? side=None, Tensor? sorter=None) -> (Tensor)
2021-11-11T08:37:18.6527714Z 	aten::searchsorted.Scalar(Tensor sorted_sequence, Scalar self, *, bool out_int32=False, bool right=False, str? side=None, Tensor? sorter=None) -> (Tensor)
2021-11-11T08:37:18.6529144Z 	aten::searchsorted.Tensor_out(Tensor sorted_sequence, Tensor self, *, bool out_int32=False, bool right=False, str? side=None, Tensor? sorter=None, Tensor(a!) out) -> (Tensor(a!))
2021-11-11T08:37:18.6530203Z 	aten::_foreach_norm.Scalar(Tensor[] tensors, Scalar ord=2) -> (Tensor[])
2021-11-11T08:37:18.6530964Z 	aten::split_with_sizes(Tensor(a -> *) self, int[] split_sizes, int dim=0) -> (Tensor[])
2021-11-11T08:37:18.6531732Z 	aten::split.Tensor(Tensor(a -> *) self, int split_size, int dim=0) -> (Tensor[])
2021-11-11T08:37:18.6532445Z 	aten::unbind.int(Tensor(a -> *) self, int dim=0) -> (Tensor[])
2021-11-11T08:37:18.6533115Z 	aten::unbind.Dimname(Tensor(a -> *) self, str dim) -> (Tensor[])

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@lezcano lezcano added the module: tests Issues related to tests (not the torch.testing module) label Nov 2, 2021
Copy link
Collaborator

@IvanYashchuk IvanYashchuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible to get the information about which LAPACK implementation is used in NumPy with numpy.show_config(). Maybe the test should pass only when OpenBLAS is used.

test/test_linalg.py Outdated Show resolved Hide resolved
@IvanYashchuk IvanYashchuk changed the title Fix failing test due to a bug in NumPy when using some BLAS Fix failing test due to a bug in NumPy when using OpenBLAS Nov 2, 2021
implementations

Fixes #67675

cc mruberry

[ghstack-poisoned]
lezcano added a commit that referenced this pull request Nov 2, 2021
implementations

Fixes #67675

ghstack-source-id: 684ad941ae271a5ef3a9c0553d32ab4b1bdb99b7
Pull Request resolved: #67679
@lezcano
Copy link
Collaborator Author

lezcano commented Nov 2, 2021

I've looked into numpy.show_config() but it looks like parsing it would make this into a very brittle test. I vote for leave the fix as it is, as I don't think that we'll encounter many more NumPy bugs in that particular piece of code.

test/test_linalg.py Outdated Show resolved Hide resolved
implementations

Fixes #67675

cc mruberry

[ghstack-poisoned]
lezcano added a commit that referenced this pull request Nov 3, 2021
implementations

Fixes #67675

ghstack-source-id: 0b37c1ba7d522143ee3b3835ac2fe5595a2a54d2
Pull Request resolved: #67679
@lezcano
Copy link
Collaborator Author

lezcano commented Nov 3, 2021

@mruberry this should be ready.

@mruberry
Copy link
Collaborator

Hey @lezcano this looks great but looks like it picked up some kineto updates, too.

Would you remove those and just reping me to merge this?

implementations

Fixes #67675

cc mruberry

[ghstack-poisoned]
lezcano added a commit that referenced this pull request Nov 11, 2021
…ations

Fixes #67675

ghstack-source-id: 705569c1a653288aad45c624e3ab0fbeaddcb956
Pull Request resolved: #67679
@lezcano
Copy link
Collaborator Author

lezcano commented Nov 11, 2021

Submodules are my worst enemy, or at least, certainly in the top 5.
Fixed @mruberry

Copy link
Collaborator

@mruberry mruberry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! Thanks @lezcano

@mruberry
Copy link
Collaborator

@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@mruberry merged this pull request in 43874d7.

seemethere added a commit that referenced this pull request Nov 15, 2021
Summary:
Pull Request resolved: #67679

implementations

Fixes #67675

cc mruberry

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D32368698

Pulled By: mruberry

fbshipit-source-id: 3ea6ebc43c061af2f376cdf5da06884859bbbf53
Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
ghstack-source-id: 856e69d7e57f4e8cd8c794feda9487f006c7dfde
jambayk pushed a commit to jambayk/pytorch that referenced this pull request Feb 11, 2022
…7679)

Summary:
Pull Request resolved: pytorch#67679

implementations

Fixes pytorch#67675

cc mruberry

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D32368698

Pulled By: mruberry

fbshipit-source-id: 3ea6ebc43c061af2f376cdf5da06884859bbbf53
jambayk pushed a commit to jambayk/pytorch that referenced this pull request Feb 14, 2022
…7679)

Summary:
Pull Request resolved: pytorch#67679

implementations

Fixes pytorch#67675

cc mruberry

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D32368698

Pulled By: mruberry

fbshipit-source-id: 3ea6ebc43c061af2f376cdf5da06884859bbbf53
jambayk pushed a commit to jambayk/pytorch that referenced this pull request Feb 14, 2022
…7679)

Summary:
Pull Request resolved: pytorch#67679

implementations

Fixes pytorch#67675

cc mruberry

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D32368698

Pulled By: mruberry

fbshipit-source-id: 3ea6ebc43c061af2f376cdf5da06884859bbbf53
malfet pushed a commit that referenced this pull request Feb 15, 2022
…72820)

Summary:
Pull Request resolved: #67679

implementations

Fixes #67675

cc mruberry

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D32368698

Pulled By: mruberry

fbshipit-source-id: 3ea6ebc43c061af2f376cdf5da06884859bbbf53

Co-authored-by: lezcano <lezcano-93@hotmail.com>
jaglinux added a commit to jaglinux/pytorch that referenced this pull request Feb 16, 2022
Cherrypicked the changes from master PT
pytorch#67679

Signed-off-by: Jagadish Krishnamoorthy <jagdish.krishna@gmail.com>
jithunnair-amd pushed a commit to ROCm/pytorch that referenced this pull request Feb 23, 2022
Cherrypicked the changes from master PT
pytorch#67679

Signed-off-by: Jagadish Krishnamoorthy <jagdish.krishna@gmail.com>
jataylo pushed a commit to jataylo/pytorch that referenced this pull request Aug 25, 2022
Cherrypicked the changes from master PT
pytorch#67679

Signed-off-by: Jagadish Krishnamoorthy <jagdish.krishna@gmail.com>
(cherry picked from commit bda1d0a)
jataylo pushed a commit to jataylo/pytorch that referenced this pull request Aug 30, 2022
Cherrypicked the changes from master PT
pytorch#67679

Signed-off-by: Jagadish Krishnamoorthy <jagdish.krishna@gmail.com>
(cherry picked from commit bda1d0a)
jithunnair-amd pushed a commit to jithunnair-amd/pytorch that referenced this pull request Sep 20, 2022
Cherrypicked the changes from master PT
pytorch#67679

Signed-off-by: Jagadish Krishnamoorthy <jagdish.krishna@gmail.com>
(cherry picked from commit bda1d0a)
jithunnair-amd pushed a commit to ROCm/pytorch that referenced this pull request Sep 28, 2022
Cherrypicked the changes from master PT
pytorch#67679

Signed-off-by: Jagadish Krishnamoorthy <jagdish.krishna@gmail.com>
(cherry picked from commit bda1d0a)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla signed Merged module: tests Issues related to tests (not the torch.testing module) open source
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants