Skip to content

Conversation

mlazos
Copy link
Contributor

@mlazos mlazos commented Sep 7, 2023

Summary:
Original commit changeset: f15956d96311

Original Phabricator Diff: D48996091

Test Plan: Reverting to Unbreak test

Differential Revision: D49065517

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @ngimel @yf225 @chenyang78 @kadeng @muchulee8 @aakhundov

Summary:
Original commit changeset: f15956d96311

Original Phabricator Diff: D48996091

Test Plan: Reverting to Unbreak test

Differential Revision: D49065517
@pytorch-bot
Copy link

pytorch-bot bot commented Sep 7, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/108793

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 2 Unrelated Failures

As of commit 445576a with merge base 774c822 (image):

NEW FAILURE - The following job has failed:

FLAKY - The following job failed but was likely due to flakiness present on trunk:

BROKEN TRUNK - The following job failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D49065517

@facebook-github-bot
Copy link
Contributor

@pytorchbot merge

(Initiating merge automatically since Phabricator Diff has merged)

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Sep 8, 2023
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: Command git -C /home/runner/work/pytorch/pytorch merge --squash __pull-request-108793__init__ returned non-zero exit code 1

Auto-merging torch/_inductor/ir.py
Auto-merging torch/_inductor/lowering.py
Auto-merging torch/_inductor/utils.py
CONFLICT (content): Merge conflict in torch/_inductor/utils.py
Squash commit -- not updating HEAD
Automatic merge failed; fix conflicts and then commit the result.
Details for Dev Infra team Raised by workflow job

@Chillee
Copy link
Collaborator

Chillee commented Sep 8, 2023

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: 2 jobs have failed, first few of them are: trunk / linux-focal-rocm5.6-py3.8 / test (default, 1, 3, linux.rocm.gpu), trunk / linux-focal-rocm5.6-py3.8 / test (default, 2, 3, linux.rocm.gpu)

Details for Dev Infra team Raised by workflow job

@Chillee
Copy link
Collaborator

Chillee commented Sep 8, 2023

@pytorchbot merge -f "failures unrelated"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@atalman
Copy link
Contributor

atalman commented Sep 8, 2023

@mlazos Looks like it breaks inductor tests

_________________________ AotInductorTests.test_addmm __________________________
Traceback (most recent call last):
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2100, in _run_ninja_build
    subprocess.run(
  File "/opt/conda/envs/py_3.10/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v', '-j', '14']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/var/lib/jenkins/workspace/test/inductor/test_aot_inductor.py", line 219, in test_addmm
    actual = AOTInductorModelRunner.run(model, example_inputs, expected)
  File "/var/lib/jenkins/workspace/test/inductor/test_aot_inductor.py", line 63, in run
    optimized, exported, output_tensors, output_spec = AOTInductorModelRunner.load(
  File "/var/lib/jenkins/workspace/test/inductor/test_aot_inductor.py", line 50, in load
    optimized = torch.utils.cpp_extension.load_inline(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1635, in load_inline
    return _jit_compile(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1710, in _jit_compile
    _write_ninja_file_and_build_library(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1823, in _write_ninja_file_and_build_library
    _run_ninja_build(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2116, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error building extension 'aot_inductor': [1/2] c++ -MMD -MF main.o.d -DTORCH_EXTENSION_NAME=aot_inductor -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1013\" -isystem /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include -isystem /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/TH -isystem /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/envs/py_3.10/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=1 -fPIC -std=c++17 -c /var/lib/jenkins/.cache/torch_extensions/py310_cu121/aot_inductor/main.cpp -o main.o
FAILED: main.o

@mlazos
Copy link
Contributor Author

mlazos commented Sep 8, 2023

@mlazos Looks like it breaks inductor tests

_________________________ AotInductorTests.test_addmm __________________________
Traceback (most recent call last):
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2100, in _run_ninja_build
    subprocess.run(
  File "/opt/conda/envs/py_3.10/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v', '-j', '14']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/var/lib/jenkins/workspace/test/inductor/test_aot_inductor.py", line 219, in test_addmm
    actual = AOTInductorModelRunner.run(model, example_inputs, expected)
  File "/var/lib/jenkins/workspace/test/inductor/test_aot_inductor.py", line 63, in run
    optimized, exported, output_tensors, output_spec = AOTInductorModelRunner.load(
  File "/var/lib/jenkins/workspace/test/inductor/test_aot_inductor.py", line 50, in load
    optimized = torch.utils.cpp_extension.load_inline(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1635, in load_inline
    return _jit_compile(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1710, in _jit_compile
    _write_ninja_file_and_build_library(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1823, in _write_ninja_file_and_build_library
    _run_ninja_build(
  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2116, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error building extension 'aot_inductor': [1/2] c++ -MMD -MF main.o.d -DTORCH_EXTENSION_NAME=aot_inductor -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1013\" -isystem /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include -isystem /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/TH -isystem /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/envs/py_3.10/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=1 -fPIC -std=c++17 -c /var/lib/jenkins/.cache/torch_extensions/py310_cu121/aot_inductor/main.cpp -o main.o
FAILED: main.o

We force merged this for internal reasons. I'm going to reland once I can debug and that should fix the test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants