Skip to content

Conversation

AlekseiNikiforovIBM
Copy link
Collaborator

@AlekseiNikiforovIBM AlekseiNikiforovIBM commented Sep 23, 2025

Skip test_compiled_autograd_attribution on s390x

It fails both on s390x and x86_64 at least under some circumstances. Disable it for now until on s390x until it works reliably.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @chenyang78 @kadeng @chauhang @amjames @Lucaskabela

It fails both on s390x and x86_64 at least under some circumstances.
Disable it for now until on s390x until it works reliably.
Copy link

pytorch-bot bot commented Sep 23, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/163647

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 28caf37 with merge base 1a42656 (image):

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@AlekseiNikiforovIBM AlekseiNikiforovIBM force-pushed the s390x_updates_for_testsuite branch from 9a1f383 to d211950 Compare September 24, 2025 11:12
When bool value is assigned as as_int on big endian systems,
value of 1 updates last byte of int value.
as_bool checks first byte of same memory location.
That byte is not updated. If bool value was 'true', after such conversion
the 'toBool' function would actually return 'false'.

This change fixes multiple tests in test_nestedtensor.py such as
test_as_nested_tensor_from_tensor_dim_2_layout_jagged_requires_grad_False_contiguous_False_cpu_float16
@AlekseiNikiforovIBM AlekseiNikiforovIBM marked this pull request as ready for review October 2, 2025 09:27
payload.u.as_int = *mi;
#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
/* due to byteorder if value assigned as_int, as_bool actually is not set correctly */
payload.u.as_bool = *mi;
Copy link
Collaborator Author

@AlekseiNikiforovIBM AlekseiNikiforovIBM Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't just set as_bool for every case due to

#if defined(__clang__) && defined(__x86_64__)
// Initializing entire payload stops valgrind's from reporting
// "jump or move depends on uninitialised value" in IValue copy constructor
// See https://github.com/pytorch/pytorch/issues/37117
payload.u.as_int = b;
#else
payload.u.as_bool = b;
#endif

I can remove these checks and just set as_bool value for every case if that would be preferred way to fix this code.

@mikaylagawarecki mikaylagawarecki added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Oct 2, 2025
Copy link
Contributor

@malfet malfet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@malfet
Copy link
Contributor

malfet commented Oct 14, 2025

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 14, 2025
@malfet malfet changed the title Update testsuite for s390x Fix IValue from SymBool on big-endian system Oct 14, 2025
@malfet
Copy link
Contributor

malfet commented Oct 14, 2025

@pytorchbot merge -f "This looks fine"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

zhudada0120 pushed a commit to zhudada0120/pytorch that referenced this pull request Oct 15, 2025
Skip test_compiled_autograd_attribution on s390x

It fails both on s390x and x86_64 at least under some circumstances. Disable it for now until on s390x until it works reliably.

Pull Request resolved: pytorch#163647
Approved by: https://github.com/malfet
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/inductor ciflow/s390 s390x-related CI jobs ciflow/trunk Trigger trunk jobs on your pull request Merged module: dynamo open source topic: not user facing topic category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants