Skip to content

Conversation

@z1-cciauto
Copy link
Collaborator

No description provided.

veera-sivarajan and others added 25 commits November 9, 2025 21:08
…tion's Signature (llvm#167248)

Since llvm#162441,
`buffer-results-to-out-params` transforms `private` functions only.

But, as mentioned in
llvm#162441 (comment),
this is a breaking change for pipelines handling C code. Our pipeline
@EfficientComputer is also affected by this breaking change.

Therefore, this PR adds an opt-in flag to allow `public` functions to be
transformed by `BufferResultsToOutParamsPass`.
)

Unfortunately this is more dynamic than anticipated.

Fixes llvm#165006
Fill out more information for sign and zero extend and add some truncate
information; however, the primary change is to int/fp conversions. In
particular, fp to (narrow) int appears to be relatively expensive.
…ble (llvm#165525)

When Polly generates a false runtime condition (RTC), the associated
Polly generated loop is never executed and is eventually eliminated. As
a result, the fallback loop becomes the default execution path.
Disabling vectorization for this fallback loop will be
counterproductive. This patch ensures that vectorization is only
disabled when the RTC is not false (no Codegen failure).
…tructor-throws' (llvm#164061)

Closes llvm#157299.

---------

Co-authored-by: Victor Chernyakin <chernyakin.victor.j@outlook.com>
…s generated (llvm#166910)

This patch doesn't change anything. Just adds more explicit checks to
verify what is generated in this case when an alloca has a zero-sized
array.

I'd expect an `OpRuntimeArray`, but nothing is generated.
…cape' (llvm#164081)

Need these options to complete
llvm#160825, but I think it's
generally beneficial to fine-tune this check.

---------

Co-authored-by: EugeneZelenko <eugene.zelenko@gmail.com>
Co-authored-by: Victor Chernyakin <chernyakin.victor.j@outlook.com>
…m#167258)

For the non-built-in vector type, the RISCV cost model cannot handle
this properly.
So fall back to the BasicTTI for this situation.

Fixes: llvm#166732
…m#167214)

`__tuple_types` is at this point just a `__type_list` with a weird name,
so we can just replace the few places it's still used.
With `+SPV_KHR_float_controls2` and when there is a non-int
`OpConstantNull` we
would call `MI.getOperand(1).getImm()` when `MI` was not an `OpTypeInt`
(the
associated test has an `OpTypeArray` zeroinitialized).
Under this conditions an assertion is triggered.

This patch adds the missing condition.
…lvm#165863)

Extracts of unsigned i8 or i16 elements from the bottom 128 bits of a
scalable register lead to the implied zero-extend being transformed to
an AND mask. The mask is redundant since UMOV already zeroes the high
bits of the destination register.

For example:
```c
int foo(svuint8_t x) {
  return x[3];
}
```
Currently:
```gas
foo:
  umov    w8, v0.b[3]
  and     w0, w8, #0xff
  ret
```
Becomes:
```gas
foo:
  umov    w0, v0.b[3]
  ret
```
Specifically, this patch adds the following combines:
  SUB x, (CSET LO, (CMP a, b)) -> SBC x, 0, (CMP a, b)
  SUB (SUB x, y), (CSET LO, (CMP a, b)) -> SBC x, y, (CMP a, b)

The CSET may be preceded by a ZEXT.

Fixes llvm#164748.
Call getVectorTripCount first, and call getTripCount failing that, in
simplifyBranchConditionForVFAndUF, to simplify missed cases. While at
it, strip the dead check for a zero TC.
…lvm#166947)

This patch adds another run of DropUnnecessaryAssumes after
vectorization, to clean up assumes that are not longer needed after this
point.

The main example of such an assume is currently dereferenceable
assumptions. This complements
llvm#166945, which avoids sinking
code if it would mean remove a dereferenceable assumption.

There are a few additional cases where some unneeded assumes are left
over after vectorization that also get cleaned up.

The main motivation is to work together with
llvm#166945, but there may be a
better solution.

Adding another instance of this pass to the pipeline is not great, but
compile-time impact seems in the noise:
https://llvm-compile-time-tracker.com/compare.php?from=55e71fe08b6406ec7ce2c81ce042e48717acf204&to=85da4ee3a74126f557cdc74c7b40e048dacb3fc4&stat=instructions:u

PR: llvm#166947
llvm#166756)

Section C3.2.2 (quoted below) in the ARMARM makes this a requirement of
assemblers for load/stores with unscaled offset. It makes no mention of
PRFM so I don't consider this to be a bug, although I can see why we
would want to extend this behaviour to the unscaled variants of these
instructions as well, as GCC does. This patch adds an alias for this.

C3.2.2 Load/store register (unscaled offset)

  The load/store register instructions with an unscaled offset support
  only one addressing mode:

      Base plus an unscaled 9-bit signed immediate offset.

  See Load/store addressing modes.

  The load/store register (unscaled offset) instructions are required to
  disambiguate this instruction class from the load/store register
instruction forms that support an addressing mode of base plus a scaled,
unsigned 12-bit immediate offset, because that can represent some offset
  values in the same range.

  The ambiguous immediate offsets are byte offsets that are both:

      In the range 0-255, inclusive.

      Naturally aligned to the access size.

  Other byte offsets in the range -256 to 255 inclusive are unambiguous.
  An assembler program translating a load/store instruction, for example
  LDR, is required to encode an unambiguous offset using the unscaled
  9-bit offset form, and to encode an ambiguous offset using the scaled
  12-bit offset form. A programmer might force the generation of the
  unscaled 9-bit form by using one of the mnemonics in Table C.3.21. Arm
  recommends that a disassembler outputs all unscaled 9-bit offset forms
  using one of these mnemonics, but unambiguous offsets can be output
  using a load/store single register mnemonic, for example, LDR.

Fixes llvm#83226.
…61049)

Several components in libc++ aren't defending against overloaded
`operator,(T, Iter)` currently. Existing deleted overloads in
`test_iterators.h` are insufficient for such cases.

This PR adds corresponding deleted overloads with reversed order and
fixes these libc++ components.
- `piecewise_linear_distribution`'s iterator pair constructor,
- `piecewise_linear_distribution::param_type`'s iterator pair
constructor,
- `piecewise_constant_distribution`'s iterator pair constructor,
- `piecewise_constant_distribution::param_type`'s iterator pair
constructor,
- `money_get::do_get`,
- `money_put::do_put`, and
- `num_put::do_put`.
@z1-cciauto z1-cciauto requested a review from a team November 10, 2025 12:06
@z1-cciauto
Copy link
Collaborator Author

@z1-cciauto z1-cciauto merged commit fb71c30 into amd-staging Nov 10, 2025
14 checks passed
@z1-cciauto z1-cciauto deleted the upstream_merge_202511100706 branch November 10, 2025 15:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.