Commits on Dec 5, 2024

  1. Configuration menu
    Copy the full SHA
    722a568 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    ed9915f View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    dd7a3d4 View commit details
    Browse the repository at this point in the history
  4. Revert "[clang-format] Add cmake target clang-format-style-options fo…

    …r updating ClangFormatStyleOptions.rst (#111513)"
    
    Breaks the build when docs are not enabled.
    
    This reverts commit f7560ee.
    This reverts commit 6bec180.
    nikic committed Dec 5, 2024
    Configuration menu
    Copy the full SHA
    3740fac View commit details
    Browse the repository at this point in the history
  5. [InstCombine] Fold `icmp spred (X *nsw Z), (Y *nsw Z) -> icmp pred Z,…

    … 0` if `scmp(X, Y)` is known (#118726)
    
    ```
    icmp spred (X *nsw Z), (Y *nsw Z) -> icmp swap(spred) Z, 0 if X s< Y
    icmp spred (X *nsw Z), (Y *nsw Z) -> icmp spred       Z, 0 if X s> Y
    ```
    Alive2: https://alive2.llvm.org/ce/z/F2D0GE
    dtcxzyw authored Dec 5, 2024
    Configuration menu
    Copy the full SHA
    59720dc View commit details
    Browse the repository at this point in the history
  6. [LLD][COFF] Add basic ARM64X dynamic relocations support (#118035)

    This modifies the machine field in the hybrid view to be AMD64, aligning
    it with expectations from ARM64EC modules. While this provides initial
    support, additional relocations will be necessary for full
    functionality. Many of these cases depend on implementing separate
    namespace support first.
    
    Move clearing of the .reloc section from addBaserels to assignAddresses
    to ensure it is always cleared, regardless of the relocatable
    configuration. This change also clarifies the reasoning for adding the
    dynamic relocations chunk in that location.
    cjacek authored Dec 5, 2024
    Configuration menu
    Copy the full SHA
    71bbafb View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    b6217f6 View commit details
    Browse the repository at this point in the history
  8. [Sched] Skip MemOp with unknown size when clustering (#118443)

    In #83875, we changed the type of `Width` to `LocationSize`. To get
    the clsuter bytes, we use `LocationSize::getValue()` to calculate
    the value.
    
    But when `Width` is an unknown size `LocationSize`, an assertion
    "Getting value from an unknown LocationSize!" will be triggered.
    
    This patch simply skips MemOp with unknown size to fix this issue
    and keep the logic the same as before.
    
    This issue was found when implementing software pipeliner for
    RISC-V in #117546. The pipeliner may clone some memory operations
    with `BeforeOrAfterPointer` size.
    wangpc-pp authored Dec 5, 2024
    Configuration menu
    Copy the full SHA
    db9057e View commit details
    Browse the repository at this point in the history
  9. [NFC] Fix uninitialized scalar field in constructor. (#118324)

    Non-static class field is not initialized in constructor.
    zahiraam authored Dec 5, 2024
    Configuration menu
    Copy the full SHA
    4443314 View commit details
    Browse the repository at this point in the history
  10. [flang][test] Recognize !$acc and !$omp spelled with capital letters (#…

    …118666)
    
    If there are any continuation lines in the source, they will be printed
    by the unparser with capital letters (at least in case of OpenMP). To
    avoid having them stripped out, recognize their spellings using capital
    letters as well.
    
    ---------
    
    Co-authored-by: Michael Kruse <github@meinersbur.de>
    kparzysz and Meinersbur authored Dec 5, 2024
    Configuration menu
    Copy the full SHA
    da6099c View commit details
    Browse the repository at this point in the history
  11. [Matrix] Fix crash in liftTranspose when instructions are folded.

    Builder.Create(F)Add may constant fold the inputs, return a constant
    instead of an instruction. Account for that instead of crashing.
    fhahn committed Dec 5, 2024
    Configuration menu
    Copy the full SHA
    ffb1c21 View commit details
    Browse the repository at this point in the history
  12. [flang] fix private pointers and default initialized variables (#118494)

    Both OpenMP privatization and DO CONCURRENT LOCAL lowering was incorrect
    for pointers and derived type with default initialization.
    
    For pointers, the descriptor was not established with the rank/type
    code/element size, leading to undefined behavior if any inquiry was made
    to it prior to a pointer assignment (and if/when using the runtime for
    pointer assignments, the descriptor must have been established).
    
    For derived type with default initialization, the copies were not
    default initialized.
    jeanPerier authored Dec 5, 2024
    Configuration menu
    Copy the full SHA
    ff78cd5 View commit details
    Browse the repository at this point in the history
  13. [SystemZ] SIMM32 is a signed constant (#118634)

    A follow-up to PR #117181: SIMM32 must use getSignedTargetConstant(),
    too.
    redstar authored Dec 5, 2024
    Configuration menu
    Copy the full SHA
    f85be32 View commit details
    Browse the repository at this point in the history
  14. [InstCombine] Infer nusw + nneg -> nuw for getelementptr (#111144)

    If the gep is nusw (usually via inbounds) and the offset is
    non-negative, we can infer nuw.
    
    Proof: https://alive2.llvm.org/ce/z/ihztLy
    nikic authored Dec 5, 2024
    Configuration menu
    Copy the full SHA
    462cb3c View commit details
    Browse the repository at this point in the history
  15. [Support] Use macro var args to allow templates within DEBUG_WITH_TYPE (

    #117614)
    
    Use variadic args with DEBUG_WITH_TYPE("name", ...) macros to resolve a
    compilation failure that occurs when using a comma within the last macro
    argument. Commas come up when instantiating templates such as
    SmallMapVector that require multiple template args.
    TylerNowicki authored Dec 5, 2024
    Configuration menu
    Copy the full SHA
    8e66344 View commit details
    Browse the repository at this point in the history
  16. [MLIR][EmitC] arith-to-emitc: Fix lowering of fptoui (#118504)

    `arith.fptoui %arg0 : f32 to i16` was lowered to
    ```
    %0 = emitc.cast %arg0 : f32 to ui32
    emitc.cast %0 : ui32 to i16
    ```
    and is now lowered to
    ```
    %0 = emitc.cast %arg0 : f32 to ui16
    emitc.cast %0 : ui16 to i16
    ```
    mgehre-amd authored Dec 5, 2024
    Configuration menu
    Copy the full SHA
    1f93282 View commit details
    Browse the repository at this point in the history
  17. [SCCP] Regenerate test checks (NFC)

    The checks generated by old UTC version fail on this test due to
    missing signature matching, so regenerate it with a newer one.
    nikic committed Dec 5, 2024
    Configuration menu
    Copy the full SHA
    bb03a18 View commit details
    Browse the repository at this point in the history
  18. [AMDGPU] Refine AMDGPULateCodeGenPrepare class. NFC. (#118792)

    Use references instead of pointers for most state and initialize it all
    in the constructor, and similarly for the LiveRegOptimizer class.
    jayfoad authored Dec 5, 2024
    Configuration menu
    Copy the full SHA
    f9f7c42 View commit details
    Browse the repository at this point in the history
  19. [PowerPC][Backend] using signed extend value instead of zero extend v…

    …alue for isIntS34Immediate() (#118703)
    
    The patch fix the issue
    #118695
    diggerlin authored Dec 5, 2024
    Configuration menu
    Copy the full SHA
    6b5c67b View commit details
    Browse the repository at this point in the history
  20. [ASTWriter] Do not allocate source location space for module maps use…

    …d only for textual headers (#116374)
    
    This is a follow up to #112015 and it reduces the unnecessary
    duplication of source locations further.
    
    We do not need to allocate source location space in the serialized PCMs
    for module maps used only to find textual headers. Those module maps are
    never referenced from anywhere in the serialized ASTs and are re-read in
    other compilations.
    This change should not affect correctness of Clang compilations or
    clang-scan-deps in any way.
    
    We do need the InputFile entry in the serialized AST because
    clang-scan-deps relies on it. The previous patch introduced a mechanism
    to do exactly that.
    
    We have found that to finally remove any duplication of module maps we
    use internally in our build system.
    ilya-biryukov authored Dec 5, 2024
    Configuration menu
    Copy the full SHA
    f1d81db View commit details
    Browse the repository at this point in the history
  21. Configuration menu
    Copy the full SHA
    d25d040 View commit details
    Browse the repository at this point in the history
  22. [InstCombine] Move gep of phi fold into separate function

    This makes sure that an early return during this fold doesn't end
    up skipping later gep folds.
    nikic committed Dec 5, 2024
    Configuration menu
    Copy the full SHA
    f7685af View commit details
    Browse the repository at this point in the history
  23. [LoopVectorize] Restore cost check lines in test (NFC)

    Accidentally dropped these while updating the test.
    nikic committed Dec 5, 2024
    Configuration menu
    Copy the full SHA
    707e089 View commit details
    Browse the repository at this point in the history
  24. [OpenACC] Implement 'gang' clause for Combined Constructs

    This one is a bit complicated, as it has some interesting interactions,
    as 'gang' Sema is required to look at its containing compute construct.
    Except in the case of a combined construct, they are the same. This
    resulted in a large refactor of the checking code for CheckGangExpr,
    plus some additional work on the diagnostics for its interaction with
    'num_gangs' and 'vector'/'worker'.
    erichkeane committed Dec 5, 2024
    Configuration menu
    Copy the full SHA
    3a4b9f3 View commit details
    Browse the repository at this point in the history
  25. Skip escaped newlines before checking for whitespace in Lexer::getRaw…

    …Token. (#117548)
    
    The Lexer used in getRawToken is not told to keep whitespace, so when it
    skips over escaped newlines, it also ignores whitespace, regardless of
    getRawToken's IgnoreWhiteSpace parameter. 
    
    Instead of letting this case fall through to lexing, check
    for whitespace after skipping over any escaped newlines.
    bazuzi authored Dec 5, 2024
    Configuration menu
    Copy the full SHA
    f7e8be7 View commit details
    Browse the repository at this point in the history
  26. [RISCV] Update matchSplatAsGather to convert vectors if they have dif…

    …ferent sizes (#117878)
    
    This patch updates the matchSplatAsGather function so we can handle vectors of different sizes. The goal is to improve the code gen for @llvm.experimental.vector.match on RISCV.
    
    Currently, we use a scalar extract and splat instead of vrgather, and the patch changes that.
    mikhailramalho authored Dec 5, 2024
    Configuration menu
    Copy the full SHA
    59a9e4d View commit details
    Browse the repository at this point in the history
  27. [NFC][SystemZ] Use SExt for signed constants (#118803)

    Use SExt instead of ZExt in XForms which produce a signed value. This is
    only to make it clear that the XForm handles a signed value.
    redstar authored Dec 5, 2024
    Configuration menu
    Copy the full SHA
    3bd3fa6 View commit details
    Browse the repository at this point in the history
  28. [InstCombine] Remove nusw handling in ptrtoint of gep fold (NFCI) (#1…

    …18804)
    
    Now that #111144 infers gep nuw, we no longer have to repeat the
    inference in this fold.
    nikic authored Dec 5, 2024
    Configuration menu
    Copy the full SHA
    d09632b View commit details
    Browse the repository at this point in the history
  29. Configuration menu
    Copy the full SHA
    2bd3174 View commit details
    Browse the repository at this point in the history
  30. [NFC] Complete proper copying and resource cleanup in classes. (#118655)

    Provide, where missing, a copy constructor, a copy assignment operator
    or a destructor to prevent potential issues that can arise.
    zahiraam authored Dec 5, 2024
    Configuration menu
    Copy the full SHA
    f59b600 View commit details
    Browse the repository at this point in the history
  31. [Clang] Fix -Wunused-private-field false negative with defaulted comp…

    …arison operators (#116871)
    
    Fix -Wunused-private-field incorrectly suppressing warnings for friend
    defaulted comparison operators. The warning should only be suppressed
    when the defaulted comparison is a class member function.
    
    Fixes #116270
    whiteio authored Dec 5, 2024
    Configuration menu
    Copy the full SHA
    d457100 View commit details
    Browse the repository at this point in the history
  32. Configuration menu
    Copy the full SHA
    97fd435 View commit details
    Browse the repository at this point in the history
  33. [RISCV][NFC] Don't set UnrollAndJamInnerLoopThreshold in getUnrolling…

    …Preferences (#118572)
    
    This has no effect since its the default value used in
    llvm::gatherUnrollingPreferences.
    michaelmaitland authored Dec 5, 2024
    Configuration menu
    Copy the full SHA
    34a076c View commit details
    Browse the repository at this point in the history
  34. [flang][test] Change re.I to flags=re.I in re.sub

    Follow-up to da6099c. As a positional argument, the `re.I` was in
    place of `count`, not `flags`.
    kparzysz committed Dec 5, 2024
    Configuration menu
    Copy the full SHA
    8a90b5b View commit details
    Browse the repository at this point in the history
  35. [InstCombine] Prevent infinite loop with two shifts (#118806)

    The following pattern: `(C2 << X) << C1` will usually be transformed
    into `(C2 << C1) << X`, essentially swapping `X` and `C1`.
    
    However, this should only be done when `C1` is an immediate constant,
    otherwise thiscan lead to both constants being swapped forever.
    
    This fixes #118798.
    momo5502 authored Dec 5, 2024
    Configuration menu
    Copy the full SHA
    27eaa8a View commit details
    Browse the repository at this point in the history
  36. [AMDGPU][True16][CodeGen] uaddsat/usubsat sdag for true16 format (#11…

    …8708)
    
    uaddsat and usubsat SDAG codeGen pattern for True16 format witth
    V_ADD/SUB_NC_U16
    broxigarchen authored Dec 5, 2024
    Configuration menu
    Copy the full SHA
    e7412a5 View commit details
    Browse the repository at this point in the history
  37. [libc][docgen] update to POSIX.1-2024 (#118717)

    The recently ratified POSIX.1-2024 is newer than POSIX.1-2017.
    nickdesaulniers authored Dec 5, 2024
    Configuration menu
    Copy the full SHA
    fdb90ce View commit details
    Browse the repository at this point in the history
  38. [ProfileData] Add InstrProfWriter::writeBinaryIds (NFC) (#118754)

    The patch makes InstrProfWriter::writeImpl less monolithic by adding
    InstrProfWriter::writeBinaryIds to serialize binary IDs.  This way,
    InstrProfWriter::writeImpl can simply call the new function instead of
    handling all the details within writeImpl.
    kazutakahirata authored Dec 5, 2024
    Configuration menu
    Copy the full SHA
    bda0209 View commit details
    Browse the repository at this point in the history
  39. [RISCV] Clear vill for whole vector register moves in vsetvli inserti…

    …on (#118283)
    
    This is an alternative to #117866 that works by demanding a valid vtype
    instead of using a separate pass.
    
    The main advantage of this is that it allows coalesceVSETVLIs to just
    reuse an existing vsetvli later in the block.
    
    To do this we need to first transfer the vsetvli info to some arbitrary
    valid state in transferBefore when we encounter a vector copy. Then we
    add a new vill demanded field that will happily accept any other known
    vtype, which allows us to coalesce these where possible.
    
    Note we also need to check for vector copies in computeVLVTYPEChanges,
    otherwise the pass will completely skip over functions that only have
    vector copies and nothing else.
    
    This is one part of a fix for #114518. We still need to check if there's
    other cases where vector copies/whole register moves that are inserted
    after vsetvli insertion.
    lukel97 authored Dec 5, 2024
    Configuration menu
    Copy the full SHA
    b6c0f1b View commit details
    Browse the repository at this point in the history
  40. [flang][cuda] Use async id for device stream allocation (#118733)

    When stream is specified use cudaMallocAsync with the specified stream
    clementval authored Dec 5, 2024
    Configuration menu
    Copy the full SHA
    83ccaad View commit details
    Browse the repository at this point in the history
  41. [libc] revert all process_mrelease changes (#118650)

    Revert as its test is unstable.
    #118057
    SchrodingerZhu authored Dec 5, 2024
    Configuration menu
    Copy the full SHA
    245f26a View commit details
    Browse the repository at this point in the history
  42. [lldb] Fix the SocketTest failure on unsupported hosts (#118673)

    The test `SocketTest::TCPListen0MultiListenerGetListeningConnectionURI`
    is failing on hosts that do not map `localhost` to both an ipv4 and ipv6
    address. For example this build
    https://lab.llvm.org/buildbot/#/builders/195/builds/1909.
    
    To fix this, I added a helper to validate if the host has an /etc/hosts
    entry for both ipv4 and ipv6, otherwise we skip the test.
    ashgti authored Dec 5, 2024
    Configuration menu
    Copy the full SHA
    0964328 View commit details
    Browse the repository at this point in the history
  43. Configuration menu
    Copy the full SHA
    b8c4fb0 View commit details
    Browse the repository at this point in the history
  44. [flang] Assume matching shapes in elemental assignment with non-reall…

    …oc lhs. (#118552)
    
    The optimized bufferization pass cannot optimize very simple cases of
    elemental
    assignments, because of the suboptimal checks order. This patch relies
    on the fact that in a legal program the lhs and rhs of an assignment
    have matching shapes, when lhs is not an allocatable and rhs is a result
    of an elemental array operation.
    vzakhari authored Dec 5, 2024
    Configuration menu
    Copy the full SHA
    3f0cc06 View commit details
    Browse the repository at this point in the history
  45. [flang] Expand SUM(DIM=CONSTANT) into an hlfir.elemental. (#118556)

    An array SUM with the specified constant DIM argument
    may be expanded into hlfir.elemental with a reduction loop
    inside it processing all elements of the specified dimension.
    The expansion allows further optimization of the cases like
    `A=SUM(B+1,DIM=1)` in the optimized bufferization pass
    (given that it can prove there are no read/write conflicts).
    vzakhari authored Dec 5, 2024
    Configuration menu
    Copy the full SHA
    cc46d0b View commit details
    Browse the repository at this point in the history
  46. [mlir] Add ValueBoundsOpInterfaceImpl for scf.forall (#118817)

    Adds a ValueBoundsOpInterface implementation for scf.forall ops. The
    implementation supports bounding for both induction variables, results,
    and block args of the forall op. Induction variables are given upper and
    lower bounds based on the lower and upper loop bounds, and dimensions of
    the results and init block arguments are constrained to be equal to the
    matching dims of the shared_outs operand.
    
    Signed-off-by: Max Dawkins <maxdawkins19@gmail.com>
    Co-authored-by: Max Dawkins <maxdawkins19@gmail.com>
    Max191 and Max Dawkins authored Dec 5, 2024
    Configuration menu
    Copy the full SHA
    3da843b View commit details
    Browse the repository at this point in the history
  47. [AArch64][SME] Fix bug on SMELd1St1 (#118109)

    Patch[1] has update intrinsic interface for ld1/st1, while based on
    ARM's document, "If the intrinsic also has a vnum argument, the ZA slice
    number is calculated by adding vnum to slice.". But the "vnum" did not
    work for our realization now, this patch fix this point.
    
    
    [1]ee31ba0
    wwwatermiao authored Dec 5, 2024
    Configuration menu
    Copy the full SHA
    409edc6 View commit details
    Browse the repository at this point in the history
  48. [RISCV][GISel] Enable support for ArrayType arguments if the element …

    …type is also supported.
    
    This allows us to handle small coerced structs that are passed as
    [2 x i64]. This is one of the last big reasons for -O0 fallbacks
    in some of my testing.
    topperc committed Dec 5, 2024
    Configuration menu
    Copy the full SHA
    41c33cb View commit details
    Browse the repository at this point in the history
  49. Configuration menu
    Copy the full SHA
    2469984 View commit details
    Browse the repository at this point in the history
  50. Configuration menu
    Copy the full SHA
    8ab2730 View commit details
    Browse the repository at this point in the history