[lldb] Add additional assertions to TestVTableValue.test_overwrite_vt…
…able (#118719) If this test fails, you're likely going to see something like "Assertion Error: A != B" which doesn't really give much explanation for why this failed. Instead of ignoring the error, we should assert that it succeeded. This will lead to a better error message, for example: `AssertionError: 'memory write failed for 0x102d7c018' is not success`
[OpenACC] Implement 'worker' clause for combined constructs
This is very similar to 'gang', except with fewer restrictions, and only an interaction with 'num_workers', plus disallowing 'gang' and 'worker' in its associated statement. This patch implements this, the same as how 'gang' implemented it.
[Webkit Checkers] Introduce a Webkit checker for memory unsafe casts (#…
…114606) This PR introduces a new checker `[alpha.webkit.MemoryUnsafeCastChecker]` that warns all downcasts from a base type to a derived type. rdar://137766829
[libc][windows] start time API implementation (#117775)
Add a `clock_gettime` emulation layer and use it to implement the `time` entrypoint. For windows, the monotonic clock is emulated using `QPC`. The realtime clock is emulated using `GetSystemTimePreciseAsFileTime`.
[C++20] Destroying delete can cause a type to be noexcept when deleti…
…ng (#118687) Given a `noexcept` operator with an operand that calls `delete`, Clang was not considering whether the selected `operator delete` function was a destroying delete or not when inspecting whether the deleted object type has a throwing destructor. Thus, the operator would return `false` for a type with a potentially throwing destructor even though that destructor would not be called due to the destroying delete. Clang now takes the kind of delete operator into consideration. Fixes #118660
[RISCV] Extract spread(2,4,8) shuffle lowering from interleave(2) (#1…
…18822) This is a prep patch for improving spread(4,8) shuffles. I also think it improves the readability of the existing code, but the primary motivation is simply staging work.
Disable test broken by #117624 (#118858)
The test fails after #117624 https://lab.llvm.org/buildbot/#/builders/186/builds/4581
Revert "[libc][windows] start time API implementation (#117775)"
This reverts commit 0adff0a. Breaks the GPU build
[libc++][test] Refactor increasing_allocator (#115671)
The increasing_allocator<T> class, originally introduced to test shrink_to_fit() for std::vector, std::vector<bool>, and std::basic_string, has duplicated definitions across several test files. Given the potential utility of this class for capacity-related tests in various sequence containers, this patch refactors the definition of increasing_allocator<T> into a single, reusable location.
[RISCV][GISel] Add Zfa FP legalization and full tests for 9 insn (#11…
…8723) ceil, floor, round, roundeven, trunc, rint, nearbyint, maximum, minimum.
[libc++][docs] Use --show-all in the sample command-line to run bench…
…marks It's really not useful at all to run benchmarks without --show-all since you don't get the benchmark output. And since --show-all is the suggested default way to run benchmarks, it's not necessary anymore to mention it right below.
[libc++] Fix unintended ABI break in associative containers with refe…
[SelectionDAG] Stop storing EVTs in a function scoped static std::set. (
#118715) EVTs potentially contain a Type * that points into memory owned by an LLVMContext. Storing them in a function scoped static means they may outlive the LLVMContext they point to. This std::set is used to unique single element VT lists containing a single extended EVT. Single element VT list with a simple EVT are uniqued by a separate cache indexed by the MVT::SimpleValueType enum. VT lists with more than one element are uniqued by a FoldingSet owned by the SelectionDAG object. This patch moves the single element cache into SelectionDAG so that it will be destroyed when SelectionDAG is destroyed. Fixes #88233
DataLayout: Fix latent issues with getMaxIndexSizeInBits (#118740)
Because it was implemented in terms of getMaxIndexSize, it was always rounding the values up to a multiple of 8. Additionally, it was using the PointerSpec's BitWidth rather than its IndexBitWidth, which was self-evidently incorrect. Since getMaxIndexSize was only used by getMaxIndexSizeInBits, and its name and function seem niche and somewhat confusing, go ahead and remove it until a concrete need for it arises.
[libc] Include CheckCXXCompilerFlag when checking compiler features (#…
…118862) Includes `CheckCXXCompilerFlag` so when building LLVM libc is built standalone, it actually works and doesn't complain about `check_cxx_compiler_flag` not being defined.
Reapply [memprof] Update YAML traits for writer purposes (#118720)
For Frames, we prefer the inline notation for the brevity. For PortableMemInfoBlock, we go through all member fields and print out those that are populated. This iteration works around the unavailability of ScalarTraits<uintptr_t> on macOS.
SimplifyLibCalls: Use default globals address space when building new…
… global strings. (#118729) Writing a test for this transitively exposed a number of places in BuildLibCalls where we were failing to propagate address spaces properly, which are additionally fixed.
[libc][docs] reorganize documentation (#118836)
This commit does a few things: * creates libc/docs/headers/ and moves all user API related headers under it. * updates paths and docgen * updates the top level index to put these headers under a new "Implementation Status" tab. * rename some of the files to be foo.rst for foo.h (except strings, which is currently a mix of string.h and stdlib.h) * update the heading of some files to be in the form foo.h.
[libc][docs] stub out assert, errno, and locale (#118852)
[libc][docs] stub out assert, errno, and locale These were the remaining c89 library headers (besides string.h and stdlib.h; I will split strings.rst in a follow up commit). The macro support detection in docgen doesn't quite work for some of these headers. Add the stubs for these headers for now, and fix up docgen later. See the "NIST publication": Link: https://www.open-std.org/jtc1/sc22/wg14/www/projects.html
[PPC] Custom lower ssubo for i64 (#118711)
This is a follow-up patch to improve the codegen for ssubo node for i64 in 64-bit mode by custom lowering.
[JITLink] Switch to SymbolStringPtr for Symbol names (#115796)
Use SymbolStringPtr for Symbol names in LinkGraph. This reduces string interning on the boundary between JITLink and ORC, and allows pointer comparisons (rather than string comparisons) between Symbol names. This should improve the performance and readability of code that bridges between JITLink and ORC (e.g. ObjectLinkingLayer and ObjectLinkingLayer::Plugins). To enable use of SymbolStringPtr a std::shared_ptr<SymbolStringPool> is added to LinkGraph and threaded through to its construction sites in LLVM and Bolt. All LinkGraphs that are to have symbol names compared by pointer equality must point to the same SymbolStringPool instance, which in ORC sessions should be the pool attached to the ExecutionSession. --------- Co-authored-by: Lang Hames <lhames@gmail.com>
[RISCV][GISel] Use correct shift width for GIShiftMask32 ComplexOpera…
…ndMatcher. We should use 32 instead of XLen. This allows us to remove 'and X, 31' from the shift amount.
[AMDGPU][True16][MC] create true16/fake16 mc tests for more vop3 test…
… file (#118859) This is a NFC. Create and duplicate test file for true16/fake16 mc test and update with +real-true16/-real-true16 flags properly. This is for preparing more test changes for true16 flows
[InstrProf][lld] Extend test to confirm order_file takes precedense o…
[examples] Add missing dependence on OrcShared.
Hopefully this will fix the linker error in https://lab.llvm.org/buildbot/#/builders/80/builds/7248.
[RISCV] Use zext and shift for spread(4,8) when types allow (#118893)
For a spread with an element type small enough, we can use a zext and shift to perform the shuffle. For e8, this covers spread(2,4,8), and for e16 covers spread(2,4). Note that spread(2) is already covered by the existing interleave logic, and is simply listed for completeness in the prior description.
[RISCV] Correct the pass name for RISCVPostRAExpandPseudo.
riscv-expand-pseudolisimm32 -> riscv-post-ra-expand-pseudoa
[clang] Compute accurate begin location for CallExpr with explicit ob…
[ORC] Provide default MemoryAccess in SimpleRemoteEPC, add WritePoint…
…ers impl. Make EPCGenericMemoryAccess the default implementation for the MemoryAccess object in SimpleRemoteEPC, and add support for the WritePointers operation to OrcTargetProcess (previously this operation was unimplemented and would have triggered an error if accessed in a remote-JIT setup). No testcase yet: This functionality requires cross-process JITing to test (or a much more elaborate unit-test setup). It can be tested once the new top-level ORC runtime project lands.
[CGData][GlobalIsel][Legalizer][DAG][MC][AsmParser][X86][AMX] Use `st…
…d::move` to avoid copy (#118068)
[flang][cuda] Distinguish constant fir.global from globals with a #cu…
…f.cuda<constant> attribute (#118912) 1. In `CufOpConversion` `isDeviceGlobal` was renamed `isRegisteredGlobal` and moved to the common file. `isRegisteredGlobal` excludes constant `fir.global` operation from registration. This is to avoid calls to `_FortranACUFGetDeviceAddress` on globals which do not have any symbols in the runtime. This was done for `_FortranACUFRegisterVariable` in #118582, but also needs to be done here after #118591 2. `CufDeviceGlobal` no longer adds the `#cuf.cuda<constant>` attribute to the constant global. As discussed in #118582 a module variable with the #cuf.cuda<constant> attribute is not a compile time constant. Yet, the compile time constant also needs to be copied into the GPU module. The candidates for copy to the GPU modules are - the globals needing regsitrations regardless of their uses in device code (they can be referred to in host code as well) - the compile time constant when used in device code 3. The registration of "constant" module device variables ( #cuf.cuda<constant>) can be restored in `CufAddConstructor`
[memprof] Rename Inline to IsInlineFrame in YAML (#118901)
This patch makes the YAML field name match the struct field name.
[Serialization] Support load lazy specialization lazily
Currently all the specializations of a template (including
instantiation, specialization and partial specializations) will be
loaded at once if we want to instantiate another instance for the
template, or find instantiation for the template, or just want to
complete the redecl chain.
This means basically we need to load every specializations for the
template once the template declaration got loaded. This is bad since
when we load a specialization, we need to load all of its template
arguments. Then we have to deserialize a lot of unnecessary
declarations.
For example,
```
// M.cppm
export module M;
export template <class T>
class A {};
export class ShouldNotBeLoaded {};
export class Temp {
A<ShouldNotBeLoaded> AS;
};
// use.cpp
import M;
A<int> a;
```
We should a specialization ` A<ShouldNotBeLoaded>` in `M.cppm` and we
instantiate the template `A` in `use.cpp`. Then we will deserialize
`ShouldNotBeLoaded` surprisingly when compiling `use.cpp`. And this
patch tries to avoid that.
Given that the templates are heavily used in C++, this is a pain point
for the performance.
This patch adds MultiOnDiskHashTable for specializations in the
ASTReader. Then we will only deserialize the specializations with the
same template arguments. We made that by using ODRHash for the template
arguments as the key of the hash table.
To review this patch, I think `ASTReaderDecl::AddLazySpecializations`
may be a good entry point.
The patch was reviewed in
#83237 but that PR is a stacked
PR. But I feel the intention of the stacked PRs get lost during the
review process. So I feel it is better to merge the commits into a
single commit instead of merging them in the PR page. It is better for
us to cherry-pick and revert.[llvm] Pass FFI CMake options through to runtimes (for offload) (#118807