- AVX
- Remove unnecessary moves from VPSHUF{D, HW, LW} (a4f89b79a)
- Reduce codegen for 256-bit VMOVMSKPD/VMOVMSKPS (394a6f28d)
- Handle two field insertions in VPERMQ (83a989c6d)
- Slightly trim codegen for VPSADW 256-bit case (9740f488c)
- Handle trivial UZP/ZIP operations in VPERMQ (c85e426cc)
- Handle trivial cases better for VDPPS (3470dd1e7)
- Handle easily broadcastable permutations in VPERMQ (7ae55d73c)
- Skip identity insertions in VPERMQ (d5be15c90)
- Handle transpose cases in VPERMQ (ec1b24d05)
- Remove unnecessary dup if sources are the same in SHUFOpImpl (d93997c1c)
- Shave some moves off 256-bit VPALIGNR (ab4e0f653)
- Reduce moves in 256-bit VPSHUFB (fda023e7d)
- Wire up helper to VPERMILPD (588126625)
- Wire up lane helper for VPERMILPS imm variant (b846cb6c2)
- Wire up lane helper for VSHUFPD/VSHUFPS (62ecdd650)
- Wire up lane helper for VPSHUFD/VPSHUFLW/VPSHUFHW (9f195ff37)
- Reduce inserts in VBLENDPS/VPBLENDD/VPBLENDW (4eb569487)
- Remove unnecessary moves from PINSRX ops (5ad06b025)
- Allocator
- Fix and optimize VA range detection (af4da43bb)
- Arm64
- Fix byte-size handling in unaligned STLXR emulation (f66368b19)
- Arm64Emitter
- Tidy up load/stores in Push/PopCalleeSavedRegisters (500d2374a)
- Build
- Enable ccache sloppiness for time macros (cae5da577)
- CodeCache
- Support targeting WOW64/ARM64EC in FEXOfflineCompiler (08ed4fb98)
- Resolve comment https://github.com/FEX-Emu/FEX/pull/5441#discussion_r3255384588 (df73e8472)
- Implement lazy code loading (bb0d142a6)
- Config
- Use more sensible default for portable config location (33f3b8659)
- Core
- Start a new block for the next op in full SMC check (b23fa3009)
- FEXBash
- Drop implicit -c and add colored PS1 (9be7d6d11)
- FEXCore
- Pass host type that changes codegen to FEXCore (3e60aa573)
- Ensure LOCK prefix instructions are handled correctly (d848cbbc0)
- Allow InterruptFaultPage to be significantly further away (e4daea406)
- Add support for developer single stepping, read/write watching. (d4c80d909)
- FEXGetConfig
- Even more correctness changes for X2E (ab9a8c62a)
- Showcase RMW versus loadstore atomic differences (50f449487)
- FEXOfflineCompiler
- Fixes HostFeature detection under Win32 (7fa4d7826)
- Add "process-all" verb (cb018257c)
- FEXServerClient
- Workaround sun_path 108 byte limit (1db45e2a7)
- Format
- Fix missed clang-format (5fd917ec2)
- Frontend
- Fix vsyscall page tracking. (e02953dc1)
- HostFeatures
- Put SVE support querying into single function (b9aeccf13)
- Don't capture CTR/MIDR under simulator (07f7aa3c8)
- Only enable `dc zva` optimization on Ampere CPUs (c98cef0da)
- IR
- Add constant for swapping midsections of 256-bit vectors around (110313e7d)
- ImageTracker
- Support using image IDs as an extended volatile metadata key (ed724a61a)
- IntrusiveIRList
- Amend signature for PostRA() (b87ff1e2d)
- JIT
- Arm64: fix the loop in CacheLineClear/Clean (27a5f0918)
- Amend op typos in implementations (55c90cfc3)
- LibraryForwarding
- Implement support for CUDA (e12bd2710)
- Add annotation for snd_htimestamp_t (a5c3fc475)
- cuda
- Convert constexpr to const (9ac608ca4)
- LinuxSyscalls
- add missing thread header (a1071ec01)
- OpcodeDispatcher
- Fix missing zeroing for vcvtps2ph (9f2e98294)
- Fixes CRC32 with high 8-bit register (43bd24345)
- Fix incorrect comment for BTOp. ZF must be preserved. (e19aa975c)
- Fix typo in comment (7dc2dc874)
- Sanitize selectors for VBLENDPD/VPBLENDD (681c5e809)
- Eliminate redundant moves in VMOVHPOp (dd44bc8d0)
- Make use of Bind consistently (110c7cb62)
- Move a few stray literal accesses to Literal() (09aa5abbd)
- Fixes 64-bit LODs with address size override (ed6a178ae)
- Passes
- Trim unnecessary forward declarations (280568df2)
- Proton
- Fixes Mafia 3 (34344e576)
- Vector
- Only signify 128-bit vector loads in UCOMISxOp (e26a792b7)
- Fix typo in VPERMQOp (70fe9a440)
- Remove unused OpcodeArgs parameter from SHUFOpImpl (ad618be97)
- VectorOps
- Eliminate unnecessary moves in VMov if applicable (6bcadde65)
- Avoid temporary if able in 256-bit VFRecp (5f1c8efe0)
- Reduce temporary usage in 64-bit AdvSIMD min max paths (16f90b33f)
- Avoid move in 256-bit VAddP/VFAddP if possible (5d8d052a7)
- Avoid dup if able in VInsElement 128-bit element path (01b0b4e65)
- WOW64
- Support disabling DEP (f5efdac2e)
- Windows
- Fixes SHM stats reallocation (b4e2f5118)
- Load unixlib if possible (78832cc0d)
- Adds empty Linux side unix library (fe4d2bc6c)
- Trace interrupt translation and prototype INT 0x29 fail-fast mapping (f5fafa5b9)
- UnixLib
- Fix loading with new MemoryWineLoadUnixLibByName mechanism (44e24c9e6)
- Adds remaining helpers (32b96c259)
- Adds support for Hardware TSO support (24da43f82)
- Misc
- Re-optimize FYL2X for reduced precision x87 path (3d593ce87)
- instcountci/VEX_map3: Add missing third param to VPBLENDD (d138c854f)
- instcountci/VEX_map1: Remove obsolete comments (7e2d3b07c)
- Do not exec FEX if it is a folder in FEXBash (848c4b2d6)
- Fix UAF of PS1 (4f995cbc1)
- [SVE256] Add fast paths for trivial VSHUFPD flags (0b0000, and 0b1111) (161937425)
- [SVE256] Remove unnecessary move in VCVTPS2PD (c13064e20)
- [SVE256] Remove heavy handed moves from scalar compares (37e32fbcb)
- [SVE256] More comprehensively test SSE insertions for scalar comparisons (27acbba52)
- [SVE256] Add more SSE scalar variant unit tests (6f33d2b4c)
- [SVE256] Handle SSE insertions for PCMPESTRM/PCMPISTRM ops (c5880e761)
- [SVE256] Handle SSE insertions for MOVQ2DQ (9d0c05d9c)
- [SVE256] Handle SSE insertions for EXTRQ/INSERTQ (f6a68cb7f)
- [SVE256] Handle SSE insertions for CVTPI2PD (462c78541)
- Fix incorrect RSP update for 16bit leave (f5477039f)
- [SVE256] Handle SSE insertions for PMADDWD (ee4794c99)
- [SVE256] Handle SSE insertions for CMPSD/CMPSS (3a23bb4b7)
- [SVE256] Handle SSE insertions for MOVSHDUP/MOVSLDUP (46ffb25f8)
- [SVE256] Handle SSE insertion for aligned/unaligned loads and non-temporal loads (069a4025c)
- [SVE256] Handle SSE insertions for MOVH(PD, PD, LPS) and MOVL(PD, PS, HPS) (3929d25dc)
- [SVE256] Handle SSE insertions for XOR special case (f98ac7f26)
- [SVE256] Handle SSE insertions for INSERTPS, PSIGN, PINSR, and shuffles (23099100b)
- [SVE256] Handle SSE insertions for shifts (adad3c27d)
- (470aeab21)
- [SVE256] Handle SSE insertions for more misc ops (99662b70f)
- [SVE256] Handle SSE insertions for PHMINPOSUW, DPPD, and DPPS (1a606de29)
- [SVE256] Handle SSE insertions for blends (223e0f4e5)
- (0b1f336e0)
- [SVE256] Handle more SSE insertions for some one-off instructions (d9ea6651f)
- [SVE256] Vector: Handle SSE insertion properly for various ALU operations (9d5494d9f)
- Drop unused CMakeSettings.json (97f1f47fa)
- set sysroot to X86_DEV_ROOTFS for guest toolchain (53c269ee2)
- New CPL0 instructions from #5510 but with unittests (1240a00fa)
- code-format-helper: More dependabot changes (0b871bf54)
- Cherry-pick #5508 with instcountci changes (268081e5d)
- Windows additions for code caching (27324ded8)
- Library Forwarding: Various build system improvements (b4fe65f2c)
- JIT-inline FPREM/FPREM1 for reduced precision x87 path (0d7289048)
- arm64ec
- Single instruction optimization in EC map lookup (65b05fa8c)
- Fixes some FEX allocations that were missing TOP_DOWN (7ff2069e6)
- instcountci
- Add a few more cases for VPBLENDW/VPBLENDD and VSHUFPD/VSHUFPS (72e01274c)
- Add 16-bit pcmpxstrx variants (3d66be9e5)
- meta
- Add CONTRIBUTING.md (83f325de0)
- unittests
- Add selector tests for VPSHUF{D, HW, LW} (a79c471c3)
- Add test for stress-testing VPBLENDD selectors (417bd8604)