Skip to content

Compilation optimization for modern CPU#137

Open
DarkGhostHunter wants to merge 2 commits into
PHPantom-dev:mainfrom
DarkGhostHunter:optimized-builds
Open

Compilation optimization for modern CPU#137
DarkGhostHunter wants to merge 2 commits into
PHPantom-dev:mainfrom
DarkGhostHunter:optimized-builds

Conversation

@DarkGhostHunter
Copy link
Copy Markdown

@DarkGhostHunter DarkGhostHunter commented May 20, 2026

The following fixes #132. It adds two things:

  1. Multiple builds for modern CPU:
  • Adds v2, v3, and v4 builds for all x86-64 builds: Windows, Linux and (Intel) Mac¹.
  • Removes v1 (default) builds.
    • Old builds can still be compiled manually with x86-64-v1 or native as target CPU.
  • Sets "friendly" suffixes for each build
    • v2: (as-is)
    • v3: modern
    • v4: avx512
  1. Adds CARGO environment variables for better binaries
  • CARGO_PROFILE_RELEASE_OPT_LEVEL="3": Max optimisations for execution, greater compilation time
  • CARGO_PROFILE_RELEASE_LTO="thin": Aggressive global optimisation, greater compilation time, less risks of OOM.

The Zed Editor extension will still download the v2 builds by default on x86-64, since it's not suffixed.

It would be great that it could detect the required flags to download the most optimised binary for the system. I believe this can be done by probing the CPU for the SIMD flags:

  • v2 requires: popcnt, sse4_2, ssse3
  • v3 requires: avx2, bmi1, bmi2, fma, movbe
  • v4 requires: avx512f, avx512bw, avx512cd, avx512dq, avx512vl

AI aided in this block, while trying to check how would be the logic. I guess it will depend on calling shell commands for CPU support.

use zed_extension_api::{self as zed, process::Command};

#[derive(Debug, Clone, Copy)]
pub enum X86Version {
    V2,   // Genetic x86-64 with SSE4.2, SSSE3, Popcnt
    V3,   // AVX, AVX2, BMI1, BMI2, FMA
    V4,   // AVX512
    None, // Non-X86 
}

fn detect_x86_version() -> X86Version {
    let (os, arch) = zed::current_platform();
    
    // If it's ARM or else, x86 microarchitecture doesn't apply
    if !matches!(arch, zed::Architecture::X86_64) {
        return X86Version::None; 
    }

    match os {
        zed::Os::Mac => {
            // macOS provides the sysctl utility
            let output = Command::new("sysctl")
                .args(["-n", "hw.optional.avx512f", "hw.optional.avx2"])
                .output();

            if let Ok(out) = output {
                let stdout = String::from_utf8_lossy(&out.stdout);
                let lines: Vec<&str> = stdout.lines().collect();
                
                // Check AVX-512 (v4)
                if lines.get(0).map_or(false, |&l| l == "1") {
                    return X86Version::V4;
                }

                // Check AVX2 (v3)
                if lines.get(1).map_or(false, |&l| l == "1") {
                    return X86Version::V3;
                }
            }

            X86Version::V2 // All Intel Macs support at least v2
        }

        zed::Os::Linux => {
            // Query /proc/cpuinfo safely using commands permitted by the sandbox
            if let Ok(out) = Command::new("cat").arg("/proc/cpuinfo").output() {
                let stdout = String::from_utf8_lossy(&out.stdout);
        
                // Take the flags block of the first core
                if let Some(flags_line) = stdout.lines().find(|l| l.starts_with("flags")) {
                    let flags: Vec<&str> = flags_line.split_whitespace().collect();
            
                    // Check Level 4 (AVX-512 foundation and structural elements)
                    if flags.contains(&"avx512f") && flags.contains(&"avx512vl") {
                        return X86Version::V4;
                    }
            
                    // Check Level 3 (AVX2 + Bit manipulation)
                    if flags.contains(&"avx2") && flags.contains(&"bmi2") {
                        return X86Version::V3;
                    }
                }
            }

            return X86Version::V2;
        }

        zed::Os::Windows => {
            // This PowerShell script uses the .NET Hardware Intrinsics library (available natively in modern Windows)
            // to check if the CPU instruction execution pipelines for AVX2 or AVX512 are natively present.
            let ps_script = r#"
                $isV4 = [System.Runtime.Intrinsics.X86.Avx512F]::IsSupported;
                $isV3 = [System.Runtime.Intrinsics.X86.Avx2]::IsSupported;
                if ($isV4) { Write-Output "v4" }
                elseif ($isV3) { Write-Output "v3" }
                else { Write-Output "v2" }
            "#;

            // Execute the script silently through the CLI
            let output = Command::new("powershell")
                .args([
                    "-NoProfile",
                    "-NonInteractive",
                    "-Command",
                    ps_script
                ])
                .output();

            if let Ok(out) = output {
                let stdout = String::from_utf8_lossy(&out.stdout).trim().to_lowercase();
                
                if stdout.contains("v4") {
                    return X86Version::V4;
                } else if stdout.contains("v3") {
                    return X86Version::V3;
                }
            }

            return X86Version::V2;
        }
    }
}

1: Intel Macs spawn from Nehalem (2008), Haswell (2013), and Skylake-X/Xeon and Ice Lake (2017).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[1.x] Optimized builds for modern CPU, remove old CPU

1 participant