Skip to content

is_dotnet_package_installed "already installed" check is weaker than the .NET host visibility check, leading to un-healable broken installations #694

@nmoinvaz

Description

@nmoinvaz

Description

The "already installed" short-circuit in both dotnet-install.sh and dotnet-install.ps1 decides whether a requested version is already present by checking only whether the version subdirectory exists on disk. The .NET host in dotnet/runtime uses a stricter check for the same question: it requires the directory to exist and contain a specific marker file. These two checks are asymmetric, and the asymmetry creates a class of broken installations that the install scripts refuse to repair because they consider the version "already installed," while the host refuses to use it because the marker file is missing.

Current check in install-scripts

Bash (src/dotnet-install.sh:574-589):

is_dotnet_package_installed() {
    eval $invocation

    local install_root="$1"
    local relative_path_to_package="$2"
    local specific_version="${3//[$'\t\r\n']}"

    local dotnet_package_path="$(combine_paths "$(combine_paths "$install_root" "$relative_path_to_package")" "$specific_version")"
    say_verbose "is_dotnet_package_installed: dotnet_package_path=$dotnet_package_path"

    if [ -d "$dotnet_package_path" ]; then
        return 0
    else
        return 1
    fi
}

PowerShell (src/dotnet-install.ps1:786-792):

function Is-Dotnet-Package-Installed([string]$InstallRoot, [string]$RelativePathToPackage, [string]$SpecificVersion) {
    Say-Invocation $MyInvocation

    $DotnetPackagePath = Join-Path -Path $InstallRoot -ChildPath $RelativePathToPackage | Join-Path -ChildPath $SpecificVersion
    Say-Verbose "Is-Dotnet-Package-Installed: DotnetPackagePath=$DotnetPackagePath"
    return Test-Path $DotnetPackagePath -PathType Container
}

Both are purely is-directory checks.

What the .NET host actually requires

The host's framework discovery logic in dotnet/runtime/src/native/corehost/fxr/framework_info.cpp enumerates shared/<framework_name>/ and silently skips any version whose directory is missing a specific marker file:

std::vector<pal::string_t> versions;
pal::readdir_onlydirectories(fx_dir, &versions);
for (const pal::string_t& ver : versions)
{
    // Make sure we filter out any non-version folders.
    fx_ver_t parsed;
    if (!fx_ver_t::parse(ver, &parsed, false))
        continue;

    // Check that the framework's .deps.json exists.
    pal::string_t fx_version_dir = fx_dir;
    append_path(&fx_version_dir, ver.c_str());
    if (!file_exists_in_dir(fx_version_dir, deps_file_name.c_str(), nullptr))
    {
        trace::verbose(_X("Ignoring FX version [%s] without .deps.json"), ver.c_str());
        continue;
    }
    ...
}

Required marker for a framework version: <framework_name>.deps.json (e.g., Microsoft.NETCore.App.deps.json) must exist inside shared/<framework_name>/<version>/. If the directory is present but the marker is missing, the version is silently ignored, as if it weren't installed.

SDK discovery in sdk_info.cpp has the same structural requirement, with dotnet.dll as the marker:

// Check for the existence of dotnet.dll
pal::string_t sdk_version_dir = sdk_dir;
append_path(&sdk_version_dir, version_str.c_str());
if (!file_exists_in_dir(sdk_version_dir, SDK_DOTNET_DLL, nullptr))
{
    trace::verbose(_X("Ignoring version [%s] without ") SDK_DOTNET_DLL, version_str.c_str());
    continue;
}

The failure mode

Any process that can leave a version directory present but without its marker file produces an installation state with the following properties:

  • is_dotnet_package_installed / Is-Dotnet-Package-Installed returns true.
  • The "already installed" short-circuit at dotnet-install.sh:1431 and dotnet-install.sh:1489 (and the PowerShell equivalents) exit 0 without doing any work.
  • The .NET host silently omits the version from its discovery list.
  • dotnet --info, dotnet --list-runtimes, and dotnet --list-sdks do not show the version.
  • A subsequent dotnet <command> fails with "You must install or update .NET to run this application" and the "The following frameworks were found" list omits the affected version.

Re-running dotnet-install.sh against the same version does not fix the state — the short-circuit takes the same branch on every invocation. Manual intervention (deleting the version subdirectory) is required to get the script to reinstall.

Mechanisms that can produce this state

Anything that leaves the directory intact while removing, truncating, or failing to write the marker file. Non-exhaustive list:

  • A cancelled install where extraction began but didn't complete. This is already tracked in actions/setup-dotnet as actions/setup-dotnet#501 and has been observed in production.
  • A second concurrent install into the same directory that overwrites some files but not others before failing or being killed.
  • A cleanup or disk-quota process that removes individual files rather than whole directories.
  • A partial extraction due to disk-full conditions.

A real-world incident of this class was recently observed in a self-hosted GitHub Actions runner using actions/setup-dotnet with a persistent DOTNET_INSTALL_DIR. The SDK was marked "already installed" and the install script short-circuited, but the matching runtime was invisible to the host. The runner state was cleaned before the exact on-disk state could be captured, so we cannot attribute the incident definitively to a missing .deps.json vs. a physically deleted directory — but the code-level inconsistency exists regardless of which branch triggered it.

Proposed fix

Have is_dotnet_package_installed / Is-Dotnet-Package-Installed verify the same marker file the host checks for, selecting the marker based on the asset type:

  • For asset_relative_path == "sdk" → require dotnet.dll inside the version directory.
  • For asset_relative_path == "shared/Microsoft.NETCore.App" → require Microsoft.NETCore.App.deps.json.
  • For asset_relative_path == "shared/Microsoft.AspNetCore.App" → require Microsoft.AspNetCore.App.deps.json.
  • Generally, for any shared/<framework_name> → require <framework_name>.deps.json.

Sketch for dotnet-install.sh:

is_dotnet_package_installed() {
    eval $invocation

    local install_root="$1"
    local relative_path_to_package="$2"
    local specific_version="${3//[$'\t\r\n']}"

    local dotnet_package_path="$(combine_paths "$(combine_paths "$install_root" "$relative_path_to_package")" "$specific_version")"
    say_verbose "is_dotnet_package_installed: dotnet_package_path=$dotnet_package_path"

    if [ ! -d "$dotnet_package_path" ]; then
        return 1
    fi

    # Verify the host-visible marker file, matching dotnet/runtime host behavior.
    local marker=""
    if [[ "$relative_path_to_package" == "sdk" ]]; then
        marker="dotnet.dll"
    elif [[ "$relative_path_to_package" == shared/* ]]; then
        local fx_name="${relative_path_to_package#shared/}"
        marker="${fx_name}.deps.json"
    fi

    if [ -n "$marker" ] && [ ! -f "$dotnet_package_path/$marker" ]; then
        say_verbose "Directory exists but marker file '$marker' is missing; treating as not installed."
        return 1
    fi

    return 0
}

The PowerShell version would mirror this using Test-Path -PathType Leaf for the marker file.

This change keeps the fast-path optimization for the common case (a complete install is still a no-op) and only does additional work when the directory is present — an additional Test-Path/[ -f ] per short-circuit check, which is negligible. It does not require any behavioral change to how downloads or extractions work, and it makes the "already installed" contract match the host's "visible to dotnet" contract, which is the contract users actually care about.

Metadata

Metadata

Assignees

No one assigned

    Labels

    untriagedIssue is awaiting triage

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions