Skip to content

Bug in parsing pnpm-lock files #32

@connorshea

Description

@connorshea

I noticed this when using scrutineer and - with Claude's help - traced the bug to this package. Basically, I noticed that running the dependencies skill from Scrutineer on @oxlint/migrate resulted in some garbage in the resultant package list:

name version purl
'@typescript-eslint parser' pkg:npm/%20%20%20%20'%40typescript-eslint@parser'

Example pnpm-lock.yaml file: https://github.com/oxc-project/oxlint-migrate/blob/e569fcfcfe8dca175ec5b9630b0fd878d8456ba6/pnpm-lock.yaml

It appears to be that a block like this:

  '@typescript-eslint/eslint-plugin@8.59.3':
    resolution: {integrity: sha512-PwFvSKsXGShKGW6n5bZOhGHEcCZXM8HofLK9fNsEwZXzFRjoY+XT1Vsf1zgyXdwTr0ZYz1/2tkZ0DBTT9jZjhw==}
    engines: {node: ^18.18.0 || ^20.9.0 || >=21.1.0}
    peerDependencies:
      '@typescript-eslint/parser': ^8.59.3
      eslint: ^8.57.0 || ^9.0.0 || ^10.0.0
      typescript: '>=4.8.4 <6.1.0'

Or this:

  eslint-module-utils@2.12.1:
    resolution: {integrity: sha512-L8jSWTze7K2mTg0vos/RuLRS5soomksDPoJLXIslC7c8Wmut3bx7CPpJijDcBZtxQ5lrbUdM+s0OlNbz0DCDNw==}
    engines: {node: '>=4'}
    peerDependencies:
      '@typescript-eslint/parser': '*'
      eslint: '*'
      eslint-import-resolver-node: '*'
      eslint-import-resolver-typescript: '*'
      eslint-import-resolver-webpack: '*'
    peerDependenciesMeta:
      '@typescript-eslint/parser':
        optional: true
      eslint:
        optional: true
      eslint-import-resolver-node:
        optional: true
      eslint-import-resolver-typescript:
        optional: true
      eslint-import-resolver-webpack:
        optional: true

are the source of the misparsing, and we end up with invalid package values being parsed from the lockfile.

The following is an explanation from Claude of the root-cause/possible fix, because I do not know Go very well :)


Symptom

The dependencies table contains rows with clearly invalid data sourced from pnpm-lock.yaml:

name version purl
'@typescript-eslint parser' pkg:npm/%20%20%20%20'%40typescript-eslint@parser'

The name has leading whitespace and an unmatched quote; the version is the second half of a scoped package name.

Root cause

Triggering input

pnpm lockfiles can contain a peerDependenciesMeta: block inside a package definition, where each peer dependency appears as a YAML key with no version on the same line:

packages:
  '@typescript-eslint/eslint-plugin@8.59.3':
    resolution: ...
    peerDependencies:
      '@typescript-eslint/parser': ^8.59.3
    peerDependenciesMeta:
      '@typescript-eslint/parser':    # ← 6-space indent, ends with ':'
        optional: true

Step 1 — extractPnpmPackageKey accepts the wrong lines

The function is meant to match only top-level package key lines (2-space indent), but its guard only checks that the first two characters are spaces:

if len(line) < 4 || line[0] != ' ' || line[1] != ' ' {
    return "", false
}

A 6-space-indented line like '@typescript-eslint/parser': satisfies this condition. The function then extracts:

key = line[2:len(line)-1]  →  "    '@typescript-eslint/parser'"
                                 ^^^^
                                 4 stray leading spaces

Quote-stripping is skipped because key[0] is a space, not '. The key contains @, so it passes the final validity check and is returned as a legitimate package key.

Step 2 — parsePnpmPackageKey splits on /

The key '@typescript-eslint/parser' does not start with @ (it starts with spaces), so it falls into the non-scoped branch and is split on the first /:

if slashIdx := strings.Index(key, "/"); slashIdx > 0 {
    name    = key[:slashIdx]    // "    '@typescript-eslint"
    version = key[slashIdx+1:]  // "parser'"
    return name, version
}

This produces the garbage (name, version) pair that ends up in the database.

Fix

Add a third check to extractPnpmPackageKey to enforce that the line starts with exactly two spaces:

// Must start with EXACTLY 2 spaces (not 3 or more)
if len(line) < 4 || line[0] != ' ' || line[1] != ' ' || line[2] == ' ' {
    return "", false
}

This one-character change makes all deeper-indented lines (nested keys, peerDependenciesMeta entries, etc.) correctly rejected.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions