Skip to content

vltpkg/reproduce

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

45 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

reproducible logo

reproduce

Can we reproduce a package with the "origin" information provided?

Features · How It Works · Configuration · Strategies · Usage · Insights · FAQs

Features

  • ✅ determines whether or not a package can be reproduced from it's referenced repository metadata (ie. repository, repository.type, repository.url, repository.directory & gitHead)
  • 🔍 validates repository information against package.json if the package referenced lives on a registry (will fallback to package.json inside the tarball if the package is not in a registry)
    • 🔀 mismatching repository information is considered "manifest confusion" & will return false for "reproducibility"
  • 🗄️ provides persistent caching of results
  • 🔄 currently only supports npm as a "strategy" but will expand to support other package managers in the future

How It Works

  1. ⬇️ fetches the package & any corresponding metadata
  2. 📂 if available, does a clone/checkout of the corresponding source repository
  3. 🔄 attempts to prepare & pack the source repository using one or more strategies
  4. 🔍 validates the integrity value of #3 against the package fetched in #1
  5. 📄 returns results and caches them for future use

Usage

import reproduce from 'reproduce'

// Basic usage
const result = await reproduce('package-name')

// With custom configuration
const result = await reproduce('package-name', {
  cache: {},
  cacheDir: './custom-cache',
  cacheFile: 'custom-cache.json'
})

CLI

npx reproduce tsc # exit code 0 - reproducible
npx reproduce esbuild # exit code 1 - not reproducible
npx reproduce axios --json  # exit code 1 - not reproducible
{
  "reproduceVersion": "0.0.1-pre.1",
  "timestamp": "2025-02-25T10:40:24.947Z",
  "os": "darwin",
  "arch": "arm64",
  "strategy": "npm:10.9.1",
  "reproduced": false,
  "package": {
    "spec": "axios",
    "location": "https://registry.npmjs.org/axios/-/axios-1.7.9.tgz",
    "integrity": "sha512-LhLcE7Hbiryz8oMDdDptSrWowmB4Bl6RCt6sIJKpRB4XtVf0iEgewX3au/pJqm+Py1kCASkb/FFKjxQaLtxJvw=="
  },
  "source": {
    "spec": "github:axios/axios#b2cb45d5a533a5465c99559b16987e4d5fc08cbc",
    "location": "git+https://github.com/axios/axios.git",
    "integrity": "null"
  }
}
npx reproduce require --json  # exit code 0 - reproducible
{
  "reproduceVersion": "0.0.1-pre.1",
  "timestamp": "2025-02-25T10:22:09.303Z",
  "os": "darwin",
  "arch": "arm64",
  "strategy": "npm:10.9.1",
  "reproduced": true,
  "package": {
    "spec": "sleepover",
    "location": "https://registry.npmjs.org/sleepover/-/sleepover-1.2.3.tgz",
    "integrity": "sha512-yNAIVUqbQifyy5+hfzAzK2Zt21wXjwXqPyWLu+tOvhOcYKG2ffUiSoBXwt/yo4KJ51IcJfUS0Uq0ktOoMWy9Yw=="
  },
  "source": {
    "spec": "github:darcyclarke/sleepover#f2586e91b3faf085583c23ed6e00819916e85c28",
    "location": "git+ssh://git@github.com/darcyclarke/sleepover.git",
    "integrity": "sha512-yNAIVUqbQifyy5+hfzAzK2Zt21wXjwXqPyWLu+tOvhOcYKG2ffUiSoBXwt/yo4KJ51IcJfUS0Uq0ktOoMWy9Yw=="
  }
}

Configuration

The reproduce function accepts an options object with the following configuration:

{
  cache: {},                      // Optional in-memory cache object (persisted to disk if provided)
  cacheDir: '~/.cache/reproduce', // OS-specific cache directory
  cacheFile: 'cache.json',        // Cache file name
  strategy: 'npm'                 // Strategy to use
}

Cache Locations

The cache is stored in OS-specific locations:

  • macOS: ~/Library/Caches/reproduce/
  • Windows: %LOCALAPPDATA%/reproduce/Cache/
  • Linux: $XDG_CACHE_HOME/reproduce/ or ~/.cache/reproduce/

Strategies

A strategy is a set of operations to take to recreate a package. Strategies should represent common patterns for preparing/building/packing packages to cast wide nets. If a set successfully recreates a package then its ID will be stored inside the returned metadata.

UUID Notes
npm:<version> clones, checks out ref, installs deps, runs prepare scripts & packs

Note: one-off/bespoke or complex configurations will not be supported but we will continue to add more strategies as we find common patterns.

Insights

Top 5,000 High Impact Packages

Note: "High Impact" packages are defined as having >=1M downloads per week and/or >=500 dependants. This list was originally generated here. This test was run on 2025-02-26.

  • 5.78% (289) are reproducible
  • 3.72% (186) have provenance
List of reproducible packages
semver
tslib
lru-cache
readable-stream
ansi-regex
commander
minimatch
yallist
glob
string-width
fs-extra
emoji-regex
which
execa
ws
minipass
cross-spawn
micromatch
whatwg-url
tr46
mime
path-type
loader-utils
write-file-atomic
callsites
ini
binary-extensions
is-binary-path
pump
read-pkg
normalize-package-data
open
json-parse-even-better-errors
cli-cursor
yocto-queue
restore-cursor
terser
fastq
sax
ip
log-symbols
reusify
ssri
nopt
normalize-url
@eslint/eslintrc
@humanwhocodes/config-array
mdn-data
mute-stream
import-local
gauge
spdx-license-ids
test-exclude
regjsparser
spdx-exceptions
is-unicode-supported
is-ci
url
source-map-js
regenerate-unicode-properties
minizlib
unicode-match-property-value-ecmascript
data-urls
html-encoding-sniffer
whatwg-mimetype
cli-spinners
xml-name-validator
abbrev
type
unicode-canonical-property-names-ecmascript
unique-slug
unique-filename
w3c-xmlserializer
dot-prop
camelcase-keys
@sindresorhus/is
foreground-child
@npmcli/fs
stream-shift
log-update
make-fetch-happen
boxen
del
tar-fs
@hapi/hoek
p-retry
has-ansi
minipass-fetch
cli-boxes
agentkeepalive
sort-keys
safe-stable-stringify
node-gyp-build
npm-normalize-package-bin
builtins
aws-sdk
elliptic
npm-package-arg
validate-npm-package-name
es5-ext
es6-symbol
strnum
path-scurry
registry-auth-token
crypto-browserify
d
html-tags
moment-timezone
npm-bundled
ignore-walk
npm-packlist
devtools-protocol
get-port
package-json
p-defer
p-event
latest-version
default-browser-id
npm-registry-fetch
compress-commons
zip-stream
lcid
filter-obj
npm-pick-manifest
pacote
read
require-in-the-middle
npm-install-checks
throttleit
@npmcli/run-script
touch
read-package-json-fast
@npmcli/promise-spawn
@npmcli/node-gyp
@npmcli/git
prebuild-install
store2
@npmcli/installed-package-contents
proc-log
postgres-interval
xregexp
webpack-hot-middleware
is-what
copy-anything
set-cookie-parser
p-filter
fast-redact
known-css-properties
remark-slug
is-builtin-module
remark-external-links
is-text-path
text-extensions
memoizee
timers-ext
spawn-command
find-versions
debounce
xmlhttprequest-ssl
pino-abstract-transport
run-applescript
use-callback-ref
use-sidecar
estree-to-babel
default-browser
bundle-name
pretty-ms
postcss-normalize
cli-color
macos-release
windows-release
remark-footnotes
import-in-the-middle
read-cmd-shim
cpy
write-json-file
cron-parser
find-babel-config
lru-memoizer
unzipper
winston-daily-rotate-file
obliterator
csv-parser
mnemonist
set-immediate-shim
through2-filter
init-package-json
winston-logzio
@npmcli/package-json
promzard
s3-streamlogger
bin-links
@npmcli/map-workspaces
@npmcli/name-from-folder
walk-up-path
ast-module-types
union
why-is-node-running
@npmcli/metavuln-calculator
hot-shots
parse-conflict-json
oidc-token-hash
prom-client
marked-terminal
promise-call-limit
node-source-walk
libmime
logzio-nodejs
postcss-sorting
@zeit/schemas
ethereum-cryptography
parse-github-url
light-my-request
detective-stylus
n
comment-json
detective-typescript
@lezer/common
@lezer/lr
precinct
redux-mock-store
detective-postcss
twilio
log
tocbot
@hapi/podium
detective-es6
get-amd-module-type
detective-sass
detective-scss
detective-cjs
generate-object-property
sprintf-kit
highcharts
graphql-subscriptions
@tailwindcss/forms
jspdf
chance
eslint-plugin-react-native

FAQs

Why look into "reproducibility"?

We believe the strategy of leveraging reproducible builds for the purpose of associating artifacts with a source/repository outperforms the current provenance strategy with the added benefit of being backwards compatible.

Will reproducibility get better with time?

Yes. As we add more strategies, we should see the percentatge of reproducible packages grow over time both net-new & previously published packages will benefit from the additional strategies. Feel free to contribute!

Credits

Big thanks to @siddharthkp for gifting the package name reproduce to us!

Learn More

We wrote a blog post about this project & the results we found which you can read here: https://blog.vlt.sh/blog/reproducibility

Is your package reproducible?

About

Library to check if a package is reproducible

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •