Skip to content


Nick Sweeting edited this page Apr 6, 2021 · 27 revisions


▶️ If you're having an issue with a breaking change, or migrating your data between versions, open an issue to get help.

ArchiveBox was previously named Pocket Archive Stream and then Bookmark Archiver.

THIS PAGE HAS BEEN MOVED: See the releases page for versioned source downloads and full changelog.

🍰 Many thanks to our 60+ contributors and everyone in the web archiving community! 🏛

Expand old release notes...

  • v0.2.4 released
  • better archive corruption guards (check structure invariants on every parse & save)
  • remove title prefetching in favor of new FETCH_TITLE archive method
  • slightly improved CLI output for parsing and remote url downloading
  • re-save index after archiving completes to update titles and urls
  • remove redundant derivable data from link json schema
  • markdown link parsing support
  • faster link parsing and better symbol handling using a new compiled URL_REGEX

  • v0.2.3 released
  • fixed issues with parsing titles including trailing tags
  • fixed issues with titles defaulting to URLs instead of attempting to fetch
  • fixed issue where bookmark timestamps from RSS would be ignored and current ts used instead
  • fixed issue where ONLY_NEW would overwrite existing links in archive with only new ones
  • fixed lots of issues with URL parsing by using urllib.parse instead of hand-written lambdas
  • ignore robots.txt when using wget (ssshhh don't tell anyone 😁)
  • fix RSS parser bailing out when there's whitespace around XML tags
  • fix issue with browser history export trying to run ls on wrong directory

  • v0.2.2 released
  • Shaarli RSS export support
  • Fix issues with plain text link parsing including quotes, whitespace, and closing tags in URLs
  • add USER_AGENT to submissions so they can track archivebox usage
  • remove all icons similar to branding from archive UI
  • hide some of the noisier youtubedl and wget errors
  • set permissions on youtubedl media folder
  • fix chrome data dir incorrect path and quoting
  • better chrome binary finding
  • show which parser is used when importing links, show progress when fetching titles

  • v0.2.1 released with new logo
  • ability to import plain lists of links and almost all other raw filetypes
  • WARC saving support via wget
  • Git repository downloading with git clone
  • Media downloading with youtube-dl (video, audio, subtitles, description, playlist, etc)

  • v0.2.0 released with new name
  • renamed from Bookmark Archiver -> ArchiveBox

  • v0.1.0 released
  • support for browser history exporting added with ./bin/archivebox-export-browser-history
  • support for chrome --dump-dom to output full page HTML after JS executes

  • v0.0.3 released
  • support for chrome --user-data-dir to archive sites that need logins
  • fancy individual html & json indexes for each link
  • smartly append new links to existing index instead of overwriting

  • v0.0.2 released
  • proper HTML templating instead of format strings (thanks to!)
  • refactored into separate files, wip audio & video archiving

  • v0.0.1 released
  • Index links now work without nginx url rewrites, archive can now be hosted on github pages
  • added script & docstrings & help commands
  • made Chromium the default instead of Google Chrome (yay free software)
  • added env-variable configuration (thanks to!)
  • renamed from Pocket Archive Stream -> Bookmark Archiver
  • added Netscape-format export support (thanks to!)
  • added Pinboard-format export support (thanks to!)
  • front-page of HN, oops! apparently I have users to support now 😁?
  • added Pocket-format export support

  • v0.0.0 released: created Pocket Archive Stream 2017/05/05