Skip to content

Releases: webrecorder/browsertrix-crawler

Browsertrix Crawler v1.2.1

26 Jun 16:18
4495532
Compare
Choose a tag to compare

What's Changed

  • browser policies: disable restoring any tabs on startup + set new tab URL to about:blank by @ikreymer in #626
  • Remove DISPLAY env var from image by @ikreymer in #625
  • Don't filter saving redirect if no response body. by @ikreymer in #628
  • Always download PDF + non HTML page cleanup + enterprise policy cleanup by @ikreymer in #629

Full Changelog: v1.2.0...v1.2.1

Browsertrix Crawler v1.2.0

21 Jun 23:35
8af8b3c
Compare
Choose a tag to compare

What's Changed

  • Bump version to 1.2.0 Beta + make draft release for each commit by @ikreymer in #582
  • Always add warcinfo records to all WARCs by @ikreymer in #556
  • Load non-HTML resources directly whenever possible by @ikreymer in #583
  • base image version bump to brave 1.66.115 by @ikreymer in #592
  • Add group policies, limit browser access to container filesystem by @vnznznz in #579
  • cleanup dockerfile + fix test by @ikreymer in #595
  • Consider disk usage of collDir instead of default /crawls by @benoit74 in #586
  • add --dryRun flag and mode by @ikreymer in #594
  • proxy: support setting proxy via --proxyServer, PROXY_SERVER env var or PROXY_HOST + PROXY_PORT env vars by @ikreymer in #589
  • merge 1.1.4 -> 1.2.0 beta by @ikreymer in #611
  • add EXPOSE for ports used inside container by @ikreymer in #612
  • adjust browser viewport to avoid cutting off bottom of page by @ikreymer in #614
  • clearer scope check by @ikreymer in #615
  • http auth support per seed (supersedes #566): by @ikreymer in #616
  • logging: log error message when seed is failed to be created by @ikreymer in #619
  • add yarn.lock to Docker to ensure consistent builds! by @ikreymer in #621
  • disable socat by default by @ikreymer in #622
  • bump brave to 1.67.119 by @ikreymer in #620
  • Updated rewriting for YouTube + dependency update by @ikreymer in #623
  • 1.2.0 release - deps: bump wabac.js to 2.19.1, RWP for QA to 2.1.0 by @ikreymer in #624

Full Changelog: v1.1.4...v1.2.0

Browsertrix Crawler v1.2.0-beta.3

21 Jun 22:04
65a8635
Compare
Choose a tag to compare
Pre-release

What's Changed

  • Updated rewriting for YouTube + Instagram, dependency update by @ikreymer in #623

Full Changelog: v1.2.0-beta.2...v1.2.0-beta.3

Browsertrix Crawler v1.2.0-beta.2

21 Jun 03:12
Compare
Choose a tag to compare
Pre-release

What's Changed

Full Changelog: v1.2.0-beta.1...v1.2.0-beta.2

Browsertrix Crawler v1.2.0-beta.1

14 Jun 22:26
ac722cc
Compare
Choose a tag to compare
Pre-release

What's Changed

Full Changelog: v1.2.0-beta.0...v1.2.0-beta.1

Browsertrix Crawler v1.1.4

14 Jun 02:16
9094a83
Compare
Choose a tag to compare

What's Changed

Full Changelog: v1.1.3...v1.1.4

Browsertrix Crawler v1.2.0-beta.0

10 Jun 20:19
e2b4cc1
Compare
Choose a tag to compare
Pre-release

What's Changed

  • Bump version to 1.2.0 Beta + make draft release for each commit by @ikreymer in #582
  • Always add warcinfo records to all WARCs by @ikreymer in #556
  • Load non-HTML resources directly whenever possible by @ikreymer in #583
  • base image version bump to brave 1.66.115 by @ikreymer in #592
  • Add group policies, limit browser access to container filesystem by @vnznznz in #579
  • cleanup dockerfile + fix test by @ikreymer in #595
  • Consider disk usage of collDir instead of default /crawls by @benoit74 in #586
  • add --dryRun flag and mode by @ikreymer in #594
  • proxy: support setting proxy via --proxyServer, PROXY_SERVER env var or PROXY_HOST + PROXY_PORT env vars by @ikreymer in #589

Full Changelog: v1.1.3...v1.2.0-beta.0

Browsertrix Crawler v1.1.3

21 May 23:48
Compare
Choose a tag to compare

What's Changed

  • Mention command line options when restarting by @edsu in #577
  • save state: export pending list as array of json strings + fix importing save state to support pending by @ikreymer in #576
  • Sitemap Parsing Fixes by @ikreymer in #578
  • Fix failOnFailedLimit and add tests by @tw4l in #580

Full Changelog: v1.1.2...v1.1.3

Browsertrix Crawler v1.1.2

15 May 18:09
1735c3d
Compare
Choose a tag to compare

What's Changed

  • improved handling of requests from workers: by @ikreymer in #562
  • Skip Checking Empty Frame + eval timeout by @ikreymer in #564
  • add STORE_REGION env var to be able to specify region by @ikreymer in #565
  • PDF loading status code fix by @ikreymer in #571
  • Fix regressions with failOnFailedSeed option by @tw4l in #572
  • headers: better filtering and encoding by @ikreymer in #573

Full Changelog: v1.1.1...v1.1.2

Browsertrix Crawler v1.1.1

02 May 16:01
22b2136
Compare
Choose a tag to compare

What's Changed

  • Avoid crashes when editing / creating profile and navigation is interrupted
  • profiles: ensure all page.goto() promises have at least catch block/a… by @ikreymer in #559
  • profiles: ensure initial page.load() is awaited by @ikreymer in #561

Full Changelog: v1.1.0...v1.1.1