Skip to content

Releases: apify/crawlee

v3.5.7

05 Oct 09:03
Compare
Choose a tag to compare

3.5.7 (2023-10-05)

Bug Fixes

  • add warning when we detect use of RL and RQ, but RQ is not provided explicitly (#2115) (6fb1c55), closes #1773
  • ensure the status message cannot stuck the crawler (#2114) (9034f08)
  • RQ request count is consistent after migration (#2116) (9ab8c18), closes #1855 #1855

v3.5.6

04 Oct 10:31
Compare
Choose a tag to compare

3.5.6 (2023-10-04)

Bug Fixes

  • types: re-export RequestQueueOptions as an alias to RequestProviderOptions (#2109) (0900f76)

Features

v3.5.5

02 Oct 13:02
Compare
Choose a tag to compare

3.5.5 (2023-10-02)

Bug Fixes

  • allow to use any version of puppeteer or playwright (#2102) (0cafceb), closes #2101
  • session pool leaks memory on multiple crawler runs (#2083) (b96582a), closes #2074 #2031
  • templates: install browsers on postinstall for playwright (#2104) (323768b)
  • types: make return type of RequestProvider.open and RequestQueue(v2).open strict and accurate (#2096) (dfaddb9)

Features

  • experimental support for request locking (Request Queue v2) (#1975) (70a77ee), closes #1365

v3.5.4

11 Sep 13:23
Compare
Choose a tag to compare

3.5.4 (2023-09-11)

Bug Fixes

  • core: allow explicit calls to purgeDefaultStorage to wipe the storage on each call (#2060) (4831f07)
  • various helpers opening KVS now respect Configuration (#2071) (59dbb16)

Features

  • remove side effect from the deprecated error context augmentation (#2069) (f9fb5c4)

v3.5.3

31 Aug 07:46
Compare
Choose a tag to compare

3.5.3 (2023-08-31)

Bug Fixes

  • browser-pool: improve error handling when browser is not found (#2050) (282527f), closes #1459
  • clean up inProgress cache when delaying requests via sameDomainDelaySecs (#2045) (f63ccc0)
  • crawler instances with different StorageClients do not affect each other (#2056) (3f4c863)
  • pin all internal dependencies (#2041) (d6f2b17), closes #2040
  • respect current config when creating implicit RequestQueue instance (845141d), closes #2043

Features

  • core: add default dataset helpers to BasicCrawler (#2057) (e2a7544)

v3.5.2

21 Aug 12:40
Compare
Choose a tag to compare

3.5.2 (2023-08-21)

Bug Fixes

  • make the Request constructor options typesafe (#2034) (75e7d65)
  • pin @crawlee/* packages versions in crawlee metapackage (#2040) (61f91c7)
  • support DELETE requests in HttpCrawler (#2039) (7ea5c41), closes #1658

Features

v3.5.1

16 Aug 08:48
Compare
Choose a tag to compare

3.5.1 (2023-08-16)

Bug Fixes

  • add Request.maxRetries to the RequestOptions interface (#2024) (6433821)
  • log original error message on session rotation (#2022) (8a11ffb)

Features

  • exceeding maxSessionRotations calls failedRequestHandler (#2029) (b1cb108), closes #2028

v3.5.0

31 Jul 06:53
Compare
Choose a tag to compare

3.5.0 (2023-07-31)

Bug Fixes

  • cleanup worker stuff from memory storage to fix vitest (#2004) (d2e098c), closes #1999
  • core: add requests from URL list (requestsFromUrl) to the queue in batches (418fbf8), closes #1995
  • core: support relative links in enqueueLinks explicitly provided via urls option (#2014) (cbd9d08), closes #2005

Features

  • add closeCookieModals context helper for Playwright and Puppeteer (#1927) (98d93bb)
  • add support for sameDomainDelaySecs (#2003) (e796883), closes #1993
  • basic-crawler: allow configuring the automatic status message (#2001) (3eb4e4c)
  • core: use RequestQueue.addBatchedRequests() in enqueueLinks helper (4d61ca9), closes #1995
  • retire session on proxy error (#2002) (8c0928b), closes #1912

v3.4.2

19 Jul 14:11
Compare
Choose a tag to compare

3.4.2 (2023-07-19)

Bug Fixes

  • basic-crawler: limit internalTimeoutMillis in addition to requestHandlerTimeoutMillis (#1981) (8122622), closes #1766

Features

  • core: add RequestQueue.addRequestsBatched() that is non-blocking (#1996) (c85485d), closes #1995
  • retryOnBlocked detects blocked webpage (#1956) (766fa9b)

v3.4.1

13 Jul 12:22
Compare
Choose a tag to compare

3.4.1 (2023-07-13)

Bug Fixes

  • http-crawler: replace IncomingMessage with PlainResponse for context's response (#1973) (2a1cc7f), closes #1964

Features

  • jsdom,linkedom: Expose document to crawler router context (#1950) (4536dc2)