Skip to content

Releases: algolia/docsearch-scraper

feat(meta): handle comma-separated version

03 Sep 11:10
Compare
Choose a tag to compare

This release enables to use coma-separated token for docsearch:version meta tag.

The behaviour of the docsearch:version meta tag will be similar to the meta tag keyword defined from the HTML 5 spec.

The docsearch:version tag can be a set of comma-separated tokens, each of which is a version relevant to the page. These tokens must be compliant with the SemVer specification or only contain alphanumeric characters (e.g.latest, next, etc.). As facet filters, these version tokens are case-insensitive.

For example, all records extracted from a page with the following meta tag:

<meta name="docsearch:version" content="2.0.0-alpha.62,latest">

Will be tagged with the version:

version:["2.0.0-alpha.62" , "latest"]

deps: upgrade Scrapy + Chrome to stable 84

17 Aug 13:50
Compare
Choose a tag to compare

This PR upgrades Scrapy to its latest version (2.2.1). It also removes unnecessary use of CustomContextFactory. It also upgrades the chrome version to its latest stable available v84.

Upgrading Scrapy introduces many benefits such as:

File extensions that LinkExtractor ignores by default now also include 7z, 7zip, apk, bz2, cdr, dmg, ico, iso, tar, tar.gz, webm, and xz
Upgrading Twisted to its lates version. This is required to mitigate with CVE-2020-10109
Better logging system
A new DNS_RESOLVER setting allows enabling IPv6 support

feat: update chrome to 83.0.4103.61

01 Jun 10:31
Compare
Choose a tag to compare
v1.10.0

feat: update chrome to 83.0.4103.61

feat(meta): do not jsonized version meta

27 May 12:14
Compare
Choose a tag to compare
v1.9.0

feat(meta): do not jsonized version meta

feat(analytics): define a consistent ObjectID

29 Nov 17:35
Compare
Choose a tag to compare
  • define a consistent ObjectID

feat(headless_chrome): use google chrome 78

14 May 14:19
Compare
Choose a tag to compare

Before porting to v3

26 Apr 13:57
4b49e80
Compare
Choose a tag to compare
v1.3

before python v3 porting

Update slenium depedencies. Enable usage of chrome 73

03 Apr 07:29
1be3ca8
Compare
Choose a tag to compare

This release:

  • Updates the selenium module to the latest version
  • Uses the latest stable Chrome driver (v73) for the unix set up
  • Removes unused selenium server (only needed when using selenium RemoteWebDriver) #442
  • Bypass nb_hit_updater if env UPDATE_NB_HITS is set or terminal used is not a tty #429

v1.0 First steady version correctly released

27 Nov 13:58
9b0ac08
Compare
Choose a tag to compare
Merge pull request #417 from algolia/fix/build_base

Fix/build base