CervaJäger

Searching for beers sometimes can be a hassle, browsing each beer's web store, for the best price it's time consuming and sometimes confusing. Even though We have access to google shopping and similar, small beer's stores and importers are usually not available at those indexers.

CervaJäger is a module that contains only the backbone for other applications, and will provide methods to register data sources, search, organize and return the best matches for the desired beer. The main purpose of this module is searching web sources, via scraping. However, the module is generic in a way that other types of data sources (spreadsheets, rest api,...) can be implemented and plugin in into the final application.

This module has no association with any beer, brewery, or Web Store. Its purpose is not commercial, by any means, it's a personal project with the intention to help me finding the best beer prices out there. Prost!

Instalation

Pre-Requisites:

It`s required at least NodeJs version 14.x
At least 400mb of space, as it will use an embedded chromium

The modulo is not available at NPM. You have three options at the moment, depending on how you`d to use the module:

Download the last released version;
Checkout the source code, from this repo, and install it;
Download the npm package from github packages. Refer to this guide for more info on how to configure Github as a npm registry.

Installing from release or source code

cd <checkout_folder>
npm install
npm run build

A companion CLI is also available, and can be used to evaluate the module. A sample processor and data is packed together, as an example of the module's capabilities.

cd <checkout_folder>/dist
node ./cli -s "beer name"

Installing from github packages

npm i @arturfigueira/cervajager

Example of Usage

import { Scraper, DamerauMatcher } from "@arturfigueira/cervajager";
import {
  SamplerProcessor,
  WebScraper,
  ScraperEngine,
} from "@arturfigueira/cervajager";

const searchTerm = "Straffe Hendrik Heritage 2013";

//Fuzzy Naming matching to filter results. Optional
const matcher = new DamerauMatcher(80);

//Initialize a new Web Scraper plugin.
const webScraperPlugin = new WebScraper(new SamplerProcessor());

//List of data sources to search for the specified term
const sources = [webScraperPlugin];

//Initialize the scraper
const scraper = new Scraper(sources);

//Search and display the resultant list of beers
scraper
  .byName(searchTerm)
  .then((result) => {
    console.log(result.scrapedBeers); //ordered by price, ascending
  })
  .catch((err) => {
    console.log(err.message);
  })
  .finally(() => ScraperEngine.halt());

API

Scraper.byName(string)

Search given beer name at all registered data sources. Errors during the search will be suppressed and won't halt the search process.

Returns a promise that will resolve into an object that contain a list of found beers, and a list of errors, identified per source. Both lists can be empty, and won't be null.

Plugins

This module works with the idea of plugins, that can be developed to extend the reach of search. All plugins should implement the interface sourceScraper.

A WebScraper plugin is available OOTB, however each web page has its particularities. To deal with it a scrapeProcessor is available to be implemented for each web store that will be searched. The scrape processor will be responsible to process and extract the list of found beers, from the web page.

import { Scraper } from "./core";
import { WebScraper } from "./plugins/scrapers/web";
const myWebStoreProcessor: ScrapProcessor = new CustomWebStoreProcessor();
const webScraper = new WebScraper(myWebStoreProcessor);
const scraper = new Scraper([webScraper]);

Name Matching

When searching for a term a mix bag of results might be returned from the data sources. Each source may deal with a search into different approaches, some may return results even when the searched beer is not found, some may return the found beer with a list of similar beers, others may return the exact match. To help filtering those results a NameMatcher interface is also available and can be loaded into the WebScraper plugin. Name matching is optional for the WebScraper, if not presented no filtering will be applied and all results found will be returned.

A matcher, using a Damerau-Levensthein distance algorithm, to calculate terms similarities, is available out of the box.

import { DamerauMatcher } from "./core/matcher";

//Ratio of 80%. Will consider a match if both terms have at least 80% of similarity, based on damerau algorithm
const matcher = new DamerauMatcher(80);
matcher.matches("TermA", "TermB");

Contributing

Fell free to contribute, suggest optimizations or new features. But before getting your hands dirty, please take a minute and read this guidelines.

License

This project is licensed under the GNU GENERAL PUBLIC LICENSE - see the LICENSE.md file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
assets		assets
samples		samples
src		src
test/core		test/core
types		types
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
.travis.yml		.travis.yml
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
jest.config.json		jest.config.json
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CervaJäger

Table of Contents

Instalation

Installing from release or source code

Installing from github packages

Example of Usage

API

Scraper.byName(string)

Plugins

Name Matching

Contributing

License

About

Releases

Packages

Languages

License

arturfigueira/cervajager

Folders and files

Latest commit

History

Repository files navigation

CervaJäger

Table of Contents

Instalation

Installing from release or source code

Installing from github packages

Example of Usage

API

Scraper.byName(string)

Plugins

Name Matching

Contributing

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages