Kitchenhand

A simple scraper for extracting structured recipe data from the web

This is a simple utility module that attempts to retrieve and parse recipe data from websites that employ the structured data recipe schema defined at Schema.org. The scraper can extract data both from webpages that store recipe in embedded ld+json as well as recipes that flag DOM elements with itemprops. That means that Kitchenhand should be able to handle the majority of recipes that are stored using the common schema, provided that the implementation of that schema in the markup is reasonably sane.

Basic Usage

const kitchenhand = require('kitchenhand');

kitchenhand(<url>).then(recipe => {
    console.log(recipe);
});

Kitchenhand will return either a recipe object, or an error message if recipe data could not be retrieved from the specified URL:

{ 
    message: "Could not retrieve recipe data from <url>"
}

Options

The call to the kitchenhand() function also optionally accepts an options parameter. At present, however, only the parseIngredients option is handled.

parseIngredients

When passing the parseIngredients option, Kitchenhand will attempt to parse the list of ingredients into objects with three properties: amount, unit, and name.

kitchenhand(<url>, { parseIngredients }).then(recipe => console.log(recipe));

If all goes well, this will result in a recipeIngredient array that looks like the following:

Recipe {
    ...,
    recipeIngredient: [
        { amount: '1/2', unit: 'cup ', name: 'fresh parsley leaves' },
        { amount: '1/2', unit: 'cup ', name: 'fresh cilantro leaves' },
        ...
    ],
}

Currently, the parseIngredients function relies on a regexp-driven algorithm that is not exceptionally robust, especially in relation to the great variety of formats in which ingredients are listed in recipes today. Expect that your mileage may vary!

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
scraper		scraper
.gitignore		.gitignore
.npmignore		.npmignore
README.md		README.md
index.js		index.js
kh-cli.js		kh-cli.js
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Kitchenhand

A simple scraper for extracting structured recipe data from the web

Basic Usage

Options

parseIngredients

About

Releases

Packages

Languages

noemata83/kitchenhand

Folders and files

Latest commit

History

Repository files navigation

Kitchenhand

A simple scraper for extracting structured recipe data from the web

Basic Usage

Options

parseIngredients

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages