Simple client for the Diffbot API
Switch branches/tags
Nothing to show
Clone or download
Pull request Compare This branch is 1 commit ahead of tymmym:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
src
tests
.gitignore
LICENSE
README.md
Setup.hs
diffbot.cabal

README.md

Diffbot API Haskell Client

Simple client for the Diffbot API for Haskell.

Installation

The easiest way to install the package and its dependencies is to use the cabal command line tool. The Cabal-Install page explains how to use cabal.

To install the package enter the following commands:

$ git clone https://github.com/tymmym/diffbot.git
$ cd diffbot
diffbot $ cabal install

You can also generate library documentation from annotated source code using Haddock:

diffbot $ cabal haddock

Alternatively you can read it online.

Usage

Automatic APIs

Diffbot uses computer vision, natural language processing and machine learning to automatically recognize and structure specific page-types.

To use the Automatic API, call diffbot function with following arguments:

Argument Description
token Developer token
url URL to process
request API settings

Here is the full example with default request to the Article API:

import Diffbot

main = do
    let token = "11111111111111111111111111111111"
        url   = "http://blog.diffbot.com/diffbots-new-product-api-teaches-robots-to-shop-online/"
    resp <- diffbot token url defArticle
    print resp

This code will print information about the primary article content on the submitted page:

Just fromList [("author",String "John Davi"),("title",String "Diffbot\8217s New Product API Teaches Robots to Shop Online"),...

You can extract values from response with a parser using parse, parseEither or, in this example, parseMaybe from aeson package:

getInfo :: Object -> Maybe String
getInfo resp = flip parseMaybe resp $ \obj -> do
    author <- obj .: "author"
    title  <- obj .: "title"
    return $ title ++ ", by " ++ author

You can use the same diffbot function to send requests to other Automatic APIs (Frontpage, Product, Image and Page Classifier), e.g.:

diffbot token url . setTimeout 15000 $ defFrontPage { frontPageAll = True }

Custom API

You can also simply create requests to your Custom API. Just implement an instance for the Request class. Look at the Article API sources for the reference.

Crawlbot API

Crawlbot allows you to apply either Automatic APIs or your own Custom API to intelligently extract an entire site.

To create a new crawl you should use crawlbot function:

import Diffbot
import Diffbot.Crawlbot

main = do
    let token = "11111111111111111111111111111111"
        crawl = defaultCrawl "sampleDiffbotCrawl" ["http://blog.diffbot.com"]
    resp <- crawlbot token $ Create crawl
    print resp

You also can view, pause, restart or delete crawls.

Details

Please consult library documentation for additional information.

-Initial commit by Tim Tych-