Official Diffbot Ruby API Client
Switch branches/tags
Nothing to show
Clone or download
Pull request Compare This branch is 17 commits ahead of Sology:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.

Diffbot API Ruby client


This is a Ruby client library for Diffbot API.


Install the gem:

gem 'diffbot-ruby-client', :git => ''

Require diffbot in your app

require "diffbot"


Obtaining Ruby Diffbot client is simple as that:

client =

This allows to build thread-safe applications and to keep at a time multiple client instances with different setup.

Initializer can accept also a block which allows us to do some fancy setup stuff:

client = do |config|
  config.token = ENV["DIFFBOT_TOKEN"]

Once we've got token configured, we can move on to making actual requests.


API uses Faraday as a HTTP middleware library. It can be configured as usual, even within initialization block:

client = do |config|
  config.middleware =
    & do |builder|
      # Specify a middleware stack here
      builder.adapter :some_other_adapter


Generic calls

While Ruby client provides support for each Diffbot API through dedicated classes and methods, it is still possible to call API in a generic way. Here's an example how to do that:

client =
response = client.get("v2/analyze", {:token => DIFFBOT_TOKEN, :url => ""})

response will contain then JSON reply parsed to a Hash. It is possible also to issue POST request the same way (via post method).

Article API

Assume that we have our client configured. In order to use Automatic Article API we need to instantiate Article API instance first:

client.article # => Diffbot::APIClient::Article
client.article(:version => 1) # Instantiate API version 1 (2 is default)

Then we need to specify the query:

article = client.article.query(:fields => [:title, :link, :text], :timeout => 2000)
article # => Diffbot::APIClient::Article

And then do GET or POST request:

response = article.get("")
response[:title] # => "Some page title"

response ="", content)

We can also make a sweet one-liner out of it:

response = client.article.get("")

There is also an alternative syntax for making requests:

article = client.article.query(
  :fields => [:title, :link, :text], 
  :timeout => 2000, 
  :method => :get, 
  :url => ""
response = article.execute

Frontpage API

Calling Fronpage API is also pretty simple:

response = client.frontpage.get("")

By default DML is returned in response. You can change this by adding :format to query:

response = client.frontpage.query(:format => :json).get("")

Image API

response = client.image.get("")

Product API

response = client.product.get("")

Analyze API

Similarly, here's how you would call Analyze API:

response = client.analyze.query(:mode => "article", :stats => true).get("")

Custom API

With Custom API you need to supply its name:

response = client.custom("my-custom-api").get("")

Bulk API

Bulk API allows to submit jobs to Diffbot. Jobs can use different apis to analyse websites. This requires to supply apiUrl which will be used to perform crawling. Ruby client makes possible to avoid using urls here. Instead, it is possible to use Ruby API objects described above:

bulk = client.bulk(
  :name => "bulk-job",
  :urls => ["", ""],
  :api => client.article.query(:fields => [:title, :text]),
  :options => bulk_arguments_hash

api argument here can accept any valid API object with or without extra query parameters.

Once we got bulk object constructed, we can get job details, pause it, resume or delete:


Finally, we can obtain result of bulk job:

Crawlbot API

Crawlbot API is pretty similar to Bulk API but instead of :urls parameter it requires :seeds. Here's the sample call:

crawlbot = client.crawlbot(
  :name => "test",
  :seeds => [""],
  :api => client.analyze

Just like Bulk object, Crawlbot supports details, pause, resume and delete operations.

Batch API

Batch API allows to submit multiple API calls in one single request. Once you've created batch object, you can add api calls using ``<<method. After that, just callexecute` to submit request:

batch = client.batch
batch << client.article.query(:fields => [:title, :link, :text], :method => :get, :url => "")
response = batch.execute


Please see LICENSE for licensing details.


Łukasz Jachymczyk,