Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Scraping Web Stuff with Meteor

This package adds some sophisticated scraping utilities to your meteor app. You can request ordinary HTML sites or RSS/Atom-Feeds and get a nicely detailed result object back. A few features, built on top of excellent NPM modules:


meteor add anonyfox:scrape


Works on the server with an easy API:

    # scrape any website
    websiteData = ""

    # scrape any RSS or Atom feed
    feedData = Scrape.feed ""

    # scrape wikipedia
    article = Scrape.wikipedia "web scraping"

    # scrape everything else, without further parsing
    data = Scrape.url ""


Works best for typical articles, blog posts or other content sites, but even a tweet should suffice. Example response data for

      title: 'Inside the digital war room',

      text: '21 February 2015 Last updated at 15:55 GMT It is not clear how many devices have the software installed Chinese computer maker Lenovo is offering customers a tool to help them remove pre-installed software that experts warned was a security risk. The Superfish adware [...]', # shortened, actually here is the full text !

      lang: 'eng',

      description: 'This would allow it - or anyone who hacked Superfish - to collect data over secure web connections. Users had initially complained about intrusive pop-up ads appearing on their browsers. Lenovo said on Thursday it had disabled it because of customer complaints. Superfish was designed to help users find products by visually analysing images on the web to find the cheapest ones. Superfish appears to work by substituting its own security key for the encryption certificates used by many websites.',

      favicon: '',

      references: [ '', ... ]

      image: '',

      [ '',
        '' ],

      [ 'devices',
       'appears' ],

      domain: '',

      url: ''


Takes any RSS or Atom Feed and returns a bunch of items. For example:

    data = Scrape.feed "" # {items: [...]}

A single news item looks like this:

      title: 'AppMachine raises $15M to help non-coders build their own mobile apps',

      description: '"Build your own app" startup AppMachine has announced a $15.2 million funding round, as domain hosting and registration behemoth Endurance International Group takes 40 percent in shares.',

      language: 'eng',

      link: '',

      pubDate: Tue Feb 24 2015 16:10:17 GMT+0000 (UTC),

      image: '',

       [ 'netherlands',
         'shares' ]


Takes a simple keyword and optional language and additional tags (in case of disambiguation)

    Scrape.wikipedia 'avengers', 'en', ['film']

This produces following output:

  title: 'The Avengers (2012 film)'
  lang: 'en'
  descriptions: [ '2012 superhero film produced by Marvel Studios' ]
  tags: [ 'avengers' ]
  aliases: [ 'Marvel Avengers Assemble', 'Marvel\'s The Avengers' ]
  url: ''
  summary: '<p><i><b>Marvel\'s The Avengers</b></i> (classified under the name <i><b>Marvel Avengers Assemble</b></i> in the United Kingdom and Ireland), or simply <i><b>The Avengers</b></i>, is a 2012 American superhero film based on the Marvel Comics superhero team of the same name, produced by Marvel Studios and distributed by Walt Disney Studios Motion Pictures.<sup class="reference plainlinks nourlexpansion" id="ref_1">1</sup> It is the sixth installment in the Marvel Cinematic Universe. The film was written [...]'
    caption: 'Theatrical release poster'
    director: '[Joss Whedon]('
    producer: '[Kevin Feige]('
    screenplay: 'Joss Whedon'
    based: '[The Avengers]('
    music: '[Alan Silvestri]('
    cinematography: '[Seamus McGarvey]('
    studio: '[Marvel Studios]('
    runtime: '143 minutes'
    country: 'United States'
    language: 'English'
    budget: '$220 million'
    gross: '$1.518 billion'


  • get it working on the client
  • image thumbnail creation
  • transform Favicon to Base64 String


Scraping is a game of catch-22, and everything may break everytime. Therefore this package is licensed under the LGPL 3.0. Do whatever you want with it, but please give improvements and bugfixes back so everyone can benefit.


Scrape any Website or RSS/Atom-Feed with ease.







No releases published


No packages published