@scrapinghub

Scrapinghub

Turn web content into useful data

splash

Lightweight, scriptable browser as a service with an HTTP API

Updated Jul 30, 2016

JavaScript 4,415 641

portia

Visual scraping for Scrapy

Updated Jul 30, 2016

Python 2 1

shub-image

Client side tool to prepare docker images to run crawlers in Scrapinghub

Updated Jul 29, 2016

Python 243 65

frontera

A scalable frontier for web crawlers

Updated Jul 29, 2016

Python 53 22

python-scrapinghub

A client interface for Scrapinghub's API

Updated Jul 29, 2016

Python 88 57

testspiders

Useful test spiders for Scrapy

Updated Jul 29, 2016

Python 9 3

exporters

Exporters is an extensible export pipeline library that supports filter, transform and several sources and destinations

Updated Jul 27, 2016

Python 23 28

shub

Scrapinghub Command Line Client

Updated Jul 27, 2016

Python 0 0

scrapinghub-stack-portia

Software stack used to run Portia spiders in Scrapinghub cloud

Updated Jul 26, 2016

Python 1 1

scrapinghub-stack-hworker

Updated Jul 26, 2016

Python 5 3

scrapinghub-entrypoint-scrapy

Scrapy entrypoint for Scrapinghub job runner

Updated Jul 26, 2016

Python 522 69

dateparser

python parser for human readable dates

Updated Jul 25, 2016

Python 6 1

kafka-scanner

High Level Kafka Scanner

Updated Jul 22, 2016

Python 1 0

collection-scanner

HubStorage collection scanner library

Updated Jul 22, 2016

Shell 10 4

docker-images

Updated Jul 22, 2016

Python 2 9

doc.scrapinghub.com

Scrapinghub Documentation

Updated Jul 21, 2016

Python 236 87

scrapylib

Collection of Scrapy utilities (extensions, middlewares, pipelines, etc)

Updated Jul 19, 2016

Python 106 9

extruct

Extract embedded metadata from HTML markup

Updated Jul 18, 2016

Shell 1 2

scrapinghub-conda-recipes

Conda packages for scrapinghub channel

Updated Jul 14, 2016