Loading…

Python 3 1

distributed-frontera

Updated

aduana

Frontera backend to guide a crawl using PageRank, HITS or other ranking algorithms based on the link structure of the web graph, even when making big crawls (one billion pages).

Updated

Python 341 68

splash

Lightweight, scriptable browser as a service with an HTTP API

Updated

Python 1 0

scrapinghub-entrypoint-scrapy

Scrapy entrypoint for Scrapinghub job runner

Updated

docker-redmine

forked from sameersbn/docker-redmine

Dockerized redmine app server with a couple of pre-installed themes and plugins

Updated

Python 0 0

doc.scrapinghub.com

Scrapinghub Documentation

Updated

Python 365 33

dateparser

python parser for human readable dates

Updated

JavaScript 3,296 446

portia

Visual scraping for Scrapy

Updated

Python 56 16

frontera

A flexible frontier for web crawlers

Updated

Python 163 62

scrapylib

Collection of Scrapy utilities (extensions, middlewares, pipelines, etc)

Updated

otp

forked from erlang/otp

Erlang/OTP

Updated

Python 10 13

shub

Scrapinghub Command Line Client

Updated

Python 3 1

scrapy-crawlera

Crawlera middleware for Scrapy

Updated

webstruct

Learning the structure of the web

Updated

Python 13 9

crawlera-tools

Crawlera tools

Updated

Python 131 18

scrapyrt

Scrapy realtime

Updated

Python 11 17

python-hubstorage

HubStorage client library

Updated

python-readability

forked from buriy/python-readability

fast python port of arc90's readability tool, updated to match latest readability.js!

Updated

Python 0 252

kafka-python

forked from mumrah/kafka-python

Python client for Apache Kafka

Updated

Shell 0 29

docker-kibana

forked from balsamiq/docker-kibana

Balsamiq kibana webapp docker container

Updated