@scrapinghub

Scrapinghub

Turn web content into useful data

  • Scrapinghub Command Line Client

    Python 67 42 Updated Jul 18, 2018
  • A python binding for crfsuite

    Python 404 141 MIT Updated Jul 16, 2018
  • A scalable frontier for web crawlers

    Python 748 139 9 issues need help Updated Jul 16, 2018
  • NER toolkit for HTML data

    HTML 171 40 Updated Jul 13, 2018
  • python parser for human readable dates

    Python 963 199 BSD-3-Clause Updated Jul 6, 2018
  • Lightweight, scriptable browser as a service with an HTTP API

    Python 1,931 277 Updated Jul 2, 2018
  • Shell 18 5 Updated Jun 29, 2018
  • Kafka Consumer Lag Checking

    Go 373 Apache-2.0 Updated Jun 28, 2018
  • A client interface for Scrapinghub's API

    Python 111 36 Updated Jun 12, 2018
  • Scrapinghub Documentation

    Python 9 12 Updated Jun 11, 2018
  • Extract embedded metadata from HTML markup

    Python 253 39 3 issues need help Updated Jun 8, 2018
  • Visual scraping for Scrapy

    Python 6,185 963 Updated Jun 6, 2018
  • RabbitMQ backend for ASGI

    Python 9 Updated Jun 4, 2018
  • Scrapinghub Learning Center

    CSS 32 15 Updated May 25, 2018
  • The Apache Kafka C/C++ library

    C 942 Updated May 23, 2018
  • Python parser for Adblock Plus filters

    Python 108 16 MIT Updated May 17, 2018
  • Scala 123 Apache-2.0 Updated May 13, 2018
  • A process for exposing JMX Beans via HTTP for Prometheus consumption

    Java 236 Apache-2.0 Updated May 11, 2018
  • Scrapy realtime

    Python 369 83 Updated May 9, 2018
  • A tool for managing Apache Kafka.

    Scala 3 1,357 Apache-2.0 Updated May 4, 2018
  • Scrapy entrypoint for Scrapinghub job runner

    Python 10 7 Updated Apr 25, 2018
  • An efficient simhash implementation for python

    C 50 15 Updated Apr 24, 2018
  • Software stack used to run Portia spiders in Scrapinghub cloud

    Python 7 3 Updated Apr 3, 2018
  • Python 25 11 Updated Mar 27, 2018
  • Dockerized redmine app server with a couple of pre-installed themes and plugins

    Shell 2 293 MIT Updated Mar 27, 2018
  • 11 7 Updated Mar 23, 2018
  • Sample projects showcasing Scrapinghub tech

    Python 51 39 Updated Mar 11, 2018
  • [DEPRECATED]

    62 Updated Mar 2, 2018
  • scrapylib Archived

    Collection of Scrapy utilities (extensions, middlewares, pipelines, etc)

    Python 17 80 Updated Feb 22, 2018
  • secure services with stunnel

    Shell 1 11 Updated Feb 6, 2018