Skip to content

bitmakerla/estela

main
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

* Solving issues with localzones, correcting query for spiders, deleting unnecessary endpoint project_stats_jobs, avoiding memory leak in modal frontend

* Solving issues with rest_framework Duration field parsing in frontend
bbcc64d

Git stats

Files

Permalink
Failed to load latest commit information.

estela

estela is an elastic web scraping cluster running on Kubernetes. It provides mechanisms to deploy, run and scale web scraping spiders via a REST API and a web interface.

Technologies

docker python react nodejs

Project Structure

The project consists of three main modules:

  • REST API : built with the Django REST framework toolkit, it exposes several endpoints to manage projects, spiders, and jobs. It uses Celery for task processing and takes care of deploying your Scrapy projects, among other things.
  • Queueing : estela needs a high-throughput, low-latency platform that controls real-time data feeds in a producer-consumer architecture. In this module, you will find a consumer used to collect and transport the information from the spider jobs into a database.
  • Web : A web interface implemented with React and Typescript that lets you manage projects and spiders.

Each of these modules works independently of the rest and can be changed. Each module has a more detailed description in its corresponding directory.

estela-cli

estela-cli is a command-line interface for estela.

How to Contribute

Please read CONTRIBUTING.md and follow the steps. Remember to abide by our adapted from ESTELA Code of Conduct too.