Per-project job limit #140

luc4smoreira · 2016-04-06T19:54:46Z

I developed this new feature to allow limit the maximum jobs per project. Please, check if it is interesting.

…_jobs_per_project

Digenis · 2016-04-07T09:33:38Z

Hi,

What is your use-case for this feature?
Do you have problems with projects monopolizing resources?
If so, how did this happen?
What kind of projects do you run in the same scrapyd instance?

Btw, try running the tests locally before opening a PR
instead of waiting for TravisCI.

luc4smoreira · 2016-04-07T12:38:24Z

Hello Digenis.

I want to use Scrapyd in a production environment. There is a lot of Spiders projects. Some of those, runs eventually (monthly) but take about 3 days to complete all jobs, with about 500 jobs. So, I don´t want to lock the other jobs when this project starts.

I found other users that need this kind of feature too, like this one:
https://groups.google.com/forum/#!topic/scrapy-users/FME7PVpD2k8

I will work to fix the tests today, if I have time. And push the code to this branch.

Digenis · 2016-04-07T13:38:43Z

I will need someone who's been more involved in the poller/launcher to review this when ready.

luc4smoreira · 2016-04-07T14:54:19Z

Sorry about the mess in Travis history.

I fixed the unit test using mock, with this module: https://pypi.python.org/pypi/mock

I am looking for how to add this egg in travis.

# Conflicts: # scrapyd/poller.py

schedules no postgres, esse comportamento pode ser hablitado através da configuração 'enable_postgres_persist = true'. Isso viabilizará o reprocessamento de jobs.

inicio e fim de uma execução.

request_count) no banco ao termíno de uma execução. Além disso foi criado parametros no arquivo de configuração para os buckets, do s3, para armazenamento do arquivo de log e items.

…om um banco postgres.

…erreira/scrapyd into MAX_JOBS_PER_PROJECT

…nar quando algum erro ocorre.

conexão com o rabbitmq.

Max jobs per project

jpmckinney · 2021-09-24T00:06:44Z

This PR has severe conflicts. Would any of the contributors be able to resolve them? If not, I will close the PR and create an issue instead (or defer to #197 as suggested in #389).

pawelmhm · 2021-11-23T06:54:00Z

This introduces postgres and rabbitmq as dependencies, will increase technical debt. Also some things added here were already done in simpler way using sqlite here: #359 and merged.

So in this form this PR cannot be merged.

Ideally we should just allow configurable Pollers, now we load QueuePoller class by default, but we could just make it possible for people to write any sort of complex Pollers themselves. Same for scheduler. I think ScrapyD should be basic and simple, but should provide building blocks to extend it with your desired functionality. This desired functionality from this PR could be added as custom project extension of some specific ScrapyD project, and ScrapyD should just allow people to integrate it easily by making all core components configurable.

Lucas Miranda added 6 commits March 17, 2016 13:57

Limit number of jobs per project

5709361

Documentation about the new parameter max_jobs_per_project

726aa99

Ignore files from eclipse workspace

85ed98c

Merge remote-tracking branch 'main/master'

a233941

Fixing logical operator

abf8113

Limits the maximum jobs per project using a new parameter called: max…

259e7a5

…_jobs_per_project

Lucas Miranda added 2 commits April 7, 2016 11:27

Changing visibility of method has_slot_for_project to private.

ce93726

Fixing unit test, mocking launcher object

d73bcc1

Lucas Miranda added 4 commits April 11, 2016 09:44

Unit tests corrections

56d999c

Set param as int.

5649ba5

Using config.getint()

fda04e8

Merge branch 'MAX_JOBS_PER_PROJECT'

5d21adf

# Conflicts: # scrapyd/poller.py

Digenis added the type: enhancement label May 17, 2016

Digenis added this to the 1.2 milestone May 22, 2016

wellingtonferreira added 10 commits June 15, 2016 18:55

Extensão do Scrapyd para realizar a persistencia das informações dos

eb5f369

schedules no postgres, esse comportamento pode ser hablitado através da configuração 'enable_postgres_persist = true'. Isso viabilizará o reprocessamento de jobs.

[OFF] Criação de 2 métodos, que marcam respectivamente, as datas de

02725cd

inicio e fim de uma execução.

[OFF] Pesistencia das informações (error_count, warn_count, item_count e

3aa425b

request_count) no banco ao termíno de uma execução. Além disso foi criado parametros no arquivo de configuração para os buckets, do s3, para armazenamento do arquivo de log e items.

[Off] Parametro para o arquivo de configuração do s3fs.

57199b2

[OFF] Suporte nativo ao postgres.

51a6189

[off]

2396725

[off]

41c9700

[off] Transição para a versão 1.3.1

248b767

[off] Criação de um singleton responsável por gerenciar as conexões c…

54af011

…om um banco postgres.

[off] Transição para a versão 1.3.2

7ddafd4

Digenis mentioned this pull request Nov 2, 2016

Polling order / Scheduled jobs priorities vs queue priorities #187

Open

correção do método poll para executar projetos em simultaneo

9199604

wellingtonferreira added 9 commits January 12, 2017 17:05

Merge branch 'MAX_JOBS_PER_PROJECT' of https://github.com/wellingtonf…

e1aed00

…erreira/scrapyd into MAX_JOBS_PER_PROJECT

[off] Tratamento para impedir que o servico de polling pare de funcio…

65af4f8

…nar quando algum erro ocorre.

#SUPPLY-366 Integração do scrapyd com o rabbitmq.

e661329

[SUPPLY-366] Integração do scrapyd com o Rabbitmq.

eff6f06

Remoção da lib pyrabbit

90f6e43

Foi retirado do construtor, a instrução que realiza a abertura de

5f3d9c6

conexão com o rabbitmq.

Centralização do total de jobs.

93706cd

Tratamento de erro ao retornar o tamanho da fila.

29f8846

Tramento de erro no método que consume mensagem da fila.

7259681

Digenis modified the milestones: 1.3.0, 1.2.0 Apr 6, 2017

Digenis mentioned this pull request Apr 12, 2017

avoid having 2 times the same spider running at the same time #228

Closed

Eduardo Tavares and others added 3 commits November 29, 2017 15:11

Desligado o log da query de contagem de execuções em fila

f635425

Retornando à configuração de fila de execuções com postgres

05fdd94

Merge pull request #1 from wellingtonferreira/MAX_JOBS_PER_PROJECT

7e3d501

Max jobs per project

Digenis mentioned this pull request Apr 9, 2021

Project alive? #389

Closed

Digenis changed the title ~~Limits the maximum jobs per project~~ Per-project job limit Apr 13, 2021

jpmckinney mentioned this pull request Sep 23, 2021

Version 1.3 #364

Closed

jpmckinney modified the milestones: 1.3.0, 1.4.0 May 13, 2022

jpmckinney closed this Feb 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Per-project job limit #140

Per-project job limit #140

luc4smoreira commented Apr 6, 2016

Digenis commented Apr 7, 2016

luc4smoreira commented Apr 7, 2016

Digenis commented Apr 7, 2016

luc4smoreira commented Apr 7, 2016

jpmckinney commented Sep 24, 2021

pawelmhm commented Nov 23, 2021

Per-project job limit #140

Per-project job limit #140

Conversation

luc4smoreira commented Apr 6, 2016

Digenis commented Apr 7, 2016

luc4smoreira commented Apr 7, 2016

Digenis commented Apr 7, 2016

luc4smoreira commented Apr 7, 2016

jpmckinney commented Sep 24, 2021

pawelmhm commented Nov 23, 2021