Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How could I reduce number of workers? #1598

Closed
ygbingo opened this issue Dec 17, 2021 · 1 comment · Fixed by #1744
Closed

How could I reduce number of workers? #1598

ygbingo opened this issue Dec 17, 2021 · 1 comment · Fixed by #1744
Labels
enhancement Improvement on existing feature

Comments

@ygbingo
Copy link

ygbingo commented Dec 17, 2021

Could I reduce the number_of_workers?

I run the doccano in my machine use this code.

doccano init
doccano create user ***
doccano web server --port ***

And then I got this log:

Booting worker with pid: 19
Booting worker with pid: 20
...
Booting worker with pid: 157

It run lots of worker and it took up a lot of memory. So, can I change the number_of_worker varlible. I saw the default number_of_worker= multiprocessing.cpu_count()*2+1. How could I change it?

Your Environment

  • Operating System: Linux
  • Python Version Used: Python38
  • When you install doccano: 2021-11-30
  • How did you install doccano (Heroku button etc): pip install doccano
@ygbingo
Copy link
Author

ygbingo commented Dec 22, 2021

As I want to add a PR to solve this problem, I got this error:

fatal: 'origin' does not appear to be a git repository
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

😅

@Hironsan Hironsan added the enhancement Improvement on existing feature label Mar 17, 2022
Hironsan added a commit that referenced this issue Mar 17, 2022
ghontolux added a commit to ghontolux/doccano that referenced this issue Mar 28, 2022
* Rename RelationType to RelationTypeOld

* Add RelationType model

* Replace RelationTypeOld with RelationType in label_type app

* Rename Relation to RelationOld

* Add new Relation model

* Add data migration for new relation model

* Remove old models related to relation

* Rename RelationNew to Relation

* Simplify RelationList and RelationDetail APIs

* Update relation type urls

* Add a test case for relation type creation

* Add clean method to Relation

* Add use_relation field to SequenceLabelingProject

* Allow user to select relation labeling option

* Enable to create relation type in frontend

* Enable to show relation distribution

* Enable to label relation(first edition)

* Update the version of v-annotator

* Add LabelingMenu.vue

* Enable to update relation type

* Enable to add relation if relation type is not selected

* Add export catalog for relation extraction

* Add RelationExtractionRepository

* Add EntityAndRelationWriter

* Add factories for exporting relation data

* Rename link to relation

* Remove old entity components

* Rename SequenceLabelingLabel to Span

* Enable to reset selected entities

* Enable to highlight entity when it's selected

* Add cleanup after adding a relation

* Change switch label

* Remove direction field from Relation model

* Rename entity to span

* Update docker compose instruction, fix doccano#1601

* Update README.md to add documentation link

* Update default database location in cli

* Fix state of label selection on LabelingMenu for sequenceLabeling

Currently labeling two enties with the same label by chosing it from the
autocomplete, didn't work. Labeling the first entity worked, but when you
tried to label a new entity with the same label chosen from the
autocomplete, nothing happened.

* Fix LabelingMenu style

* Change the props of LabelDistribution.vue from colorMapping to labels

* Replace each distribution component with LabelDistribution.vue

* Change the prop name from labels to labelTypes

* Move the import button to the left

* Move the dataset import page to the dataset directory

* Move the dataset page after successful import

* Fix migration

* Black formatting

* Update migration when adding uuids

reference: https://docs.djangoproject.com/en/dev/howto/writing-migrations/#migrations-that-add-unique-fields

* Update docker documentation

* Rename parameter names in data_export

- id -> data_id
- format -> file_format

* Remove unused variables

* Remove duplicated zh

* Add data export page

* Add validate function to data import page

* Remove download dialog from dataset page

* Update tutorial document

* Specify project when retrieving labels

* Output log from gunicorn

* Add task images

* Update project creation form

* Add project creation page

* Remove creation feature from project index page

* Enable to create tags on project creation

* Update pypi workflow

* Update pypi workflow

* Upgrade npm packages

* Fix project update

* Update pypi workflow

* Update upgrade guide

* Add MyRole API

* Update pypi workflow to install poetry-dynamic-versioning

* Update LabelingMenu.vue, fix doccano#1736

* Replace git:// with https://

* fix: Replace git:// with https:// in yarn.lock

* Show id in database page, resolve doccano#1604

* Support workers option, resolve doccano#1598

* Bump waitress from 2.0.0 to 2.1.1 in /backend

Bumps [waitress](https://github.com/Pylons/waitress) from 2.0.0 to 2.1.1.
- [Release notes](https://github.com/Pylons/waitress/releases)
- [Changelog](https://github.com/Pylons/waitress/blob/master/CHANGES.txt)
- [Commits](Pylons/waitress@v2.0.0...v2.1.1)

---
updated-dependencies:
- dependency-name: waitress
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* Update minimist version

Co-authored-by: Hironsan <light.tree.1.13@gmail.com>
Co-authored-by: Hiroki Nakayama <hiroki.nakayama.py@gmail.com>
Co-authored-by: Jesse Hoobergs <jesse@codebergs.com>
Co-authored-by: Roland Szabo <rolisz@gmail.com>
Co-authored-by: youichiro <cinnamon416@gmail.com>
Co-authored-by: Gerhard Haß <gerhard.hass@neofonie.de>
Co-authored-by: Wojciech Kusa <wojciech.kusa@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
ghontolux added a commit to ghontolux/doccano that referenced this issue Apr 8, 2022
* Rename RelationType to RelationTypeOld

* Add RelationType model

* Replace RelationTypeOld with RelationType in label_type app

* Rename Relation to RelationOld

* Add new Relation model

* Add data migration for new relation model

* Remove old models related to relation

* Rename RelationNew to Relation

* Simplify RelationList and RelationDetail APIs

* Update relation type urls

* Add a test case for relation type creation

* Add clean method to Relation

* Add use_relation field to SequenceLabelingProject

* Allow user to select relation labeling option

* Enable to create relation type in frontend

* Enable to show relation distribution

* Enable to label relation(first edition)

* Update the version of v-annotator

* Add LabelingMenu.vue

* Enable to update relation type

* Enable to add relation if relation type is not selected

* Add export catalog for relation extraction

* Add RelationExtractionRepository

* Add EntityAndRelationWriter

* Add factories for exporting relation data

* Rename link to relation

* Remove old entity components

* Rename SequenceLabelingLabel to Span

* Enable to reset selected entities

* Enable to highlight entity when it's selected

* Add cleanup after adding a relation

* Change switch label

* Remove direction field from Relation model

* Rename entity to span

* Update docker compose instruction, fix doccano#1601

* Update README.md to add documentation link

* Update default database location in cli

* Fix state of label selection on LabelingMenu for sequenceLabeling

Currently labeling two enties with the same label by chosing it from the
autocomplete, didn't work. Labeling the first entity worked, but when you
tried to label a new entity with the same label chosen from the
autocomplete, nothing happened.

* Fix LabelingMenu style

* Change the props of LabelDistribution.vue from colorMapping to labels

* Replace each distribution component with LabelDistribution.vue

* Change the prop name from labels to labelTypes

* Move the import button to the left

* Move the dataset import page to the dataset directory

* Move the dataset page after successful import

* Fix migration

* Black formatting

* Update migration when adding uuids

reference: https://docs.djangoproject.com/en/dev/howto/writing-migrations/#migrations-that-add-unique-fields

* Update docker documentation

* Rename parameter names in data_export

- id -> data_id
- format -> file_format

* Remove unused variables

* Remove duplicated zh

* Add data export page

* Add validate function to data import page

* Remove download dialog from dataset page

* Update tutorial document

* Specify project when retrieving labels

* Output log from gunicorn

* Add task images

* Update project creation form

* Add project creation page

* Remove creation feature from project index page

* Enable to create tags on project creation

* Update pypi workflow

* Update pypi workflow

* Upgrade npm packages

* Fix project update

* Update pypi workflow

* Update upgrade guide

* Add MyRole API

* Update pypi workflow to install poetry-dynamic-versioning

* Update LabelingMenu.vue, fix doccano#1736

* Replace git:// with https://

* fix: Replace git:// with https:// in yarn.lock

* Show id in database page, resolve doccano#1604

* Support workers option, resolve doccano#1598

* Bump waitress from 2.0.0 to 2.1.1 in /backend

Bumps [waitress](https://github.com/Pylons/waitress) from 2.0.0 to 2.1.1.
- [Release notes](https://github.com/Pylons/waitress/releases)
- [Changelog](https://github.com/Pylons/waitress/blob/master/CHANGES.txt)
- [Commits](Pylons/waitress@v2.0.0...v2.1.1)

---
updated-dependencies:
- dependency-name: waitress
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* Update minimist version

* added encoding format while opening of file

* Set nounset on Bash scripts (fixes doccano#860)

* fix empty export in entity-relationship-labeling

* show total progress if collaborative_annotation

* Update factory for export repository

* Add reduce_user to RelationExtractionRepository

* Set default value to IntentDetectionSlotFillingRepository

* Add reduce_user to IntentDetectionSlotFillingRepository

* Add test cases for TestTextClassificationRepository

* Add test cases for Seq2seqRepository

* Add test cases for Speech2textRepository

* Add test cases for FileRepository

* Enable to pass mypy

* Add test cases for Progress API

Co-authored-by: Hironsan <light.tree.1.13@gmail.com>
Co-authored-by: Hiroki Nakayama <hiroki.nakayama.py@gmail.com>
Co-authored-by: Jesse Hoobergs <jesse@codebergs.com>
Co-authored-by: Roland Szabo <rolisz@gmail.com>
Co-authored-by: youichiro <cinnamon416@gmail.com>
Co-authored-by: Wojciech Kusa <wojciech.kusa@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Dhiraj Suvarna <dhiraj.suvarna@gmail.com>
Co-authored-by: Alexander Kurakin <kuraga333@mail.ru>
Co-authored-by: mkmark <mark@mkmark.net>
Co-authored-by: Gerhard Haß <gerhard.hass@neofonie.de>
ghontolux added a commit to ghontolux/doccano that referenced this issue Jun 2, 2022
* Rename entity to span

* Update docker compose instruction, fix doccano#1601

* Update README.md to add documentation link

* Update default database location in cli

* Fix state of label selection on LabelingMenu for sequenceLabeling

Currently labeling two enties with the same label by chosing it from the
autocomplete, didn't work. Labeling the first entity worked, but when you
tried to label a new entity with the same label chosen from the
autocomplete, nothing happened.

* Fix LabelingMenu style

* Change the props of LabelDistribution.vue from colorMapping to labels

* Replace each distribution component with LabelDistribution.vue

* Change the prop name from labels to labelTypes

* Move the import button to the left

* Move the dataset import page to the dataset directory

* Move the dataset page after successful import

* Fix migration

* Black formatting

* Update migration when adding uuids

reference: https://docs.djangoproject.com/en/dev/howto/writing-migrations/#migrations-that-add-unique-fields

* Update docker documentation

* Rename parameter names in data_export

- id -> data_id
- format -> file_format

* Remove unused variables

* Remove duplicated zh

* Add data export page

* Add validate function to data import page

* Remove download dialog from dataset page

* Update tutorial document

* Specify project when retrieving labels

* Output log from gunicorn

* Add task images

* Update project creation form

* Add project creation page

* Remove creation feature from project index page

* Enable to create tags on project creation

* Update pypi workflow

* Update pypi workflow

* Upgrade npm packages

* Fix project update

* Update pypi workflow

* Update upgrade guide

* Add MyRole API

* Update pypi workflow to install poetry-dynamic-versioning

* Update LabelingMenu.vue, fix doccano#1736

* Replace git:// with https://

* fix: Replace git:// with https:// in yarn.lock

* Show id in database page, resolve doccano#1604

* Support workers option, resolve doccano#1598

* Bump waitress from 2.0.0 to 2.1.1 in /backend

Bumps [waitress](https://github.com/Pylons/waitress) from 2.0.0 to 2.1.1.
- [Release notes](https://github.com/Pylons/waitress/releases)
- [Changelog](https://github.com/Pylons/waitress/blob/master/CHANGES.txt)
- [Commits](Pylons/waitress@v2.0.0...v2.1.1)

---
updated-dependencies:
- dependency-name: waitress
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* Update minimist version

* Add a setting for AWS

* Add a celery task to upload a file

* Specify storage to example model

* Pass save_names parameter to Writer

* Update DatasetImportAPI to upload a file

* added encoding format while opening of file

* Add GCP setting

* Rename imported file

* Hide query parameters on dataset page

* Enable to remove a file from storage on removing it in application

* iss1765: added confirmed_by field to ExampleStateSerializer

* Randomize saved file name

* Pass upload_ids to import_dataset

* Add upload file size limitation

* Ensure that the imported files will be uploaded after ingesting the data to the database

* Add post processing to the test cases for importing task

* Enable to check file type about image and audio files

* Add FileName class

* Add upload_name field to Example model

* Enable to store upload file name

* Set nounset on Bash scripts (fixes doccano#860)

* fix empty export in entity-relationship-labeling

* show total progress if collaborative_annotation

* Update factory for export repository

* Add reduce_user to RelationExtractionRepository

* Set default value to IntentDetectionSlotFillingRepository

* Add reduce_user to IntentDetectionSlotFillingRepository

* Add test cases for TestTextClassificationRepository

* Add test cases for Seq2seqRepository

* Add test cases for Speech2textRepository

* Add test cases for FileRepository

* Enable to pass mypy

* Add test cases for Progress API

* Update setting files

* Enable to specify .env file by CLI

* Add a document for cloud storage setup

* Remove unused import

* Use upload_name as an export file name

* Use upload_name in dataset page

* Replace url with fileUrl

* Move CORS option to base.py

* Update document for cloud storage setup

* Add tmp_file as a volume, fix doccano#1780

* Enable to export dataset when checked approved only

* Install pandas

* Add ExportedCategory model

* Add exported model manager

* Add labels for data export

* Add dataset class to represent export data

* Add JoinedCategoryFormatter

* Add formatters for category label

* Update constant name in data export

* Update export writers

* Enable to export dataset

* Add span model for export dataset

* Enable to accept multiple label types in export dataset

* Add test cases for exporting relation data

* Update factories

* Add test cases for export task

* Update typing to pass mypy

* Add test cases for exporting intent detection dataset

* Remove unused code

* Remove duplication from formatter

* Extract export catalog as files

* Change attribute name from field_name to column

* Remove ExportedLabelManager

* Add remove files function

* Refactor test_task.py

* Move filter_examples feature to ExportedExampleManager

* Add test cases for ExportedExample

* Add test cases for labels

* Add a test case for dataset

* Add test cases for formatters

* Add FastTextFormatter and FastTextWriter

* Add RenameFormatter

* Remove unused options from data export catalog

* Update factories of data export

* Rename exported dataset columns to match the imported dataset columns

* Rename parameter name

* Boost performance for data export

Remove is_text_project from ExportedExample because this queries db every time

* Replace zip_files with shutil.make_archive

* Remove duplication from export task

* Bump django from 4.0.2 to 4.0.4 in /backend

Bumps [django](https://github.com/django/django) from 4.0.2 to 4.0.4.
- [Release notes](https://github.com/django/django/releases)
- [Commits](django/django@4.0.2...4.0.4)

---
updated-dependencies:
- dependency-name: django
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* Remove unused shell script

* Remove unused component: ChartDoughnut.vue

* Update lint rules and fix files

* Add a frontend job to workflow

* Add hadolint as a workflow

* Remove examples.py from data_import

* Add uuid field to BaseData

* Move examples methods to BulkWriter

* Replace Writer with BulkWriter

* Change the location of applying cleaner from writer to reader

* Remove Examples class

* Add LabeledExamples class

* Move bulk_create code to LabeledExamples

* Add RelationLabel class

* Add example import data for relation extraction

* Add relation extraction to import catalog

* Update import factory for relation extraction

* Add uuid field to label models

* Enable to import relation extraction dataset

* Fix the bug of RelationExamples

* Rename FileData to BinaryData

* Remove unnecessary build_data and build_label

* Divide LabeledExamples into subclasses

* Change Reader design: return dictionary and use DataFrame

* Add LabelFormatter to extract label dataframe

* Add DataFormatter to extract label dataframe

* Add formatter factories

* Add uuid field to the return value of reader

* Return uuid from formatters

* Add label classes and their test cases for data import

* Add label types and their test cases for data import

* Remove unused files

* Add test cases for data classes

* Add makers for data import

* Change method name from create to save

* Add type hint to marker

* Remove writers

* Remove unused factories for data import

* Simplify parse method in label

* Simplify readers

* Return line number from parser

* Add first class collection for label

* Add DummyLabelType model for data import

* Add labels test cases for data_import

* Avoid save labels if there is no corresponding example

* Avoid removing overlapping spans with another example uuids

* Add not empty validator to TextLabel

* Remove uploaded files when FileTypeException is raised

* Add BinaryExampleMaker to handle image and audio data

* Add datasets for import

* Add FileImportException class

* Change the default value of column data and column label

* Update frontend to handle relation dataset import

* Add plain dataset

* Apply black

* Fix relation export

* Update default celery setting to production

* Update pypi workflow

* Update create-package.sh

* Add developer guide to docs

* Change hyphen to underscore

* Add non negative constraint to span offset

* Add non empty constraint to label names

* Replace get_by_text with __getitem__

* Replace file_format string with Format class

* Update .gitignore

* Update developer guide

* Add prettier config

* Apply prettier

* Add prettier to the workflow

* Remove image caption page

* Fix problems after applying prettier

* remove print statements

* fixed data import for entity linking

* fixed data export

Co-authored-by: Hironsan <light.tree.1.13@gmail.com>
Co-authored-by: Hiroki Nakayama <hiroki.nakayama.py@gmail.com>
Co-authored-by: Jesse Hoobergs <jesse@codebergs.com>
Co-authored-by: Roland Szabo <rolisz@gmail.com>
Co-authored-by: youichiro <cinnamon416@gmail.com>
Co-authored-by: Wojciech Kusa <wojciech.kusa@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Dhiraj Suvarna <dhiraj.suvarna@gmail.com>
Co-authored-by: Pisanu Federico <federico.pisanu@gruppomol.it>
Co-authored-by: Alexander Kurakin <kuraga333@mail.ru>
Co-authored-by: mkmark <mark@mkmark.net>
Co-authored-by: Gerhard Haß <gerhard.hass@neofonie.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improvement on existing feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants