Skip to content

Commit

Permalink
feat: image captioning annotator (deeppavlov#197)
Browse files Browse the repository at this point in the history
* Fix requirements.txt (#84)

* fix itsdangerous requirements

* pin itsdangerous requirements for all flask==1.1.1 servers

* update (#2)

* fix/slot extr conf (#156)

* Fix requirements.txt (#84)

* fix itsdangerous requirements

* pin itsdangerous requirements for all flask==1.1.1 servers

* fix slot extraction confidence

Co-authored-by: Andrii.Hura <54397922+AndriiHura@users.noreply.github.com>
Co-authored-by: mtalimanchuk <mtalimanchuk@gmail.com>
Co-authored-by: Dilyara Baymurzina <dilyara.rimovna@gmail.com>

* Fix/simplier skill selection (#159)

* feat: simplier skill selector

* fix: refactor skill selector

* fix: codestyle

* fix: get detected_topics

* fixes (#160)

* Feat/transformers intent catcher (#135)

* feat: train intent catcher

* feat: intrnt_catcher transformers train and use

* feat: intrnt_catcher transformers train and use

* feat: intent_catcher server and test

* fix: model to cuda

* fix: cuda is available

* fix: paths

* fix: ignore index

* fix: black style

* fix: paths

* fix: paths

* fix:model modes

* fix: load dataset

* fix: no extra info

* fix: no extra info

* fix: params

* fix: paths

* fix: paths

* fix: paths

* fix: dockerfile and downloading

* fix: black style

* fix: extra im ports

* fix: dockerfile

* fix: paths and reqs

* fix: paths

* fix: lineterminator

* fix: paths to save model

* fix: paths to save model

* fix: paths

* fix: sentences

* fix: test file

* fix: working version

* fix: working

* fix info

* files

* fix: gpu for tests

* fix: gpu distr

* fix: codestyle

* fix: codestyle

* fix: friendship fallback

* fix: codestyle

* fix: book skill start if lets chat about books

* fix: no repeat

* fix book turn on

* fix: remove extra comments

* fix: some changes

* fix: use dp based model

* fix: random prhases tto

* fix: dockerfile

* fix: after review

* fix: codestyle

* fix: yml configs

* fix: add files for intent catcher

* fix: codestyle

* fix: nvidia error fix

* fix: upd config

* fix nvidia keyring using wget

* fix: working version

* fix: add files

* file path in dockerfile

* fix: order of init

* fix: codestyle

* fix: upd intent catcher version v2

* fix: revert to prev version

* fix: tests for upd version

* correct scores

Co-authored-by: mtalimanchuk <mtalimanchuk@gmail.com>

* fix:  proxy dialogpt (#164)

* Dream mini (#161)

* Fix requirements.txt (#84)

* fix itsdangerous requirements

* pin itsdangerous requirements for all flask==1.1.1 servers

* Add mini version of Dream

* Update cpu.yml

* Update pipeline_conf.json

* Update proxy.yml

* fix: cpu only for existing components

* fix: cpu only itnent catcher

Co-authored-by: Andrii.Hura <54397922+AndriiHura@users.noreply.github.com>
Co-authored-by: mtalimanchuk <mtalimanchuk@gmail.com>
Co-authored-by: Dilyara Baymurzina <dilyara.rimovna@gmail.com>

* Feat/infilling: not used yet (#163)

* Fix requirements.txt (#84)

* fix itsdangerous requirements

* pin itsdangerous requirements for all flask==1.1.1 servers

* infilling added (not tested)

* files moved, some paths fixed

* [DGM-49] path to model fixed, test added, seems working

* takes a batch, bigger test added

* assert added to test

* assert added to test

* minor changes

* fix: codestyle

* fix: proxy pass

* fix: yml configs

* fix: refactor infilling and usage

* fix: paths

* fix: dockerfile

* fix: upd files

* fix: working version

* fix: codestyle

* fix: codestyle

* fix: works on gpu

* gix: readme

Co-authored-by: Andrii.Hura <54397922+AndriiHura@users.noreply.github.com>
Co-authored-by: mtalimanchuk <mtalimanchuk@gmail.com>
Co-authored-by: Dilyara Baymurzina <dilyara.rimovna@gmail.com>

* Feat/update dialogpt (#170)

* feat: update dialogpt

* fix: codestyle

* fix: book skill false start

* Feat/parametrize response selector (#169)

* feat: parameterization in progress

* fix: some progress

* fix: parametrised

* fix: parametrised resp selector

* feat: confs for intent responder

* working version

* fix: black style

* fix: codestyle

* fix: resp selector

* fix: dialogpt params

* fix: one more param

* fix: codestyle

* fix: params

* fix: revert dialogpt

* feat: change params of dialogpt (#172)

* Fix/protobuf version (#173)

* fix: fix protobuf version for sentseg

* fix: ner protobuf

* feat: first russian dream (#176)

* feat: docker compose with main components

* Feat/tests russian (#90)

* feat: runtests russian

* fix: test file and elements

* feat: tests russian in jenkinsfile

* feat: files for tests

* fix; program-y name

* fix: change to dff-intent-responder-skill

* fix: sync with agent updates

* fix: cleanup for both runtests

* fix: fix path to pipeline conf

* fix: remove ner from tests

* fix: unbuild english bot before russian tests

* fix: codestyle

Co-authored-by: Fedor Ignatov <ignatov.fedor@gmail.com>

Co-authored-by: Fedor Ignatov <ignatov.fedor@gmail.com>

* fix: no sentrewrite needed

* Feat/ru program y (#88)

* feat: ru program-y version

* fix: variable name

* fix: russian tests

* feat: test files

* fix: dff program-y skill for russian

* fix: dff program-y skill for russian docker compose

* fix: dff program-y skip eng tests

* fix: logs

* fix: add variable env

* fix: revert dangerous skil

* fix: type

* fix: imports types

* fix: line buffering

* fix: default value

* fix: tests

* fix: program-y patterns

* Feat/spellchecker levenstein ru (#89)

* feat: add files with correct rights

* feat: spell check ru

* fix: add dockerfile path

* fix: add  commit

* feat: new files

* fix: config name

* fix: config address

* fix: config as a file

* fix: config title

* fix: consider list not sample

* fix: test fix

* fix: test codestyle

* fix: levenstein tests

* fix: levenstein limit memroy

* fix: levenshtein spelling

* fix: mapping for spelling

* feat: batch processing

* Feat/ru badwords (#93)

* feat: russian obscene words

* fix: badlist ru named as en

* fix: badlist language

* fix: badlist tests passing

* Feat/dummy skill ru (#94)

* fix: russian dummy responses for russian letters in human utterance

* fix: codestyle

* fix: black

* Feat/ner russian (#92)

* feat: ner config

* feat: files for ner ru

* feat:  ner model

* feat: ner integration

* fix: format yml config

* fix: format dockerfile

* fix: path to data

* fix: tests for ner russian

* fix: codestyle

* fix: update ner version

* add russian entity detection

* add russian entity linking

* Update requirements.txt

* Update ner_chunker.py

* fix: rus entity detection tests (#96)

* fix: rus entity detection tests

* black codestyle

* fix codestyle

* fix codestyle

* fix bug

* codestyle

* codestyle

* codestyle

Co-authored-by: dmitry <dmitrij.euseew@yandex.ru>

* Feat/intent catcher Ru based on multi lingual USE (#98)

* fix: intent catcher params and paths

* fix: paths in dockerfile

* fix: pintent ru phrases without random ones

* fix: random intent phrases

* fix: intent training params

* fix: intent requirements

* fix: intent requirements

* fix: download model

* fix: model which to download

* fix: imports for correct work

* fix: corrected phrases

* fix: corrected phrases

* fix: corrected phrases

* fix: corrected phrases

* fix: corrected phrases

* fix: ccorrect path to save json threshold

* feat: intent data ru json

* fix: ccorrect path to save tests

* fix: existing var

* fix: regular phrases

* fix: next test

* fix: trianing logs and new threshold

* fix: trianing logs and new thre change prhases

* fix: change regexps

* fix: change thresholds

* fix: new template for itnent phrases

* fix: etsts ru

* feat: upd model

* fix: upd logs of training, upd conf value

* fix: punctuation

* fix: punctuation

* est: upd model

* fix: training logs

* fix: tests

* fix: phrases for opinion

* feat: upd model

* feat: training logs

* feat: upd model

* fix: tests

* fix: remove opinion request intent

* feat: upd model

* feat: upd model

* fix: new train logs

* fix: new phrases

* fix: min prcision for intent

* fix: lower boundary

* fix: usage of lib

* fix: codestyle

* feat: add itsdangerous requirements

* fix: spelling preproc endpoint

* Feat/dialogpt ru and dff-generative-skill (#97)

* Fix requirements.txt (#84)

* feat: initialize dialogpt_RU

* feat: files init

* feat: basic integration of dialogpt_RU

* fix: rename dialogpt

* fix: dialogpt to device

* fix: dialogpt final version

* fix: dialogpt test

* fix: dialogpt test

* fix: dialogpt  resources consumption

* fix: dialogpt  to tests

* feat: dff generative skill

* feat: dff generative skill

* fix: remove extra files

* fix: input to dialogpt

* fix: input to dialogpt

* fix: logging

* fix: turn on tests

* fix: get dialog from context

* fix: get uttrs from context

* fix: geempty uttrs

* fix: return empty resp

* fix: test file

* fix: tests

* fix: test ratio

* add speech_function_* dist

* add speech_function_* dist readme

* added sf_functions

* fix ports

* fix:m codestyle

* fix deployment config

* fix: tests for generative skill

* fix: codestyle

* add formatters, fix pipeline

* update speech function * sources

* fix: check if dialogpt is ready

* fix: wait services

* rename book skill

* remove old book skill, update usages

* fix readme

* fix codestyle

* fix codestyle

* fix codestyle

* fix codestyle line length

* move res_cor.json to shared files

* fix itsdangerous requirements

* pin itsdangerous requirements for all flask==1.1.1 servers

* fix cpu.yml, dockerfiles and test for sfc, sfp

* fix codestyle issues

* blacked with -l 120

* following Dilya's holy orders

* following Dilya's not so holy orders

* fix formatters

* fix pipeline

* fix pipeline and formatters

* Adding timeouts + mapping of book skill

* removed old & irrelevant tests

* we've set confidence to super level

* feat: midas cls sent tokenize only if needed (#101)

* feat: midas cls sent tokenize only if needed

* feat: take into account tokenized uttrs by bot

* fix: codestyle

* fix:
itsdangerous reqs

* fix: docker reqs

* fix: check another container

* fix: rights for file

* fix: coestyle

* fix: return tests for intent responder

* fix: revert intent responder

* fix: review fixes

* fix: codestyle

Co-authored-by: Andrii.Hura <54397922+AndriiHura@users.noreply.github.com>
Co-authored-by: mtalimanchuk <mtalimanchuk@gmail.com>
Co-authored-by: Daniel Kornev <daniel@zetuniverse.com>

* fix: remove convert and sentseg for now

* Feat/dff-intent-responder-skill ru (#99)

* feat: prepare new intent responder

* fix: responses for intent responder ru

* fix: test based on language

* fix: path to intent response phrases

* fix: remove convert and sentseg

* fix: another gpus

* fix: file path and logs

* fix: env and logs for intent responder

* fix: exit response

* fix: choose_topic to low prioity intents

* feat: tests for ru

* fix: tests for exit ru

* fix: black codestyle

* fix: tests for itnent catcher en

* fix: куьщму сщтмуке фтв ыутеыуп акщь еуыеы

* feat: turn on generative skill

* Feat/wiki parser RU (#114)

* update

* codestyle

* add language parameter

* fix: language arg

* fix: language arg and revert generative in dockercompose

* fix tests

* codestyle

* fix: tests for ru

* fix: language value

* fix: ru test results

* fix: test pipe

* fix: sort types_2hop

* fix: black codestyle

* fix: tests for en wiki

* fix: quotes

* fix: codestyle

* fix: sort objects

* fix: test for wiki parser

* fix: codestyle

Co-authored-by: dilyararimovna <dilyara.rimovna@gmail.com>

* Feat/ru friendship skill (#120)

* feat: add language parameters

* fix: black codestyle

* fix: codestyle

* fix: dff friendship ru

* fix: dff friendship ru

* fix: dff friendship ru

* fix: dff friendship ru уокротила реплики

* fix: dff friendship tests

* fix: dff friendship tests

* fix: languAGE FOR WIKI

* fix: language default value

* fix: language default value

* fix: language env var

* fix: use templates by language

* fix: ru templates

* fix: no lang env var in common

* fix: lang to ackn

* fizx: codestyle

* feat: default lang value

* fix: dummy for russian

* fix: no en acks

* fix: how are you ru

* fix: logs for response functions

* fix: logs for condition functions

* fix: ru version if what to talk about

* feat: ru tests

* fix: codestyle

* fix: ru condition to resp selector

* fix: ru condition to resp selector

* fix: logging level and configuration

* fix:  ascii in tests

* fix: add 'user' to dff input

* fix: add language env variable everywhere

* Feat/dialogrpt ru (#121)

* fix: file drafts

* feat: files for dialogrpt

* feat: dialogrpt pipeline and scores

* feat: dialogrpt pipeline and scores

* feat: dialogrpt readme

* fix: small readme

* fix: sno healthcheck

* feat: add dialogrpt to pipeline

* fix: codestyle

* fix: test files

* feat: upd packages in dockerfile

* fix: path to file

* fix: shared file

* fix: codestyle

* fix: imports

* fix: option consider

* fix: option consider

* fix: codestyle

* fix: vars

* fix: test file

* fix: convert to list predictions

* fix: tests

* fix: codestyle

* fix: codestyle

* fix: codestyle

* fix: readme

* fix: dialogrpt to tests

* feat: no extra files, add tokenizer as parameter

* fix: codestyle

* fix: var name

* fix: batch prediction

* fix: batch prediction parameter

* fix: test choice

* fix: format values

* fix: codestyle

* fix: upd deeppavlov download

* fix: dialogrpt container name

* fix: dialogrpt as hyp annotator

* fix: dialogrpt test

* Feat/ru personal info (#125)

* fix: ignorecase and no text in code

* fix:  russian in patterns

* fix: language env var

* fix: russian patterns and responses

* fix: russian patterns and responses

* fix: path to file

* fix: test for new version

* fix: test for en

* fix: codestyle

* fix: f palceholders

* fix: format usage

* fix: codestyle

* fix: logs

* fix: my name is not

* fix: homeland pattern fixes

* fix: me name is not function

* fix: more logs

* fix: fix my name is not function

* fix: my name is not

* fix: do you know my name

* fix: test format

* fix: test format

* fix: test format and mroe tests

* fix: test format

* fix: more tests

* fix: more tests

* fix: mtest format prints

* fix: black

* fix: en tests

* fix: en tests

* fix: en tests

* fix: en tests

* fix entity detection (#127)

* Feat/spacy lemmatizer (#129)

* fix: add spacy annotator

* fix: usage of spacy attributes

* fix: test spacy annotator

* fix: add params

* fix: add params

* fix: fix test

* fix: rights on file

* fix: codestyle

* fix: extra f string

* Feat/russian sentseg (#128)

* feat: basic config (with no changes)

* feat: data preproc

* feat: data processing

* fix: codestyle

* fix: sentseg ru like dp.ner_rus config

* fix: rename config

* fix: fpath

* fix: readme

* fix: custom sentseg reader

* fix: custom sentseg config

* feat: sent segmentation

* feat: sent segmentation tests

* fix: rights on file

* fix: codestyle

* fix:  data preproc in sentseg_ru too

* fix: metric values for sentseg trained on ru subtitiles

* fix: path to sentseg to download

* fix: usae sentseg ru model

* fux: rights for file

* fix: newer spacy version

* fix: newer depepavlov version

* fix: reqs

* fix: server

* feat: new config for bert model

* fix: upd sentseg config

* fix: upd sentseg config

* fix: remove old config

* fix: config path

* fix: deeppavlov 17 2

* fix: remove extra import

* fix: new docker image base

* fix: reinstall spacy

* fix: resentseg tests

* fix: codestyle

* fix: docs

* fix: add sentseg to tests

* fix: dockerfile

* fix: model path

* fix: add dialogrpt o wait hosts

* fix: more complicated test for badwords annotator

* Fix/upd badlisted words (#130)

* fix: more complicated test for badwords annotator

* fix: revert badlisted en words

* fix: russian badlisted words

* fix: give tokenized sents after spacy

* fix: ru badlisted words

* fix: ru badlisted words folder

* fix: ru badlisted words get data

* fix: test file

* fix: ru badlisted words tokenized sent

* fix: ru badlisted words tokens

* fix: codestyle

* fix: revert badlisted to dev

* fix: pipeline conf post_skill_selector_annotators

* fix: sleep before re try to connect to dialogpt

* fix: formatter format

* fix: more russian badwords

* fix: correct endpoint for spacy annotator

* Feat/ru random questions (#131)

* feat: random russian questions

* feat: dummy provides random russian questions

* fix: refactor questions

* fix: add pre-dummy phrase

* fix: add pre-dummy phrase

* fix: codestyle

* fix: path to file

* fix: strip russian questions

* fix: last chance response

* fix: documentation

* fix: more confident generative skill

* fix: dummy response always available

* fix: intent responder check if exist

* fix: most dummy responses language based

* fix: remove punctuation if present

* fix: documentation

* fix: documentation

* fix: new limits for russian baseline

* fix: dialogrpt scores as conveval

* fix: sentseg ru remove commas

* fix: no wiki-skill yet

* fix:  ner no threads

* fix: can add prompt

* fix: prompt with conf

* fix: remove bad questions

* fix: add punctuation to generated hyp

* fix: remove quotes

* fix: re-choose hyp only for en version

* fix: dff-generative is aka script

* fix: increase intent conf thresholds

* fix: store only tokens for hyps

* fix: consider only special intents

* fix: codestyle

* fix: final fixes, resp selectiorna and thresholds intent

* fix: more obscene words

* fix: Russian documentation

* fix: image in docs

* fix: questions

* fix: bad words

* feat: ru toxic classifier

* fix: toxic check batch hypotheses too

* fix: intent responder uses lang

* fix

* fix: correct usage of human bot utterances

* fix: return 5 hypotheses

* fix: more hyps, fix reqs

* fix: black codestyle

* fix: codestyle

* fix: codestyle

* feat: response selector uses params

* fix: requirements

* fix: requirements

* fix: revmoe dialogpt prev ru

* fix: requirements

* fix: add dialogrpt again

* fix: add dialogrpt

* fix: add dialogpt ru

* fix: requirements for dialogpt and dialogrpt

* fix: return pymorphy to reqs

* Feat/ru intent catcher transformers (#171)

* fix: itnent catcher ru transformers

* fix: ru itnent catcher

* fix: intent catcher updated

* fix: INTENT_PHRASES_PATH as a main variable

* fix: dockerfile updates

* fix: test gpu

* fix: black style

* fix: add tests files

* fix: tests

* fix: rights on file

* fix: regiths on file

* fix: regiths

* fix: numb hyps

* fix: remove without threads

* fix: документация

* fix: add LET_ME_ASK_YOU_PHRASES

* fix: black style

* fix: revert extra files

* fix: dream mini uses the same params

* fix: generative default response

* fix: incase of no gpu

* fix: resources and gpus consumption

* fix: new image

* fix: add prompt ones

* fix: ru and en version sentsegs

Co-authored-by: Fedor Ignatov <ignatov.fedor@gmail.com>
Co-authored-by: Дмитрий Евсеев <dmitrij.euseew@yandex.ru>
Co-authored-by: Andrii.Hura <54397922+AndriiHura@users.noreply.github.com>
Co-authored-by: mtalimanchuk <mtalimanchuk@gmail.com>
Co-authored-by: Daniel Kornev <daniel@zetuniverse.com>

* fix: proxy usage command (#183)

* Feat/multilingual ner (#186)

* feat: ner multilingual case_agnostic

* fix: ner config

* fix: ner dockerfile

* fix: upd config

* fix: config for ner multilingual

* feat: updated config

* feat: working ner multilingual

* fix: codfestyle

* feat: upd spellcheck

* fix: add cuda visible devices

* fix: cuda visible devices

* update fact-retrieval and text-qa (#168)

* update fact retrieval

* update squad

* add answer sentence

* update

* fixes

* update formatter

* fixes

* add logit ranker

* codestyle

* codestyle

* fixes

* codestyle

* fix tests

Co-authored-by: Дмитрий Евсеев <dmitrij.euseew@yandex.ru>

* feat: upd dp-ner with extended version (#189)

* feat: upd dp-ner with extended version

* fix: upd tests

* fix: working for tags

* fix: codestyle

* fix: user new model

* feat: woking

* fix: config

* fix: upd ner dockerfile

* fix: revert format list

* fix: change ner for all dists

* fix: upd dialogpt en params (#190)

* fix: upd dialogpt en params

* fix: black style

* fix: upd params

* fix: context format

* fix: context format

* fix: codestyle

* docker fixes for hydra configuration poc (#34)

* docker fixes hydra configuration poc

* fix agent installation

* fix dp-agent commit in dockerfile_agent

* Fix requirements.txt (#84)

* update pr against the new main branch

* fix itsdangerous requirements

* pin itsdangerous requirements for all flask==1.1.1 servers

* minimal reproducible example for new dream

* add pem files to gitignore, small agent docker fix

* change commit, remove copy settings

* fix agent command in base compose file

* fix agent installation

* fix agent command in other dists

* fix commands in readme, add telegram section

* update en and ru readme

Co-authored-by: Andrii.Hura <54397922+AndriiHura@users.noreply.github.com>
Co-authored-by: Dilyara Baymurzina <dilyara.rimovna@gmail.com>

Co-authored-by: dmitrijeuseew <dmitrij.euseew@yandex.ru>
Co-authored-by: Andrii.Hura <54397922+AndriiHura@users.noreply.github.com>
Co-authored-by: mtalimanchuk <mtalimanchuk@gmail.com>
Co-authored-by: Dilyara Baymurzina <dilyara.rimovna@gmail.com>
Co-authored-by: Olga Sofronova <60696748+olkaso@users.noreply.github.com>
Co-authored-by: Fedor Ignatov <ignatov.fedor@gmail.com>
Co-authored-by: Daniel Kornev <daniel@zetuniverse.com>
Co-authored-by: zucchini-nlp <100715397+zucchini-nlp@users.noreply.github.com>

* image captioning

* image captioning

* Image captioning (#4)

* fix/slot extr conf (#156)

* Fix requirements.txt (#84)

* fix itsdangerous requirements

* pin itsdangerous requirements for all flask==1.1.1 servers

* fix slot extraction confidence

Co-authored-by: Andrii.Hura <54397922+AndriiHura@users.noreply.github.com>
Co-authored-by: mtalimanchuk <mtalimanchuk@gmail.com>
Co-authored-by: Dilyara Baymurzina <dilyara.rimovna@gmail.com>

* Fix/simplier skill selection (#159)

* feat: simplier skill selector

* fix: refactor skill selector

* fix: codestyle

* fix: get detected_topics

* fixes (#160)

* Feat/transformers intent catcher (#135)

* feat: train intent catcher

* feat: intrnt_catcher transformers train and use

* feat: intrnt_catcher transformers train and use

* feat: intent_catcher server and test

* fix: model to cuda

* fix: cuda is available

* fix: paths

* fix: ignore index

* fix: black style

* fix: paths

* fix: paths

* fix:model modes

* fix: load dataset

* fix: no extra info

* fix: no extra info

* fix: params

* fix: paths

* fix: paths

* fix: paths

* fix: dockerfile and downloading

* fix: black style

* fix: extra im ports

* fix: dockerfile

* fix: paths and reqs

* fix: paths

* fix: lineterminator

* fix: paths to save model

* fix: paths to save model

* fix: paths

* fix: sentences

* fix: test file

* fix: working version

* fix: working

* fix info

* files

* fix: gpu for tests

* fix: gpu distr

* fix: codestyle

* fix: codestyle

* fix: friendship fallback

* fix: codestyle

* fix: book skill start if lets chat about books

* fix: no repeat

* fix book turn on

* fix: remove extra comments

* fix: some changes

* fix: use dp based model

* fix: random prhases tto

* fix: dockerfile

* fix: after review

* fix: codestyle

* fix: yml configs

* fix: add files for intent catcher

* fix: codestyle

* fix: nvidia error fix

* fix: upd config

* fix nvidia keyring using wget

* fix: working version

* fix: add files

* file path in dockerfile

* fix: order of init

* fix: codestyle

* fix: upd intent catcher version v2

* fix: revert to prev version

* fix: tests for upd version

* correct scores

Co-authored-by: mtalimanchuk <mtalimanchuk@gmail.com>

* fix:  proxy dialogpt (#164)

* Dream mini (#161)

* Fix requirements.txt (#84)

* fix itsdangerous requirements

* pin itsdangerous requirements for all flask==1.1.1 servers

* Add mini version of Dream

* Update cpu.yml

* Update pipeline_conf.json

* Update proxy.yml

* fix: cpu only for existing components

* fix: cpu only itnent catcher

Co-authored-by: Andrii.Hura <54397922+AndriiHura@users.noreply.github.com>
Co-authored-by: mtalimanchuk <mtalimanchuk@gmail.com>
Co-authored-by: Dilyara Baymurzina <dilyara.rimovna@gmail.com>

* Feat/infilling: not used yet (#163)

* Fix requirements.txt (#84)

* fix itsdangerous requirements

* pin itsdangerous requirements for all flask==1.1.1 servers

* infilling added (not tested)

* files moved, some paths fixed

* [DGM-49] path to model fixed, test added, seems working

* takes a batch, bigger test added

* assert added to test

* assert added to test

* minor changes

* fix: codestyle

* fix: proxy pass

* fix: yml configs

* fix: refactor infilling and usage

* fix: paths

* fix: dockerfile

* fix: upd files

* fix: working version

* fix: codestyle

* fix: codestyle

* fix: works on gpu

* gix: readme

Co-authored-by: Andrii.Hura <54397922+AndriiHura@users.noreply.github.com>
Co-authored-by: mtalimanchuk <mtalimanchuk@gmail.com>
Co-authored-by: Dilyara Baymurzina <dilyara.rimovna@gmail.com>

* Feat/update dialogpt (#170)

* feat: update dialogpt

* fix: codestyle

* fix: book skill false start

* Feat/parametrize response selector (#169)

* feat: parameterization in progress

* fix: some progress

* fix: parametrised

* fix: parametrised resp selector

* feat: confs for intent responder

* working version

* fix: black style

* fix: codestyle

* fix: resp selector

* fix: dialogpt params

* fix: one more param

* fix: codestyle

* fix: params

* fix: revert dialogpt

* feat: change params of dialogpt (#172)

* Fix/protobuf version (#173)

* fix: fix protobuf version for sentseg

* fix: ner protobuf

* feat: first russian dream (#176)

* feat: docker compose with main components

* Feat/tests russian (#90)

* feat: runtests russian

* fix: test file and elements

* feat: tests russian in jenkinsfile

* feat: files for tests

* fix; program-y name

* fix: change to dff-intent-responder-skill

* fix: sync with agent updates

* fix: cleanup for both runtests

* fix: fix path to pipeline conf

* fix: remove ner from tests

* fix: unbuild english bot before russian tests

* fix: codestyle

Co-authored-by: Fedor Ignatov <ignatov.fedor@gmail.com>

Co-authored-by: Fedor Ignatov <ignatov.fedor@gmail.com>

* fix: no sentrewrite needed

* Feat/ru program y (#88)

* feat: ru program-y version

* fix: variable name

* fix: russian tests

* feat: test files

* fix: dff program-y skill for russian

* fix: dff program-y skill for russian docker compose

* fix: dff program-y skip eng tests

* fix: logs

* fix: add variable env

* fix: revert dangerous skil

* fix: type

* fix: imports types

* fix: line buffering

* fix: default value

* fix: tests

* fix: program-y patterns

* Feat/spellchecker levenstein ru (#89)

* feat: add files with correct rights

* feat: spell check ru

* fix: add dockerfile path

* fix: add  commit

* feat: new files

* fix: config name

* fix: config address

* fix: config as a file

* fix: config title

* fix: consider list not sample

* fix: test fix

* fix: test codestyle

* fix: levenstein tests

* fix: levenstein limit memroy

* fix: levenshtein spelling

* fix: mapping for spelling

* feat: batch processing

* Feat/ru badwords (#93)

* feat: russian obscene words

* fix: badlist ru named as en

* fix: badlist language

* fix: badlist tests passing

* Feat/dummy skill ru (#94)

* fix: russian dummy responses for russian letters in human utterance

* fix: codestyle

* fix: black

* Feat/ner russian (#92)

* feat: ner config

* feat: files for ner ru

* feat:  ner model

* feat: ner integration

* fix: format yml config

* fix: format dockerfile

* fix: path to data

* fix: tests for ner russian

* fix: codestyle

* fix: update ner version

* add russian entity detection

* add russian entity linking

* Update requirements.txt

* Update ner_chunker.py

* fix: rus entity detection tests (#96)

* fix: rus entity detection tests

* black codestyle

* fix codestyle

* fix codestyle

* fix bug

* codestyle

* codestyle

* codestyle

Co-authored-by: dmitry <dmitrij.euseew@yandex.ru>

* Feat/intent catcher Ru based on multi lingual USE (#98)

* fix: intent catcher params and paths

* fix: paths in dockerfile

* fix: pintent ru phrases without random ones

* fix: random intent phrases

* fix: intent training params

* fix: intent requirements

* fix: intent requirements

* fix: download model

* fix: model which to download

* fix: imports for correct work

* fix: corrected phrases

* fix: corrected phrases

* fix: corrected phrases

* fix: corrected phrases

* fix: corrected phrases

* fix: ccorrect path to save json threshold

* feat: intent data ru json

* fix: ccorrect path to save tests

* fix: existing var

* fix: regular phrases

* fix: next test

* fix: trianing logs and new threshold

* fix: trianing logs and new thre change prhases

* fix: change regexps

* fix: change thresholds

* fix: new template for itnent phrases

* fix: etsts ru

* feat: upd model

* fix: upd logs of training, upd conf value

* fix: punctuation

* fix: punctuation

* est: upd model

* fix: training logs

* fix: tests

* fix: phrases for opinion

* feat: upd model

* feat: training logs

* feat: upd model

* fix: tests

* fix: remove opinion request intent

* feat: upd model

* feat: upd model

* fix: new train logs

* fix: new phrases

* fix: min prcision for intent

* fix: lower boundary

* fix: usage of lib

* fix: codestyle

* feat: add itsdangerous requirements

* fix: spelling preproc endpoint

* Feat/dialogpt ru and dff-generative-skill (#97)

* Fix requirements.txt (#84)

* feat: initialize dialogpt_RU

* feat: files init

* feat: basic integration of dialogpt_RU

* fix: rename dialogpt

* fix: dialogpt to device

* fix: dialogpt final version

* fix: dialogpt test

* fix: dialogpt test

* fix: dialogpt  resources consumption

* fix: dialogpt  to tests

* feat: dff generative skill

* feat: dff generative skill

* fix: remove extra files

* fix: input to dialogpt

* fix: input to dialogpt

* fix: logging

* fix: turn on tests

* fix: get dialog from context

* fix: get uttrs from context

* fix: geempty uttrs

* fix: return empty resp

* fix: test file

* fix: tests

* fix: test ratio

* add speech_function_* dist

* add speech_function_* dist readme

* added sf_functions

* fix ports

* fix:m codestyle

* fix deployment config

* fix: tests for generative skill

* fix: codestyle

* add formatters, fix pipeline

* update speech function * sources

* fix: check if dialogpt is ready

* fix: wait services

* rename book skill

* remove old book skill, update usages

* fix readme

* fix codestyle

* fix codestyle

* fix codestyle

* fix codestyle line length

* move res_cor.json to shared files

* fix itsdangerous requirements

* pin itsdangerous requirements for all flask==1.1.1 servers

* fix cpu.yml, dockerfiles and test for sfc, sfp

* fix codestyle issues

* blacked with -l 120

* following Dilya's holy orders

* following Dilya's not so holy orders

* fix formatters

* fix pipeline

* fix pipeline and formatters

* Adding timeouts + mapping of book skill

* removed old & irrelevant tests

* we've set confidence to super level

* feat: midas cls sent tokenize only if needed (#101)

* feat: midas cls sent tokenize only if needed

* feat: take into account tokenized uttrs by bot

* fix: codestyle

* fix:
itsdangerous reqs

* fix: docker reqs

* fix: check another container

* fix: rights for file

* fix: coestyle

* fix: return tests for intent responder

* fix: revert intent responder

* fix: review fixes

* fix: codestyle

Co-authored-by: Andrii.Hura <54397922+AndriiHura@users.noreply.github.com>
Co-authored-by: mtalimanchuk <mtalimanchuk@gmail.com>
Co-authored-by: Daniel Kornev <daniel@zetuniverse.com>

* fix: remove convert and sentseg for now

* Feat/dff-intent-responder-skill ru (#99)

* feat: prepare new intent responder

* fix: responses for intent responder ru

* fix: test based on language

* fix: path to intent response phrases

* fix: remove convert and sentseg

* fix: another gpus

* fix: file path and logs

* fix: env and logs for intent responder

* fix: exit response

* fix: choose_topic to low prioity intents

* feat: tests for ru

* fix: tests for exit ru

* fix: black codestyle

* fix: tests for itnent catcher en

* fix: куьщму сщтмуке фтв ыутеыуп акщь еуыеы

* feat: turn on generative skill

* Feat/wiki parser RU (#114)

* update

* codestyle

* add language parameter

* fix: language arg

* fix: language arg and revert generative in dockercompose

* fix tests

* codestyle

* fix: tests for ru

* fix: language value

* fix: ru test results

* fix: test pipe

* fix: sort types_2hop

* fix: black codestyle

* fix: tests for en wiki

* fix: quotes

* fix: codestyle

* fix: sort objects

* fix: test for wiki parser

* fix: codestyle

Co-authored-by: dilyararimovna <dilyara.rimovna@gmail.com>

* Feat/ru friendship skill (#120)

* feat: add language parameters

* fix: black codestyle

* fix: codestyle

* fix: dff friendship ru

* fix: dff friendship ru

* fix: dff friendship ru

* fix: dff friendship ru уокротила реплики

* fix: dff friendship tests

* fix: dff friendship tests

* fix: languAGE FOR WIKI

* fix: language default value

* fix: language default value

* fix: language env var

* fix: use templates by language

* fix: ru templates

* fix: no lang env var in common

* fix: lang to ackn

* fizx: codestyle

* feat: default lang value

* fix: dummy for russian

* fix: no en acks

* fix: how are you ru

* fix: logs for response functions

* fix: logs for condition functions

* fix: ru version if what to talk about

* feat: ru tests

* fix: codestyle

* fix: ru condition to resp selector

* fix: ru condition to resp selector

* fix: logging level and configuration

* fix:  ascii in tests

* fix: add 'user' to dff input

* fix: add language env variable everywhere

* Feat/dialogrpt ru (#121)

* fix: file drafts

* feat: files for dialogrpt

* feat: dialogrpt pipeline and scores

* feat: dialogrpt pipeline and scores

* feat: dialogrpt readme

* fix: small readme

* fix: sno healthcheck

* feat: add dialogrpt to pipeline

* fix: codestyle

* fix: test files

* feat: upd packages in dockerfile

* fix: path to file

* fix: shared file

* fix: codestyle

* fix: imports

* fix: option consider

* fix: option consider

* fix: codestyle

* fix: vars

* fix: test file

* fix: convert to list predictions

* fix: tests

* fix: codestyle

* fix: codestyle

* fix: codestyle

* fix: readme

* fix: dialogrpt to tests

* feat: no extra files, add tokenizer as parameter

* fix: codestyle

* fix: var name

* fix: batch prediction

* fix: batch prediction parameter

* fix: test choice

* fix: format values

* fix: codestyle

* fix: upd deeppavlov download

* fix: dialogrpt container name

* fix: dialogrpt as hyp annotator

* fix: dialogrpt test

* Feat/ru personal info (#125)

* fix: ignorecase and no text in code

* fix:  russian in patterns

* fix: language env var

* fix: russian patterns and responses

* fix: russian patterns and responses

* fix: path to file

* fix: test for new version

* fix: test for en

* fix: codestyle

* fix: f palceholders

* fix: format usage

* fix: codestyle

* fix: logs

* fix: my name is not

* fix: homeland pattern fixes

* fix: me name is not function

* fix: more logs

* fix: fix my name is not function

* fix: my name is not

* fix: do you know my name

* fix: test format

* fix: test format

* fix: test format and mroe tests

* fix: test format

* fix: more tests

* fix: more tests

* fix: mtest format prints

* fix: black

* fix: en tests

* fix: en tests

* fix: en tests

* fix: en tests

* fix entity detection (#127)

* Feat/spacy lemmatizer (#129)

* fix: add spacy annotator

* fix: usage of spacy attributes

* fix: test spacy annotator

* fix: add params

* fix: add params

* fix: fix test

* fix: rights on file

* fix: codestyle

* fix: extra f string

* Feat/russian sentseg (#128)

* feat: basic config (with no changes)

* feat: data preproc

* feat: data processing

* fix: codestyle

* fix: sentseg ru like dp.ner_rus config

* fix: rename config

* fix: fpath

* fix: readme

* fix: custom sentseg reader

* fix: custom sentseg config

* feat: sent segmentation

* feat: sent segmentation tests

* fix: rights on file

* fix: codestyle

* fix:  data preproc in sentseg_ru too

* fix: metric values for sentseg trained on ru subtitiles

* fix: path to sentseg to download

* fix: usae sentseg ru model

* fux: rights for file

* fix: newer spacy version

* fix: newer depepavlov version

* fix: reqs

* fix: server

* feat: new config for bert model

* fix: upd sentseg config

* fix: upd sentseg config

* fix: remove old config

* fix: config path

* fix: deeppavlov 17 2

* fix: remove extra import

* fix: new docker image base

* fix: reinstall spacy

* fix: resentseg tests

* fix: codestyle

* fix: docs

* fix: add sentseg to tests

* fix: dockerfile

* fix: model path

* fix: add dialogrpt o wait hosts

* fix: more complicated test for badwords annotator

* Fix/upd badlisted words (#130)

* fix: more complicated test for badwords annotator

* fix: revert badlisted en words

* fix: russian badlisted words

* fix: give tokenized sents after spacy

* fix: ru badlisted words

* fix: ru badlisted words folder

* fix: ru badlisted words get data

* fix: test file

* fix: ru badlisted words tokenized sent

* fix: ru badlisted words tokens

* fix: codestyle

* fix: revert badlisted to dev

* fix: pipeline conf post_skill_selector_annotators

* fix: sleep before re try to connect to dialogpt

* fix: formatter format

* fix: more russian badwords

* fix: correct endpoint for spacy annotator

* Feat/ru random questions (#131)

* feat: random russian questions

* feat: dummy provides random russian questions

* fix: refactor questions

* fix: add pre-dummy phrase

* fix: add pre-dummy phrase

* fix: codestyle

* fix: path to file

* fix: strip russian questions

* fix: last chance response

* fix: documentation

* fix: more confident generative skill

* fix: dummy response always available

* fix: intent responder check if exist

* fix: most dummy responses language based

* fix: remove punctuation if present

* fix: documentation

* fix: documentation

* fix: new limits for russian baseline

* fix: dialogrpt scores as conveval

* fix: sentseg ru remove commas

* fix: no wiki-skill yet

* fix:  ner no threads

* fix: can add prompt

* fix: prompt with conf

* fix: remove bad questions

* fix: add punctuation to generated hyp

* fix: remove quotes

* fix: re-choose hyp only for en version

* fix: dff-generative is aka script

* fix: increase intent conf thresholds

* fix: store only tokens for hyps

* fix: consider only special intents

* fix: codestyle

* fix: final fixes, resp selectiorna and thresholds intent

* fix: more obscene words

* fix: Russian documentation

* fix: image in docs

* fix: questions

* fix: bad words

* feat: ru toxic classifier

* fix: toxic check batch hypotheses too

* fix: intent responder uses lang

* fix

* fix: correct usage of human bot utterances

* fix: return 5 hypotheses

* fix: more hyps, fix reqs

* fix: black codestyle

* fix: codestyle

* fix: codestyle

* feat: response selector uses params

* fix: requirements

* fix: requirements

* fix: revmoe dialogpt prev ru

* fix: requirements

* fix: add dialogrpt again

* fix: add dialogrpt

* fix: add dialogpt ru

* fix: requirements for dialogpt and dialogrpt

* fix: return pymorphy to reqs

* Feat/ru intent catcher transformers (#171)

* fix: itnent catcher ru transformers

* fix: ru itnent catcher

* fix: intent catcher updated

* fix: INTENT_PHRASES_PATH as a main variable

* fix: dockerfile updates

* fix: test gpu

* fix: black style

* fix: add tests files

* fix: tests

* fix: rights on file

* fix: regiths on file

* fix: regiths

* fix: numb hyps

* fix: remove without threads

* fix: документация

* fix: add LET_ME_ASK_YOU_PHRASES

* fix: black style

* fix: revert extra files

* fix: dream mini uses the same params

* fix: generative default response

* fix: incase of no gpu

* fix: resources and gpus consumption

* fix: new image

* fix: add prompt ones

* fix: ru and en version sentsegs

Co-authored-by: Fedor Ignatov <ignatov.fedor@gmail.com>
Co-authored-by: Дмитрий Евсеев <dmitrij.euseew@yandex.ru>
Co-authored-by: Andrii.Hura <54397922+AndriiHura@users.noreply.github.com>
Co-authored-by: mtalimanchuk <mtalimanchuk@gmail.com>
Co-authored-by: Daniel Kornev <daniel@zetuniverse.com>

* fix: proxy usage command (#183)

* Feat/multilingual ner (#186)

* feat: ner multilingual case_agnostic

* fix: ner config

* fix: ner dockerfile

* fix: upd config

* fix: config for ner multilingual

* feat: updated config

* feat: working ner multilingual

* fix: codfestyle

* feat: upd spellcheck

* fix: add cuda visible devices

* fix: cuda visible devices

* update fact-retrieval and text-qa (#168)

* update fact retrieval

* update squad

* add answer sentence

* update

* fixes

* update formatter

* fixes

* add logit ranker

* codestyle

* codestyle

* fixes

* codestyle

* fix tests

Co-authored-by: Дмитрий Евсеев <dmitrij.euseew@yandex.ru>

* feat: upd dp-ner with extended version (#189)

* feat: upd dp-ner with extended version

* fix: upd tests

* fix: working for tags

* fix: codestyle

* fix: user new model

* feat: woking

* fix: config

* fix: upd ner dockerfile

* fix: revert format list

* fix: change ner for all dists

* fix: upd dialogpt en params (#190)

* fix: upd dialogpt en params

* fix: black style

* fix: upd params

* fix: context format

* fix: context format

* fix: codestyle

* docker fixes for hydra configuration poc (#34)

* docker fixes hydra configuration poc

* fix agent installation

* fix dp-agent commit in dockerfile_agent

* Fix requirements.txt (#84)

* update pr against the new main branch

* fix itsdangerous requirements

* pin itsdangerous requirements for all flask==1.1.1 servers

* minimal reproducible example for new dream

* add pem files to gitignore, small agent docker fix

* change commit, remove copy settings

* fix agent command in base compose file

* fix agent installation

* fix agent command in other dists

* fix commands in readme, add telegram section

* update en and ru readme

Co-authored-by: Andrii.Hura <54397922+AndriiHura@users.noreply.github.com>
Co-authored-by: Dilyara Baymurzina <dilyara.rimovna@gmail.com>

* fix: prompts from dummy skill (#193)

* Feat/sentence ranker as a service (#191)

* feat: sentence ranker almost

* feat: sentence ranker

* fix: tests

* fix: get scores

* fix: codestyle

* fix: reqs

* fix: flask jsonify

* fix: flask jsonify

* fix: json types

* fix: logs

* fix: usage of single scores calcution

* fix: codestyle

* fix: codestyle

* fix: refactor

* fix: scores and curr_single_scores

* fix: codestyle

* feat: log

* fix: model and test

* fix: upd reqs for kg service (#195)

* image captioning

* update image captioning server.py

* update image captioning server.py

* add dream_multimodal

* updates

* updated pipeline

Co-authored-by: dmitrijeuseew <dmitrij.euseew@yandex.ru>
Co-authored-by: Andrii.Hura <54397922+AndriiHura@users.noreply.github.com>
Co-authored-by: mtalimanchuk <mtalimanchuk@gmail.com>
Co-authored-by: Dilyara Baymurzina <dilyara.rimovna@gmail.com>
Co-authored-by: Olga Sofronova <60696748+olkaso@users.noreply.github.com>
Co-authored-by: Fedor Ignatov <ignatov.fedor@gmail.com>
Co-authored-by: Daniel Kornev <daniel@zetuniverse.com>
Co-authored-by: zucchini-nlp <100715397+zucchini-nlp@users.noreply.github.com>

* add: files

* fix: docs line

* fix: codestyule

* fix: run command

* fix: run command

* fix itsdangerous requirements

* image captioning

* image captioning

* Image captioning (#4)

* fix/slot extr conf (#156)

* Fix requirements.txt (#84)

* fix itsdangerous requirements

* pin itsdangerous requirements for all flask==1.1.1 servers

* fix slot extraction confidence

Co-authored-by: Andrii.Hura <54397922+AndriiHura@users.noreply.github.com>
Co-authored-by: mtalimanchuk <mtalimanchuk@gmail.com>
Co-authored-by: Dilyara Baymurzina <dilyara.rimovna@gmail.com>

* Fix/simplier skill selection (#159)

* feat: simplier skill selector

* fix: refactor skill selector

* fix: codestyle

* fix: get detected_topics

* fixes (#160)

* Feat/transformers intent catcher (#135)

* feat: train intent catcher

* feat: intrnt_catcher transformers train and use

* feat: intrnt_catcher transformers train and use

* feat: intent_catcher server and test

* fix: model to cuda

* fix: cuda is available

* fix: paths

* fix: ignore index

* fix: black style

* fix: paths

* fix: paths

* fix:model modes

* fix: load dataset

* fix: no extra info

* fix: no extra info

* fix: params

* fix: paths

* fix: paths

* fix: paths

* fix: dockerfile and downloading

* fix: black style

* fix: extra im ports

* fix: dockerfile

* fix: paths and reqs

* fix: paths

* fix: lineterminator

* fix: paths to save model

* fix: paths to save model

* fix: paths

* fix: sentences

* fix: test file

* fix: working version

* fix: working

* fix info

* files

* fix: gpu for tests

* fix: gpu distr

* fix: codestyle

* fix: codestyle

* fix: friendship fallback

* fix: codestyle

* fix: book skill start if lets chat about books

* fix: no repeat

* fix book turn on

* fix: remove extra comments

* fix: some changes

* fix: use dp based model

* fix: random prhases tto

* fix: dockerfile

* fix: after review

* fix: codestyle

* fix: yml configs

* fix: add files for intent catcher

* fix: codestyle

* fix: nvidia error fix

* fix: upd config

* fix nvidia keyring using wget

* fix: working version

* fix: add files

* file path in dockerfile

* fix: order of init

* fix: codestyle

* fix: upd intent catcher version v2

* fix: revert to prev version

* fix: tests for upd version

* correct scores

Co-authored-by: mtalimanchuk <mtalimanchuk@gmail.com>

* fix:  proxy dialogpt (#164)

* Dream mini (#161)

* Fix requirements.txt (#84)

* fix itsdangerous requirements

* pin itsdangerous requirements for all flask==1.1.1 servers

* Add mini version of Dream

* Update cpu.yml

* Update pipeline_conf.json

* Update proxy.yml

* fix: cpu only for existing components

* fix: cpu only itnent catcher

Co-authored-by: Andrii.Hura <54397922+AndriiHura@users.noreply.github.com>
Co-authored-by: mtalimanchuk <mtalimanchuk@gmail.com>
Co-authored-by: Dilyara Baymurzina <dilyara.rimovna@gmail.com>

* Feat/infilling: not used yet (#163)

* Fix requirements.txt (#84)

* fix itsdangerous requirements

* pin itsdangerous requirements for all flask==1.1.1 servers

* infilling added (not tested)

* files moved, some paths fixed

* [DGM-49] path to model fixed, test added, seems working

* takes a batch, bigger test added

* assert added to test

* assert added to test

* minor changes

* fix: codestyle

* fix: proxy pass

* fix: yml configs

* fix: refactor infilling and usage

* fix: paths

* fix: dockerfile

* fix: upd files

* fix: working version

* fix: codestyle

* fix: codestyle

* fix: works on gpu

* gix: readme

Co-authored-by: Andrii.Hura <54397922+AndriiHura@users.noreply.github.com>
Co-authored-by: mtalimanchuk <mtalimanchuk@gmail.com>
Co-authored-by: Dilyara Baymurzina <dilyara.rimovna@gmail.com>

* Feat/update dialogpt (#170)

* feat: update dialogpt

* fix: codestyle

* fix: book skill false start

* Feat/parametrize response selector (#169)

* feat: parameterization in progress

* fix: some progress

* fix: parametrised

* fix: parametrised resp selector

* feat: confs for intent responder

* working version

* fix: black style

* fix: codestyle

* fix: resp selector

* fix: dialogpt params

* fix: one more param

* fix: codestyle

* fix: params

* fix: revert dialogpt

* feat: change params of dialogpt (#172)

* Fix/protobuf version (#173)

* fix: fix protobuf version for sentseg

* fix: ner protobuf

* feat: first russian dream (#176)

* feat: docker compose with main components

* Feat/tests russian (#90)

* feat: runtests russian

* fix: test file and elements

* feat: tests russian in jenkinsfile

* feat: files for tests

* fix; program-y name

* fix: change to dff-intent-responder-skill

* fix: sync with agent updates

* fix: cleanup for both runtests

* fix: fix path to pipeline conf

* fix: remove ner from tests

* fix: unbuild english bot before russian tests

* fix: codestyle

Co-authored-by: Fedor Ignatov <ignatov.fedor@gmail.com>

Co-authored-by: Fedor Ignatov <ignatov.fedor@gmail.com>

* fix: no sentrewrite needed

* Feat/ru program y (#88)

* feat: ru program-y version

* fix: variable name

* fix: russian tests

* feat: test files

* fix: dff program-y skill for russian

* fix: dff program-y skill for russian docker compose

* fix: dff program-y skip eng tests

* fix: logs

* fix: add variable env

* fix: revert dangerous skil

* fix: type

* fix: imports types

* fix: line buffering

* fix: default value

* fix: tests

* fix: program-y patterns

* Feat/spellchecker levenstein ru (#89)

* feat: add files with correct rights

* feat: spell check ru

* fix: add dockerfile path

* fix: add  commit

* feat: new files

* fix: config name

* fix: config address

* fix: config as a file

* fix: config title

* fix: consider list not sample

* fix: test fix

* fix: test codestyle

* fix: levenstein tests

* fix: levenstein limit memroy

* fix: levenshtein spelling

* fix: mapping for spelling

* feat: batch processing

* Feat/ru badwords (#93)

* feat: russian obscene words

* fix: badlist ru named as en

* fix: badlist language

* fix: badlist tests passing

* Feat/dummy skill ru (#94)

* fix: russian dummy responses for russian letters in human utterance

* fix: codestyle

* fix: black

* Feat/ner russian (#92)

* feat: ner config

* feat: files for ner ru

* feat:  ner model

* feat: ner integration

* fix: format yml config

* fix: format dockerfile

* fix: path to data

* fix: tests for ner russian

* fix: codestyle

* fix: update ner version

* add russian entity detection

* add russian entity linking

* Update requirements.txt

* Update ner_chunker.py

* fix: rus entity detection tests (#96)

* fix: rus entity detection tests

* black codestyle

* fix codestyle

* fix codestyle

* fix bug

* codestyle

* codestyle

* codestyle

Co-authored-by: dmitry <dmitrij.euseew@yandex.ru>

* Feat/intent catcher Ru based on multi lingual USE (#98)

* fix: intent catcher params and paths

* fix: paths in dockerfile

* fix: pintent ru phrases without random ones

* fix: random intent phrases

* fix: intent training params

* fix: intent requirements

* fix: intent requirements

* fix: download model

* fix: model which to download

* fix: imports for correct work

* fix: corrected phrases

* fix: corrected phrases

* fix: corrected phrases

* fix: corrected phrases

* fix: corrected phrases

* fix: ccorrect path to save json threshold

* feat: intent data ru json

* fix: ccorrect path to save tests

* fix: existing var

* fix: regular phrases

* fix: next test

* fix: trianing logs and new threshold

* fix: trianing logs and new thre change prhases

* fix: change regexps

* fix: change thresholds

* fix: new template for itnent phrases

* fix: etsts ru

* feat: upd model

* fix: upd logs of training, upd conf value

* fix: punctuation

* fix: punctuation

* est: upd model

* fix: training logs

* fix: tests

* fix: phrases for opinion

* feat: upd model

* feat: training logs

* feat: upd model

* fix: tests

* fix: remove opinion request intent

* feat: upd model

* feat: upd model

* fix: new train logs

* fix: new phrases

* fix: min prcision for intent

* fix: lower boundary

* fix: usage of lib

* fix: codestyle

* feat: add itsdangerous requirements

* fix: spelling preproc endpoint

* Feat/dialogpt ru and dff-generative-skill (#97)

* Fix requirements.txt (#84)

* feat: initialize dialogpt_RU

* feat: files init

* feat: basic integration of dialogpt_RU

* fix: rename dialogpt

* fix: dialogpt to device

* fix: dialogpt final version

* fix: dialogpt test

* fix: dialogpt test

* fix: dialogpt  resources consumption

* fix: dialogpt  to tests

* feat: dff generative skill

* feat: dff generative skill

* fix: remove extra files

* fix: input to dialogpt

* fix: input to dialogpt

* fix: logging

* fix: turn on tests

* fix: get dialog from context

* fix: get uttrs from context

* fix: geempty uttrs

* fix: return empty resp

* fix: test file

* fix: tests

* fix: test ratio

* add speech_function_* dist

* add speech_function_* dist readme

* added sf_functions

* fix ports

* fix:m codestyle

* fix deployment config

* fix: tests for generative skill

* fix: codestyle

* add formatters, fix pipeline

* update speech function * sources

* fix: check if dialogpt is ready

* fix: wait services

* rename book skill

* remove old book skill, update usages

* fix readme

* fix codestyle

* fix codestyle

* fix codestyle

* fix codestyle line length

* move res_cor.json to shared files

* fix itsdangerous requirements

* pin itsdangerous requirements for all flask==1.1.1 servers

* fix cpu.yml, dockerfiles and test for sfc, sfp

* fix codestyle issues

* blacked with -l 120

* following Dilya's holy orders

* following Dilya's not so holy orders

* fix formatters

* fix pipeline

* fix pipeline and formatters

* Adding timeouts + mapping of book skill

* removed old & irrelevant tests

* we've set confidence to super level

* feat: midas cls sent tokenize only if needed (#101)

* feat: midas cls sent tokenize only if needed

* feat: take into account tokenized uttrs by bot

* fix: codestyle

* fix:
itsdangerous reqs

* fix: docker reqs

* fix: check another container

* fix: rights for file

* fix: coestyle

* fix: return tests for intent responder

* fix: revert intent responder

* fix: review fixes

* fix: codestyle

Co-authored-by: Andrii.Hura <54397922+AndriiHura@users.noreply.github.com>
Co-authored-by: mtalimanchuk <mtalimanchuk@gmail.com>
Co-authored-by: Daniel Kornev <daniel@zetuniverse.com>

* fix: remove convert and sentseg for now

* Feat/dff-intent-responder-skill ru (#99)

* feat: prepare new intent responder

* fix: responses for intent responder ru

* fix: test based on language

* fix: path to intent response phrases

* fix: remove convert and sentseg

* fix: another gpus

* fix: file path and logs

* fix: env and logs for intent responder

* fix: exit response

* fix: choose_topic to low prioity intents

* feat: tests for ru

* fix: tests for exit ru

* fix: black codestyle

* fix: tests for itnent catcher en

* fix: куьщму сщтмуке фтв ыутеыуп акщь еуыеы

* feat: turn on generative skill

* Feat/wiki parser RU (#114)

* update

* codestyle

* add language parameter

* fix: language arg

* fix: language arg and revert generative in dockercompose

* fix tests

* codestyle

* fix: tests for ru

* fix: language value

* fix: ru test results

* fix: test pipe

* fix: sort types_2hop

* fix: black codestyle

* fix: tests for en wiki

* fix: quotes

* fix: codestyle

* fix: sort objects

* fix: test for wiki parser

* fix: codestyle

Co-authored-by: dilyararimovna <dilyara.rimovna@gmail.com>

* Feat/ru friendship skill (#120)

* feat: add language parameters

* fix: black codestyle

* fix: codestyle

* fix: dff friendship ru

* fix: dff friendship ru

* fix: dff friendship ru

* fix: dff friendship ru уокротила реплики

* fix: dff friendship tests

* fix: dff friendship tests

* fix: languAGE FOR WIKI

* fix: language default value

* fix: language default value

* fix: language env var

* fix: use templates by language

* fix: ru templates

* fix: no lang env var in common

* fix: lang to ackn

* fizx: codestyle

* feat: default lang value

* fix: dummy for russian

* fix: no en acks

* fix: how are you ru

* fix: logs for response functions

* fix: logs for condition functions

* fix: ru version if what to talk about

* feat: ru tests

* fix: codestyle

* fix: ru condition to resp selector

* fix: ru condition to resp selector

* fix: logging level and configuration

* fix:  ascii in tests

* fix: add 'user' to dff input

* fix: add language env variable everywhere

* Feat/dialogrpt ru (#121)

* fix: file drafts

* feat: files for dialogrpt

* feat: dialogrpt pipeline and scores

* feat: dialogrpt pipeline and scores

* feat: dialogrpt readme

* fix: small readme

* fix: sno healthcheck

* feat: add dialogrpt to pipeline

* fix: codestyle

* fix: test files

* feat: upd packages in dockerfile

* fix: path to file

* fix: shared file

* fix: codestyle

* fix: imports

* fix: option consider

* fix: option consider

* fix: codestyle

* fix: vars

* fix: test file

* fix: convert to list predictions

* fix: tests

* fix: codestyle

* fix: codestyle

* fix: codestyle

* fix: readme

* fix: dialogrpt to tests

* feat: no extra files, add tokenizer as parameter

* fix: codestyle

* fix: var name

* fix: batch prediction

* fix: batch prediction parameter

* fix: test choice

* fix: format values

* fix: codestyle

* fix: upd deeppavlov download

* fix: dialogrpt container name

* fix: dialogrpt as hyp annotator

* fix: dialogrpt test

* Feat/ru personal info (#125)

* fix: ignorecase and no text in code

* fix:  russian in patterns

* fix: language env var

* fix: russian patterns and responses

* fix: russian patterns and responses

* fix: path to file

* fix: test for new version

* fix: test for en

* fix: codestyle

* fix: f palceholders

* fix: format usage

* fix: codestyle

* fix: logs

* fix: my name is not

* fix: homeland pattern fixes

* fix: me name is not function

* fix: more logs

* fix: fix my name is not function

* fix: my name is not

* fix: do you know my name

* fix: test format

* fix: test format

* fix: test format and mroe tests

* fix: test format

* fix: more tests

* fix: more tests

* fix: mtest format prints

* fix: black

* fix: en tests

* fix: en tests

* fix: en tests

* fix: en tests

* fix entity detection (#127)

* Feat/spacy lemmatizer (#129)

* fix: add spacy annotator

* fix: usage of spacy attributes

* fix: test spacy annotator

* fix: add params

* fix: add params

* fix: fix test

* fix: rights on file

* fix: codestyle

* fix: extra f string

* Feat/russian sentseg (#128)

* feat: basic config (with no changes)

* feat: data preproc

* feat: data processing

* fix: codestyle

* fix: sentseg ru like dp.ner_rus config

* fix: rename config

* fix: fpath

* fix: readme

* fix: custom sentseg reader

* fix: custom sentseg config

* feat: sent segmentation

* feat: sent segmentation tests

* fix: rights on file

* fix: codestyle

* fix:  data preproc in sentseg_ru too

* fix: metric values for sentseg trained on ru subtitiles

* fix: path to sentseg to download

* fix: usae sentseg ru model

* fux: rights for file

* fix: newer spacy version

* fix: newer depepavlov version

* fix: reqs

* fix: server

* feat: new config for bert model

* fix: upd sentseg config

* fix: upd sentseg config

* fix: remove old config

* fix: config path

* fix: deeppavlov 17 2

* fix: remove extra import

* fix: new docker image base

* fix: reinstall spacy

* fix: resentseg tests

* fix: codestyle

* fix: docs

* fix: add sentseg to tests

* fix: dockerfile

* fix: model path

* fix: add dialogrpt o wait hosts

* fix: more complicated test for badwords annotator

* Fix/upd badlisted words (#130)

* fix: more complicated test for badwords annotator

* fix: revert badlisted en words

* fix: russian badlisted words

* fix: give tokenized sents after spacy

* fix: ru badlisted words

* fix: ru badlisted words folder

* fix: ru badlisted words get data

* fix: test file

* fix: ru badlisted words tokenized sent

* fix: ru badlisted words tokens

* fix: codestyle

* fix: revert badlisted to dev

* fix: pipeline conf post_skill_selector_annotators

* fix: sleep before re try to connect to dialogpt

* fix: formatter format

* fix: more russian badwords

* fix: correct endpoint for spacy annotator

* Feat/ru random questions (#131)

* feat: random russian questions

* feat: dummy provides random russian questions

* fix: refactor questions

* fix: add pre-dummy phrase

* fix: add pre-dummy phrase

* fix: codestyle

* fix: path to file

* fix: strip russian questions

* fix: last chance response

* fix: documentation

* fix: more confident generative skill

* fix: dummy response always available

* fix: intent responder check if exist

* fix: most dummy responses language based

* fix: remove punctuation if present

* fix: documentation

* fix: documentation

* fix: new limits for russian baseline

* fix: dialogrpt scores as conveval

* fix: sentseg ru remove commas

* fix: no wiki-skill yet

* fix:  ner no threads

* fix: can add prompt

* fix: prompt with conf

* fix: remove bad questions

* fix: add punctuation to generated hyp

* fix: remove quotes

* fix: re-choose hyp only for en version

* fix: dff-generative is aka script

* fix: increase intent conf thresholds

* fix: store only tokens for hyps

* fix: consider only special intents

* fix: codestyle

* fix: final fixes, resp selectiorna …
  • Loading branch information
9 people authored Sep 28, 2022
1 parent 1ab1c4a commit c61351b
Show file tree
Hide file tree
Showing 15 changed files with 861 additions and 12 deletions.
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -217,9 +217,11 @@ Dream Architecture is presented in the following image:
| Wiki Facts | 1.7 GiB RAM | |

## Services

| Name | Requirements | Description |
|---------------------|----------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| DialoGPT | 1.2 GiB RAM, 2.1 GiB GPU | generative service based on Transformers generative model, the model is set in docker compose argument `PRETRAINED_MODEL_NAME_OR_PATH` (for example, `microsoft/DialoGPT-small` with 0.2-0.5 sec on GPU) |
| Image captioning | 4 GiB RAM, 5.4 GiB GPU | creates text representation of a received image |
| Infilling | 1 GiB RAM, 1.2 GiB GPU | generative service based on Infilling model, for the given utterance returns utterance where `_` from original text is replaced with generated tokens |
| Knowledge Grounding | 2 GiB RAM, 2.1 GiB GPU | generative service based on BlenderBot architecture providing a response to the context taking into account an additional text paragraph |
| Masked LM | 1.1 GiB RAM, 1 GiB GPU | |
Expand Down
24 changes: 12 additions & 12 deletions annotators/combined_classification/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
gunicorn==19.9.0
sentry-sdk[flask]==0.14.1
flask==1.1.1
itsdangerous==2.0.1
requests==2.22.0
uvicorn==0.11.7
prometheus-client==0.7.1
filelock==3.0.12
torch==1.5.1
transformers==4.6.0
jinja2<=3.0.3
Werkzeug<=2.0.3
gunicorn==19.9.0
sentry-sdk[flask]==0.14.1
flask==1.1.1
itsdangerous==2.0.1
requests==2.22.0
uvicorn==0.11.7
prometheus-client==0.7.1
filelock==3.0.12
torch==1.5.1
transformers==4.6.0
jinja2<=3.0.3
Werkzeug<=2.0.3
14 changes: 14 additions & 0 deletions assistant_dists/dream_multimodal/cpu.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
version: '3.7'
services:
dialogpt:
environment:
DEVICE: cpu
CUDA_VISIBLE_DEVICES: ""
intent-catcher:
environment:
DEVICE: cpu
CUDA_VISIBLE_DEVICES: ""
sentence-ranker:
environment:
DEVICE: cpu
CUDA_VISIBLE_DEVICES: ""
6 changes: 6 additions & 0 deletions assistant_dists/dream_multimodal/db_conf.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{
"host": "DB_HOST",
"port": "DB_PORT",
"name": "DB_NAME",
"env": true
}
64 changes: 64 additions & 0 deletions assistant_dists/dream_multimodal/dev.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# С такими volumes удобно дебажить, не нужно пересобирать контейнер каждый раз при изменении кода
services:
agent:
volumes:
- ".:/dp-agent"
ports:
- 4242:4242
dff-program-y-skill:
volumes:
- "./skills/dff_program_y_skill:/src"
- "./common:/src/common"
ports:
- 8008:8008
sentseg:
volumes:
- "./annotators/SentSeg:/src"
ports:
- 8011:8011
convers-evaluation-selector:
volumes:
- "./response_selectors/convers_evaluation_based_selector:/src"
- "./common:/src/common"
ports:
- 8009:8009
dff-intent-responder-skill:
volumes:
- "./skills/dff_intent_responder_skill:/src"
- "./common:/src/common"
ports:
- 8012:8012
intent-catcher:
volumes:
- "./annotators/IntentCatcherTransformers:/src"
- "./common:/src/common"
- "~/.deeppavlov:/root/.deeppavlov"
ports:
- 8014:8014
badlisted-words:
volumes:
- "./annotators/BadlistedWordsDetector:/src"
- "./common:/src/common"
ports:
- 8018:8018
spelling-preprocessing:
volumes:
- "./annotators/spelling_preprocessing:/src"
ports:
- 8074:8074
dialogpt:
volumes:
- "./services/dialogpt:/src"
ports:
- 8125:8125
sentence-ranker:
volumes:
- "./services/sentence_ranker:/src"
ports:
- 8128:8128
image-captioning:
volumes:
- "./services/image_captioning:/src"
ports:
- 8123:8123
version: "3.7"
193 changes: 193 additions & 0 deletions assistant_dists/dream_multimodal/docker-compose.override.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,193 @@
services:
agent:
command: sh -c 'bin/wait && python -m deeppavlov_agent.run agent.pipeline_config=assistant_dists/dream_mini/pipeline_conf.json'
environment:
WAIT_HOSTS: "dff-program-y-skill:8008, sentseg:8011, convers-evaluation-selector:8009,
dff-intent-responder-skill:8012, intent-catcher:8014, badlisted-words:8018,
spelling-preprocessing:8074, dialogpt:8125, sentence-ranker:8128, image-captioning:8123"
WAIT_HOSTS_TIMEOUT: ${WAIT_TIMEOUT:-480}

dff-program-y-skill:
env_file: [.env]
build:
args:
SERVICE_PORT: 8008
SERVICE_NAME: dff_program_y_skill
LANGUAGE: EN
context: .
dockerfile: ./skills/dff_program_y_skill/Dockerfile
command: gunicorn --workers=1 server:app -b 0.0.0.0:8008 --reload
deploy:
resources:
limits:
memory: 1024M
reservations:
memory: 1024M


sentseg:
env_file: [.env]
build:
context: ./annotators/SentSeg/
command: flask run -h 0.0.0.0 -p 8011
environment:
- FLASK_APP=server
deploy:
resources:
limits:
memory: 1.5G
reservations:
memory: 1.5G

convers-evaluation-selector:
env_file: [.env]
build:
args:
TAG_BASED_SELECTION: 1
CALL_BY_NAME_PROBABILITY: 0.5
PROMPT_PROBA: 0.3
ACKNOWLEDGEMENT_PROBA: 0.3
PRIORITIZE_WITH_REQUIRED_ACT: 1
PRIORITIZE_NO_DIALOG_BREAKDOWN: 0
PRIORITIZE_WITH_SAME_TOPIC_ENTITY: 1
IGNORE_DISLIKED_SKILLS: 0
GREETING_FIRST: 1
RESTRICTION_FOR_SENSITIVE_CASE: 1
PRIORITIZE_PROMTS_WHEN_NO_SCRIPTS: 0
ADD_ACKNOWLEDGMENTS_IF_POSSIBLE: 1
PRIORITIZE_SCRIPTED_SKILLS: 0
context: .
dockerfile: ./response_selectors/convers_evaluation_based_selector/Dockerfile
command: flask run -h 0.0.0.0 -p 8009
environment:
- FLASK_APP=server
deploy:
resources:
limits:
memory: 256M
reservations:
memory: 256M

dff-intent-responder-skill:
env_file: [ .env ]
build:
args:
SERVICE_PORT: 8012
SERVICE_NAME: dff_intent_responder_skill
INTENT_RESPONSE_PHRASES_FNAME: intent_response_phrases.json
context: .
dockerfile: ./skills/dff_intent_responder_skill/Dockerfile
command: gunicorn --workers=1 server:app -b 0.0.0.0:8012 --reload
deploy:
resources:
limits:
memory: 128M
reservations:
memory: 128M

intent-catcher:
env_file: [.env]
build:
context: .
dockerfile: ./annotators/IntentCatcherTransformers/Dockerfile
args:
SERVICE_PORT: 8014
CONFIG_NAME: intents_model_dp_config.json
INTENT_PHRASES_PATH: intent_phrases.json
command: python -m flask run -h 0.0.0.0 -p 8014
environment:
- FLASK_APP=server
- CUDA_VISIBLE_DEVICES=0
deploy:
resources:
limits:
memory: 3.5G
reservations:
memory: 3.5G

badlisted-words:
env_file: [.env]
build:
context: annotators/BadlistedWordsDetector/
command: flask run -h 0.0.0.0 -p 8018
environment:
- FLASK_APP=server
deploy:
resources:
limits:
memory: 256M
reservations:
memory: 256M

spelling-preprocessing:
env_file: [.env]
build:
context: ./annotators/spelling_preprocessing/
command: flask run -h 0.0.0.0 -p 8074
environment:
- FLASK_APP=server
deploy:
resources:
limits:
memory: 50M
reservations:
memory: 50M

dialogpt:
env_file: [ .env ]
build:
args:
SERVICE_PORT: 8125
SERVICE_NAME: dialogpt
PRETRAINED_MODEL_NAME_OR_PATH: microsoft/DialoGPT-medium
N_HYPOTHESES_TO_GENERATE: 5
CONFIG_NAME: dialogpt_en.json
context: ./services/dialogpt/
command: flask run -h 0.0.0.0 -p 8125
environment:
- CUDA_VISIBLE_DEVICES=0
- FLASK_APP=server
deploy:
resources:
limits:
memory: 2G
reservations:
memory: 2G

sentence-ranker:
env_file: [ .env ]
build:
args:
SERVICE_PORT: 8128
PRETRAINED_MODEL_NAME_OR_PATH: sentence-transformers/bert-base-nli-mean-tokens
context: ./services/sentence_ranker/
command: flask run -h 0.0.0.0 -p 8128
environment:
- CUDA_VISIBLE_DEVICES=0
- FLASK_APP=server
deploy:
resources:
limits:
memory: 3G
reservations:
memory: 3G

image-captioning:
env_file: [ .env ]
build:
args:
SERVICE_PORT: 8123
SERVICE_NAME: image_captioning
context: ./services/image_captioning/
command: flask run -h 0.0.0.0 -p 8123
environment:
- CUDA_VISIBLE_DEVICES=0
- FLASK_APP=server
deploy:
resources:
limits:
memory: 5G
reservations:
memory: 5G

version: '3.7'
Loading

0 comments on commit c61351b

Please sign in to comment.