From 26ff1aaf1c9019c60b74468705bd6d9b7ebc5353 Mon Sep 17 00:00:00 2001 From: Dilyara Baymurzina Date: Thu, 30 Jun 2022 14:36:39 +0300 Subject: [PATCH] feat: first russian dream (#176) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * feat: docker compose with main components * Feat/tests russian (#90) * feat: runtests russian * fix: test file and elements * feat: tests russian in jenkinsfile * feat: files for tests * fix; program-y name * fix: change to dff-intent-responder-skill * fix: sync with agent updates * fix: cleanup for both runtests * fix: fix path to pipeline conf * fix: remove ner from tests * fix: unbuild english bot before russian tests * fix: codestyle Co-authored-by: Fedor Ignatov Co-authored-by: Fedor Ignatov * fix: no sentrewrite needed * Feat/ru program y (#88) * feat: ru program-y version * fix: variable name * fix: russian tests * feat: test files * fix: dff program-y skill for russian * fix: dff program-y skill for russian docker compose * fix: dff program-y skip eng tests * fix: logs * fix: add variable env * fix: revert dangerous skil * fix: type * fix: imports types * fix: line buffering * fix: default value * fix: tests * fix: program-y patterns * Feat/spellchecker levenstein ru (#89) * feat: add files with correct rights * feat: spell check ru * fix: add dockerfile path * fix: add commit * feat: new files * fix: config name * fix: config address * fix: config as a file * fix: config title * fix: consider list not sample * fix: test fix * fix: test codestyle * fix: levenstein tests * fix: levenstein limit memroy * fix: levenshtein spelling * fix: mapping for spelling * feat: batch processing * Feat/ru badwords (#93) * feat: russian obscene words * fix: badlist ru named as en * fix: badlist language * fix: badlist tests passing * Feat/dummy skill ru (#94) * fix: russian dummy responses for russian letters in human utterance * fix: codestyle * fix: black * Feat/ner russian (#92) * feat: ner config * feat: files for ner ru * feat: ner model * feat: ner integration * fix: format yml config * fix: format dockerfile * fix: path to data * fix: tests for ner russian * fix: codestyle * fix: update ner version * add russian entity detection * add russian entity linking * Update requirements.txt * Update ner_chunker.py * fix: rus entity detection tests (#96) * fix: rus entity detection tests * black codestyle * fix codestyle * fix codestyle * fix bug * codestyle * codestyle * codestyle Co-authored-by: dmitry * Feat/intent catcher Ru based on multi lingual USE (#98) * fix: intent catcher params and paths * fix: paths in dockerfile * fix: pintent ru phrases without random ones * fix: random intent phrases * fix: intent training params * fix: intent requirements * fix: intent requirements * fix: download model * fix: model which to download * fix: imports for correct work * fix: corrected phrases * fix: corrected phrases * fix: corrected phrases * fix: corrected phrases * fix: corrected phrases * fix: ccorrect path to save json threshold * feat: intent data ru json * fix: ccorrect path to save tests * fix: existing var * fix: regular phrases * fix: next test * fix: trianing logs and new threshold * fix: trianing logs and new thre change prhases * fix: change regexps * fix: change thresholds * fix: new template for itnent phrases * fix: etsts ru * feat: upd model * fix: upd logs of training, upd conf value * fix: punctuation * fix: punctuation * est: upd model * fix: training logs * fix: tests * fix: phrases for opinion * feat: upd model * feat: training logs * feat: upd model * fix: tests * fix: remove opinion request intent * feat: upd model * feat: upd model * fix: new train logs * fix: new phrases * fix: min prcision for intent * fix: lower boundary * fix: usage of lib * fix: codestyle * feat: add itsdangerous requirements * fix: spelling preproc endpoint * Feat/dialogpt ru and dff-generative-skill (#97) * Fix requirements.txt (#84) * feat: initialize dialogpt_RU * feat: files init * feat: basic integration of dialogpt_RU * fix: rename dialogpt * fix: dialogpt to device * fix: dialogpt final version * fix: dialogpt test * fix: dialogpt test * fix: dialogpt resources consumption * fix: dialogpt to tests * feat: dff generative skill * feat: dff generative skill * fix: remove extra files * fix: input to dialogpt * fix: input to dialogpt * fix: logging * fix: turn on tests * fix: get dialog from context * fix: get uttrs from context * fix: geempty uttrs * fix: return empty resp * fix: test file * fix: tests * fix: test ratio * add speech_function_* dist * add speech_function_* dist readme * added sf_functions * fix ports * fix:m codestyle * fix deployment config * fix: tests for generative skill * fix: codestyle * add formatters, fix pipeline * update speech function * sources * fix: check if dialogpt is ready * fix: wait services * rename book skill * remove old book skill, update usages * fix readme * fix codestyle * fix codestyle * fix codestyle * fix codestyle line length * move res_cor.json to shared files * fix itsdangerous requirements * pin itsdangerous requirements for all flask==1.1.1 servers * fix cpu.yml, dockerfiles and test for sfc, sfp * fix codestyle issues * blacked with -l 120 * following Dilya's holy orders * following Dilya's not so holy orders * fix formatters * fix pipeline * fix pipeline and formatters * Adding timeouts + mapping of book skill * removed old & irrelevant tests * we've set confidence to super level * feat: midas cls sent tokenize only if needed (#101) * feat: midas cls sent tokenize only if needed * feat: take into account tokenized uttrs by bot * fix: codestyle * fix: itsdangerous reqs * fix: docker reqs * fix: check another container * fix: rights for file * fix: coestyle * fix: return tests for intent responder * fix: revert intent responder * fix: review fixes * fix: codestyle Co-authored-by: Andrii.Hura <54397922+AndriiHura@users.noreply.github.com> Co-authored-by: mtalimanchuk Co-authored-by: Daniel Kornev * fix: remove convert and sentseg for now * Feat/dff-intent-responder-skill ru (#99) * feat: prepare new intent responder * fix: responses for intent responder ru * fix: test based on language * fix: path to intent response phrases * fix: remove convert and sentseg * fix: another gpus * fix: file path and logs * fix: env and logs for intent responder * fix: exit response * fix: choose_topic to low prioity intents * feat: tests for ru * fix: tests for exit ru * fix: black codestyle * fix: tests for itnent catcher en * fix: куьщму сщтмуке фтв ыутеыуп акщь еуыеы * feat: turn on generative skill * Feat/wiki parser RU (#114) * update * codestyle * add language parameter * fix: language arg * fix: language arg and revert generative in dockercompose * fix tests * codestyle * fix: tests for ru * fix: language value * fix: ru test results * fix: test pipe * fix: sort types_2hop * fix: black codestyle * fix: tests for en wiki * fix: quotes * fix: codestyle * fix: sort objects * fix: test for wiki parser * fix: codestyle Co-authored-by: dilyararimovna * Feat/ru friendship skill (#120) * feat: add language parameters * fix: black codestyle * fix: codestyle * fix: dff friendship ru * fix: dff friendship ru * fix: dff friendship ru * fix: dff friendship ru уокротила реплики * fix: dff friendship tests * fix: dff friendship tests * fix: languAGE FOR WIKI * fix: language default value * fix: language default value * fix: language env var * fix: use templates by language * fix: ru templates * fix: no lang env var in common * fix: lang to ackn * fizx: codestyle * feat: default lang value * fix: dummy for russian * fix: no en acks * fix: how are you ru * fix: logs for response functions * fix: logs for condition functions * fix: ru version if what to talk about * feat: ru tests * fix: codestyle * fix: ru condition to resp selector * fix: ru condition to resp selector * fix: logging level and configuration * fix: ascii in tests * fix: add 'user' to dff input * fix: add language env variable everywhere * Feat/dialogrpt ru (#121) * fix: file drafts * feat: files for dialogrpt * feat: dialogrpt pipeline and scores * feat: dialogrpt pipeline and scores * feat: dialogrpt readme * fix: small readme * fix: sno healthcheck * feat: add dialogrpt to pipeline * fix: codestyle * fix: test files * feat: upd packages in dockerfile * fix: path to file * fix: shared file * fix: codestyle * fix: imports * fix: option consider * fix: option consider * fix: codestyle * fix: vars * fix: test file * fix: convert to list predictions * fix: tests * fix: codestyle * fix: codestyle * fix: codestyle * fix: readme * fix: dialogrpt to tests * feat: no extra files, add tokenizer as parameter * fix: codestyle * fix: var name * fix: batch prediction * fix: batch prediction parameter * fix: test choice * fix: format values * fix: codestyle * fix: upd deeppavlov download * fix: dialogrpt container name * fix: dialogrpt as hyp annotator * fix: dialogrpt test * Feat/ru personal info (#125) * fix: ignorecase and no text in code * fix: russian in patterns * fix: language env var * fix: russian patterns and responses * fix: russian patterns and responses * fix: path to file * fix: test for new version * fix: test for en * fix: codestyle * fix: f palceholders * fix: format usage * fix: codestyle * fix: logs * fix: my name is not * fix: homeland pattern fixes * fix: me name is not function * fix: more logs * fix: fix my name is not function * fix: my name is not * fix: do you know my name * fix: test format * fix: test format * fix: test format and mroe tests * fix: test format * fix: more tests * fix: more tests * fix: mtest format prints * fix: black * fix: en tests * fix: en tests * fix: en tests * fix: en tests * fix entity detection (#127) * Feat/spacy lemmatizer (#129) * fix: add spacy annotator * fix: usage of spacy attributes * fix: test spacy annotator * fix: add params * fix: add params * fix: fix test * fix: rights on file * fix: codestyle * fix: extra f string * Feat/russian sentseg (#128) * feat: basic config (with no changes) * feat: data preproc * feat: data processing * fix: codestyle * fix: sentseg ru like dp.ner_rus config * fix: rename config * fix: fpath * fix: readme * fix: custom sentseg reader * fix: custom sentseg config * feat: sent segmentation * feat: sent segmentation tests * fix: rights on file * fix: codestyle * fix: data preproc in sentseg_ru too * fix: metric values for sentseg trained on ru subtitiles * fix: path to sentseg to download * fix: usae sentseg ru model * fux: rights for file * fix: newer spacy version * fix: newer depepavlov version * fix: reqs * fix: server * feat: new config for bert model * fix: upd sentseg config * fix: upd sentseg config * fix: remove old config * fix: config path * fix: deeppavlov 17 2 * fix: remove extra import * fix: new docker image base * fix: reinstall spacy * fix: resentseg tests * fix: codestyle * fix: docs * fix: add sentseg to tests * fix: dockerfile * fix: model path * fix: add dialogrpt o wait hosts * fix: more complicated test for badwords annotator * Fix/upd badlisted words (#130) * fix: more complicated test for badwords annotator * fix: revert badlisted en words * fix: russian badlisted words * fix: give tokenized sents after spacy * fix: ru badlisted words * fix: ru badlisted words folder * fix: ru badlisted words get data * fix: test file * fix: ru badlisted words tokenized sent * fix: ru badlisted words tokens * fix: codestyle * fix: revert badlisted to dev * fix: pipeline conf post_skill_selector_annotators * fix: sleep before re try to connect to dialogpt * fix: formatter format * fix: more russian badwords * fix: correct endpoint for spacy annotator * Feat/ru random questions (#131) * feat: random russian questions * feat: dummy provides random russian questions * fix: refactor questions * fix: add pre-dummy phrase * fix: add pre-dummy phrase * fix: codestyle * fix: path to file * fix: strip russian questions * fix: last chance response * fix: documentation * fix: more confident generative skill * fix: dummy response always available * fix: intent responder check if exist * fix: most dummy responses language based * fix: remove punctuation if present * fix: documentation * fix: documentation * fix: new limits for russian baseline * fix: dialogrpt scores as conveval * fix: sentseg ru remove commas * fix: no wiki-skill yet * fix: ner no threads * fix: can add prompt * fix: prompt with conf * fix: remove bad questions * fix: add punctuation to generated hyp * fix: remove quotes * fix: re-choose hyp only for en version * fix: dff-generative is aka script * fix: increase intent conf thresholds * fix: store only tokens for hyps * fix: consider only special intents * fix: codestyle * fix: final fixes, resp selectiorna and thresholds intent * fix: more obscene words * fix: Russian documentation * fix: image in docs * fix: questions * fix: bad words * feat: ru toxic classifier * fix: toxic check batch hypotheses too * fix: intent responder uses lang * fix * fix: correct usage of human bot utterances * fix: return 5 hypotheses * fix: more hyps, fix reqs * fix: black codestyle * fix: codestyle * fix: codestyle * feat: response selector uses params * fix: requirements * fix: requirements * fix: revmoe dialogpt prev ru * fix: requirements * fix: add dialogrpt again * fix: add dialogrpt * fix: add dialogpt ru * fix: requirements for dialogpt and dialogrpt * fix: return pymorphy to reqs * Feat/ru intent catcher transformers (#171) * fix: itnent catcher ru transformers * fix: ru itnent catcher * fix: intent catcher updated * fix: INTENT_PHRASES_PATH as a main variable * fix: dockerfile updates * fix: test gpu * fix: black style * fix: add tests files * fix: tests * fix: rights on file * fix: regiths on file * fix: regiths * fix: numb hyps * fix: remove without threads * fix: документация * fix: add LET_ME_ASK_YOU_PHRASES * fix: black style * fix: revert extra files * fix: dream mini uses the same params * fix: generative default response * fix: incase of no gpu * fix: resources and gpus consumption * fix: new image * fix: add prompt ones * fix: ru and en version sentsegs Co-authored-by: Fedor Ignatov Co-authored-by: Дмитрий Евсеев Co-authored-by: Andrii.Hura <54397922+AndriiHura@users.noreply.github.com> Co-authored-by: mtalimanchuk Co-authored-by: Daniel Kornev --- .env | 2 +- Jenkinsfile | 100 ++ README.md | 190 ++- README_ru.md | 227 +++ RussianDream.png | Bin 0 -> 207146 bytes .../BadlistedWordsDetector_ru/Dockerfile | 11 + .../badlists/bad_words.txt | 221 +++ .../requirements.txt | 10 + .../BadlistedWordsDetector_ru/server.py | 133 ++ annotators/BadlistedWordsDetector_ru/test.py | 19 + annotators/BadlistedWordsDetector_ru/test.sh | 3 + .../IntentCatcherTransformers/Dockerfile | 2 + .../IntentCatcherTransformers/README.md | 6 + .../intent_phrases_RU.json | 1389 +++++++++++++++++ .../intents_model_dp_config_RU.json | 190 +++ annotators/IntentCatcherTransformers/test.py | 10 +- .../IntentCatcherTransformers/tests_RU.json | 66 + annotators/NER_ru/Dockerfile | 36 + .../NER_ru/ner_uncased_rus_bert_torch.json | 155 ++ annotators/NER_ru/requirements.txt | 8 + annotators/NER_ru/server.py | 85 + annotators/NER_ru/test.sh | 4 + annotators/NER_ru/test_server.py | 29 + annotators/SentSeg/.gitignore | 3 +- annotators/SentSeg/data_preprocessing.py | 309 ++++ annotators/entity_detection_rus/Dockerfile | 24 + .../entity_detection_parser.py | 242 +++ .../entity_detection_rus.json | 48 + .../entity_detection_rus/ner_chunker.py | 470 ++++++ .../entity_detection_rus/requirements.txt | 14 + annotators/entity_detection_rus/server.py | 129 ++ annotators/entity_detection_rus/test.sh | 5 + .../test_entity_detection.py | 29 + .../torch_transformers_preprocessor.py | 405 +++++ .../torch_transformers_sequence_tagger.py | 394 +++++ .../wiki_ner_rus_bert_torch.json | 133 ++ annotators/entity_linking_rus/Dockerfile | 40 + .../entity_linking_rus/entity_linking.py | 573 +++++++ .../entity_linking_rus.json | 61 + .../entity_linking_rus/requirements.txt | 12 + annotators/entity_linking_rus/server.py | 80 + annotators/entity_linking_rus/test.sh | 4 + annotators/entity_linking_rus/test_el.py | 38 + .../torch_transformers_el_ranker.py | 382 +++++ .../torch_transformers_preprocessor.py | 75 + annotators/fact_retrieval/tfidf_ranker.py | 7 +- annotators/sentseg_ru/Dockerfile | 23 + annotators/sentseg_ru/README.md | 27 + annotators/sentseg_ru/data_preprocessing.py | 341 ++++ annotators/sentseg_ru/dp_sentseg_reader.py | 143 ++ annotators/sentseg_ru/requirements.txt | 8 + .../sentseg_ru/sentseg_ru_bert_torch.json | 154 ++ annotators/sentseg_ru/server.py | 97 ++ annotators/sentseg_ru/test.py | 15 + annotators/sentseg_ru/test.sh | 3 + annotators/spacy_annotator/Dockerfile | 25 + annotators/spacy_annotator/README.txt | 1 + annotators/spacy_annotator/requirements.txt | 9 + annotators/spacy_annotator/server.py | 60 + annotators/spacy_annotator/test.py | 67 + annotators/spacy_annotator/test.sh | 3 + .../spelling_preprocessing_ru/Dockerfile | 38 + .../levenshtein_corrector_ru.json | 60 + .../requirements.txt | 5 + .../spelling_preprocessing_ru/server.py | 43 + annotators/spelling_preprocessing_ru/test.sh | 4 + .../spelling_preprocessing_ru/test_server.py | 22 + annotators/toxic_classification_ru/Dockerfile | 23 + annotators/toxic_classification_ru/README.md | 3 + .../toxic_classification_ru/requirements.txt | 10 + annotators/toxic_classification_ru/server.py | 92 ++ annotators/toxic_classification_ru/test.py | 16 + annotators/toxic_classification_ru/test.sh | 3 + annotators/wiki_parser/Dockerfile | 3 + annotators/wiki_parser/test_wiki_parser.py | 105 +- annotators/wiki_parser/wiki_parser.py | 8 +- assistant_dists/dream/dev.yml | 1 + .../dream/docker-compose.override.yml | 5 +- assistant_dists/dream_mini/dev.yml | 3 +- .../dream_mini/docker-compose.override.yml | 12 +- assistant_dists/dream_russian/cpu.yml | 21 + assistant_dists/dream_russian/db_conf.json | 6 + assistant_dists/dream_russian/dev.yml | 125 ++ .../dream_russian/docker-compose.override.yml | 385 +++++ .../dream_russian/pipeline_conf.json | 410 +++++ assistant_dists/dream_russian/test.yml | 53 + common/acknowledgements.py | 13 +- common/dff/integration/condition.py | 4 +- .../dialogflow_framework/utils/condition.py | 4 +- common/emotion.py | 8 +- common/greeting.py | 383 +++-- common/inflect.py | 20 +- common/personal_info.py | 212 ++- common/remove_lists.py | 249 +++ common/response_selection.py | 5 +- common/universal_templates.py | 55 +- common/utils.py | 4 +- common/wiki_skill_scenarios.py | 2 +- .../Dockerfile | 3 + .../server.py | 31 +- .../tag_based_selection.py | 29 +- .../utils.py | 28 +- .../rule_based_response_selector/Dockerfile | 3 + services/dialogpt_RU/Dockerfile | 23 + services/dialogpt_RU/README.md | 3 + services/dialogpt_RU/requirements.txt | 10 + services/dialogpt_RU/server.py | 166 ++ services/dialogpt_RU/test.py | 23 + services/dialogpt_RU/test.sh | 4 + services/dialogrpt_ru/Dockerfile | 26 + services/dialogrpt_ru/README.md | 15 + services/dialogrpt_ru/data_pikabu.py | 885 +++++++++++ services/dialogrpt_ru/feeder.py | 144 ++ services/dialogrpt_ru/master.py | 216 +++ services/dialogrpt_ru/requirements.txt | 9 + services/dialogrpt_ru/server.py | 84 + services/dialogrpt_ru/test.py | 25 + services/dialogrpt_ru/test.sh | 4 + services/dialogrpt_ru/utils.py | 204 +++ .../rule_based_selector/connector.py | 5 + skills/dff_friendship_skill/Dockerfile | 2 + .../scenario/condition.py | 54 +- skills/dff_friendship_skill/scenario/main.py | 2 - .../dff_friendship_skill/scenario/response.py | 114 +- .../scenario/weekend_condition.py | 5 +- .../scenario/weekend_response.py | 27 +- skills/dff_friendship_skill/server.py | 2 +- skills/dff_friendship_skill/test_server.py | 6 + .../tests/how_bot_is_doing_RU_in.json | 232 +++ .../tests/how_bot_is_doing_RU_out.json | 61 + .../tests/lets_talk_RU_in.json | 110 ++ .../tests/lets_talk_RU_out.json | 50 + .../tests/no_hobbies_RU_in.json | 239 +++ .../tests/no_hobbies_RU_out.json | 73 + .../tests/no_recent_events_RU_in.json | 243 +++ .../tests/no_recent_events_RU_out.json | 69 + skills/dff_generative_skill/Dockerfile | 32 + skills/dff_generative_skill/README.md | 302 ++++ skills/dff_generative_skill/common/.gitkeep | 0 skills/dff_generative_skill/requirements.txt | 2 + skills/dff_generative_skill/scenario/main.py | 29 + .../dff_generative_skill/scenario/response.py | 74 + skills/dff_generative_skill/server.py | 114 ++ skills/dff_generative_skill/test.sh | 4 + skills/dff_generative_skill/test_server.py | 33 + skills/dff_generative_skill/tests/.gitkeep | 0 .../tests/lets_talk_in.json | 59 + .../tests/lets_talk_out.json | 49 + skills/dff_grounding_skill/Dockerfile | 3 + .../dff_grounding_skill/scenario/responses.py | 3 +- skills/dff_intent_responder_skill/Dockerfile | 6 + .../data/intent_response_phrases_RU.json | 39 + .../scenario/response.py | 4 +- .../scenario/response_funcs.py | 62 +- .../dff_intent_responder_skill/test_server.py | 6 + .../tests/intent_choose_topic_RU_in.json | 212 +++ .../tests/intent_choose_topic_RU_out.json | 48 + .../tests/intent_exit_RU_in.json | 212 +++ .../tests/intent_exit_RU_out.json | 48 + .../tests/intent_exit_out.json | 4 +- .../tests/intent_what_can_you_do_RU_in.json | 212 +++ .../tests/intent_what_can_you_do_RU_out.json | 48 + .../tests/intent_what_is_your_job_RU_in.json | 212 +++ .../tests/intent_what_is_your_job_RU_out.json | 48 + .../tests/intent_what_is_your_name_RU_in.json | 212 +++ .../intent_what_is_your_name_RU_out.json | 48 + .../intent_where_are_you_from_RU_in.json | 212 +++ .../intent_where_are_you_from_RU_out.json | 48 + .../tests/intent_who_made_you_RU_in.json | 212 +++ .../tests/intent_who_made_you_RU_out.json | 48 + skills/dff_program_y_skill/Dockerfile | 3 + .../data_ru/categories/README.txt | 19 + .../data_ru/categories/bot_profile.aiml | 203 +++ .../data_ru/categories/greeting.aiml | 145 ++ .../data_ru/categories/letschat.aiml | 196 +++ .../data_ru/categories/misunderstood.aiml | 78 + .../data_ru/categories/no.aiml | 111 ++ .../categories/psychological_help.aiml | 159 ++ .../data_ru/categories/thanks.aiml | 53 + .../categories/what_to_talk_about.aiml | 73 + .../data_ru/categories/yes.aiml | 83 + .../data_ru/debug/duplicates.txt | 5 + .../data_ru/debug/errors.txt | 1 + .../data_ru/licenses/README.txt | 6 + .../data_ru/licenses/license.keys | 1 + .../data_ru/lookups/README.txt | 10 + .../data_ru/lookups/denormal.txt | 50 + .../data_ru/lookups/gender.txt | 16 + .../data_ru/lookups/normal.txt | 462 ++++++ .../data_ru/lookups/person.txt | 0 .../data_ru/lookups/person2.txt | 0 .../data_ru/maps/README.txt | 21 + .../data_ru/nodes/pattern_nodes.conf | 17 + .../data_ru/nodes/template_nodes.conf | 71 + .../data_ru/processing/postprocessors.conf | 6 + .../data_ru/processing/preprocessors.conf | 2 + .../data_ru/rdfs/README.txt | 23 + .../data_ru/security/usergroups.yaml | 19 + .../data_ru/sets/README.txt | 24 + .../data_ru/sets/commit.txt | 2 + .../data_ru/sets/commitment.txt | 1 + .../data_ru/sets/crime.txt | 7 + .../data_ru/sets/crimeverb.txt | 7 + .../data_ru/sets/gonna.txt | 5 + .../dff_program_y_skill/data_ru/sets/my.txt | 7 + .../data_ru/sets/question_like.txt | 6 + .../data_ru/sets/stupid.txt | 13 + .../data_ru/sets/suicide.txt | 2 + .../data_ru/sets/suicideverb.txt | 8 + .../dff_program_y_skill/data_ru/sets/talk.txt | 33 + .../data_ru/sets/wantto.txt | 6 + .../data_ru/sets/wishfor.txt | 2 + .../data_ru/spelling/corpus.txt | 1 + .../dff_program_y_skill/scenario/response.py | 7 +- skills/dff_program_y_skill/test_server.py | 7 + .../tests/who_built_you_RU_in.json | 53 + .../tests/who_built_you_RU_out.json | 48 + skills/dff_sport_skill/Dockerfile | 3 + .../dialogflows/flows/sport.py | 3 +- skills/dff_wiki_skill/Dockerfile | 3 + skills/dummy_skill/README.md | 4 + skills/dummy_skill/connector.py | 66 +- .../dummy_skill/russian_random_questions.txt | 164 ++ skills/emotion_skill/Dockerfile | 3 + skills/emotion_skill/scenario.py | 4 +- skills/meta_script_skill/Dockerfile | 3 + skills/meta_script_skill/meta_script.py | 3 +- skills/personal_info_skill/Dockerfile | 3 + skills/personal_info_skill/server.py | 260 +-- skills/personal_info_skill/test.py | 26 +- skills/personal_info_skill/test_EN.json | 320 ++++ skills/personal_info_skill/test_RU.json | 320 ++++ state_formatters/dp_formatters.py | 59 +- state_formatters/utils.py | 4 +- tests/runtests_russian.sh | 182 +++ 235 files changed, 18508 insertions(+), 632 deletions(-) create mode 100644 README_ru.md create mode 100644 RussianDream.png create mode 100644 annotators/BadlistedWordsDetector_ru/Dockerfile create mode 100644 annotators/BadlistedWordsDetector_ru/badlists/bad_words.txt create mode 100644 annotators/BadlistedWordsDetector_ru/requirements.txt create mode 100644 annotators/BadlistedWordsDetector_ru/server.py create mode 100644 annotators/BadlistedWordsDetector_ru/test.py create mode 100755 annotators/BadlistedWordsDetector_ru/test.sh create mode 100644 annotators/IntentCatcherTransformers/intent_phrases_RU.json create mode 100644 annotators/IntentCatcherTransformers/intents_model_dp_config_RU.json create mode 100644 annotators/IntentCatcherTransformers/tests_RU.json create mode 100644 annotators/NER_ru/Dockerfile create mode 100644 annotators/NER_ru/ner_uncased_rus_bert_torch.json create mode 100644 annotators/NER_ru/requirements.txt create mode 100644 annotators/NER_ru/server.py create mode 100755 annotators/NER_ru/test.sh create mode 100644 annotators/NER_ru/test_server.py create mode 100644 annotators/SentSeg/data_preprocessing.py create mode 100644 annotators/entity_detection_rus/Dockerfile create mode 100644 annotators/entity_detection_rus/entity_detection_parser.py create mode 100644 annotators/entity_detection_rus/entity_detection_rus.json create mode 100644 annotators/entity_detection_rus/ner_chunker.py create mode 100644 annotators/entity_detection_rus/requirements.txt create mode 100644 annotators/entity_detection_rus/server.py create mode 100755 annotators/entity_detection_rus/test.sh create mode 100644 annotators/entity_detection_rus/test_entity_detection.py create mode 100644 annotators/entity_detection_rus/torch_transformers_preprocessor.py create mode 100644 annotators/entity_detection_rus/torch_transformers_sequence_tagger.py create mode 100644 annotators/entity_detection_rus/wiki_ner_rus_bert_torch.json create mode 100644 annotators/entity_linking_rus/Dockerfile create mode 100644 annotators/entity_linking_rus/entity_linking.py create mode 100644 annotators/entity_linking_rus/entity_linking_rus.json create mode 100644 annotators/entity_linking_rus/requirements.txt create mode 100644 annotators/entity_linking_rus/server.py create mode 100755 annotators/entity_linking_rus/test.sh create mode 100644 annotators/entity_linking_rus/test_el.py create mode 100644 annotators/entity_linking_rus/torch_transformers_el_ranker.py create mode 100644 annotators/entity_linking_rus/torch_transformers_preprocessor.py create mode 100644 annotators/sentseg_ru/Dockerfile create mode 100644 annotators/sentseg_ru/README.md create mode 100644 annotators/sentseg_ru/data_preprocessing.py create mode 100644 annotators/sentseg_ru/dp_sentseg_reader.py create mode 100644 annotators/sentseg_ru/requirements.txt create mode 100644 annotators/sentseg_ru/sentseg_ru_bert_torch.json create mode 100644 annotators/sentseg_ru/server.py create mode 100644 annotators/sentseg_ru/test.py create mode 100755 annotators/sentseg_ru/test.sh create mode 100644 annotators/spacy_annotator/Dockerfile create mode 100644 annotators/spacy_annotator/README.txt create mode 100644 annotators/spacy_annotator/requirements.txt create mode 100644 annotators/spacy_annotator/server.py create mode 100644 annotators/spacy_annotator/test.py create mode 100755 annotators/spacy_annotator/test.sh create mode 100644 annotators/spelling_preprocessing_ru/Dockerfile create mode 100644 annotators/spelling_preprocessing_ru/levenshtein_corrector_ru.json create mode 100644 annotators/spelling_preprocessing_ru/requirements.txt create mode 100644 annotators/spelling_preprocessing_ru/server.py create mode 100755 annotators/spelling_preprocessing_ru/test.sh create mode 100644 annotators/spelling_preprocessing_ru/test_server.py create mode 100644 annotators/toxic_classification_ru/Dockerfile create mode 100644 annotators/toxic_classification_ru/README.md create mode 100644 annotators/toxic_classification_ru/requirements.txt create mode 100644 annotators/toxic_classification_ru/server.py create mode 100644 annotators/toxic_classification_ru/test.py create mode 100755 annotators/toxic_classification_ru/test.sh create mode 100644 assistant_dists/dream_russian/cpu.yml create mode 100644 assistant_dists/dream_russian/db_conf.json create mode 100644 assistant_dists/dream_russian/dev.yml create mode 100644 assistant_dists/dream_russian/docker-compose.override.yml create mode 100644 assistant_dists/dream_russian/pipeline_conf.json create mode 100644 assistant_dists/dream_russian/test.yml create mode 100644 common/remove_lists.py create mode 100644 services/dialogpt_RU/Dockerfile create mode 100644 services/dialogpt_RU/README.md create mode 100644 services/dialogpt_RU/requirements.txt create mode 100644 services/dialogpt_RU/server.py create mode 100644 services/dialogpt_RU/test.py create mode 100755 services/dialogpt_RU/test.sh create mode 100644 services/dialogrpt_ru/Dockerfile create mode 100644 services/dialogrpt_ru/README.md create mode 100644 services/dialogrpt_ru/data_pikabu.py create mode 100644 services/dialogrpt_ru/feeder.py create mode 100644 services/dialogrpt_ru/master.py create mode 100644 services/dialogrpt_ru/requirements.txt create mode 100644 services/dialogrpt_ru/server.py create mode 100644 services/dialogrpt_ru/test.py create mode 100755 services/dialogrpt_ru/test.sh create mode 100644 services/dialogrpt_ru/utils.py create mode 100644 skills/dff_friendship_skill/tests/how_bot_is_doing_RU_in.json create mode 100644 skills/dff_friendship_skill/tests/how_bot_is_doing_RU_out.json create mode 100644 skills/dff_friendship_skill/tests/lets_talk_RU_in.json create mode 100644 skills/dff_friendship_skill/tests/lets_talk_RU_out.json create mode 100644 skills/dff_friendship_skill/tests/no_hobbies_RU_in.json create mode 100644 skills/dff_friendship_skill/tests/no_hobbies_RU_out.json create mode 100644 skills/dff_friendship_skill/tests/no_recent_events_RU_in.json create mode 100644 skills/dff_friendship_skill/tests/no_recent_events_RU_out.json create mode 100644 skills/dff_generative_skill/Dockerfile create mode 100644 skills/dff_generative_skill/README.md create mode 100644 skills/dff_generative_skill/common/.gitkeep create mode 100644 skills/dff_generative_skill/requirements.txt create mode 100644 skills/dff_generative_skill/scenario/main.py create mode 100644 skills/dff_generative_skill/scenario/response.py create mode 100644 skills/dff_generative_skill/server.py create mode 100755 skills/dff_generative_skill/test.sh create mode 100644 skills/dff_generative_skill/test_server.py create mode 100644 skills/dff_generative_skill/tests/.gitkeep create mode 100644 skills/dff_generative_skill/tests/lets_talk_in.json create mode 100644 skills/dff_generative_skill/tests/lets_talk_out.json create mode 100644 skills/dff_intent_responder_skill/scenario/data/intent_response_phrases_RU.json create mode 100644 skills/dff_intent_responder_skill/tests/intent_choose_topic_RU_in.json create mode 100644 skills/dff_intent_responder_skill/tests/intent_choose_topic_RU_out.json create mode 100644 skills/dff_intent_responder_skill/tests/intent_exit_RU_in.json create mode 100644 skills/dff_intent_responder_skill/tests/intent_exit_RU_out.json create mode 100644 skills/dff_intent_responder_skill/tests/intent_what_can_you_do_RU_in.json create mode 100644 skills/dff_intent_responder_skill/tests/intent_what_can_you_do_RU_out.json create mode 100644 skills/dff_intent_responder_skill/tests/intent_what_is_your_job_RU_in.json create mode 100644 skills/dff_intent_responder_skill/tests/intent_what_is_your_job_RU_out.json create mode 100644 skills/dff_intent_responder_skill/tests/intent_what_is_your_name_RU_in.json create mode 100644 skills/dff_intent_responder_skill/tests/intent_what_is_your_name_RU_out.json create mode 100644 skills/dff_intent_responder_skill/tests/intent_where_are_you_from_RU_in.json create mode 100644 skills/dff_intent_responder_skill/tests/intent_where_are_you_from_RU_out.json create mode 100644 skills/dff_intent_responder_skill/tests/intent_who_made_you_RU_in.json create mode 100644 skills/dff_intent_responder_skill/tests/intent_who_made_you_RU_out.json create mode 100644 skills/dff_program_y_skill/data_ru/categories/README.txt create mode 100644 skills/dff_program_y_skill/data_ru/categories/bot_profile.aiml create mode 100644 skills/dff_program_y_skill/data_ru/categories/greeting.aiml create mode 100644 skills/dff_program_y_skill/data_ru/categories/letschat.aiml create mode 100644 skills/dff_program_y_skill/data_ru/categories/misunderstood.aiml create mode 100644 skills/dff_program_y_skill/data_ru/categories/no.aiml create mode 100644 skills/dff_program_y_skill/data_ru/categories/psychological_help.aiml create mode 100644 skills/dff_program_y_skill/data_ru/categories/thanks.aiml create mode 100644 skills/dff_program_y_skill/data_ru/categories/what_to_talk_about.aiml create mode 100644 skills/dff_program_y_skill/data_ru/categories/yes.aiml create mode 100644 skills/dff_program_y_skill/data_ru/debug/duplicates.txt create mode 100644 skills/dff_program_y_skill/data_ru/debug/errors.txt create mode 100644 skills/dff_program_y_skill/data_ru/licenses/README.txt create mode 100644 skills/dff_program_y_skill/data_ru/licenses/license.keys create mode 100644 skills/dff_program_y_skill/data_ru/lookups/README.txt create mode 100644 skills/dff_program_y_skill/data_ru/lookups/denormal.txt create mode 100644 skills/dff_program_y_skill/data_ru/lookups/gender.txt create mode 100644 skills/dff_program_y_skill/data_ru/lookups/normal.txt create mode 100644 skills/dff_program_y_skill/data_ru/lookups/person.txt create mode 100644 skills/dff_program_y_skill/data_ru/lookups/person2.txt create mode 100644 skills/dff_program_y_skill/data_ru/maps/README.txt create mode 100644 skills/dff_program_y_skill/data_ru/nodes/pattern_nodes.conf create mode 100644 skills/dff_program_y_skill/data_ru/nodes/template_nodes.conf create mode 100644 skills/dff_program_y_skill/data_ru/processing/postprocessors.conf create mode 100644 skills/dff_program_y_skill/data_ru/processing/preprocessors.conf create mode 100644 skills/dff_program_y_skill/data_ru/rdfs/README.txt create mode 100644 skills/dff_program_y_skill/data_ru/security/usergroups.yaml create mode 100644 skills/dff_program_y_skill/data_ru/sets/README.txt create mode 100644 skills/dff_program_y_skill/data_ru/sets/commit.txt create mode 100644 skills/dff_program_y_skill/data_ru/sets/commitment.txt create mode 100644 skills/dff_program_y_skill/data_ru/sets/crime.txt create mode 100644 skills/dff_program_y_skill/data_ru/sets/crimeverb.txt create mode 100644 skills/dff_program_y_skill/data_ru/sets/gonna.txt create mode 100644 skills/dff_program_y_skill/data_ru/sets/my.txt create mode 100644 skills/dff_program_y_skill/data_ru/sets/question_like.txt create mode 100644 skills/dff_program_y_skill/data_ru/sets/stupid.txt create mode 100644 skills/dff_program_y_skill/data_ru/sets/suicide.txt create mode 100644 skills/dff_program_y_skill/data_ru/sets/suicideverb.txt create mode 100644 skills/dff_program_y_skill/data_ru/sets/talk.txt create mode 100644 skills/dff_program_y_skill/data_ru/sets/wantto.txt create mode 100644 skills/dff_program_y_skill/data_ru/sets/wishfor.txt create mode 100644 skills/dff_program_y_skill/data_ru/spelling/corpus.txt create mode 100644 skills/dff_program_y_skill/tests/who_built_you_RU_in.json create mode 100644 skills/dff_program_y_skill/tests/who_built_you_RU_out.json create mode 100644 skills/dummy_skill/README.md create mode 100644 skills/dummy_skill/russian_random_questions.txt create mode 100644 skills/personal_info_skill/test_EN.json create mode 100644 skills/personal_info_skill/test_RU.json create mode 100755 tests/runtests_russian.sh diff --git a/.env b/.env index 63d677513f..62e7cd93de 100644 --- a/.env +++ b/.env @@ -30,4 +30,4 @@ NEWS_API_ANNOTATOR_URL=http://news-api-annotator:8112/respond WIKI_FACTS_URL=http://wiki-facts:8116/respond FACT_RANDOM_SERVICE_URL=http://fact-random:8119/respond INFILLING_SERVICE_URL=http://infilling:8122/respond - +DIALOGPT_SERVICE_URL=http://dialogpt:8091/respond diff --git a/Jenkinsfile b/Jenkinsfile index f6e736356b..7ebbfb9659 100644 --- a/Jenkinsfile +++ b/Jenkinsfile @@ -157,6 +157,105 @@ pipeline { } } } + + stage('Build-RU') { + steps { + script{ + startTime = currentBuild.duration + Exception ex = null + catchError(buildResult: 'FAILURE', stageResult: 'FAILURE') { + try { + sh ''' + tests/runtests.sh MODE=clean + tests/runtests_russian.sh MODE=build + ''' + } + catch (Exception e) { + int duration = (currentBuild.duration - startTime) / 1000 + throw e + } + } + } + } + post { + failure { + script { + sh 'tests/runtests_russian.sh MODE=clean' + } + } + success { + script { + int duration = (currentBuild.duration - startTime) / 1000 + } + } + } + } + + stage('Start-RU') { + steps { + script { + startTime = currentBuild.duration + Exception ex = null + catchError(buildResult: 'FAILURE', stageResult: 'FAILURE') { + try { + sh 'tests/runtests_russian.sh MODE=clean && tests/runtests_russian.sh MODE=start' + } + catch (Exception e) { + int duration = (currentBuild.duration - startTime) / 1000 + throw e + } + } + } + } + post { + failure { + script { + sh 'tests/runtests_russian.sh MODE=clean' + } + } + success { + script { + started = true + int duration = (currentBuild.duration - startTime) / 1000 + } + } + aborted { + script { + sh 'tests/runtests_russian.sh MODE=clean' + } + } + } + } + + stage('Test skills-RU') { + steps { + script { + startTime = currentBuild.duration + Exception ex = null + catchError(buildResult: 'FAILURE', stageResult: 'FAILURE') { + try { + sh label: 'test skills', script: 'tests/runtests_russian.sh MODE=test_skills' + } + catch (Exception e) { + int duration = (currentBuild.duration - startTime) / 1000 + throw e + } + } + } + } + post { + success { + script { + int duration = (currentBuild.duration - startTime) / 1000 + } + } + aborted { + script { + sh 'tests/runtests_russian.sh MODE=clean' + } + } + } + } } post { aborted { @@ -168,6 +267,7 @@ pipeline { script { if (started) { sh './tests/runtests.sh MODE=clean' + sh './tests/runtests_russian.sh MODE=clean' } } } diff --git a/README.md b/README.md index d4e5ca2d2a..f65d040672 100644 --- a/README.md +++ b/README.md @@ -7,8 +7,8 @@ To get architecture documentation, please refer to DeepPavlov Agent [readthedocs # Distributions -We've already included five distributions: four of them are based on lightweight Deepy socialbot -and one is a full-sized Dream chatbot. +We've already included six distributions: four of them are based on lightweight Deepy socialbot, +one is a full-sized Dream chatbot (based on Alexa Prize Challenge version) in English and a Dream chatbot in Russian. ### Deepy Base Base version of Lunar assistant. @@ -47,6 +47,15 @@ because of its modular architecture and original goals (participation in Alexa P We provide a demo of Dream Socialbot on [our website](https://demo.deeppavlov.ai). +### Dream Mini +Mini version of DeepPavlov Dream Socialbot. +This is a generative-based socialbot that uses [English DialoGPT model](https://huggingface.co/microsoft/DialoGPT-medium) to generate most of the responses. It also contains intent catcher and responder components to cover special user requests. +[Link to the distribution.](https://github.com/deepmipt/dream/tree/main/assistant_dists/dream_mini) + +### Dream Russian +Russian version of DeepPavlov Dream Socialbot. This is a generative-based socialbot that uses [Russian DialoGPT model](https://huggingface.co/Grossmend/rudialogpt3_medium_based_on_gpt2) to generate most of the responses. It also contains intent catcher and responder components to cover special user requests. +[Link to the distribution.](https://github.com/deepmipt/dream/tree/main/assistant_dists/dream_russian) + # Quick Start ### Clone the repo @@ -153,39 +162,39 @@ docker-compose -f docker-compose.yml -f assistant_dists/dream/docker-compose.ove By default, `proxy.yml` contains all available proxy definitions. -# Components +# Components English Version Dream Architecture is presented in the following image: ![DREAM](DREAM.png) ## Annotators -| Name | Requirements | Description | -|-----------------------------|--------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| ASR | 30 MiB RAM | calculates overall ASR confidence for a given utterance and grades it as either _very low_, _low_, _medium_, or _high_ (for Amazon markup) | -| Badlisted words | 110 MiB RAM | detects words and phrases from the badlist | -| Combined classification | 1.5 GiB RAM, 3.5 GiB GPU | BERT-based model including topic classification, dialog acts classification, sentiment, toxicity, emotion, factoid classification | -| COMeT | 4.5 GiB RAM, 2.2 GiB GPU | Commonsense prediction models COMeT Atomic and ConceptNet | -| Convers Evaluator Annotator | 1.5 GiB RAM, 4.5 GiB GPU | is trained on the Alexa Prize data from the previous competitions and predicts whether the candidate response is interesting, comprehensible, on-topic, engaging, or erroneous | -| Entity detection | 3.1 GiB RAM | extracts entities and their types from utterances | -| Entity linking | 16 GiB RAM, 1.5 GiB GPU | finds Wikidata entity ids for the entities detected with Entity Detection | -| Entity Storer | 220 MiB RAM | a rule-based component, which stores entities from the user's and socialbot's utterances if opinion expression is detected with patterns or MIDAS Classifier and saves them along with the detected attitude to dialogue state | -| Fact random | 50 MiB RAM | returns random facts for the given entity (for entities from user utterance) | -| Fact retrieval | 400 MiB GPU | extracts facts from Wikipedia and wikiHow | -| Intent catcher | 2.7 GiB RAM | classifies user utterances into a number of predefined intents which are trained on a set of phrases and regexps | -| KBQA | 360 MiB GPU | answers user's factoid questions based on Wikidata KB | -| MIDAS classification | 4.5 GiB GPU | BERT-based model trained on a semantic classes subset of MIDAS dataset | -| NER | 800 MiB RAM | extracts person names, names of locations, organizations from uncased text | -| News API annotator | 70 MiB RAM | extracts the latest news about entities or topics using the GNews API. DeepPavlov Dream deployments utilize our own API key. | -| Sentrewrite | 30 MiB RAM | rewrites user's utterances by replacing pronouns with specific names that provide more useful information to downstream components | -| Sentseg | 1 GiB RAM | allows us to handle long and complex user's utterances by splitting them into sentences and recovering punctuation | -| Spacy nounphrases | 200 MiB RAM | extracts nounphrases using Spacy and filters out generic ones | -| Speech Function Classifier | | a hierarchical algorithm based on several linear models and a rule-based approach for the prediction of speech functions described by Eggins and Slade | -| Speech Function Predictor | | yields probabilities of speech functions that can follow a speech function predicted by Speech Function Classifier | -| Spelling preprocessing | 30 MiB RAM | pattern-based component to rewrite different colloquial expressions to a more formal style of conversation | -| Topic recommendation | 40 MiB RAM | offers a topic for further conversation using the information about the discussed topics and user's preferences. Current version is based on Reddit personalities (see Dream Report for Alexa Prize 4). | -| User Persona Extractor | 40 MiB RAM | determines which age category the user belongs to based on some key words | -| Wiki parser | 100 MiB RAM | extracts Wikidata triplets for the entities detected with Entity Linking | +| Name | Requirements | Description | +|-------------------------------|--------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| ASR | 30 MiB RAM | calculates overall ASR confidence for a given utterance and grades it as either _very low_, _low_, _medium_, or _high_ (for Amazon markup) | +| Badlisted words | 110 MiB RAM | detects words and phrases from the badlist | +| Combined classification | 1.5 GiB RAM, 3.5 GiB GPU | BERT-based model including topic classification, dialog acts classification, sentiment, toxicity, emotion, factoid classification | +| COMeT | 4.5 GiB RAM, 2.2 GiB GPU | Commonsense prediction models COMeT Atomic and ConceptNet | +| Convers Evaluator Annotator | 1.5 GiB RAM, 4.5 GiB GPU | is trained on the Alexa Prize data from the previous competitions and predicts whether the candidate response is interesting, comprehensible, on-topic, engaging, or erroneous | +| Entity detection | 3.1 GiB RAM | extracts entities and their types from utterances | +| Entity linking | 16 GiB RAM, 1.5 GiB GPU | finds Wikidata entity ids for the entities detected with Entity Detection | +| Entity Storer | 220 MiB RAM | a rule-based component, which stores entities from the user's and socialbot's utterances if opinion expression is detected with patterns or MIDAS Classifier and saves them along with the detected attitude to dialogue state | +| Fact random | 50 MiB RAM | returns random facts for the given entity (for entities from user utterance) | +| Fact retrieval | 400 MiB GPU | extracts facts from Wikipedia and wikiHow | +| Intent catcher | 2.7 GiB RAM | classifies user utterances into a number of predefined intents which are trained on a set of phrases and regexps | +| KBQA | 360 MiB GPU | answers user's factoid questions based on Wikidata KB | +| MIDAS classification | 4.5 GiB GPU | BERT-based model trained on a semantic classes subset of MIDAS dataset | +| NER | 800 MiB RAM | extracts person names, names of locations, organizations from uncased text | +| News API annotator | 70 MiB RAM | extracts the latest news about entities or topics using the GNews API. DeepPavlov Dream deployments utilize our own API key. | +| Sentrewrite | 30 MiB RAM | rewrites user's utterances by replacing pronouns with specific names that provide more useful information to downstream components | +| Sentseg | 1 GiB RAM | allows us to handle long and complex user's utterances by splitting them into sentences and recovering punctuation | +| Spacy nounphrases | 200 MiB RAM | extracts nounphrases using Spacy and filters out generic ones | +| Speech Function Classifier | | a hierarchical algorithm based on several linear models and a rule-based approach for the prediction of speech functions described by Eggins and Slade | +| Speech Function Predictor | | yields probabilities of speech functions that can follow a speech function predicted by Speech Function Classifier | +| Spelling preprocessing | 30 MiB RAM | pattern-based component to rewrite different colloquial expressions to a more formal style of conversation | +| Topic recommendation | 40 MiB RAM | offers a topic for further conversation using the information about the discussed topics and user's preferences. Current version is based on Reddit personalities (see Dream Report for Alexa Prize 4). | +| User Persona Extractor | 40 MiB RAM | determines which age category the user belongs to based on some key words | +| Wiki parser | 100 MiB RAM | extracts Wikidata triplets for the entities detected with Entity Linking | ## Services | Name | Requirements | Description | @@ -194,51 +203,84 @@ Dream Architecture is presented in the following image: | Infilling | 1.7 GiB RAM, 1 GiB GPU | generative service based on Infilling model, for the given utterance returns utterance where `_` from original text is replaced with generated tokens | ## Skills -| Name | Requirements | Description | -|---------------------------|-------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Christmas Skill | | supports FAQ, facts, and scripts for Christmas | -| Comet Dialog skill | | uses COMeT ConceptNet model to express an opinion, to ask a question or give a comment about user's actions mentioned in the dialogue | -| Convert Reddit | 900 MiB RAM | uses a ConveRT encoder to build efficient representations for sentences | -| Dummy Skill | | a fallback skill with multiple non-toxic candidate responses | -| Dummy Skill Dialog | 600 MiB RAM | returns the next turn from the Topical Chat dataset if the response of the user to the Dummy Skill is similar to the corresponding response in the source data | -| Eliza | 30 MiB RAM | Chatbot (https://github.com/wadetb/eliza) | -| Emotion skill | 30 MiB RAM | returns template responses to emotions detected by Emotion Classification from Combined Classification annotator | -| Factoid QA | 200 MiB RAM | answers factoid questions | -| Game Cooperative skill | 120 MiB RAM | provides user with a conversation about computer games: the charts of the best games for the past year, past month, and last week | -| Intent Responder | 40 MiB RAM | provides template-based replies for some of the intents detected by Intent Catcher annotator | -| Knowledge Grounding skill | 60 MiB RAM, 1.5 GiB GPU | generates a response based on the dialogue history and provided knowledge related to the current conversation topic | -| Meta Script skill | 150 MiB RAM | provides a multi-turn dialogue around human activities. The skill uses COMeT Atomic model to generate commonsensical descriptions and questions on several aspects | -| Misheard ASR | 40 MiB RAM | uses the ASR Processor annotations to give feedback to the user when ASR confidence is too low | -| News API skill | 60 MiB RAM | presents the top-rated latest news about entities or topics using the GNews API | -| Oscar Skill | | supports FAQ, facts, and scripts for Oscar | -| Personal Info skill | 40 MiB RAM | queries and stores user's name, birthplace, and location | -| Personality Catcher | 30 MiB RAM | | -| Program Y | 800 MiB RAM | **[New DFF version]** Chatbot Program Y (https://github.com/keiffster/program-y) adapted for Dream socialbot | -| Program Y Dangerous | 150 MiB RAM | **[New DFF version]** Chatbot Program Y (https://github.com/keiffster/program-y) adapted for Dream socialbot, containing responses to dangerous situations in a dialog | -| Program Y Wide | 130 MiB RAM | **[New DFF version]** Chatbot Program Y (https://github.com/keiffster/program-y) adapted for Dream socialbot, which includes only very general templates (with lower confidence) | -| Small Talk skill | 35 MiB RAM | asks questions using the hand-written scripts for 25 topics, including but not limited to love, sports, work, pets, etc. | -| SuperBowl Skill | | supports FAQ, facts, and scripts for SuperBowl | -| Valentine's Day Skill | | supports FAQ, facts, and scripts for Valentine's Day | -| Wikidata Dial Skill | | generates an utterance using Wikidata triplets. Not turned on, needs improvement | -| DFF Animals skill | 250 MiB RAM | is created using DFF and has three branches of conversation about animals: user's pets, pets of the socialbot, and wild animals | -| DFF Art skill | 200 MiB RAM | DFF-based skill to discuss art | -| DFF Book skill | 450 MiB RAM | **[New DFF version]** detects book titles and authors mentioned in the user's utterance with the help of Wiki parser and Entity linking and recommends books by leveraging information from the GoodReads database | -| DFF Bot Persona skill | 170 MiB RAM | aims to discuss user favorites and 20 most popular things with short stories expressing the socialbot's opinion towards them | -| DFF Coronavirus skill | 150 MiB RAM | **[New DFF version]** retrieves data about the number of coronavirus cases and deaths in different locations sourced from the John Hopkins University Center for System Science and Engineering | -| DFF Food skill | 170 MiB RAM | constructed with DFF to encourage food-related conversation | -| DFF Friendship skill | 100 MiB RAM | DFF-based skill to greet the user in the beginning of the dialog, and forward the user to some scripted skill | -| DFF Funfact skill | 100 MiB RAM | **[New DFF version]** Tells user fun facts | -| DFF Gaming skill | 120 MiB RAM | provides a video games discussion. Gaming Skill is for more general talk about video games | -| DFF Gossip skill | 95 MiB RAM | DFF-based skill to discuss other people with news about them | -| DFF Grounding skill | 90 MiB RAM | **[New DFF version]** DFF-based skill to answer what is the topic of the conversation, to generate acknowledgement, to generate universal responses on some dialog acts by MIDAS | -| DFF Movie skill | 1.1 GiB RAM | is implemented using DFF and takes care of the conversations related to movies | -| DFF Music skill | 100 MiB RAM | DFF-based skill to discuss music | -| DFF Science skill | 90 MiB RAM | DFF-based skill to discuss science | -| DFF Short Story skill | 90 MiB RAM | **[New DFF version]** tells user short stories from 3 categories: (1) bedtime stories, such as fables and moral stories, (2) horror stories, and (3) funny ones | -| DFF Sports Skill | 100 MiB RAM | DFF-based skill to discuss sports | -| DFF Travel skill | 90 MiB RAM | DFF-based skill to discuss travel | -| DFF Weather skill | 1.4 GiB RAM | **[New DFF version]** uses the OpenWeatherMap service to get the forecast for the user's location | -| DFF Wiki skill | 160 MiB RAM | used for making scenarios with the extraction of entities, slot filling, facts insertion, and acknowledgements | +| Name | Requirements | Description | +|-------------------------------|----------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| Christmas Skill | | supports FAQ, facts, and scripts for Christmas | +| Comet Dialog skill | | uses COMeT ConceptNet model to express an opinion, to ask a question or give a comment about user's actions mentioned in the dialogue | +| Convert Reddit | 900 MiB RAM | uses a ConveRT encoder to build efficient representations for sentences | +| Dummy Skill | a part of agent container | a fallback skill with multiple non-toxic candidate responses | +| Dummy Skill Dialog | 600 MiB RAM | returns the next turn from the Topical Chat dataset if the response of the user to the Dummy Skill is similar to the corresponding response in the source data | +| Eliza | 30 MiB RAM | Chatbot (https://github.com/wadetb/eliza) | +| Emotion skill | 30 MiB RAM | returns template responses to emotions detected by Emotion Classification from Combined Classification annotator | +| Factoid QA | 200 MiB RAM | answers factoid questions | +| Game Cooperative skill | 120 MiB RAM | provides user with a conversation about computer games: the charts of the best games for the past year, past month, and last week | +| Intent Responder | 40 MiB RAM | provides template-based replies for some of the intents detected by Intent Catcher annotator | +| Knowledge Grounding skill | 60 MiB RAM, 1.5 GiB GPU | generates a response based on the dialogue history and provided knowledge related to the current conversation topic | +| Meta Script skill | 150 MiB RAM | provides a multi-turn dialogue around human activities. The skill uses COMeT Atomic model to generate commonsensical descriptions and questions on several aspects | +| Misheard ASR | 40 MiB RAM | uses the ASR Processor annotations to give feedback to the user when ASR confidence is too low | +| News API skill | 60 MiB RAM | presents the top-rated latest news about entities or topics using the GNews API | +| Oscar Skill | | supports FAQ, facts, and scripts for Oscar | +| Personal Info skill | 40 MiB RAM | queries and stores user's name, birthplace, and location | +| Personality Catcher | 30 MiB RAM | | +| DFF Program Y skill | 800 MiB RAM | **[New DFF version]** Chatbot Program Y (https://github.com/keiffster/program-y) adapted for Dream socialbot | +| DFF Program Y Dangerous skill | 150 MiB RAM | **[New DFF version]** Chatbot Program Y (https://github.com/keiffster/program-y) adapted for Dream socialbot, containing responses to dangerous situations in a dialog | +| DFF Program Y Wide skill | 130 MiB RAM | **[New DFF version]** Chatbot Program Y (https://github.com/keiffster/program-y) adapted for Dream socialbot, which includes only very general templates (with lower confidence) | +| Small Talk skill | 35 MiB RAM | asks questions using the hand-written scripts for 25 topics, including but not limited to love, sports, work, pets, etc. | +| SuperBowl Skill | | supports FAQ, facts, and scripts for SuperBowl | +| Valentine's Day Skill | | supports FAQ, facts, and scripts for Valentine's Day | +| Wikidata Dial Skill | | generates an utterance using Wikidata triplets. Not turned on, needs improvement | +| DFF Animals skill | 250 MiB RAM | is created using DFF and has three branches of conversation about animals: user's pets, pets of the socialbot, and wild animals | +| DFF Art skill | 200 MiB RAM | DFF-based skill to discuss art | +| DFF Book skill | 450 MiB RAM | **[New DFF version]** detects book titles and authors mentioned in the user's utterance with the help of Wiki parser and Entity linking and recommends books by leveraging information from the GoodReads database | +| DFF Bot Persona skill | 170 MiB RAM | aims to discuss user favorites and 20 most popular things with short stories expressing the socialbot's opinion towards them | +| DFF Coronavirus skill | 150 MiB RAM | **[New DFF version]** retrieves data about the number of coronavirus cases and deaths in different locations sourced from the John Hopkins University Center for System Science and Engineering | +| DFF Food skill | 170 MiB RAM | constructed with DFF to encourage food-related conversation | +| DFF Friendship skill | 100 MiB RAM | **[New DFF version]** DFF-based skill to greet the user in the beginning of the dialog, and forward the user to some scripted skill | +| DFF Funfact skill | 100 MiB RAM | **[New DFF version]** Tells user fun facts | +| DFF Gaming skill | 120 MiB RAM | provides a video games discussion. Gaming Skill is for more general talk about video games | +| DFF Gossip skill | 95 MiB RAM | DFF-based skill to discuss other people with news about them | +| DFF Grounding skill | 90 MiB RAM | **[New DFF version]** DFF-based skill to answer what is the topic of the conversation, to generate acknowledgement, to generate universal responses on some dialog acts by MIDAS | +| DFF Movie skill | 1.1 GiB RAM | is implemented using DFF and takes care of the conversations related to movies | +| DFF Music skill | 100 MiB RAM | DFF-based skill to discuss music | +| DFF Science skill | 90 MiB RAM | DFF-based skill to discuss science | +| DFF Short Story skill | 90 MiB RAM | **[New DFF version]** tells user short stories from 3 categories: (1) bedtime stories, such as fables and moral stories, (2) horror stories, and (3) funny ones | +| DFF Sports Skill | 100 MiB RAM | DFF-based skill to discuss sports | +| DFF Travel skill | 90 MiB RAM | DFF-based skill to discuss travel | +| DFF Weather skill | 1.4 GiB RAM | **[New DFF version]** uses the OpenWeatherMap service to get the forecast for the user's location | +| DFF Wiki skill | 160 MiB RAM | used for making scenarios with the extraction of entities, slot filling, facts insertion, and acknowledgements | + + +# Components Russian Version + +Dream Architecture is presented in the following image: +![DREAM](RussianDREAM.png) + +## Annotators + +| Name | Requirements | Description | +|------------------------|--------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| Badlisted words | 50 MiB RAM | detects obscene Russian words from the badlist | +| Entity detection | 3 GiB RAM | extracts entities and their types from utterances | +| Entity linking | 500 MiB RAM, ?? GiB GPU | finds Wikidata entity ids for the entities detected with Entity Detection | +| Intent catcher | 900 MiB RAM | classifies user utterances into a number of predefined intents which are trained on a set of phrases and regexps | +| NER | 1.7 GiB RAM, 4.9 Gib GPU | extracts person names, names of locations, organizations from uncased text using ruBert-based (pyTorch) model | +| Sentseg | 2.4 GiB RAM, 4.9 Gib GPU | recovers punctuation using ruBert-based (pyTorch) model and splits into sentences | +| Spacy Annotator | 250 MiB RAM | token-wise annotations by Spacy | +| Spelling preprocessing | 4.4 GiB RAM | Russian Levenshtein correction model | +| Wiki parser | 100 MiB RAM | extracts Wikidata triplets for the entities detected with Entity Linking | +| DialogRPT | 3.8 GiB RAM, 2 GiB GPU | DialogRPT model which is based on Russian DialoGPT (see https://huggingface.co/Grossmend/rudialogpt3_medium_based_on_gpt2) and fine-tuned on Russian Pikabu Comment sequences | + +## Skills & Services +| Name | Requirements | Description | +|------------------------|---------------------------|-------------------------------------------------------------------------------------------------------------------------------------| +| DialoGPT | 2.8 GiB RAM, 2 GiB GPU | Russian DialoGPT model https://huggingface.co/Grossmend/rudialogpt3_medium_based_on_gpt2 | +| Dummy Skill | a part of agent container | a fallback skill with multiple non-toxic candidate responses and random Russian questions | +| Personal Info skill | 40 MiB RAM | queries and stores user's name, birthplace, and location | +| DFF Generative skill | 50 MiB RAM | **[New DFF version]** generative skill which uses DialoGPT service to generate 3 different hypotheses | +| DFF Intent Responder | 50 MiB RAM | provides template-based replies for some of the intents detected by Intent Catcher annotator | +| DFF Program Y skill | 80 MiB RAM | **[New DFF version]** Chatbot Program Y (https://github.com/keiffster/program-y) adapted for Dream socialbot | +| DFF Friendship skill | 70 MiB RAM | **[New DFF version]** DFF-based skill to greet the user in the beginning of the dialog, and forward the user to some scripted skill | +| DFF Wiki skill | 150 MiB RAM | used for making scenarios with the extraction of entities, slot filling, facts insertion, and acknowledgements | # Papers ### Alexa Prize 3 diff --git a/README_ru.md b/README_ru.md new file mode 100644 index 0000000000..e92a266295 --- /dev/null +++ b/README_ru.md @@ -0,0 +1,227 @@ +# DeepPavlov Dream + +**DeepPavlov Dream** -- это платформа для создания модульный диалоговых систем. + +Документация архитектуры для DeepPavlov Agent может быть найдена на [readthedocs documentation](https://deeppavlov-agent.readthedocs.io). + +# Дистрибутивы + +На данный момент представлены 6 дистрибутивов: +- полная версия англоязычного бота DREAM (основан на версии бота, участвовавшего в lexa Prize Challenge) +- 4 дистрибутива Deepy представляют собой легкие версии бота на английском языке, +- русскоязычная диалоговая система, в основе которой лежит генеративная модель DialoGPT Russian. + + +### Deepy Base +Базовая версия Lunar assistant. +Deepy Base содержит аннотатор исправления опечаток Spelling Preprocessing, +шаблонный навык Harvesters Maintenance Skill +и навык открытого домена на основе AIML, написанный на Dialog Flow Framework. + +### Deepy Advanced +Расширенная версия Lunar assistant. +Deepy Advanced содержит аннотаторы исправления опечаток Spelling Preprocessing, +разделения текста на предложения Sentence Segmentation, +связывания сущностей Entity Linking и детектирвоания специальный намерений Intent Catcher, +навык Harvesters Maintenance GoBot Skill для целеориентированных ответов, +и навык открытого домена на основе AIML, написанный на Dialog Flow Framework. + +### Deepy FAQ +FAQ-версия (Frequently-asked Questions) Lunar assistant. +Deepy FAQ содержит аннотатор исправления опечаток Spelling Preprocessing, +навык Frequently Asked Questions Skill на основе шаблонов, +и навык открытого домена на основе AIML, написанный на Dialog Flow Framework. + +### Deepy GoBot +Целеориентированная версия Lunar assistant. +Deepy GoBot Base содержит аннотатор исправления опечаток Spelling Preprocessing, +навык Harvesters Maintenance GoBot Skill для целеориентированных ответов, +и навык открытого домена на основе AIML, написанный на Dialog Flow Framework. + +### Dream +Полная версия DeepPavlov Dream Socialbot на английском языке. +Данная версия практически идентична DREAM socialbot из +[the end of Alexa Prize Challenge 4](https://d7qzviu3xw2xc.cloudfront.net/alexa/alexaprize/docs/sgc4/MIPT-DREAM.pdf). +Некоторые API сервисы заменены на обучаемые модели. +Некоторые сервисы (например, News Annotator, Game Skill, Weather Skill) требуют использования приватных +ключей для использования API сервисов, большинство распространяются бесплатно. +Если вы хотите использовать эти сервисы в локальной версии бота, добавьте свои ключи в переменные окружения +(например, `./.env`). +Данная версия Dream Socialbot потребляет много ресурсов в связи с модульной архитектурой и изначальными целями +(участие в Alexa Prize Challenge). Демо-версия бота для общения представлена на [нашем сайте](https://demo.deeppavlov.ai). + + +### Dream Mini +Мини-версия DeepPavlov Dream Socialbot. +Данная версия основана на нейросетевой генерации с использованием [English DialoGPT модели](https://huggingface.co/microsoft/DialoGPT-medium). +Дистрибутив также содержит компоненты для детектирования запросов пользователя и выдачи специальных ответов на них. +[Link to the distribution.](https://github.com/deepmipt/dream/tree/main/assistant_dists/dream_mini) + +### Dream Russian +Русскоязычная версия DeepPavlov Dream Socialbot. Данная версия основана на нейросетевой генерации с использованием +[Russian DialoGPT модели](https://huggingface.co/Grossmend/rudialogpt3_medium_based_on_gpt2). +Дистрибутив также содержит компоненты для детектирования запросов пользователя и выдачи специальных ответов на них. +[Link to the distribution.](https://github.com/deepmipt/dream/tree/main/assistant_dists/dream_russian) + +# Quick Start + +### Склонируйте репозиторий + +``` +git clone https://github.com/deepmipt/dream.git +``` + + +### Установите [docker](https://docs.docker.com/engine/install/) и [docker-compose](https://docs.docker.com/compose/install/) + +Если вы получаете ошибку "Permission denied" во время запуска docker-compose, +убедитесь, что [ваш докер клиент сконфигурирован](https://docs.docker.com/engine/install/linux-postinstall/) правильно. + + +### Запустите один из дистрибутивов Dream + +#### **Deepy** + +Подставьте вместо `VERSION` нужное название дистрибутива: `deepy_base`, `deepy_adv`, `deepy_faq`, `deepy_gobot_base`. + +``` +docker-compose -f docker-compose.yml -f assistant_dists/VERSION/docker-compose.override.yml up --build +``` + +#### **Dream (с использованием proxy)** +Простейший способ испольховать Dream - поднимать бота с помощью proxy-сервисов. +Все запросы будут перенаправлены на DeepPavlov API, поэтому вам не потребуется большое число ресурсов. +Локально поднимаются только агент и база данные mongo. +См. [использование proxy](#proxy-usage). +``` +docker-compose -f docker-compose.yml -f assistant_dists/dream/docker-compose.override.yml -f assistant_dists/dream/proxy.yml up --build +``` + +#### **Dream (локально)** + +**Данный дистрибутив DeepPavlov Dream требует крайне много вычислительных ресурсов.** +Для оценки требований можно обратиться к разделу [Компоненты](#components). +``` +docker-compose -f docker-compose.yml -f assistant_dists/dream/docker-compose.override.yml -f assistant_dists/dream/dev.yml up --build +``` +Мы также предоставляем конфигурационный файл (`assistant_dists/dream/test.yml`) для распределения по GPU для серверов с несколькими доступными GPU. + +``` +AGENT_PORT=4242 docker-compose -f docker-compose.yml -f assistant_dists/dream/docker-compose.override.yml -f assistant_dists/dream/dev.yml -f assistant_dists/dream/test.yml up +``` +Если естьнеобходимость перезапустить определенный контейнер без re-building (убедитесь, что маппинг папок в `assistant_dists/dream/dev.yml` правильный): +``` +AGENT_PORT=4242 docker-compose -f docker-compose.yml -f assistant_dists/dream/docker-compose.override.yml -f assistant_dists/dream/dev.yml restart container-name +``` + +### Использование +В отдельном вкладке терминала запустите: + +``` +docker-compose exec agent python -m deeppavlov_agent.run -pl assistant_dists/dream/pipeline_conf.json +``` + +Введите имя пользователя и можете начать болтать с Dream! + + +### Использование с HTTP API +Как только вы подняли бота, DeepPavlov's Agent API запускает `http://localhost:4242'. +Брайзерный интерфейс по умолчанию в DeepPavlov Agent доступен как `http://localhost:4242/chat'. +Узнать больше про API можно в [DeepPavlov Agent Docs](https://deeppavlov-agent.readthedocs.io/en/latest/intro/overview.html#http-api-server). + +# Конфигурация и использование proxy +Dream использует несколько конфигурационных файлов для docker-compose: + +`./docker-compose.yml` -- основной файл, включающий контейнеры агента DeepPavlov Agent и базы данных mongo; + +`./assistant_dists/*/docker-compose.override.yml` содержит все компоненты для дистрибутива и их основные параметры; + +`./assistant_dists/dream/dev.yml` включает маппинг папок (volume binding) для более простой отладки; + +`./assistant_dists/dream/test.yml` содержит перераспределение по доступным GPU; + +`./assistant_dists/dream/proxy.yml` содержит список proxy-контейнеров. + +Если ваши ресурсы ограничены, вы можете заменить некоторые (например, все, кроме тех, что вы разрабатываете локально) +контейнеры на proxy-версии, поднятые DeepPavlov. +Для этого, убедитесь, что они определены в `proxy.yml`, например.: +``` +convers-evaluator-annotator: + command: ["nginx", "-g", "daemon off;"] + build: + context: dp/proxy/ + dockerfile: Dockerfile + environment: + - PROXY_PASS=dream.deeppavlov.ai:8004 + - PORT=8004 +``` +и включайте этот файл в команду запуска: + +``` +docker-compose -f docker-compose.yml -f assistant_dists/dream/docker-compose.override.yml -f assistant_dists/dream/proxy.yml up --build +``` + +По умолчанию, `proxy.yml` содержит все контейнеры кроме агента и базы данных. + + +# Компоненты Russian Dream + +Архитектура Russian Dream представлена на изображении: +![DREAM](RussianDREAM.png) + +## Аннотаторы (Annotators) + +| Name | Requirements | Description | +|------------------------|--------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| Badlisted words | 50 MiB RAM | Аннотатор детекции нецензурных слов основан на лемматизации с помощью pymorphy2 и проверки по словарю нецензурных слов. | +| Entity Detection | 3 GiB RAM | Аннотатор извлечения не именованных сущностей и определения их типа для русского языка нижнего регистра на основе на основе нейросетевой модели ruBERT (PyTorch). | +| Entity Linking | 300 MiB RAM | Аннотатор связывания (нахождения Wikidata id) сущностей, извлеченных с помощью Entity detection, на основе дистиллированной модели ruBERT. | +| Intent Catcher | 1.8 MiB RAM, 4.9 Gib GPU | Аннотатор детектирования специальных намерений пользователя на основе многоязычной модели Universal Sentence Encoding. | +| NER | 1.8 GiB RAM, 4.9 Gib GPU | Аннотатор извлечения именованных сущностей для русского языка нижнего регистра на основе нейросетевой модели Conversational ruBERT (PyTorch). | +| Sentseg | 2.4 GiB RAM, 4.9 Gib GPU | Аннотатор восстановления пунктуации для русского языка нижнего регистра на основе нейросетевой модели ruBERT (PyTorch). Модель обучена на русскоязычных субтитрах. | +| Spacy Annotator | 250 MiB RAM | Аннотатор токенизации и аннотирования токенов на основе библиотеки spacy и входящей в нее модели “ru_core_news_sm”. | +| Spelling Preprocessing | 4.4 GiB RAM | Аннотатор исправления опечаток и грамматических ошибок на основе модели расстояния Левенштейна. Используется предобученная модель из библиотеки DeepPavlov. | +| Toxic Classification | 1.9 GiB RAM, 1.2 Gib GPU | Классификатор токсичности для фильтрации реплик пользователя [от Сколтеха](https://huggingface.co/SkolkovoInstitute/russian_toxicity_classifier) | +| Wiki Parser | 100 MiB RAM | Аннотатор извлечения триплетов из Wikidata для сущностей, извлеченных с помощью Entity detection. | +| DialogRPT | 3.9 GiB RAM, 2 GiB GPU | Сервис оценки вероятности реплики понравиться пользователю (updown) на основе ранжирующей модели DialogRPT, которая дообучена на основе генеративной модели Russian DialoGPT на комментариев с сайта Пикабу. | + +## Навыки и Сервисы (Skills & Services) +| Name | Requirements | Description | +|----------------------|---------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| DialoGPT | 2.8 GiB RAM, 2 GiB GPU | Сервис генерации реплики по текстовому контексту диалога на основе предобученной модели Russian [DialoGPT](https://huggingface.co/Grossmend/rudialogpt3_medium_based_on_gpt2) | +| Dummy Skill | a part of agent container | Навык для генерации ответов-заглушек и выдачис лучайных вопросов из базы в каечстве linking-questions. | +| Personal Info Skill | 40 MiB RAM | Сценарный навык для извлечения и запоминания основной личной информации о пользователе. | +| DFF Generative Skill | 50 MiB RAM | **[New DFF version]** навык, выдающий 5 гипотез, выданных сервисом DialoGPT | +| DFF Intent Responder | 50 MiB RAM | **[New DFF version]** Сценарный навык на основе DFF для ответа на специальные намерения пользователя. | +| DFF Program Y Skill | 80 MiB RAM | **[New DFF version]** Сценарный навык на основе DFF для ответа на общие вопросы в виде AIML компоненты. | +| DFF Friendship Skill | 70 MiB RAM | **[New DFF version]** Сценарный навык на основе DFF приветственной части диалога с пользователем. | + + +# Публикации + +### Alexa Prize 3 +[Kuratov Y. et al. DREAM technical report for the Alexa Prize 2019 //Alexa Prize Proceedings. – 2020.](https://m.media-amazon.com/images/G/01/mobile-apps/dex/alexa/alexaprize/assets/challenge3/proceedings/Moscow-DREAM.pdf) + +### Alexa Prize 4 +[Baymurzina D. et al. DREAM Technical Report for the Alexa Prize 4 //Alexa Prize Proceedings. – 2021.](https://d7qzviu3xw2xc.cloudfront.net/alexa/alexaprize/docs/sgc4/MIPT-DREAM.pdf) + + +# License + +DeepPavlov Dream is licensed under Apache 2.0. + +Program-y (see `dream/skills/dff_program_y_skill`, `dream/skills/dff_program_y_wide_skill`, `dream/skills/dff_program_y_dangerous_skill`) +is licensed under Apache 2.0. +Eliza (see `dream/skills/eliza`) is licensed under MIT License. + + +## Report creating +For making certification `xlsx` - file with bot responses, you can use `xlsx_responder.py` script by executing +```shell +docker-compose -f docker-compose.yml -f dev.yml exec -T -u $(id -u) agent python3 \ + utils/xlsx_responder.py --url http://0.0.0.0:4242 \ + --input 'tests/dream/test_questions.xlsx' \ + --output 'tests/dream/output/test_questions_output.xlsx'\ + --cache tests/dream/output/test_questions_output_$(date --iso-8601=seconds).json +``` +Make sure all services are deployed. `--input` - `xlsx` file with certification questions, `--output` - `xlsx` file with bot responses, `--cache` - `json`, that contains a detailed markup and is used for a cache. diff --git a/RussianDream.png b/RussianDream.png new file mode 100644 index 0000000000000000000000000000000000000000..0e32166de95e32c318dcf5692e6bb9ffcfcbd9a7 GIT binary patch literal 207146 zcmeFX_ghq1(>AJz3I>$HECw*3qhbd-p`dgIItMx^YUiBjM7I$WK|#y{il`V-L=Xd@ zIHH09P!tdlQBe>uj3Sso&+3`y{m%L2T;KBtyw}xDZ}wh$uiCY$?z*dLB{FEFk;5hp z8!%wN$Z#^AHDJI{uK@$x-g*oHXGEVKy#_x6%`6gjKyH<9$AAGo6&3>DqKlQQw6X!S z;JAN}X2BpzgV{0*j-LgCiHycTsY;}X5$Vi`|Pqx%0oht6Puf6-_#E+_~J3q-*GmC+`PF{=#v|9K^F`9BwA7LnrL zvr?JZX(Cez3R9F-4-dx^F$BrKH(MM=*}r2H27^KiM1jqM6M#zpqdOpp>Ob0I8W*62 zHVhliiGX3WE73?oc*MWMQBlwsDKZjGv;{%s77mfa4w8$-QBJD{sYi)SQW_7;h7H$< zwZw1@*({X9p$Ieq7a`F&2wVct5f*2r3As$BSuJy-sBnh_0Sl*LtPD6N2%$qc2@WKd ziog*NGHIAz5u|ajR53_0MuNt2d65i55QV5jVbtMff{~81${`F@n2Bg&uoM(D0X&*+Eee zpcuGT9}Ynw6l6XHPRA07D7#o~A|q)UlE!LMuwiH(l4=&o;X0uc*c?_)wo05)B)9_{ zGt>~eUW<_#bsU~at+kRU5EdhdPhtze+t{^svV;wh=@@Dm3>Z9Cf`p)H8io-oGl=Ol zn_i}ZSfxrluq7-Gui{CGVG2y7#z7^LWe5x&g(3=q5G0rG1yUIWNsGYKVo?+x#i4hE z!=P}XT2G~rWeOr$Na72SVKxmIY2gUrEMvIHVuF+P5|T|#C3DTOS|y$>4#UCi44_gh zH-c{n)55})SfkM*B9dZ(k6QWK~YEn-w0+vsudEp(-B6Bf(z_HG(C$O6&Z`S>MV8< zi_GA&NmeQryeN@l!N8m`Mu<}Gknm$6dKAMH&PUqpRGk)Kj*>FqN`_vo!HEnCT#Sto zL?uVa1QZh=+{qBK?J*W6pGrWX;gTRZ%B44{G=_t>$=E!&79(WiuqFz4J!B+{#6{xu zcm+&Nhx4RSIxSB_W?;<`SiKse1g=Jmz_C#jA&+Zq9mz>YGQ==B1MkEDdBUS; z5T#RSfa|Ug)&m4HzJKXjZrU*QKQ%prod?7Mi8+Y1=k`Lk_j3%He4Ach#-r& z0!4(%q$R>tBwP?htb@duEDjS)8X+`8qRaAiaxJuH6JJ19=HI~hRG7&I3N5B=E z5dv(KRgF;dRN*lgES9T73k5idln#wxf>HT4sYwKO6TwH0 zvD;}dF-gEML(Q643QO%|sDj`Sk}1N1sD{F@g)^!kU7@Wg=cI6w0q!foaObY$y*$y;L6UlZ!m~cXn4EQ5a&T+^rW~Tv1i~;~agb0ad z2p`43@aa^d0Yi>dxI!fltJ4!j04f2f1}SMmCCRReV#664v4o2gs}&S29AY8KOp++1 zm|Zzt)atWJqc78ya4(a;1jP?(K`(!}s7REZq#Frp!p7@Z-KBY+T*Oc76DZI2D0K(yew9f+B*TAvh>i zL=XedaiU`(Y_t_Z3ddU6GP4TB!Sc;|ih^gNN=$ls5QQnC#i$7qCoUpNLDcbM#V#wN zQ8*MHjBnNP5GsVt>M}^ONN*2QIfA5CIGT=f;y6GQCQjla8?0EH#==1mur3Nn5v%z& zn1Eujslqrx7CuTWc8ZZ`SS->NDZ`|Ep#)F@7)cGrYU~UG0pmnz39$-c41^aXbc#h( z3nErVC)1Sb(xM0vNFALRYlzi|j1X!BmP-*T#bgUxNkvn%5ThD{ zmSNN=iPRjU)UwQAJ_h(&_#`fw$_Dev1SmN~Z6h0FT&G1O7E~XER9c)!sS-tYQvbyv zY>XU2v{}(G2u)!z^D(g4AdG|s4d;Z5%s4sK;lNW3Y&MLE(5hk)&IpQyCBwp`bQE|b zk_14CXqDq3^4QpjaDxRx3%A;W>=3mU9qUj_hd><4b<(@Ay= z%dQnW^mMt9r*Z{fCJaGkW9$|Z!a;*FVx>qIOer-gBZ6-)@OQx8pGHNVM zX=mF6D5aeP(480?=A>8{atc8bLncVVkwOvG38xChPz9DK*Fp6n@NBzYz~-xrG0aGl zJzOU?A$cUAJfg@L9;J20IIu3DhimYBnOrPU@M76yoL-F;+cA6vHv+DR;F%*-E`sO6 zK)MO);Mw$6x-C{jH`?h`2b_iuM=+vLcm$tiiExcW33w2y+=(OW)H<=n*tJwd zr-W*e5h&n{*d~T3usSOq6UkzdV(oYwjmHPR9Atnir966+8D%r7VHiNO#c~9Rhcr-R zDz4K;!NCF1h1&RNlt^WRu-JHp2-qSCt}zklni!lOglnWJDh$Yu)@hk!qMD$R+TkXh z!Gbcvsaz>1mcZp=sB9t6pbFQ(Sa36)NfvSlHfoe!=|V>pRc}=bjVdCWV8U=^dTb1a zBXcP;l8xh4 z&grsgjgo`364^wORsm6qtU?~nVTjUOfSg2HB$pTgXFw@birRn$#)q|$xKS=^31bJ* zBF$KyI-Dwsgj&eplq!mVqAAd9g*_7d2kwrgaOg;!NQZ>!!-z_70uWRgN5i9=)yAkW zir7YEipgxD!k{BY!8A@&ERaMLg%OY;+DM1W!Ds61Qd$I(XGcY`;55DnZ!$wH1Q}Aq zX6mUjB$16nDv@TEPAHG1#PGFJ3IdLx!7;IPB8m-g1si2BU}+RSPZgC^#S|X!(I5Z?E zoFZco*ir0AHB4=X0;{EBVq(Iue3~;%05T;Znu-M^LPK@&yBHNoj4~n=5GWtY6G{~5 zSTPD^hgnoM2>@5VSRxB@SfFr`L=TlYc~%u**hFJ2Hq3&;umR(7#avt%#}JOE5o1gg zrhx3gN_b=<08M#BBvb=eIoJq1hc4HO^js_}(qyt|VP-i7=QM#kS#TBwfsllSDN%3~ zGzg(`>D~kkh{(_p4O9VC%jE;4kASd27K3&2h-wTsisdjeZIMcaU8s?;*$fSarees{ zCXxkajZz>{Qa(N+Rw2?5?0hI&Ac*1Eu}maKDCS5a)n*#PEVOHaq#TfND0xUT93uyl zX#mNCD#hU}iW-3r(`y6Z~v_l8F`R!@^+_qeL!Y5E(?302w3HI!t;COik2iV1zIu z1cA|m+#QR-*ib}BtZU84PLR|}h)636!f{Fw*jNexcNs2%#YE%qM3N+uKt$o?Odb<% zqZlJ4+!%+B79^7^Xeb9AU_Vx)v5A;)6Wtaez=-i$koZP~5v1rynk@=kB%x|qI3Bq2G&gf0dY1C>aTG=W^DLDGXn zI5k^_)6f`7E`}lI5p7I7-@>KCA&~?!T4up>l`wLY4Mt!GVF9xsV$rVZ65N%!|4&8f zpYjs;|35`LIHCO@k2hez^a0`cFiz}>cZY^v8CzRhdU?yynr#WghS~Or$M=msd*SxX zI;Hl}0e(%;BuV0hJEx#`c1@sOcs(uhkL20Y_Teu~Plz53MdBImoTXZieZRZ)rE3qa z{%{cf<^lL#&TJ3mU1j%{tbQM|x~@@k?oH^=cLmkk@$UGg|HUWkiTl3Yt$9!S&&Q7+ z=so)X{)t?gf}6Y1BVogT^YM+Z`M*E$-&qO15kvngiI=+XIl~E_?&kmBdmj6b*t}&-0jt}~;gq`j-nvsm_i2H9e3&#Afa{pJk|1p*S zwOjxl{|gKM;N1Vh!auOE?Bt+r;}*a^mj1r}{qM3-P5o>7K9!%-9Qbf>qT9UOD@O&Y z<9q*H(p+SP=bXEg-tpu#Ti*CoiF;_hv+=9q?ygzME27%BEtPvfwyy5!lsS9fsV0>5 zK`za5AvuHNHf`;fosOUHFTW4|dhP3T__Q^D$Mt_a9=U&9&(5CZ6KCXozBex;FuLeK zZglaaU#s_Wqo8Yle7Vb?SL^Go3>{cmU8%@jbZJ&<{da+FxGLt6XN8_w94{+fSk#O9 zwT5u1CFhQ)^@n%Iu|EYk#H>95m!W zEpO$u38S)4j*jzi4_ZRna2%IGo8Hx3@NOKH=h+MyjDN1@VXT@Xz0GTr+ZGxW@UZ9dN$;6b$eA~-~3`i zXb<@uZQs&~@5kl01mxg~(xHyRmdExO-xFv5nvXm|JEU%_I+XGGkmWnM@8SlInKy#%_x8+{KQD1OwAUM#i{0?QMoole-*W%*mJk3Yd>gX- zS^NE)t_e*&k@RspmQH*JCM9bLL|EM}2`mW8v*uc;~|-xEkui9{i2{15U?0qOW`IXrBUK-K*{HIC=Qe zqJ0%*FU#V$9rYNKvZ1a0rd1`HI%kw?LANkbA!K(q+2v&w>!yVXEXqZ0go2jU(-JRTZ!{Zc(*VdH)Th~q9?~Ke|u-An^OEWzd_db=p_aAARkXu+Mj2M)5 zGG%_)vW5~|<6Pea?Y-`ZiMEIPv$pjlaDnAT&iag|zEVg_+L1h|d-!{NwLF8B3vbGg-y&>zZ-_*pko$ISo3 zyW-D*Bc9L~f7)E>{$lg?g~v!cqV8-vYfEn&n+9}QQDMOj@sP^a?Q8ZQbT4bRG->H0 zmvJg0nupaNSlswAXHi&6+2-0^(PwwhrmDxQ(8d=_;o7Hpn?Ihozjx;ib9IPf#k%ir z?j1Aa-<>UgUszu?f0*yvuPy6K{T3wp_`aQNC^5;K&TdUy+UQPR_Tveg&eh}oEPa2e zscH__+W6YK7TEVp4-ToILUq8M8+~k|d*AlQj;EXJM<9zxP!Hw0K;n{ppXLffmfY7f ziRT`}hkDP+Z^rii%z9qmGI>@i>gTnV?t7-Er^ay(eH>qPXv?cA&m|`U>#=U=$+3<_ zez*M|ofV9|`O9=-QAcgg{)q@RCn_Vymbmn>$LMoqxyFDdZ9y`Nk4nC-pXug5dhYz6`76` zpT~Dr87|x&XIgl&8=v(CEdJ$^;So(6{41;Tm+gs4?s+lLJxP5zwSUE#!^pUjhL8QR z6{Z7O`%1pIWwh-iO;wo*?^jVr=LO0FAu6u7SOV#Lp_UQ^Uv z)57st@7&Ln-`P4@udIuJxu5H-a`Q1!h+HI~1K8vrc>~aA`;vUY9Nr_9nU5ig*4V>b; z+XWQscFZjoJ*fJ+=$z0m;p`+#OtwIO_01n8+XA_Aw+ZiOQw}h~HXM=GN82ldc6Ck} zw>|OswbhEEO7%gHpiN7A#9Z%Dm07u~f%9zkTiTh~P@tHcROEBl-#hN?AJc!N?i!8u zUS#bYQwi9Kw{N`X+);>ja?|U*XGt4peR*3%@$m$X(#*(f@+9s8c!9bw_>sZ-@!Hq)QA_&r zVqOInzY_=B-c9Jo2IJg^5eJl2um9D3_#WE|V#~XWk7wmuoOG8_!qlej!rs+0a5#vxEu89&o2({6A_cDZr|gvvbdd&hw4>So+0DDBVaPQ&$|ABuI2blrmMWg8DRI)At2HbpkCs;oZ# z;n(tzHJR;-YvW^%tJT`wo?~QdD%Q0~Z_n*`|F2YoxAcS9sja*UER>gf{qF3H1BLY~ z7Hui+*d?2lstB2P&+XXRqLb~k{TuF(iq=@I04e@8)OEvQb) zyD=xNsKb;&UfOyrG0!>=h?DM~0GrvEdTLyNUy88S(zhjMS6g1s?}@4YN27kO&+1i`rZ ze8$yb#rgHi4A+L7Kl7N7wIFVBm%w}6gc_1Jc^goLP-=|^bLY8}=krD`+b^W*iT)Y=TbFMI|@_uK3(;U;GmJz_w zGQtF3Pt9i3xkO4H?0t&*uTHof@^v0#sMW90+NECsBhgG<-}mX>SpKTysUeMLFV;ki zT+Th2v~-!9qrSj@J|ug$f9kk_$vZAqbQS@E(&OQRHF?{Ev`d*Ucg>1)k8~GCthzmx z7nE~0KdGbs{NV*(N1xm$w{nKla0lZH@gH0E*{uI{iB-iTy9y{w7Bo@vc0XV z|7AA-@st~b^J-lnUNqG5d%7Fp@Zp~5aoGcZtWNWMy>`v(=3Jq?`TpaA@l2qh?Wf~v zr=46bFMfXs4oLSl2jS3cxA&f7!m2-Czmd8kX=%E9P-sndjt1Z~+HKnUz6)ugYn>Uv z#-lUb1hbc}#Uf?77w$|tk`CsbLkXU<3Cyd=V?W6*a!(3g(p5}JS_n|$wPVegzjFU< zeh?aWv@^{&9%}svm=W^)wO-+ycYtlCi^{g%E=i~P3|#LUD$=zmo?SnH z5&(xoW}B8@RPJ23eJ^iC!-)oaDS_gi9yUPXiSYdH$|NfDW%t(@ih&J$nI9i)(^~4@|&AIbydCy^&Svo zdE0Y3#(4i7z!($q*lqcwK)2h=H{`4&0F0xr`qlTnEioVya(K<>@ZvkUSKHF-{2l{N z@@`3jkRR&?mp_2?eXD)8e0_h<$R#}hac}03UruFon_m`P%p1ko2eOK7H_zlvoj=gc zF=ZOzLeA`mTZ-^W?n!5X|J-35{2$u37tULzLloQ|x9rxHT{Q2wM)#zn3PZ6{&yTD<73^YmA)oPCBf1N&-~XJwQ>`iRd~IDb=5}u5yI*zvM*%&_yla|39k2cK z(9o%|H&U|8XKfK9FOLu;mPanSBhramH@-?4sZ!H>WDEYLN+8nhYd(KqIstx_^ zUAe_ozRtHtt~dWWXIOi`@6(R;LBq!fp2eH5tZ!PiX{5ICO3IW)btS(>t^0dWch}0_ zMa?ZBWn*?1xAa#LL#or@`)~ITGX0ATx zo8+CMS^wn`sjYp<@2i+Md)R-jUkw5!r9L;hY_fJXAT+-@poXXn$cwQJY9_X`#U0Z^5>C^9XL{`N=`aH8&ZHIfjHmwR*-eKHsRi~PF*!^qe)7w4!S2*uJ9~D=>V|`yMvc3DOsr62ShqkeO zuVhC^-HD0j6kO){%oiTbz*yh;RhcRfzuVXUIvKP6VDaLwUFS>nGZeo*?=L-Y$o?F| zdmOpq4kA>Uu`-h0W`RadFK+-yTM z1(u5fi@3>sOMvelALf|Nc=BD)B6HM56o%;B9KO>C`8g+=ON-X6aU;~jFsPM7mfV|V`~a_BESaKOarzDV)?ISYbaPcDs^wX3-6-Z7v|D1} zVz4-snD&FMIlBPK-Dl1BPD49Bzb;Z2zuY>x@zfnqacBx=fkbC&L2JsQy2b4u*Z(eV zFqHN`xEpX-yYADrmEWee`^Fz$+bekM*XkR;Eo*qeNZa4S)#^Xo4GR{B*EZU=~ zJ$_TURdnDB3IBWKq@YvH6~nsFGiP-_y&$RlYv?p>3i_2tQ7 zmS4T7g5SS>X6P<8+;IiBt7**7u+W%6oRMP(dY>9Soe*^T#39Iz^B}|ZpXqkh&)Bl| zYnm%*t9cHU+<8LT4NOR1#2wjRI_B4ZZ0Od%Hbnl%h9*xV6#Q#Li``@o|Fxk;FPWq%VE@xg_hsD= zfig(xw4=|A^Syl04;S}5VV?kavKI%OHc@yY^Tn+F%fNfiV7}$bL z`gpdW3Z^({B1X?Dda-f%1NyqpK9dhsgq)dmwl(!_}e(7j! z+8Pz?rAp0dCwE5G9Bm8x3<>>wgQ*)c&|67g+P&?f|3~+y9-DCwJ>U4v@p%~#{ktEy{YAvk9ZP=2nty$7UlxC8w=)HoyW@pr=EXz7i6PD}dB5A*(KESkm;CxP zs_V(Es{*=lmHtwLyzTzVU-{nwnm2rS2uZx}W!Ysr93+T4Rm$#Xx`vJ$p-lzdHWJUjx_^6na+aD}9- zgzkbnvaT=T_Nvgh#mPJT8@UDN{oZIqf-iuMr4~hBgZQVR`%?klTw!Tvh7?P|NNaJUxQ1(R~jDJOMgapJ-mG(W9Z!<-zo;-3YqTs z8;3WvjaLz$dyaVoIrKG@`E$2l)_2Url%j|(k1?iobyu{9LOZG&0#Z+YV&aR{y@>2J z3+8WgxjOc8c=gw(k{#TXnfHue-ab0x$~Ro_zGHITK}Zl8>*nmXx@sltRG^4wHDMP8E~F7G;bx!3Aj z-ygn&ZcKRqYW%Wdzrjmc^rQ}qj2mY#YoA9UJAQQBZ(n7h5Ndjw3`wWsZ;cK-8632?wD|9nc+mh#cQ`pyaK9^Fpw4)pkI77XN)?(4}0z6legxHmsbP0gW_aR$HPL_ir$3EoA77!=ezC5O*;-}aST4p9u6;-t7BPKc>x>1(YR>Ab^|=X$My#GXaOl(4BHMd)-gi;PlFt=0<_Yfi zaoRVH3We}it}R>;jo%8tHQ#9B!=)y@ZTAx7`Xzr=IU8jA8?uslD;&2|C+4pI z{i`m}0Z1MH=j>T~4ql02i?&V<8|A)a=3G^WV3&^4DU9(QlDa%^Ra~H??+f&s#8b!Q{*sE?g%)N)$-8iF=jNg{Pugb?-qx4? zvMyWs5CqHQx!#r%--ShQj}}G0$hF>`wVyw9O`$u+YZS1kqRFGKOr1RJDdo|%)nAeO z4aZwkF_ITI;_4nPQdAvGAHM6_v!hYPYZjV2wubxnaMZT332=k?h3+8b)rZ+TV7I*lpjTH=H8}y*kJ>*4f**JT3$kzjeZQ~lU;+pium5rl!LQ={C4zHR+(@t$cif<)sF7`{ik9dV%+`1k90I;{mCr+eY=2m4?d&Lc3Sn}}K+h@PFO{p@3NgUp36NB?R91%K%P*|$F(H3g?86qMY!arJA?%Km%Aq4_gT3S&(# zOWNK9Y3k+|fTW_%;#)L+s*3KI`?A7cCZACL66B%}T;+JpgrmRvF4vpLw*sqrX@`OV z*>l_pMD}eS|8#bSRy=;rat(M(jh!D-?zdi&A}RlaR{dt*vfSqH@AimA*pM*)zLJ`V z@*_F-`SWUEXPzUwzRlmULk@ko78KoT0P4@0l3S$i_5gvPd9kyIz0VU=_hLSt(uw{J z1S1}I2cJ0m|1ReG&vyeZz4Nviw40@0yIcj$p1;1A-&uPw+Ldg)|3u5{5RJd(b+?}m zkRN&J6DU1C95OGpX9Q`=tSxyhTTH&|+84E4U~Wf0%@ge6j#$665P#u&{i1Q>o-P9! z*otLSlgQx#?p`}zrmh;UP42tQGfomuzV@N0OS{3R3Tt?Me_2s!r7d$!VJ#ITi0kaQp(o$Hk6-0+?xB>s+@gapRcZd z7TcPP9DBxmkeG&l)^ii>g{iBxWCUB!G@BUGyBV2;_uI|}LOj+gcwR-7Z5u#a?MLXO zMVrYDi3RWGt^6JlUG6vjsiXbS`;U+dnwfBmf?cx}z0 z0{Zl>))~}dweM<7uQ~snw=ALTfq{L!-iw?FiI{n{bzNWUv?i^j<>&LS+F{kg{Q4EE z8?rJ$2zm0QtA61_Q$oMFrO;eVbB_k9hp4e41ep z>Ra*)%|w_qK?g^2{QvCV+?uyryz+M3+wq6)Ix3C6Cr&E_hX%h!pM4D9JXo^beslUQ z(>|{1mWjN1%%bTfo6v7Dn`RFw?Jf)UYJfsbuke!*pp&E-jF?rwz8dI$%c0uALqs3m zfjSAQRP_rathHKIKzrsHfKK`6fTdTACJa5l-OJD5BuT=iEdUNH5cWohU z2PE1<-shjFXLJVzwhY`aKw}$rP)6tthBlg%#ulkoA=-MsXin&@>eEi3xJ*TJt;cl9L zt$fuSdF#B6w%#Vgk_@jMU8u+;^b&=2`PPodSGqti{c~~4oyi96p%E)oZAG2^Ut2DF zvs{!Wd5-@u&wfRS?FNY4la%Y;WKT*nAFaE1zpUub?SZwyv;W5;zkewHsQ&=cqh%kq z4Ty7M`=e3BP94?09nc!`k)_KDXPZ zS;wx38TL(?SQ}AR;wcA9PR;)qYIZppFb+e>Cc{+QWn%RpYr?Cs|gq{4I_Zy zH|XHtOrLSG*JG>B1ieDf>`U|TAJ+@2JeQ-_z^+0CwY#zs|N2-0k_Fo&M%LJPKiUDI zSL)eb+#cH-X7$m<)5_)6xWDcXQ%rW;jvqB`#n%f7d98!vW=!@wn(Tq{|I+vRp9~B% zzxvr@%KhdJJGDE(e7@22_5+S)tVw_^fo6`b+4kyLU+?A8DF-61?rVKW2w&Hev$Al@ z{w~i)cPCJ;ws3jTTucz)uH+p?-*sRcLvNmNt9Zoq*YgzDZ(gB9*Oi1vf%-4{%r!u( z-g`r7h@?xu%crIw|Az3XNK{PyYg0sq6!=9ORh8n)+0$$;;-gpH z{wH;9he+>5wwq2=;kQEXV?EEVn%|!od)d}Dqor%HHaj!1{%_v>hm8BCisIC-kjd-6 z{jC^D?>Bbtk!-(Fv3+3K#F@|14<@4~9PIK;d5~4enk2~sNm3BHBmMh~)Z5=N+R7Q{ zO}=Fz#E4QvEC@p(nS|K0 z)$965=ndx<`=H6FZ<%MF_dOV*tI4_Q2vz#LdHd+t`>*3(@pnbv*?acJ&YPF#`a0q- zcy67K=|QfajbXVA)U!GZ1u>r!+iS%`CloUrfk3r z>{m1=JpbKue!Z^XbebZuI5&y?IB-sowz~JAK<6Z8DlC*mjj<|d7!K(>Jmi{?+ylHm6#J?)na5Vd%&+TiJH21WilAF`l ze7M?v_E=)`GfLpI;)-3QkZq0A1S?kI*N&UA&@J5O;Wu*nsoVfJ{}&$r*ov(nhjkeuR6 z-!l|B+~3bg`5Sl1r}Wfp``S(d6B|dJRBKIt1`YeTWXVYD{C?I|(bC(`Z?=ukP2QPC zg{dOF+uP@-+|TQukj*QsaJy6%}s|ZQW~@cmD<;}gU2slb^k)_A;^P~XlTQe z?pbKgC2^}?$!+!alu z3Qm{)XzcNg*Uq+N$e(^o_m)3w7$_OXlnkkO!nTlVe3+}{$4w7VD>hD#88#bw)u|Qf>UdEy%|h= zqImSlKQ#z6H?k8uod6nAmhNlcKYh^nYGZGMVwq%4F|*-XXY^_(>iFQED>;?-_0G@- z{hRlo@a->S&C@Q1>1Mu}t65bu+}kfPNLamj_2<8?gVe7{(c2W))sD`7wjuJ^=9i6h zvLxcxfQ-)x&byo9a=^k)Ut8O9KWw_D+q}oMu-`i9>(%i5)I|5TNclf5J13aCRtq4p;QblVyEpEyES*p8 z=5{oUm@pH6e^*R>WefVor?9+^9n>Udc4q77_9+ebW6xjcDEE|Xv-`?+(f}vDS6XiJ z&A+hg_!lb{^NtSMIZnkJ&eXe!-DtCS$FWr9@>k^!)Bu)JAyms;5IMJuy%TO4`?y& z3G&~YcR0HPtu4$P+_O4;{r=X6@qvEDV6WhrzAd*Q1bac?JL#${iE`C=%L36~_itu; z&X2fY+&?^Ri`(!T7a91~v*HA0<`cUFwrYIb%D1MvpVK3LemP5HTGMTP-PNci-|y1@ ze5=8DwLUN?U+*uQ5CLnJ;rd|IH@xk>JZd`HV+2K)M!?Hhnvp9DPK$5o`BzZ?e<5R}R z1$ijfWQTF(Q!uRLscZiJ>%Dqf9qi_X-G_FlJKsGd9$Ztlb!0*Dv&p9w*o=}PfxlYb#|F%Z zkNnvRN_Bi&OJ>Zyxwf^mVbeAzubWT|-LSi1Q}lJ(#(P7rw3c}9t(+<#1W`~KM?UNa z)|P(wSl!|?HeXn2{-aGCDL%Y7e>T{XmS&>S6zxE)7x= zZ3RUa{?I!wVLH^+xTD(ZTY0?tl;?U+T(_&ixVDS*w>nBcgy%=;nml>l6F68^7=r%x6&}A zY!31_q#PDlo;+F6KP>lInr6K0QEZ*?=GRG+=XOomzppiKqO-2zDdp#Kj8~3F>+L&| zkM2X4Wvbur{HKC*mD=Oz@2S5$cX5(m+l@h^F76)JLH6(@LY%2dn>_F%e6lfT#;GsA zXebv|Emr#|s@fX=Z_&FqXVSiPrbq1Teo zee~dt?0YZ6bnhApP9kO&?|4S+N;9^q2aVUw==K~U9WfB>jL11#s#%L}4O#;lIoe9- zx~Z$*LCby=fqiamU_09>h*c9~o5SWE-((^o2&!1*f(}Rv)E}0B{ z)^iN)mH6~*z@8-2A9bd;+w-%p%)T9+&H{lqvK6Lo8T?P+J*dBcNnE;7cr*CM!r{x( zyZ)?$XjcS(d5F`ky*M2My!#ycch#bg7tKso*BJ=n{T1HSI&RpAuKtj1t;Z(XYOnOh z@INQNh%oKkx4ZBP@6L{C22}5YGRjx`1 z$c(EBlyny5y?t~0{_C>H!_F-$UcM>%c>QN%Y3rwe%E5EP4%~RUrDEpR3mZobsmy!5zsQonEOwGQHfdewAX{+-y*&#lMaB5BL5uKVrS~|Hoh=&nZJ$>+)%D2JM`3 zq4tk2%t?jifM=^mQt-Vy_ub3LJX!Z)ZK(G9Wa8=0i$9W|z>mBKz4%|rMQ_l(DT_X! z&0w35d$3o|#Jqj+lWEMxs;Z9AwLdlscaz?vefb1nA)u06e?K7Z?1O5Pdg^!3*dMI8 z7=V~;+X8B&027V@Oo-g*8}Hhbd9Lu*Y)vQH`DU{Z4T9gf>c-4rXM{)5Ue~<0gb97? zOyqm0{0QjB_;c-(EgyyJ7ryE{xYsoCB>h!x&dtuKr^s1XbH&%N(}yAjNc`Z$ZDTTF zvZUna5t?Xj&WY#=%M=gc((fMylc!Z)bX9pD?+5MA6ziJZ*zu)qGWVwkuvFT+chW`D z+&j>eDWyWNOI9}dY4-zo_0z2BRf<8wGkC_}SDe<5u1$#Z)(u}$exzyot+h(}f|t$Z zi~c^kEifN3zU*nAsZCef-@8`=I{QsY)f;}kki5W%{)ab+;_3{p=xW6V#}kipe=Oyw=ZX51wCidImIgw7(&lBo7DNOd(Zkl zX&x5Q!EO#6F<|`4H5aIoa{co>&F83;Ocy5HW&b~1y=7EYUE4OShzcS|C=#0n5hSHH z-7P3ccXw^+E)k?bq(d5{yF+QDJEXfoy5G6Df-$w4m=~X(uYXwLloG^-^mfcLy6+m5Xa*@>2Lw5T3nryNi$=a_-po%&(!s@@+p; zddDXShoYgA06zwC!LVCPYuce330D0_k()&nYDLnGadXs4X^XEJzOA{7LqhFIwm**q z?PcnC^7DT3C>Q9Lvh&4TDuJi&%NCHci9&^raJaem2okL_nop)C^Ul+!;SYei zM~u*2LWHzWIB{v>^fT+}7)y}@44b*@xQ04`a_g%u1$OvFweQ0xm2t8^C$936f=73Z zWd!iqV1XmT4m47EVft5)2B=(CX=;O=9g9kG=j0n#ga#vehi=(%egt#;7+Cr@pa~Wm z4dPEb-rtlKA*Dew94X7EXNugE))YsMgvxAWQB7e$dxk-cl8Y>7qQ0Ag-W4TI0LP&S z;1P1&C2&LnNt;;Jix2bCaXJtaCcIzN9x$*U9WV#W#j4GVkaBg9eX>f!I-g(VG&kJT zxks|0GO5uiJI8}p9cU6eWl&jyKm&pn$#20f(V6$fBHeS%KMP<330UN}9CIK{KEvw#bA`Pel^*>t*MbSRgnLvH zj7UXO?p;|Il0?f<8moz}pdVj@2}d6CFNX_n@AyoqNN*A38q5D`Zfoe{J=HiIPPrM$ zlE|d==2#5xA^ zy!V^IrAb`&rD%JQc=yz93W;&3i!*n$@vD+rmzZ-n0Xpt%ZAnF}(%i^2KSQPlo(D{L zXWYn5z~3_Lw7Nbx4Bnq*3pgD$X;8OFhkI`U;ZgY>$?D$5aQ1XLr$nq8VQ6h$(e z$)tk7$k@DQ4UjGqc4N^sUHvgS*`BQQ;MlVqT0WFro@?0q{n9=MUIcoSvN@4>?Ne3y~FYIk^vHJs~BGq(29f!aZYV)PR05Qb6^X?iE__ zw?>OrMD?7<G-;~LPS&*7ExN?1yn!rf#BNSCCH;T zb_R(EC!UDQherg`No;ara5{p2DBqaA0$|0!G~eSwxIVD8dDP{QFzbDEUQhL>?Q4Mp zHA^R;zwHx@5blYU+hXmc+t0|A%-pXYGD}_VIpdR<+73XZL?lE?P<1xnM}Iylh>Wta z6wXw^4wF>YPPjGpeL{0|mfF|}G}VS26UD34igQ2#W5!YCMD>Qy{8;Vx-w*ve$dL#` zD5JK0VX-Y}r~L)FAyer>b89iLPV*h}cYL4=aVv{ z19j!|4W)w1H?d4V3aQmf3SDlmPj?9R0ascYhclrwmi?BbrTs-aDNSTaje<1A*7uyh z3*Ahlqx3ajx0$kK#-5#(9nw%4W9-*iM5h(LLvA$@;#=yOQNv=!prWjCRFXj4->*zc z6ubq8@q5(U7l#1##qQZQaA_r|*QnvETh(M1jDBpD(Z@OxWb(XmBjoHPk8wrxyHw7Q zNBmC3hZh<9rOLQxlYYJFCiJ{=XBFO1EPhwo89mGz7UokKc;4_GLAgv(&gh*^)t#aj{rQ zt*Yw_HKad{2lZI5m^e8&eXa41dsdj&1!JRXYTT*=LX3XnoG*Dy=*-_F1Y~a>JUjTq zrGe~*y1a6ow~OPgh8%AitHfyFE1)_X1H{#-)@Gt^NWFffItS z+d{cFZ+5w4W~i*PLCLIR0cZg9D9j{w);_vjjhgQM$yqnEbwG*I>@O{DMoC9U5cvZk zRGy6WNkx%+@9OB2rI(V!2&apU?1aNU?*1%VDoYt&KmoRvh`S^o;UZI(7 zE?8qjf$z%9fY*$DhJd_77yMVc?Mb2Mi2^_oY^fOm6w;GZsbB~25zv}#lH|!NlzW5J z%lhW|7oCIT?=SbHd;8mldrNucSk{a=F@9xfy3EpHn;g65Z99z8ZHiz`Q}UY6SsF2IS+bjJ+;sl zj#;OcENUu4n&Xf$RpX@EFm3R40u}j&a_}XN79de!`L{0sw*h9jb{L`+Yn6NMJoJ2K z&%>CJsv{M2iXA4ZpLym1*Ox9hiTN#*4an- z?oXoG7j60yImb|TGu~&*{`ERsMF$ATgKD9@+{GKKl93Prt970K+uZknw>9yE+WRt# z$=5BR>x5R-KOi;v$vYtFOIruR>6}Rs@&{p@tFhg4NlV3JsM_b<%8%`FXWV^~d4re2 z&Eadae$TZ`WVtpdT_;^aThvZ7L#I=2j;IOS)`2OY5&(4Wcn>$kC_J-cStF{~`Pg0q zH7&jGp%XATvoQtf4&~&=vsJj$|F$`d}pnwD5F$-d<`GV*Q^M%QAUmB}Th47l(Q55*16{ANC>Oyo3Kn zj(N4tGmkG>?-)3znIZcjCy>=yvVO%O50Homz*z=+`S;IQ>1Wl5B?|&kCw)x!#n042 zy8Juk!#%(QO1L?Z4^yDz4f<976)pIsWtLI5)T@FKBdr`hk`DX`H%KeSxtWuW#MorB zAVlaFP*wm`n4ie&f>Sfze~moBc}Rif9;5L=Qm5|sU7%ybLjY{dcKQY?l(>ok-!qC8 z!Nc9LSG!fXvkBPZgEj{~P(6C+hw`3G#9LD9;jDE5nfk8lV~qEO2*CACT`r3Pd2!^t zB(Ls;HLzrlAy%nJnwPWR_$-AVqx31RK1R^I-^;$tH;mUFqRP#y3WSt#1$wF(ULJ$OZQX-$DZC zt=RNc5*mq?Kv&Z;i3#vJ*p4s{up>kA{nu~#ugjCBZT|Q{I*!7&4l-LLG=gN3HHANn)pCIw;4AWE1 zX6(+(OZmgHr<{Appp-P%Sk-(rdc%rZZU&iJq@2?Ms6GB0NGiuvJFBjnz3*8nz1ZHE z7(E-ig@n2oyhj77y60X>A6@TxHZ8(~7t!X) zd(E`tFEpR>O8d&u5c$95JyuA_dG;VLuj=cAh$dL&mcnmpkkgn_xd0+_g)zg|h}M8orTx>o?^CiqAxQ4Ieou;3)Av&FbxW1|v_!3-gfnqnFr4QaRn=S`5 zP{T&#(Fhj?;2_zVD5D6Hc?9-`@)}?vcdt_lZOI{+yIA^cK(&}C8n67PF)mxKRK)0~ zpUXF`A979L=U82wleEd9Jvur^`X~nr`Zz>b-WWx~r2WwGD!&8tPms7U&apt@a+q@9 z`+*#C9j$|KH|h*zFd&>e`}Eg-@|7HH5w1Pn4~b2PT#>h&em!YJ^qg<}gw%2zG&oO1 zl#0y|-lB4DcQw*gcckAp1${F%J)lWi+6QbCGtP@QCM`JDF3DRDM{Qw?Optk@_q6ep zcdHr>r1{f4t_=Y#YG3X@Cw(?Q<2YxDel5b)g_v*%z?JJ?locva(-EpYR5gASM8psLYdKYAm#2`LtMhJ=9dO5RNb%Ky#qld(f34${h+e6;AUP5+ z@vdAGuBzI-;e>`#L_s)4}2yAouI*$F_NhrqJN zgq6X^Lw=DomT{#afz^}M->EzjLFBspzeRn1Lm4equYXKYa(jsr8# zX!=zKgm5RjGt*gkls;ie_wh_ng&nX#)xhu>IxRn`h7oatSwnlDHZ8f0h@wD6r`2)` zNH6tmVjZ_91{+75b>BN$K(cWl0d|0uTt3YHgaadn{jyGrK~P4Tjc1;;{b&ToFJuv$ z`gBXthmaT{mo0JeNe{j7lXyq_j=V~Otou4gRKj&rQUPGR=?5kZRq9tbuWItfvOCht&%Zpz_6LQv-j!aNZBuw#oh6-gA*3KEqXR8hQBNF zk-^qq1lb40VBz{#`|HjAQq+%6+YB02fHh~w^~=jYY6*hB%WBjn>eZc7EW?1Zo+S=5+PEQ)uxH`gohqa!=HW4glEReIi7RRxLS7k3^Tz|qd(EhherU)bl> z=Psf{O7n2d_9KM8@GQdXCQ5wDg2+2WhKXFaOCAWP;H<$P5BCu{{vHgkyo34y_MzvQ zvW6~2F!S;?_|V^7gFq?IAW_*5ZoYEZf(KKCuta!!d`dP^M@xGOc>rTZq(WZT%^IY8M!!edO?~R`?@YN4b zTKH4=M0IatzF4}m;gbZonslxHX{A|DI763ns;3^*S?IOaqHYQCa-tWu5&yEDt=#5``}}sC(+k^wIb>43hFegysY??Cnbam zZ2X>u&kG1h3Tmxss4aQzc2QVe{~8JNlo{23gD~|@so|L4`hHt6P-_^Od+&oEyZ&->yh10tI2O|s%rCJ>aBp(^^{`jvqm?l zCCX&914R(9*MmR)T9LrEAH+HDb+1Q;W5%E;T_IEZ$OQV*eeAB_ODA!?MOG6OSXx5c z6d4T<9eXvhW00z8q?_)|LB&J|c}zLjBt*Uudn@%ZLXN|80Qp*^->R@EiQ_I}k}oEs zn~3}Y{6gOj@e0)0)PU?8bV6Yp9XZC>o;#z9W2M1py>@dryC36KW(3LL{l8m~JMc*icYm9=(a7F>Gov1P|MU!%<~wTwJ`@*4SU1B+zH zqOeuP8>fnE2w$9GP7Wr{{@glBPoCm^XiJ7f^b(h|%ikgY0u8yUbsx`R&pThmL)({c zBA%G7R0sc7Ng_p(c`>jJZI!eMm|fLBSH)jj+((@w1VZ(c^T99=1CR4x+EXp+JMjE9 zG!vsspNY0}SGgm)zA}H_Y*rvh+{5nm`pHcrRa*oKBL@ylPCsT}TZtrqm#Vz|+z3j+ zX)bHcI)@7XF2osU0r{5dgkY;aS%` zUP60W^bbSZUyu@S+XdU~8^BPDT;rt5F2COTy?L=Bhd%t<_vH(6w-lF6ngqRoT4D)U zhi=1`bBaG}p#i^68F8p|qqYRUR$im2TBz_vXw=}S?0akV+=HG(89OI}1pH-q&)V+i%?2B;Yx`&~tyKD07CwZsr~!5iUZjF}ZO9_4-h`Pw58@aI z^1cMHQZDWTyq_(jkJ2X`-5mb#T<17Y{TJkYVy{@&a@{d(yxy@UoLcw_|76-ny~8n1 zX~=op5K#?Uzj50g1NS$U`e_B!vWX^Jm&EoXa$loFJ(ybZla@qaA7&tMqI^q)_i$&# zB4Vj{vh~ojslFaGw&orV=`x45m(JOWOX-TIy|+Di)GzOO8-<3YO)BQ@$>LM`INOkq z;mAkHZzWp4Y|O;8ARpmQ;rI0Dea~3GFSKQ`Oc-eWkY87reFbmAV6w(`&B4F=RPL#^ z0LygD5&TtrGF5MfX3ef@ybM1`1wGUI2JwC~ z$_TO#tK59a zo?b%JYsp#%5Maw*^~6u`Ct&&(z|V5_ez`>J+tfR6dAHlcu;s*~LqUYV6jzW0MW}Go zR?XUvcV0id;sZw%SEv0mEuJ*@a2R+zb!Z#K5Y`yI22QVrxoPCDAmpP+hf^1EOsuj_*9 z$DLIWVN|@y3fyFb1rqUQumBGsUXe|T3(Hcw4u5-?m}JsS&j}T0Wnhl6CautnpP=?hRg6>yb6T7UFxjrIsbYp$rVgcG;_L(rq z@3+M~5s>tk2Y)3Z@c4&T@lP+H39b{y?XumEdtk~{-C{JJL}Ewgw9tUiSIb9 z)(2AKOcDqpWT4?cnRT_UD{M6t=RL>~ikBnbl^B)`;itxv>F2w&sQYBt3)xFMn{I)V zmX$imE|SdhNmGOv%zp&;hpt#WS%zGP?(uQ;3)6R9NIonW;;A@hO}oV{80?FY@jj#2 zMmyVhSl*K5ALz-_!c)wCkRGh3hfPfdiSRECP`RQ+_-91g2dAId4;%n99M;g>HOle= z*tS&a9$2e&Skl{>ZOGL-*?zJljvReh~ki@-pv4ikND07Liw5| zffw>usA#`7pgLBcZ<>^iQBd^*;*u~oEa1Cn)#pZ6P-;wNRAfKc1YNYn`=u{4dcZ_k zHhy9Yi$wK3I2e* zo051-;=csOYY_NHbPT6Pbf8|50rg5A$XhDk^9HS};tANM>*>ibDYKJZzK84|N`d`- z=##0Vr(JOEm$~a*$6kP55|3PU0x2oDbuCFJnC+Y4_221UMwlO;GW}NlZ>?SQ!V0Ss z9IL@$Y14}5v!s?ZeS$DQc(D!nqQF(Ra-$wgfzvN$g{}MSZtTX3bd+dPr@J$jKmvOy zN64E4Q;!E>9Ptj9kp$FWTgZUdY&``l*#Bs`>&vSiCaGfms^o`!rp>il2dEPLo2LjC zT)ejN*SxxNe@9Jq$TB>>AC=}hpp~XBywYH?_4BjG`G&yfTK9kB<(DJK=aP%s=}&*m zbM;XnYde&xVgt~_7Ej2e!abXoBc&}Gn#_J|_%B%isSOLzENkzi03qplg0w5hX!r;6 zK&dJbK%(olM3Eq2GQ|d>GF$=mZ@yAWAg;G-Fm#7Ne5ZGAF!_#2pIxvGkb5i2y!GPDR16emYx3z z9Lk`20Xjx)T)$5pFfNZV#sP;>mFK)@B{ek;=)<<(pw)Iz zF8u<=XM{dGnld-4^NoDNLF9;(2qCDu8!9W$$^2oFXF5?!Oo)ukY8_&TEDK4x1wCHL zx!+Sepzt!vVg7#3`82EZVZ{dxF{^x8Oz#&s60UBYg;UEI^cy=F#)I6A#i1=AUu?S8 zMGHj#{f#HGlLGl~W|-m0_w&T2(1Pc2*HN%K8Gvr-#QhM&j7&_a$_8LQlCn*jA(kyP z9I;6MRQv_3gjp!&rE5k&nHJ`))XxaI!$wXh&IEETCwW8qzUw_umQ1Yk!sJ8U{MIp73d9XAckkHDFW^!B zi93npIxhXiz%nA~2yp7)54nU+`3;<2tdUryg5%_5QE>OSl z?Nc%g8CVp+X{&Z#tr8|y_VB{tkiHcC7S6gMe#2`DATiUlm1?cOL&-lb+S{)U#{lEW z%>S=cTZxT3j&i0BUp))gWZRpH2s@GPUp$7a?%;6E2j+TZT7`?#J?mkRLr5wK=Aur# zDFc(wULl1D=`>K(TwRK5v{Fq|aF7)I`U%%lUV|KzNj4%VrC45E!~hnJ&YMb1$A8tN3^-Oj!9mGX6oUrOvH`x{{&unS@q|4#{&8Zf zf3g~385=GMWHkZO-UG5aGc>O9k5y60ZCgh|Tp0Xk^Y>VEeIOvFR5VIx=uJgd8W>NWG9LF59M1?f@;@bh)L)E9~UmHbIGxakWhrH`?B!-xXTJ?ag z`7xJ>XFo_`?Dtf))()*zld;6|268xp7<%~^GwjcihX?@CwVHNmJ14QBdc{DOxD@_U zc<9;kbZiIFOT&{A5zGY;I4~B=so2 zL<{`14FHHMPyD~^&eSCUN}oea(l|)q+1(9Qr>_$F-dH!gT(85BLB^sUhiAB~*W8Fo zAjL1nrg|s{xd)U#7M_l~zd8F1<$1-IghN|iK6>H)SxCA=N;~78Gxb1#ZmoKiRppyD zpZjP%Aa;T%KnTml4UWE9BXOoT3lV2}ba!lIW2X0Ykc=D1^4ZpWkh>4EP@<1Pa$0DVo(9(Y!^GZ7?a)LPbKJ0BA873(UO# zn{DNfhetIa7-twHSl2JfU-BZC^4%IXxXZu5LLvW`#gQ`}bm%xqpLS|Ev?S~>y6*ms zB@GAsrm>Q{7|@tQ0X9$|&&s(HND2c1b$AJpvf#0>z@Jbm&KN(Yt$2KLJO&^P&~HFBv|sYdtjbD@3) zsCICk!{b^t-@;iPwT(>Eyu7>smcdaai`EhlE7id+D+2p0C$c1r@fr*!5*Pa;V892n zae@vzhy9J~^hA~Q9Q))QfL!T4sVtOHO}QJ@nyx=0=+A;sw&`1C9-z+^JpP`$ghZJ{^a$8{@ihVW$4JrDMzB1@lKGkYb z`-9pG4VAW4*Utt9mdM}u&p*`}CHUYPzJQK#GIGphur>3%+KW*0>R1}vER$ICE*rT1 zX!%4|Fm0D<0ts709Z1zdoLu{C)6O7z0yrdJlxLTFs@mFMF}96`Zu2Cq zu5BOPrys2DQSwBtc){opWCZpBXTw1&aPo#LZS6^dij7L}Ajo`q76Es^umdqk!QD8= zTjTu}pwaA-9~zZ>QV@F5QO%E>t z6ZuM3WbMu4TIxZ6u+o_F3TwafDlE06#8wAL52y^y`J$S@?z40cW7XT*#2>V@6W@@C zlQzwGDs!Wpb8{cG1dPY6+rL~?S?1ll37KjfSp`AjLD#*`rjiv)1C|&o=ShOWa~_0A z%kVdO%6||u|IS%H7f$Yikt>TgSgRmQY&$qtjlL^=CQUI^n6V{)g5E&6_Xh8k5L%{j z*NI*sk^eS(5|ggaL00IrUJ!kcSzbXj9hGLm2w2S__D#6+AqUcj77iPn{;5?uRdc+b zX$`Da9_puN@ZNnK(|2BLGMH8y5FkW}bO88ZbKBn_l8L7LL(bU9-QH{hVmTBE?X-dJ zzLGvpi6r9Fxht(w14##h4O7&JBRHmW(ZTffyWwg%5aXBMSU4Ub;c@^X^_q=@n#1X7 zfcxTa*}z%x;w%@Gk3Vw$288Y7nT%pNSE?KQK&)3;nN>o|Ln$Qr!;G>^w43MJ6ygl9 zyYW!jEaZn03%uV{ZzYxK*ub%)KT~wyN&hO6FJZ=r4jX~jm%`t)yTa|btAW79<-Kn@ zlqjjciMu2h-0%VBF^dz=am6OD5lNie)|h-mRd+4@lr&^1qbhKSD= zTt92@GEWdHE#*~)#Vgm`*0VgL^&^Ehn}U;p=cVAHA%|6`24onegC7(y;mIzYx&0uyG6E(*q(IJ2X8i{9bA%* zDI;tE^(W)fYlK>XrpYTDV-@A+wC$jrHmHnZEBYDHQ$Loi@XBL$xj=cL`&k}ZU(4) zF(@z_yXxVQ$Rc8>;2Vd%5Py|BnHx^RxdPdbHTGo1P z3~@Ff`~=?`J@>!Z!E9kkWqPW+!IV&H_1}W5>4>hDc0|p~wMc~E4J-hT&26`idHsSb zU5K<>mN}E$GHG#SN)q4j?riN?wx==v$Q6E&^hK?52DgNE{CL91@V*$1d*Sl+@8@wLPt+|+M5UW7+4#kW6+qE(V0N@#Izj(u9e#9Ti4A7m9w8Au7k_})_Nm~< z#arH(P+>k&)d0>PXAMAa8V8v_sZ&`*zXdiQZ4CDYqD)DPzc?H)`B!8DLZtrwRzb@?`-h>PGYBRQFRxGhuI~I*S z=hTec0%wua-k%o7&sBCp6Lpw|9qlwlzTq!_l79}8wZjlqaVGMXOHt&4eN}M|>%Rbq zv#m>QoZopI*B>a|2au7~7w4fqB=E<&L#q(T>U6ye8^j$ql2?elq@W9sP)9($W@L3S+Qb=-|@l9C*z zXgCOegN<-7y;Pl&L###~R(Ie0{uqigq4?->;3E8pPx&eIzFnbU|K`0rxi>r-sKJE+ z3GCQ&twD364?^GK8E7;cSdnu^g+yM_WG9B>qU!rm;v>05!5@;#wvwjuA#)#|%V(iZB7q=N`IC6x-g~!l6BzcWTwgRLT^j7%z-gMKK+Qao}B5qV= z{3ipGTXj|v{Xmv*@NZ<8pO!F4Amg`WDkrtTo&;wLxpr9qHRkc+7@MH1Dknxz&fN{3 z_hWMab0OS1%~)HbWKmWmWSVs9d9chAxN59#P!*6}6Ea1Qw=^4a$I@b;x zbwO$WR?Lp8k72e1j{U1#&i@!C^i?Aplga@Zf#@cu?zO{DGs}&)F%{n&eWN(g^I`sS ztfjv@j35M)Gu|VMlV99ia(@uc9ebDr<^Hm77Ng|mG#s+f@HWVQf`XbdP(j=C0 zdP0RK_k8E@xzd=CO6NG*8G8U12*b`i0t#KFWJ~CNk@G?-@^EDDO(vJZ|3xEGBb+tU zG44LLgFl&htD=I(rb$UI)>ogcx#ZikleTgS?jC7LEAA6IZ4v)2g8WB{bn8Kr65?5m zQbhlr`?*d>xbSys4@UJ$x}g@Gbpw`1gSJ;rTL)D5saUyk=&)_Gotn7}nYE^K)H3D{ zSM!Nt@6RN#g5xogk=hS~pLPiXxgL*4tQSaI)&eX_Z44VUJmLO-1u+d105+3|NxMqw zrOu!!hz4n`31IsasZm%Bqqj|xouube?l0fQ6(tlmbbxhxq@3sSjH5?gH*(IzaCa<# zRMB1#L`-Y<2Ua?p$S|h)6FU}|t>W0sl@eN*!$SzzzGHGDw|waQi5S87>D?(r@#d*s z1zCUQ1B$88a&y&?*k-RwV6-Nv3W^qeO{1HJ@<4MFcaceiEW@$Dmr8p*PT~i?^oAG*XN5kevP;+<`QRvfMjAPIt37L^rfj6BZ<5htZakpllXF zzUdChz4`OAe|!3$du|@c6F#H{#c0^+t+VIFHLux-PD65g^NRh`_JUiL{ySNFv44Wu zcuj3lpF-SJO=TY?8$WgK*Lrahpd*cs(EvbPFHi$UF4L>K6aZTR1SEDRABZ^wRnp>j zSjn{_6pT7jw@>Mlve2AuC-)NO3czgF^CIE*a3hp&@qfYrYGxBY-Wld`aFD^!cB#>cgua};up_=thB_S7Y7kLxxyYue_7})0apC^g%es>xC z6#6+0qpXm~Oq&?>=>hb1*)ocWK&w}-#SbVM=t?Ae#yWtjPW0x9hI8-Lm&pX~L5ZtY z*pY6s0uDi14*6mg>Q50cMe&(umN=!Zk0U)G& zmWUbC?H+m`rmzLN??%|tu=mIAwb~>pZ1(5um>f9; zE5jeE&ViCUe3iSvB?MTzXAWO(77r$mo|6wG|Ia3J?G`2*&q`Cw4B~L3frsB@9^nkv z0er_;)}jFbP>ld0>TKd7J@v9ql<6cbH2pOV zWvNBEB0=|qY3^qk3Uu>+3+BJWh!0gq2=YJ7NeWXH<-|!c7~4FC;>otS-Ey(GmYN)l z$lZGTAaQ9C`)h3I+b?$_0<^@*F+@*qLq=pR%6l`O9&?cf$ zw%o8xq`z%LYp|CHYa**yj#D@N5tgw>>GjF<^E9O`P{j0rVrH60ugYo$Dj{^5Opkfh zwI3Q*ne=DpEahxNyKllJAMNmrL6bEo`i2lF1TuWRa^a=F{n~co%?nLEDci`K#R)M) z$YstKnluXpPpvEXwh5?yh*LaxqpS^}jiVd&kt3pzN=b0{88J%K%)D}5SS!_&sXTU} zJLwfwvSi#ld0W^@w+VTwoi%tcNs{ipoZFC1h`zaMcewee-nWf3E`<`yD-tD%OX4>%n633pSN2CNQakMLO=&Q1sw9fzD;gIQa zIWGhpAXQDED!2R6pRZc>nA?Sl9Miv*Kw?JioL_q)tD#35C5r>djsR@S_4urumKTnZR>~$wF50-Yt}Mqk(A7l% zDUr=!55Z??4}VdIA~ZraS8w-$=#!!>2XZdFJicPV55gSsnduOi-Yc} zP!;(W+|tr57k%sc%VRE{uLm@=WJ@cO4Yd%lfUWrPASI~}8%u|I&3?7`<%iq1*Ps*1 z5cKjJ1kpN%Q10xgy3D=eLRAx_*JtOBMb%H$c8+cClVeg%4e(Z`VW4H)afyn?8MrK6 z(R>&907%-nq01v8QtSbhWpuW`5Yhi@6E%ajgYz}G{%dlcCrM;eX?Ev`gJy*4204%~ z6x>NNy;nMC{vtfHhgHJ=7M4s=4Q73D)4=pGMvbRea{ zP8`2y1Dz0<-dI59GJ3Mlu1SSk(}!=_(= z@GbVr2L&R?X_(|AEz}Ona-+t92pwc3m;SRB z>=+cO3}Ml4qR1c+vNnr$vsg4KYZ^DyYc|GjYp92J02mT`fvpCwP7_F?vmj(?`|!{v z?bZE10L)S7_Ni7#xXP+TOd0@^h8^&0hPv{j>{FhGc;KV5f2ZPV!a|ze!*(}j6)LS? z>qe-|Hilg$WljtYz9}vpp!_*LdnMuFkmW?+2i_w0!zq8$p^+S8A7(A3rUeo zvG;d3LtIIA9)wEEu6=HPj(pS@#hxKE|G#TUgsbp*UCKAW?F_=>16>7F`;gK0C|^W* z@ueY=Ksi{wiDkFG;k%eByNXUty( zZDDIHj(Q)`3_zBC%c-u1WF()UtMF`~r&^NKfEn(Coqo%dQ4W=CpoxmZ|0%<3;`ux3o4c9A2sXLe<9}VeEDbuWrwaR zTm5nW&Z%bFUhvqd^sE{{*k)bm?(K~Ft32Qmlp%bzlZ~PaRTFw>JG4CpMY(dS zl;2K+Vg9Nk(iM`(n^-!^d%zQBM*Igj)vL~@!DjaEa00qv352LVss^9?4U(F8D{}U* z5S9~k$0DLoiHPm-xnK^0-+ptX+a9`&@zWEMUDtf>Y|u`p?&kUak<PI-HIQmYyN$3-@vqCeXJa1TTMCtlJMvgN-)o-j}J&RNc0XSH+uANV)@)_ zaIH;d z36zOUAs*TzzBJlBf)d_$1YO>Oci7-_=nO3s&00IjomuC4n}~M6jXRGtUscs;;XV~5 zT5a-7I7bF^?~8-rZswB`AG-`<3OOX`(#jgZ|7~i02@~TZhVGm^MqTG!>KE5w{UAcx zgv{!3IPGZWe1l?Azf#m6IJ8L4$rCt4t?CjAbo}@*Nv(~sf&_bFsYcRw8Q`m>ver!^ zZ;BGuR)Xa`*$_E**@cLmYcyB=bxbz)`luH13H2k*b^};=2_QFyuH|iN)@K2BlyzP} zOR%%b6<~TVqnKz(hh`R1A_;<*_d(g%GsSut*+M}tX4td~rS>uWv4N;YaKIXhzjwYN z4}q5aSnmtWtkv2yjD(q2=M5<^|MQ)?=n636^uk4i#!<~b{m^=WDN(q>3iukb%*P8; zLq`S4-zkA%??k95&5YYVkMUZqSUf36rgH_qTRKi{XgUZnP(YhS8#WE}$}DChIi);} z$t=j;K+zKiZg3lb*tf6gl8WFGvk6S&n5FQ)5A!RvOx3v?9JqxAfm}UOMTT`m;A)jR z&)1W+xBauK0^$gJ#MPp#*F5I$8G6j$!h?(V)$&zRMIX`2mkZoCH>JrG7+l=N{Owz( zUL2|SBl6B%P1zjA7Izs-9e3HK=bFvLHqUhfhwW+OZT;zdfN81vI~qo8_=jZ_UZ;J5 z{JSJ}0BCnjVC#;cc=(J)zBQ;qbddC$J5n(VCPoFnl$1GthpR|NOKkZf19Fc#(bchg3gWvbno|g)GGcxBUFMzVa(ZPiEiez+d(t z^AhD}YK%2IKdn}unJ@)zKG#wD-9^3kWUyM4zLYONunyN|#i!{}_mx=dzkXsa#%X?q zLFDIl{s-$;_|9ESFM$`vkDG70Q~jz*ztJOAdsd8+4Op>!PZAb|Kv@g>*#w6 z@0#4#D&kwP4hdYI<})J!PBRsUTDG%L5gebnfOjYzCcBGrhIV%XALaYa>#;%ETQaY; z_}e2kwM}=V>bd&4hc{}z*>6*ZfWD}8^&v9yyRWp0SI-_!?RW{;*|AzqmbTG7=*QaE z@NRmjT^mLUySOwtd2^|$YH5;>uSDdM6?UZd=%aAMN#gS@bn%{`eYx^1lh8Brr9!|wPrQVez2^6HpH+LK!{7hvGvo7!DN1d-(O}P$L za`nHn%dux8o+|>V%XC071-+EddLv0jMka6%i5)eO&pqC1ww^}oR~!rVGjPY=p6U7D zr`Yo^o4(8AaufhWoH;p(kDk%Kn<_Vbw&LB|D0$~+s`W{py83p9u_hJeU--xX-O6FJKw>B4c5-6HR7KmI730-G?2(#C1F?JcS!2{~oD``eoMLFl3g zQnb&*zJiu{8xc}+a-UEVeg?50-6Z!OfiF;~-6RW)=g%(j7MYS!&2Zk8P7SC$=W!j` zeKyzihA{E7!@A3(_Nyr_V!1&hx|9Y;3_V=y;)wD+Z*1t z&p7^z9bs?XD*c!nH*V`Ui{&$=2rQ|--$$CIxWj#KVZb&036-yxD+(Ai;=;~n$QD@5 z*Pjpz^hk~6^4;o>uXBG>zfB$h$9vlgpgfEB@%6n%_3o{__}@^e-`$`5bOkGU&Y)Fq z;y~khbLFsj-oT{UXd z=x!8uuvboVv07Q0)gc!QLVdE=##A(L&N>TdzBe|0S})Uwxpx1u=-gsQx(Gag2ZSO@ z4i*}kz+gZQv4#;+xEBPK#_Y1oXfcJ;p22a~RaDasDb(%?Z?~6)5_o05kc1g>P!dKny^vh#4kfbN93ZH;<+fS>@me%^yM{ZT6QMp7a6-3g z;JM;lU^@mO7AY49LqkKKvI--iLqUgQd958%+?5QyBIZqNd#?0*`Z6u)JPnlqV~S0iGR%SA8{t(b75-+aU@RIkcy3=w_gq4HXW zpGw^VoVWG-H**!f2=EF7?VC3rdMI#Jrm>*k>zq$$&K1-9iCA2QP?_Zzup_7vnPnm? zATp!ZyWOw=w2cA-x8+h(3~+*ic>-3{2XG-^WP4f1$5uZbkJ1wIl?xPOxNVJ`6UjIL z2UA06v%W%dz~0S+WjHV#%@n0|z1;VZd@*?W26{;`t*(z9?)gWYI25!iEU<=2-vO(} zy{1W*8{V3r@baQYSh-n6>}%12^AlM<)DbONW6eh;=2=%H`ScQk$?@hC8U~N)^ACEp z(m+Pt(TFY)SioN905CZf>jI?6JAl|%@AVQ~Vx5Q>kh(|sp}7v{jwmKZ91#FZ{*%%j zv5s?M6%UR4?@^w|;PM8r<2e9$fK-6h?0%>E)nu|zg;q2eyZ)tM+X3vcn>DXqIx~6eFg}cGAnUVJ7*H;M$ zu|ufugxcm|SL$6G;pJNdFaUp_Ask9~S#Gqr0EV$dxg(IXcI9oab2sO1(BSS+q1Q#D zS^M>owRcpp45RI=%Y55C_X`4;e?0ckG?2bXs2nJ}-Mv3w2tm>NoR7-o=QsdcP2wmd zFc5*qqmw`Pt z3V^|`-j7jlGLqQMC8fT90)l|{a}X%HR;VY%oT)L#+ItTP7#YB4 zbgymmKwyE<1`P4>xaKZ!m~-70bvB+AE=8rRZzw%A@PD(26+ z?J|3J@lb%HD@rVt&w1TK%aI7+~FP5G@4c1T`P}PJ8w|pJ9-5dzQ z2^?yC@lH^X1qdfm3goX5^2Jtwt9$KMI}xky6|*JC1Hl@I<*gb!i1xb}L`3voKbpL{ z=Pd!)rCo!t&j4IZ*BRPMAQAzTe*Vi4GuUQ+9k&bq%7xV6yH8TgNqhE9`POCEW+cY~ zZZ*by(@Y59H2o6bm`DlSXjH{Gp9!1bG%QeJNCEo-W4;aRwm>mw%syG(8T}X=KsH<1 zH*Oukqn+rjHD<4bgoKovoUFHkh&$ifV~38CkHD=RnTtZks0n`}vuy4!AUj;i-JXB^?9|5bjV6pA5LUZ@GI0jpm_C>O|Wh2&&sa?W%MX;R%+liox68WT$Tv~6intBztAgmD`32htituE zFozW7K!|cV+YD8QZlaI?rmoB@H845X@%U3xHhITQ<#>e3VmTkDb*61K<6Zd-2m~tb z1NmsDGq<>rF1C86J8fi`uCOvn#W57j;ntKc5x5XDm)1^`j{SN%25UQ55&?B!eA)2E zH*@yDvEP#EzJ72w?B3<&Z(UAS{d>L8nq=b`5_MRG)`SF6pI5v3dI8PjBLvtldh0-G zk5q)JwMaM{2@D<<7<>yb_;PVzvAf|AoWh~fA&r*H{6$E|iQuoiRYrrmHGRJxT~L}e zkuD8?buhL?(4HhfY*BBnn%|X2bTX5jJxTr4Nem1OAQs?E9w-AK099?O{K%U( zFP{-js&&{s0NgvD7+zRfbZUZrqbPzz_0!PNH^xNW(8+w(!W5ujmMFgrf3qopfr0Tm za|eWxG;Ws$UJn;M#eW zkv`Q#r1LgmhNjB^2@K=$ydlq4B8urfDX z;+;_*?xU;&)JMxMnwYrwc&S!n@sR!nwpzcM7QjcOPk6%)+G`hpGKKz{9V%k3)&=RxkMj{XA@JGZUZ3^W{xj z_s#KY9`IlY!4EAcy}_I-wE3dwdxm{8R#W1cOGjT0#Jzb*bWbS&bk>9;lZ7}{svlv?;m>TK@qRd(%L zrqh}WaKEEKqqW9MCEvGdD}?t!$Wdq{N#J^OkO+{95H2=3N9*b9FHU|9U*G%M)JEc* zP1Gq&26BnxjNhcc?3IwcYKPRG_3yMzGNzFt(qKHhiJ+h@EZt_QNDUIl^yg-`3gM#5 z&p8L;C3U|ilr+ub5HqP%Z1xKJ9QOJ&<0?CD?r&_%YI-8Gu8-T@f}6rPLCsU#H2BXS zV*5b$)|f(4=SMS@+4=d7(ofebrjtUnr9S2GUy%=8Zb2lj*!HTzZFu*;;L%7%0EK2y zxaF~f)w!FuK>oL*&d$ye;0SG?&ex#eU`0+_v<*1db@CA!F+5neUnD!KTDZHTW5{3DltnL7tCF4) z6*7>V&i9pRH)B3Exb6{&YmOh&b-3nvWK9p37R*}oBL7{CEm+LF0nxP<5emtr z+hn!lvpDbjaBAz&^czopM=7blvx##-vOi&i);WQ{F%0PKl>j*UM6x$0eQX}MCr^i{ zVQI|A@QX6(uy-9CK~E>hO!u63F8AT?yd@>letb4h34(~XZ+}S)z~dc{?60E7=W5xm@x@J-R!9 zUz6zMY?YB}*3f_SF_n25`^+k5>gW zGJ=YdQ8dt9p?g|)cZZ9vE*De*kFK{--vpe5)hxxw){L1;BB*3Q=|4G{BpX=ZO%UtE z_iS!X4SzbFWEQi`3IS4&yv34_I(#jHb?u%%9N$e@4p2!Z1w_Gb<{3cG(kP4x2FH^n zmY2W-gaH&#(mf&`6Es=sYmX$yN1m_lFa`%VNmy!Yr1ubU3fc6jx{y(R)DnCOb*OlG zSl*hzY+b<`A&m#Sf+l%H<@jC)__}nUHm-%HNn8z>ScGRHLWLr)(=W z2EWR`Y}#OBC+NtMQZW7fpF@ZXVNDWSMCkj&iMPa9SfL=C_@f6f`&~KkA`$)7!|Phuz=`9)cBNN>?LXde1iY{D0nB z$WuxtmPXJ|ejt1{2o!WGrqksSw~L)?YphrAs_x|}1zXMONOa(2EijW4gQEE@0TwGjos$6IMDvcz@tI)TlM?`uxe5QrpyJ+D>TEb^0$%|2S< z-~`4sJGoR`RE26q>QX|jJ)^|88QFkN#J@^au^}Xq_^&BQ5L23~(wP=`%~b7mxFTle z-F5d??pXYN^A%4(OQlJfr&sAsLgK?q9Q|2FiDYJqly#YI7YQ?wSv2Br*J}lu-lo4U zD`W3K$5)OVGcL;%XUk{(`{h1ZlBETa$U3xy`Jj|g8h7tYy7=se{fTN|?MFR4vjPsO zp*)XY=(DFkZ$|-#W-XtjwU?K@^UXo0HG*w;`;p)LM6_4m>tE4Sd2Gc7c(@cZde%;Wm^^up#OYafmcf!`3}r(JW@ zjGME;-araisA>}3xoCbd$djl>Xw;{qrb!$=_q-9H5S(J$n(32ER7p0A?|qjo)C=RD zL!%m)6=@nD`r0=DRblT2##3JZ#e&SFf;kYw8wz|QN|+G07fE)+xIR0Ykr21msDa+N z=LW)s1bRVuk7dg=9L@}xt~qqQN`Pv?9U{=nZf z!sXc2@f9>Pty)OoedRK8*woF*Ow5;IeBxo9olvk7eam**>}UkX*U&7|IcQ9{$uLVW zAANhw3M2F4+1b?)j%#zK!-rXT&)%xYqx30fH`34UaD-|+XX&5BF!}XF2yLX*xTvxB zEntD|0HMNP47%ZLGqm|phkxcE%;ko47VI%|N?Bp6`)b2xGUtcjQ_{rv!=ew|pHKlt zIQH%Yms#LSgqnn!fiF=iq6Mqg71^^eO=rcAh zX|shLRFwQ#nd<$GQL)Rp+%}KU56wGGGalL&*B?zx)7RfwYSjb3gnEIbIO{{;R#{R@ zcmcFV6@S|8VAoukc9Q&b-Nk!{rfrc>q>Lx({QBMBJz$Ad@31ddr2tW_!4K4yBcoMyT}iiV4mpyIzipDo+I|Gb@Hak5KhkZ0&!3UhO87sU zGEnBK=Rh*{l6O-o2J?f94keVQyz_87X{W5CjTpv$BuFE0PY(9nEZmO#IRJuyRHOuw zQ#s|#IH#?;8NU|R{F~(sFrBXVaSs~mC8c9&$$0GxPD2Mi-}Byf`a|sQe%WFOj}wa0 z(&*Er``O$*Vrew2RnOG^ctvX>uetnLYo`>osl^hZ*~39*fL}9GHHJaO!4U`S_V?3n zUTq=NoI6vAo}F=TWD`Sd;5?6BFrIzJdaJX_Jw@|T2gpKnN6gp0$M759U;nlEKdX0d zeWrF!x9QK7Mu>YP9fU=7mUrS`>+@%y_r5vixqVq8z%3;m`$mX`Q2xp|vEkbByv%c6 zidm;87@dccQSNutoAZw}%-W*k#j)VdT1p`0Ql01TM_bsdWifr*lRnbR?A$+#+xvKe zMdhew-HZ1JqqM;zbc%_OZgh-y#dXR=q?}IQ9^VbZCv9I+eQD`D2vqQrMQkHgUa~%h zhoJQlh?a}nn2?tWV2l*;jrm(~89i)06P4EQ*%`;dtKC6sa%;#IFaMy6h>t6acURS? z+idStQn;*5OgfegH&>TOTN9e1r3aWVzG)?{c3;T+0S&i?C)&{`!fZ7<>`x?6ESjDk z`sS~1k6lXyj|ErnP=ONbPYB(|_#*Js-a;zP)RW_Y>$!%ah=yG1YbTI(~A^iBujTn@v2S1FN?R^XltOv&&8`OlOv{xu~cXudh1XZ^?_Ija2(7>+4m zT-XLQc1gH`t*|$~iz9#m>->c*P<|rz-B46!laRpAZqa|fh4g>mTlOPM<%7Btjrycj zN#FRJ5-W}R_3AElt_Y>~;>xhf>JBG@142oSSRaqe0>)4M>HW-gJm$lHp6|SN)2{u( z9l9+bEFE1ZsIAfB&JCQYpQcN%>7N>1S@QKi@-2cNCk^?Up6sx(ilfosghM>*m~gn% zAP%&6^9+7RDFjIG6xiL2Vb5h^#L_4&&^-;2I%fE~qV^q@K(GRoiozHN$7=E)-lYn- zYu}ez_jj{>+x=9<)q_f$B}eEOp{4!hr{J8sOT7Begamv&{>IvaOk3*V(X}U%@jJ&O z8ZvvjJ{_9Ywj%Z9+!!`>#iG-1TCJiN^c+^7!eJ3H<#*dAl*>(okSs_fFNmhI0Pb=| z+Vgg-l?2L=V@yp-Y?>#NbV+a17hxjyndr{YP9qLdMy^)R&Cd{8EFD(meX=VJwEs00OAfd?I;YjyFa#e zae5+e4h+V;FZr=Qn=jopHc@QBs;4ur?sR&MsKTfks;95lm9L-oJ%g$wX-1|nK<=oQ zdcy%W8|lZr|Cy3M9Zvqo{_dFeA=U=5--PEYOClMi>2?-!zT!liXcK+Qm_Ewdl3Rof zTiPiX6gp3kE7E+S=v&3iLK67LA1v8_!0f%HnhL@8V)+I8UL)JNqilmJMGv?R%$cmua)UL|(88VL>}N zHR$4u#%!rJQH5m4uVZ`Lgz#>VOQYorceCfy5_uirh1p0wt^ObgHAi4G^rd#IO{y?G zCkwrR`l&(n*8aCuaMQbdN>BhPQl57@+G`HXFc@A<3eWp6Gw>8g2vt|P=B8$WnP?J+ zF(IXF)Xe_a{*JB5*i8D)!Mr2tUc1kEaZVMD`DHN)!7@!}73{f50-sc`wdsF2Tn*wv zkdwqk@>K@wRw>V*%k@er+iCKl-h0}1?XUZqX0z7iWRf;tAQq!KYx$yBzDff^A?=WG zrLEBwjPaC5lD(Y6|ra%`Va#7*!i#^Ux)UHNHxKXegxMa3@?%Pc*N%rKPYM#!f zY?UO1IG1!hIkCn2a_^`&GJ0|06p&+!ea+pmLi&^BxLk7aLlAZXN5r*^phs1^viv}_ z02<@xRAk2;Um|th3Oe9hzG?i7Eu2zud#EcX!Kz8pZ;$ z+vR)8JIeNd&u^1d@+Cr?b2Zb)9Q03{1u7+=Sem%-s5c*_)yft#aLgS>`qj$1^f*gW z#O%#gu^%0%K&MBe4q5?YtA?6>*Y}M^j|1-}eu@%T8JWm%y_$EAG4u0antErOYd4Ya zfzua5&dBHb=+k})%2}fpxL+#%REmGFd>KEd<$EvKMhlGqmc`)y{=+|LE`Lj5;-|Lx zPmha$#MT~qQl~mRZD*&gP2wsCt~ts;iqkNzd!M3{hM{3SkHX4IrHcsMu>c= zoaIh5E~9!37aHn~euPH6fRt*Q}$Wzlhpd9W2=&pB!(%yS(au8E?zu zC2%krBpJ)R!IsAR8bDHFIFDraLfjM*$d|pJbZR#7V)@TvhU7HjbmrF8`{)rYmzzZD zEEjMagC2f}65%RFVv2SiExgm|PdsZnjADF}&&c#22neEp$Xk$T)CuK&`F9^J)~*a_ zxmhL3NTFjdL=fmZ5?#Q0)NZjLz@brWGa}w^_I~F$38GWV_^wXK4TPk9zH+>a?d1*z zH1T%G#EFt=_qssm^#*->9*{6Gkip$;XbReq_3mp9C%mqfX807Jc!c1YNy-};gWt(4 zo?d)_53PHM*!7r()cneKBgbok`%H71QQohjxn&qkI%2mYAx*b`-`*6=@9X-nGH?|}A zMNELa-CncbP&q!vDD^>MDxv3Ecg=KZ;6_g=?yI`6e@G$`Z=M&r(AjfXtAWJ{dkqrh zUc+e50BV9FDsOYn*EEh5-Fk7YGK9kEjvSujab3TV*m)GzJpLT)U*&0j@ZND1G!~Q*37kqSe zA;-2+C;GZEt8q(V2lM6RKFRt^7+FL?F1z4fU06ox3Di3q{E|=AipA7N46Fv{=#W5E z{S`d@9yeB1>=!q8&P7kPv@|DDRrY;TING%`kt_{wIosybZLVD=rYADysW}p7Fj2WP zLFwATazBlGNC>&DVz|_P8A?CeM$Lxt0B)zrUlwyO16|+Op8!s2fCGGQsnWjda-$Ef zJXxuOR2dHc-(8M&*e0xBQ<4`5g|`DrJOHJC&$w4RqNa5hf~-u)VJ?A{CxA=>H)D| z(l72~Zqbe`S<%MfX_*;0zlFD3eqvav@7;d|q!IEh>n~SAkc|F$-{V4&@jAi-L@-0j zB%M~Wc8e43M09_8s{_eNgIk>Ua>D*>3toj%n>Tf)lL}?oNEOcdeZtl`by!-qL&kw$ z@rL_OOjkca zv}Yc}Y^7rDzL0XYvcf7(3_^>jV0eJK6d#m022dfKAVB0%89P;?+lKfc%FPgPiMVB& zrub~7Q8k_3Xyu^YfeY-$P+~07dOvxw#QKT^*Z#@I)%hJmM>b8X%H8xiWTGJThU2)~ zr{7nrdqaR zZlcas0AhJrF?&^oBkgXa=I&C!Pb>GGqioj4DMS^u>1Wh~b1e=i1_7M1r2d$D;20bK z4vpIuMUko^Hq^xa*`vC#{(4mJ{AF)-sbcy@O84k}=9qEX*v-d+jVqJsXa~z>Mlysx zj1S4Egxa;%J#H2)!Z$Rh%MuV$@Xnb>PGJp-aSPEwd^0O5Uzf;G{e|-9Kc!6?Au&x4 zEwsmqGP8mfdaV<3RBE2X=%kcd5Gft#x_sIKqG`cHmWwtv8Z9<%Dexe6KSd0WoAW7! z1&+2|H{_CWyX%bRq_`_w;3lEvPS-C1%m*2<>v|)eR;9-gTcSMmZQhwaeq=@d#H9=x zc&WxKjbp9<=G`#CcLJ6;+(C{ep`oFfh<76SN>y?@y^j>_j!iQ8CGKT1$nkWbDMq#U z!-Zh2zbh9Um~XP$UM>-^ z(5O-eaW+Y(*U^g}3;eCvDF3};D@}qO(rJ?Q@AF_V;&w4XBjm}iT7SKd2-ZP;Or(<2 zp;_<1`o)^-CZfg5UXxq~L?hMIfRQL1P5)as!$Cp#x?%FAQfz40YB4|)R7rAj`Qc#9 zS!$=zo1#mMVex0D**wcQ?rHz8GM#2jSV0fxDOD52l7jliSFJG`4K8n0m_GY!H@i`O zST6T$F0cA-`1T@E>tbi)I@{JWT?(u9PN=xMkC;A?TA7;$2bCL5{3f|}QANA^{rUjA zL~5rhN8zLtil;n*)W0fU%Xy0NpMB18|8Iz?xgYfvA7ABuLO_5Se>rwI-sm@cm{4YT zH(%i@ege|-ur`_;Qdw>23R8VXUn$mm1^^$;eN-$?np9PoE9<&3E-=C+bZA>z# zNO56#zd%-GkXDBS;!L`I{2Nn6!UflIbVez8T#Y)*5;AG)h}R$v6UFpf)E)Y-t-oEx zsv;U#zUI{DxQ>Zp?wI$hXzh}wTCFFk+T)6I4{BdZ4TU@N8%L_)J>r=< z-B~{-IV)Fw{-|4|xCVX3U{YDWLnDEK`s9Ah48$I-0i)yXU*{jEFHz4n49|5OZY00! z7pH?dhKxB+3Bo{5EaB=zYOhK^H$??p5>agq8;eifIip2~oYAv<@gbAco`6i|~QWmvgR=sR?6;$-MZr zXt9Mb^jc;1fQ&V;VzO+BzrZ9qD)ELTgAyXMK8ejHVYaLuJ;4jJTz`T5?^J)7La~Ut zgGf`YjraY;gUod9o3yFvF&d9~g8~%%0ue9LxqF#UfP53@xISIwu zu=8H-y_zrPL3;S<_EB@1Ue0y+OH&ANjitx)v)TY#0#<`Q!TE3ttWNeNu^h7s zv(EO{t&0teZpk7TGQn0(mlND(>OEgTTmZQUuTrS`TQ~XUtFQg!5B)cSY#%t$vx$~UHhVt zt+zaT^q;3mq4@}n7!?At5X2RRocye40LH6O=F8ns$3(ror_0IHBu(xOwPcL^j8B!r z#py(?As^tu<6me&s->TNE=jn^#Xb@x1P>9bOYV&2I()Jy0nm?YzV%+2$-`zd$KK$4 zgKCjt62F;zEWa7LFd}BBTBBwJDT2gnD2O~JRbylmpTc~VyWHpzokA@vTJm#mZ zx8A1QzS?DUQx)N|i)pPmk#r(Bi(sAp!gCb`R4uD#B_RLP*#7y2u&LMzh$P{)Z>juU z1l)h*ptKFef3UP92C5pfy3DfMNy*IpH}^)Q5ZwiKon$(Dx;aM==7X7P-&?y|VVZa~ z?$gVBw}p5BOUsbBlndqRD~4YJM>A5W$yR@cdhNX618Vs8C^HXqYc2whs5uU{6j$^J zI>CC}Wj#QH$MyA2U6ZH!3f+CGyKdA>s49{oK1%W>^*I zc|C`PDpLo;UYcYF7fO0~AFS{o#_|-YJ(-r zq^q9yKb;S97mES%O;|rKEQbZBSglf$mejgh<_~oF0HJADh+!om2R$DmIV5Y-j5~F9 z?^m_V+up`c7CmOd)Z*dD_`L=XsAB>aX&C%Pl3Rlc<8l~_JaFn(DsaTj+sUURJ<$fQ z&ADaLS^}ALnho)I)2+F-u8wHl)W`7tgmd=s5`jTSjYUDXl$-du`eNSv-y>X9TwF;a5`7g=S^zl94+)RnATIV#xy5W)Ia@ul;uE{`dyTqX zEw`mkqdlYfzTtgN)As`_`%yYsB@acObYDSZi2r{^!jYu1cHOjNR4UiF9#oHlsGtFm2p!%MDyZoedXfjOHSsE%OFxk%%qsQJ%}#;(S;(hGW0b*1i(0EGQZRGt zfwpJ! z92aHiC~hqU#HL-gcTXmN@BnQAWl@1d&}-Fp@{qC}mTau52A7>ez*Y4(zYBY^7&myM zj1LsBngYeS-q=*C#%TEo>5ynlGWDu?^dEPT_zw<<@8im35ofjApaUV>SGvWuU*4_8yHx7K5Aei|jh) zbRd?qvY(fEgXWyb*U2@ek`FDwDRNPi$0%0Fh>hECNMx&(hem~aLkjusF0dc>O6V>j7Wx+Ds=$zg)7*mo5D}zxuZ=C zdwno9GgIdra3R)p8DdLW3$L{JNbB$1dB zc9E{ICd>+bTLsaF*r(_C*C0;6A>yt8POQ^V9o+lcRb!UC}jtEp( zZd!$MM+4OCB@5L-ts_SQL=IAV(9!cjT>-MDYV*6!EZGEYxP8#C1~TEhX8a zCI=PrAw{Hrg~1At`_cN8K%Ko-_l8cT#Do@i~3Cv3MB&5_Mhf{^m>Ri2oFo)EnVgm3e{whB}{EzGUw})Vic4Om! z)HQ&CPJB!;n&J44GIn2cmhZH)s;AJze$bS7wzJz+)UJ*--(IJ<%T{19)n!*iG^7u? zeaW!~k$S-VQ}|f?N8zIZd(Ek_rJc32z9Z#RKV7xgbKa5A!co;~y~nh{a0>A+k)h91 zSWLnpMgpgdCfceMxLt&upFXuxIuj}3PCJQanvO(S)m!g*B(b<-q&7N*zNSbY8?@b{ zvG@VGx9T6+IXf~rY`1Tp#_7Y&c`E;K1r1`YyIdZ160$6I~}Wp6q=Nr$EciC`YOHN<^*Zxr>(UHdo3 zXuIDBlgFDWC#C)|}5pWWz(_){a-z3fdd`r#+#*h8I~Q_Xpc5 z{|B};!!()6WVV>8lK zy!_d5XTE2+7s0VvZqfk#hL2qDZT3O2MN>k>I-!MzrYHN>F-#>lN918W^r-3gPgn9+ z!%3q@(FaGHKa5(*nbF@qC#0_oCc2qZ6U&=Z<>Bx5?=sq@3cK5S0>6XUB)M#t@c+Ux zU9QV^yCPO8()pN(0-XWiMkoGesdH|Ch)w}xKl8nP#>l8Ud>`Hq`RTxH;^5{oSJPpj z=B$^&F;l3~+M;yVOHu=zoWo54pv^EgN`m=sJg@OtNJ@N^aT?)Sv}E7bZKHn6RGd{W zGUsk=8THT=;ntFjzX;=T1Z&!4SkWJda-iUKZ-B94A?eg8p}!)cT(vP^#2WsuhCYN4 zS&3K9Pa&z=)UP!a5zL&ajeu7E^qMqzgHfDo-EVQuF3(*zAc$7K+|KM1+1_Hta|PR? zWpep>mKH!U?Q~rHMPAEJYym7dedv}|dV3|(+ya~P>}GP*VJYZQY-SjeYn}v_bplQ= zDM@4cGu-1H*bKhw(yNY>sc@~cHThI%79DmP3lqMur0#FmUU0JB>5e55o)7i*Zz-tC zQv|B{RY%2Fyy8nO*1YZ~JMfV;S8du8S)OR`(t)07HcAnHapw9Nqf`u|5CMzfkA!Xf zpSEWlseJab--oB;Ep@~i&NgcMk@3zy{&A)3|K>_u*1MQ0#gjn*#Ko-|1bUq~^igUv z8s`e@SUgtOYqXAV`1-S`ZH4cc$MgXY{WQ5dyrMWvJ9>cSK% zx8NK8j!DbZ*V~@EKtK}Ey=Lt7E+buTeD2EU7+sMuqgzP=2~p_-Dh1$xagI$p_twUG zzxvtM2co0LU~@CTXe48)$&LeVD{?<;^L}NaXr_n(FV)u4vfoB!d60+>&T~}Fe!HZ- z*?+Uf#H;j}^m?DJ^H-u(uXW1JdDu{wp3dp!MRL0=cdoY%)Fz&{jIT0eZD^&;A-+57o{6xa4OfFv&vKkbtKic?r08I&{v!uZ;oRE~nk~(sIeLd6m zz@TcUA-$^lTEZR+d^Q`h3a1SkuDf@hTdJ^=jB08FtMLuv`7VpQ4pk}K_Ea(?oTBw4 z2}}{~vVIt?w5EjX04shtQK8f+=C!v1QNmf3q)5#&-z_xE?sgw1 zC5T6gD1pk0WO0V+!&-?7uN{$WzuJ*i+;_#7+32eB0aL8S_d@>#nN_LG)X$07P1s7b zE9a!is=pGqGR!yYIPT82g6mn_CIyPN=UYVl^W9R_FitzOn@4KwIp@o873&UTLNSntK41IxAVc{0D z?x!;VXs27UO7I^o04JtE;atME&&}jU^lCuN61hXkiVaZoA$_T=1`NO(VzF3jakLc}^{ z;|wG8XKq3#!-WY$EOX)*)S@;^9_PPmRvwE$s=wX1@BqZq;Zhj-PHUUwZWJRG~zA?Ni`bw(L&WH=#7Ll^8&KmKO*T z8yWjha-8zsF-Y~XDmZLkK=|r#AqzNLt^}O9@=x`rQ!O9^Vd(z!r_F;<2ssfcpvD%e zW+y%k*v)KiAghfWm%Zwdej^zZ=#V~337RAaOvx7J<|5v%3&;rr-7T~~{yu|Z6fpfr zz>-s^uBPTE@;3%q8{4AfqA2qOmHvOh)W7S16d55AcM2sp8m{f+@f9w=32!$;(0WKd zA%VM3vf%({!)nh^R-8I{CdyH1|7!o(WTC<`D^Y#^EHr}8AKLY=MiU=s`6_eZRiYb>ws1fl_}09@b>-jrxM$fb6u@~* z61)JUfM5!n6B!sl+H&zPa_j$o>;JCT{}!Jpn!5kh%v-0pVOA5%-Y2wMfP9spm3m1D zs8z~!!`A^pq9?_VXP7v$EJizr{OqJ2AN~NP#VmI5(Z#PTG;|cNAo2tP*>KMaU{6Xb zkE0J&xp>7-;jNig1&ay!_J~B7H;BN|gg%}*t;dyA&1uu$KCRpLiCE7nVn8ZK%5J$B zgoMpxRMjp=B0g3Rc`%(EyRqBxD6SL};<6(Wn4zf45bfp{6#T=~qrH01_E#L-#_^{i za*@*r5@oCzt!4ta=L_IY9K-vFKL;A_V}YP#$ai!YH204_Rr0^|sjL4#>Qjwo820+Q z=&Nl`=y)oR-Yz!VhNlU5%0Ay^@+xsmY3DSDwpx^eP9e8mW$wT9srmnqq~U>;_(11;*h>%7{^3K6_7AoYSFm8@Z0wMy0z#0 zagi5EA~pCFuaEiIY$}(QQYLPqut{^;5|BrufTk#fd|+dE-m>EW_zCFFC_Z#w49w{z zUC(Wp3%3T4zM>ww0|d9&0@`j!62yW463&g#AfL$8vQ4&+L|J=61di!Q4n+(RKdx-r z)8~5*%hB-MhN^E|lWb49qF+2Y6wh9nt1)C&bhKnXy?(-J0bAu4%Pl4nUIzMYG45jk z3jK@J+&}Ji_Q(QZH|as*%|2k36PR+#f$ zTfaHbf1?F;hmeM=9%;yM+isOhQUG;=-#;j}uVeOu1#s(B=z|f6QK$U#r^R)eiF`HR zZW#h8S{TK^(Fsh{oY?3K&DrR$L$G1kZD+I>p>yAKE{x5gbrHmD(_SIt-@t>Zn zc>=MThhOuYWeOdIT{{~Ae_!A)xM{hkJfdn#@cPNeGy8fBV zkI#pXWy!Zg*rIiPeaKb$+Y+D&#uRaTR((T3FQR*8X9jRV1tzi%$jN>tX|d=)bwQD9J@9`;wMe|=IKvTwJ9Nw8YJ`?@?_J=GR|Q31JYndAl*2 zt7J~d%#`?Jbpb&8kkZi#10Ys^sew3F`x-o8T&2a%sW6O%p| zf)AHe`M~ivuM@t35?4^q%Riu@D+;|+>QGSb#s>&Y4*)U%zwr26yj7+-Zn^#5r8M78 zW^Tct`;;Swvs%LcpIW{Rps@n7J}l%`&J`x8VzcoRk)3h!#STo%Vc01c#q$%a_w+XNbg z*M#qH&4Iv^Nb~KquN7dv$Mc0Kc5YaGxuO9DwMMN)0og0ta%|MRp)_IkJ%HE^-1;*=T=$EL=8Q-2*?**(yWMkIG;G=z7iU1^Aj!=&0v zjz)_IL6gid5D1%l{E|gT`7$!1qr4(|Cwi#ae^_BSUEyembo_CV^iXL0{~_%yM2#hjfFabV`?WcXzjRN-MGGlJ4$OQc`K9yFnJ+pmh08aPNKgKKnfHKF{y*!}@?M z?s?C+Vq9ZfQyYs>JFYG5hz^TMOh9`}fuTl|nhL*b5o(u|5w_V|)sK;Jc)!$u_(R$m>HPJ+-@XeK` z(CZH+-|(7!OHBA=hMJpO5Qbirt%?tlIP&6pK}|AuS|cnYpzfpI+fu8s`1*PLWwDg) zFC0W=nEYHO*yvmiB0-!3DR)@l_u)E;XK_%BqLEAx{~srR+FNR^Jyx|<)OAIcXKBu9 zvqaD3bWBa?DKv(~S$qz=ZSoCtc?Ml&#o;7>^kiP49nF6K+dDBD?4k&vFo&63A4Jb% zK50Ii<5dNL!b4wN(hVa8u2*_2-(ok*ljY3l66z3dKt_m(Pe`kO$e#W3lyC$T%szLl zB)aN66kFb|)5S-tA7~1V`=!gb-*M?(FxI}Mmb;j|Ly(99o~T@>l5z1f7U`N!y)j;D zjZ;*o5Jz?Ii@9p2faeg60KyjU*P1@m5?p+re#5z1pOt8$Aa;q#U}N1K{DU(V&OuMH zR5^k12uK;eTVDrUvf|hOs?aQe8Psd)E_WIowJ(H2+jdFeAm1-Y+3z)b7j%Fx z{oQj(52NmHysIVVZzRgGce<=9nhEMPstleU%hh^Rf7n+5BOY#&iF-wZd|*4H&h~#` zU*Tb3ki(_Wb*hKySOY3VI{{Z*;*bt)d}rD8nZb;O%GRq2ztvK?Kn5+jdD}9`8Un;e za93uY1GnDj#`p+lqSb(t6^r(Ap{Ld2ToFnUx{m@!)3MTydRJiGT^9cK$gQxYq9o6d z*4nA%(q=>Hn-irOJfvMARcM?Fcc^;p7kWyaWA_XD;qUXc_)oHkv<~x{R7m_dne_LrEg-v~9KdfBA^U2=O*sZ-DHqzz|2-mkr^ywY;DvwPyBIQCgzn**Ge zXvf)HwFJolOF5X>7BrRS-LHAdr!mNXG>_u&UzAj*Sgd)&zFV)}O*mjCumpn|C5bX< zIppu(evc2AZ-O%6Sd%XSkI#Yw_H194xN^ALV9;=(#i&;04O@MeD%XwM0_ViWILE{% zc{(2VtMun)UydG~SSfUV7Xwts{gJs>m%~!8j4qyDT}QtaZYr%4w!HH0wC4BfFdA0y z9IdQq4&?ebE|V=*?va*Xi(W~^3kfc7Q0552tdJfZ9Tiq@k)z2v?ak2|+;BPG6I-cS zx0da9Du9DnC}iX%x@w0>SuoxO(5jWKRS$Y|7=5so>*o5q20@BzEuqAlV=WL3WBash zT9~*Fe=3)zhARB>xO7s0$ZH&gskT$(Kvtv5*hsHdrIJxPUBQ^yxm8}3uL}{R4vM?w z0=ru|s8jn?Aa~nYow}?A5n$R)e+^Q=KloBO5Dm3wf$X^&syNGrR`R!OT=pUT;F@&9 ztefD_aeW7ZA_&mXtwJUqqRXm#LEY|KciD+lJ0(0OUp}Dj!YPo>kS^Z(cJyJQ-iUX_ z`E}`EE7~_tgZE)0zy33&C%5J~rPynPId97kG5S#seHNmV3~-v<73v2w{3=qX*pgJc$u%^2F>CxTU&;iT zOlNXV&FZSfU|(G+4E$Yq{Du$f5fNpuM0LDSx2KadIvNqIJ3yw;V#`LiGJ?nZq~_$6 z`?@Z|cQjM%jP6d#Ux_0vf*Yg2!S0(Uu;AY97t zA0!v1jAl_en^zdCH%Hf~UGtsVnT}i3AMVw88RD+RhcQfn`<6sOHc3^ZIX;l>iwj43MkutY9P)TYUFi_{AOa<`VBB|rPaCd3T_o6zI|j<&un3BP*x zAkJ;P--NU4Jx!!Jb5XejrN!2XAopAW6Zzi{Jd4siGgtgQ_*i0muMi}$xUs~Z20*8d zF-O^`E*D*FP+^_d)X3hl`M97nGTl+zO+AV0_PE%^8`=T92HN$*<(9W1$aFztts!2* zEr&%u3h?Mu*&r{1yD$9zp@C>qduIW0B9F&*wu3-!-j_{kAotM=K?h1AUme34lu1?) z9*`_}y{?m<#-5|6M9N}L*LJmq&I4YMYey;BRexofb zCXM+K9YxDqE{QZFxf>9&yC(zi*eo!wK5+LV(^Cwk^L&`1*Q2AdT`a*uCiuOgEBg5@ z{%R+QOF}wn#@t=D5|?Zy8Kv2vIRlabmzOm4^pa~^z4idHkSh+FnY?!!iErF+SCUIx-y?RBh|Dil%$(Dgq`MbSu60Xh$0N458DOIgC0B z`Scw1Q*D__%qo<5XY+;>LsJE=HCc!ccZ520af@+aPygzDqU?v`yJr)FOkD}HhBoGjz+agfDa zKoJjI!Pq(!+<;H`4k)H-rP`zB&$QybE};buks7x?CywMSnLxX?atl#yH-6%IyHoeP zT|FSR);3^;PzX!~pz{Uob1|iVHtfqYVwkJkL?_w~ zZAJ&+(kmc^83DSa?llvt=)%H^w7EC1_E>W{QWB10YrR?Cvdg#3!Xdk|Tx(q30_0ia zeWain*ekz%K3e^zMonycIbQn>?1?2fb!%dVmP!7RicFpDlTwOVlA6X(p!7(F^w zwNQn)RDo>!0oFCR$AYJRS<6p=p8Qdo>7FM?v5GV*q{E*Q3!&NT+#ykcB0~@{hD_PD zFUB7n2^9YS%I?^Q;&-p&fTWMkw|71?>e?-x+VE!(ptUM5lltttMPUN|(a#&|{Ex}7 zh(?9EsmG5Ib6PUCZ=279s`e^e;%Ny5{I1f4?+foNfX_PNx1}+$QNsZZ6Eq(xLy|zo zmm1CzgJ1uY-2j-7jYK8j?>#frFR$nVTF^>HY<##hqShTNTPb%8_IlQhy z`A#b;DtLF*0Oi{=1Dp3gCw5MQ0WlLB(OI8^!S=l%{{K20)gKJiUZRR?2!9nE3Cp2a zpFaCWo%xM3?&-mdL}O3K#1Mv2Je4i}ioHe0N_a7jYLz$t71Pg;T1yY9@gH98FgP;Gn-AXlP|`f3UkMKAUqorfOMD}=_nO=S9RIJxv-qI@4-D-? zQEZ*>5!#=|4zD%-h~aSmxybkc+Jf$603VvwRCET^|FXXKVu)j(DrEN6qlU}Ml(_

w#fx3qaq(+aJzA>S{K z`_2rF@#w)2g9y=H(0~@(xPspebvm6T{{74rN;GHIdNm8BuR9}YU946ZaUYd-ef0?* znR3-3nhIblgp(A)@WnekqL(g8Nfz}Nq=qEA>i(kz!wk6>=+JY|>9zu;X)PyMfeyi>och zKrX-B(^e2&%27z5VILv}GIQ>g5#LVBb!dVyF{o*;cJFG~kAywz-yT^K!F0T4_KQTIyd$u0{@ZhSqb1G|%!8VkTgf$-a`r z3xD$^biWSx$im{XxAh1?ur`T!sa--zX_#7S%ru@sf{|9^BP|w#QqKXco*6?;20rfy zbsmD(Q?E74dl$v!mCTKdn31y={Oh%KIoHWwrkIcZ7*nFhAUkq&Zvc_I+fGq%9w?dp zTl85qC0g-G`vp6k<)1zlP8zfS#Sk4gx)fe}A$$%uRyBl@l5$dXkf_DO zeV2bVLk`gYxd!+@9XR$`LZ|K{Hh+VM^j8`3d0c5inK$wwYtZ=jtU-3-K96XV`|B?I z)150sux~R8<=`5xLYv1xdHq6nWcV5-2Y?%Kmn8Pq49hb>bnh)Jv~ak>zI(#{8|2q3S^+@7ebAQ~^bj#{zX+SV7KjmLj8S(6WQq}-dIqLY|JQ;RHad8P`o{S1>Fb`0p@B`G? z^DoZ@ICTgwXNg<)<1PyVr@mwwy3+%}cxz5e;ga!uEV}EhW!PG?Tj?v;G0m&_+zmFiF|J&A!4%Pm%hk$Z> zsvYF06$ENICK4!c3EFMjFYY(I)iZ@alH5e(fFO^8e%SHc$>#JzNA|}zLK@%j#{R>W zkhPUilCiB(d)N}dNc}ey_+PgKSeh^~R+Y6pNIa#D8v?JVn$vn|!jPCdxgGp`3Neqn zffLVXG(ZHSG$$rGll(J>4^F|yOE%Z_At9cA@Cg$}4}mm6pvbA$la8D#*;sP8o9Nau ze>-cdj%BaL=H_M_M}~N8o1E9-nC=h%7`ll-tNjCcM~}w#^Q_q$c&QSK*@d$umkmM` zt4}rV=V@|hL%#0jxR1nNkiGcQm6M)unXQf@I8|@|bcQ6prmEuhYr2(j$W?GqctQtl zmWVlOrO@Cfu3dMeSTc#8x?Y39shV0+3`inbPgf41aXC@!^AV_TSC^5fb`GD0xIdcV zQ!?Dg+6&2(rer$2Vb-Yx7?An~81T^>?`8V*pPrk-72g?Mw@>is40e6b#>TJOFfs7r zVujyn<6HEg*Kjg_BR&=)^eQ1dgn&t4Xu-p*KC=Kh1w{NTR4D7uP4+_q{jtxd0mv(EgtZ~L@x?B7!sI}eQ&RVWVM_OH)CaTb~NYx zLq%+2$G4AtDIg>^^PJ#4qrn3X*}v78JiMAPd3JZCmK;n?U0!kFXBJ0GFB@j?A^Tb4 z&&)i2EKB%ZD+Seh{m8`jTHOCui2HhOj%08AGdD+t9xcpd@mp9VLT<{5M`wE@PX!3n z!`u##bqu=#x&QB<{)`>l4Rf8o>)~w&!Rtr778L_8z}^2#sN!EI0D`(FjEIWUnVnj@ z7s%KOAyePu@B7***C*_H_XXboKz^b_Q>EPn&!Ok&lq0ckAd!A#{hzR_szIQ zhpA!a>g12cd7>#{WXM+FzNF*hPS}M;H?)`0#2w%c2c-St!8q6Ebt=p-4h)FX(a-<1 z5Gp`Fm9ji4;-yroYnH+&JYgoGgrW_00T}zAC82#-lFWZu651717{0e5JZeNoJL}Fh zut_|&FmGnw62fCpvgq`t2JfM22h7w(`0gk8>e&*zh)J>Ul{+dMz@YCfb0=iS;6mca1yErt686m(IaON z@f02dGN)eu=sLg&?te2e3$3;oO$|1tac;pIH`4}wIcgC7ttG#F_@_FY6+M0L7GJY7 zN+xrujw7%TA!0yw>X4%&!12a?#7=&1I9xCO#c-hCWgmw`5iSG}3G6zYojA65VEZ#= z36i>{H%i8`vLoOyMdI6KQfqbz*`Z@wvN>8l(iD2Nek=LM_S80#SjB;pWY+cPB>AH$ zBD{OY=$x=K$9P>=mv^!CRgWua5CRIMPs6ACLYe}odh%X%mQ zzU$T{{2K8(mMT=Ffh>b;N*W4Am-8YU)qUd-h6zjj3+5w+!pDjqkZt#=)vFX=#{^P5 zk&qY4+aT)NjV4*j6ZzG|l{ z0j1KUTZKf6J~&PPzlvz<{wjZw>SWDz|Nm`hhJK1eZS`lcz<}34=IY)GDmMptc&~*1 zF~^nWJB+-{h~I zfS4vLXxDPFHD`f~|H>)=be7g}v`prT4@d@<;$cXJ(2+aa0k+!v8LQ}&tLlcW19MUD zb3bM2KDPU)td7(csC72KeDmE9k1skf9x)t9$Pxx`hOOAJJ?0PwCWqkBx5XKvKc4An z0ihUV46EAZmiR-7@9#V0&*MF;EgEHZA}7RrD52;tk=ek;j}J)wpPwHIe!QT`Yy9H= z=@|(>VY$^x()A`v2fK2#89G;$pZxI)s-53gSePHj!7)2h!@nq-XZZ7QqyoZjr-V|Y zr7#9RS+(C!tuMb1B{wg;9LO;Q`_Ji*ZMv2KL%wcc86M$z3Oo!7u%qh^z~{aD3!H8@ z_3tF~I-~Fy77Pv64)F=%G)BN8YyIt;;{Rlk(Vl_{BM-@Wn~Ht!nP#CdR@xE%*qr9!FH5rDD@T!GKoIa52%F8J%!FSz zWo#aU2eBJb5u0+|haZGn-ima8eChpzG#(xVd}MY}Ipd+kcW8gB4Z)wawDuZ2-oktI z?k@~T%X8d!*=P^0UWay_n$@wGif-PHUY2kUMdsj}02+7!f&K3_arbk`jr6bajq+Xg zKrwhZAq6eHrRezXB(G_XKg%_7mbyC12E4&fe8Z>0uV04~)TAPng302Q*UgttyNfOX z0IgCB&cf!&Y~t-EWt!RO?Ee(@oW%JHd^%G7%*5yU>)8Lz2>^j1{~Tvx=2Bov zfBrRGQfk3g22c!j!-vVg}v8vNLC_~p_0g8drD4ieAa${zddC(id=NSK%}EJ>ixjq zdTt8d3&g>G_vg)z;1^Pb2$<4chjE+Z`*7*3P_M2LgLu&|(3B<)`sMP;83%v(X6IdY zcT5U3@Fn2eL&Vn88@H1C*vh`%-o7M!JrB{L>JMJ$!!!QI_|iLxVHT5aQD}Z>Y=fbG zVZS4pxOfy3_nmjr*THw5QP>RlaYfQm=w>q6V!a!ZnXI30o{4O~^Yajfk)-Qd zNg6h&_U(otT=03`P_^7~eNOURP{{jZFfEUoD3Jmug>1Uw@u{izqeJ#3IFB353Z4eh zcxt7NlR0me^@sDqN4sJ0bt)?NQUfI`5gfCXp}%k68D7U^z8dn(b`KNdzOB(bGNq0S z>KIM}YgM`c@zV^UL%jamGO0ayfQKCiQqaRBc&Nq`$!rGa8Cs1Z}neDbe=88Mzl??xvi#R&3 z-J5*b4D_HdEa63Q*I0H5Z0;D$2^4FTtNu|u#ioeuA6w9no(ckT&l5|8DA8$l3`pT~ z;6)qO-t41tl%EBq7M-?~2~jkW7Z zqEniBv}ztL0>r$SxIwx}8$+b>EnJ!#t(aQQ<_XP))N?nT7_6_gU!d6Qw}zfY^1k@t z_`3Q-D1aButI{I@Bb}xx^CZLX>H@yNIpHg~eZt{th$hCQ*Vq8g($Pfy!?QR0)G(Rz z!z6+Z2Ltfwr@Ub}!Sk& z_t*CJK_C`laQ{1N6WhAz;neLZs|iE`9~U$%ro(*^sOWLUHkGUk8JQz4?o7nCic#lJ zN9q`+v=wr8GzcXvduON$4zh`Jr82_&lmfhyPqGLhm0a9&1^fMuWxwCHV-SZ{?OUj^7ij17 z(>uEKInuOp6f}m8+|=2-$*huO>b`Z(RmqYUKJSD?H@e)Rb5tw^q8$HK8n7cFMz$3jSoC`CUr<(`Bl@w$p$E?~zR_)!7`tjFY~GzD^gLvNf4>=Nd%O_S59HQz1XKRaHU%#vL_LEL z*ItEk(Z>tIiX&MQn0k@z(qG;!wEljY@^cE)XR<;Mez*0U)M-r;uJJRX6#DPWYzxaS zQZe4kU?rqe(C~gI6ws`r`d*)5AzH|FEzWa3*G0Yt2}6$XD7{13VNeu5EU9+w zmeU6ok+fSOw$}(bq5p>y+C>4aHoc5XZt$%9rk44PPvpa#zQ@gvj~;$UjY1Cgdv=y0 z6#XqFzZ?yxAMttplzmPU6ow<*t50+nvd7l25BPh`xx;Xdk+XCbE8XKbnx7pgi5GmR z$NW{RRERRJTrjO9r6Tad)eN7qGY;%5@y1g0}J??;36e zlEQ66AY`-H1DYXhFXdNH%Q!>u*a|&TtL!>?iq&<_C)(MkRSm7ErwZmmJ8AL?jqhqN zjqkdpu`p~D+ow*lM>Z#-0;fFF>}(yNNAR@{_9b3b?h>`zb94SHrDA2wpO&eQ&)1%P z5WT{Q^ZR{^GtXhy8;nJ81PHS&zc=~gC}iGupahM!h=4{psZzE534HD`XUtU32eovr zL3pK)?6_u1!5brT zH%x}7^oq^^C%|!>@tdKKNUerjIl z;qFi;wBpK@@S|t!fxu8Q7a@QClPi~B2xtbYss(DaJcdc>(IjMZBfaiytmfY=EuBG&%7GL< z4Zc8W3v!T22w&2!kK|}T1X+1gyxhhr%-r{unsjtmXX^4aUzwn}UF~)BZ!*Qw+R}Q| zS&@wu?NHWo>vH)#%A6z!+i;Qp=(r4XoC>EzyDC(`R!l{T7HKtJBnDjGj}0x@c)QtD z=ZF|Gl)K%^rAD3P3T&E(5`@@Fe zRU;mB*s-RkV|G%@{F1h^|1O)bOZxt{nby}eSMrT+vm{D+;+-jJRqrEcPxKp*T6wpp zjPHcIs~t{JxDJ{+vQE>@ZowJTH)Lur@4-B;b)L@dlW)263{zR z`Y!5ePH|_#yn5KQDBr8%4N^P8ZE+TpL|G3&Yg%g0JF#UEwGP+4bkbUwvdBQaa@}D% z3FE>H{yiHGElkMt&ufz^op$`1jAzni^u`tWVZMK;KlgjcNsZ;r;STXzYQmfBBVi8&|_2K2S1j0o} zZ`j=45EPjy>}?IkJ?gRQjiD@^m|Tq;rtp`2zX3QjD##7wt?lm)mpY*|+P5ES64L36 zb~>_E3~A~O$rO7Za8+VT!tQS-Fh*yX<4@pHoo}9oKBrTE<9%~P3?PWxC&8mC*dax7 z377!=8>x%u%3r?ahm}v&b(y+BhA}p>Y@edm)^r~hiDMSx!(=$Qw*Ndzn`cXB%;?;v z-ROMCf&np44kbq!{EH?c^{H4so+Mc7bgFH+8h#?#lL z1eH2GFjp?QswOek$S{c`A)Sv}Q%$-WgZko0+3l|FIf^l?B-f03r%svuN*Y3>*Fr?7 zP|Vs2wRwx#Gi@yM3FQKtpP7Ufl& zFdjm9y=sHwA)X%cxNpYQSBpjLV;-%rd@_U-G6tzWcS2Tujo*V~&JjOCI;A>a8m$b7 z!CXWR%Wv(4!U^MW;vB!Q3wn(8Oi8*6y3*f9MPt0HS{ZJWR2llKgggl~{P!e1h+VWK z;Qgv&ZNx5!&e@S7;J%KnyQ01)kU>PXAQZ^v<)md^Y zC(75ZGcDGf_r#vqDn)_vxk_?81}mHNxL5?Is+p$8+&|ZzE3Bg_T^?mJsx3^B>pZo> zm?IQu`5p2-i0&P@Ua<_b0Nl)O8sD!WzpytJc?f<5jn0ief`FA+W|JQ4Rw12JDW7Dv z!9=Lpg(l{q*Iy|+0&m#q5 z0rBa?MTgs;uA}SaST&0E%6Q>bC11uA%Vj|L9iQ0X!2!#1UW+$o3@#ZoUp1t0l#H!* zVIW)I@tPF*t#sRcz5}`{AGuh;d1{Ab{vdp%EMLc!rSscrLNA{GXBm<{v<5u`@nr+dFF zio9h&x~D)a_}UDzmpvCu@l7RP9CGkP@v}Aqg=F#!bB+d6mdUW2{XjY&ItWCt4tqn; zFs^Ux$fU;B!sOD4#JDXPC{Y5Mv?0W+%+5oOA6As>PDlpp4+X-$v1zRAUD{my zJS8YL*Tz7_5&ER0EffcuvXi;9=mn1S3GFM^Il;Reus^b3(#5bspXaCwXj@E!ZtO$0 zMttby)0>`3(@EY&5KT$(wNM_AZd6HJADDFS?h_P3rM@7_W;J6s9Po*@G=3bd)@P5v zosr*)3lt?BOYV`^f1me~@yGTr4ar0bG{Ug;uSbt;W4TT6a|BIK_>hQhEqK&d`#PuX zmqIN?v-aCOPS&`Md&Az1Q6d;>{^saQy+l-2Z$2KK=`Ys?{}NsXXoP$(64N)2w!3({`IM zdqWD1hI?%Sd^lgVEedU8EgCJ3Lb*_;Zpt&3*H+RrdNj9%6UEbN3olt60Youyu3!UO z*W-*F>4M>P)m`S`KC!bsZF(Z`2>--Pd1;buuixHvvEuX5aS*-JqRbm5$Kj*0 z(+3105LI(q*42{)xt+;b9$oThE%*D-`P$e#lf#9MTXCyQmw3?`{AIm%EskOAEmwJOGs@r9I&d=>TJRe0-D07h0q}Vn^j+u z-!;9hJsd^Z_pD_b*zUmd^Jwx2Yo}{g?TLT(fuXH1hYh(!kqM_deMY83|cD#y%T#Z5tMop65?9_1}}D0LwFxwIq}`-7J3~ zntIYC@1A?sPh5Xu1jCQ~_du_0^!ro(xjChDRySsI zIbs@^cy&{LE1@(5RuU zY0gU3ncgbyX`|92eD`& zm0Y@FL<*{u`R~7T0gehpMuOO2-&@bbYO>;l_0cMR7n!Ih&k4{b3(XUYe80fW7XCJd z*%7yVyL0V{b4X_te~(emZxWQ5YHz{?Yjx)49#S-SE!G0|<{qQ>7miBh4kIPoPo4;) zV#}u0z_6Hz_V!<2cl|zDluRu7bdib9B3RvEh0qPc8XcULm7Q6Pq{;=3twa8;mt=U` z*V4P}^54f~2_KIX9Z$|b03%v8XHIDM+43qe{6i3{{kr|K% zp-Z>s4juk|d&+dM)CLJ+!w!QLmx77F^M}zt?GC7jH%|!6CExaj1SO#9|n zf3inG4XsugA0o>`2^A>&;H%t2DRLFP0~>8l%q*Z_wUoo%guF$Yijg9jPX?brLZk9A zVy9gkL^UaWQ{cY6v&fHTc?X1Don^?u5u*jI$WmAC_K~M_IAK(-_^-N)zcqn&c5ncf zEw;85g#PjZ2%64uqSYTG!b*@fIk!NPxl>^CMN-+Bt;ccm*v+Mn z9IsVQ#g)bp$&C(E`{wki%;*8#dj!zEsVaT^P5G>m+nz(7P`*o7t2E&_M2p)W2fE6y zXjIE7)UR*kZ7@MK#(b^ls(`4-up_07y7HFW;IZRkQ`QyX0MG0t4SN=A2&F@_U&+L)hqN zRQ@zoJT2EUT`xv7kUAmJ(&l}N@)L6!E$N*NA&0#nQh4I-SnB=~-;*e;u+`4qVd~8Z zgw2~O^2YcuPZ_*ug$Y#KgJ2dBUI1neXR`n!B=stT`y;qyqMYs^PuaWz@Ku&ysD00G zbTbcl!P9)6MlUHJS0^Lhxu}NY^F#4t$yUIIyXo*_p*Gy<`u+^I4=*UU=Mo2kLa4~n zCHv0O)T1OKwnj%Rj*yI0T0s@YpG;RDK$!Kr956LFTRUF1I)cxJ)TC3MLr9U+$#iKt zrW<>VB==>ohaq`h*wBH?GAOSd{by%W@q(XX-8!{HDrRN0a9%#C%04SC^UIW z+Y3DJ!+-~E@M4%^(1GN%x}er6sv4%cw3ELF#C~e&Pcb7;=IobmOh#uzk7oli2`8dc zh#L!GtE9G8tUN>JJWxpQs1le$XX(rdZj?MNH~Ze@HG2JG1Wkyal{z?XFHi!Un64?M z*&0P!Z3L|_G2P^;7@=}X)TV+#2&_FLUy~segJAkq{z+MUa6;fGUjDZyqNanhPBRiM z60sjJg`{X?X>6j7MLv%qYUQ8+{?C)Csk>~GP{KyPbL?EvP=saY^B69vPl~XT>}dU% zlF=lBZR8?_Z)SYFi`I6xs?d(FjDdp`Q&0Rm8OG5Z??FwkRO$FT{4jOkzw>dLMy z7qwi!w9mvWVOl8LLJ1Q_m##_X^<%7=pR$442@@(~s_$Pl)#6fE$Y%l=KWhlUFo7Nc?9Fm@RK4NzO}n_iP5}>D zck%HcjSn%V<4mQJ*Tumk(F@qh50ZyUM32Ij^3&$>G+b$7=|7uWx+#utT2rCW$RSm@ zh6ylNZX2TbPCbXHVz?MK2<1ju8j=<{0c9t2TP5yCh=|RUxhq(qXZZcLGykrG;eH7E$Smt} z*eAoklNj|muX=-JiWRHrsFHxT{doJ(Kdm{2FNIO|DF;hv^LuYLrbgDg)~Rxsby^!3fWwh?6>zvizPBj_=Jv< z`>O@Cbh+n0HwVLhN%&!ygFeF&AEuhLCEXdXLQX1#LRlP>sg3yfK7UjZeH};0g1<45!qI!J|AShKzz;8ctg_vM6_`^i_)-a}xUyp2O8s`a z+?VHeDQ1#yrDHSwexEjwhK-JW-ugz+1T&dR{ysj?8Xn`LRICw&EFp~58MD_oA+%aK z(rw0XD;^i4!brQnr$kx=zr3z%C0R;l#684HR0OnJ?-qGNA&NAHk0>X9&BzwI#SY=R zxi}6FNuq(X5mcJ4B=EMr<5t^>sjJ_6?uk?%{wTDp?d6`LoN z#?d6aT&9THQ<|wZGldO0cRi>U(7$7wEf0TC((OzXPcC66)gxz_) zg)&h(=_LGn^cD>W@?ypFjUvUNJ|@tZ+{gVJ3@;H9R8&o?NcojMS69Ap`q2adVYm;^ z3ED3*ctS@@UH^21ltZq{aF5!xRC7EX%E$>>LUd8p_IEp6F}MExLlM!{g26h)>JI5|ZG!F7F*xcT9C_E?@!z^WMB3iX<} zeD<^UE95AyT60?#$Kp69k?aRIixCGnXU>J8q&o!4dI+ToMwPFMwXiL2qezw4`WnwJ z+@+K3kNxf>sOF_-fV|MK^lTKyIk&PN}~~=Ke*KyFd1=j_PS<=F{=$ z)DWuDq$IMp#*Q&nN!E55uj_6k1q67>l;QC)Af8T&%bC^;O%s!o^0`}lVPl9TJ*2VE zyihMzR#q+E$B^e(a`X7-xrDt$-6F#Ud4&Cfr~h7%%l23tl~f4XGruAVN~mmBrApQv z3pE{HbXBT}l#+=Y6Xs`J&H~2VfZUffVaiNJL;88Y%zug<0VH54Lgge!yQ09qs9OEbVF|I-nA=E)M3r(}tG}3MuZ;Bv3)#di*Dejna zu?N%9rMO`=JwGptfR|>o4YCt~*?ZQ?+oE%!!4#1nEUxyGM9uYk90T84mGqQM91h~& zCUcmldMtzT`x+tnR!C>qi{D(uAtI|ts;>A`Uq|4ZF!!e!^+YnhyMlbnNIJ!JD@PN3 zwF%->rt$V3SYxgS)BtiI@U$dXi63fK+f8FIZp^O8701>!jIr?CAE>|8O)A z&80#B35?t#^Jb>yJYrU*Lh0iRaD_ui<1ut~AX3A8S8FbYNK6Ke&9>^5qa+1UtShP1 zr+oP1y|c{c@GU2wOIN?<75U{8mrrK%de`r=;(2>`YR2gHbgkrS1t6VX77wnNY${Gk zB`wv4LJ$)N@X+^MB_eb}j7&BWJ1D4GAYAz_;bPx+$QXs=W7jY zL`^%C$~U=Wk6HCnG0CKV0d|9f{aCNRjG_-XYZIAc$wN9&2A9a7oxut{9{hxU9E=ij zh7_4kvQcDuv;_jUCkrmfBNxT>WPk^mQ-NY8h#Jb3zcsRRpeMwBCrK@w)aQY|8qzr` z!jrAQsYez6ql-(8rhuvtvwI3b?1*YmQENhwGS2HK?qYI8ff!_^(P^J?>wNuNq+W0c zsdK1^kM(k+qj4xtQI82=3PN51o=>G=M&=ZXb)*FW{`9#R@{c0l#0yS+^_)dfa3oAc8~ zpE0`QOr0N2?UtoBDm(S=*pu-`(NCUpOPj8w?G#v~WzKxmI)4!WHi2K7R}@;j!Q5vp zi;*X*6d)d#$7j)+Frt9h9gmgXUTAhUNY_xpi{YO?d!d%9I^$!Vv{Ei@cS>6|2r523 zfWjBPE*sTguTQF$dcx&#RX-a*mmtWX&xiOr&XmSsz(9lcHT*rbOs!&H!rRyW>IEDC zi7I}N)YdhUusGtTs#RPpTtFbXMbw^Afn#y{tuDxAI2ACH^b{p)y6`GE@+`V$hSTq6 zYUgxrl+%%)sXZ}1Cu8rtrJ-P4ui|YIe@Gf<}Jp;G_rR-&4+JU0gJ|5>n zd_MKRx=@rbm}^OfvkmbCT^=V6Oz|@)Y^%}?_O3z{M)K7$TDrL?i5tU*G&&6&^43pqxoaN_X zR!nN`FHjq8qbmm$C0<8KwCpw0#kQPg77B~b$xgL=+qdUuL}!O{_AefpbVajO6s*bQP&e2bmfwhFSD#5GmVV_S*gDfX{iPnH9M)5jl}3jQ~8B{~VNe6xC9>PAsQL$~d<&hXh?MQ>41|e#AmB zFD)1xx$Hb%p6uPEHU3)id|sK!-(4`@;M5xd-};^cg~rw5!1%OlDV|P3Jk~rxIdsfN zV?wn1S&Ai!qsOl^5mUM?+DaTQ?GfF6yj65A^k0kD(IHt)>04nvZ)fc@ip3<@weRX}AR6S%uL#*n`thMdn0cHmHq z3a-gkZIj}BpE@~O5)q;v(&L8(qUx`pOgq}Bj&Uq-+iLpZ;E|Qg5a>or9Fm{j;K}%D ztq%c^u5cIhlCT$&`4A&-kiIt^L>)9b9hY8&jFpUuRH<6$iF}CaaHoXVHSSrc`H+31 zOlGJvWkYi{n90jq$dF0yMf{(_W*0;rz_&ZoR%({(p`wL zh9W(vFa^;sKHi^G&7qHkDEr(LnYiT2%1l?U3Qii>yH$ zfaK`)bpT_0t^c_t?tN)BFdvYV1v1i1#^(x*!y;4NzkG>pCrO95TjJ+^^==quv6JF& za*QycuBcqXT4~_0w_$LG2m{R{t#M&wt)y4GBN?pW;;4lPn^YMhD|V>}Draqeyo$(b zlD*-#^;U3sqsP>@%c%7kOKyIL>{=Y@4Q%PSxN&+V8-QN8Xv5)fOF?|0go);H_!GNx z{4W0Lq<1(AhK!0!27gs}R3*CV@Recx;SU80I zT@X>8T`kq_E@9g?)v<&+@jCxR(w??w_{0|b{P^^ERpQveD&51e+F|kW+V> z&N{DSQ5Z_1ez;dy1lQ3)M+2ILi)^xtEawZd{-J$&aEJeXl3>p}MXadfsAR$j-#UF6 z?8J&^eMSBD*A>HtTuS(pW)vtWZ$b-`Tw3NlazwSp3Aml-L1fcOg^a-&>QW=!Ma(az z(JW+JyFI3zZ%^^@|KsefqpHx>HeMQ(Zlt>fk?wA3q*3XX?(UM3Zt3oj25IRoLAo0W zCGTABbIy0ad(Rj*{@g={W9_x(JLenE?|DObs0D|ngiN0q?AVV&9Ju{UGwg8c3jGh6Shqx>3jH_C(0uWIW`>4-;4voI z-V3HTa}_)ND%dh6A_h6 z72-Lw9J;kUs~y)W4!c|Uwps1^6FX*tqb?z8n6{d z!24c3|54#H^D~F7@6jPIiE_fk4a9G@jIr2cC2q&Fg+;GkFnLDcP_3S4z>n7)Kx)PWO&MP*p(P21I)Hn@aJtpZz-DW>dz}|T)g@-X4 zQJ#U(=U_%vlKH7J2dkIs7&f}`;UDoVm-V@H7jLKH&=^_yt{sd>s zaK9qfzJ)6Wr4k?go%@Eft3y|((*@;EF=2WhU#u^C`}Tz2Xtt(XDyzPflizvmd~NBp zFQ~GY%%TGIvy2wWu+}PDHsz~ZyumtXL_(q$ErN$yELx_vJazipbUP#AToo;)QWQ;F zn7Ev8I%9hL?_+u$di;B|15)c}a`a+JxjqcR=f7G#dx=3fmvqi$N{2VdZSlvQd4NYV z6b_Ipt3rQ6n2Tp4HjMe=dqfg*gh$j$FgiiPa+`;mz#EU--wbiN31eJ)6f;Sgx|HIy z)8_h)W_@DgOJT;2* z$5ILpO2*fBb@3biL>Lp#?FVU%;JOJR$csUgip)F!0cGYLTv@`A&73(aWeocxqOeRD zBRKNGaIqH{z1eLpNoELmAgy*DW|7n7aLLc@PpLD~9MKO^lz|c~FAoWn{8SWTl(HLe zxz{Am;#9y)9C3C^Er2I~CQH|qRBEPJ<;}s({&d0=K|l*joPz|Pay_Q(&L@kri&@=# zxl|m^(nM%U)z8L5v4mxKQ4;=bi*OX8`vS7L(lHTfW82lH=k#pz8@O#ETO_g>3vBT2 z(h}6`Y_68V^_Qh2tk8eRIc%ab-#sGF*H}&Kkj>ht=s_8BVyrk2+mADVR5ielFvHS417!5>J zzq-(r>z2xUTYlH0A9}4+;}POMtrz~LKs_1$ml>j@{AXovd2i}uFtwA?Rf^uCpOsHV zwyXN&Cj|fx!x&dDZrW>U1jn!fZ?*x0fm|c2n93Pf2(Fde5JIw~FFz1T|0u*OCJjfM7IBJKfr2 zPmv6UFxpZ>aiplXQ^|a|6EUw(To=XKqCA@!B~h!ZW!TWRZ_zE+D}K6NZRe?KkD(R`dM*;45oUYNk%L@3Z`#@-7ef7^Y^49Fbe7;vQ88O&ZZ-294TrO!emR$Wa2(0 zR65;EoE@#4)f!COKcVC1c!?^=!6X5{sho2{d2?mk{r@s_S{h#}Hq~Zt>BoJk(;ep=^S%zFv#~3BponC5nyM9q9(fmd5KYV0b8&`EfBqk$~Hq4GXlG zEp-Kzwmrpo}&uo+Aa+BRFbvv&NhD=jbs}&~LR}K{`4@E4grZd*BE4o4|d(Q z)<}=ev_T&sX0xvj%InYlvzpCb$%up!l_ocf8M|K+-zW)V4cY*MDc=z|4#}7w5GD!f}#AWa|X6WXM zhXb?S!(a%a1fZz_rBOk1h(1Jllqq7MA1Ngqd?BVVeWR4b|FCV;Kw(vuJcc`+HqF2;u#=7U09is{n$u zA`7;^W0$P_fajq7TSptiETvmwi8`WFEBh@@cb=0!0)k_K&-~x$9&lLDxEriiTfNa) z3QBw0YxogjrVA9zYKQ@r{&HYMt7q@Dt|Jg(bN|ZGJw+zq zen}3KC4d9#QZe6ayms8CS7SSmcm8wKOn93XHvj@uPy82dIFcMhD}gA==yW{+B&6_^ zR7HcUop5k7LKgMGsG3V%l`6ZY8vHyn@GQN}C7WEavDfx`3$ zq69N@c0gDq1b&7P*T{-@W25|9onGzV2a2Mz`=gk+l_yImRSOjjF0`w1+C|V+tBaBU zIs{-QJ${hcA_pJR1%(MvU)|3wI->Ca)db83a_}?>V^T4Zg1~WXqy8T_Fr*-_IK&-N ze8B$;j+E~D`B%vS5yu6tyMPhfE??&PY3Gx_&~O4p!vj&GnK|~~&ld;ze036@6031q z?QWbqm1%*hxN_RLLpD9nARovNUEVhq9BbVY(exWCPA&g|FB;%C{wJV&Hx1#KAwL&# z*G3`cbI8|b`8rlL+7R45DLhuqZ`B&|@j?pl0ja3k`{oDugb)T&80K1$_%pmbFV{d# z{G(R8IS%Nta1dO9JJ_yx0jDCp*-&P6ZjarK&jq{y=w!kaMz`t&LSJ55YO>l+`2dwW zWb=)X_hSG=od>z*kALC-Nz0Q_jb4)*Hc$b7JCDc5e5s?Z?#lR*k_E zd$mFwC!kF!x|I2=Ga869*Ve9^!*KRp`9w;;?kP_qXLou@3TjH{%Vwhda3T5T`vk37 zwTha^!>sxWpX=jdOf3yEg_U7nkhL>2hVUevj z*tQ!N7fzutsv0Tm$lNw6IQ>!PHi8YoFECQa7IwRlOT}*zH@<;x92OA6TJ^d$K5ZbUA|AY=d&&Jt& zl`lMn+uSL$_Z14ouMe0Dv^fD7SJxjpyLOd!gwRi;(q(ra)fJ_5z?A$8(7xDd+laNQx zE)6v46KWdC;o>Ie+dms({MxZYJ<)H!eQz*|Es?vz3}rKv2vpntti|Tbv&Z=zfKm2` zV^mvXsj?EA#;jW*>0ikVW5$7P>iakK6x}1uiugxMuEf8J&_yC?8&HtDzXNY>-@btL zGbh*Am)cF%RC*mAbP^k9Rh{HAo!+{p$lWTeKou;WbI;qGjblQo6%*A`Ek`7 zhsC)yWyWa+J9KylRUBw^;r16A56zuE{(w^JcL`lh9j1TxO}?%0raUeuZ`WXO&s~ep zwU}J*N(|2A;7E~PkG#nLvcB-?YyuQ=|A{J~cn8j;dA_^QkdDAD%Bj)b1!$7nRqTZ% zkH71_*`VkR$CvhMw47x8bj1CE!^zGb3TBY;_n+1lNF zTc~!(3zK11sSH}@J0@%lJniUN@ERo2t~IMMmUCH60@)ij)AnChAHTs8M3%6e3=pjJuu5 zP2ftH3Plj`4p(2ym}s@a77r>YusWUG+?C00~Te5Gw5n zUMuBYNSet!i5AUsG?7n*b|MsJBmI?9dfo36{7QMy%zSg zhD6QV{~uA4LggyNVZAg~S~7QK{V_3%S59wpf;b_-HqT|FIy^Lz;66}VNwQRe5=GBS zy(_IR4wKeGiX5w7#!1Kqurl4AoH3ht#R@;2HN4Yg0Fg<9n+r z(9$UGC3S^UQb{QYw^nERS@sT6ph^^&QUGZ=^=2~ufP!i^0x59=1#`*R`t0G2#ARO1 z)=-?vj4bsecId5oKBAXWp5awv3U`IDj>l|&`q$Gqdi`KwOAOWQOb>VlqKuGx_Z~k@T-&oaLAKj-v#ExquEYyw%;GWEW;=d{iV=ZEN4OPT#>YH zUjqG)bwa7Z_89>Iv@9gAfs36z#``B9k<#LeuuS2Glr+lx_P;NF3Q;TKynpjq^>)@` zMm`n|)`N2Za0J`5<4d3XULXb0*vmM>FwJv#i%`AKDsQs0PJBUy+8mz6)8_n*D3!gC ztxQiIS6_xT&x6`>*PV`c^%F z28D*f@?s*cj{bCc2?JgR^YgQ*1nE1ClN7d`EUKib$G9O2?HtxDwE6{uJ0V1rm=?ni9c+0<-kiZ7-zXT~SHL*5u8eiqGB z4j3*WpWi`1gf@SC%hbFS(w<6#^dcbU2LIfG z$DRropcV(0pcIDqFOHYM%!8<|FJ4`lf|*1TzR193{)j#e^=&L_0{}nZNQHbtP@1jC zHAeQ3ny&c8q{+@Xd?d^+_htr-D}#qQU}lBx&im2hsHD4tBe_++Bp?FSW#qJ3)!Ib^ zfAO{+S`R|Kzu7NnO1=_@V8h&N!Z;G@bG2mgI5FNhdIPWD*1|z;&9z{EK^y^(7nfJ# z9i`{orB?fe;ad=t;OTwrvez-3X&JFitA*Da%1Rm`38fs~OE?Hqti{b(lgZTLG?=ryzFtu>uI0(e zRX-^$`iBOSXfiqM%}d(D%-^I>zfhr+#_L2S@%Yqp1WzPkBQBR&$_W|h8A2-CY?>v| z@n)gXKYNuB*7FgzCD$E1jqG4SLI4*jfMIk^mpkl|21=4x_QnH1C_$FWzvNN2YyKKZY`=ZNV65saQrL2CJ60i>YPCiFTtYR z&QY2!Wr%7oV=R_#>TGwk1)m}0iW`pAFP<2I_WORDiT<_u@|oh^Vsp)2HfYv1YmHab z7RNOj{PLO1Lh6BWuQRyrMiq0t3n|HM7lMh*ji7HC)Y{XLU{)QP1;e24FJc>Tf%QpG zAyO0=*g%p)I{KBOnjrPfR|=)ZO`hKes3q6HjCG=LxEO;9qW)oh-ySH?cbH?)ZmA{Y zoATZ5QpI;&lz2+y7Bw?`5*y#>u;n}G0=-f#coQH`r!YOu*imfb1oEhDzpejyrDp2c zuTPg_Dx@39jgmfrSBg;R8L}BmH;~mS=UBp@FM{PgDs|SVy^&<-0Hcx6mv#lS1k`kN zcPf{Rg@%7*Xa zYJWSNF8#~xX6c=yfQOU_D7PtC;iSm_a!m& z;yqO%COqPFXb`{?Ox8m{*VgdIErVU9s@%c`P`OaP+{=QcL$wgTlFB_rDPY!bwu2Km z(7?5Q0&{7}Aq$4^R49m&B6ygz;Ytkd-b;Cg&%% z0l1_L#2XV&q4(&NQX>Q#bmvc5eX%(GWK_4hC0wrQra(6oKnBZY97iT%i-+J5h9GGL@}aa3SOEWHkuwfhnXmc z*Mle$;>2Qu!g|Eu9X>qGEpFIG5VU#U)R3de>NpdI|B-Jy2WnYy9f=3eP?r($MOApo zRxk#jGMOt4BVj6(dHefj)pz*cVKfmg{<`~TppRYUwDv+^hde_(-8Y|AHAs;qmrU;m z`(r0e{hLQK5DO7AdaNIQJ*9m8NHMEr5o_Aq_(HCUiS0OU9SDuhAG z4gvUIF6(xm7L!@dmcfUF=j!I*e6`OH%?49tbh$FXgj~6N& z(Fv~frd_W;eT~Bsiya!BA?dz57eX&vnwmnc;9hUL9sL5^LzY#5P%VBgP=Qn<_Jo6& zPZ`_3xbGnu-LmK7B!V>Y6jF!JLkRpd{)f;TG&s-_so$H7Ww%x2uv*0_U`oIXoGGBM zV0fpdGp$u)^aA1OB-{0iPPj^*WY3Ox`uRbhJn$nzyXik65cn9Ivl1_@^ve6l3CK@~ zvxU6G4xCCk)q77(mv|sud}wIqW4m;BlJ6=r8y;~qX#eVk(Yh2~W;N&y$u0^}Z%k;E$PL8o zeFC$LaQjExuhq{~;yUvYeID<_hkXuvcO{6EAT5?xpA*3NSAtlmqeJ|DrQk;5O=kNo zdT#t}v-2HA*2-caPmNJ-nl9bZPhV9DBCL|`TpzfjU>Qh<@pyzn=9(W3V7uzu?#)32iM{*N{>A4xVA#&56q<@=w zDNp+tvj9RYEfVLe5uExb(4l1HLPLM4hp9Z>FJQg!3Wq@zPEoP*FfNFR$5c=6ne_Qp z8NRv3>a%&4J1WZ=a~{WEZ(~)0e;y{~xE7fjCj3Fh?dhK7$DsQtz%FIujm^m$p`*uI z+naNhgsjaDzMULHH5Da0V11!pvPaIR(`t`?z_7W&>$(=>m^H zy`7|p0ueijM7#Sh1fzjy$$In2Zd`mv*2;3@3}z3G*Bq8$Ou>`<@yl)`)h&Vk5fBNA)qld zm`%r&D-{(0+fh^lR5eZi2y+-5*X<6Rdv+e0ypE$~G`)`d1xGTM&@0kfLZ;2F9km<4 zcl85&S94s4v7R+be=T^+fRiRCmF4bsgm-Hr(ro&;kIr{x2_X2_g4|wpGBuf)yCrcE(FAt`8GNcS!hfBkGW;Py&5gQtHBbVbJC?NKW3c&nPhd(V$ zhWNtA_}J1wP)Z~dsn|mJH9iHn&m|*%u)#1&fkr$oQD`V$CHNDg!mu7XrK~XXp2<9E zDvffj^@u25{I3mp3who~){9MYdP{#{$5hONSN~=mUH)Mme~Hi5H2SB7iNCDG2U2YX z0lhO|o4O7((L5)vn5!M$_>(eLcrmwuS`+iSBk@e^LkY}bL|;!f=Z>X887tRRz*yj{ zAVC)ul`qea{)TjVbZL9cV;wS8T!#W~<=9BY|NI~J=FlebqQA}nPNm9dZor{=@ zxwoO7My@EN+%lZ2@x1Z7hPMm*DSweUWtL$nEk&NTUEsu+(eST>1pFVtK`MnDB>jI5 z(hG2qBo4ryjuc^MAxIyNE(QkUr4&cLOd^aVaYR?&=nHq`9nLzsKc;kO*%(ITse<-g z?}U6YDec|v`fc=hHd&WWFBajaxI=mFL1tOw+@s-Eb=@CT#(nAedzj&#>Uy6f1dgAB<;e!$#5cS`n7FBX2*o9tJ473gKodQXp(+O_al6% zpNgNX;-JjJh>I2tY(WnI#yybTDy1Rd2Q%zIu|MA)HZ%U7g#E)i(07gI75$jWuZ_8= zvM@|4+WXUXO%(P9tWX_oR!bT*T-{ge6X%i1wz0IK>=!#j0bn}NZ2;#b`vTH85{+>_ zegO^n%Zt9Va>N=hPlBXx$WobCWa6kOM7!YzQ|ag+J|2rhCb-&505loT78>aK7}(o0 zo;51?lC&pa9~t7w2kbv|%85P)@KXP%XPbWIIwXKXutJ8K#`Qa|73kEzSVexM*Quq) zv(>vC*v%I94iUe`3d0vB^t2mUwyLjQ@e`g3nL;s)9$OThB5E>3*^J>%X zo#*C%-mp3Su&4-z3tyjt{?zQKLmw@tLj>FlTpCs&AYDf2R%8_%PL$9Oic|JR7}HQHZW&Ys)2$ z=JR=iOHMW~TR`e_?b1*$4Za=js+I1~i!Hat5O1_NSZN3Ykib~w>G4|f0B3KZEU;i2 zwYGhU>up2S04=#yH8!}cZwcWU)Cwg;M4*O}Sz~Um4$BDNf;l>`s`Nlrwy@$g?;aFCVgYoN%erBAi!hr&R zNW>tilGpKd85I*zjj%b5-IHcrgRuFrIq)7Fyo+ln>+=%i%)a@Rn)}CPMou2`gc$ffz6H4&xB2jG7n9hrM~ zyE>p(b0q(7EkN%;uSDMo46X_$uB5d!JS?VeH7;x`aw}=AWpWs0GGVtMxwlL;**tdU z_s)q!DkWVxw*q5XJdxK50tTnb(EdM+zDXE;p5l625H*le%SH1}ZYd=g{b?+d-)OOR zAEj2`EUKyo+*~Ca|6oZ*!KQN-UsV`lp8NQz z-5ZTsxCA8Pw5i83r!{3=sT8%iqP5vJs*A1hz=wgG7y5$RaA@#r3eirvh9X=;C~qs& zoxSd^QB~$C<0ws@qyA}Ln3Y;`vJg!zDo#L=d|NsV*BP$WX((t4HzLyB?~M9cFlu{u zJR2qdQ%vv3_b;M*dn3o9J^5DFp319})Fx}Or_#5UQ71lf*Ef#YxhGfb4R~@(F}*|4 zu%VRH+Xs6QtzS=TrgdJtK2T(mE^6INm>*=Ig7~_ZzpsI=deIdm)f__Y_43x!il>mf-UCjE5e^1VbL;{W2||>4jvlk zwd&Qg$1}3S7~~>Wa_!nq<(0l?RCF;Sx1aWXfw1OCk~#A$3LJgA?-MC+xkFb$4ssi! zn-x^z0g0>4VicH~tm#}jdp+HvHeYN>RophLEmjD=HXALsEUoBYAJ2a38GNL|sJEcw z`{UHxr2;Ei-eJMPhQDTa8tqLmV}#)){fAOX$cj0woxt_uvZ&PXC-ePDEFsu-&aB{% zHZ4??9F*TA`S83G3VKxNYFg;te!U`zu`o}yh*p&)s)Hg%BNZT6Utb6ETsUG9lJ))l z60LQ@AIjD$Rj*kxg-3TX)TPyV7xqIe(@>8F^UNqjo2Y`4Qql0ng%Acu>L8H(oQB# zx!K%g$C1bWS5t^#?0aJ%rH8QPKtU3eG1sg-(7@Glc@oe${a7)1EpvuBlKz?uY(xSl z#kUT2B~S`Iz^J5@-b77KHA)<218kHV9%>M9{|yHhp}jjAB~?M#Qznf_EuVxUL@I?L7@a$xcS?TD}IwSa!70|~fyf}Dk{l5$ENs}u>>mOyLpmC^kh;-8L4VD2H7lN&tI z?(52ho)n=j9n~ZHGqA?>?j77mUFbW;>(X;&jdH13LZjdL1}Q==2s#b6&pMy^5N=kS zPC&(w2}$)P8Zio^{xOExVg44(j(@g4O=moslsKz(w0{_Qq_Oz+0nvph3<$CVtdF4e z5nsO_tM@Iet*=nsxfIyp{RKaH+aBhQ7Rh0u+@xjT=3+a3&)-k|5uK1$!RSZeQ-^2S zydT1Cy|kO%5k(*ooABiZMIrWA#6B5@M=rUQm<^&Db>Ge|>GG$aGu{~o4%mk=MM8$& zl_3`C`&SxMKGYf>#5_0@4A9%?&sV<*nL5Qm5v%G%=a_?_7mfNlG@#_-3vIvx1w$mt^W>NDVy1ahkw@?M>lLrl@o7oM> z&I+2i_FB1bo?ZLfSeZpGyF1ieQlXy+iP+39rL;RGA!M)Ux*BO^8!)4mItrR03}O)w zyub_3rHS~klI2XuzzJDWfO5Qzc#uqG=8}{Ycq`QMghYWT#@6qPLfGK+c7=+?PKNAQ z+yl)3e$o?V&!GM?g)cZ>26Y6LC*N!h*o;|czGPP2T+0%7dpZ$h_giwYEc<*AZbWw` zumgYhbZ_xXi*KFb7w0s>yp3L{pf8NQj1TSS zL&$eprUJrUL*E6C`7bTV~RfYE^zheS(_?ZzAjRk7IP`PcpUxAHK^y$c z$xhi#xtMc|Y{pv67jbw~X=LNwdiVOh$YAq_t-(}SSKF0ft63>#1;zq-8p26i>PFjz z=#WvO?#j<3cP|gjanO#1deI!!6hvX>xZpm1*gPlQi^3|1%nJ~Q-$~cv@_k{dm;xVq zfM#d07LZV)y74=X-sL^xtY`=Bx`^9Nqo5u07D1iscQP>-y3zIzaRKD%s`z#?Xv;b6 zA+KL0)*3#mXZ*lf7@Zc8DIFjZ$6@=fqyHpqrSuSvV6LVr@_8cDhwbPEW)Zlc>+6G2 zG>5N;%XmrJ%=Y_7G}d|z;Utu+c(u5x%)hrc$7cu_!=6R|nGR?8HA=J~o;W_*j0tmC=PsQ?!uf;eAq2NSHBk7JuV&$- z6DcJ1i~v)Jc0UM6O*+18%pG2roA~MJeTL=9;Ps(RdZ`tC^3akqJ|VWQl$?*>9fxJ~u)kwTxs zw_~Dt46hG#8Z4>E>*}S!l7vASEy3lB4uiCeG|-`i#s4)xAb-;I#7M%kjNhPH!2Z6& z3tqDCWmZEN36y>YC)Dr_$Q8m2;h0R`pWAX&pz|1q}WGn#pgiaD*quS0UBh}*9!(Fxd z#lwOQhv_YfE}D(TN{q3?)y3PO_b8BZqsH!5j9ADKiPv%0*rpf`9XbM5>e8&hZID8* zN>#AU;}4-H*z;^&9oZm7;{r@j_|dtd#j9f~-uHbyN%_+9Dv z+f`@wS)ae@P%I=JlBCO}MaxF=&_cST%GJAnn?byGvOgv?HzNy$#?U9W93!<(<#M~R zU%Z-6dbl$?f(({A_xubKlu3WY50jc#l*%yvk{eTnE{~0wyak0OjY<$)aoA`uxbW@l zsFFW83#xL$f`jiHotMCr)`Q1g&9ocQb%k`QgUP~KFuK;0{M`A2 zW=WeANgE!^OZ(#sPHVw7c)(N4H(SEk$sC(0rS+ghNsn6cP#$;g>eYdW}kjAhCshtSeO;O6r z%W>%584*Qre0xT9YcxCMuCtpaAo=|MK1c9sq}rZ+ZT2|XQ54O9;RO!|y!U!n4zbR6 z_q2GpjR2}04sIcQ7HA{6FA@(?{l;|R$qYVK3?w`=jZh$&?k9MfELP2v3pnh#X9X%i zKVImR(@rR4O;$fO8#))H(T*>IwWi|{uyp9=XJ0sSw&>!X6Kk;SKp`U|?>NfreQDxa zi&f1~BxQSnz5C|FWblF0uO$U1`@^BvyAz`4G>*xy$ole|<9(QnrU7*2vZf90*ZUtF z=EG3@FwZR75-ewPv~cl`d7QuDBuUkKV=H~iWW#&;qv3;i1acPuqH$kZeKqW0xKJsX z2JZdJCFUkutSMv(-jOnAxPPA5qa=f#Q=2Gm%~L#~6^Q9qJ3QKQ64D4(QxZCQDhI`4 znHySrf|Ii`A9yOgVNzf#U-IZv^7pbdAuGc1ZBx%K>o_zmeMqP$3s20^h7K*mtLhH) zMCaT&nd?ZDqE6HLW^Q$JwC_mFOB`KCH@>01WdRVa1b^ee0kk=&e7H4bqJ^&%c2xc7 zBvo-NbzVaJUJtjN-*Or(S0lW&253Nbv5u3fzBchQCjenV+9;E&pNt8=!Z#zC!y&zh}qUzS07ZK1t zOKL7eYUXkXXbPh5K^^zTW~*&P1r$yk18=t-Ka0c2!JsJ^JSm{8+5N2lgLp5ipzQ5xZ^bj(5z?v+apw!Xah*?V4;0Kprvt8| zuqIBg($UQ0ZTp)u*yWL1oTjm&bmMJc0AW-dvWBmWd{sIEjfYa8!1 zBbqJJK=Uj+eV~$COj6}c=A+BL7R2qoLkiS;rcx!mB=UCKRlcq6S?=EgxNWSLp=~KJ0m%AE}ziK8dv?-j|mA#A8+E2x` z+na0>90uupcW8l@zl??Sq@Oqcl~G@wE7?k4E=D-qT5N=}r;|m|x6cQ}s-hO-ACK9L zmoZ!zKZ&qK%%uz1sMsXL?L(js9pb=X9F>8MO$EI@(GO+y?B#2PGK^Q_peA}3U+7s7(ch%M!Bi!PG-`?cJ@o8E*``kVALP$;O3@@lfz==X@Fh)#`}N$p zp5%!phM+i6zG7iX9McXl5%{dS8S5qyB3(lm61N`r`+lEYo=VPwH&lr@1Z-`t{b3G^ z7J5niy!4R6yq@)!t~pg2vcFBBcgTnFM!|BpP>(c?1Mw0`219WaJGbp_+@@J)vo93> zp)h*{4lHoI4yI<}d;5!#Bx_Mk9-k7QzR*ivwL}M;TArndmMa?TS?c7ypf^<2vI0?m zSr;Hs95cg<8P@Fmzy)XXGaR{SoLzako02oFiTm4^$-vm1Qw}N5T|3W-S1Lmm=!9Ia z-8~$^e>4v@kkN#Ut9tJ&-9Oy^K80VKQ1{MIeK=SBU-gqIJaI&3J=>4bbUyS*)0Y~V z2b4c^Dbem+rB5l^7nS#VVYZWmJ)b@?cD66D4M7t>FB+@&r&;Q!!PUc5BrjiSkb*~} z<#@jyBz}vskLfI^nqUqtLYZgs`|lX=IkzW@^lo>gz&#aCfldPzs_Lv%`mt;{4mLDA z=WWYrPJWa&k(~_tLaQq)8VO&JYJpsi64{EgiSzi9+e@q-<@}aV{IRU=GW>B+c}nNY zCzJ9zX!neDJa^a{v1$7rwwCWCuCo76o5vJE!IV`m^ikYbGG-znnp7*I6s-oa<4u&1 z?_-1tHon1d^alYi;+~9TlBE_(yGepPB~b25;WL<-(_$&}*%eVT7N)G3!Aq45*f^O`9Tt&1IIp?Fq0zpFZW*h=GN@7` z`Uk^NVl6Zu?kUbb%(Li50`F}0*bDPBwvo)&_`fvA#7A2=wTZr!tXL^4RXD>lY@GGg zC68!I;*i;y4`%%A^o-e*jxho)hoe>6k5Sr$nnsvk5mDof4Hp6prGB6HczWI8rI(Dp zcGO_gzqAB*PtKNwVV`d=K$+JABsWd56Ynllh_wmhPs`d@x9cv5E3ij%Vk`J$Chscs)5q z6T&hm;uE1pWccU!U7e?ol67gL=No^JldyuOl2Wkk*D%Vv1ZLf2_O{B@SK>>59PrVY z3hd+u%4J4==Gyt~*8@h4AC4AkzW6@e-^U93`t*gVF@zWzKI_Fnk)%0{RP`5!3DR*# zn^dMkj#M9v|M9#blj)nSh_u#jv|#GE>{qx-z4pN3czo@is2^w>c6+jUu%U^W?RFcN z+ir+;-YZTV&J{VH)s#9YM6FQqX5}Ikgt+TlTBP7N7D-hNo~*7kjbg+scj5x+mLldk z0@&YVm&VA~lpjq75ft^ zy_PKTdb-w63{jI|106Pl`MDL07kCVe8?hf>myg@Q**x!#u0A$bDT7~fvW6YFL&tc+B7QO+C7R5#fBbLt3J&=bO zV}CLC?0e|Jrl+@q=R|c>A9&h|zX@53(8w+3f>qbKh`XD!#Gy9>0UO+g9BKrSrj)r9 zUlApd82o8L8|N}Lk?U_dPtlb_kZoxAo2TISWCN#tq+kbtsXRr2J zf1YIKb6va0Jl-jyZnbtEI8eN&756}mt4G1hQNA%0@c}zbvs|4m%04@SZ`4= zvRlKzIH0yhU&OQ3R|9Y3+pn{;5+iOnl^UjO0yH)w=g^_l=D=QOu!mAIqHBI$ldYD! zG3WP&J4(9l!3^GDEL;4Z`$wHR9L=g;7~DKME7C#CQ3(lgwO2Ta<*aguXpuNW$Yw>) z-(bIM_X#yp*~R>Igj;L+J!q4{pI%)=yCgue_=wDQ%61TC)<0d%@PbO*gO2#y)RkwNZ{}x%rNuiY(~WgtFpBS5C&%cAJ9s|D0?t zinZ8#V$i($*74;5rMVhp$uUDi9x8MCxF*!D6v zxN_^tmYR&?wRTr9JhLuS1j3T0Ner5V12I{CQJJW}ijX8e=!;8h3V7b4cv9W?$!#J1 ztHU$&+T?1F4Ed}VkytKuI<3HJ(ZJ;*=h?PT*jFfGA@@rhx{svXOH5H9$qNTb-UpSk z1>2X&9#J)~?9%(!q<&wENGwoX>+bFUAe;xOi@Z3g+!YzUvv9;IL+rt6@ z+8dO=Wyj1p!L`fl_MebAkc;`-=-*rqJnNDEkjc$G9G}hQ%_jg*L>1{=5rbASMR)&1 z5DQ~Bpj_{2&=$YHUwy)8Vl~D5;oB_5O1Ye?>yLHao0oelQ0z6zv)plJ<73 zIBWksb-mT-2RrKDNUu$Y21NwYxM#6=JU39hygW;VNUxwZk-CT{ec?VoB%FnzVDfbv&K+|U~q`x@(P8j=#%jwc`T?HD@U_9 z5aoARM0<8n*=*L;G0SWT6%w8k|7`MzpKYS{c(~x?-%g>?#Z#|%;Ta)ZMC!w`jgruI(C)Tsk!4ZS* zdW#~b6Br7HV$lGXhP`kLdQxG$2)Y=uYIkne`SO}>R7nMKm6F?Br?u)#UGH*jzIRvF zDiK4p6^Lufd(Ye&Ey5Bb2N*Y+*$YGm;fYI!y9#R119}uGoyHAkicQU&&W%~MhC<~UTe@wMzqE8hlA#*Fa z;(5Cn&jo8MzzUiep+a#3z=9a z%R{Pd80FOw{`F`74E9Zr}|ZPANqvh{8F{2N<4t1piHze6-l(iI zNxNX1K_WXMfJmfwn3Jrni8O5_9s~Yp=NDk)5Rlo?gG&80M~09OMbXbJ%WKa1`1`I@ ziCjrNN@5Bdo;^rlWoZiqQ({L4jzy=d5ffX;`fhL_4Nfx9;B-8*pSsfrZJ~(q+3K(H z|7#WiBaC*m98*j+AUf#+hft`HjwdOPV?ve)r;1e68YPA+!1V&-6XLh(1 z_Ks&t%1P%_SIA!E79CB^$PGhJNEQ&; z;yTuz7tm^VeBd3+;2AcSgQ7XlUy6v*;nKjp#Os$}W%7Ww(fx`#pI23LzEwN+-A|g; zmKCA#Imq;WZ{DHTTWx)N9;lSY55{|GVG~f>+$xNAZlG&TKj7>o5wg*^R#7?}VqKbF z1ziVQZKQ^gCFJv5$12xoQd-onjW2vw(AzovgvLZ*X#UkSNe;-Ig8alk+i7VL1Ox342rIBus?(Pl&>As8i zJ>NawxntbF9YfuFuQk_<-}6jX;JZLSwY|yGyvpmrASCmC% z-6|#vRt}C!ZT~7sg|G(N7FQ!5kvYqrtme%$*n^UWKj;7QB$?HowoI>OU>}i3tsG@# ztfeo3ocGr8(68WxdNz*9ukENZNR`Hj#o*FtvDqooM7l^hA09puFD#EenbaX+u%wna z+QvnHnqyOkIkf1P`vId)F~`^B{5KMcZ@TA|-eyxeK;!ycfH51WXov4nFOoS75LyHL zraykVbwleITg@H}VzK;6tb^akOoZS}*ao(<3DF4k6XldBqb9Kn<);y{%lsZ~&WpT| zoNc!qUHb#1ZXNDI&F@&C{=)*Gf!v*mx_I)d`q`XLmr_uk9+&4m+ac|oq59~+N+Y@% z*DMu>39tBTBrk%CKrFikvuUFgL5H-Pte0XXQ*8lI_4|00AeV%@`Ih>s|I=PmZe>e` zh7Im@a=DUMy?bM4jksubE={hTox4&PcK1r)g5P?YXlG7Ia~;jR-}3}en;5j|d?PiV znj8K}8>J<{{Lh|`Q)as`=Rv9@h0B)uy$a?$VjD1B%U)e>mlsi7j*}GieYw#@BhyStjPikhzol?Vta2*ss%?i9aahU;8UbyB@G{@~`X`dW zse?OQ2NX5!Xfx?mXUBtg7+*>pGCp_W_eGI1{8{o2Sv`r}T`q*{xptaZ<#)D!(IW>j zDCE&*cQPbdO~^3d^ZJ_tr4cJ}rB;LE(`jGZZTr-&s#eS7cYkBQW^iJ+*m#D{ki=#D zd;c^3eJRq6@V;81Qhy5@aF`AlNpB_Ly2k>B(4m6hm#3Hu#bgd6D3AzAk&tKd8M!FF!@%QeP`lP z6mu1H^1tp}lGGvOnxU^m=17Is$(v&;Jr8Ox!Jw9o2Jr~|Uc#D9g~}9dzM|Q|(eGG9 z5ve~LPs0(!Tt^b}29|2qblS@Y2Z_t7Fcnd#7W)epg2GZ(r6PRZO|C3tyLM<0a zgad&K{O;Tk>e)R)p+d9jrq8cL;@ZQP5^5(%@SskAL10=t=u>KO&pvJr7O}#Z$h>~< zD|@MshT`bv;hA1!O?*l=8tNkmjRi+0RATU>r|&x{VOh#ziMgT^{%lf>i;10ucSfb%6ZvQTe@#}5o1>*M^aXLYdGR-JlkOR7JsU|>>@RZ-J1dX zHNJk43Xh>IMhK@he_MM7T!C&C=7#9)XoZR3p&-FG$}@?IT{&G4eRMI`XJU(BOoQKbMNTumZWWsZ`%gQY9|6)WBO}5KA&^1G zwlT*-v|SygHrkXMRDuSiP4|l+LGQ;df5cZOmq*%;NAs^=)2dx(7lZHLA0^MnE8vgs*O{uVcfG+*TM`H^`M^qfoid_Z4dscT7{t*cLO}-X_BTnv z0DrrBAVhNJ(mZG4>W~9a)#qZnGHTUyxfd;1-y}0=p-d-C%ezY84r~p2JtpnS?zP=> zp@PQpIkLqnp^NMo^_yR%x<__bCn1|~r29cx?oBQ~}CB`peP-i||XBprMYz8w)$v9g)c*Wo4(M`EYk!b~_B&PqaBFYR6I+c-kDL z7YlJ9zNp|&Gcjec?Mtc1jQ9M3VK%b;6h^#O>Ll$5W?N5{^7)3NwcAfEPr+dgshT{K z;6m(As{X&Bg9O3$d)%e|p91+TG)Q_aHKMJ%Qr!1-tF9wqYc!cLk``j=Xl$G0NeYr) zFN!)8Zd>X)e{OF^U$H2_9=sCnyT^V@1I5j;{b%Xd8wzCIHd2;!w$GS)O|DQ(OiWx{ zT)2*cqC5`&`vn_L@uYspN$0WB=OmQZMb$=t;2r5mS|4vib2)Qwzrj)?ZH}2x zOk2O12LmB(5cwyYI_d9?ut<0&AaW=_RNd`6gJ96uo!Sa z!z!qJr6sfK({qq74f#qjYPr?81l(INwdzXF7Sx_2xZ6Deu~CG1m>tljnQelVLLbHV!FVC-uGts9(N+{w%%6iUrRmmFB;ZX_0#D zcL;R}iZ3d$rDcA{I`39mRbu2%geDlrrCvX2eexL@A!jpx+`~@?m2~qLp=|5SGA|%G zH~Or)!!YmT8APU>8-q#)VbE}z;9OUtx25*^_8TWcL^84mzw?OOmKK|FE+xlTl*q0? z*6+rN5vr4tgv$0i)}=m#z)dM|7zFKY5dhO3d>c^_w!0 ze^;$oz?5F6l%?KR;0S}LDe}Sj_V%}H`BLzW;cO1(F276aHfi;jhv(}*>me9J#;`oC zgjk}98@#uX*XbQCh|gAPL(MyP_q&p~pWh6$OXE&J@3K3FTNd4o1Pj1Z!uxE6(eGRo zW_f3y^PJhZI*k5)*C7GVd)PaX*Z*>lFo1V-VC6=nCgh^f)on1Ng#6+52Of++*^H)d z{${-A|K#du6?u*~jz{lM$75Y@&w{UE#Q0`V*5LX5{Ps7JCC1no+o$tGm1#c*Q;0wk zs|iw?pm&7x(R`-c#s1GFI?^6{yb#6d?z!XAO=>vD$mqFV)CV|M@ujN%p!pP$EuR4uv=%7;Q$zlM)ck|d2>w!uNcO#Ds6_E9YliQw-;UzB=cc{%38!{4Fz-oy6SI#ZI_ zXx<>CwWb-l+pjA5=e*BV6x}EoNm3dnlg{S5)42+=26(9Mi^(*;8SL^N9kgWB7eVY<4e(r_(b(fF4%|6Hj z7n_oFOP^C+G@8QwbX~kevl1GQ^`ila(cjHL?l62#5{9XoHjCx4k)~;g69?u4P{x9yPSUB@zAs&)>U|;B zYLrg}{uC_NlR94%PVoWSvEAnJO(a zp!&+`oUV07B(DE+FoW~?QGxtM8uy(UJuu;qxNx7LoYk=8rGWw6^}ontYxY@yq2D;| z>;*kBuK||>!hJ5Ql0p5Bu$_fnD8iN zNI^A22>U6EudfH50|WvH5K+vYI@Ws4c&YAA5>IOnY~$~vVvz8a5gag+~sT{j-kBUqTmO`*Uz24GM;|3KK$LP{@q4m1c;Ng z)XmJ+?DtpX>q=22t*wG-d|$r4FV$-A?xq}r8%7o1$jysB^Dk66@74hY|EH^#So8{` zH^mo{?ATSGeuo7`8sSQvgLRuJytNP$s}OU(SQQ`9qT)mzCt5D<1j$E_hLh8cPnIi$ zAwiL8hU?VCC}&4G?b^Zgg9%eHQ}1$`9e0*(;W2Mf4}+z%-HYaB;UG3Ep0L_M9M_i* zyjpbyojneJI-CezMu;%vhs=6&vFfd!OzbeB9C|#iv-4Ct1vvd7uq0^l7Qm%Un zM%{)0bM*nzk5^;~^1?WFt4fLfUHYwH*}III*^M^BI4|gKD;0`-u68JobYq0H<`P&h z+2L|n^B0_%Zb-TscdZUAbwY(1x;cAM_ubrWxWG1jfi z<5L;|SD3y(czE*kWG=2`lf~Do6FPT_1ne7bs3PJpTS5sQr5kp-0n&%*?Bv+Jv~YzE zrmykqB4V6}bF~NuFr2Fr=cxDBk1#h#TcUK431$g_@Smu zJ*jV8qQ+RswcM)xM}%y`FSq_C^a4|+bwZIfqW}!7mW{dt(B$eG|MJ`Rvg;v6C5Kqijo4<;HB{som29{+Kjpus? z#-IEcLF*Lz$84#*i5F3_ zl)<9&5#ZN)XCU*YRI3#M8Jc5F<($E}LG4ro^srROWum+ikqA_Xp^<`s+y2B(*6Gm# z+H5>sL}jNN$^NGJb@&frCR-^_n)uHvq!V`Q=O!Kx_;;SPHN(b>H}9PF;QH<3&Yc!R|#G2RnK;XyO&` zXXYK)52Fz2@}XfGrk{o>)>$ zVNHUu;J7<88SiuEk^wuJ+{OjH)iqjAx=~@b^>(@?%UC=iWd9#NemrRI`oo60C7_Px z$AWCadDsmMqm!om+yRmvsE8j}n>zn!-s;H&RO7@jXnW-GF9QBN7zyHkWBn+HmOrn6 z-{2ioD(Da&|V<|I5f zW-YTdr9YG$n_c(T<(R6px7w1y83?>N{^jz$&R(Fnm}K-n6#eo4rRWg~RdQ7>qO*!` z`H*jSYhMsJA1uVDN(}w@9giH0)2y8Go_5F3m4I(NLWv4Y%c)JKVEb2K78p@}?s z*>=t0GH3gPUl?5l!AS3!MKXNM}Qm+ zB((V}FNfXyav`?;JNTL&bTDjInm@I2t5pkMx4g)a!+W!p0U;PpVtvsax;?{Fkri;3 zY4Es+j}b)3kKb9^aEbB)R^dIh#<**&oJfisA&6sl<=oIgYJalcf58E)Z&#-~ny|A< zI~g2?=N@fdcv2vcZd7oZ6AO4=2ggQJv1BX=Y3|gRX6Pi8d{CNhDS@|@^2x4szB(2# z@QM)-y<@IWNToi&nybpr4+kx1?Iy(>+|?+dK70=LXc~vN{<<$snqG2|GoN+dT^-tf z@$G;+J^YL^T|V1Z#<0```~G&?h&tC;5UthL0g=-=VT3`qz`c{_2HALWyB3 z!@^H2=XfKFSCNN;mpwbv`nyH9c+j{peW8u=hGg_)f2+9O`3B$X@wz|o6%N<^$s;*T zO@T*u`w~KQKkF;m?hjdxh?Y7HP!a4!>*+PEtU6Ms!ZH_j`x_3qFr1ryxeen2qO2KL zd~iXIdrJW(_E=KHkI#CpkFHq8c3}2;X}$9MD>FXZ)E~M~9?Sruqo(M?@{a{8)dJ5% zE*K~{?*?8Lyk2;37Ka9=@{l?3loANGcn5{2GAwB9*(yY&a912(>p2HY6UTsy|A|25 z7S#Lx)w{-lWInrW*ieqJIzZb>xt#VqDex(Snd>F?^*R?8J8yk@K!f>p^AjOmp#+0O zlHx^>V~53A`w$8o2LIo znyL^Ii44fr#6`S|y(uOh_SMTG>&jP1$aqA^_icYCG2kBY28#PL%%Nxa&Q2igCu`Wu zK77lGknF;wym>1F2LUFRezH4^hI@IGSt0eRw23$)`dLDx@(7SUbUfN~6qU^^4$L>x zjN%%(0?4=f;M$J+LZu7yf6&tKHB-Gm-9T8#;F3=Cd4B(4fWRr+w4uIThVyGv76!U3#JWue+MH>W`xxf#XqewE|dGQMR8=X z&+?d-IIDJT^%mH19OzqWBdZDDaB36nkUex0ldR7CeRjwc@J>-KC?(`(vzX-~b(p4~ zebXSh@$GQ22zfY_FX%*e4IeLQU7*&;M|6Lp){qx|kWj!OcuItT*OlP+bhXzCeB93) zr(GY*fu5;kW`n40DY1tF@5Ny8pV2fHBY+5^SB(^Ezx(>U`@-@3)aXZPuml=GIKZn} zz2)Qa)zz#5s72R>qp`kP73pm#ajh0Z2p|XV2!xZrGP`sWB$&vM@X)XPbe4_#$xy0= zCM@Bkb6_w%{-dbY@ubeWyK%6$);654Fs%U;O%i8RNA2bMtrhM03Ji=@eunSDkE|eV zq$`nK7GD=1A5G2@4Y&P2(9c^RhE*2%Mn9!^#KI!=z(L+=nRfWTy8EC_m5e_q$@Ebu zC(OE4%zSG?H4rO^rpo1_=#mhqoZ6K1W$m!gVfIZW z!sm%6=)HoW{@_>?HMNmrrWZjM_%q$W?GL|cH;xd>DQ=vU@f#cjM@~ODU;>8~b~)3{ z1eN1|xg+hYYkW2!Y5xb=xF_X^$K=>6v17D!ikS8 zfx7>B`YR>$E){*r-t1jyMy$wr$=IH(46*3sLNOW)>Bu5)>4js^th_`>nT#j)`IJ>V zSN4!PGbFb!q%K_lxsUe^m0y=fEIVUTE=&9m9 zG6LkufBLU`QVcAhzMH1T;Rwc43lykw(cfkjboO=ro?CgoB|NyHvD`0$yzQUZUxV7K zrQl8qqzfv3(Z|B6u#$&OZ4LCvn9(h5q!)yOj(k9%)s>W5{2@t+Y~WtR$7reNLNlq} zUy9y#w)OW5=U|VCc6k))5|zd1T+H0gCA-g>D{CbbK}yb|fEInB`)+5uq)Gm`y_rq( z<**7gH6r@Q<8~wUSyFpw#;I>V#gqre$S6bFSgUfWSEH^~hq@=RvFX36ED6gsEwKM> zyHVXy-@y4;ZMGy>u&pD%@wjz1S80}X4S9Jm{ziAEfgj8S5cNUl$K&jlverf?pfOyF zq^`#FeG*~vjKS$*VZG-W?Hir;j_$F!PcO7;rb?i!X9ugVo*PCg>bG84F3$FMcZFq) zz9x5($HrV=g*psTtlA8vlT|cZC^>SO;9enAF>MMfp&bGpN$q$CC@(QS!n9tE@rQi0~W57eQdPxJtr!FR3`8%1~z)Ndt;_p)v8|0|I=u6JI&g!3|+lxAZLZ;qVPt((VZk0DL6!lPgjmq3bt7^@)af3@|9Ut!XLx0 zJ^d|Lk7JUSg`x%&3o;`deeSkjBT1nGsBBV|Zd?h9cq=T05Tn0?J-RQ&HpL7^Yk00o8+gQ0z472_G&|?n{oE92K#=NW>=B^sRvyk3(f>+5YCv>MSlGAOKEb zk?(QA_Vdau@6k%DI6*=cQ@M=zaLU%F|F8i2b2%_X%ih7GXQU;9CMY?BzWW%#_P2La z2n;#P2?-idOG6p@(T$e@ZUSyTmEg5KK)73jTL1f98D7dn;SEFxvJygC@gAN7_JFW z^l!pcTLlF*99fetY6w13&6?(mju?-|`Q#_*LgACKalk@OOg&*jZ?HFQ@^n`V!HZA{ z6cX&}dzni?qpb%o5*+52CQ#u*vDKg`fb7SR%Bh1Xu=2o8dhqRA_5VpX3{vEicoF3n zF`1I+Y4~_SaP@}f%Bewm`SQhU-g4D^h*tG=yo~U$Q(ULgy+cBh+!Syyn0{*R)0~Rs?N73~6Dc3A42OGUsVb36VdTrGw!`6NN?k&YF ztMN~A<*&JNsdS79>o_Fhfd0UWyzqR!*w%mqM{E2wmwijC~eY#IHz2 zjkOds_h}2A@%IC5ACtR71p={PVgI zIOtc+XRFj>v^;c`jt(zPVdCKFoIv(Gh%b4u-YTNXZnyGvYqg6kR`6hO{as4P%7HD(QnQE zvanqumTa;u7lvv`9JkEVQk`XV-TOL=aaC!S>euC7h?f?G0=**nZ8%)S$~D{}rY-K- zIiA7_cdi|85yIg_3srGY$|$FzND^g`x61I2k8AR4FsrPnK27IzBElZh8bp#-75+@- z#-fDF7y#*&pUVgjT5QWihpTa(!GF#5J*p(9MarML3`}^6JMv$xGdb~j8|lg)rGcEc z@H^k@_!;;Ri&z#bETCyOzM!1!ye?#z`cm(?Q7hb*siqr$OTSyZ@YOCsNmyTXLOM50 z<2g)11u3TS`GQWcN}k*|>DafVqJbzsLw~>o-ENg>=zq`2c~+`EL~n93+x=68B2!~? z(SKe+0!)?(=OelU*5@aMY$$@aY+0hh$ZIN9S@5>smG0L3cnkse^6y!r0^!o5oMuVc z0;z57whi?=vmYWVS)ssI*?j)Zm;*T>d`P6z_-0-O0t13xR zCXJGpGpyU{k9sK)_JZz#5pW+sY+-@@RSUWfCA0X+wcG-jFa-PSRAK}wL(<8AIab>; z?euoN*B^j}&-T{e@a6X>x`*Qzt*5(VGm(g9M5=j=Cu`}djc z#b3Bf_p4=5X4`AAjhAZ3I1bPtn`xo;6pYE@0*8nS%T8)mIH`347nVTg-=y-lEJ_6x z>>K&-{6s@qoD=ArN0rqxZ{OM$*mRL1K)e|U5g@ru?A6t|^S*JauCro-)d!0aH5p3l z@2G~&#I`6M-+20p(r#KMu0(9D<$K(pjx~Kgt2Emt8!M<6V8Wzv+Q^CMZKIUIT#82` zAVL&?K6Vy3q@cu;YV>go(UG@`)*_cn!HzD#{&Jr=47-GcbK!0jRTj~Cy>#>EimVfqV3k+i#Z#UT&oRZlPEvGkyZ&!x0 zS*XKl#@Y;DjW&-cU0xL@jwqMv2!Pf%;nophh2f0yDRk`Yc#03u$xG|pa}9;B9Y@yn ztUVFG9ecl_B-53t_D}DvQL5`&=gw&lZGZx&E|2^a!=n;_`5+$XD(hkdiG+(<-F*~c zhj80o%Z`H{>nRbbcy&#c@QutCyOt`{-eovH#u){ChcvA@$qV2AXc8{dh16V~fWN?@ zF`G-;;^8N7JodFKZ7Jca{nL;QiLK^?d*&)a=zb}@Fs@WdGJH6%P;zyAu2OWMfhUj~@XAa6cNI`$GT-1<;T8bqN~W1>=f2eFP=}y(s+voe z8BTV){`!!o5%~tG2fHtCR~JYiNc>`{1y+@qduoF$Wh=@GvHmBFfS_=U!NIwDa&xxx z9n6v4Bi<^cyp)wa+RG28R0fjc+lWvQL!=oiFMx3Bw)E9;yyG**wNYAwyJ`!T_}Lzy zA2?c$k0SC;_fAWL$!>TZq-Xu3b(^8laVr5A>>-kx0SD#WS7>E6BJr)lg~<7N#H0Rg zzWO_KBy7f)p{)!RJZACKuN7Z%Q5ZxCNy&`7&(pzE!luqN9ly@ps07+tx7*vc-Ybo2 z)*Hp7Da#(Ri*2}J6Jrwe;buJxcpl7a>ZqtkHd}X+LqggDFF`OaOaYPKB~t1KDCEz3dB`d2Nk|z4HCW^P!0jKBUx$Bp4_uR~Ex<4xoBjc?+4Qik0soxMTF|pEP8h;8iVUK`Y zFO*@=9RzbD-fv1>Nq;dXdiCbd_ki%oL5%R5(ZF;;E?wT>%-}Dk9JA^{%Wqm2snOgP zrtr)X{^CD(Y1XpCtZiP}N_AHhQggg+DBIR~js1C?B-=z{(n3eW*p!#^xFh(HJPNd@ zK(9B<0~j8!xIC?Y?4n8dsc6Jk0~|aDn_pyW0R$}8igIm!CHhN+)zgIP@qf6TdVg`X><>bwGUQFPpVF2MBdL7 zRWozdIJ0;5T`ary`)MixPG%zh6nuSvonAF0H^NNm=3 z8!!g#4LS_LrC=?si`sOS?D25LV9bvj4iGBARoEb{C>2#@x?Hh)*PAK6u$`r*^P;=Y zYv>0r)q$uN*-yWInwz~1D%0a(E!I~bs-B=l^jbYo>l*)9mTem3a*P{XO2uN($TM)Pc_wmGzb8zqFB9A z#L2*zCH&O&@|;_1>T|%=@p88KFR*>>uCPFr9y4a2LLrIwTTQkZMhHIX3eHVDP!N#n z-a58I1+l%ye*&&UqnW>`KwlT&E5wFQZRQCy!I07kLpQSLw3c!#TyBnt-q6Z7UC^uI zJSQ$sEkm0#Yb?X<@GevAvo0mv4 z*#<9vK(3LtSnsw+3k+svy`1ScHJeD%aUcEpM}-M|gNbjVmw4+dJ2}J(Rq{=*u&G)3 zkd}#-4_g^8ELaBDug5m(?JnM)DXQLDuL+^m0D<53yUyp!gGw{un|g=gw8=#z=rM@& zA^%^cf%vPdwjMXZOpvg$u75qJ`5f-^{KNCyGkO$8 zj;X}zpmxpeoc%otd!my9yqen=ca{^i)UN-uOyMD)?fSlJHn_fq9CQ04ToI9d`McSd zN`>>z`B{m8I$5Xp2QK$eCG})ANN!k7#z0{Ny;KZ(8CX?GHzR}IK!-%Y zq~G*W0N>sVc$2`@I@jV~>9e{ZhHgrw(R0H*0vu$dY+`L5NLe=qPv6mB&_D&Krxvqv zZ}$-jc8L?RBP7r3knqlL-*8 ze7e?$@RG7Y`EEd+?s;N?giTY>$cX?EFO1oh_;%yPQp>MNJF;^9`0&OJg6rGj=Phnv zrTN-N+0U5VX2I>E?K|*#+8oOd<-I&!9`F?%)B0O%Xte*&`>`Iou%dXCO>G~t4&g@+ z0ezgGE_cEGonsB^IQJJg==O|a>?H%@bfPjvDw)|}+SnqX&={$iP6rhLQrC|7GYwoH z&4D}$H^`p=pfNX|?wSK2m(jH4~IS0T^(13DrIGsC^DZPaRsIew( zRM3`gohSOfOIdN-7SJdIBI!d?{Holj9lgz(!74*#ScEZ|Nx>v~jSaiKv$HQm0 z@xw#bD2AWutfCQt^d#hGuIx4PuoqiJ(j^X)%1$5BhZpL6jKBZ^Cq`ELd`I?$W`oBT zNL}&3kn+8^L{q-B^go~^)c*w~-+qm{VUSaKZm2H(a{2W0JG;d+acuuWoehg&2kia* z%`0HD91tHLPjAdpn<>RkcTOrJ{|fMiPR$sv-pJD<{3y=+uKE1Pa978iu~Gd~krm?+ z+dijYn?DBvff!(Pt_o4-K>h9rP+6!NT9N5j-??9wDZe+Poiz*pOBtG}kWbKGBIcsS4Bd~8Q zP)G6$0$x84V&8H9El>jgEl>={KUfH;e#T_hSY@tknW@iypVU%V*8+@Zi<_~WsZ*xN z52x$sZf`H9(n*gCxd1afT0se}Oug%}+TeGt6x0xz+0DUJ97UofV7H*0ORJ#S%<%m| zp3-#pWOuin|Ma|H^iq2fD?f1P&(-qwKTO|j5?Pu)keh4b@BF;nGWfy$s#x@wg9Ey~ z)=o!)Uvjj7vD@jOt88q-Kc5#Gg#Z7a&kIe^;{B91#g)@^3d9TkH8nM3GkvpxWFX+R zh-Klu4SeoLp4*@N+mDpXLLSL4RfXP0A$Z&K2$IQET5KDb5a#nEftP!%Iid%UKf5d$ z2r-nhd4i(Ls&@YjKPdDHhzy}D~0^iVMoC~FfdE4P-S-_ zKk~mmXAJH@>2q(PnxM%JGY$m%*ww5vnouxT1%?l&%{?G#Ht%^usr=7R7XZpRscdd9 zdaP#NU!Q$~V3BSi$-%<{FzoO;GnH6>mRp9s-muxTHeM2Zob`7+=lJyOkNL9*@(IAA zs?YyX9vgE57!W6@g_d=1Fi%cTgn=Q$%U0aLlDR5c?>4tEVE5CCY^9mL_HrJ}pU`kY zImZ=a+CaU{T#k@kJ^OS&L!DZ`u!*`bIhOHgwjQSQE~4#3x^uKvRa^Otvs(gOY2a!j z=y~gJx6!i(mN1;0oryl)_muV;S&G`7&6~g%D;GPbqHbdl!#HqaZB2s?-auYbO63nJ z(7`kP8Rda50r+b0Wj>cbY>hJZfu1aIx8zxz|D&b%5M-6}>NzYOqB56zOi6S^?5iY}CJv}3 z0_7+T6Rm0E$6-u=qLQmJwT7mrjC7 z*6j%=RRxtmChco3EDqWWyQkDkoA;J2dK0%NyW(_(x?le)oiwA&peeKz24X=%Atl#^8it@&)oc<|uhl2V-Z zb6giQ*(@1+wpwNTZL$^1`W1GNkSospX)8jDoSO8LVJ&7#Nwhy?ztkCX zWa4?0tp`9L>asMyi+14t1hm=&&xQ46lofugK93EJseDQ}=|*51fXX^y&kz7ecY^j( zx0hjSHgYqK5=K?%N5hYf1InJxc1?OS|_r@=tHKEYzIJLc}ee+QEKBlXIuB8fgVQ zZn)yal4w|hfkIh;jgX3Lv}Hbyt0rY$Z~evzBjm+Cw`uiyP|6ZRe4xU~&j(_}NaE#< zg}QP}$AlkSp#+LZfmtcf;V_%uEhYMg%<}Yh7k(! z-)~A75!t{<{&~jw^~byyn9H*rv>o|YxD;eyf}4jr0k?eT@_a!089qI;r@&msxbn+| zspFq1yoaExi5J|vA4Sf)=?Xs7<-JtcUJ z3#5unGuU`UMG52N=!YV}cMsdY$JDL^&;64*?T}K=4*~%XT=&aE!$KJvlfufrS)j%P z_7G%e9^I1JO>lb7UBTyp88gtQtiZHKv4YS5s&W!Tu} z{=^8yeM#Q1^<3Tl7Zr(#$7+f`MqbzR`qZX z-@#&GsCV~6sp7~1S3u;oh57(B$&WgeTTzhcKly_&n#W_j_m0{zdj% z{QCIuZ*#>KXGSAC&8t*wka2Q zX4G!zb_#v2n9^GfCW_>_Fg@bL&sEfn{+GwJC>56)nO^6HdO45l4@glG8{ekq$14J! z_@T5Xv+;Yk1i^l`EZ_|)c?{UCSDO65cTeUpJy-#nXF)`SY~%O`*$Zq4l_NiySDF{D zjOM;>)jMoJ=&L@8ggP?-ybviHK5U&Ga^ZRzXGRTV{6$Vu?2aBD9tIW`#l3@G$3~W` z6$R>+lXbGqJ({=$Kaf89+n|a06B^BkDy~f9qJTH}XA13G`#mts4*#W)l6)f#uGe;5 zo`*zFw2%rEWNZHUz*!KKuIbdnc1izx;JW*lv9u{)KR-H^^WYA~g*XDkN-o!ZlCyY> z&oGeOYx-B6-ijNOGgRgwBl(4qAD(42Kc2A~{{X3sj1xj?osx=RqPG8L>L?V`P^ich zSkWuLB%9(2S~P`>0BCScZ-!9KJA59~%oTVp!ec((pT^dNA`|2(q~iY}T(rI1zCblc%%CHRc6rA47 zj1`z@0HX6wa46n27A1n1f4D7+vH4d;pMvlCnxcIRk*8nfOC7Afk3Z0DrE31TRi8=x zY>mnEnz-#t;-|~%$VT_1!FP^7tT0+Ysttt3u)RJ%RsAn6sZ3J;Q-n-)Pu;ong{9w3!qDbpDW-)77+Y{d+6__1w^R&4bqQFu)F8qVG}Z{_YRqc4^AU z3o9r5O@3V1Fy%XQVB0%4@oBczL9kf(_GDS~@a00;>%P~>1Ts^(3>LKppt8NJR4(>n z$dOG?m=dRaa6WScM`Sjbjtm$^3{=ToWte>qNg9q}K!)k?1~_A^2JoOh{6Zw~85daZ zCTkWY?OsKLG$<=m>MJTTvMBJ+1BQ&6`zpptM$kcszJB0!>oGU~8zuVu;@vx~6AZ~d zzj);rn^i1nurZ*Fow>pXr!YS>B^I1QkfxUwf;7ETg53K!2O4B*)bA64zSw4yXgl<9 zvMP3fxD=HuZoLZ2N0j^FiVeZjGwZLQ{5`~u@*ibf98kuIbU-;I7z}T6R}6m&2PUlV zR==j4oR0bGA~0th2w{JaQuvbTwea+KC5Upk9qRWB&1fmD1|^DI>t9dY?yRGp;p?qK z$`fCqm(Xw2e`-BcYCPa6m*B|K`SLMci$yP~fRblc32EQ&^>ZJDHWmyLi0&XG@i z&+f>h%<}pA<0Vi`K1&Jrj>je>SStL2$@c%lc&$Ip@}d>wCG13cuT&X!>nT?n9Ea!% zD@SJ;bTeWrRnlGRRr0^tuQKXo3<=I4$s)t1&oDZkb!XbWdm6Fy;-nBF=^qL}cOE{) zjah1T|FmOTbvO6p8$|Ul4;H=}=OS7A@W2!u-4{KT75C~rUvPey6$So|hcNBemVhwb7m;#6xq;%&(s9_hZr{f9>5AQDd zQ1xB1Av6?ufAU-CNcoNKfcdUl=J&W>QiP!Sh#2RKoC+KWq~;B>9^HePkDusLc#UU1 z1{Z-V{>U+GmM!Vj0qc^k@9L92u>B6;r&@BGL0vzN!2Asn}a=&|XbIE-U-I|e=#g8(|PAEs1 zh=DVko&kS$PZ<9*4mQi%u%a;}!=uk*+;UP3^Gy%pGXr1%9`xybHuM@uX3uRh~ z@@{{q?_5cKVDQ({sdpm<629`u8PJi;GD&@3pj(bD3oA?J(+M4b#Oy;Q#ByudF^YJy zH6Y~%1;JM!47bXnsp8g_*xv3(s!U+{FmBdritHA@fVB> zjqaIvU0$v*@5IgK zz1__|%lZQ8F|l}|k@>`tbc{Q?1W$!b`EtFMkWe)8&E*u>X1sen0USu`!oWgiJZuYMgTgpF*bOr9zZ5!@CJ*Chh}fG>Qx?7N>7Ly zlOBwOgant%j^$TccG*dh+QT_aR|2F=ss7jk^D%Rhkvh2I?W)_Oq#BsVTV#P=lbdXK z$2WaV+cB6Eta)4n+r1$#S4W$K3AyF4_8GsopaTfiG%fTZnOaneHXCxEVGk?Lzf<1! zW0V=4D~w5Ynn{B0&OhIH%1Xbod9#glOPk=GEeckzSb<``^!|^jPivBhSx6$QCqQ;q zf4|Gh`f-rN`>AW}74D+NOZW%^pUjOwnR77;%lGtL6qPu^O%~j^@=R(L(1rU0>lweudW!d_wQ>Ou2uuV#p2aozwn^ zh))a-SpcUxRw85BU=>Sl`x-F)Smj=c4bZnbGVsH9DyoWVlKw(de0OMeOONb{$LT;o zZTc$PdqYNjd_hEc=CY$bGa<0IIF}a{mxt?c zYKl=(zT~1X6AKKn=b6BA^OL_D*cq<+_QRu&Cx+(=w-+$rqh&G<6wLqPu$Ts=x8b z8or(g?t}U?zv*U|i(r0-OL5~#H_kapZgxx2-@8El-TWPEmj2UpCt3ecw-cnhhV#3hZSl6hV!R5Xs(38AtzTcd4X3l>MGke(F z_Z4gXYJFa8Nt5TfRH9mRLFx8H|NVy5Jx&QzuAy1UYBtmS>vlv571n-?&E>D~KdaJZ z#mR{d+S7lO>9S7rLPZZVb=`Xy$Mvxg&-Nw@nZ;shgAeEQH5$aZW0*$Ny40f8%^R3n z-`JhK88FVH5=c4$Ckf1jqvf2>F@BJ6JF^bbOSJ3z97d|Qs;@$9I~x%lr!xw3D0S3E~{Qlf3b2hx$6 zi6xv>>jh#60;3z>68Y(;N>8th!&ZRO7z+_teJUV?^j)A(L|-mhYb z)g>+~%^$=mx2#3X^@n13;Z>AgLC`d3-4Yexi*agKDpjoG6$?_g3a)f(W%B#Pct$MR%G^4?gi$bo>FFQgB8 zPV2<}QXe3g_$lCrfc`&sTeSCOWo}JuyYd)q&MqjJnD%K6IV0yFcdqRFvQ0J#M|+j; zVlgNJEFSObrIC=4yS$*`q-(9TVbSc#2nq3z;pM0geSHMgx$GN}1*c!Y`NIp>*f*rV zMnW;HLkLZ6^zY zui4h~RTJbr@mE?dAj$PKv_ce7oc*zE5nLolpl(FP(cO6hgksaVd}wh)1uZs`#MsaX z?g?yhoV70;c)2`oVfDomwfbkDAU20V{e=E-4^motG?t6CS7@m`9{g<&&N3|)--7VR zlbydJWJr2QCDs2laDa}wi<1u#j6|pg@)S0yKP6qIOo-?XtQf%7?jJM2^WC^FDVsJCx@@f11=2K~ zuWQQchddi?Qp|sCK_)jly?4B^m71KY<~6KzTD5T#f7787b@?kGt>rt7yuAz6OVK~G z<&eeeM-|rmUBHDhWz5^`9(e)5vLRH+mwD9y5x3aSgMEIC0eohX6e zh))HxZ}&IEQp> z9w~l zQS15_o&&^>FBc<5S){41h|ZR=g;nV<6rk&*G`Cx%>QZUTzPTapT-~2j&y&p$Y848;%5=8&Hg;S+J-DHp8`e%2CPn!^BEi3ja|*0+fU`nggV+C=K=|gc54p=Q&K=3yTzo7SoL>nF!RAG z(T$ZX*7}1pHx$16J!+0gB;~&8U!b7q=-I-qE&|_YmbY<$#wD6KycHI4uY#XU~yR+OM2))yNX})bRjTZa_ zK82CCKV%zo72{oV@N(Od+DDjNfg$xFg1IQ^dV3Lv-1w>Rg6~&HHIz}LgsiAqd-|tn zv3 zF6h^?W$u3W2cGrH{A_2$^2Z9o*GWueq6H95_jL9MM^~uo9l?5ny8QlT{kzi%86h3f z8Rs?Ih;`m{FD>qPkFO5s=TS&_kT5YZCF03lQ|!oq6(5n7;Gx2#P!?CX;p(%|Df+cW zXE7TM@5XzHuF#@?Ti{RG8vUDU*l|RNeDsFZ#D1-P2;(>uWH9kP?zOq<>jBEDtuK69 zVQES0mNaXE_fl`IAg7gLR)5}YQ}j=n_?tuzq1o+fZVrY^F8|(1P#^3F;X>>QjmUOK z<0EPUgAd4IFU?1m|DInF^eZ@ag7B-;ZSB&cF45=Y82<9fmNJXvjJ=rDR zyNkO}RFollmo?cJt0 z!-!IbCIz+py(1V>WfSfjr{H7U_&)n;vBg$%FHjl@f_0t~AH(FE83}EJM^NEvcew6J zr(9k-b4juI+u1dTvtL)K(upai;lttQH$XQKI??mB5XOJM@-Kk@r(;M}r!BgQ4{8|Q zr=2z*G)2h0tD|hC&=8E-^Lj|H4G{@wvFOyoUXmi@DmQ4Dn=S5SZ}Rtn?Tm`@;w9m< zsahx~Iy_01JzA&473VIb^F;U$Pgh>{--_C7YQ0#5tF74Su1=;e*B|klAj_R&j%?v` z|Ew9T0Oc?5|Kc5gm#n^cy68nWm|4RO^&MJiSg@Ss+iF-^f+VzRquT{eNHURho{{#n zKaWq+oV{pe83l*e=f0?ePk9|Ea3C=nng6Nn(B1#*8{@Wor=l6{_y8C(g zwEPmEI;4+*L2|ax;e>g7qJn__NDU505L@`J( zRTvY=#_yKPn6m}kw!yEJ<N|H{laQ5FUemGNx5yD; z>jViT!tq=vxU`u7b`(%nPJ&F#t50(+x##o8zxMl6oH+MQ`2uc59t%wsBYXr|F29^o zjk3L11JgnFB6FW-c7dxS1Pq~QYwPoR`LU}jdsNTUlGMk4OlFKu$_C`2%KXm2OIT29 zYxdy|3eqDD{-M>w%lwyLMC>g*%*F845n2Ov-AXD4^wNc8-9=+oaFo|JSl{ytJRkj1_5_O z+5158H?yY*u`clZ@auJA;|Qum_oQdWjNmstU#o}b)rH$h_!45 zlT)zLjilc3$6ft>WSZv>Sz3bSuEI@jQ8I@6*JrsJOlo zNA8gA%VHt=h-K`Kh#8vZqIkUf^^fm&HwC7TZcHA`O6aRpt} z`;pgS~u7_(6GZsNIMrpwO06zA%E3!zV{jOFoSZ7~6ek9d|aJ z;Ui52U--X=o=z9-Na(|k^#CCqmBzeCFZm54%tBjdi^cDi@m@|UfBH?~|4U$N3^P{i zEa2$%x%w+wal9X9W7EiO?}s%vX)vEWsBkX%K0nM zeWKc8p_6B{^maFyPyK_Ohlm1?@#Nyl>ZNFy{;@bNc58%f%_KB?-X~V1?eYamb`n70 zuyoxTf2V`R_}no~E}K9glNyfo1${f25Xk}{y}8+){}#ihf=)QTvVW|c)A_`GB@6(# z-(~?dnJKVYaiNn9<=5H&J{2*OWfj6AKK&V3LD$@z8a8J(a77yM(VD+=#G#5eu(i1k zvygg$&%|9Br;D&71LXh@L&u=RLwCtKqm`8rFy5HBuB}RYGcYY@@f-7zv}$iC zJfbbm`#lj{(_my`eG4^zDXgCBlZ zdij(v(uRghZ`@sDXzV-O#${vV&Su-b*_?FU4J9k1ba&p-Z}J48qU?pu;+H(mf(%$h zWwCI`)7=3BPKtd;e?AfwLWZ%&`hldn)gO?!(dbIYnN~m*J;=G%T@3dJ%@(TrZ{Tdk z@vM>D@Mv55qKTD~~X)>zG14%lX{|0Nz@| zdsush;&Qdz7a?$q!Kd+i&xAbp-~%p%##P4W#6b_33*yD^P+AT>f9;eP+o<0 zloA$`>=b<8jS#+!VLaao}G^-*{>(_i)r2 zn2+C`>r!ynZSnZa|4Yo_*EU3AFs<*PRNx4u)|D4_Mbnqec3Yw-F6Z&rN_G9}>0X5* ze0%YDRbX@t|J*p`x%6*Y-XK3D6gi!zEwI25w~li4kx#GEYR4w7T*7<=tvASWSW`ng zjY4O(@X<StBOzuNJWPg4jEfcY7=#WdW8t2i7(nB|?Q@fVt? z0TY$KXM6y-On}N%b^HC@7uUuQGoDV*)!?s_Z{zf!R;coGzE^{$PJ4-hBFESFfMF~P zEh{m2_X36Y`ybD_3d4<&RNgoyU04KM1zv_IO6dFp-;DLD5+QYgvxsahvn1hsan$3b z_HFx=T4MWO=4lE=ve4Z3$Y17cqCf4`E9fI*6XryLq9iYR?o4b?1e%bLlF-b)jm&ov z2?-^Zt<-J`<+DapaQ21pfbD5~3|0U6D}-|UT;99-q1F|TzsPW$h;h->FQM$;OR4MK z-p33Q_@Y}ZDOsEIseg^lpRFDs!S0XGun+W08%=3&3n^0z23;r6W86v>hf#G{{FVpQUK$=qs2p0+^vDT7LDRd7Qp^P&zVlcnh&u;ltRAdsj@t_k}M{X))VA_VoiS+EEcf1zM(_+`9yCjsnbQ zSPIu!d!t&=>FFG~KjuRSweI~}(p@AUr!#I8#o~pIbG$F=4L-3YD`Yj+DDrfg2pzBv z{0*!_QFq1Q<>)fWX~f@T(e%a6ys;#P9pI(3`g4C{Nyu4?(9uKRK8L7?r&eh^hlXf? z9C<)VK@9oteVhl+N0rLD2UixnY$nEHv#HlWMOc-VC;?Zin(KF-aW%s`>5l(cN%Bwo zFz2{~C8+A_ZfNPxrv>_wOd$=Nze!%c^U$`_?E`lIHyXIqjr(Ymq8qhP3fmZ+#X;-c zt$DfN>C3{`{uFmC#Pio*R{nvHe9oK%pUge48fNkvivbIoraK^IwAjcr?Nl;-eQhrr z%8SS$lex@7-qo2jzj}JKP!q@L_@TGKvk~QDC-xXF3!JVXj^In;b`{#ipO+QUdlw36 zi5Gm=9IV&4DlBB~7)MJ@R=wJThr8f$Mc-&vnN>@;#$bmat8iw!Shhe{*33 zz3?md$AGSCb>Y&L8^{OYrS=6!lJF@i!~m?}(E2LcOhrY`Pc@8+h4g|{ST-M_(zRnp zi{(tCXjKK=#Q_G|zI*OE%7*-U^|Whgvjr~uxPi>OBPz)Qh}4>$uA!C*8qwX)=~B4d zy;D~_$|Wb(bGpW=m zfwdp(&kD}3EEQAX#}FO8zcQi)_I-CzRELF^RSjYx^`}hngSqn_@pD~o$>K;TFJ1p;gX2$_#qH^L9fQ2oA{$_R(}p= z7$#YZeRL=S$W{;0qi;wyNrKkg{F!&9PPtipeL5y5o9$*|babT5t!m8|Sk)Sx(2)sP zKc6f&Wy=x2P-nnm`>l)+YJ7aob7M;oY%RA_(emhaqxHj7@zyYyP9=>mo%}V*Y1htI zlDT_8n|)68dP~t@Cvz_9G`(FjTny2#MZ)MSPN9c$~G-p}aI z&DavPwWvEXk@538pqBmryaOo6Cu(Ew`K$HzCEDrT($718mlq+6ch@vh6ZmvpZF$?YP%8t*OS?=M_ZRC|(;Ene8$LnM_9@}v~3B32}N;>L1 za+}CQ?UW|3IjBce`}(g36FGPwllp&UE9k)uEqDJge`ei6-}Jv)08q4~+p&g^UzCt} zbpR5vtxk~S;r7uVS-PqFYIOYfE1R7mpWIO0RnYShFT3{g6slCvI1tZ~E%DJ*a@kMH zA6_hvv}p$zsn+1G9q?Pr7Mk{T^U@97> z%LG{M2Zzh#mC1W9gs2%|R=U_P( z9NCs@^>^5<(e(m_-DaDwYJO_eLxgv-KvCR%DsjdlLohI8cYpDfVu@M@kb#aab|VP( zptLE~Sf@?ALZ-*pdg{Lguw>Tc0?2pN?a4{5C2}KZ(e-!^LE4?nd|Q9Lmq1F0>(*>A zr$V@BkM}P0J4O!!T&8rmz~J!fVQe}z zG%`iK)mGhdxa#7S-5SdlkflB-O`?Kj|76xr{A;>|LzK_BqwDt|g zl5&ks#2w3=_&CTDAGbDBYfOzp4ip(Ot z%7jxSHw-ZoJ6fv0cpa>IM)cqRKr0u4D}S+s9njiLpKSvsUhMb3;@}+ zgm<@f(yqS z@jB_18($ynWHGsEC(**u3hH9H+{r?@A-rI6&yyV|+6(J6osviE{f1S@>SM)q33bo? z0J&q72*8H%jaMymz=bZyT94r=)R;ob^uOl2uO6x*7MG*}fW@zBC*}Nq8wy`)@0P9D ze3CNz_V{|qnu0t6+xkD2m=-&?mlaFl7$^x2ABW&r)1@9clz1%Jr|HxOmKR*C0>|sE zKmQGfY&!tH2lz7 z=avei9&+OB#I*`>FkAFG2pG$KAR;>Xx{}I zk#lm}*oR@M!)Ex9;dqGup0_|b2ZUW~a#IFOzBq%f8&|fSw~Pm8cyK!C1J+&Mx|xHr zD0ichmBbhUrSGl#U@MbV>VC*Ng=DGA9{ysB8%eiw&{&zUaBVPgRwapM#b^4_#%!_v z7=}Qfbhs!UjXEQ21$tR2*L0embkmuauMy$JZAP(nxO1n~(TK2vGjXA0vXsJMu8iDW zoc@vtd>}ErPf@cgd@iq+t{JsW2?-0*wO)ToP#PK#rKB0k?;{cNIa%f zDHSBo&Tqx9q2rsWQn@Hs~!UMZwQtpNmdvKPIEC)}R71S8&tkYs69w89kOIA-hx<%1N)al*y~(p9h7ct-Jt{S&Lnp1)8Q{9o zZ~89Q4&k~j8I<4H5bS(TS!6$39b>*`TeAJUt1yiV6~<4tv$dRldye1HONGcLb6|p(R&{ zbK8?fl@k~2)z>&$xfxAXfJ>@mHgXG&d6I2GB!d5)1qJ1*{ia6W#J7jOnlf;dCeIT7 z+3%uF;%bWz2IgCiYT zwNN&%_WSKc#`kQ5KaLG%x{hZ>U9+XK*9+e7Kq!PwsObLR#VuR@Wx97?Q@Mm_ZB4J| z$NIvHeU0UA@Bfk`jovfkNcOY@DsF&F(2Q3k?3XQNnrq9k-}!9rtQ){XAm0vLxbrN4 zveCthV1^v;N4|7EjktC=M*DNunKZrq*1V1Es6#Wf5oJIj#%##u@|5ZO(;vLZZK=;S z%}bkNSxpiuBV#vQ&c7oCJ3P^?{rPS2c76znBT{5fG(Q#ZE*v&V7P2u}q#+QdaJr$C zX!LFuVGWPOokLTl(JYl=rA~{vsJC-$l_gi zP$kKE&%kGv0t=`Bjg(&a%P@t%S+<0EPQl$!^q%W0>1-uSR&#yaM;42gL2(x16@aIb zJ=}ETcb$uV1$auHmmD!AYQ*Sh)GS@!<7d$Bd8m&p@z9%nyxLh z0zX91;{Hx^O(aILgG)h`DD}ij5lkpu>PGs~(&o06d3AlI`tbCa=*(?$L1$O(po1vp zo!s6Yg`eBCoj7B;x!mJoM!Z*V3hZs{XsXMwMoB7(|?6Ep3v1nFqc)T{_B2 zDdpO_Ic1F5-hW6IgVCY&cuN;R0;Ldb4b|L{QB4Y@(gP02-F^}uP5d~n=|0Fi0UUv!6zO_ z*2gIV2cgL-P9%e7x6dsD`iSfFoG6($p2mk)^IMUt>x(F=j(5zKA`$SJ_gcr5ZAg?T zglk?^1UD6|#4C`&V}3CJkc%CiS5K6~BEb{pbU!3cvzmOHqalhTo)+HLo-K2Q8~kJF zH*Zi(4S9!(d_GBa5fQikT-wRlX!)o~$K zkMXvdt^=G*y?hJ^i9?F>7eI^u5(?qKu|{&f*wTipO`NIJ*)rB3W_F`-NmcD! z-1*a*S*9$&>?C4yXM_ZVSG>7CTI{@ri0(SNCSI#E*|yQ>4N;o^HU+o7A%n3wPs9#{ zP6i1;!IUl+wlTvoCz*)E{&qk0M#i`Y+5JCzT0|Epw5RC8U#yxC6iohK`E~%2!h3n@ ztB#BWiVI34i0kJ*PMhmLb94o3_$N~~J&4xRt)@NCGOzGSmF-n&WMoieUKkE%dKlsk zYH%PP)mMsR5z?@VuAU6+jQo7W9#|&c=RyLhN?l$4=tx`uqDgzS%P4Zl=P=SX_0`w- zC~dSb8sghGyT))aok0>3`$5gmV^&*lB%jN8997=?pYHf;UR@-Y($8-kd)uSzAEGd? zh6-Y^IluLQ4&N`WM+jVS2w@>+iS5Oc^s2aTDT6!xun9)fRb7#Y*s!h!_bXnZd09k& z{{C$*?;9W{6kAmmA=9CfA9QFr%9DA2>`yU;*-F-Sw&eBE(!cX_qM?@n{KT)+$Qgkx zZS%`nET@_EXPlUwrF9K7(`9;n7_62~PEMq*4#v0=-uU)S(w`VKUy$%N3(qV=GwMEx z-5N;}?s+$H>XGWqbA7dQjM>V^%rbPPP2$81Q0uNkw>}8gbFiR7?3ygc1+qRp#%~9l zYbuxA2Ao*3kzDhWL)Jzbg&%24-V}VwC-mPOGEm6*A;w9uYN(J&bl<2v!2vn0QOP&! zNzpAkNu$fE$@t^M;GNk-F`N`AkQq}B{qgN$z1KN2g21xd=9Vx!CiUm{zg{fuUTsY* z=gCOoz{UwAEBvM%U|6imWNB>LW@O;r(7_pu4DXk5=Ew3PY?Ja%30MuYVU(qkkGZ_)@*3cIwM;DHVBSQh}#p{_K!t^NG=p{g$QRaGbMIxH&+^eu5ZLI&pfQHs5JZ= zUv*4+rh9fwCU{KDe<`M+#o;2`y&%(t)g3k3TJaH8Zpm4dh6G8P%i85GL%R&1=3sr;vPR3iLV>r#IDI#_SpmO|yx-eU2ZGV)l4%zv&>N+(+Q^6kxN)qb)cwcp!3qf_ zlFDxR`q{4+wU|Stx+6K!ah79sJTuVmmV$Im{R~{ITl+5DB2CK8R7MJXFlJCDr|FpumI3E+z6YrifcQ^+NPsym;pct{uuBC-E_&o#E$Llnkpw!WC< zN%=lhmgPL|@98o32Ofz(+{Fy5VH^_ZQ&i;%>q$by;W1MQWUAL@bV1Z@E+6XuZL4F~ z{KX7@FF}OBKhIKS&cnqDOwL=1FW|pOell5@d>LP%*S)OkD(~GP*Yi~ylo>zkv7(K* zbji9)U!eF(XW3=s02%h7-U1jVs9e)W8>m+GJ>>=vs7C#G_ z+i4u}Rwt&g2X;%PFrm%YIFfYw_!)Qh~QP z8T!NJa}vL`fuS(81tT43W(Z{I-j z(@T53k#J0gR9>M2%5SGv1BMQoFLO)M?dEFh_LmlBjAS$4AzG{Cn;;WHaLDE!3Hk+? zst$!x+q5Yzq-w1XURESCFi0_>G7EtzSTaNaP^M3q`Ud4M8lAi7R*_pLK2d8%#T$ozpL1;k?Kk9qI7;n2cn7gft?gi+QOTYYEC4ar5(PBvEKT zC=@x9$#TQcw6Axr zwhs47J@1<VfTEJy$TDxl#@Vx5@(o@JgVD`ZH-ubZsm@a1yK=reb)7#f9SYHRu` zk!Xm}rq*YSRf3;mzx$;`*6F-vq$K|1Q|8lJwL`19mEXt+rJtrm#YzNq!ufNu9kmyo zRVa;{BM_D84a){4vZFMqxt0$|uOEef$ee{|%GF4gsaG?ubL^E3)u%g5M@t2@X_Qhl zhGC@q2Q}ZW*3=eRoY*+e0f1#Iwaat7;_Lgi*OTR+d-(FJ1QngGamDyy=I`pIH?2tZ zPIUa|a%3w6aQ-@U@2(E^-$LFMf1+9mYcnY~v~lJ2I3YyrD7W7FQnXYB`>8A>feYi& z?^0I`E2RpIFt6aU2q<@$08>^L1yps*RZ$01pu=Yr+O*& z+xVeD4+{4BAPok^Qe)8{=s&mYJstyuc9u7+Pp$I$uRA>jO)51imXX9_7^#DV$J`H9 zLezyT4vv<&P?P8TZ1lIN7GX58&xI+b$#PMOOKmgvSEl=Dgo2@-A&7Sx@6=6N(OR`l zhFgryGZmu4Eh?2Ud*T{;qMF9n+bw4Hvx%{Q$jG1_EJ1 zY(C;V+OU2S;#$ox70X+*;qk=cLsf0SNQkA2Vc&eulNjJAKBe2n6L;Rgr5qrU5RfSh zf$JqlSF3eQasFW?qX`YPPR+kDX!8%QRA*4}qu-azbZ+Tus8fEXeAEK9oK4grO_WNl%_ax_o*?P8-tpzk|ufw++P?GRsP zoHpADn@^TYNycJDJ?T2`0iE&;*UHaQd``MjAjRjrQtoJk1>*l0fy_aM%;10E2R@9Q5J~pjdBrEzj79?H2_sDk`1E^GLbUctow;QJZnF!~rg=&)@9j=kEQ32^< zCu(Ovq+s3R`d7UL*CJETf3@Du%ojVb8_3nMOscGD=e)?uR;~KPEQ1nC z)FCHUljzi8X)bg66)Ed$`%=x2J99+jQadN*!~jjVO2W>}lq2#RP(lp-4+a59+~rKs ztsJuwpK}JH7Q!ST6`7(aX#{O{8_+DO%vNi$!%zz#2?dbNl^Svb?Ic4BlJjHu4aqOG zThTJq8q38tU*?@>NK;5RLpuB2BkWCA9(B3ZY5_M&@f#$G!3}Bj9}b6O=m2HdS!Kj- z4*CPm;c?J)R958qan@vn)4QS&a=62HiT4u{&z^KuOIq(WBDFF+IT9Q$*lU*7Qc-=d+yisF{XfyFaQGDJDR9kmxfq2;67|y7*Op)3~-k>VCrvo-=#tD?J$yQpt!&X4v?_oj#J$Y1=T6fsh5j^Cg{0>??>Hqzb7@3>5c>bM(Dcnwkc^)EEd|8*t?9p0Ua1Vu zvsw7`pb|5ClB{sbt{hQtUws7~07ctcFacYs!;#8%mDIAIfZ1adz93kh`6~npk3P4| zJruRCXb)JG>a0^`xgK*^B)67-+F{uDnJDwf=f5jvz>-^dhT~*WIQ!v1!Ywv)q7Nr- zW^u8ZdpJHJrG74Aj73{*{(y2gO>{r$<|+Q>NUUl;g6|a}`x6OHd*tUA{)P-{LePag zhfOK!W@Y-thkC5=&NK$YQKBe=yWEn%9oIr>DRMqc3xIQCwToyH*-XZ?vKEiGu>$T# zS*__8t$VYF^cQguH#=zjso1Ue3P@E22!u2taDS~%ntLaI*ik@Gf;TcPE7g@KlRWheL?Ygj^cfd7 z1mX?Tn4u0?d5FWl*Q=C|3<4Np;xZKy#IW!cdQ1YHM{I{4{CAEOJ$)N5xpv$>b^r~> z&>?qrE7_1xm85B=!BWEPcEh(LDFoY$RIRHWLw? zJ?vp_GDyFcELQ0#lN*5%t-W_BYum%@nKhoM6Q6GD6`5x&HYOs>scy<*g%qi0k{f&_ z>KqJekI)Ec{1t8L=wovc4t!ID_to@ayaM^~DDZH?1)4z3vKH&B zJN3h54^`!|uvEJn|roMRb z4qYHoRzm=h02l4_t;k$qArWgyd|SWStW8*z@H$V?dz@k7;p6B=NJKpxB7tHOQbQ+S zZdyqzH0@}Cb5czoP+sqDaYR=g8Ck-gAltrC!)|YuJ?`L3#ir9MjMC{{4Kq`CW9eN3 z)PpbQs>0;TKcBV7u^XpJO%__FNv+&vVp8N?;>{Z9)BithB8fHZ1 zVI_R-T%pb=smVNa-k73w0R|x2osF|obWDqC5?}PTf2+(o6&G<&6i)cYY+*pQCG7R8 z)$u!2vz1aJ0AsZ&tQ0douJBK@l`N-KRAF8qWg*MYi%=MA$Zac&m=nX_7!rqn;+46i z#C@+^i6l10pKoI{Bu2LKrJ>%3e0p;@GSSx~)5;16=uTq;Uv5Gqj{fCW1jWAjt*2BU}91D$fg8!M5S6dnWE zYqO<}-`e^UNsm@;XNRxarINH~=5Q-xAFZ(U9>UI$RAOjU5hN4Xx*KXv&{qjs-6t_% zfO-*GaCCzl6iGgzFuEWiOTxB?JR~DeROUM2oalJAGIA~xgb>&ajiZX-zE?ymqLurN zgs<|cb%E^!%?|R~9Vu!sI|thrTwUafzt!T{EQvqr^@gpBO#^}B z1VNfycGP`_39>RVYd!@%1J+fWhAaWNjYwdS`=Embln^2kR>SU2PtTYkc6h;grj8Mx zj{331b(44WvaNDp`c-O35YD0Ak^Q{cyHJ$o=~?V%w;pt)@CkY*O1ycVnIKXU_`M*~ zm6{XYS^jvI0m}BSI-+|zpnN^OkTIGpC*kB0*81HTPwkVi#8Cz7>`jc?)XJ((8^hLB zW!r$aTfCIzuZD}IVD~hb{^oF=z0Xw|>6chEyaIO_k$&@Jv6 zaWc1|6w($Fz9ykS#Y<3f_Vz1mQhe479O@?I5++o40j2%iiX9(|u3wN@>W}FtVB@H` zp-z=+hPpikAt;wd6PQ@U(c9GzuP!Rs_l9Wa+Hg3`B>gK3C-d6(k&jtm-iP$typvnF zMcnK-8~9jy2C^|%t4uN*6Q3y*nH-V4Wb8!lt?&$>d$G8*|I7ZY>dXgh?v#XWxjYuM zUr5KUsI{}ijMH!ruPtTSa6Jkh;!R(_b1vMKh5u@+;328;WXH(51LUL~=Ny8;{eO)j%)uC^AI9&-Md@`|8gBksI2V$*s zQSm4mFOZMB5iw_X^@Bjl9 zN(k`=8B^_4ra1D2>O|1ONLp(;7?|Mb$Qglp`RsnM;=E(^IcE>H&8Sp12D2KW>CK8S zrzxF!^!{xjcq@^_MHocQCKq2Y4L#F%;zuhNMYqxnVpr5|MVx8Yly64!Y#@(UheZ6x z45KxuJmn^?yDJ+{w*yLeiMLg4WcEM#R;tyX(zt?RGcDGR8bW(bqFxCFVrivwFU+f? z1xrSL=3an@3~SWBd=|ixiNVLQSek?YQX{P~o^JisubkDe{YLi-Vp|-SwbeCgF4wb! zw%&xPe%;GbVxzgDltxhCkztdUi0;3A+2IXnN6V4wxKBAGREspuFS@y0UqNNYO7*p) zjcAtRi7PrYry0stlE%^+P^v-|xa5zdE1dEYt7N2FTZ& z_>Tsk>&hwO-H>({c!V!*m zlS%wEkq@e=#^wxiP53&)+@o_)#!ro{*jvyo{vRyNQT=e%A$+)4iTK_}!20hn^!NMg zV50UMpWI3^lci}6FGH9%zEJ85G<4#ZcSO=9*4))!1zkxCaf)?fvv*l2DfH4s_g-9W zPbFG!>p>RYl2Oaok`9E zKKW>9b09UeYU5)O??DAAfESqW>ghWG*?I}ud#wr%{nJm|=n0=peUUjIwvtUP)lp~2 z_3IF4%z0sez46xYQr+}h384-#FW;CBh1L4Hn>AbX=YK+t=gSR#l`+48n6VwZqkEwJ zED!`fm(a&mhdKAQc-*HrFUQbp;=l=__A-#TeA6Ev?8uwLW?q>R8LQMkAvZGwDpT?Z@62Mul{xz*`gidxu+ z$!AN%3zXvG4Jkk4k4VAzECRzx)`_9Nrkd>bFsdz<0>0m#Yagp^O~fRqiJK{dey8hN zc>hlJ1J;C3Qx+^FYM#;wD9h_7fyu&p9;;QmBF7%hni}8SETR0g>>h=<$3HZ6tiaBOgzf+S1vtErJp4izGeljK%$?W1?0W zxd)%2p6SuU*p*QCf9lfUP;icV?~oD=#&7eZXPcFJ>=@8#QX#6RvfJXQ)!rf~SLnw& zE{+(!|AHJLlx;M>h=0(88rYEk&g23Bnp417ob)7He%-)!qV&c>%%_^uN>NOMV@X3C zperW&HvfO>dML=G0-0`i5dT<4d1+I@53T2T%eBM9AXsF4BUb>_*)>UHEs^~q2(wAg zGoP7=c-GG!QC+y~2>~f*cW~tkls?+s^*(c?&2Y6^tl;{XCh0<#S87}o1L1g@lkmuQ z|8H|^He*+cse|@xobO9X zo`v@4nW?uxwYc}SUwj2=5Wok3?kFcZxebJc1zQ*4kjHGjBM|(k)e(RkSLZ~@VHf#$ zyhUF8PY7_t1fnUb-2b@|GM?QCpND|!Rr(pt^hfF1+M4`Kk)EFZ=i!z)V&Nx;<1wMb z@!~H5Uj*%Ki9K5j3VT$bgfYn|nWUoa5@6zBGkHHiygzGAfysOdz(l`)N3mjW;B<#pjScB4qjP zb_f8zgRWojko=lRovd1Ebn$_1Kv^Ao6s9v*4%ZP{b?s*xBm60QL&06m@SKc5T-TF6 ziNAehi=5>Nd^{2_DK5mbM3HwVIwjCfL3V(scL$JHYh54C{W)EOR#Q`x1`U|}R0HU} z@?Drs)YJleGw@?2D@9Z)pVO;|%rzLAjDTlsEh2ik_J;I}=+^JyBkAIR_+dYMnxTNaibE7)F{#|Atln_ z(9+!?4I)a%&?OxsNVjxIcQ*(KBHi8H-FY9*Iq$u9-F3grCs-5zC-&aI3V?di;wYnx zPF0czc%`vR&8vW00=6^k@lpHXl?jEFbserV+1H3nL_qYp zrCJQ$hJLeB!JY|N??PZuN@o?FrSU?*^Bj5KstLBVOYgEF9Ut48)M2hvlJ2hzM`aV& zZ}MT(3_d(CBIl^Kpx+OO%zIQ=jP^V#ETZbqfnsyhV$ZMlC;MZE9$)>5K{}R+{_LyE z&BKV7K%c}m1UT7jn+3(K5Dgjv9Z&9G-}Z*se4@deK+WWkTo8I?xX|p*!phnjO|PP? z=45X}j~A5Zp(17fy67FwSs##3j2iE(yN{q?ZX!V~X7kD0`o$UY+W)#?k?g&5$flJ0 zz0@|=;(-`LuDqvxwhoddT@eULNc6lCFY*>3xUII&coaS4h|^E_0vG1L z6*&swEgyv9W7$Lkf(fzhW7`s%-A~XmY5RrzyUV$?r62je)Bo9)N7DgC2<{1CqhlP8 zKZJBx<UBXqy;XD|4?xnA^Vwnk=xgqLmvO_kh2GdqR8^5*$otfZH}8cJeqMeo`%< zxoe}VD~qvBkDsklSvje!R2-E68blV(!dQJ3T}(!;W?VxWX2bNi4fYmUR-DPPBPg)k z5n|<`c%YtZU18ack0XK5I2?d>(C=$*v$z%_{2KZ|)F!?MO!Gc1)MD~nZ1?iL=7`Zv zbTVA8{TW~LnQYp=a-x!l+4?upBKRu4iPJkmXzJG!`ksLesF;sO^s{d1U%3>$63ZN) zCadOLYbiK3sf9D;UwgLR^0a8?%}lt!IV}`Z*Qp#DfZY;N0?)P+4%JPVbopxJ*^i-a zy;+(+H6;t5^s&oz9$Vv+M+tG+M);1Xrb)CQ6D*;CfEU5jxyL3-sN@bpZ324 z;Y$Y?k431fQw7}af-5kRUXOF(kS`6+&8H1qZG{SJrq258|Z$7z9f}1}pG9 z6%O5)$Hm-U#0*}Y(mg=&q^hqT7&Dr4C7H}?s=M$^flMF*zb~8>=@`EYp0Lh2^tb3m z{%CH7Jo{F5FwG|g;`=nI8&jpNYYy_f0zn95Y%ux>L{2&(&9Ujvd)oSood95SSTK~Z zO*!q;4%FEo9vPcKttfi;tjYxzWf_ex>45o2vHH_6x7R?mexwo~%lb9SYQ6X8BjH_C zw;pZAu{yrRRxf`i%;`6bGfe^8PI>WgV<0tcpvCvJ)%I6-U+GhVVfwbja#Z)W&l`&lvAkP%ay8bk=o6OK~GdK0!ofqH#?4d7ON< zHT2QSE+KYeR3p=RUgAf!mGk{FCW9JX2(SiQ1QM`af&F<;pC{xB8rqzLm72b9OECXG zK4JT+L#P+fV3mgO6~#owEK8m+KmFhIqA649Z11V~AJ)trK4b?I+FWnCs+MTUoA!z* z=(m+jYE)W$3~OS3Mv_00s>)i+VLw!f2sv}OAIk~RLI}lW(jov{sd(J`iv{v;&KFn{ z#X659ix0#VRJ5U@HBHUDM6`N_y*27?7ao@Sc50wb*CfEr7Xdhr911j&J_gc@ zy&A!6QsC~_=W`DQYJv}L!lZ8Ye}ctEG?V)Z-mhGZY#hrBS|`kDJF+Xh0HS&6YjcyS zmvTi3+p;+U!!JSeOe_Vayr5ggpP+aGx@YXUfc(e0Cdur~R6uz{9vLQif=haOd348X zrPty=GgLEW(D!C1?svEN6-zMB9+BBkx0-{cm7B0$1Zu-Z2Zvj?@c==SVGpg*vd61r zJM2;A^2ST#267%L46y{qO3iELj6UB2shdWNZIciBtxJGgSZcgOc1k?_f{HdoaXQ)^ zh|MxGi!*;ZRE#PSwV2fpdI`w&N_n?~^n%uf1#%-JBbbxZL}D4haFu3=SAgSgXSgqrV2L^!vKJaw-q*n_#J)if6&5yb8zU;_*o;oBV z96Q>UR(NUteC7PGg#G9=MO;Jn>0hG~!P0mLE*Wd=)<5E}}XEvz`AkQp=4M3qe0ITj@#FOPoNvd3`XsK(NL`)GFb_>bh;R?jjkP zWHsO99>!GEun3GFCUHyIl@#J0fijyc8jBG}cuS2&Ro0K|$5J9509lOcPy;d%Ik&DC zk%&MOIBNEanc1uTk-A>!oJ4zDzqV6qzt4ulp!~cw&PSO;r`T$r@l1Q^mSTxubL88v zVt-H}1Y~nOh2Nuue)|7q&`)i#)(rVZH`|-V=KAZmsKWm-Xg4!L0>~7%^Xzzyb)JWX zx2Wd_=K+tdVqKKijMR+(c=<`)(SYfS!Pg`$Cn0-Pj?Et`TgY<{TCh^sAHej*vb}-= z2}wb~GZ(Bp&a{%n!R82lvRSw9huX&BJsQ<{QPI0|%`Zm8722a+Q~)goe|NGL8^kR7 zoBoW1k&U82l2;l9{2Vu3(9d7ARdww+pB!_t8Fl-W<925lUO-GHvXZ-dNT+w;?lTw5 zMyK!o0N;g-jvr-FqJSXLy_X<}{x2l{-m_Q%yMh6Bx$^f#FJO_L3i{J!DP~<$J{I3@ z`k*-h6-hNd;Lv`t3p3P9Q;n(aO6hw(u#f!u?}PmPh;f3?2e}^e&nJ4??n7Xf{!A-$ zSHIYC%noaU#rlXw(YY%uwK+{A!pCHxnlL-m$=jxwOGW9cn@q}V_DOux?i*nzRbN6TI!_{v8@z=PsDXp%OV z7g!ET0fd(3X)WvfAn9R@^@+r(EfaOYgj(U`iXtsiG@66?1PG&J&HnFYE!DI+4x*_vfC_rz=VwfC#BxJbJ^6;0 z@4X*SMH4Mt=r5$#(RjhfL@2s^v7fAO5AWM2%S>rcWPg>v>4_H2AtN2CQkW2>Sh=uX zDEqqi@uj}A{R0-Llb?*RWr0)(V*C9mZtshfgA|d&^W#pShndom(nK9!`bCub{{DuU!_vT; zo-8VYeclzemZ7FmWfBv~qXzurq6HN*6DpbQeLRQ@^%|e4Qx<#KYgalean&?n*dh31 zsN+n>c^uuo3Dt*mwr!zfy>j+(xlk|C#EsgJi`)?>b;HsYe=I8xfHfc|N7g%N;Y==5 zv4kERxEd1>er+9U|9g0~vH?Vbx9+#VEz1*wY&dKX7z-Kv{$48=s?*7BFf=TjmjgtZ zgo;=mwRms!RNp{RcCv46~HSr)-4}=5O2?~(iJ0ocCR>qYAIfV6-q|n zvxT0MDJP*B&wI~E{r>ayLmjxd@;xreK_-EakWj927Fqy=G+$kM?D6JGTv=oR2}Ko; zY;@`d^G{UIGj(tLE-XeeUCTf_tNi?L!gB%CM_Dvum691>e#~=jvIp- zT@w+C^J97_`JAcBjeEMMU;Tc*CiCs|l6yN*-QVQGtZXBtT9{OaV@ z%?tn6GV8KX>`JVdx3RHt@4vuuV_T1fb&wc1!ts8m3k_2O|xZr8Gj@8vzoQgEI=@*>n>F;H?GfGM3KS{9*)q+VH)02|wPCd2Y34hGKdsIdN zr>!#IR-~MBDw*-k!IACX6bMj6o0!74tz~K|j+d@&56Fl}@)_Bda)pzPI%SmVz1rR# z6|pRdV|kY7w+E@7rKV@%s)fo}_Dj)}n;2y5-(%tsAp;-*-Ku|XXf}&Gn<=SnTWYAU zKZKXnV-vadg@Z9lOPQFLNMDP%KYwmrf3*4L3>gW3E&LZ`$p7$^kZt*2<_&baf!tmS z-m?HmN9R;hGLp*amw>Ph?GcV4gz{n8ChZC80A zpDY%05g7KuNICET{SJW}M@h7%#(KWhxa39eabj7M8Y#4r&fNQZ4i=rYW=eAAv!^Kr zH|9w2XBwrh&C)j<#ifJ1yViPd8(~2D_6G&Rx%Of-n`TeNWm?~@>LA4P{^A7)%KCF^ z&vnQsTmmmYwFQv7p2oiYhE9(_K`9}RXxIS)D5;#=03c(7GAIuV5oVM9UR|grB(>kcU;w6uPSMSZc#D2 za7kPN4bp!EzdSJ+xqa)S%z4t(iW(}4L-fVxgCl;xkh(72JL{P+(W;Ig$Byj9k{5hpw*1g0om?#|x zAL%TFkSNOofM~6kkk^s1x6iOq1EYY@v}k8WG5UP8HVYXV{0wfZ&use)ZnY0muU9;J zlewPz-c+kImjAKEb1nMb&-zDfph$mLR)dDlxUP&n}8n-6M@%IMK=bukc7 zV%w<=?ICX>JKol$Ka@^kG-7#CEiH!MH+^oUJGHm_?D=T@7ZhvV-&i)!$CH1^Gg|3> zZvlafWXbrVQn|GmrWIu9BOiD1_Rca+_w2N-DXJEmU>UzC8TyT>UhKcLRV30C!)|4d zZYdno52}XtPR3w-l3{P#)?vAy0(!G2>i^(%-_GP~U!-`8Rlw8r74XNM81##^h(-B4 zLq_h+Ro8ci$Fwm%&Wx9TcA>9(4vXI`$9sCOjM+s>((V4hDGrka{V2OPVuZ1nRB+1=Vd^d?H#)YT;OZO0792m2KK0c{6fsmrpy zOfZ$p0sF0yrNulXG;8$Z-ZT>#V30TXoFwu>DplUy=6cdpR+~c(i&Fv(N;2k&b8fj1 zHucsh)8<6nrWp$v-2F1jlfDT%vUTFhQ@+R;8}g)>sYtwP&lF1~Z7xy#x4CkW&3gF% zrcZ{Q@}3<5LsivOzRzqyILtnGv={Vb(r$kj_eJ83Dg0;FH^PytO|2B}HQJcDx}jY7y_*xv_-6l49kwPi4-?-jvbD0L zvM)S8szusTt47^W?dYzsDRb^5x1FCtqtoai5Nrn5tD`p^=-sUbA`))BXGsOhMS5A+ zyJC%}o61S+tB8;({H{UN*T0vm{>mItC3`4N(5uJ2xyTNuy!{%LCam=5H&Hm4Drobl zNGNxD_!v{-!1 zmmoY!^#1lA#(~nSR&DkQvPt{@K4{$4 z=cjW5JE%aPuL&7OXJ}BH$*3_vI2va|~U5sM!={{YsK7yqnGXn9oQ({{(8WJ{#;}PuBy;v9N26mpv+y zaYk{@%u2g8U{{n!=?i#z^J`yxzM5TQ;e6k5uX3X~nm`I}Y*FBIn{!fJ=x(GmhdW%$|^ zj}Re9rFqJRXuBFx{l|^9>k5v($6tv;3gPj?pdnOEK!>0iDJo4FCN*N3#})@#aF@4V zG?q*sIIM|t70>%^8D?kxa5Ij*m({GLrF0>Q+@L_+v#VSH^cK{2Gh+H^Z_?efn!mc^ z)ya_p$(*!pxuU=X`II#^o#^(*)OU=*IXqGryYv&R$l#3ngI6uz>++>xfh&edndf~1 zInP6u-6tjzKmc6pr#~fhF^#M-`j*~$)-ar*9W~-X?eGneM0zl~I$e6qKh9+C_s_Zk zJvT?W4u{x!3R+lK6BT_9$y_8=IxIrOHP-`GS}~RG7=CcY;GZAX3NQTWn2=snpCjh*4bGi&6M@yA4u&)c>xm*4z1EXPi zjtnvExg9(CwDFQOGPn&xQC_LL#(CYb%|+0KcUgDP;XJCkIf-k1a`EtFV<4W-h0Wvc z+8@{*zE((yKXjEUsEW^dFco2Kwm}aQQ6Xq|jX;B-KA-a>C}x}E=c)O0bzqN$<|ajE zxqY03Nytfl5^K@6XoKpoQ3CVbM-Y>5};PMq_>mevNvteliljOedgdD_vdazn>@v~7;FNNiS zuOqXY+)ORTwq)s}P#qz8FoT*oy%1ur; zR0Pp~rm@CwAeNs|VhMy#sglV!u73YVHM#KFw;i>eru>`#{LOg9;MZ>9Y7_UKXSqr> z&-iKzztA%VToATGObx^KY8Pxy_eK@oL4y@ga1cY&zWm_oQ-gf1vwGzkx20&eH2stT zp4hzxlJRa675{q+b*7f1P*Hh)Zr^^R(G24eaKBC46t1`58mim)aJtEhLCDtG*HJXH zxjY^VCijUKRPF=p)?vQwlY1(Do&af5Y3BmzGg)WlpC0I`GR6{pcK1r>AKh($ZiJIQb$pl;pK-fm(3ju z#m5p@^Z28Cz)*cPRBM~*`rFm*)A5dcwnZ7mXmsy(kny(Iz4A6_6X}%DpXp#_8o%k; z*%z0n2fr>;!AKrG1jri`X>C~UKAe_zf1*u96_3$DL>@4(R#Z#g$V|H)yS{F487k1V zMoxbIc|O3U?^t>;iS9V(aRclEs-n72Nv?ey-+kA~z)RuxHqUA?zZhK5kzlw`M8aTD z+ipqwO+0MrP=S@j=_&~cj8vRGTLlUeElv~vu2dzM*~XtXgfD$~$s0agIMT?4BRw6bg`D9w(7VWipkgd~@NYP)v;5fa*KpCG@&Q!ByMrK zFzT0L)+lq3l}foPs>RV1(MASz0@DSD%X&oTDEI+^F&|s3J^h;|s+i-eMa2_qDT5bg z#Q;$as79e5{*&G`!{Rhe0P-6pk4#)&BX-YZH)$Kc$us#tp1JwmspGZO>VaPy4%_>QCKWdA)0OADa?#)$_>m z$BO%H5uQiHJ+prbmtsxmV8=J?xQ&UgHesOaMGvG76Umx+{mn45ksz8}MAKP7{woaR zYRIwuT(r<=(qk4-#TDuH;oArVze$hW4t!(tOdK<)H%J~F- z=UeK2hf!&zq@R_~1Kt(UH?&agcdK7Y4n6dLEe5@xS>nebvWIQI_Li!i&U6_Ri5N=v zfZjjlYP^xiB=U(Bc$u{C+WRpp2FriGD>Q3ZUf;V(ds$A%{%18}l?<-(x6W`>(KJpF z)aR=qFTXy}JoaxyJ@@8xu-rbPXfz1?jHr{zHBv%C_om*45ArWFQ|oBE3nZ|88@PGx zu)T>+K)oTKEHW`Z_X-(4Yi@X_$E;dR6h$7fZ5Jj&tWE4#EyVqlz9#LjpTXv6K~#6? zRcl7QVwPURL~DM1jPw;SV`-bQLqbaEj&R5gSDmB&<%9s?)Q|aJC39-|f}w`=+c~w7 zEI|MB~35{_7#I1xMZz>DSEn)L7PP5bK^8%&7 zZnphbz7_*veidgDu41^qx!MS8UO|pRm`FF3?3Le`KLh7SSZ>kZ^e6D?7Qx5rVJ;OB zF6eRV_f<%S+VKaoW87Q(YT=Z|IvbYFPZ@x{m;0_*zYgL+7FPCFW%4i2+764;M!lm0 zi*@%mMh$*)OcEmMYZ9f!0khu(trTGc{ASydl?Oz7y1?e_7?Y@2N?KNSNjgUNd}|#t zw|KvJ{h>i*=HlU7wEJ!P;BfjASWL{iO_a<2m^xg(#C++*4EOpGNHer zq-We;nyRy`MarDX=1!Du-+^0p;)hP_Qq&J-$|f+s zH+Zp~?c_e=JDLa_-VD!IBcR((AF}MSdOrR+JgQI}%PzJ`9#5}M`VCr8`*_JfwG||| zy;OUDcZ#mo5&F59cvNtq*zpHig;T%A<7jd@ZiuC{QfRxyA^Le#(#l|86M8@}aRwWj zoGpoRu64JKq-ka(NWO0)zx;rGVzSB&=~1QgwQSZ_JBZCNU45k%7~d+vjRCW$ z-ed2lQ`3vS_O8dft%puV)ESwFr+!)N54r_+BQQ!5gDZi7s#rJ9q}#W!wx-l>mD<7F zF1n;zR}z|^nT9hbhn!Ix3IQrJ1YsickeEPB2)!P3ZaG~b;>L5@-V|5Y4OrOIrMs6! z+MHMfItl5S>zP@vn$S7!P2-x6i;}Yq0EkDkBL4*|okMT{V~(|ml9$L&quvNe#|aU4}MGh3T6M1tHX&;4eot&aSO$UHOBna|nK?4+kz%krlSU}Ks8Kc z8~x%FwSqJe6+QwaYQ0Bp(zkv~cj0`|56ivGa>eZS{!C_wmHh;f0Q&hAiFS3=YDd=0 z_gBU5Bw%3H+1}?2%7zrE@i58YPovcIeHi(6KTj+Fd{7)i=$Tt%IK3T>*oMc$bqM&- z>9(9Q@oT_wFxhP~N`lv=`mN07nId4UD%f=XyI_iv{aMQX2~K&tS%6=%I9!T3-{{)T zl!e2n9L_0XAWg4hYV+gsBx6BFcZDseDa?y_2$N#ndqkSj`0Jt zxgmK>VM4M;e&*BDuPy8CyOKEBiFl^py1D|MIT&-&-+L~XN_9oQ!G0D(mU~IO=K5B( zQZZ>r$p;1^0QK4WbQ364);C2<`eryp5;#nkqjMUaq8~y*lu0LH0@B9pxIy4&^S7IZP3i|9qrSL%SrqT-mpZkLHU1ln`R`cG5 zRprs-g_8VtQvu`BY3@=H6hwXFe*N}=)QH|6VSJs+9re*!|2gu1`RAz8tzMIIn2e*8 zEHp~^0~=S(Rz&?4_rDQ_8%O7J60wG1wtF^HrDrU&v%u+EvM*s&yo~Y+81d02K^+1S zimuq78K$}Me$-KcNjh+trd~)fK&eYVHfw1=dp}cIDbk9VUHj_QvzAO_r4mQn%V$a8 zju@Wo{^(#ZpV?GjE$aQvWh!ZAtRSfEX(pz_^Z~sBCH{g}1+ZdHjmgo8pf(-(r9>?6 z5V+xZO3d?Pt*Q4*B*MVKHHE`=E&fLFTy0MyWTElSYnm%YaV(#ak}4B+)cr0=z%8b= zvq^iM;rcaT-&_09J=B=|Z)0+U!m-kP87FGvyuIO~*5w@zP_Y`JBxdG>edL~0iCZZR zGuJmt#d=+t|B>*bZQr|}-RNDazuul3m0Rd80iV%JI;}qMWE2pUO5Ll$CSuU@F1Z3@ zHC-rOImr&-m zL_>8`BrfV0jQ~@OuhgI<FyEJzhJX8 zKQo$GP$N4bGjg7BZGisqAiJn}x<8o6-{6tyx~mh%^k%NkA&xcCp4+M|xVAVI|8)Ka3dLx}Tbu5!QnmgY$ zwj1-wQ=QE7Px+aLsm;Z;s4fpYxasg zOB%bT{eW9{l9a2L!=s~+_Y<#H!}YKY?dcD>|J!W;7s zdyx{$N~wl6zbUHZhN8Gi=6nFHYPsdX%SQ~oEw2cBx~@uXN`Z=ss_fIWFv+z0^HhOv z!mX4~7sqqn9^pu8h&uXAmH|LaGG7TnXGboKe`58T}k&H8+R;<;5UY>lL{N` zKNg22|95dni5PNv_3|ddUe{EcjRFjd<%#)86UQxtCj_hr_#8`@S(|U}2nf)d@VbRn zHo9VbUGAp1@=@Ti5F6o0kDF`E`L|6=PJ`rtPAJqJO|O&&Lx5I0iYCf4*-%0vH?CtH zv|$(Hr02bqZ-O%9HndV(H_ci!iHF;5$2HL zhZixqwxhHNkScfHj2%>1ZsL8UyQL3THSSJXY4*%r^_!a`qMaveXb{dZS`_$-QK+vB zt+8AIab=@3K1RNtFDH|_$k6;enY_IGGvbFD4Bgk#%A+a5F*14_G|mj zC7^aiB%!JQ zBkJ9vgX9VL+7tB@va-V~Gd1MT>um6G`Z>jdJWK7`89AJCPvT0RoVDXGDyICWL_~l% zJed9D_P|3Yt{HU;45|@+U_#QE4hChvCV>h?XAT1BLb!~tt5M+RL!`e4G`EXp(q84h z+ZCEqeS1zdcbX%&2gU?>t6IOf$mTqe|4#QQ=1IRed66w2-Kel+22-M<_Xjx_lI9&(;86Sx{P!}YW$eTN;gihqH& zYb2GcDknH8=p<(#Q#^V55#RvKrBb&OvfImhq@4@bbc>GiX^yws1pj^h5QYCd{~z6R z7ll zaVLC~@pQ1s4*hg%UhCbLS+_%EJjR(|ZG*7GtrPs!?g&h^g6NLL<~;*l?oQ&(|F$+y zb+o3_o9sk_lgkln4)}!klcomCEDeLtHal(8J7Y1mUl{)Dwybs8LFft8oz7r!bu#y7 zA|+7^V%We)7D#6f2-W*-Yo`oR?q;sHpND<%;< z47q=BicQt|Ze#ZNF#8g9z#0~(XnO3QS*LwJ3r04NpQ=SRF24uBkek5;$XS+%kB|Rd zp8v-6#eN4<`}2sfXOEnOv7je#WzILXxp+PTw`K2X`vMpeM>GG` z5M>HjUO1NgWwNW^E4#OC?9akdZmO?W;QPIWBm z%q7G`P>P|E2s%UviPnUMYE;-Kb*mN~Py3SHsR;48mR{p4wCkJTz`Mw8-0v54wn*^K>W3g-t*LjW zLllIjkIzUhAbqd(#>sl=u$iwQQ-c8+qgAKx!3nFw{127Hh_%GG(o|)ogWof+ zM^%Z*FeQPJaRrR@N5?Ja9iN|*qN#opzgqBUdJ%fTy{gM;@`o>GK;f#$h~)#h@F8WY znBg;;bE}^PVKiSy0w%JNHiz9P@dl1QERgj}&6j|!&7|Liz3*$Bruwg~matKm6@~X4 z`FGvO^V6-R;EoD7eIxqA-GzRwJy~F~TNoRQ_=nIRYL@Cckh&0K0|AVE8wxt{*I{0i zK=icrz>gndq!&yLz~~=Zm<>z4SLRUHPuv#f+GQ<*sGi(+_*xCA8?2aGn#%tWrU!ElzEI}@p5#O$^F~xe`|NT!;#t1y7>ls(@AQ+)RWYX14?g6*$3F>U#;Y_75-3*?=PBK=51JX`SArC-fz-=#Z$dGz}NjCmStRLfz9P%JYN%Iy04o z5y1Eez>KnV;?r=i+IQ!B9wC?{1vWi0<-h8=`?PCtiQeF0Eh;9YoGJxya+b!vX6v*oc;yHb&R;URn~ju_zU@hS1b@swy1+x{cX zdn|C9vwwc>b~y6)g~^Tvx>LLG8>eFFDlLtgIs8234or5YIxN(uYsioe4{C!pQLJ@r(IiULmVhukCOP1%{Yq=LJ|)mZ#C94y$O8Ew2{WJ0|B#sE-S}rW!*ChpBQI zi77JYbanaP&ao)9(HwLrb(cilNHm4iH#-#oiO)W7ihB)JE#`$?>M-gw13F^h4_)vQ zG%Ugjff&9QODRLftoYjZ>lhdYjvcK?eDP7(AL>ATiyx{S{gcfcvr}F(RQRc^=6;)* zS~u%v2_)18rZJPPOj0ZUp(0w%e6&yxi`HGe?A#4ADF7JQ7@kb$?eYp~o|6j%8d*Ne z1U;CA24|rApVsvHb9itMp*hrU`|DIpc9I2VMj=lmjoB{zAoS5q2`&7tdoxB+HhV=8<{L z!u+YfF^F8y+0Uz8;^?eLJvg2Zk+i{yMcjdQdPAVt-G*(E`Xs|_DEnzPqvBmigb9*o z>F=Gm5~ZAPy8q(`YVltkrnD1`&mhH->qmYia{>?~XHk%JKq*R$CGUOO=Qq;nu(C+W zQ~|2yZ&!XdPF5?eO*0J1PJ81o46}PDEI$N}#)40{`2lmZcJ`bK2~&mREk|0+vtS!R zFmz~BwSy6WV3sk!(q^%w0I4P+nQ=%s6pqv!^g*{R&yX3129(2jp^dJr$a|#IhSM2& z8dWgd@-G8HAvf?!2oC1AfRDp7QVraiURdu+ zwBBZ40wa`>PdW^=pPoa{e_LP`!ti+cu(~RF_%XJf11i`MAQAc{0><{ z-7h+t|0^6j+`-|Rx-(k&8tci{7?JKA<%N`pKaQu`Ig3xN=r7I zLa`%?n$G(oUqL3B9{x*Y>hPcJZXc_zrBVgnjTDu`V3S$CRO3u4eOP&Wv-g3nh$P$D zU%5#~JwW^dSC%ozSbn=^T-@{6$bOU*;Qts-^U<;jLZlaM56lB!Pu_NQ0p z3}J$>hSyv9(hzVPaG~KTHK~d;W#7WV8PoC4&?88Q1nT(ojdX_}>x8AfjXC)B6U71K zyT$cNN!#SZ0})H-ourot>dtEc*yY=8dy;-Gys4Ib(i1C+V!n z!IF$(!tB5vwO z4`vR4rrvk@%-mpcQz)L_hD(?!U@_J69 zfoUs3sAl)}oAkI;!cpfg?Ke&&Pv#sO@g2iR8FAhzMf;1TYkQV;qz54B@p%Rss)v3) z_nU(5J0XGydXcH1udGI$CjM5`jB(WTluj_H)38pV_P#$h#b2RxMXsW$-uq;JV_6R( z<=Les0fKRJf`Cxspl?rm%RaYC^VOa_`B`BkUH~at6!=xs-H_R*m9-(MOFUhuM$|sk zj!=tbi*2pfEZ(h}r&#$#snTqLt$%tX+gt&_+Fj;?b|87(2`>z7>q`dyt z`ps>P9|TT`=UG&{>kS2P8hwWNN{4Ru^Ws)d z&;zql;fF?UNK7jId)VYy6=A2GZfhf}-FjDBo4h>1lu86u$b~zQr0sG$li)MMm#=!> zKY~v^iQ`2ob}vg6Y299=JXtgURW0a#@1w(Ao^BRXPB2^Jo}^f&n)Crt zZUsYt!l_rDOK*o)SC?d=Zau`~#vKdmp2VJ<4oIX;cE<4l!t(b@zV6#STkL>#sCiro zyBPjEC1l+jM}JEZ?e#lJV{!n2v@Nm2mu3GbrDkp}e0@@Xwi6(fi9*ApWKSb(j32cw z*LGq&Q1)%#_R(&aDS3H)s;bBYz2bIVR-uBXw{y!dN67H4P9e*);LWgYhy1rGIZ~SW zI9O!fSru_hvWtX`ZZ{Ab8X7wNIzCs-?SK7HNoO>IH6crYvh_#(+K%sU-z52gF1A{K z&dAev4LD|#gwYEgf{sRFCz^#|z1o?h1g&RX$}Lr4MoGwa*6W%tc^*FJO*LDe#Q;Q2 zdDR7FG*CDP#V#{vTDyfG6Iq>WwUU_}Ec%HRRKL08^H- zK(JK*CpD1lt06gZBGkGoQKmA37tg29snhZI1^Lk9l)uRxAIPEnszNxGvts;a9`~m|3H|KT zy-#;1r?GthVhlUa5sEwohtb+Eb$8;Rj+4!lK|L=K?xL4t&43y6v3LwCu6VvJAU%eS-GyCUrcdb)VQk|Drlf8}<5i;I-u zkDdVJvL~|E9OELq`1!%_6_j{?NhK4FfBYKnG<9(x2;(?J>@j`+8D_jbmB(1nc%CNl zkk@6LLfn@)DwHitc({3FM}mnVu22)p;If-b=HlV7%}q8{Ze*mVhx|>Xc)?(aq$ids zs6t}G+qTw*B24+1F)KWVN#5}hyqqj=j~k=)s&o!Txjz5yX7bjQb+haxoC=SN%+*4T zOs)g0$y$7dm991NI!-1^_JFbP}^IKXbjz#=(j|) zs767#A*hQADRo4d@H&HT4X1zk;G`C_dKa9m2q*B$#l23%RMQGe?x}S`_bt#kg>xUA8q<5mw^UJDGTJd zM)hk#o{@errbk6xW2~wuZwFEUn`N!n^Gz;4I>!>zRHu|D9|bb`=i``D9f=%U!x=(R z&vKRL&<^IC|MuFjA02nd?7o^iGIt`D01ZuJZpy*;hwp`E6S(qLhGigLFtrcZYO$2uOFQAo(hdq;z*mx3n}! zN_TfR-^SlL_n!Lhxc9$zI2gm{dG_9GuQk`4b6Fh_H|YcU1&vx^NWaEg;es&g)Rkoo zDO=F%?$=_C``zj|FH*m`JQgqjr{iC$TF5opJ#Te0^=)#o+;5~R9B&j($C`-U72mI! z7^}2xek7&%xw$wI2>|9Q^K3kZ(`DUe>pn)Uy+CR_eN}goKtA2Nfg-2S{hiSPwv0-N z@CJ&WTvL1o3go+mb(`)0Uw6d&l1Imq=8cHG;zjl&-QVdI5y$NH1AFFQvW!fdmWWG& z!}h;~89qpasKK8WA_#?lhtCc`*qB~-cea1z_$gfOa4?9?=9MQ|N=2!iP7;95U%v&} zKjb5ouQN9)nyV9w{B*q?P-wbP`Rv5=6ye;k?5qEFjnyntCtZ+!;el!=vb8%QZ=6VS zH}Rbg&VB?8;g+u#EKIFxXsMR1R@uOmGWSPp!ht#yA!k2ur8R1bVk>Q$>W@EZLTCuz>A9It-S)U5xs z-5#!d#+>)FkX0+p(k}LE^m)}DKMVv~-)eijqd%J&CxSeECTn%6pEqo+zN>044wRwO zVdJmP01yul`_CF*C#Kfqh6~`eoMAzQP5QHnmzAcj2a*w9ke_wO>`y)RqxMw_`aPOOfvzP?B&51psnL| z0~S*86{gRgbpUkaeKK+Bc(rw=Jm_d9aWpD+;>ExIu};vOr;{{C#pl3xwLqf_A*~l2#oaf|PhFw$1a+0Z&xVxYe^{b!QJqS+B-+iCUUKa~pJDO>5^CZ#V zX>o!XHMxLY^pYm&Trc$4NJY3d65{YdIXdG(TO0}>*Q>bi2+p26kNlwS?9D_4V>UYH z7oH8+l*rIr*-erPKN+R-`+dG@E&XSTzk$N7&0&c~T8AURhny=4dy(ylMukaxkr6Y- zK%|VJUN&(CNv6O^umtQ|mi(nppy|zglbVVU5l8%#_xkp7e5Kaw3YS|v9hM1gLsq%+ zowLQjud4XC*&M$_)MR&GQ$S?>k)Hj`m_D=_dk(M)8bYKOBOKFcZ}>9SMi7+t z{e~Wp8|-naoBQs!!ksvCX_M##&fotVip0U~z7qVxX%V)upyBVCK-$`sazb{BQqj(P;&AB7% zz1eLQiL>^74k|Lun4e$5i54g;z0tQ|^2>K7m0wPVHiUvlgwQV6l#Wk=e+s;n|_a_7gFXnArYo?{`1Y z8nR~ldTKt2XCjx}xrLXj`~;XIArkB{O*)hW5$>oI45giAi!;^q{a3_5n=Tq3N^O39 zydh^CU7}GRyXN&;y#U`?Eb6alvGpt0Z^3PWgV^BV()Z2}nWZcA3b}xCM0{&T9(|Ef zAqLOsDd@aH!QTZAK)E%I*bsds3FsgBF@f7OtJ%dFkvqi6!Q;}8P%Y#BXcgJN6DEX5 zLD2=YB=I~gJCI5e|JtG;m2oqxVw=xd8 z0aRnIsm9pE~@n9n8EF6EnPw|eP5OIKTy=3D@^ls&m z%9q_OOW}l{0Q{_cg$6NlblUepAWf)=$2(h{jS4M30%}-IvG6*XOvsyWYQa4FA$0 z=meP2?FMXZ6sykiQXYHPF*`UsWFDdZ^mK}!`xDO0+tQ7cVrC)u__0(vzHZbs2sMUd zqE^dmRH*3hF4%YlP;}^Lk&QVq+pPncBt-QYa_JHZe(eufq+ed$!Sn*dY2|#y)^b(m zM$daxpUGz!8%$3jP>g9i0@CrkgvH#|vW`ZtSjb#!-EU}=3svac!>=Dspy5>{iKSu! zQkiu;ZJ&O~wOhh!(E%nmhwnzPnzCcFn`$+`m`g;l(FF%(Cry5vCKiyy8gw(qc3PC5 z6IdB#icx_DRV z!)vOag4yDI=c_WFFjaX`yE<59p$daFlqv=v)Q@>DY|kr$0B@U@-=*BwXT|@^iY23Z z_>&jw`;ePRK)j!L@(_vHKoxMU_1%jJVJ(dojFf&`THr5oB|h>y%4B4qjvGuN2oJEp zB%3BhsxyFyX9sP+2CR>vG{#8uUkVzV-DhBK+L*6!p?S;_&_QPNwD8s%cbh90Yp7W1 zMd9Mhp2qv{&Cf7-T+y;vqoBcm=IV>`R*FYmJIu>cf?u#(JHiNnp^V{Q@`n1Tee+%W zNi>XX2{^?PS2QUQaYlx5bZR9=3~S8_IN*+A;vnGA3J$(e*&x5g+z*`eNnwpa=zW>K z*7UKef`RG9p*S!Z^-zb$0d5fbh0hOt%cU2JyDd^?dNcP9eu+Y_qvDPi$hvBo`pmq* z-M&ASb#rYo*Id}DO!iU~bote6e~O>cE@L2FB^$eq{M6cbcZuNqn&0O7^~+5Hhtbqt zG>$C4p&|_#?C3mm70leb$vjuC&Rjaf=a7k*zE1-A-+4UoANx%Tbq3Vhu&1h?jyHp( zu?Qm%o1C%UjMck`DPL596dFILU(>H#DK=C7sZuRq?N!mNsp|?mrKE-?UAMC@O8W*! zHa3HnhOv?B%=YWv)4Cl-^JPUqVDP9SAD4y2SA4cgElHS3_zX;mQNZ5SLW7h5H(!N2$&6mEbkH(6~Bvn%9uX)8a1_P3f;*{#F zetU~}pWCHw=-6U7&=gZpph<&h9Fq z@q#KwGzgtK9-=e-aKV;Sy_vMNv%$dO#5@gL^RqviX|!OR2(7x^XT`2v2Mz77=xew^ z7hV2HS0+$O9S23^sps`2bqQZxhCNDh@4-wnQRa2H0VgT73mFWgmoWBAF`Wz_)_EGm zbdZ+n!uflQ;8~FF(!pa=BR!9MUp?w#dl#!UmOne+&l^zeooG zS)tJe47b-QoGwC~H=};faBIhb@PK;73@^Kvq|UKi^gDz1jj1L2TjTSf-=;-b2r|=? z)erb!)&_F#+7bJg$7WFi5r6{haTDvCVTBcHaK3VLgz(mbp+#@ySmB-C5ySQ%L?ZYI zLm7afmG9G^BVVT|!d1ja#+eWfWpfR;QQ_SiR1X%y8WHq1~U zQmV>bCw_8ly-F$a5vVVa1weg)R-9pKD0T}r^F}E!)nCQQr{%dwf8`s#XTc)^ev;<^fHGAsW|FE9+EO*>8&eo9bK5iSbPMAGZI-b7 zk4}pbgGJ4r_d$yu8!M+isoqD-n4oBrrVR)luK|c1d+7EIq29KTdQT#y+vje^H^H3$0nl$Hq@T6+;kM!!hjm2SLKd?r3m)L>LZp=9l-i#>9 z87iZr=bxT8^AZP-BWvU<7QS1NTXFm+**ycEML7JgF2iRhGXnYuZdvk1E)1YLUECRH`8Bd%U zLAsWtnZaTLF*&^KCy>wt){dkY2j%D^lI7=AM$fa|(}G*^hdT!mBw}3oTBishA0cOg z0PsLld7B;0X}<}JaP-a%Zp*&Lern_rHx2grs+GhKI^uMxjYK3%Wpp43U&XQjp7_2_ z4F>0`_wvB91So1$XmZ)Q(qohVF|SH znHW_8FIYxDF{Q`>3Wa{yqa^6fz+f+7zEWvk)6Nz;{r!-sF97hnpPr8Xm3#`Ty3JsT}>v8ls9@8n$HAkZC&A3n2*5|hL7^H|omK~E#m`b8RM)+trX-0g)oadhuH(AA-)m@F zi2xEE9`|T^A&tDb8<+h|v%VbKB5Js7+jM1mL*o$rI#lrIVWe1hEB@0Vyjna>?*e+) zR?aN7L64b`-!5Ig267RUCw-`(A?Lxo?zpU6USwLM^aBSq}}QC2r@H9N0Z*Pl19cs*mM#|&iO zWRKbRJuAqA?wx3Y8Bzc)<77RU#i+>OFG=jEmWlSfCD-s8n6lrV zq6SrVBH(UcQT#1_E+|gWu=`RBrM}FqW+H^}NF!BYfcxTCbM+n8BBSvch)KUaiXLT> zDbWc|0y10N3|?EaVxEZL24^4d#lQzfJ3ebvA$!ykg#N614dwWO(KF#Z`?Vb}Ag3-+ z&d^m=SI418u7;Nc32X#nwjZQIHT^~9w>24 zhl`M(B|@C(MF6jF*RH*r?vXnG@)z~SN3A9=hWd-dwq7uMna7<3+sRy2wFl)2NN&&z zhk+mZUDr)+HwHwMg@K%=N~9o~8Ym}Od*f8@K1y6slRiq{N?1K6fO77=W)X4UJGZlb zwGZs=9=k}7~0=FtK(Iu#@Q}VUW^~=lSE42N&y64Ai%CUcI&PhGF zo(Owv&E@@LYYtEw^m2A05>&4>nJ>AO9hZ+3%x5`GGLGB&j5nlF2=a%gX1*ypSZ7qa-~TRuYWxRG9ar{AU} zy4HUZh7uVQr+4g&n8s&v^C$VjA*fimt!k$VyY<;o^i^q{6!V8({Nk2%s|4y#d2D+Q zf60BcEFW$*h+?Ob()2Ix=a_8Ap2Rf6+U7mP#kJWhy2`$-r83dCB=##dKQfm{-8Xu3>nxm!u4s`E=pxaBClW$8CXIi^=X)f2N<)cTw!qj^fGu}^; z3{_K~KoXjEuG!E*N+A8vrK(x7IGlbafPc(AaJ;FKul?ei^xH*^jyKRruwsBv%~i|N z6{r%WJXVZ^j~ZVvS_^|vu~lcvQtuMh{Mw$J>?JY#KGpUGE(d`|Mg03~|K^|;l-evF zpYnih&57+F_R!9jqu3M_I8!Y*=eDZcchEhs)lKeqVF9YF?7{a|kCW{`^hf9xlDD_& z1JccB*CSlIgoyx+`6Ow6z|s>P6{{-q*6Jr7xPvTF=@rxYxmW>S_WP$wInVw<&h`w+ z0llc3hsI(uucO!d?wY^@iO^)&d7Hx}1a~{a^pf%j@OD}N1^F3;(KA^-9|FvbY?>CH{8Y-1bhxu zx(6aqMvnPxDh0|Mg!Umi>ul&8SqP)Rxdj|41gB8Z{i#*5)w`7lW@v&0QN zTCH+gpwMgwJa0Td3a)i(&!6{n8WXNGFUEa)%&E_;X%K0r|B!R#rTPTX>dDiOwWQ9!-xrOs^PD)a!Km zKZ$mt$GD7oqp9R;pWth8XUI@uy-=+kmf?kevs5t&1MCQefhJmh@H^lb?805Q5@`LU z+i55ke;p>E^MG_>|}lL zY9mdKfFtYbwGD@_@oJ~@(dr98RqHt3QAla>CnL740v4Phsnw`12mJcF4&*@?KLC+? z*(yxP2ML=G5+FxjD)4o~fQR?;GH!7I0B%$>#zEg13B0lZKlb7lig3^V_+zgYiHg zJ|`ioA>CRZH=bWfMtuoio^}A<+sp3>-_{lLtAUG)asP$E$DtHD=K6XzVIWY&0eHH6 zVKy-h9O5k#BG@~L1yR?HWX=aYC7hKvjNB%*PAiz3Us>GHjE5o-vwLI2Dkj3xPM!k) zscQ2DRGFlN_e&w7w4K5Q%GCNwlYwrVYyHVe3iBqZ466NFPwOkcv3jUF>sKs%_S*f* z=*Gfs^RfB4fEqY-Q@s4B=|#6`1}OP(j(|ac1E8aLzg}2xi71^4@r)SsW1C z2_4`HF+Z{aRqFP{>qEuHFmZ=VUjVqgln#c^SusbuK@5a*l#n(bstwI7Orrl5X1VU^=H3X4AiT?Y3S|1y zXUnZ`CqF%6MvD3Ah@vvxlKdK*UG?cA*g^GXsvz#4QkqkGgExr3-B1sa=YD*b1K$d^8(pw(B-H;x9WUdGwln#}%k9o6W)T`7 z;Y56G9UcDU&HKEyShT&Gx8#}<&F!EMb~RXx{mmLQ6+<;@PNFmBxar8Rwwaz;b;Lb%N+8bT(q}?|SfFwn`2fcxR zTOag7P1DynrUIYZKQMkVfPFFYxKCav_{Rc>N}e`_lH7j1*YAw`PNCA3`d!m>xf>lt z=x%r_zjxU&P%9{EeMLuIWPCS9PiQXkC~NynA?0`$ZvL1|jL{*PJB_C1B{#A@?)-H% z0C<$3UYtZB9I%)9w8O7Z&h^$j)$^0TTKKIuk zb<22JzN+9xwdk_~@GkO2u<1|F9)Hxt5dZtDoQD=N8Jm`ha96c?Fj(&4qtfLb0FKIb zjzT%nRt~=cx0}Te>2=lh7U_ouBT*B)c5avg4U$cl(=Dl*$6td;%)o&;rbrTDy_1fC``U$ zwn$0RZ~07<;%4kAc_f zMW z@3*M2PM*vFWW_4^1VxO0|Ez$BEPNSd5~K($Uj-H;1wQmeBC=Xp6pUFfR1rwVG2C^l zU5y75OC(cZNXEoqp6`Xb$ny*@KJL>+8cbwHML-||e0mjN%_(HE66y3J=m7Nn#qXKEPd~i*7PY%H75~sDdaIsMnc;kxCAVZyPYg(s(P%?Bz zFn0uWkJAT*qis)WBvwqD27qS&rn0{M*q_ zPayR&c^rA;e>-9z+9m1%=OGoTjn0OeQo%SOIqk+F`Y3z1lWX#AH98;gU%r;pbx3MN z{kU{0_a{15|53v}p6S2ZsJJCPiNx$nmIuB;Dl`Tw_lUR?n4JMHUS@)LjF_o5QgHNV zTXTY0Hi0CeduN4MFA3)x(DnvWsU0wj=C=LbuF;h{&ZE4Y5`#-x42{VTVdn^ z4>BR70C{AH_igyDR^YW$;iiUgJu4*u@*X$spX9;H`rr3(C8LzzLW_5%Z#HW+5q!{S z9a~tHE$8J+#Vo*oT^ixsf5>`-`mKWRB-ro1O!sL0b(`cZB> z9=JY$s+ZPiiADuGsFySe#%n2ndkVp%M$C(AV|=_ z6*TW`J%WQZDAsTo1+Eu)#jy5g+{gx#ImxvIxPf`8k(d1EgHSv{g&r#msI92JFfJ~r zRP=(u4d`#=(|i$F_yW8L5TQm~ zUOU3rk||#c_I(ncTo}G1hEG0CGXV{KHd{f(rt{WMx)2|5HJEeT5A7Umi7#osUb(IW zSD!J}E?|uiQsQFteD)U}x$i3W`1!R336=b`BblF#)Got+!a6V{As{`|__rh8I03`} zv5|oo zwzG#mUnkQABkTe-`T&_i3URy-k#LYU`AepE)8Ach&ULa}coaE8Fq!ZoKJCNGb3{A9 z>5cUsWmeU}l@Ugtlun$btqtD5$MAO+1?;rR(3e>~vz)m-axj^#%oS61&UFzf zf8)a|L@3<0_V#Cm&PD!lHF3vtx8F{fj1PeiRjYK> zEEk-hpPy_9ob{5&=vYaqTDB&#+msvkAm?cmhd3Q!E|nQ@@{76_>SW@hk{$l&t>QjB z{p}!)vjC&R&1Q$K`k^g-v+To<4GzasS|B+tz^^iM8%(-+W-{o_vwQ)eMzGuN&z1;a z9uW&!TG|goy}yd&G}LVL#0dfHPGubwNs^;ZPW}esOdp6|oTBw)9sU*UnooEpoSq zeLyP@j!&#d1GtrUMXtjM>n*yo0alLi95K`cHiVYE#j^F)66J1?3l!ih_ z3f*EKoW6n%iH}c~>%2#)q(^_?K3*7l5P}SuQ+JQ3r{GrM>5pao?rp+ix&xb(v_x&m zcI3N^>F*_O;3$tiGx^|u`S9TI+o#<`zFv}hZL05$$)tuieFZFJIi+>Q12aRQc^5uD zlhRkdQOE+FE*h`1%i$IO!~L{B^i{u-HsY7B%iyDd3xvCXDo9J)|iW)&tNl?)WU% z3D@vf5&3bHOBxJ31W5qFFhgyrtktBc-2G10Qk}2I*d#sPy)z-X%;LZff9f1T5|bgX zsEY)j%}oCJLn3?c3(2lV)|j~x_yWAwaz8rEf-@U8h6$G#cePXQ&T8yC zaydR4>}Ki3+&?sAj{KS@T(Uh5$-b-~ay{LikmMX>-8KCEg~}N}C>gSRB&`8os@v4% z_oju^2%c!IqHyxyeSe3W>S)oo(M-9EiQWaeHvJ5LxZ&>_h-(#C3_^Cxr=n}>%H<6T zm)QzUpI{?T@)SKtBljc(_@Sxrd#`*M%-1eQre~qmk$bPycHD-XW#b_y`PZYtd4ZkJo`4c9wnTWnga-fTQ%-1@PQuHQNnip-! za(Vc-Yi3(4ZLe8OIk{6R=E#~c6WN{S#c!0swGWqxK1)34u<`YtUv$9d_td`9-sLIM zuCT9sFwZ~(MqO@xBsjq|;%S7no0q{Qrj%_j%Edml=@+GM{NQ@LtcobJG_iubc|CBK zj&%c?MQXwZY;O;{@yBl(6ila&gJEaf-6)-KNic)#P0wiX*Xbu-wR&rJCrc$RB~k28UzlDH@ahHp$w4 z=hDJ+v8iI-wxv#~Nk-F-mzHy0utFTS2Nj<8@vs5x;C%<@#0}ON3SW(kWbUx-3(gty z4u9XuRX;poG*S-+H1Tb=EQ$N0g;I_%75qMPH_6Mj?sNj80Fhu@Z0A9PxV%>c0kvY@}BxzzDEs?tt9N=P7rBQ^tIN z(x&I~cP;>F_k*ubdye7qCgCRGl#sq93R0&vS9u}*?uuFoMp$b7C-!h;t#BTd9c&XK z75BAiD|EUO9Ni|b(DUM*CA5cI)mGV}y~$3zvemdq(}_Iu+WZ+({2)WU&ZAP4J0qs36)rQ`LUlGi>$g7h`TusXza}b=vrOENTM1K4x#pkMD zIUHK#AE|U_z9c(oottq{s8UstPU0n=HWH=8*L23GIFj`BYkim}#jxtO1jQOU65%l*0`OaD^`UJl7P?>R zamUzP{o{_&K>kjsj3(GGZf;s$T!_MGz?Wg0_f~Qb1&PiD|Gv_ctM_?y1rG6dZX{v7 zz>IfQIkgTuNU;yi_`${uZC~r87;RUdRGMBKu`iN(fHN8Z{HT?;O#N7MZA}>=cJ)kR zKR$oq>{^^+dW9g08RYE$rRvM$E+Bxk6knpGs?u3iWaaI1gqSTIJ|=paX-KFPhksA> zP?p1GIV{HtkX1|OeE8)`1*>VdtgfjPbDqm4*P-XzQiZYKdG+r~90;H%YwUCE-Z@!> z+%+f4JfO_kaaiorI9|Jx;g03|#~^f1SLWL0t?OaWeEsAG2QJDJx_`;$7S9U3du3n+ za%g{N-+V55MyNsqiM^8ebF_YZzWUsj-Wy*j&M{kN{|qDn5Cig{r46Hs#igb65c|^~ z3b0GV?8yw(_`eMt*atsZKPz|st?9{s8%*lk4$?5vv(jN)13Ij*Vu8V3oR}fK z#+RZ#s4{kE*983uJg*3*$0}Ydrm9(!h%a696pV2`ko|rGhaLDNK$uhQNVFKAUo$cB z$&MiBgSifJ??Bhp1jxOxng9FxP~cF;i2_;0!9q(nmJIju@VxNPSY|xQ7=|wq;^F0c zHCC<{_XIipiN41%4|88?-M7p=XYRO_OO+7$`>$PzLR?F3?Vg7gvI+K%QI{`YtVtxT zu6%@G4eAQm%9mBm4bQhP#*85>yxg_EE0Ql4`uUYN`0M~o3lm6XxOu;lAR`e&fkC5E z3d>-_Fx%_~(@?$N=91YWyeC%m9~Xxfy>3PR&(THak5eh0#F%j*5S*XYP{=lb<8i&j zE}S^%XmB}*Zn!j8ePgI&E)55<(i{^415RH1bJ(#j7OIuT=(tI3g!w0GdG2`b&wKdO z%^+(&@R?ns0#8b~sIF80i-(1PWRMXnQ)x+?Ib3biq0{`v2TXIpia-@Wh|x^qAt*XU zk#FW;v*CUPj)({!@hTjjffsOVR=wUGFU}r7lIR<3i>CH`z0Vwi$4)V5 zjWm9{W2lA7$o3Q-Vt>C9tjXzB$;SQ^)Iay3#qNhoejc^9`<2HP_WO;`Bi>e(&I;{T z=GdB2=1C+l{rf^^^bHTI#q%1m0g<@>47+tZvkweB<{SUPId1BPd)lEjskj(Y$tZHU zK$RIak}oc(i8#gTwNQQ@h~;$$w#S=2@tkI=(ahnSpOuISOSGB^3o%QDY==b~z0O{k zjoxG^o=KYC7-^A7MM{$+vP~j@hju@L-biPM5|i$CH^9~kFtB8LpH3yNTiTYcKHe8s z%ILn#L#tE%{IQ`6C5P;}toLN$+${VpIl>GrPmaGZQyv7ut(YSL|B!JtT<<0$>W{`C z(Z)C9Oeh#r@1ww#t3M~HUcR_sC=eh&Wpjv*&yC|hS-Oy+9yyE6mBx%j$mw@4oFfK~ zuwaKmj@|`9czgFCwLy;S{(bp#GeydpZ|%3_E&9_%MjUvjM?dpD>nfH&`D*FR8OU1u zu*lRUdyTQfz8Am|J3dFp2fsd)zAJ!kfwi3J3NpAm?krX~?8wLSR{BrpjjNDww!z^I z=KWj5xf>+ml-q!!D9(})R%Q&U?K5sZ( z_rr+_^hC8VYE~Wi?sAgyf7f~gO^d!Jv8}-ot%XzkP5=)I(~eB;8{f3!PhE*0WRICB z=huJCL??017FX?rna&^wF<)d7>abHMtPGM@Ynr$Wf4BWck*&86t&it($^Jh+GI|i; zZ3>ps7uy|A5#N4yMC(<~Gtg;`^z%3r4c2tMIt~~pn4O(vI*VJ&Q_s#pQ$u(xjHQ<3 zAFl{GaIs!}?8o2&>1ev)EUoIlmM9Vyry?>vOGeW$)Z|eWX(0Ee8}WVplWzREjZOFCSGTqO!75_evBc#kd_qC1*cRm)5!ewr zN^MEabk5EI4a9=CJH(l}8KN_Ty{RBk4lN)*Sp=C8N{Pg-^sF-n_WZkKmm)R<(Yp5; zwp5P=FQvfJ@GajuJOL^A!DNm3=AUw$;+<9>UM_>HWxeQkS7cH?f~4Le1MV_0Epj;g z(pRajirMdid-xxa^A&SjE8$3^Go?C5vy5O~67dULD=4T}5Jn64{UiP{Jj>KGUIz;f z9R0E>y%HVCwGXRpC7k%?;(wdw8|GFOKGAjifwJwmiD?{8);*%*0 z#9!iSjxXkQX5Q}wiT0-Mu$l#wM{Uq}^LcDzHA57jc*jSg>s?Ie{=rT#h~jsnNa!(+Gm zi*&6eHdi=UbE?9Sxv?>7S^bc-PE=HXz+E1R?j-Wht@{b(1SLQits_fM)jR7*CKU08 zJ^nqbwl66-Zd^j4Y$XC9ZPSqR8mX73)uDPuINuU{`#jiSe1eosicXD;d4H}c+q2rq z`_)^MKC`=KFAVP_eh*YL&r!pn9t&CF4j~03IuLchUBC~PeEot+Z4(>(2(znD|EE9% zF8+1doswDoOO;Z=-UnK_sRrcWI9fGr`vZ3$;- z_?qjae_b=A>!k1y=Z@70M6Dwk2%$oC5{Cy=!9;<%xJ@Lx+tCohrO6wlhp%4oHP;IZ zjbBQXSljfY3S?~WrC$v$aAWDz+|CPtH^%*`kz^i%%4lE^rXVo#GRG-Bot>)d# z$KTKHUW8YL(L)Op;|Edd1olt4oysMCbJ`h3Ph>9=@r8!7yWZCOzr8VLe-dX3T zVe=-Zt-q5!b8bj0GU*E%?}Mo-WcQEUzl@jls`JD5v?kg|nV$}7d;EVkqPVoCfBWgT zUzzG(3{W&ZCz_uhWAvIlzPp_=2anS(%Pj9~M2*0_X?LzIHXVsD8vu4CYV3Y7J+3b2 zi?S(R|A_#D6h}hWi+zInGr}T#C382n5}m;PWKAmX;?c6_U2=&w%-tzI#bsXOsP}>S zb1A{ef#^tv5N>f+B(R`eM;Io!6^}9cX#HPyY0+jJVvUWC`0pSUCX*hqomE`rH^47~ ze_rSEB;$;VNWT4g4uVYdN}hz$dagF8a_gJX9Wo` zl)=Y98V?a$l1HRlsm!)D#n-dTVUl`$DPuTMUO>Xi@Ggn&%|0KPacmoA8ZGiezFbEr zQa?VDZEwxYU970qo0cw8@q2>LSC;6axk=4I5|0odoXq`sG}s9oI>Wi=NuOjN%LIp) z|57HT8Ew3ut+iuv9Yc>U)qYkOdmq%Z5=evQiBzO_7SEL zQa|uXCgzMM5^qYsswSL+T5xP$+TD&Vg;=PS=x_88NW_wSBY^(h@%!<_Co#V` z-tvih`N~@gvjP^ga-#(|GRY)pFuPpr?EBTx&;n6-((lFt{O`s?R-K0&tHiySgIV}? z2{i)xMOSwVv$?Eq0I(uL4K_n&)fM=BEG^-6LfrsTa|}uB4Eo1A9H2?I;}2e15(eC@UQEi?-)A5UKJhg@SK7|V6;OqNCuHRCbC6;HeTCu~W9qmq<@00SvFjy~+s z!c}M-{_ynl$HanHH!yf%`de8l!FFV+-Gp*5-=x;jF39_WSn8a1bF3-q%!A>8j7l%N zaX=b*m+x}~gt*tFjY21|G7=sFxghBB^FWTNR_c~*Eyk;1axd5$>@jyLXq%hUF{>~b zk$kz&iLH0G6*S1pH51~oKgKDl4q&RcUP;V7NbZ)XXyBW6K^ukXi~Pwemy4XzIm{(& z`jeRyGz`{@IOU66&2|@>5SPP**D5SI%bmO=Z3E#L-r&gKH;0TM!{SB!4 zUw%*d&pu}bmrK7oPnIZR85InQe!Gk0!*!B|QPCxh<1y$p%ir%u(LQ&^zY@M$Q!?QN zP~Fdefa?By&SDpQV`)g+y(mc~Hj^30azm$DbHy$Py0>1l6Q})pPr9cy%3Ozzf1kG= zW(kpaIL+B#P&O$BQajRzAyDTKt|6$(JEK|iRrVOWtR8x`mw%X7?APXbW<(QH?nju| z0S%%Cli)W)?8wJFCWXU7hf4Q;1n}&--02u|YaGL?AqX$}trrL?<`#ZyZCwkM3HHqJIU#IxY9->%Qpv6x*E+>+Y<+ zH69E^K{HMlp`MXJ6Z-Iy%LAT}Fy(Sdl8XkKU9!v*J!t#?d^28ly^IyJHV@wUR>b%Vj!r!pWk*nWZ24;R@(hsBwt2? zN+FqtZ%8vpwvmYp!pqqb%%G6>BdC8SGJ+GZoV3ONzQ3$_DHcb*m@Q{O$1F;#7o*2= zE>yijS4ldl{Aa$aI#ut|9cXw=;uXL91TFCiT02G{1^2;WJe#}Hx;f7Y7B3WJh@g}f z=guUo?B85JdsFLhLjk&$0sR{GuXS(36Peohmeofi#F2*`e!#8{s>xe3IfFc{@|hyK zOnmt2AE5UyC~Gj4lZd@!uJX%wieg++llkzOS%Q!Fyag;}YOm{0;VW^f?P4fkx$yj% zYldFy@@Or|$N_1x%ZE7%Cr0~Y2_wYBNh`sjFkY0AdVC`(ZYvM~#sWz_n27#F7`Il9 zfD8_I$F;9@wT0+G`3m%DT-&VH82ea?)2xL)9~->HK{m!{zdUq!D!r)81B!z~MUH5Y z52L<*iu&3o92~BKU})g%&>dW?6a4tn-rR+Rl`g8wwZAy*mZxU4>j}U=`v1?+=O^V= zP8xuujUlAMR9E55*eh>-)5=9*@Zw24I3>dm`oaop`gupK#Bk{MoJ2GS3Rle~>W#~u znls*Y&Pj}cq@fiEaAUEQ9?ZypUTY+LPsna4Y@k;M^Etft?A8A6V@jtFU?h-ilL2%rYo*NF}jX@NbLDS`c4>x3qGkh;)($mA1l1~GwV*hCUN^s z|KB)lD9P~>@ne9&x8QV%BNFvmYaasUo2d?lLjGTuIbDHwco|h@s+pXFV9fSsXP*1X z%PT$aOrdo6pGvQhX;h0p!L?Hw{M!AJr`T5)ha%np%LXeBdX#>?|JbA4HnTz*2nfhX zf;^@hng7!Z^?zGm;SG!_oNPUHE3~A*is6s|J*99aqXAEJr7X@c6o9r8Jx07=2B1Ts zM4i$E&BMc6cCw1p;ez1JWGgbNLW=*QdfRud<)CYupm4YT#!jHdN4>$44s4E;C zS4r95E6&#{=CUb`o_kl7`C&7@c;-!y!L!~P{Ph7 z%T?3GGN8HrdRj0<#vtKIKzlsJ;8BRspQ^yJiXQ!}vikFS8v(?g@UL?6Z#x{Uj9bdT zT=WqDbLKy_9`MX^LALWMfA<%fN`PYc7+RNb&hRu3s9$gtyNVcpQjV+i8f#^O@F>OmPrU;?VDKM-)rcl7dqPg zEA2r!*C-Zo0-f)^eVriZ*G6cek0%wIs(~5t&q-~atd}1fohC^DDIX`Y!heMpTb0KZ z;u-qitPmp`Y1b1&ey>yPm>_7X?gaXoWBzCTQ>?iXOT(;pgDT>3kICq@N~bnLF+yj2 zVx1wW!e3d?pUW%i30%adLcp`2+Hm`qvRm*GM;mTnJ&By>?YQcsc>1TtDaT=Yf$hmMZFJp3VhF?iN4IX*UaEaAIFupl3<}corjtagTTLMAY(%* zJ%uQ`M}JR={H4&Y%dBI^ae6VdHkhF*S6Ne&4$=v@3j>V_thODRuu=5^X|5d#K}hoc zB-RiAfAT2~N_VEeOpew6<^^ljUrmzPIhs88Zx zIpsTJ=s+r*gH^-?r-Toz9J$VkTV*kjUs7ip6663_e?~^bI7ZM03@piJ+8i6bfd&y* zaIfd={i&k-O#uX7e!#W>;C*Prq6~w7#?3C8#}63?(!@rdv3ov)U)xJ6aeWCOclysfQh&i-nPT!!Kr+%9 z{e%Q$^81`106w+gv^9ic`uB3e7VEd%(t02=NfC2?0N%NDo8Ino?RKYD#}G?VX<0VL z5G$vPpZZ*`hmT-GaEARtUMl0R1J(_RJx;>zJ>-=MBH2fN1Q_sB);E<3aLQ! z(u{!^{8g<$>ieIo6=0{Os}P{T&0;ln#2@W}5#{er3a1Ve*@8^$3|PmgaFa{Q-wXrM z&OujcG&E(I>(LVFEMZN(jsYgULL?`cQmuc87~MMCCM>-P(R-e=tr@!Al}yVyx~qw~ zYwCK1YFg>h_ApPq6dKpp=U{cjG~++*Ul3y47}$aDailntwy^x5)#Uj`ERRR^hc|YP z1Q1F+#8V}d*Y~a!V0V(yi9B<;VN1oiEMn51(G1E^E3`3!XWnz~rU$IM!}=?&m%ga4 z9G+VYSG-fjBAL&+;`e0geV{2o(DxD7 ze+@K-ttLd9KQwycFFM-@`c`tte$`48T+wF)h(kCmye%6`*#}2IRj3qFKbKW#y9X|K zHqgtp7A|InkON~5d+ptOGBQ>m$jwP#&rUilZGHf}M;#dv$|fVcO0gk-_HI(%oU1b> z{{_v8xj{~rr#6*em50WaZc8`v>zxEd>>Nd}fACb%>4&ylp@8+#f!!(DVM#t}Y!{fZYaMas@J3~m()6vLb<{)pNuGYZR1n+SEW=Q0iu9+H?#JgiF750Rl)$55`oeXUuu!& zM(?C>m+>%}$HV7_|4PGU3}mO8mQTnUq4(KbJ)=TkaKa3vPjo=T{5Pp-Oyy)gA781+ zY>)xyR-cOpNhQh?jmGonY}RNWFZiYKAT)LIa~Bz02&h+|DgHIneg6Upn4O-l_9t4} z_G?dFnp-cfi%lI2iUG!-#41B!BSG363hE}gKhqYPjH}A;d zWwj2lss#Jtk06#R*(^iZr3*a~>Y8z;tRkr4!8c3;a@#RFYsz=6G8ppuE$xr=Khf>1Y+@yzq9w@Lx$8|}Lp!5#adus3 zZ2P)Lf4#?*L`s0N!wl9^GqhBuv@Z@_rc6Gnvi%`tXMPlAW0texBp+h6uJsf3NU^`f z*m8l_@WRaP?I|W_t#RKoz@Kj@6^JK8zWsGh*;W&tf0{$dP(39G2g&~t4b?i|Aeuxo z_zeuie3UY4MP(j5FFCDr-{N?9w&BCg$vOoX$G?2$r1*64Hg11`49uD(kvDjF#(Thr zNp2dJ$2teBC>%c<*r7}9fM?J>#MJS7k#5@^MuzWCnAP(#_Z>_+WxRv5VLxZtu6%zlERjs(DPJ8ci#mi{dGHE8$EdX?ywfULA> zh!H*D0l0vV7N~9itUv%y?zlsnDCx{mOjJGT7u@^X$g9I%?l9QPQDY1)3euJ8rhCyz%(arnZH%ubkNeFu;+vH zC!oup!E6gWVtednzws?|g%r8wwUfgDG_=}7owIW8ad!jUw;AH-eeE_Kz*N~kXe%R~ zu$NxF6w$1`d`?X;i5+jY>~l{CEkDag66VgQORECf;t|@tg7V+B(EcS)6&%pB-o&7D zhHI2hn4#O9Vr?oMl*ARrIKFiovV0t0u@MTYa5r5$3Lq66_=3|A{-gl?>MRISChA() z*qxU%dT#>KYCf2oRMl>oXyLP%et+IS84_;wnPd~xa|=G z2bpz{#sEGBCO>K8zx7xDG8^F97pYIXFt(y*{_O1Xgm$?bG)$m<^JwKK$M56jDo|d7Xs%Pjt4({Yz(A9GS4qwSqes_yj?OSMZ!r)W~{>YYmiE(bZ zr?s64@!;8iSWUjd$3qrn7v$IM&#dnO&vw4m5xOawE&`ab8>-$vJXM?=HdVw0-FtTm z7?62?Q*)|jzSghmHGkB!e&)_gaUh=OMlbaC?(OX;Yjl!-SL7Ad8D4u3J}zg)w!Ba? zL$$m2?t}Vr;39nN9s%Ml8C|UsfJ{LBJB$CXLhTE9FEST$J*t)T)6382l=9wUGUMh& zCDc9uE>J=*oOZuISq1K?RAD(sN5^xDGk$Nf3)yVzr0_JQM0WJkV&@qgVXj9Un0_$~ znlv2_t396{M47;RQEXA&Kfb1_^>?aCQgaL*i(Npz_B8RhhEAx?4 zhSnDQA7i>Biwf{4)$hl?&ytwCXz3%HH<;zVPq=e$Jui9I+Z?ZNN*ZS1ezltaQ%`F^ zQy2@ozlj@eFtrO;HZt#RF5nUR3Z$@hQ*`6267C+nwTVCj3DfZkFh#zzh)4i@Mx3rW#?{46!_{}=o$nb?Z7!EAuS zRu!_=Pn9BkT6bbp= z{%wI^dFq#O+t(^cFAN4AKYGFp|7Pl??8w^m#?cFpai`D(MwFD%+|+Kx+;%`skacr_FKWO4mwKOYhpjFw2Jv`-ho7oKBbV5OL_< zpIaQHYm{&!>p2k2HeCGyu5T8ICQy+N`EuIRo6L%ZvXk1%TJZY*je6+(GUL8wmlq5% z|E0V<2Ux$m4f@Z>lz-IMgzWvrYh=1|dzRn7?)}>kE9G2my3jb*Nfe}xc=qZgnO-9U zG?bCNvDtE_=F~b}_7XyWLBpT=b0WV+ab}-msZiwvi6RC9H!dB7v5x>R!(7LU!K7@V zQ?o}5V)EAdE6c8t=;s`46t;p1KTv?u$G&KGMo@DbNX7@z#uPMya`QozRp4De13IM? z!B(d*6AnaPB~KFjl|-K;>`jeYhNQ!VtMLAaq_fR%BycdBMF2ah!fi6O!ylQe*5Uc| z!Dw#{(t$x2{JsK<3owhk^28x|+8flo3H}#um}R0-Quw@tKoP6hpaO1TVs0nDLs;4` zZYwtHJp1^e_#7zN;YlJuJC`Z2aduxZmG;_n}y9s;0{K;d@CV~qCsV_1BB0CrCSBNwZP zblJ=+m3%1$K!4JjD?V2Z0(ELBmq%l;O4KmHx<82nj`X==feId&F4BXmd!O(X8-)Q@ z1Na=DW|lZJuP!WBg}j$y?*Y=v?sToMKpP^p+vy&U7wAB~i09M9zI09NZ6(7%0XA5_ z!MP72-w-ojIPdDU1xAY&;_{&NC-1nDxIIp?aymQJSniI&w*=px0a?ji{@$+TLxS|1!}gN)H^bqW9D zx%To0C%Z~{{=Z65`rMr^GH+lQiNB?`l^1}d2C-ce%9_+Iz0l>rI%U8}ZF?`)l7F_N z0!Ci0&Y~?f!V;Z%cAu5Jw^*caUTI)J8VgSsjk?r75q&Z@43vCrxZ<_1Zy8aGbfveS zO}iO=>fP;JxCzkhs?Ch8a(dV;Pow+$1-6`Gj5DFtRO|#PvSYFOcAyA+8hwLhTU4-_ zGTaCbbs?`&LvKOL9b%wu){PIsoHSpX2{xnkDNJQmm2mZF)zF&ub>YDvVxSU(mSFEg z7Dks%s@4zqN*n#f(Z)yFG|(kZ0)kqpZx_Vj-SL81Y81(JqeHvbDhlQVK+A#U9oL-N zk}4{Q`^)ByTO^~G`t=zRH9-z|6A&Ji4;lxePUv|5S8*lggk+1K1-;cRBLHWr8q8bV z+o6QM3A2ahY(Q`qXZznN){1_)OJiYyTT%?osj=hpnu3Ko3y2_zquxHf`CNtAlq1j- z+-;q-_*gFFTpzC4>Wz*7yJQ?krp-&Eap%*!9Ntknt?N)P>txbkh3dD($d^JneMf zfM$}5_RwT6Fu5UcOjc6dH=(bB{mGDoSq^|M26E|Bhvt3q;o=CVBThS=w!dD{o`t=)1w~;ei z{XC#sX#;QaI%g@Ws-uZH^KJhp&-RRuyrBCYr3Djj#1Qzt)#do_21zHFNBY5$Z*|xy zmQb0%btVck3?C6JEb89^NTY?L#&T9&wJ=0(+bz4wBr_Kbl(r0fn3t_5v|~kU+ZMp# z1D~Yk76gCrKpox7K13q~{-_Gf3V*d12BzR{}jocqwIs4?)H`?;|ArCo)QjUz6w zZkM<<**x0d211AY@!)vWuOz|ye|Y}X66ih!|CXn=G4Gu2&>ypEKL%qf!hC~&e15Ja zUNt9kI+$xUUit#9P<5JzB&-}B{Sh;ib?`K6W^=U z5%1o*V}k;jS}|Gt*qjQ6eIy`pNCCvGgvTInnX?z;+G0Q^f|UMAm0)~I5D-u=P}I{V zli`H^T)m<(EY96*F=m9=*iBBi+fDR5Vmz=|RI54LZg9iEqwP5o9N(iWDe2V?ADDYx$n@ zC0rPR7tW5V@|xZSZ%V)|u_N6fDHO)}uYRWf_i)jl`1hfKlW|GTZr%=~1?Lt_X z5_!5>my_V;R6CgB2TH%xmAS%yOpc%r4uY!xIs&noP(r2;JpNfNK}YfTlFJh+f~TWV z$bH|JIlKO|iQD-bTRfBVyC~J)bD?OUPXbtL>R!O8+~MzAzcWN!~W|anX3D2>TLc%M`9? zJpr(K(pw{52{o>M59r8ao@{tDi=A1;0F>?Wd>w&>8h`q9R`R=3v0 z5$Mfi>fgb#0}5%JFfi>qxSTXyE~KyXGYh8Tbv*$B#sA5i7cD<}t-D zYfgPJU2B14aKnzIbXSR(vdja4`)-j>+ahKK8IJ=~jgD0WP4B?yQu0?_e5GIH6Yu|_ zHW%Pz(u`f559-R(V0d z!`xn65wFBfIvp&k)&Pp11>y_^Cdl8qB5*OtHkR7{pL1Zf>gUNKP3z&c_`CcjP;jDZ zuHA+%bY^Dl(PzDlL>`dfF(yVt1qT4E;2Y79rx0YAc;k+kH ziAZ7d554$Gm|q=2z&{>KFeLZp(Y|m#E5Ss_tNZ;7v|E1x$u#K1ca-=3(p;by_%y*t zm=E4>k^To*Tq;O_i$6!lW3%$Zf0R%y)g(V>At2T2E4K5gF6(_^IE4DQ3j=s5Hg#E` zj0}9XSV?{U6K0b<}0%an1BicL(@^&y6 zicJN9pF(F8)9T;yvK0IdXOP)*)h!f{3+FfVAP9EF1ArAO{Y5c+ zAdr!?JGux*qrqrmFJ2Ps|MIgMuAig1 zmhUx%eVY0>$;L8DuNDF|^N>s-UVhwpGyN;;=9H{5yDR!?eli>wtmrm0->PUE^zF#2 zGwh01IUx)bXxtiDMp8$q4?SS+NI%Sm8oeDp{k}@93t9ZJG?J^d->*mjZmw(DWv+=J zQ$Wnz0&;`)Rl7O6<0Ti-j3=OHEIN}pIiI+XqMam+gj=0Y$QkG^z88DUrb_@kyk5kj zH}9D#`)JH!{$PqCSI_^2RRU4+w~uU+uI<)sm3hTc_HNK;p`G1! z@sWXaoG*Z{^A|^JV0BbkvBcpIA}z8YrA4b9X=2_Fz&;??I!9jz%?PsP_Dm5j!|g5H z2D5%6#OHC~{Uoc;LSkWH!f&EM_P$#WV)@7Te@AMiJ*SNS*xBSd+)5q|w20V+wtG~)e=|OLcKFMrTB{rNYVbRJyB7Yd#x~PRit0=&Z1+6p2ss*Tyq;XpZw#aPB z0yoEZXZ(I#k7RGT0IvYJ1H~w-b6`p-wVZ}x`#u9kd7h=I^Sw%l%h}QDaMI3SIj-`K#U zEDV^GReyxo?yUhA!7PpP{MMYYRW?tX7%*WKVubIPv{LR)2!tiD+NRhnt#6N=hmFq7N-Vr+m$4=BMSoFQV%YXKF90OYbpG)`H-a5Dsp`)`&i~IT;$k8_?yt{$ zV#9}l140QPX|Q+aMLd;8{kvBI$3(%haw4e!w{(Y-z!H1?Id)ib*{B^X0rT!s3BI3vx%Z%d3zWx6;Z zV*=g$4@twkj_$(IaMeaFf{+Xus5n{MP?`$Eh1rjfkNMv1(HqTWCY_j)R)fo`6(?4x zlb?;m)7-!pjTZlq1!7H|C@L|vPufbA;~EhUor>H<89Pw30Wj;loJo5X^1^1GO1Klt z$6&kzEa15Kj6fB_IMmK}t-`ff{{uLb$whYwOk~=X)qo2%ApQZ~FUtd3$v<|<|0;zr zD6~-|*fEf$spdA4OnuMqt zBF|VBBETzuD4bgJya9GEP5$X1cEd@A?y%mJm_7l(O_@!q0Wi)^KZS8({Suj$tvU0H z#JH^mn_J|CP!&N)C3P4doysm|Liu+At<+#G1rSK`uT)vLfkUFLx`JTVY=%20K*OvI z2wrN-uL@>=g)jAAg)cyM|057$in5}D$-y79m~7Sfr1&8!$uNvZXDxv}KyG?yY%?U< z2yFmOU>ehLE$HeNkRrZst0Vv@aCv;Nv=1LxBsz}SoI0LL88(c?xe5Ufuv<8gr9p#I zGs(B`~&{W&<&Y73uLhRCW9cvd79{`QP$>piSD2iOHYzabdRHG4L{qwzBMyB}RJvwp-fplA?&0XY_=8G1G)5KAn z2YShoRpO+Mx6@z2_*TFfei(TSDy`#TZxSnR$t@e^huoL4yY;4XW$u10!8z!QZX6ZB^_Af{9Um6Yap>w-O zK2?afP%l9M$kTuP=FGV63F|7(f)yZk(9RG=PHH{xh=7%(Bd3iPXyiV{395H!<#<2I2oOIyooDQgkc6C+|Hoh$gswq& z`hPe<>v>+70`i?LqEs41FX@w!6vHUW2(%|%tr!4LP6QT#iDYg@ktOsc8lm&A8Aw2&bXM)|HbWtCbd0d1z4N5Dvf$_Rs%F$NcMB1j_ZLTAo%F(W|3nKOhqwgGDl_41%&;q6o_jpS31t%C+wv>Vbwq zoZaqhzcIO_CW@yok^MoDFB8~-uUw~#9qB|FWS3P$N8kWFmmYKboUFu?evjgq$9v?a z-#8si3%>zNME=kj(+0QqS}$!PCsuk80Ggp&Nv>V0Efj(IH}TYqg6e4|urOdDZ}u8G z;1_>T`Hv3<$Y{rZRn+In9G(wN3hjio0P?so(YS$1G=SCUb}y_2J48{(s=rCR z0i11<{;(E7SJb8kdFupD;yag9ToboLYOu5{#J3493Jh7>u8k57KUY9Hm|4_asD|x~ zVuK&~;cD08s<~G{551Zebf11`by3j;HkOh11iMX3$Tl@nW9+fyns-6WsiCJiG>Mb~up|=eSy(Iu9RLOxXBKlI68te}VE^B%YK20hiPWbwo=?yg4*ODg!nJ@t zu@orHm)8YaGGLqc0}irOhjb3$VC`-XYfg=)(pJM5;PpNu=6QEgL_Hc5tkE?$nwWYQ zv}F`9^K3^T_Bf=Vx0}RbI=rT)_Y`DKD7lzjBus|JnJZ zWB9n`w?I)K$Txg}dlrsVGSA*Ar~7T|kAZiYMK|?AzE2C0Fsu*(EqAE=i&+%@wQh(O z{e=k+lb;p4hXiV^+fg`6vI&j}s`Qyq0hX3JErO6(Uo-RbJm^4eOj+FsFm27_E_ z#naPcXu{6^6=6NYV5S;7H6a>cd-q^|vAyWO_CJZkgvN(IA|>dX<-~AI(Wq#bn@~TA z=%!7lZNnWChn59H9-V)Xdhf#!UE4o!^&L}g*gxs>L(zvri$a9f#U<|Oz3R`$Iibbo z3wSA}aQrSYi*1j)ZIjS2AG0rl1L3li${-PT{au#Q*_r6NmQo&bS8M|{#?#&m-#BZ; zU&@fn)BXOTtxD@%&(RHM7k%OApXtXO8XA2edlGgVI8x$p5Q2!)`<@_@{Vf(@K&7ve zgY%5v_q?$S|Bi?_dTI>y;%AIa=J}$k7>PI#oA5{|(2^T3)ZnZRb{V&Dg_)xH^L@%4 z6C%}yVWQ2D+PzC0^FLf4>hpaQhA^Ao2(;V9miS#`<>*(yi|P4A@Hs)N z_0u;V4N#6UY+_pkLT^YwUbZO1bjNisZXOPgCcd@kl#bk2R zN`t6x7?9&VYjCe%V*OSyg&MzJbfPsc2<_%uI2veWXInqus`82-}?e$e24IW$yDM1eL6sfsE4?O@1l!b2@r zFFH7%e9RA+aF6koA13b}obWmm_U_M2#Z(?&t^IL}9BWr)(nPDTN!S}H2grXQp$#B; zHxNBsaM%xcF%;yRsIW?=>2_P2)h_z`*F%DH(kcG_hR5%qKLT-Afd z?Ui47O0;5Io}EW{g@Cx)p0D^qArcMA{7lhlk#ret287mP|q+ITg3hG(PKpzvOm zQZ2T#h~B#2-9ErE=h7~jis<;vfkt&_c{sNu-d51^To_!#>Zxfhy!*aXa_Q zrCb}|W?g2?4C9R7lOQk@l{i9w2X38`RzWrSF|=I=#J-sp^^_;jBK@KP*eL#=n6i7q zb>e8h9pHlfUew?eyv$UqqJl#pXefU$MwEJeluitb{#%m2ys2)}#Uj1r;iM_c4aKm&a6~ z^vd@BvxhbMZn>h{9DOlQ?tF{?&8A}50u6VIIt{nXO>0qyzj1{IRUBMYaqrwBIIMFhw8&4LKMfRESst>@uo zpA{!^%jzSj(;Nb4h;6rwIpx6cE3QR|p1L$EVqvT9#=kLFw;#oo`(}hrAs$n23h_ z<}_sKQynhoJh-Gi=2721L$d(%*zlft;R5~uueigH3w|w9D-X4An~3)hdwE%TupEh2 z#ncG!BG)1h=*-0$BOoN`PxHi55tvrt7TjF|1jlS`=n5etjbciJUA(MNK)#|>vZt+f ztYq(#9p}FhUF>UFykI6#T9vnX&>CUCUk1l+vzyhM6ri*2mwa$wztqf?CQr`$o?PNP zF;VcD4+OJhR<#ix=GNu0pC}vZ__Q(AD&ZI%BU9Al9&+)g zNk-0)Lla42Z1EaAmu|Z`_#>cXP*j_TsT%H;CmyuH577W1v4zYfgx|H^g-M}O5+QaV zLNO*uvt}HNqcZd_TRBQ}CcS4umsS%5Qn?X4n|0(9G_;1Qqsv~TYwRaJfu1wXJCrdk z2mv~eNuRKupb7=(O=w>`RkFG}ABByHmGjh4`*a|C9o%# z>W`>T8?RF=>#QxzF!TCzmJf@y6#X8X@kIkC@Dk@bk`tNbaLg0(gD^fc7UK&7v3Fg) zqKL?jUMu~q&rQz`)#g?eSC42uhs(}x~pVZBxjmVOjr;a9P}p+T?Oi1K+)q% zWT!72xYHur;x~5(3w+d)4K_YpbUsfWUgPMAtO$(^ry|%qn)8-3Sh4QM-K$MrFmJ|{ z+DJ))$V6Vr2#=pKZpAHlu08XZvOD`W#qP9A-->Fb$`hnn=i?N9P!SjyJwwAFD!zIq ze9^p7`9nsoc+l~g12IokJ%Y65F3cC}Mj_r>iE^jQmrubCAYZ*M;>zXADkJc-xzXSI zg1X&IDTlnvWap!%ejrRYXUR?)AvybWi+_i3yT?W5YQkgtmmmqNL|&lqePycjpgk$it}+F`ZgPZT4U~BWM_lvn7I|Cn8&h zxa-Mh21UrB0Ylw7cNV_T_KC!grzT*Gfq4EV!ISElNh_Nfn_>Uw#+r%)*~)bf%HuwI zu|_T!p|W*>1Ph%;YiqPII`nDG44-UA1B&CWkk)TEXbe1_njui6dsFOm>EuVuq zKH;Zu?!%oHaAf5jmOzy)L_II+QtDax$k&7EbhV2tV!mlY$nK%ec6?`N(j)Ch9_iJD zpw8)upRl?st0>Fx;*WY2IXpMvL>`1zE^pT{BAt_A^X|rxsc7No>kVBG41$j^cH;BR zs|N5_Q*>3i4E$jAj6H{%R0z!L`39JCkCubja}Pno1g%k~?gF?DB}oZ#j7rxP(oxB~ zb6c|y4l5C_EIG*OeQjB(ywyCXTQ1`kKNEZz#l?;w9lI-X#I<^>;7Q!kP-DmT`ZX|u z(-EeTe0|4OuH*P+yi1ps%!ka!4*WGW;m_XPBaWE6+5Cx_iKHq_&Xg;cWKp8k>A6>H z?z=sNyOw>g{C!I)jngNlE3VnXC{S(;_lDvOLuHcpG1*a8=ECMz7i@@b*7jh3*m@T5Mi{bSOP>~=rG1mIzO{^)0W%Zqpr zgYH&zBon8CNdONM*|$JYTAap55S5Dqe6r9m%pJ|K3Uo51mT# z;$}>EvfRcou^@iAMy%PE#LKJH*hY(bG_)m{&_m&IkRUxW478@#8h$eBV8YU#(&ru& z*|9KO4$n+019evv37GE)k~hxv4tY0}Qng#+*fRZrvG(gMdeBd4-jz!-s%5wG1b?Q^ z*hPJ@HJ}xPd`9O9qB!+#-Cgb|*m$eVuDcp0O20E!^G0hoH8hU-Jd1SLhYub;vBpl7 zSkk?F*LXB^;zaKXRp85-9Xiw=yqRjXT!|>1HQx4WHYc@os`~7=_f+s1+TOZwQbxmY zpVUtWDCXagB|q?%!F|uWRH)Nxv@78ie>SEjcz&hH*qv%L!`(X1G zEqX15mC+vK&Bca%%wW$cwgEm83QQt~6}ZHtn*KD@QJnjta=Idt%_awi86_za*{c$_ zU*l(CMf>Jq8!f$>979_CW_j+DKj<K_EDG?L{yjikM zNE(^$?FV_rMO=UXBVTcb0SsuU4j8BU1;WX5)OT^#>x4>aPNI!wD%#}*R`ysl$q{HH+XbsViK5}F>)>ds-<2&oSxZPWlbr(ZMDcD{#ZW3yf=T3L6?-^C z%{+W_K@a9$<8=RKx3E#G1oaW^b!KBIdPu3ga2~jC{*A{A?THf3sLI6XyvAW|q*C49 zS&ARnG#{ne9_fvm7RPGRH@g4$@r4>*bnykOa+ADW(|D9Br)9{D9_}l_t+MT^lFf45 z8aTNzBU)%E+WomI@=LeX5QnQnz25JEY7N+hJDJ+>Fu{$R6A5KT*EBV+MjOyEtE`co zG6RcOJ29xP3N~thr>mrma;vI%#`U%HYpz2Ri2QH+FE9eTe#kb9dFx(D8x02* zoUMmxF}ON#zQRbwyV~dNV7dH6ZDC(OlmAA(+`zh``)wyD^mIk-QfxdG1&Ulc&iq;; zhl5DVob|zcXX#X%?>P81Urjz8Le0{~2lCA$5%9VyYxTHvSs2Zl!Jf$EvOza2l%hfL zBC6NfD74HHWFHyQSB?Zvn$zU+cRaC(%0z=sr9it&&H2ILe)pj3BV~exd#W1M!GIFX zBD`7Lq!xIzNl&Gsn|?%z7=))VXz+ZH&*mtz$m``^KZGlen>*EB4l;&O}L{~?sI=ME75n8`~H*lG^4ZeMV?N0d#^Z3G-uFHEbw;KW=|x=USb4N z4*lCPh~N?Ah*Jr~V`gSUBmR7NJy~pYyb+`y!g8xrMJPg7@yel25t@gKd~?C@RN|hf-Etk?d-ZJSpCGg z4W;{S=IqH+)o$2ZPP?}a$Z@f0m})1+UxH7>v)2g_$>+rzlj>NOQpU2h$fLe%V8{?7 zSj`=xmxh?!*U^vh;e_m|vcweJb#raO3MLaZ-ZUd?SzDc7&|d1ngeA^?Pqo!7xl(%} zias+&_Uz+hCNw)NC^lI7`QQhNh3~S zaXx6}kCd=weoa@g)f+Vj>Bq^b7P(Lk%fA^OCcAO<+YL6hX}h++11~j~`|AAmqkIt( zWJo5z^1V9sgR^HDeVf^De<`p5@}avCSvB1Y*I^8E)oNx%XIzQQHrI%_vugFtBYDb- ztuG|dK?pqrMapW56l_y0yDV`HuPtVdg+yiCDL@E{4$$Y2TCelEikc)JY>}C3;F&UO z*WlR3eE6K!yu9gbR!>$nl7PxB+V1xymE3B5xZ%>pqUUFZzYWe%U9WC5Ju)GJOR$$K zw8XQ5!R|QxC9JR%_LpL zsaU1@^sPG&B^lZ#G@R7#NX5}?F%7qPuwWEw9mhE^RzBs3$}9cEAuO5^KMm7vl|Vugj?H$g`C} z5fVLEMM<^qnR27f!=B7j0~8C)`n~J21X3a-u`uOQHG)dt6H5}*Q)f^SrKJ~N#I{G2 zXPJZi$<9`PK-|bPbzrOix+7F@tu-hywy~Cxw$y`v)AO~-;%aw_;7#%lG!ry6PVCLq zTXWT7^0u`$9#ZMl#Oe!UgE|+#_4cmmifKNZukb;Ff$!W29RN@W4h?;)rL*-MC}K5MXw%PNtNEM{HSTkxEaMm&Q#C7LTp z!9gjyvT>g+uC6;6GETF!gNCw1jCGAxNgb|M()b`RulLzqn!{n9*Ib?82zK^bci?%t zdAIP~nB;}&nn>4mcQ|Mru_|4jz6H2J(doSGCby-z1&avB;AZT4FY76%{ps?nuF2^a z;v?}2GJUlZ`G?vI3y4TaS92*XdAu>-<(RPWHf-M`nk9K;7ijXBp4AJ~>|rPKsJ%BC zFo%Jb2EBCOhc3=fM59r;J?wT;v7Ni{iMM-sqYTpyJ2j@`Wi#9PdxdvbN7c@1b_>_S zu8vizbhOmAr;?2N`9~BtdsmiG?JLvYqM6{8p$#m1&1oIAYLfHrs!r$Dx2#U`K2+P? z2B$S=Sc#QQ$;-As$J3+dua1P>p^&3Aw|Tk8^}_E*o!Lu3T@xNv5>7}vnMDvin?d> zxj`-#${jo?OW`%&q97;VUGZqrS*`Tlws=kbW-rZGv9n2@+>~H6cx>+(Q8v@@#qjOj z=pH5h3Ko}U!CS$ZZd~yKo zK8Fzyc(<*@TRq}}4$Va8o}ye_ADL|nY;1$3cr1uKdx!pva9FqsDT4h` zrfFk}`_0OZHHSNwfqt-B%3clfJo6p-#B(Ie404)2UHW z(5=6OK7cKkYLHN3=;+a-np;%^HsS=hNH?t|7okRGNC&lw3o`SUuoN2IYRP0+`y(@+ zyJ^3)SPxfApRz=~WB$d6kmeZZ1I)k*t}y@-E>__T>bDIGhuhqLc#^k;0zm7Uzk0ke zlxs0lt?1J@Qe?9#LKj!r?(xlGtJ`GYTfo>RB^ormI_9{CQ=Q23o1RrFwZfdsQQ~!_ zvE!qr?;p|+Bd}j@*6W`>iM`z-o61t(i8Pt|F59U(Xbppb`Fi(iZceNJ zXM-B*WVT~nMO1KYT|6Pq5%02}jq1sCF{C>kIBm>a94M}Lq0diS>d z2)Vxi+$%>kxRPtIS83W&LddJ|5d5qAkNWfHL|0w^5+#bIM^?!pR!>|s{CWs6Y-G)C zK*Ldm6k*7PH(rn+FG@7UuG;MpC8V?XK9V>MHw%hjncX2r4Nu88Q{S+WIz?V@G`{)x z#&SW^lc{0`voIU~oUK(C)K`7B3(c_leTftL+-0U?itV9zVgwvKi1HB_-Wj`UYD(Lq zq3DbtyzIqu-e`dy^dQsB2%h5X#`VCrw|u zEni{UU(buxY`G_IX4H5dGz6@ekkWB?>*Dmh-VL>sd4X*&4!26 zYa@P>DFyShv~krp_xS2C)q$86yC5HbYdYpjf5n_fjmPbUAt9$j&uqy4{z40l{MQJZ z<<3+%A}7AR`l|x>WR?2O1k3tC?+0*pt37y8bZVpG6!8@EbQRNSZ!^V>=NrAPF#-)9 zrcLY>I_}MmZavd%@_`+xwAPKQmVA4#-1W(2JvC6iNJltMXema^!?=#seDWoWY@>T zAQSnFbl%n5oOyR7N`jFYue;(xb%UursjKmfWiA|vEnY*|Y&Y;Ur1_|nktd1qA|wtLr~MPdE;ky6x^`J>}4h9;tfemg~r=x%#Y zuR8U}tLPYhIkd1=ZN$Q~kkgGJ(T2O*nlOS_I^3VZMb7`Pz4s1^YTLer1wlb1h$2cx zqU0z!iHKw+O3qDag5=mhn-CRH$ysuaP0k962u-HRL2?EqhlaP>b3E5`uGjljy{fNX zz4!AEY9rly?={z4V~#oJSbNt$C8jN)bgb=WD{-DxtUEqfMolD%x+y*(AlD8gqTX7{ zE!6W5fgK$z7aoA)Qm;)&f`kt$R;Djkhg;%s1@SpiV(nw@MHbaQCGI*4D&O@wY6g!&cD;Li%?tZ=XEYgvbOI$PjUvW4(6my?2rS zdfl!bzb`6JH#J!{kj-(tR$INB@u2G^4w#BLj-(qb=aNAw_P>s90X} zr-ol2;u7?w$PqwA-mxTlzJCsVa|TR%2`M&&7ST)^2#8`2sJ*P-53cT0M@Euqbi1g^ zoPwv}pw{(scysF!FElyT)g+)336qyL>jkB*d46kU5w!S3vg{Vt&YwgaDs^0Wu)AIe z{DW1$+z!Q;lCknu=8=Hsh6xlhxjkT9_-(cg+7`*_WL6U=EYOw~8{OR&`H{#>av(!F z#thAK4ZJJ&yeGv4yo^Is^}fE~Ng-LeZ&}7{f2q9kL$ToV_+x;EnRwG)9nY5H?g}aM zv0vOl$K@z*FF(4Q$jmSa1B-0EH{t~b(4c64#iZ+Nq}IO1oAqKBPHWH&Vj{!Fis_wGMFcd3o?S!ro2A>5?qc+k8z zSZ3Sw!kWHhbBGREP?P51gxi4O!SUgGK2CBB?iCsu2l;Hx>^B_=Vphm2pJVtTNiQ3Y z584#DB)9uC+*2AkE?AS|&e6kGN+5Gn3*D*tSG01q%go}?9n31RX<+^3+)KUf4udGj zhK1X8y_HD`YQi_v)kqedBiQaBHjWDntDYqx9Ye?RwhwsbCxK0(h%y&>o`T52V zt}zlJoipX0Gd8sJKPO|)qh)*mioPb*jxmQ3RgQABbAFa<`c!y(`gGCb@*i!7xe3Z( z+ca_cWBI{aMa~Wnz1xwbMYmp}KAx{c4lZ}&k<}(Po4z&F zSu1B5ERfn6boH=+X@nd`I>sCJ-<;wh+wquyRn#A@iN#pnMk>EsFkc-mLq+BKtW~w> zkIdOYko~8+bn{HZ92ys!hoB596^Tey+ns*RF$wRwcRC5iE_4^qS9D{W}6U#rVuVik5dZVjSU9!pMF&Z^ z2R_J7_b$B;aal^+$6V~Z{K3fa%sC1B$cMa2LR1poeR@5EjJPTfyEtHWkZSwJ4?;hCSHpIT0Oz}-N_Zj11-k=>!Uk55ZSa5w+=q)Wi#$lk{X9-~ z_aVFd!yB4UMAMIuchGn^j{|D>iT3L+lQhXOP+i6=Kcwsw60e~fkNXzQWhCi%)~QT0 z+v_!;iSegbS4@18PCyrhdf2BOQv2og4x6U9@!R4a8X*Ao2j-zFLT6F(A3%E!Z&*$c z6oO4@(eMiUksHf!zpbyU6%%?0LZou?ApZq(9n(lb`-%EG!52jeVazDLClIw_qdIS0 zAKDD1C{-!+R+!|;k&7G)k%|OF3Q=HKV+$UIg0E+1hAK3c>P}ajKPuLwBwsJ;ggR-! zv>fg-lS#~9!Xde%Xfm7V6|d6p5ZdA)`Xx+ zMB2YKL0WsjlN56`v(-`U&GIx}&ECk~=Hwt#NvLX)WIpD+JJ1Vzf?R#jV{@xCguCTY?gQQF%pqV$gU+8A1XeFJ=xnJnobyuaTKmB1aUva3EscA22KL@b~^Je?t3Wa$6B@i%1Ni|lr*zp~3Y zAAB(v-fYUYa`CW{Qp7o}T^&v=byOq?I0_>}XQG6DzFygzq0Wun3I9-yo%AT963DgN zEJUdnaSI?+v4JNL|K>CZ)wfAjvPbRL@zQRZ>?exEmagd>wu>_9WoeZ81>Lso%?K_# z%G!65ZV91Y?F`;t9>~=zv(^5#F&!#}%20{ROAkT5?rAnHw;Rf%6^^zHP7OUv>wDK2 zwjE&v+tz~W&MY)B5`x$FIlvO5W4fqp341Kt?3)hD&GUmk?T{{+QP-ewF`l0 zkd7cA#2=*0I%oC)!El{V@X%9$%hSRHRJ-SpanSKvFZZbp!0P#kPmna$Uw4HKO08dG zn4QJcgOAVThO4hZLgQ}0D6@U3cPbS=bh(_vRyfBoBB2yEZ~}?SHqI%#mb0a>760zh z+2#C-@tzodD*@X+)-l&!6>VL_fX(sIe*WwWWL~p`&(0vPI=}?{Pr;Nzdi<0~<(pwI z#eMgR+ULWAEF}YDZ`j)+B?b7Yk?I)`UCqtVTOp+k7~+>6HemB%E4tYf__JmvXHZkA zi{KeYH4JKmTbdhAC2Ec!V2W0)|I`I?!GebFB<21u{$_qAsO9b* zHcttq*$k6Ub$a24kvbw8mtHoTKCdhwQ(ZaTS*u;;YzcLL6?1P85H+!b(JGhxt54S( zPU>m=kDOFu1r!oQ-Aay+z#I(EV(2_bbCv+f4;$j|ZZ{DK6d7oa2SmTD*=DO7U`Vck zIzLe+p>ft~t{Exus)J2lNtt3=%6$C8Q`idl9AknS8aG~+z7Sv?yS>W3tGX3V#;!eh zm_{S+dH2aVvXVCUWt0ja)D_!an@FLFLI0$l=H#Kc_Z-P~k{a!ac?bxy#4P5($RF(% zOe-~hfY2@J^9>Q-9d)rCETTW1@Dbx^PI;%_ov6+h4!CytEcxeGAGVdx)E9awB?!aW zendK^bc?w&fJ*3|?ZHrqljXLnqD7lUDud{EH`vsxP3T_tL|($f8QB&(=*Kz0Jx_uC}U9YSv>RJUCT%1>+2&fs9?Wiq!9C1v*T3qK0)v5 zBZyhl67#?btU&^UqIm{sxTVw5moyJ;8TMa^8XiO4vV>kzlj{`0#lm?*dcV{>HpB6t ziL0l`Rje%6m%$ou&kzF{fS02HFSkR0K14R84XU*LFUasRkA$Kh+TYMcB)VypDTw%! zstaLbF%-denyP&z0Th>TO6ff1Xt%(N#!8`=B>NHh#4#nsHejTc2y3i^=5-ZBMQMIMf6vDt?e3YGVrjO0jvq@)#uST;<6Lp|-8_s&YT zO>c%Se?YsktV^`SC1lA{kT2XTFsSM)gX+LG-kn7yT0rn`ZbEr>Vmly`~9Q}46d*0_M3e67gjkP$h0q|+`t z>`dLCb2IdZ)@Ac!Ue^IGZz~Z*!Rtu`9X)H2$*5v}0wp{CM1FaW5DPx-!zgS+clySc zvRni>4|Oo&mSDq&XNy*)?nb^Gk|6jzgy*SiGJ1mStG5RMw_vwSyV$v1c#Dt!3+aU` zoBOWRqtpmKT%gA>Vgs6~d9|r7{~gI(ulF7+6$49)oR8h4-Dwu)!Pa6X+?bGC|nR0 zCrcB7%T4RpY_A~Q8}9E2Aqjb@ze$@MuE76tPz{WOZUlsDT-b;K!YT1oKmVnRqW(a= zBMbV1=fe@y7xoU9bq_q2WyIX0eDoI&z72?|lgnIq#!Uj;l7_fk>q}3b@|QyPCzyis z2le#h)Lld5bVFjgf9010qWD-)O?pqQA!y!f9R)j25SEf+W~4})XtdaC`7+Ald~d4# znM@l7(ZB!}Y|#4734Y-*NUt|~-#CIY@+%|b{56s{*e_V}x^r|XSL=mV3^dC0w9Y9I zh*HO5i_J@T+0!=hD4=!`ta9k}7kZjA5m5iDXubQDXr{H{-6~eCC1E~+FAIX$SYZ1> zaN600=BiW_oAABoC6ZGGumiahdPRMrg6@9S&8yZSJQ3WQDvR%XN0LJ4(&CmzX(x5C zkV?zg(+x5C{^e9fH40Na`JtXB+o8?LjxCQsYE-A8PRUQCkwTVcd9i?STu-iz?}OBx zz%a<6?xgLE7L3pQRfmyy!ct^y-|UGxjEm*kD-d^O;2mbodCJ{rI6&s7 z5!d5u9V#6~vf~a00fBZT7qB|so^_=jyo$KNPwOee?di(OjPfN= z%&m!)b4akM?)_KA2zE_Fp#dLOt$Als>~WIlJ+#5*ccU?;HSbbUWvbM%Bh!F;zNx&CNFtF%l)*B`k{e_Y>6$+@V3}yEQSaYGrxW zpMo+WW#}q#=Iqe=E)^kaUDgPCFq5XWFR2gEHhVWUwQRs z1qU{WopY1U9xpGtUIsMS%fvJ??aBJr?@HLah8WSk!iF1eCUrmQ<(0ZqeB-c0=$OxV z)WF|{eQf0$I!*Y87T~nyXCn?=!F!jmaD3@OmwGqc5IJ5Xc|3JAzU`j`-(h|>U^ao; zaEfR*?yh^z`Xt~qc9>su?il4ADp6^rJ3=^vN&|$9XP-a7@Pj764aS%uZK$-)1Z7FgRVY9l%0w8<@Ry`~qp1C*MV#pF#hxdp z0F)bE22gIbo&zBbgbUJyltTIf!(8o7@WtHMSwz{4Uyp|L$=$`dsbB&^XW4_6@cV{0 z8t$J=bPAfId8?)46os;!VxHZ-mQsk9cAxHx6lq+p1zBXe0xu)WvnM;~7-z8>xt~0NKMEw@h#*X5W*3!hEIO9S+tT6=W z!Q6WVm;gGaa@FZIj{vT-MTh08{S}(aG(4-B^msTajm_(G)}0w>UgMs1+vF+m!X`5f zn~T_3oJEih2bGdYsaEPLbIK0MKtToY4~-i&Vi1yV(sL`3C;AO`uAtQX`13p zu@H%k_1|RjWd4G90T8@1D=hz6vuVk#cg7~oHPdcWpKNLvgrs_Pz0S>@KMv_Wje{-S zI7X>=$X3_?Rx|FDHLr>^T@mj6%Zx=WfhgS2#vk3|cYy5N1=YewnWsxHVQ~{wiY*su z_U?v@$m4Sls0!ndY(!#Y$;@5{@A09iFQPFrzML0bzSwJgtc13K1VRl}-tOcl~EZvxz@v45`a z>tT`@)Q`{(t<72tBx~xNJbyAI$%YYhzE-JSvS!OP(57h7`Arq+^?tJ*PF2lv?sCL& zbw2T3HQv;HL)w00=B5k)Y!F+%@y6DFCFUF!;cUIwb}1{xkB~;e>p_QfBNnC;zf&Q2 zFNT7He(euOIq)lRcO129==lmjOgCm#!`G`AwAj<#r`%o(pMUA}pp>kkRzfv1G^uR4 za(mz5^J}5FR5wy8_+8TIQoqNM>xl{0QDR!lnIQdtOo|VaqwB{uZ-`AsixG{?d-%T% z2IAl7CC~V0(pl&7jEl4YH!G?!PInR!Y)E$aJ_ZiBAwgpA6AiJ8{jB{KMSSxdPcCEQ z;S^ykE-7rqmDtnLd-ggT8((*rN>3Lfu?Pts4&0`Jqi?6*XC!vi?IosW*pb#Jy3zi>HF;e-g>DTwE)^dSW%}W$gZZ=41?lw0gQWG*_v{V3Ih$f0r zYU44%&8>(R+jO~WR-BJl1{UFwg-=M&!sHqm;D1CU`IRdYw6?iG;Jm@g%)&tO9Gmj8 zeJ@DDRtt1&G*yv4z*--maLE?D_~8cWLzM5*8%CYi8Eoq2bOv~IB$F6NA3KE{+{(PX z7CrTCre~jhG4g3>Iv0->{bWGnmNB`_$mcQl8FwJF8q#AVqJGPqMgVV>>gX-lJ{c}_ zt*iqQan7d^p}3%Yc4bt~4Ul80Ga(g|nFAjeMi&X(B{15uJP|%IBW^M}5=HFk0ixcV zy3LEORnXw$94}bd-ITY5K<>IRpQ+&n)Xrib2}4VDlBf2?pSQN+Jx*!-_+dnFKGf2E zfL(|q^|BC(Snk}xOKB=FJ4pji!o*0k!flU2_~zMyF|$*NnH?c%EA9kTOYfQMd`q?u zPqp7rPPlF*!h&HjQ5YuHnDF6gB+2gdEb>vo5?uk7CsIE)N^XQ(MZ)pcS%=}F+{ zIjd$U?9*K^A2Q%@cg`+`sxS%&kD1^MnY+-1_)2~Pf?19XpSXrgis!e;utIVFD8BfN zhPX_dUtqh-QP|>yA0VBD|EDm#xW+Ks&!%i;jIQs>0yj23jw<#2ELg5o(B`hl3xRvx z6f3JAW*7Fd^Cy$_$zae(DR+yk@V<3!cR|4Q4pXMie%w zPlcFV6o5B|gA{Zgq@WF;m*4OLh%`=CRFGgMme<3E8+;at4MqKj3YI$te6DMtUP@-i zDPjgvoa^5;>`y*oxOQ(-Rg3+j?C{;{GYh%%*r}57X+S_b2=|uizPJB;W=EXo0tp^3 zoU_ftX3mj!H|?C8Qua2IP*xefc5PD6WH*J}^|;yn@WY$n3GGM&%gLLk;J~%^Z!eu) z98Kp(83pPv6b{M&bwpMPZf=GotJSs1fPj4>Xnosl+BvbjeJ)V!<;KY%(tk@0-5J&@ zYse4!+{e21A}5<6@zgHoi4o~vWBEdCSPi;%wafXG5xg2_lKNDb@E*TugopWTt=3lL zbCDwr6+LdiZiIu(<-*4EMn_rZr3P7+)%!pkkX1YIT6pUmm&{GY#v{d%BR#PLR^!aH z;Vt`1>StUp!<)gJ805inB}Mo@C&E~5M>eSd^mS4MHJFl&TD za!psd<2Jx9U4|%Y(5iQzLBS|;Zc2~_{u#NO8ml=lJ3hq!-0|VRIm&8SMTH1KORR5c zgI6wn&%(fpNY(Mp4?e4%rIX;JRrMg@?vf~2-(Y9NAF6!Img@bOj?WrAnL1_{aR9sx zmfMDA5F1;2R2k%x+WII>-@OJF5aabCpKc{lf3sZ08{Qha&QY=iSU3-FVNyuKzWp|< z?ba=sYTxY=hk}!$&t!RXKW_eeff9g!?)M7iA&Ho@${lj-UO!z>VA>rl`~$`~;1NkU zU@Z8OgHddP4U$83PDvq0E(YL{raGFM;U|pQSFvzNUb%|QUr*pf$X4m}zXv@666sTz zA9T_{2r*jt5cj12>Gt;N(kBAo&PX2Io;1idtd8GIbI#9RRK~p}2R{5zmySdEd4f5Q z{|*0^FG84}z>_CWuIm?YuswhoMdTT1k}cz|htfQB{+O(Ag)6Q^6)5C4M{p^X$>KuG zSq%>xsDSdlwlV!kKH{sj-#mND8JwQC(knGu;cmNA1l|->km|>$08BiV?l5R^Te27r zuqFrwx4Da9WQiK;=_x(N-LnQP5nu01c0Rg*g_VhuH=~jkvpJiVN2O6`n_m z*Rx0i83`XP4W|_U_A7S?aP+_NSOcMhR}XMG#>z@X9rl82>F3zw$5+k~?;=aT;7@cu z?+xkjC_lIOBC&(vKVSb8b}}2u)vM{fEuGdU1x>?M4vyGQwP6%E$dYuKFkz~lxj|tc z3P{^)Igl6MD>xGI^`i#zR1p!fYri`w7Wad13pV+&WinQ3NfjfuJL0vgAJ(uY+)JZEict9Gh5<$tJ_#1#!4FaHR}5I zyqzq63gAJ<)fw>L89!hGXR1^5eB`X+qiasWTZc0$TZ$Itq z_&Vi=J+o8vb9zk4P5?CZfAgiAcLJJ|Wgj750wpW{i{ITO-Q+Wname3i0wDH|^zv`c zNg~~N0gE&+qzEV+o<$V>&UmM=iSS8URI`w%Ko z9Hsj|u;$lhkjDYp;QP;D{P+Lc_+NqX2Vyobk#Lw^_yfOweNG&z3v^BkcgdIIsBPabFLalIy~T zjA5mx4DL5_&Ejhx`U>36Zm&2$`Zmx0%5LS;r#rc08dl1vCAVDHt-X7OwGU*55LP;^ zQ|Wz0e&%DNxk-u|Z*J1K5Bibtyxmy6+-h}2>2ZN7()++=uM&q>BA@+0&#j`E za9ObD6YYs>h4Ad3A>+HrLEHe-Ccp!v$as-Id)S@~o#+KJ8kz#Oh@4^0O5gpG5NfFm z26^HU|AAsdtGdI3v!5gQbsk1m15FC=az#!$(AVknR=X}_a7*qr-11pdnA&J=z39C% z6etK?J1^!}U%Ma933aDFT#xQZlrTrbfO=(X$;B~eLZyOvIK*D8{4=9CD0#n@H1wWr zi<(eeq2|YLenGz4;J@{bW$P05{ID&0aROV{KROtA=vS4n%^YRjfq;0g3 z-=>#gmA9NOpH=bFrzoKY(5FoY)?DaY)X=7K;f!|TM+?c%#|^XGp-PnUP(4dT{i>>5 zF_Jan?QzEi!|lFC8K>jpvUl9eho+!sBQ(8U#in0v5YLy0B7^0}Lg{r1Mq8_|h?#!^ z3h95X#dn-wi6nmo$@kyAdE=3^b5PZzK%%A4Vg|vjL+<>Udy8+<1?L9vKT~smnR1wV zu;5|5kU%^Uu0Wv@!>bGF8Sc@|jSx~ZimFp9fCary5OL8PJ9J0wB#eaf%c~OBuJdlM z47bG!T7`g$fcNw$RfCu?L&|O1&bZZIzX^fNB0?M9jyTY~wHqoK z`Jlq*)fYXn^-Z7qe$d2xlCQ$<*uF`Mj{Vl)?5Ecn%jAv#-7QuSRNN$YIfIpkQKWn zkXL2)Lq-#@uNv{F>bRFSL)k&@t3%meCPMmlo9g>tiu7*vyd7}$IPo#u86VsEAT`u3 zK()DR9iwZR>PHb~g3??0HgP1Z=}6>-jJ>)$LY&(F)9)c&#ia}9>f5Uqz zyfj8$s~x=BY=7e12>;;dw|f215l(lOs|f2w>@nY#V7B{jUq!YXWPc57v6<{8&N7Jk z7Axrx(l(x%2J6OBHY;(9f}q-~l4<>+oVhWhPVnjVKq(& zvNw@qh;YD$hnIKa@F3>eSE{ByC|a-0$nhOwF?-x%kwfOmhV==wsnHP^-R%sR@I zu4WZV?MqhiDAdy|1U0Dz`lS&|`l!jBBkymP%DAq!CQh?0#1zy7AR$L{xZZD>Uh zB^hfB+pRj|zZV%R)E;XfFVJuLv+(cG^4dYRB~6;`2%e3Pa{86Fj*VnzCx`JQ#i(|o9Y8>0UB zy*@?Px@v_PlQwJD#lS=2HJy#BijMCn*-$Ri8+3mj4`{Y!@}id z+Iscf4xxqK-TI;>G3Tu!it>WgPg!@6NhP=BUR+WtrM{lcUIi7@N|Pv7Te?;q#^bfH~7C$JwW)>?1aE_PVzW>HlGJeH9JxgyQ4#Iy6dXD7yG#MQoFRjeZ2 zGt$JBHj!%YB~8_dQ_N%Sq1XY)Nbr$UDFq4|Z}PaEFQc&Gu%zwvGHJI8H=#Gu4+${b zgFC{djUdCrgOIKjy4P)>`@SQlsjiuB_(!9bH7YV&IZw5*-x)r%;uzK2Yk z`HyoHChO3jeeg-^ikF>)oCenm#`TIk6H_jh7n?^jSx}Z|r#8tGEoLS+^_S>EqYRxY zxLll_71sLnoM?WAa-4Ducl{U$3{mtLlTH6grW%Q{udB4ntt$-Xcu}JF`nkdSSNtgc z^@xE&V#fm$F}2vV6}sAhkW^RcMXdSf=$M@QKxohb0IvDU-qvXfp|9LyU8}cYTOC#o z*+ziJXe~t{AR`}CNpE)=`K|?#yPX&kSCK8(BsCO6=s&7aNgj_%qea$;D|Us?te9gFR zp_Ox2uF82%xr-FxyYlrUBv*Ds^`Jqm-epMLW9hl%>*nORM=HEOh6x7T6OFOJnebm( zK;c%*jB3-3GPjK`nL6j(Vef|IG{uGeX|nG8@R*Z zf5bl|oFQ!%)Fc#Go8jn3l@qcm3SDTKV7|Is_|^Y&`&|yByvK7Yx+|{(^@CMq==mcd z%E-aknPP=&HKL;mb9K94#SW(8)W&kgUQYMuvPW<`9^=f8WwReJf+rI=Y2f`ev8Uhy zEEs}E&-4p|r%3A3faW!LWaRZ(FT=UDJwn@ztIDEXt--G_YQkFvAXQ&q1B7X~$%|lq z0p;VMy=G8VX17CO*&f}^7WZaH?(E5WvwzM=ZFM`xdk8dPwX2?i86EL? z?)lfYR_CRL<=ROTgtpgB(I=_?S2L^2mYBp8fnd!ajV6bd8-2O?r0l`Lw_Cf*{AA+j zo{JRQ^LS;BV=@!IJC6*lY{lQ$7p0}Pgjl0$V-Dsg>b$4EWQIwkYei)YS3v_MeGfIZ zm;A!ozO^!Kw(kkp8OqCt(3F#AGbhImIyG^Qm6~nyNu10mRj+rPy?9BfpkR>Ym9VnA z=k=e86w+Zs18GuTb7G98`#Lr8!jnv+klx|uH@wu|<=q+il-pDM zg-eN8yTD2yuM)J;Q3FA3AXBzpb5~U&zhteP+i%iC(pN*a;qc}VV!q39;w0HtD~CnT z!_~RB{l<~c3?sC)eXxa^cswJO38FAX4=M4+VRPPN^>h~1 z#JhEog071_Om({lXO}?U)F%8fS3fC4TLc4zO$!WVG_8huaTN~gw>qJRXZ_HN#LulS zYFHh)?9vWqLM{;IH?K^jGUs|nP;fP|M7789!QQOFwp3?Z6Iv86Q=T-3T<||XfG8AC z?@9k`joRDCf6k<%vvTnc6|;xnhUia7Vxxfp(`m4`=buCvK0|sB!W^r6(96N%^>s>) zUBLn}F4u|o(xD{Rq8%i(md9Xk)0Up7PedCNv$==YCzeOs`vXZ?PLDDuzM|JrO08ww zyd`Z>+{;h(tMzeugWuX!QP&2U>Ml~|5IVCW{D0}#QN*QdA}C=9z+mIr1t z@Yu6$B3pe|-m&!OYY;=-Ew)S{ODeH9WQ{^;E!!f;&YAnNs8SaWoM~X>hnCp4?Vwg#ii&t@68pR>U2G&a_9V2?5aSQzNq%w? zHCrV+=`SUfojumxi+DpnLHQ1eP`QT2{^u#+6?d5{T`J9T z;Pg5??DWs}OE}~7-KOt>46TZS_0hj_G3nup7%zxsz2FURXYEni?T}-^cg}G^ZY~3f3AP~jh+vHq^Lky@ zVC;I>J`usDE2@y}VdSL-QtI-y_=X4LvFVT<0)fCF(?}g;q+o6cjTjrhSQph$38dG@ z(Km~Wx|)gNWbmuGw_TU4_EbC7$K1Q{o1f!FUWA-@D6^;~ly(MNkKE&y+Kd&H?d~^) z$i_ZSS7?q&v;Fd34s1OxI32j7k2sI)PPS)(ZJ=6EFIxxY6~$9JbM+C4Piyz@mFNtB zBI)*q2GGv2BTe7L$q|rijMwg4ka~b5w+Ip+n~GtjbG_@+jAqk1uQd5O+x6^v>gz3e zY_?pr3Yf!8?wmSxaYNyjjOG=O$jsc2lHb~8Da5|0L2WbxnNhttpCguy!XcCx3^L|hXL2Iu0Q!^Rc6!8C36p<_GQ7T z0Xp=Yi4Q+lDT@2L{0|7hA0RW64nGn=#3sW1=tW zTe=G64UNp5go_+*OKcm}git}-c0A7e;>Kr*X*Fm1bG6kG+QGw1L)*8@?1n#E`n;hi zY!>ZGGv!AgjFss|tWhzGh`Jvo@wN5R2v-_2W5tPRK*-oU6C%X<@lnriP*1l%b|n+a zZvGOOMBL3{471i%Io{Mu;j1z~PN}M%U6TY;kBICvx@dL7!WKfstYf?~fNWRXl$`vz z76zXr!+7%`csADGG4&TE?+`pdMjvZk`)7#y`Fwvyv_C6qAcX(_P^SA=mE^}!0C;>M z5%ojMkN@CL|6)>q9X+@O+=w9$RHpv#t9?#)3cF@e{9FL}DVqL;Xp_$xPp|K~F%}3t zdJ3MECg;Wf_F^%}tCJO#eWZ0vzIS>(rB$lV1xt&y=YknqP_Y+xepzm zaIIYe>qNAH_yVCn0DAhfpOcy#^FbLx$eZo&aV@iZ;*;I~Z;me5!;3+e~iO+x6goOu?50=^mx)+QcUa;Bn~ zloY5kW=K6E~iMQ8B#k0 z`j>NIxgpKTYSGy^|ITRsvKR@FFMRsMnjq-1IKk6lgvltXr^6 zP>y?qd-L?~3ELSE0YHpgNZwMlA!osmI)pQ7zdEqm;OFCIfc~#^4cu|v54P` zROtSna=_yT39$^UQ>Ub_%uNXL4c^0j<$N>X6g)Cux~MGh1`Zwxopit{EbLz#aAAiu z5!BrM)t`QRixaEG6iRIWLHXkyzsk6Y2X{%l-$OVD4@5+N5x85dqV0 zF4RT(fU10_>*=oz`+b0oa*a-&rsRshxzJmtNnyLyWWrz0&L38SzX+J#c+VC7n+r)R vH98Fr%ToSs9?7vG>wB literal 0 HcmV?d00001 diff --git a/annotators/BadlistedWordsDetector_ru/Dockerfile b/annotators/BadlistedWordsDetector_ru/Dockerfile new file mode 100644 index 0000000000..6e5745c7c5 --- /dev/null +++ b/annotators/BadlistedWordsDetector_ru/Dockerfile @@ -0,0 +1,11 @@ +FROM python:3.7.4 + +RUN mkdir /src + +COPY ./requirements.txt /src/requirements.txt +RUN pip install -r /src/requirements.txt + +COPY . /src/ +WORKDIR /src + +CMD gunicorn --workers=2 server:app diff --git a/annotators/BadlistedWordsDetector_ru/badlists/bad_words.txt b/annotators/BadlistedWordsDetector_ru/badlists/bad_words.txt new file mode 100644 index 0000000000..de2a9fc557 --- /dev/null +++ b/annotators/BadlistedWordsDetector_ru/badlists/bad_words.txt @@ -0,0 +1,221 @@ +БЛЯДЬ +БЛЯТЬ +ЕБАТЬ +ПИЗДА +ХЕР +ХУЙ +БЕСПИЗДАЯ +БЛЯ +БЛЯДВА +БЛЯДИАДА +БЛЯДИНА +БЛЯДИСТОСТЬ +БЛЯДКИ +БЛЯДОВАТЬ +БЛЯДОГОН +БЛЯДОСЛОВНИК +БЛЯДУН +БЛЯДЬ +БЛЯХОМУДИЯ +ВЗБЛЯД +ВЗЪЕБНУТЬ +ВЗЪЕБЩИК +ВПИЗДИТЬ +ВПИЗДИТЬСЯ +ВПИЗДРОНИВАТЬ +ВПИЗДРОНИВАТЬСЯ +ВПИЗДЮЛИТЬ +ВПИЗДЯЧИТЬ +ВПИЗЖИВАТЬ +ВПИЗЖИВАТЬСЯ +ВХУЯРИВАНИЕ +ВЫБЛЯДОК +ВЫЕБАТЬ +ВЫЕБОК +ВЫЕБОН +ВЫПИЗДЕТЬСЯ +ВЫПИЗДИТЬ +ВЪЕБАТЬ +ГЛУПИЗДИ +ГРЕБЛЯДЬ +ДЕРЬМОХЕРОПИЗДОКРАТИЯ +ДОЕБАТЬСЯ +ДОПИЗДЕТЬСЯ +ДОХУЙНУТЬ +ЕБАЛКА +ЕБАЛОВО +ЕБАЛЬНИК +ЕБАН +ЕБАНАТИК +ЕБАНДЕЙ +ЕБАНУТЬ +ЕБАНУТЬСЯ +ЕБАНУТЫЙ +ЕБАНЬКО +ЕБАРИШКА +ЕБАТОРИЙ +ЕБАТЬ +ЕБАТЬСЯ +ЕБАШИТ +ЕБИСТИКА +ЕБЛАН +ЕБЛАНИТЬ +ЕБЛИВАЯ +ЕБЛЯ +ЕБНУТЬ +ЕБУКЕНТИЙ +ЗАЕБАТЬ +ЗАЕБИСЬ +ЗАЕБАТЬСЯ +ЗАПИЗДЕНЕВАТЬ +ЗАПИЗДЕТЬ +ЗАПИЗДИТЬ +ЗАПИЗЖИВАТЬСЯ +ЗАХУЯРИТЬ +ИСПИЗДИТЬ +ИСХУЯЧИТЬ +МНОГОПИЗДНАЯ +НАБЛЯДОВАЛ +НАЕБАЛОВО +НАЕБАТЬ +НАЕБАТЬСЯ +НАЕБАШИЛСЯ +НАЕБЕНИТЬСЯ +НАЕБНУТЬ +НАХУЕВЕРТЕТЬ +НАХУЙ +НАХУЯ +НАХЕР +НАХУЯРИВАТЬ +НАХУЯРИТЬСЯ +НАПИЗДЕТЬ +НАПИЗДИТЬ +НАСТОЕБАТЬ +НЕВЪЕБЕННЫЙ +НЕХУЙ +ОБЕРБЛЯДЬ +ОБЪЕБАЛОВО +ОБЪЕБАТЕЛЬСТВО +ОБЪЕБАТЬ +ОБЪЕБАТЬСЯ +ОБЪЕБОС +ОПИЗДЕНЕВАТЬ +ОПИЗДИХУИТЕЛЬНЫЙ +ОПИЗДОУМЕЛ +ОСТОПИЗДЕЛО +ОСТОПИЗДЕТЬ +ОСТОХУЕТЬ +ОТПИЗДИТЬ +ОТХУЯРИВАТЬ +ОТЪЕБАТЬСЯ +ОХУЕННЫЙ +ОХУИТЕЛЬНЫЙ +ОХУЯЧИВАТЬ +ОХУЯЧИТЬ +ПЕРЕЕБАТЬ +ПЕРЕХУЯРИВАТЬ +ПЕРЕХУЯРИТЬ +ПИЗДАБОЛ +ПИЗДАКРЫЛ +ПИЗДАНУТЬ +ПИЗДАНУТЬСЯ +ПИЗДЕЛИТЬСЯ +ПИЗДЕТЬ +ПИЗДЕЦ +ПИЗДИТЬ +ПИЗДОБЛОШКА +ПИЗДОБРАТ +ПИЗДОБРАТИЯ +ПИЗДОВЛАДЕЛЕЦ +ПИЗДОДУШИЕ +ПИЗДОЛЕТ +ПИЗДОЛИЗ +ПИЗДОМАНИЯ +ПИЗДОПЛЯСКА +ПИЗДОСТРАДАЛЕЦ +ПИЗДОСТРАДАНИЯ +ПИЗДОХУЙ +ПИЗДОШИТЬ +ПИЗДРИК +ПИЗДУЙ +ПИЗДУН +ПИЗДЮК +ПИЗДЮЛИ +ПИЗДЮЛИНА +ПИЗДЮЛЬКА +ПИЗДЮЛЯ +ПИЗДЮРИТЬ +ПИЗДЮХАТЬ +ПИЗДЮШНИК +ПОДЗАЕБАТЬ +ПОДЗАЕБЕНИТЬ +ПОДНАЕБНУТЬ +ПОДНАЕБНУТЬСЯ +ПОДЪЕБНУТЬ +ПОЕБАТЬ +ПОЕБЕНЬ +ПОПИЗДЕТЬ +ПОПИЗДИЛИ +ПОХЕР +ПОХУЙ +ПОХУЯРИЛИ +ПРИЕБАТЬСЯ +ПРИПИЗДЕТЬ +ПРИПИЗДИТЬ +ПРИХУЯРИТЬ +ПРОБЛЯДЬ +ПРОЕБАТЬ +ПРОЕБАТЬСЯ +ПРОПИЗДИТЬ +РАЗЪЕБАЙ +РАЗЪЕБАТЬСЯ +РАСПИЗДОН +РАСПИЗДЯЙСТВО +РАСХУЮЖИТЬ +СУХОПИЗДАЯ +СХУЯРИТЬ +СЪЕБАТЬСЯ +ТРЕПЕЗДОН +ТРЕПЕЗДОНИТ +ТУЕБЕНЬ +ТУПИЗДЕНЬ +УЕБАН +УЕБАТЬ +УПИЗДИТЬ +ХЕР +ХЕРАКС +ХЕРАСЕ +ХЕРАСИ +ХЕРАНУТЬ +ХЕРИТЬ +ХЕРНЯ +ХЕРОВИНА +ХЙ +ХУЕВ +ХУЕВАТЕНЬКИЙ +ХУЕВАТО +ХУЕВИНА +ХУЁВИНА +ХУЕБРАТИЯ +ХУЕГЛОТ +ХУЕГРЫЗ +ХУЕДИН +ХУЕЛЕС +ХУЕМАН +ХУЕМЫРЛО +ХУЕПУТАЛО +ХУЕСОС +ХУЕТА +ХУЕТЕНЬ +ХУЙЛО +ХУЙНЯ +ХУЙНУТЬ +ХУЯЦИЯ +ХУЯСЕ +ХУЯСИ +ХУЛИ +ХУЯ +ХУЯК +ХУЯКС +ХУЯЧИТЬ +ШИРОКОПИЗДАЯ diff --git a/annotators/BadlistedWordsDetector_ru/requirements.txt b/annotators/BadlistedWordsDetector_ru/requirements.txt new file mode 100644 index 0000000000..5311bf5bc4 --- /dev/null +++ b/annotators/BadlistedWordsDetector_ru/requirements.txt @@ -0,0 +1,10 @@ +flask==1.1.1 +itsdangerous==2.0.1 +gunicorn==19.9.0 +requests==2.22.0 +sentry-sdk==0.12.3 +spacy==3.0.5 +click==7.1.2 +pymorphy2==0.9.1 +jinja2<=3.0.3 +Werkzeug<=2.0.3 \ No newline at end of file diff --git a/annotators/BadlistedWordsDetector_ru/server.py b/annotators/BadlistedWordsDetector_ru/server.py new file mode 100644 index 0000000000..8400d50d11 --- /dev/null +++ b/annotators/BadlistedWordsDetector_ru/server.py @@ -0,0 +1,133 @@ +#!/usr/bin/env python + +import logging +import re +import time +from os import getenv +from pathlib import Path +from typing import Set + +import pymorphy2 +import sentry_sdk +from flask import Flask, request, jsonify + + +sentry_sdk.init(getenv("SENTRY_DSN")) + +logging.basicConfig(format="%(asctime)s - %(name)s - %(levelname)s - %(message)s", level=logging.INFO) +logger = logging.getLogger(__name__) + +app = Flask(__name__) + + +lemmatizer = pymorphy2.MorphAnalyzer() +ANYTHING_EXCEPT_OF_LETTERS_RUSSIAN = re.compile(r"[^а-яА-Я йЙёЁ\-]+") +SPACES = re.compile(r"\s+") + + +def tokenize_sentence(sentence): + if isinstance(sentence, list): + # already tokenized + return sentence + else: + sentence = ANYTHING_EXCEPT_OF_LETTERS_RUSSIAN.sub(" ", sentence) + sentence = SPACES.sub(" ", sentence) + return sentence.lower().split() + + +def lemmatize_token(token): + parsed = lemmatizer.parse(token) + if parsed and len(parsed): + return parsed[0].normal_form + else: + return "" + + +class Badlist: + def __init__(self, path): + """ + badlist object loads your favorite badlist from file + + Args: + path: Path object to badlist file, one badlisted phrase per line + """ + self.name = path.stem + self.badlist = set() + with path.open() as f: + for line in f: + token = line.strip().lower() + self.badlist.add(token) + self.badlist.add(lemmatize_token(token)) + + self.max_ngram = max([len(x) for x in self.badlist]) + + def check_set_of_strings(self, ngrams: Set[str]): + """ + Checks if any bad listed phrase in a set of strings. + Args: + ngrams: set of str + + Returns: + True if at least one badlisted phrase is in the set of str + """ + badlists = ngrams & self.badlist + if badlists: + logger.info(f"badLIST {self.name}: {badlists}") + return len(badlists) > 0 + + def __repr__(self): + return self.name + + def __str__(self): + return self.name + + +badlists_dir = Path("./badlists") +badlists_files = [f for f in badlists_dir.iterdir() if f.is_file()] + +badlists = [Badlist(file) for file in badlists_files] +logger.info(f"badlisted_words initialized with following badlists: {badlists}") + + +def check_for_badlisted_phrases(sentences): + result = [] + tokenized_sents = [tokenize_sentence(s) for s in sentences] + tokenized_lemmatized_sents = [[lemmatize_token(token) for token in sent] for sent in tokenized_sents] + unigrams = [set(tokens + lemmas) for tokens, lemmas in zip(tokenized_sents, tokenized_lemmatized_sents)] + for sent_unigrams in unigrams: + result += [{blist.name: blist.check_set_of_strings(sent_unigrams) for blist in badlists}] + return result + + +def get_result(request): + st_time = time.time() + sentences = request.json.get("tokenized_sentences", []) + + if len(sentences) == 0: + sentences = request.json["sentences"] + result = check_for_badlisted_phrases(sentences) + total_time = time.time() - st_time + logger.info(f"badlisted_words exec time: {total_time:.3f}s") + return result + + +@app.route("/badlisted_words", methods=["POST"]) +def respond(): + """ + responses with [{badlist_1_name: true}, ] if at least one badlisted phrase is in utterance + """ + result = get_result(request) + return jsonify(result) + + +@app.route("/badlisted_words_batch", methods=["POST"]) +def respond_batch(): + """ + responses with [{"batch": [{badlist_1_name: true}, ]}] + """ + result = get_result(request) + return jsonify([{"batch": result}]) + + +if __name__ == "__main__": + app.run(debug=False, host="0.0.0.0", port=3000) diff --git a/annotators/BadlistedWordsDetector_ru/test.py b/annotators/BadlistedWordsDetector_ru/test.py new file mode 100644 index 0000000000..27a11e344f --- /dev/null +++ b/annotators/BadlistedWordsDetector_ru/test.py @@ -0,0 +1,19 @@ +import requests + + +def main(): + url = "http://0.0.0.0:8018/badlisted_words" + + request_data = { + "sentences": ["не пизди.", "застрахуйте уже его", "пошел нахер!"], + } + + result = requests.post(url, json=request_data).json() + gold_result = [{"bad_words": True}, {"bad_words": False}, {"bad_words": True}] + + assert result == gold_result, f"Got\n{result}\n, but expected:\n{gold_result}" + print("Success") + + +if __name__ == "__main__": + main() diff --git a/annotators/BadlistedWordsDetector_ru/test.sh b/annotators/BadlistedWordsDetector_ru/test.sh new file mode 100755 index 0000000000..61672db785 --- /dev/null +++ b/annotators/BadlistedWordsDetector_ru/test.sh @@ -0,0 +1,3 @@ +#!/bin/bash + +python test.py diff --git a/annotators/IntentCatcherTransformers/Dockerfile b/annotators/IntentCatcherTransformers/Dockerfile index 2f060d804b..c3da5f939e 100644 --- a/annotators/IntentCatcherTransformers/Dockerfile +++ b/annotators/IntentCatcherTransformers/Dockerfile @@ -13,6 +13,8 @@ ARG CONFIG_NAME ENV CONFIG_NAME ${CONFIG_NAME} ARG SERVICE_PORT ENV SERVICE_PORT ${SERVICE_PORT} +ARG INTENT_PHRASES_PATH +ENV INTENT_PHRASES_PATH ${INTENT_PHRASES_PATH} COPY annotators/IntentCatcherTransformers/requirements.txt /src/requirements.txt RUN pip install -r /src/requirements.txt diff --git a/annotators/IntentCatcherTransformers/README.md b/annotators/IntentCatcherTransformers/README.md index b9ad091162..543948977c 100644 --- a/annotators/IntentCatcherTransformers/README.md +++ b/annotators/IntentCatcherTransformers/README.md @@ -11,3 +11,9 @@ It consumes 3.5Gb GPU RAM during fine-tuning. Classification results after 5 epo {"train": {"eval_examples_count": 209297, "metrics": {"accuracy": 0.9997, "f1_weighted": 1.0, "f1_macro": 0.9999, "roc_auc": 1.0}, "time_spent": "0:03:46"}} {"valid": {"eval_examples_count": 52325, "metrics": {"accuracy": 0.9995, "f1_weighted": 0.9999, "f1_macro": 0.9999, "roc_auc": 1.0}, "time_spent": "0:00:57"}} ``` + +Russian Intent Catcher is also available. Conversational Russian BERT-base version after 5 epochs achieves the following results: +```json +{"train": {"eval_examples_count": 16315, "metrics": {"accuracy": 1.0, "f1_weighted": 1.0, "f1_macro": 1.0, "roc_auc": 1.0}, "time_spent": "0:00:30"}} +{"valid": {"eval_examples_count": 4079, "metrics": {"accuracy": 0.9998, "f1_weighted": 0.9998, "f1_macro": 0.989, "roc_auc": 1.0}, "time_spent": "0:00:08"}} +``` \ No newline at end of file diff --git a/annotators/IntentCatcherTransformers/intent_phrases_RU.json b/annotators/IntentCatcherTransformers/intent_phrases_RU.json new file mode 100644 index 0000000000..4dd0d1b90b --- /dev/null +++ b/annotators/IntentCatcherTransformers/intent_phrases_RU.json @@ -0,0 +1,1389 @@ +{ + "intent_phrases": { + "what_are_you_talking_about": { + "phrases": [ + "о ((чем)|(чём)) ты( говоришь){0,1}( вообще){0,1}", + "что это значит", + "я( ничего){0,1} не ((понимаю|поняла|понял))" + ], + "reg_phrases": [ + "о ((чем)|(чём)) ты( говоришь){0,1}( вообще){0,1}", + "что это значит", + "я( ничего){0,1} не ((понимаю|поняла|понял))" + ], + "min_precision": 0.0, + "punctuation": [ + ".", + "?" + ] + }, + "topic_switching": { + "phrases": [ + "((хватит)|(прекрати)|(не хочу)|(не хочу больше))(( говорить)|( болтать)|( рассказывать)){0,1} ((о)|(об)|(обо)|(про)) ((этом)|(этой теме)|(фильмах)|(музыке)|(политике)|(погоде)|(себе)|(мне)|(этих вещах))", + "((можем)|(можешь)|(давай)) ((мы)|(ты)|(я)) ((прекратить)|(перестать)|(остановиться)) ((говорить)|(болтать)|(рассказывать)) об ((этом)|(этой теме)|(фильмах)|(музыке)|(политике)|(погоде)|(себе)|(мне)|(этих вещах))", + "((можем)|(можешь)|(давай))(( мы)|( ты)|( я)){0,1} ((поговорить)|(поболтать)|(порассказывать)|(поговорим)|(поболтаем)) о(( чем-то)|( чем-нибудь)|( чем-либо)){0,1} ((другом)|(еще))", + "(я ){0,1}((не знаю)|(понятия не имею)|(без понятия)) о чем( еще){0,1}( можно){0,1} ((поговорить)|(поболтать)|(порассказывать)|(поговорим)|(поболтаем))", + "((скучно)|(надоело)|(бесит)|(хватит)) ((о)|(об)|(обо)|(про)) этом" + ], + "reg_phrases": [ + "((хватит)|(прекрати)|(не хочу)|(не хочу больше))(( говорить)|( болтать)|( рассказывать)){0,1} ((о)|(об)|(обо)|(про)) .*", + "((можем)|(можешь)|(давай)) ((мы)|(ты)|(я)) ((прекратить)|(перестать)|(остановиться)) ((говорить)|(болтать)|(рассказывать)) об .*", + "((можем)|(можешь)|(давай))(( мы)|( ты)|( я)){0,1} ((поговорить)|(поболтать)|(порассказывать)|(поговорим)|(поболтаем)) о(( чем-то)|( чем-нибудь)|( чем-либо)){0,1} ((другом)|(еще))", + "(я ){0,1}((не знаю)|(понятия не имею)|(без понятия)) о чем( еще){0,1}( можно){0,1} ((поговорить)|(поболтать)|(порассказывать)|(поговорим)|(поболтаем))", + "((скучно)|(надоело)|(бесит)|(хватит)) ((о)|(об)|(обо)|(про)) .*" + ], + "min_precision": 0.8, + "punctuation": [ + ".", + "!", + "?" + ] + }, + "lets_chat_about": { + "phrases": [ + "((можем)|(можешь)|(давай))(( мы)|( ты)|( я)){0,1} ((поговорить)|(поболтать)|(порассказывать)|(поговорим)|(поболтаем)) ((о)|(об)|(обо)|(про)) ((фильмах)|(музыке)|(политике)|(погоде)|(себе)|(мне)|(искусстве)|(книгах)|(чтении)|(супергероях)|(погоде)|(нас)|(отношениях)|(семье))" + ], + "reg_phrases": [ + "((можем)|(можешь)|(давай))(( мы)|( ты)|( я)){0,1} ((поговорить)|(поболтать)|(порассказывать)|(поговорим)|(поболтаем)) ((о)|(об)|(обо)|(про))(!?(( чем-то)|( чем-нибудь)|( чем-либо)){0,1} ((другом)|(еще))) .*" + ], + "min_precision": 0.0, + "punctuation": [ + ".", + "!", + "?" + ] + }, + "exit": { + "phrases": [ + "пока", + "хватит", + "закончим разговор", + "мне пора" + ], + "reg_phrases": [ + "пока(((-)|( ))пока){0,1}", + "хватит ((болтать)|(говорить)|(разговоров))", + "закончим(( разговор)|( беседу)|( диалог))", + "мне пора(( идти)|( бежать)){0,1}" + ], + "min_precision": 0.0, + "punctuation": [ + ".", + "!" + ] + }, + "repeat": { + "phrases": [ + "повтори( еще раз){0,1}( пожалуйста){0,1}", + "((можно)|(можешь)){0,1} еще раз( пожалуйста){0,1}" + ], + "reg_phrases": [ + "повтори( еще раз){0,1}( пожалуйста){0,1}", + "((можно)|(можешь)){0,1} еще раз( пожалуйста){0,1}" + ], + "min_precision": 0.0, + "punctuation": [ + "?", + ".", + "!" + ] + }, + "yes": { + "phrases": [ + "((да)|(конечно)|(разумеется)|(точно)|(согласен)|(согласна)|(все верно)|(верно)|(ага)|(окей)|(ок))" + ], + "reg_phrases": [ + "((да)|(конечно)|(разумеется)|(точно)|(согласен)|(согласна)|(ладно)|(давай)|(все верно)|(верно)|(ага)|(окей)|(ок))( ){0,1}((да)|(конечно)|(разумеется)|(точно)|(согласен)|(согласна)|(ладно)|(давай)|(все верно)|(верно)|(ага)|(окей)|(ок)){0,1}" + ], + "min_precision": 0.0, + "punctuation": [ + ".", + "!" + ] + }, + "no": { + "phrases": [ + "((нет)|(нее)|(неа)|(ни за что)|(ни в коем случае)|(не за чем)|(да ну)|(да нет)|(да нет наверное))" + ], + "reg_phrases": [ + "((нет)|(нее)|(неа)|(ни за что)|(ни в коем случае)|(не за чем)|(да ну)|(да нет)|(да нет наверное))" + ], + "min_precision": 0.0, + "punctuation": [ + ".", + "!" + ] + }, + "what_is_your_name": { + "phrases": [ + "((представься)|(представь себя))", + "у тебя есть ((имя)|(фамилия)|(прозвище)|(никнейм)|(ник))", + "как( я){0,1}( могу){0,1} ((тебя)|(вас)) ((называть)|(звать))", + "как( я){0,1}( могу){0,1} к ((тебе)|(вам)) обращаться", + "как тебя ((зовут)|(называют))", + "как ((тебя)|(вас)) ((зовут)|(называют))" + ], + "reg_phrases": [ + "((представься)|(представь себя))", + "у тебя есть ((имя)|(фамилия)|(прозвище)|(никнейм)|(ник))", + "как( я){0,1}( могу){0,1} ((тебя)|(вас)) ((называть)|(звать))", + "как( я){0,1}( могу){0,1} к ((тебе)|(вам)) обращаться", + "как ((тебя)|(вас)) ((зовут)|(называют))" + ], + "min_precision": 0.0, + "punctuation": [ + ".", + "?" + ] + }, + "where_are_you_from": { + "phrases": [ + "откуда ((ты)|(вы))( родом){0,1}", + "((какая)|(какой)|(какое)) ((у тебя)|(у вас)|(твоя)|(твой)|(твое)) ((страна)|(город)|(место)|(институт)|(родина))(( происхождения)|( рождения)|( создания))", + "где ((ты)|(вы)) ((живешь)|(существуешь)|(обитаешь)|(находишься)|(сейчас)|(рожден)|(рождена)|(создана)|(создан))", + "какой ((у тебя)|(у вас)|(твоя)|(твой)|(твое)) адрес" + ], + "reg_phrases": [ + "откуда ((ты)|(вы))( родом){0,1}", + "((какая)|(какой)|(какое)) ((у тебя)|(у вас)|(твоя)|(твой)|(твое)) ((страна)|(город)|(место)|(институт)|(родина))(( происхождения)|( рождения)|( создания))", + "где ((ты)|(вы)) ((живешь)|(существуешь)|(обитаешь)|(находишься)|(сейчас)|(рожден)|(рождена)|(создана)|(создан))", + "какой ((у тебя)|(у вас)|(твоя)|(твой)|(твое)) адрес" + ], + "min_precision": 0.0, + "punctuation": [ + "?" + ] + }, + "what_can_you_do": { + "phrases": [ + "что ты ((умеешь)|(можешь)|(способна)|(способен))" + ], + "reg_phrases": [ + ".*что ты ((умеешь)|(можешь)|(способна)|(способен))" + ], + "min_precision": 0.0, + "punctuation": [ + "?" + ] + }, + "choose_topic": { + "phrases": [ + "о чем(( ты)|( мы)){0,1} ((можем)|(можешь)|(хочешь)) ((поговорить)|(поболтать)|(порассказывать)|(поговорим)|(поболтаем))" + ], + "reg_phrases": [ + "о чем(( ты)|( мы)){0,1} ((можем)|(можешь)|(хочешь)) ((поговорить)|(поболтать)|(порассказывать)|(поговорим)|(поболтаем)).*" + ], + "min_precision": 0.0, + "punctuation": [ + "?" + ] + }, + "who_made_you": { + "phrases": [ + "кто(( тебя)|( тобой)){0,1} ((сделал)|(создал)|(построил)|(разработал)|(спроектировал)|(произвел)|(запрограммировал)|(владеет)|(контролирует))(( тебя)|( тобой)){0,1}", + "кто(( твой)){0,1} ((ботмастер)|(создатель)|(разработчик)|(владелец)|(хост))", + "((какой)|(какая)) ((команда)|(университет)|(институт))(( тебя)|( тобой)){0,1} ((сделал)|(создал)|(построил)|(разработал)|(спроектировал)|(произвел)|(запрограммировал)|(владеет)|(контролирует))(( тебя)|( тобой)){0,1}" + ], + "reg_phrases": [ + "кто(( тебя)|( тобой)){0,1} ((сделал)|(создал)|(построил)|(разработал)|(спроектировал)|(произвел)|(запрограммировал)|(владеет)|(контролирует))(( тебя)|( тобой)){0,1}", + "кто(( твой)){0,1} ((ботмастер)|(создатель)|(разработчик)|(владелец)|(хост))", + "((какой)|(какая)) ((команда)|(университет)|(институт))(( тебя)|( тобой)){0,1} ((сделал)|(создал)|(построил)|(разработал)|(спроектировал)|(произвел)|(запрограммировал)|(владеет)|(контролирует))(( тебя)|( тобой)){0,1}" + ], + "min_precision": 0.0, + "punctuation": [ + "?" + ] + }, + "what_is_your_job": { + "phrases": [ + "((какой)|(какая)) ((у тебя)|(твоя)) ((профессия)|(работа)|(специальность))", + "кем ((ты)|(вы)) ((работаешь)|(трудишься))", + "((у тебя)|(у вас)) есть ((профессия)|(работа)|(специальность))" + ], + "reg_phrases": [ + "((какой)|(какая)) ((у тебя)|(твоя)) ((профессия)|(работа)|(специальность))", + "кем ((ты)|(вы)) ((работаешь)|(трудишься))", + "((у тебя)|(у вас)) есть ((профессия)|(работа)|(специальность))" + ], + "min_precision": 0.0, + "punctuation": [ + "?" + ] + } + }, + "random_phrases": { + "phrases": [ + "Круто вышло В целом примерно ясно, что к чему, но как вы радугу вытащили", + "Здесь три слоя кривых Curves с разными настройками Я psd удалил уже", + "C ХДР на траве особенно перемудрил А на исходнике экспозицию поменьше сделать, насыщенность побольше короче лр за 2 минуты", + "Одного меня беспокоит показ псевдореальности, из за которой повседневность становится ещё более невыносимой", + "плюсы господам JohnSpartan и SuperTOLSTYAK выше по ветке обработка автора годится на постер какой но как фото режет глаз неестественностью", + "вы правы, второе фото больше на коллаж похоже Долго сравнивал, чтобы убедиться в обратном", + "кстати, ливень действительно начался потом и действительно было ветрено и не жарко и радуга тоже была это Алтай, детка", + "Одного меня беспокоит показ псевдореальности, из за которой повседневность бла бла бла", + "Предлагаете втыкать на серую хмарь небес за окном, и восторгаться промозглым дождиком и силуэтами промышленных заводов Я пас", + "До лучше, после почти что вырвиглазно, как и на всех остальных фото с обработкой Интересно, почему ОНИ все думают, что так лучше", + "Эффектно, но на фотостоках, к примеру, такие отклоняют из за слишком задранного контраста", + "Видимо, 3 года назад ему небо не понравилось, а исправить не мог", + "И в чем тут дело в обработке Поднимаем контрасность и всё Очень круто", + "Было лучше же xD Сделал картинку просто грязной", + "ну поднять контраст большого ума не надо, минус однозначно", + "убрать дымку чёткость", + "вот чего ты до меня доебался то", + "могу ещё про HDR поговорить XD", + "давай поговорим про хдр я из 4х кадров делаю например", + "а я из одного и вообще люди часто путают HDR и LDR с повышенным локальным контрастом", + "На первой, кстати, качество получше", + "Прямо наглядный материал получился для демонстрации последовательности распознавания образов нейронными сетями", + "Вот они наглядные преимущества фотографирования в RAW Сохраню себе и буду показывать тем кто говорит что это все баловство и лишние", + "Если честно, вспоминая эту фотку, я тоже всегда переключаюсь на рав даже, если флешка почти забита", + "зато у жепега другие преимущества, правда с основном связанные со скоростью", + "до больше нравится Настоящая", + "Да никто не говорит о реализме, но тут после выглядит крайне наркомански", + "зависит от гаммы монитора", + "Вот поэтому я и не уважаю тех кто называет себя фотографами сегодня, они умеют только обрабатывать фото а не фотографировать", + "Спорно! Фотография как и любое искусство передает эмоции А как уже этого добился автор не имеет значения", + "Фотография как и любое искусство передает эмоции и/или вызывает эмоции", + "А можно было ХДР сделать", + "Я думал, что свитшот, это что то типа камшот, только сладкое", + "От слова sweat, а не sweet", + "Стопарь сладковатого спиртного коктейля", + "Свитер англ sweater от to sweat — «потеть» А я то думал", + "Я думал от sweet милый", + "не толстовка, а свитшот не блокнот, а молескин не зефир, а маршмэллоу не мокасины, а топсайдеры не пидор, а микроблогер", + "Можно на тебя подписаться", + "Блять, забавный пикабу Человека, который пошутил заминусили, а того, кто шутку объяснил заплюсовали;/", + "Не микроблогер , а твиторас", + "В моем мире есть кофта и свитер, всё", + "Глянул я тут в вики, что такое на самом деле толстовка Неожиданно А та одежда, которую я называл толстовкой, на самом деле называется худи", + "Худи идет с высоким воротником А то, что просто с капюшоном это всё таки толстовка", + "Можно, блин, фотки А то все равно ниче непонятно", + "@Sviter , это правда, что ты вязанная хуйня", + "Мода, остановись, что ты делаешь Я ещё на букве п, разучиваю чем отличается парка от плойки, а мне уже предлагают перескочить на букву с!", + "я думал это называется кофта", + "Слово заимствованное 50 лет назад становится родным", + "скрафтила пообтесалось О_о", + "часто вы употребляете\\слышите слово скрафтить", + "Пиздец чувствую себя лесником Ху зэ факин скрафтить", + "Не насосала а скрафтила", + "Блеать я сейчас аж проблевался", + "Толстовка без капюшона называется кофта", + "Да идут нахуй эти свитшоты и тп, кофта, универсальное слово, свитер вязаная кофта", + "Кофта, вязаная кофта, кофта с капюшоном Остальное всё хуйня", + "чем лонгслив от худи отличается может объяснить", + "Лонгслив по покрою то же самое что майка, только с длинными рукавами Худи толстовка с капюшоном", + "омг, чел! тебе уже можно студию по лофт дизайну барбершопов открывать!", + "Он еще коворкинг не собрал", + "А разве у майки могут быть рукава А после отрастания рукавов она не превращается в футболку", + "это 3 стадии развития рукавов майка чуть больше футболка еще чуть больше лонгслив long sleeve длинный рукав но это не точно", + "у водолазки ворот другой", + "а с рукавом 3/4 куда", + "не толстовка, а свитшот", + "Худи более тонкая ткань, чем у толстовки", + "Sweater и sweatshirt Но все на русский лад произносят и представляют себе неправильный перевод, ругать я их не буду", + "Свитшот Sweet Shot выстрел сердечками", + "sweat shot выстрел потом", + "Я думал о5 её холодцом кормить будут Спасибо за видео, довольно честно и открыто", + "судя по первым видосам, её кормят не только холодцом, но и пельмешками и чебуреками тоже D", + "Интересно, вы бы ещё тексты писали под видосами", + "А что за случай в такси", + "Да так везде, не только в Японии или России Такова природа людей", + "Она беременна и эта девочка", + "они ещё никого не заебали", + "Не понимаю подобных реплик теги есть, автор виден, ну видишь ты что контент тебе не нравится так не смотри", + "Кокаинума осенна не хватает, насялника", + "решил тряхнуть стариной", + "Галкин, ложась вечером в супружеское ложе", + "Хорош спорить Sekas это последствия А гугл латышского не знает https//wwwmultitranru/c/mexel1=27&l2=2&s=sekas мультитран лучше", + "а как Централ Партнершит перевел название", + "У мужика горе вся семья погибла, а киношные компании бабло поднимают как то всё неправильно", + "Речь о тебе, об Арни или о том мужике, на чьей жизни фильм основан", + "фильм на реальных событиях снят, реальные люди погибли и история которую играет Арни насоящая", + "Он ответил на комментарии zz00, при этом никак не высказав своего мнения Вы же начали его упрекать за точку зрения ARCHIMED", + "> От фильма он остался совершенно не в восторге А до этого вообще обещал не смотреть Видимо, понравилось хайп ловить, хз", + "В каком контексте ты это сказал Ну, то есть из за чего посчитал его упоротым", + "в контексте поста Шварцнеггер > умер > бабло кинокомпаний", + "секаса очинь не хватаит", + "У нас в прокате Последствия А Sekas по латышски эффект Но все равно смешно!", + "Sekas по латышски последствия, эффект это effekts Ваш зануда эксперт латышского из Даугавпилса", + "Не совсем из Д пилса недалеко живу в 140 км вниз по течению", + "И латышский, и русский у меня родные языки", + "А мне гугл так перевел", + "гугл знает латышский ну ооочень плохо", + "Блин, постоянно когда хожу в кино и вижу заголовок Sekas улыбает Правда переводится это вроде как Эффект", + "sekas переводится последствия", + "Типичный русский за 25 лет так и не выучил латышского", + "Типичный приебалт лишь бы русского обосрать", + "Типичный русский, лишь бы всех остальных обосрать", + "нахуй нужен латышский язык", + "Этим Марксом был альберт Эйнштейн", + "Только вот это не интернет, а улица, притом в России", + "в какой интересно России Если это Кишинев", + "Ну все правильно Секас=эффект>", + "Sekas дорогой мой, это последствия", + "А внутри чемодана ещё один агент ФБР _", + "Не маячок в заднице, а анальный зонд", + "Сотрудник ФБР в плавках требование угонщика принес 1 млн долларов в чемодане в качестве выкупа за 86 заложников, Майами, июль 1972 года", + "Если он в плавках и с деньгами Да, хоть артист Большого театра!", + "В плавках, с деньгами в Майами Как звучит то круто, если вырвать из контекста", + "1000000/86 = $11600 за душу", + "бакс был дороде на порядок, это около 23000 советских рублей насколько я помню или курс 2 к 1 стал ближе к 90 м", + "а КУПить можно было А то мама в Украине их брала или в Польше там граница почти", + "А как же позиция США никаких переговоров с террористами", + "это что за требование такое странноев плавках хммм", + "Чтобы не взял с собой оружие, или маячок какой нибудь не подбросил", + "Прям как в крепком орешке", + "Это не тот чувак, что после спрыгнул с самолета в парашюте и потом не нашли его вообще", + "https//ruwikipediaorg/wiki/%D0%9A%D1%83%D0%BF%D0%B5%D1%80 вот эта история, на год раньше произошла", + "Может быть, чтобы из чемодана ничего не спиздил и по карманам не распихал", + "Ага, а в чемодан маячок не засовывается", + "Старикам здесь не место нормалтный там такой маячок был", + "В плавках можно принести немалую пушку", + "А я 49500 хотя этот ноль и не меняет ситуацию", + "Меня очень смущает нолик на конце", + "А меня точка, которая в России чаще используется как разделитель разрядов, а не для отделения целой части от дробной", + "Какой стресс! Какой стресс!", + "Главное чтоб быку не пришлось показывать свой значок", + "Я дизайнер Почему у тебя облако с сообщением прозрачное Не видно же букв на фоне самолёта!", + "Настолько тонко, насколько прозрачен фон облака с сообщением", + "Он принтскрин фотошопа в пеинте замутил", + "О, такое было в Городке", + "А в симпсонах и не такое", + "В симпсонах было ВСЕ! Понимаешь Все! Не удивлюсь если наш первый контакт с внеземной цивилизаций будет с такими же пришельцами", + "Угонщика самолета в поезд Чёрт, они все продумали там их искать никто не будет", + "Месяц 8 точно есть", + "Такое ощущение, что он продал дом, машину, все свои вещи, остался в одних трусах, и принёс все имеющиеся деньги Респект таким парням", + "я так последний вклад по ипотеке вносил", + "Это пистолет у тебя в трусах, или ты просто рад меня видеть", + "И в трусах пистолет у агента фбр", + "В этот момент чувак сам подумывал свалить с такой работы на этом самолете!", + "Ну и работа… это как про пожарников работа просто шикарная, в целом всё устраивает Но как пожар, так хоть увольняйся!", + "Пожарник это жук Пожарный это профессия", + "Ну допустим, это был охуенный жук", + "Как влитой! Именной щит", + "этим значком его и придавили", + "дадада, машину не заводят, а запускают, не дырка, а отверстие, плавает говно, а корабль ходит блаблабла", + "Светит фара Горит фонарь", + "Горит пердак, а фонарь освещает", + "ах, сколько экспрессии", + "Браво, два коммента в лучшее этому господину", + "горит костер, фонарь светит", + "горят дрова, а пламя излучает свет =", + "А вот лампа не горит А календари врут", + "Не последний, а крайний", + "не крайний а замыкающий", + "Пожарниками раньше называли мародеров, которые поджигали дома и пиздили всё, что успевали, когда хозяева выбегали", + "Правильно,заводят хуй за щеку", + "У нас в палате все капитаны у каждого своё судно", + "корабль военный, судно гражданское", + "Я знаю Но до этого тоже доёбываются Ты, например", + "Счас к тебе корабль приплывет", + "А у нас, а может только у нас так поджигателей называют Мне друган говорил, земля ему пухом", + "Пожарник это тот у кого случился пожар", + "Тогда у меня пара вопросов #comment_85913316", + "А что непонятного Отлично горел, можно сказать с огоньком", + "мне за этого кота насовали минусов полную карму вчера", + "Еще пожарниками называют людей, которые ищут ценности на пепелище", + "Тушила это работа Кстати у них праздник через неделю, вроде пожарник это терпила погорелец и поджигатель", + "пожарниками ранее погорельцев называли", + "https//ruwikipediaorg/wiki/%D0%9F%D0%BE%D0%B6%D0%B0%D1%80", + "Ничего подобного, у меня есть доказательства Upd упс, пролистал ниже, уже скинули значок", + "пожарник это профессия, а пожарный это рукав заебала уже эта илита", + "Пацаны, а можно мне с вами А то опять премию урезали", + "Моя премия и премии моих коллег здесь в чемодане, можно мне с вами", + "Мне интересен план угонщиков получить деньги, а дальше", + "а как же мы не ведём переговоров с террористами", + "так это ж не террорист, обычный угон за бабло, никакой политики", + "и всё же есть ньюансы у одних идея, хоть и плохая, другие просто хотят денег", + "Заиметь деньги как бы тоже идея", + "ну это разве идея это так, сладкая мечта о халяве миллион долларов на 4 5, даже в начале семидесятых", + "Тогда террористов как таковых не было", + "Так это же не в России было", + "Моссад не ведет переговоров с террористами", + "Джордж и его помощники хотели быть уверены в том, что агенты будут без оружия так плавки в таком случае не помеха", + "А теперь уберите с этой картинки пистолет", + "Работая 40 лет в своей америке, он бы больше заработал А так разве что подержал пару часов лям в руках Единственный, кто получил профит Алжир", + "А что было с самим самолётом Он же стоит наверное больше, чем сумма выкупа", + "Спасибо! Класс! Не слышал!", + "Странно, я думал это история Д Б Купера", + "Улететь Если их сразу не начали штурмовать, значит там заложники", + "Потом орать it's a prank!", + "Могло быть и хуже Черное Зеркало первая серия, которую ньфагам лучше не смотреть", + "Свинья охуенная тема!", + "© гг немого фильма Свадебная ваза", + "Может пистолет в трусах Это же не плавки", + "Чумодан то, видно, тяжелый Как он его передавать будет", + "Или просто русский возвращается с отпуска домой", + "С ФСБшниками такой фокус не прокатил бы", + "Жить надо прожить так, чтобы тебе к борту принес чемодан с двадцатками сотрудник кгб в труселях", + "У него ствол в трусах", + "Классно! Не знал! Автору респект!", + "Это мне на днюху подгон был", + "А чего до трусов то Он же в трусах мог пушку спрятать, ну или чего они там боялись", + "пожарник это погорелец", + "В таких историях нужно писать не выкуп, а прикуп", + "Может в ФБР у них форма такая https//youtube/Ce92FeW UbYt=31", + "А есть ссыль на историю, что на фото", + "ага адльше что удочкой чемодан поднимут на борт", + "А в трусах у него ствол", + " Мы требуем голого агента фбр с чемоданом полным деньгами Мы сделаем все что в наших что вы только что сказали", + "допилилЪ конкурс на название сюжетов", + "Ничего не имею против гомиков знимайтесь, чем хотите Но не надо это вот так напоказ выставлять", + "от заката до рассвета", + "Когда какой то старый пень проводит отпуск интереснее, чем ты 😆", + "ну че так грубо то есть давно придуманный девайс", + "Значит надо запихнуть туда пальчик одной руки и пальчик другой руки Очень гибкое применение девайса", + "Это всего лишь преобразователь сетевого напряжения в 12В постоянного тока с двумя автомобильными розетками, не городите тут хуйню", + "если выкинуть лишнее из корпуса и подпаяться напрямую, можно использовать и по другому назначению", + "А если это будут пальцы двух разных людей, не касающихся друг друга и стоящих в резиновых сапогах Неплохой дивайс для пати вечеринок", + "это даже круче пояса верности", + "То с огромной вероятностью ничего не будет ни одному ни второму", + "Того, кто прикоснется к фазе, непременно шандарахнет", + "Кое что другое не войдёт туда", + "а если входит, то есть смысл воспользоваться", + "Это твои не влезут, а мой влезет ©ослик Иа", + "Так себе повод для гордости", + "ты не пихай писюн в розетку ударит током ты умрёшь хотя раз он туда пролазит пожалуй лучше умереть", + "Ток пройдет между пальцами", + "Как то похоже меня бахнуло, то пробило где то до груди", + "Так студия Лебедева и позиционировала Вилкус как устройство для подзарядки То есть цели убить владельца не стоит", + "Смотря, какая схема используется в разветвителе Если соединение параллельное скорее всего именно такое, то не сильно", + "Сейчас придёт Роскомнадзор", + "и сунет пальцы в розетку велкам, хуле", + "Всем штатом желательно", + "Но розетка то одна Хотя можно сэкономить если все сотрудники поочередно вставят пальцы друг в друга и самый первый уже в розетку", + "Друг в друга И куда же", + "Куда что пролезет, главное дабы контакт состоялся", + "Эх Мечты мечты", + "Не прикидывайся, я знаю кто ты", + "вилка замкнёт цепь, и ток пойдёт по ней", + "и будет греться если достаточно долго ее держать, можно получить ожоги, несовместимые с жизнью", + "Автомат вырубиться через секунду", + "Через долю секунды Прямое КЗ отработает моментально", + "Вилка/шпилька сразу выстрелит из розетки", + "Вилка с 3 зубцами не пролезет в розетку Нужно 1 зубец отогнуть", + "#comment_49037987 эпичная ветка была, помню", + "Идиоту от идиотов, как мило", + "хороший человек бы например фазу как землю обозначил бы", + "в смысле была бы надпись Zemlja puhom", + "Всё нормально сделано Они же не желают действительно убить заказчика", + "А чего латиницей написано На казахском что ли", + "скорее всего, маркировочный принтер просто не имеет русской раскладки", + "Мне кажется можно было и так yMpu cyka!", + "OHu IIpocTo B Kc 16 He urpaJIu HuKorga HaBepHoe", + "u B IRC`e6e3 koDuroBku HE CuDJ7u", + "а в чем суть то если вставить это в розетку выбьет автомат и все", + "Устройство экономии электроэнергии", + "А фотограф это быдло монтажник, или заказчик идиот", + "А фотограф это быдло монтажник, или заказчик идиот Не путайте тире и дефис Не путайте — и", + "Как написанное тобой вообще прочитать и понять", + "Он спрашивает, кто постит это фото быдло монтажник, или заказчик идиот Лично я думаю что быдло монтажник", + "ты ставишь тире между словами, когда тут нужно писать быдло монтажник и заказчик идиот эх", + "Тот для кого ссылку просят", + "ты не только вместо дефисов тире ставишь, но еще и вместо запятых какая любовь к тире", + "Как я часто отмазывался на диктантах Это авторский знак", + "Ну русским же по белму написано фотограф быдло А кто идиот монтажник, или же заказчик понять не может, поэтому просит подсказать", + "Глупая конструкция При втыкании, просто автомат в щитке отрубится", + "2 фазы ж соединены , значит при вставлянии в розетку прост овырубится автомат и всё", + "Опасная картинка У рядового обывателя может сложиться мнение, что слева всегда ноль, а справа фаза в розетках", + "Соглашусь, но нефиг рядовому обывателю лезть в розетку без индикаторной отвертки Не уверен не берись!", + "и на нуле наконечник не обжат", + "Гуглите Соединение обмоток генератора звездой", + "Ты знаешь, я вот учился 5 лет на вышку, дабы знать ответ на твой вопрос, если коротко, то разница потенциалов, а если углубленно, то мне лень", + "к вам в дом многоквартирный заходит не одна фаза, а три просто вы используете одну из них и ноль", + "если в квартире электроплита, то две 380 вольт надо же получить", + "ну электроплиты вроде как сейчас уже от 220 работают", + "может быть, но у меня на 380 и на нее свой счетчик, отдельный", + "Не земля, а заземление", + "Мне лень было рисовать, картинки с гугла первые попавшиеся Может быть ноль нейтралью еще назвать", + "полагаю речь о фазе с нулем", + "кароч так сделано, чтоб въебало сразу при вставлении вилки в розетоньку", + "Не должно если какие либо исправные автоматы и проводка не тоньше дешевого китайского удлинителя", + "не бахнет, это прикол электрика", + "Из той же темы http//pikaburu/story/otlichnoe_reshenie_problemyi_4935872", + "ВНЕЗАПНО, на этом проводе подпилена изоляция", + "Они фазу с нейтралью коротнули Так не умрет Только пробки вырубит Надо было было землю с нейтралью закоротить так гораздо больше шансов", + "Прочитал голосом насмешки из Unreal Tournament пиратский перевод", + "почему меня дергало током три раза за день и я не умер", + "хотелось бы узнать предысторию", + "Если вселенная бесконечнаобратное неопровержимо следовательно шансы на зарождение жизни где то еще стремятся к бесконечности", + "неопровержимо, но если найдете край, то опровергните", + "Ну допустим где то есть край вселенной, но как этот край будет выглядеть какова его протяженность Что находиться за ним", + "Блин, скинь почитать Очень интересно", + "12 дней прошло, а он ещё не скинул", + "Отличная идея для рекламы билайна, Светлаков оценит", + "О, МишкинаШишка ожила наконец", + "Всё есть бесконечность", + "Средневековый Flatout", + "Medieval Ultimate carnage", + "Нужен кэп Почему конь второго седока вообще улетел, тем более в таком положении он должен был лететь спиной вниз или вперед", + "это не второй седок это на одной картинке показали динамику рыцарь ткнул копье в землю и сам повис на нём, а конь улетел вверх", + "А диаметр на табличках зачем", + "Там только один рыцарь и лошадь", + "опять на Пикабу комиксы со звуком Зачет!", + "Странные оценки, 1 диаметр и просто диаметр", + "При чём здесь знак пустого множества Если это 0, тогда почему его перечеркнули", + "Нули зачёркивают чтобы не путать с буквой о Я тоже так делаю очень удобно", + "какой то профессиональный юмор", + "Ах ты сепарюга! Бухнём в пятницу", + "я ежей принесу гулять так гулять", + "Я жаренной форели принесуС икрой", + "Ого Спецпаек А разрешение от особого отдела есть", + "Вкусно живёте А мы, в Челябинске, метеоритными осколками питаемся", + "Вы хоть на поверхность выходите Нам Солнце только снитсяИногда в фильмах показывают", + "В темноте не запнитесь о поребрик", + "Хорошо,хорошо поговорить с Питерцем Питер,прекрасный город", + "И мне, и мне одну заверните!", + "Пссс,я тут знаю одну библиатекаршу,у неё есть доступ к французской литературеТам в книгах бабы раздеваются догола", + "Только согласованные с особым отделом", + "А вот воспитания тебе не унять Наверное книжки читаешь", + "Может они его специально разбили, чтобы чужие боялись", + "Агент кремля, всем стоять!!!", + "Авианосец Самолет Падение где то должен быть МакКейн", + "Можно ещё остров Кыска вспомнить", + "Это просто американская тактика, вам, русским не понять", + "никуя не новинка, и никуя не американская https//ruwikipediaorg/wiki/%D0%9B%D0%B5%D1%82%D0%B0%D1%8E", + "Винстон Кузнецов 2 1, наши начинают отыгрываться", + "Разбился и разбился Да и хуй с ним,он же не убил людей И не убьёт", + "Товарищ Ким, залогиньтесь под собственным именем", + "Бля, чувак, у тебя тег политика и ссылка на Вк Меняй на что нибудь существенней Это не канает", + "на кой хер тут вообще ссылка не нужна, благо это не новость где нужны пруфы, а просто шутка на политическую тему", + "Тогда не надо ставить тег Политика, читайте правила", + "с какой стати шутка политическая, а значит без тега политика пяти минут не пройдёт, как набегут рыцари свежего и устроят сеанс призыва модератора", + "Правила читай, олень", + "олень ага, и мамку ипал и лалаку затраллил и го пвп или зассал тоже отдохни ка в игноре домашку поделай", + "Как обычно, ошейник с обратным адресом, а поводок, чтобы не убежала на прогулке", + "Супер Игра где прячутся остальные сарацины", + "Ответ из 9 букв Первая И , последняя М Крутите барабан", + "Как работает эта херня с колесом на картинке В чём пытка", + "Тут же другое колесование", + "Достаточно болезненные !", + "Может, там внизу под колесом штука с острыми шипами, типа терки, и чувак каждый круг проезжает по ней лицом и тушкой Я бы так сделала", + "Опасно, наверное, с тобой жить", + "Не знаю, муж не жалуется ему нельзя жаловаться", + "Однажды он сбежит,чтобы этого не случилось перережь ему сухожилия", + "простите, а что это за вайнерша на гифке", + "Как гусеница побегу, языком цепляясь", + "аблин,а я думал растягивали = жизнь обман Я бы так сделала а вы опасная хД", + "Щас погуглила растягивали на дыбе Устройство колеса гуглить лень, сделайте это за меня кто нибудь", + "дык суть пыток как раз не в смерти,а в том чтобы человек остался жив до признания,а потом уже пофиг", + "Я тоже думал что растягивали Короч погуглил там снизу должны быть шипы или пламя Просто тут не нарисовано", + "пнятновсю жизнь я жил во лжи =", + "Там другое колесо, горизонтальное Клали человека сверху и херачили молотом по суставам", + "Растягивали это четвертование", + "нене,есть же дыба когда просто растягивали чтобы все суставы вышли", + "Дыба,не растягивание просто тебя подвешивали и тело само ломалось,никто никого кроме тебя не растягивает", + "сектор Иерусалим на барабане", + "И при чем тут крестоносец Но deus vult, конечно", + "AVE MARIA DEUS VULT!", + "ФОР ДЕ ГРЕЙС, ФОР ДЕ МАЙТ ОФ АУР ЛОРД! ФОР ДЕ ХОУМ ОФ ДЕ ХОУЛИ!", + "ФОР ДЕ ФЕЙС, ФО ДЕ ВЕЙ ОФ ДЕ СОРД ГЕЙВ ДЕЙР ЛАЙВС СОУ БОЛДЛИ", + "AAAAAAAA AVE MARIA!", + "То чувство, когда не понимаешь отфотошоплен ли Якубович", + "Юмор современных 314 здюков мне совершенно чужд Где шутки за 30", + "И правильный отве е е ет П Л О С К А Я !", + "http//cs5pikaburu/post_img/big/2014/12/12/7/1418383960_1075114245jpg баянище", + "Ссылку лучше выкладывать на сам пост", + "И вызывать модератора, тк без вызова можно познакомиться с падением рейтинга", + "а разве вызов когда то поднимал рейтинг", + "Окраска чем то падре напоминает!", + "Действительно, похоже Как раньше не замечал", + "одновременно мило и жутко", + "Привиделось,что часть венка это её хвост обмотанный лентой", + "Шампанское в бюро ритуальных услуг так мило", + "Коробка с НГ осталась из под подарков", + "Из под бабушки из ЦАО", + "надо было назвать Аида", + "Красивая коша , и цвет в тему", + "Наверное, самый лучший работник!", + "Без видео, где она ноги перед всем районом раздвигает, пост какой то не полный", + "Не всегда, практика по стране есть, когда ребенок остается с отцом, но там так должны звезды сойтись, что легче в лотерею миллион евро выиграть", + "у нас на работе некоторые женщины пытаются ништяков отжать у всех типа у нее дети в ответ, говорим это мы тебе их сделали отступают", + "Всего 6 скринов, а вся жизнь одной семьи, как на ладони Как будто, Сто дней одиночества прочитал", + "12045 дней одиночества", + "У меня всего 8000 дней", + "Друга, бывшего мужа, бывших одноклассников и случайных прохожих", + "Давайте устроим перепись населения кто сколько платит алименты! Интересно", + "ни одна баба так и не дала", + "фаербол уже пускаешь", + "Ага, ночью, под одеялом", + "Как то не естесственно и суть надуманная,похоже на сочиненку для группы яжмать", + "Ну, это у неё капец мозга случай тяжёлый и вряд ли поддаётся лечению Сочувствую", + "Знатно подкололи дамочку", + "Котслампойсмотритнафейкжпг", + "странный какой то разговор почему сразу не оборвал встречными вопросами мол зачем ей это надо и стоит ли вообще отчитываться", + "Хм Это похоже на настоящее, тут чувствуется искорка не трустори", + "Ссу на ебало всем, кто воспринял этот фейк за чистую монету Ваш геолог шакал", + "Как же вы достали Убогую хрень выдумывают и бегом сюда выкладывать Начинаю верить, что кармадрочество лишает рассудка", + "Бедняжка, как же тебе тяжело Так переживать из за каждого говеного поста Отдохнуть тебе надо", + "Пытаюсь, только говнопосты мешают", + "Так ты плюнь в монитор и листай дальше", + "был бы твой монитор плюнул бы, но передо мной мой монитор", + "Думаю, если бы оно было ближе, ты бы другую отмашку придумал", + "Зачем это на Пикабу У нас культурное общество тут, а вы со своими матюками Совсем ахуели", + "Кто будет в храме матом ругаться, того кадилом охуярю", + "Вот блин вижу переписку яжматерей ставлю минус Десятки постов ежедневно вам не надоело еще Все, тренд прошел уже", + "Вот! Плюс тебе! Кассир! Кассир должен такое пресекать!!!", + "Про очередь спасиб, успокоили Но раздражения на молчащих тут и правда довольно много ведь", + "Думаю санитар леса запустил новое движение Я бы присоединился, но жалко машину", + "Пезд, обгоняющих пробку по обочине и пытающихся заскочить в поток без поворотника жму до последнего, прямо глядя в глаза, пока не попросит", + "И что хорошего в том, что ты их пропускаешь", + "Молодец!!! Лови плюсищще,единомышленник!!!", + "Мне стулом по голове один раз прилетело, когда я спросил Куда вы лезете , было неприятно / Теперь я контролирую тылы при любом конфликте", + "а у меня муж был как раз свидетелем ситуации номер три, поржал просто", + "Поговорила, поцеловала Но потом все равно развелись ;", + "ну, не всегда это работает", + "Я тут из недавнего поста про таблетки Золотая вы женщина, Lulka", + "о, смотрю тут не мало соседей я тоже не умела, говорю же, в школе постоянно обижали все, росла тихим забитым ребенком потихоньку вот тренируюсь", + "Ну всё! Сегодня же после работы кинусь в бой", + "ахах, удачи как нибудь окажемся может в одной очереди, вместе заклюем", + "на дорогах точно такая же ситуация все пропускают объездунов и обоченников, иногда и по несколько машин пропускают дико бесят", + "Главное, чтобы это не стало вашим единственным способом снимать стресс А до тех пор развлекайтесь", + "Согласен, правилу трех Д у нас следуют зачастую там где надо и там где не надо", + "Буду в Дыбенко, постараюсь вам дорогу не переходить", + "Про очереди на остановках неместные могут быть не в курсе, ибо такая практик в мск и спб в основном", + "а по моему наоборот это хорошо, за счет того, что они остаются ждать следующую маршрутку, могут уехать те, которые бы иначе еще 20 минут стояли", + "Да, на 492ю маршрутку очереди страшные Но пролезающих вперёд как то не довелось встречать", + "Истории, говоришь Lulka, а ты машину водишь ;", + "ну мужиков с 1 2 покупками мона и пропустить а баб надо воспитывать", + "Норм, Юль! Попаданец грустный какой то", + "Пидорасов боишься так, что джинсы задом наперед носишь", + "Где вас учат дебильные вопросы задавать Куда меня жизнь закинула Какой Якутск Какая кома", + "Я не медик, поэтому не могу поставить вам диагноз Только рекомендую обратиться в ближайшее мед учреждение", + "У нас бабка соседка тоже ежегодно проходит диспансеризацию После которой прекращает кидаться на людей и выть по ночам", + "Ещё момент, вы не медик, так кто вы", + "У нас в РФ тоже! Правда черствый хлеб, и за деньги", + "Был же такой Его еще обвинили в рекламе и в том что он крадет хлеб покупателей", + "Это не еда дорогая, к сожалению, а люди бедные", + "И люди бедные, и еда дорогая", + "40 50% тратят от медианной ЗП,если считать среднюю 35к рублей то на еду будет уходить в среднем 30%,что тоже много", + "Совсем разложились Надо же бульдозерами давить", + "Или срок годности переклеить", + "А потом бульдозером раздавить И по федеральным каналам показать", + "Пятерочка в бытовую мусорку выбрасывает", + "Санкционку, изъятую таможней, уничтожают во всем мире, а не раздают малоимущим", + "Это вообще нормальная практика в Германии, фудшеринг", + "Так в германии тоже Да и в других странах ес наверняка также", + "Я смотрю европа то совсем прогнила!", + "Речь не о просроченных продуктах", + "Во Франции недавно так тоже делали Инфа не моя, но от человека, который там жил", + "Молорики прибалты, чувствуется русский дух!", + "Задрали уже! Я просто жрать хочу!", + "Да и России так же, достаточно минуту погуглить", + "Причем тут социализм", + "Автор просто не в курсе, что благодаря этой программе магазины платят меньше налогов", + "То есть государство приняло закон, который выгоден и магазину и простым гражданам И не получило с него откатов А че, так можно было", + "Хз, может у них там правительство правда работает, а не придумывает новые способы законного отъема средств у холопов", + "Нууууукак бы чем не социализм Государство по сути оплачивает еду для нищих", + "А я то думаю, откуда знакомый привозит просроченные финские продукты", + "В Питере сеть Дикси по уценке продает продукты близкие к срокам Никакой халявы", + "Потому что их прикроют Они обязаны утилизировать продукты с истекшим сроком годности", + "Еще и ждать определенных дней будут, чтобы только только выставленное взять нахаляву", + "Эх, не в той стране я живу Но спасибо что не Китай, или Индия, или в Африке там где нибудь", + "При социализме паразитизм не приветствовался", + "http//pikaburu/story/khalyavskie_magazinyi_germanii_482843", + "https//youtube/LvPucEICDyw", + "Помню в Минске после вечерней службы в Красном Костеле давали вино и еду", + "При причастии в православии тоже вино дают, но мало", + "А вот мне бы такое питание очень пригодилось бы, вместо того, что бы работать мог быполноценно учится", + "Ну что за хуйня Они дают пищу от всей души! Если ты не берешь ее ты обижаешь добрых людей Это подло! Бегом за киндерами!!!", + "Скорее воспитывает рационализм Нерационально выкидывать продукцию, когда ее можно отдать", + "Странная логика, но в любом случае, это не противоречит моему высказыванию", + "Ну эмигрантов можно и просрочкой покормить", + "так перед кем должок то перед побирающимися мигрантами", + "Вы только постарайтесь не возвращаться обратно Пока есть еще силы уходите", + "@moderator , я расшифровал криптопосыл прошу разъяснений и действий", + "уд срамной в уста твои сахарные, содомит стукач", + "Катись отсюда, псина немытая И слова свои засунь в _О_", + "иди делать фелляцию таджикам, содомит", + "Не измениться, правильно", + "То есть у вас перед глазами пример родителей, и вы выбрали мужчину, который пьет Как так вышло", + "Тупой ответ на правильный вопрос Я всегда могу увидеть, если у человека предрасположенность Просто понаблюдайте за ним на праздниках вот и все", + "Вопросов больше не имею", + "Любовь, будь она неладна", + "Чем ваша история в итоге закончилась", + "Очень жаль, что так Все таки подумайте, может стоит поискать счастья ещё", + "Значит дайте шанс Выше правильно написали, с пьяным бесполезно о шансах последних говорить Лишь с трезвым", + "Это еще надо дождаться, пока реально протрезвеет А это дня три хотя бы без капли алкоголя Тогда наступает нормальная трезвость", + "Лет мне было 18, я шел с работы погожим летним деньком, с бутылочкой пива в руке Кто плюсует это говно", + "Те кто как и автор наивно полагают, что этот пост хоть одному человеку в мире поможет после его прочтения", + "Ну это только начало славного пути Надеюсь все же с этой дорожки свернешь", + "Получится, охуел что сомневаться", + "в смысле я не совсем понял ваш комментарий", + "Я о том что надо делать, и нечего сомнениям поддаваться", + "К врачу не пошёл Осознал что это зло вот какой я молодец Закрыл Пикабу пошёл за бутылкой", + "А я в данный момент пьян", + "Это физиологические изменения в структуре мозга, в первую очередь", + "хаха, у меня был такой Сергей Саныч в свое время", + "СС, если быть кратким", + "Сегодня в магазин вечером не пойду, надо заканчивать", + "Увезти маму И внуков отдать Ей нужно быть кому то нужной Она так привыкла И номер телефона ей пока сменить", + "Никогда не похмеляйтесь", + "Ты опять начал Вызывай наркологичку, нечего тянуть В прошлом посте ты говорит, что это самое лучшее", + "Не на что деньги будут только к концу месяца, а это удовольствие недешевое", + "никогда не похмеляйтесь, как бы вам не было плохо вы это заслужили если совсем невмноготу, квас или кефирчик и в сон", + "если не похмеляться то и запоя не будет", + "Ну так запой и происходит из за опохмела Не похмелишься запоя не будет", + "Имеется ввиду, что постепенно из запоя выходить, сокращая количество алкоголя до 0 Это невероятно сложно, но при поддержке близких вполне возможно", + "говорят, если не похмеляться, труднее в запой свалиться", + "ты еще и девушка , ну ты крут =D", + "а лучше совсем не пейте", + "Возможно тебе 16 лет", + "все будет, не переживай", + "Пикабу поддерживающий", + "а вдруг ты испугаешься, и бросишь бухать вот и поддержка!", + "ты ни когда не познаешь вкуса холодной воды с утра", + "Выпей много литров хренового пива и шлифани водкой Если и тогда похмелья не будет у тебя нечеловеческий метаболизм", + "В депутаты идти тебе надо", + "Это алкоголизм, когда или даже неделю подряд", + "Это вторая стадия алкоголизма", + "А на первой стадии что происходит", + "на первой ты не видишь минусов употребления", + "Крепкая у тебя печень, однако И ты это, выздоравливай давай", + "такая же фигня, утром хочется мяса какого нибудь, бульончика остренького, в душ, на свежий воздух, но пить, фу", + "В лигу Алкашей нас по ходу не пустят", + "а я если смешаю, то блюю, реакция организма, мне помогает хорошо", + "бля я один въябывая на работе что некогда бухать? и бля если вы пъете запоями на что вы вообще пьете?", + "Копеечного хрючева полны магазины На 200р можно упороться на полдня Ну хз как для кого, лично для меня это уже кома, я буду спать часов 8", + "Йога я не шучу Просто много есть легенд, а в итоге сводится то все к одному , не пить совсем", + "Как вариант, менее вредные наркотики чем алкоголь Но у нас это нереально Хорошо что остальной мир уже к этому приходит", + "Со слов жены каждый день за рулём 😂 Шутка", + "хорошая книга стивена кинга корпорация бросайте курить", + "Как с паническими атаками борешься Тупо терпишь", + "Мне помогал Фенибут И шугняки отгоняет и спишь боле менее нормально", + "Или адаптол Он более щадящий по ощущениям организма И не такие финтоплясы выдаёт если на него вдруг 😳 взять и выпить алкоголя", + "А панические атаки сразу после запоя, или когда долго не пьешь", + "Панические атаки после запоя И это такой трэш, что врагу не пожелаешь", + "Господи, спасибо что это прошло мимо меня", + "Я так же про всякие наркотики говорил, пока не понял, что алкоголь не меньшее зло", + "Мексидол Ноотропил Попробуй", + "Как не обидно признавать, но я зависим от пива В неделю 5 дней точно потребляю его", + "Могу сказать что это очень плохо Не прекратите пока не поздно, потом будут проблемы Пиво вообще напиток хитрый", + "Все верно написал!Особенно последний абзац,что не до достижений, лишь бы в животное не превратиться люто плюсую!Держись,прорвемся!", + "Красава Вот прямо с меня написано Горжусь тем что не пью уже долго Только вот стоит ли гордиться", + "сам себе гордись другим не нужно рассказывать, мало кто оценит", + "Травку лучше тогда курить", + "Статья однако Похмел сниметнафиг", + "Травокуров со стажем не видел Тощие, пиздят без умолку, iq приближается к нулю", + "Жене большой привет и слова сочувствия Вам ничего говорить не стану судя по настрою абсолютно бесполезно увы", + "я смотрел в безднуа бездна смотрела в меняс тех пор не пью, советы и ответы на вопросы не даю", + "Если бы пришли раньше, застали бы колобка", + "Хочу поставить это фото на заставкуСкиньте пожалуйста в хорошем разрешении фотографию", + "Обе фотографии загрузил https//yadisk/i/iiFoyAF33HDcUi https//yadisk/i/KklIUFYA3HDcVu", + "Живых зайцев, к сожалению, обнаружено не было Так будет точнее", + "Не живых так то тоже не обнаружено", + "Там в травке то, что было зайчиком Вы не заметили просто", + "Совпадение Не думаю", + "В следующий раз должен быть волк", + "Было бы круто! Никогда волка в дикой природе не встречал", + "Лучше и не встречать У нас в соседней деревне волк недавно двух алабаев сожрал, только головы остались", + "А после него — охотник", + "Пресет для LR Сергея Рожнова", + "Цвета шикарные Можно узнать как обрабатывали", + "Как раз жуткая цветокоррекция", + "В лайтруме Пресет ручная доработка", + "Старый Бен всегда умел тонко подъебать", + "Отливающий штурмовик никого не смутил", + "А чем нас должен смутить солдат писающий на дерево", + "Вероятно тем, что попал на дерево шутка за 300", + "с учетом что он вон в тот костёр целил, то не попал", + "меня смутил как он там член вытащил", + "А что если штурмовики так хуево стреляют, потому что джедаи отклоняют их лучи телекинезом", + "Ну значит хан, р2д2, с3ро, лея и многие другие джедаи", + "хан, р2д2, с3ро, лея и многие другие джедаи Не пойму, в чем тогда проблема", + "там тире должно быть после другие", + "но его попривычке отклонило телекинезом", + "так что я джедай, что ли", + "Он просто настолько мощный джедай,что силу свою не контролирует,и тем самым отталкивает знаки препинания", + "Это из за того,что он перешел на темную сторону силы", + "А ты пробел между запятой и и", + "А я пробелы после запятой и не ставлю никогда", + "Так ты ж, тире писал, ты штурмовик получается", + "Я преклоняюсь перед вами, мастер", + "Это дефис Вот тире —", + "Так ведь Лея действительно джедай", + "Зачем ты так Он просто вернул тебе тире,которое сдуло телекинезом", + "прост где то же должно быть то тире, которое он отбросил в начале сообщений вот оно здесь", + "ну в любом случае, технически она не джедай, хоть и чувствительна к силе", + "Так про джедайство и речи не было", + "рей типичная мэри сью и продукт диснея", + "ээя как раз в начале сабжа намекнул саркастически, что она джедай, так что была речь о джедайстве", + "Донни Йен тоже не джедай, но выстрелы отводил таки", + "а ещё у диснея героиня к середине фильма освоила приёмы силы и лайтсайбер", + "Пффф, героиня, Финн штурмовик ито с Кайло дрался на световых мечах без Силы", + "ну пока не известно, есть мнение, что финн не простой штурмовик и чувствителен к силе, но это будем посмотреть", + "Вроде штурмовики Первого Ордена обучены владению разными типа оружия, по этому он мог махаться световым мечом", + "В шестом эпизоде она демонстрировала ментальную связь с Люком ближе к концу фильма", + "Так то Лея имеет предрасположенность к Силе", + "Не забывай, солдат! Враг повсюду!", + "ты видимо не понимаешь, что в том и прикол", + "С личными местоимениями тире не ставится", + "Что поделать,я тот ещё плебей", + "он же вроде сантехник, не а вообще, отмазка, как дырка в попе, есть у каждого", + "В таком случае визор вообще можно блочить", + "Ну, он собственно и блочился в линзах камеры, которые передают изображение на глаза как в шлеме Железного Человека", + "Штурмовикам обычно ставят задачу взять главгероев живыми Ну и те берут как умеют", + "У них жи оглушающие патроны есть", + "Но им про это не сказали", + "Зачем им вообще магазины, ни разу не видел, что бы кто либо перезаряжался", + "Потому что плазма не бесконечная, но её хватает на очень много выстрелов", + "плазма ионизированный газ не неси бред, что плазма как патроны", + "Ну как бы бластерный газ используется Это даже в оффициальных рулбуках словесных ролевок прописано", + "для особо умных http//starwarswikiacom/wiki/Plasma", + "ее порционно высасывает из магазина и она летит во врага", + "Сила влияет на их технологии, помехи создаёт им", + "Лучу отклоняет сила Сила во вселенной звёздых войн это сценарист", + "Может не лучи а сами пистолетыбластеры", + "Я тебе телекинез кое куда напихаю Не телекинез а Сила! Не лучи а болты! Учи матчасть пидаван", + "я так двери в ТЦ открываю, скажите я джедай", + "Это ещё круче чем в фильме", + "тем самым показывая незнание электронной пунктуации", + "Как же меня это взбесило Силовое принуждение это вообще техника уровня мастеров джедаев", + "Ну не на сто, в него таки настреляли", + "Вроде в черном это летчики, не", + "Не, Death Troopers, отделение личной охраны Кренника", + "какое смешное имя у него", + "Почему ты назвал это имя !", + "это очень мощный селф баф, повышающий все характеристики и додж от выстрелов", + "Также у машин с изображением языков пламени скорость увеличивается прямо пропорционально их размерам", + "еще одни любят появляться по одному и в этот момент разговаривать по рации", + "если нулевую меткость увеличить нас 100%", + "Это на шанс попадания именно А уклонение и резист как написали выше", + "Лучше бы сделали вторую часть или ремастер вот этого шутана Эти парни тру илита", + "Да она и щас заебись Перепрошел месяц назад", + "Республика узнает о твоём подвиге Мы будем ждать тебя ,Грэгор Всплакнул", + "Насколько я знаю, не из этой компании,это был другой коммандос,Грегор", + "Ну вот, опять перепроходить и плакать в конце как сучка", + "Ну во втором батлфронте будет сюжетка как никак Обидно, что для пски за полное издание опять долларов 130 потребуют", + "есть любительский проект https//vkcom/swrcommando_2", + "Только игра, должно быть, выйдет аккурат перед третьей халфой", + "ЕМНИП они не поддались приказу 66 и ушли в ренегаты", + "С чего ты взял Как раз таки у них имеется зуб на джедаев, которые не дали им спасти своего товарища", + "Ну и Дельта и частично Омега половина этого отряда свалила на Мандалор остались на службе Империи", + "если это 501, то они как раз таки расфигачили джедаев", + "Они не из 501 За 501 мы можем поиграть во втором баттлфронте старом", + "Ну мы то знаем, кто тут на самом деле элита", + "Объясните несведущему, кто это на арте Кто выдал ситхам броню штурмовиков Или штурмам мечи", + "501 легион,кулак Вейдера", + "Во блин, а я думал мечами может пользоваться только тот у кого дохера мидихлориан", + "Видимо для большего эффекта", + "почему они с чужими пиздятся", + "http//rustarwarswikiacom/wiki/501 й_легион", + "Простите, для лиги лени, можно отрывок где говорится про мечи 😅", + "Это просто фанарт Тут еще можно ксеноморфа заметить", + "Он зажал меч под коленом Зачем", + "мне кажется, что это она я вижу что то похожее не сисечки 3", + "Это те, которым эвоки наваляли", + "можешь посоветовать почитать чего на их тематику", + "Как то в левой руке меч прямо через ногу проходит, через коленку /", + "Мб, типа отблеска на камере", + "Штурмовики разве не по приказу старались не попадать", + "А офицера Вейдер тогда по приколу задушил, когда имперцы потеряли Сокол из виду", + "Ага А ты бы так не делал, если б мог", + "Мне потом расскажешь", + "Есть видео в котором говорится, почему штурмовики так плохо стреляют https//youtube/mdmiFAljkd4", + "Это же художественный фильм, х/ф не существует без нелепостей и допущений", + "В художественном фильме допускаются некоторые нелепости, но чем их меньше, тем выше уровень самой картины", + "Оо прикольно сделано Отсылка и к любви отрезанию рук, и к известной картине про собак играющих в покер", + "Отсылка и к любви отрезанию рук поясни", + "Хрен его знает я был бухой и это было дико давно", + "а вейдер тогда почему задушил офицера", + "Первого за дело, второго для прикола задушил", + "Я бы тоже, будь я Вейдером, ради прикола души бы офицеров", + "Ага, а когда они защищали генератор поля у второй звезды смерти, тоже херово стреляли по приколу", + "Попробуй, блять, сам попасть по этим ебучим мишкам!!!", + "Которые, блять, сливаются с окружением!!!", + "Врубаешь тепловизор и стреляешь", + "Руки @Chiliktolik , признавайся, ты опять пытался их нарисовать", + "Если бы они попадали в цель, то серии сократились бы по времени как минимум на час", + "Командир скомандовал Огонь! и сразу пошли титры", + "Знаю инструктора этой элитной школы штурмовиков", + "Почему за тупые шутки нельзя бить людей по интернету", + "А если это раскадровка из игры", + "Почему нельзя бить людей по интернету за тупые раскадровки из игр", + "Почему нельзя бить людей по интернету !", + "Потому что им не больно, блин!", + "Мне важен сам факт битья", + "Это не отменяет того факта, что шутка про меткость штурмовиков не очень", + "Просто она не попала в цель", + "Выстрелила Но не попала", + "Это кто тут такой выебистый", + "так же как и твоя шутка, ведь она тоже старая и уже не смешная для многих", + "Разрешаю тебе не смеяться", + "Почему шутка тупая то", + "Да это просто уже классика", + "Да шутка то бессмертна, не спорю, просто народ уже совсем меры не знает Однообразие и поточность обесценивают все хорошее в ней", + "А еще Донни Йен вроде как показал, что не штурмовики косые, а Сила отводит выстрелы", + "Вроде понятие юмора индивидуально для каждого,не", + "Полностью согласен, а первое предложение вообще можно в золотой список цитат о ЗВ добавить D", + "Ага, особенно на планете с храмом джедаев, где Джен Орсо с дубинкой отмузила больше 10 штурмовиков", + "Но не настолько, чтоб быть смертоноснее джедая", + "Да блин, стырили имя, вот и путаю", + "На заднем плане ссыт штурмовик", + "я думал мне показалось", + "Эти белые штурмовики настолько тренированны, что бояться потерять зрение когда им кидают песок в шлем", + "Ребят, хочу посмотреть все зв, в каком порядке их лучше смотреть", + "1, 2, 3, Изгой Один, 4, 5, 6, 7", + "Это если в хронологической последовательности", + "4, 5, 6, 1, 2, 3, Изгой Один, 7 но это моя теория что так будет интереснее Сам смотрел по порядку", + "Хм, а мне в любом порядке нравится пересматривать, разочарован не был", + "Ну имхо новичку лучше все же в хронологической последовательности смотреть, дабы не запутаться", + "Блин, потеряли поцана Смотреть надо ТОЛЬКО с 4 по 6 Всё остальное это уже фансервис для выжимки бабла", + "Разве фильмы не ради денег снимают", + "Фильмы снимают ради искусства Мне жаль, что вы это не знали", + "аааа так вот почему уже 8й форсаж сняли", + "Вот этот фильм сняли ради бабла Только это не фильм, а говно", + "Я вот и удивился, что сначала вышел 4 эпизод, раньше зв особо не смотрел, поверхностно в курсе событий", + "Не за что Если посмотришь отпишись о впечатлениях", + "А мне как фанату он тоже очень понравился", + "По мне так они все на уровне выглядят Эффекты никогда не заменят вручную с любовью сделанные декорации и костюмы", + "Такс, такс, такс, про тройку войн клонов можно ли поподробнее А то знаю только закончившийся в 15 году", + "Спасибо, пересмотрю лишний раз", + "4,5,6,1,2,3 и 7 никогда, старт срач", + "Вот все тут только и шутят про штурмовиков, а великолепной графикой никто не восхитился", + "Ну у них же есть отряды специального назначения", + "Поясните пожалуйста, почему клонов Джанго Фетта заменили штурмовиками Вроде как клоны были намного мощнее", + "А серые штурмовики попадают так сяк", + "попадать в цель а это вообще законно", + "Стрельба для петушков, топор выбор мужика!", + "А у вас ересь в зубах застряла", + "Да они там чуть ли не через замочную скважину повстанцев выкашивали даже команду А перебили", + "Ну,значит не угадала", + "Ага было на джое пару часов назад", + "Написали бы, что с вк", + "А какая разница где создана шутка, если шутку создавали люди, которые могли создать и тут", + "О Ленинградский 101, привет из Железногорска", + "Я ваш дворник Завтра будем выпиливать шиповник", + "Спасибо вам за вашу работу!", + "Шоколадку ты принеси, так за работу отблагодари", + "Я не мусорю,думаю это лучше чем шоколадка", + "Подарю а они фантики по двору разбросают", + "Да не будь ты жлобом, принеси шоколадку! Мы проверим", + "Какие настырные у нас дворники", + "Принеси шоколадку!!!!", + "тоже верно, люди, кем бы они там не были, любят так деать", + "Не трогайте шиповник, он вкусный!", + "Когда меня в начале 2000х местная банда гопарей отпиздила и в шиповник выкинула, мне он вкусным не показался", + "Ты просто не распробовал", + "У меня и зубов то тогда с половину повыбивали так что", + "Чай бы тогда заварил чай из шиповника тоже вкусен и полезен!", + "Не распробовал пиздюлей", + "я смотрю и думаю, место вроде знакомое часто в гости езжу🙋", + "Я из 99 длинной пятиэтажки Хоть и давно уехал, но узнал места сразу же", + "5 лет его убирала До сих пор с многими жильцами здороваюсь", + "А этот хондовод 199 В гости приехал или местный", + "Ну вот без тебя белый ларек совсем загинается с торца дома", + "сначала засомневалась, ибо регион какой то не тот на машине заглянула в комменты, а тут перекличка железногорцев привет с 60 лет", + "Пол города тут сидит", + "Про шоколадку не забудь!", + "с огромным удовольствием на даче у родителей фигурно подстригала трехметровую тую красота!", + "Он по другую сторону баррикад, так сказать", + "Полковника по привычке убрали не смог удержаться", + "Гус Фринг тоже сам полы мыл и столы протирал, осторожнее там", + "Мой дядя заведующий ветеринарной лаборатории, кандидат наук, работает на второй работе дворником", + "Так все и есть до обеда гОвны разгребаю, после обеда дкларации", + "Порнуху по ночам смотреть не надо", + "как это влияет на зрение", + "А как волосы с ладоней убрать", + "Есть одно озеро, но там обычно горло полоскают", + "бумаги, что бы пройти", + "лебёдушка, к морю лети", + "исправлю после обеда гОвны разгребаю, до обеда подметаю улицы", + "Ох, с вечными выдумками нашего любимого государства хотят как лучше, а получаетсянеизвестно где больше придется разгребать и потеть", + "до обеда улицы убираю, после обеда говны разбираю", + "Короче говоря что до обеда, что после, всё равно говно грести приходится", + "Господи, куда катится этот ебаный мир", + "здоровье в тонусе А спина как", + "Здоровье в тонусе, на спину тоже не жалуется Он же не камазы разгружает со щебнем", + "Это была ирония Неужели это не очевидно", + "И я сама это сделала!", + "можно я это заскриню и буду нещадно использовать тут и там", + "Зачем перед текстом три пробела Мой внутренний перфекционист негодует!", + "Спасибо Кажется, волшебник, вы?", + "Одобряет Йода магистр", + "джой реактора на вашу картинку нехватает!", + "Пользоваться Пэйнтом должна научиться ты! И про вьювер фото Виндовс не забудь ты Мастер Йода", + "Дворник 20к получает Это какой город", + "Знакомый алкаш дворником работает 22 тыщи получает Белгород", + "Знакомые узбеки дворники в Люберах по 15 получают", + "Железногорск, Красноярский край", + "Ого! 20к! Это в Москве наверное", + "В прошлом году работала дворником в детском саду, получала 6 тысяч в месяц", + "Меня тоже бесит офисная работа! Хотя расчёты ещё терпимы, несмотря на сложность", + "50k рублей Это почти $1000 А по старым деньгам так почти $2000 Ну охуеть теперь, дворник получает больше меня Пойти, что ли тоже устроится", + "у тебя херово с матьимачехой", + "Спасибо за Вашу работу!", + "Все профессии нужны, все профессии важны! Вам спасибо за труд", + "Я рад за вас и спасибо вам за вашу работу Поменьше вам всяких хмырей", + "Вот, персонаж тоже эстет был Если что, Мусорщик, 2001г", + "Гуськов замечательный", + "На первом фото за машиной тоже вы", + "Не я, это самая молодая наша сотрудница", + "Спасибо вам за вашу работу", + "Все так и есть У нас тоже урезали Но в нашем городе (около 100000 людей)жешинам трудно с работой Даже в дворники очередь", + "В МСК тоже с работой напряг, даже в технологи очередь! т_т Либо место говённое", + "Каждая работа достойна уважения Жаль только, что другим людям дают 8 12 часов работать за 8к в месяц", + "Может быть когда нибудь это поймут и все остальные люди Что любая работа достойна уважения", + "А вот скажи, есть день ВДВ, сколько адекватных вдвшников, а сколько неадекватов но больше то всего запоминается плохое", + "Неадекватов меньшинство, но исполняют номера именно они, поэтому они и на виду Пара сотен обезьян на всю страну создают репутацию празнику", + "Ну тогда я пойду на вокзал Выразить свое уважение некоторым дамам", + "Я думал, он про старушек, которые пирожками торгуют", + "Ливерными Я б зохавал сейчас", + "помогай весне ешь снег", + "Если а работа под работка в радость то это шикарно!!!", + "Как говорил мой учитель физики в школе дворник самая лучшая работа, сразу видно результат", + "Спасибо, очень приятно", + "сразу для любопытных город указывайте давно было если 1012 год Кстати что то я тебе не верю, у тебя же год рождения 1593", + "С годом рождения не угадал Хабаровск Я 19века", + "О_о 19 века !!! Сколько ж вам лет, если не секрет Царя видели", + "Последняя кто видел царя", + "Мне одному кажется, что дом на фотографии буд то карточный домик и складывается во внутрь", + "Фото на мейзу 3 Сама фоткала", + "тоже заметил Толи фоткали в движение, толи дом с наклоном", + "Хозяйке на заметку Если делать уборку дома реже,то результат будет заметнее xD", + "Система налогообложения?", + "А я мусоропроводчиком работаю Мож думаю сантехником перейти", + "На таких людях земля русская держится", + "Я думал, это пост осуждения дворника, который на дорогу льда накидал", + "А меня и дворником не берут , плохо мету говорят! что делать побираться может вот это оно золотое дно", + "Кстати А есть такая социальная сеть, где все тупо про работу рассказывают", + "Наши коллеги ругались на эту сеть из за бесчисленного спама! Я зарегистрировалась там, но заходить боюсь!", + "А он в России работает Вроде заблочили его", + "Много чего заблочили, но это не значит, что им нельзя пользоваться", + "Скажите, а квартиры дворникам дают Или это из области фантастики", + "Мой почет и уважение, выйти из трудных ситуаций с удовольствием для себя и пользой для окружающих не кажный способен", + "Нифига се, всего 2 часа в день, а зарабатывает так же как я работая по 8 часов в день", + "А аспиранты 14к в месяц загребают за 3 статьи в полгода Хе хе! Про репетиторов молчу", + "200 рублей в час, отличная цена так то", + "и дополнительно ничего не платят, за выход в выходной", + "У нас дворники ничего кроме собак не едят и не говорят по человечески тоже Кто их нанял я вообще хз, но город миллионник Так и живем", + "Я тоже так хочу!!! Но у нас в деревне дворники не нужны", + "Это как в том посте про бригадира работу можно любить", + "работа по 2 часа в день действительно Но в случае снегопада или еще чего выход даже в выходной день и уборка", + "а дом почему наклонился назад", + "Железногорск конечно уникальный город, узнал его с первой фотки", + "дворники делают наш мир чище, но все же хотелось бы чтобы эта профессия канула в лету", + "А кто же тогда убираться будет", + "я к тому все вышеизложенное сказал, что чисто не там где метут, а там где не сорят с", + "Это то понятно, только всё равно, надо убирать снег, лёд, листья, тополиный пух, пыль, грязь", + "ну это понятно, вот если б не мусорили, то работы было б в разы меньше, а улицы чище", + "Не по теме мне одному кажется что после 2 повернутой картинке текст как буд то боком расположен", + "Филиал однометёльников, молодца!", + "Она разбудил зверя!!! Грядёт очередная волна пол свою работу", + "8тыс это в неделю или месяц", + "они уже так заебали бростаь снег на дорогу!", + "Они же не из вредности кидают на дорогу, а для того что бы спец техника убрала потом со двора", + "Круто спец техника во дворах У нас и на дорогах то ее не часто увидишь", + "В МСК тракторы по улицам гоняют, пугая прохожих! Я их тоже боюсь, как подкрадутся сзади", + "Да это просто вынос мозга Абсолютно иначе теперь буду смотреть на дворников", + "Снова посты про работу", + "Сейчас именно про дворников", + "Намечается массовая серия постов ", + "8 постов уже было, за день", + "Ща глянул по тегам, вот оно че", + "про тех, кто в буклет смог попасть на работе!", + "Неа, тут vanyas3579 тему поднял Ему спасибо", + "Неа, тут @vanyas3579 тему поднял Ему спасибо ; Это чтоб он увидел", + "спасибо, не знала как донести благодаоность до него", + "Прочитал, не волнуйся", + "Прочитал, отметился, плюс поставила!", + "похоже здесь снимали Осторожно Модерн2", + "Уже оплатила образование десятка обездоленных детей Смотри, за 30 лет надо 37 детей выучить", + "Меня одного покоробило от вида китайских фар на сивике", + "Все ж по электронке Это как Просто меня несколько ИПшников просят работать на дому с ними А я не могу понять как по электронке сдавать отчеты", + "Про дуплексную связь слышали ", + "Дуплексная связь в такси в 2000 годах Никто на такую раззоряться не будет Дороговато", + "погуглил интересно в бытовых устройствах cb диапазона не встречал", + "Ну есть вариант с дуплексом и двумя станциями, но врятли это использовали в такси", + "это третья сказала Толстая Остальные две в это время пиздились", + "Это до того, как накачался", + "это сказала та, кто считала себя третьей водитель вообще не знал, что так троянская война началась", + "Точно, когда первые дешёвые сименсы появились", + "Как долго я тебя ждаладиспетчер таксиу меня море вопросовможет запилишь пост про свою работу и случаи", + "Я все ждала, когда окажется, что это не тот город", + "Полезная история Я узнал что такое тангетка", + "А правильно все же тангента", + "а бабы долго потом спорили, кто из них кто ", + "Бизнес после этого накрылся", + "По какой логике самое простое и логичное слово русского языка типа представители интеллектуального большинства пишут через о", + "Ни одной И это логично, они там не нужны", + "Ответила явно не жиробасина", + "Я думала эпиляцию рекламируют", + "зачем кому то магнитофон в туалете", + "санузел совмещенный так что это магнитофон в ванной", + "А можно включить имперский марш и срать под него", + "эпично выдавливать лорда Вейдера прямиком в звезду смерти", + "Когда запор Люк, я твой Какец", + "Во взгляде читается Вам всем пиздец", + "Это ты чтоль на фотке", + "Барсик, только не дёргайся, прошу", + "ну если умственно отсталые являются целевой группой покупателей, то всё прекрасно", + "Тем более если, огромный пласт населения скажет гы гы смотри логотип похож на хуй, на попробовать", + "Да и слоган Не купил Ты пидор очевидно, положительным образом поспособствует росту продаж среди умственно отсталых", + "А почему умственно отсталые нарисованы как хипстеры", + "Потому, что в представлениях умственно отсталой части населения хипстеры и есть умственно отсталые", + "Ээээ сышыш, вейп есть А если найду", + "Блин не вижу почем гараж на заднем фоне продают", + "980 к тенге Около 180 200 к российских рублей", + "Юристы по разводам или разводилы Оставят голыми всех участников процесса", + "Судя по ценникам их услуг на сайте, вы правы", + "я передам твоей жене что ты смотрел", + "Я уже 2 с лишним года в разводе", + "Оставил жену с голой киской", + "Хотя бы вторую не пришили", + "часы на тумбе не забудь", + "Это кто то до него уже оставил", + "Буквально вчера видел пост о штанах Там автор обещал найти про киску Уж не ты ли это", + "По ссылке пройдите на тот пост и сверьте", + "Долго искать Я его не сохранил", + "А как же женская солидарность ", + "Оставь мужа с одной сосиской", + "Ну я не знаю, может быть и так, что девушку кинут", + "Кошку то ей оставят, дорогая между прочим, породистая Глядишь, кто то ещё позарится", + "там телефон на 7 777 начинается, то есть есть у кого то 77777777777", + "Как стать экспертом по редким номерам телефонов", + "Даже если тебя продать на органы реализовать твою квартиру распродать все вещи, все равно не хватит на 77777777777", + "Идеальный пост для пикабу, котэ, баба, про развод в прямом и переносном смысле и продам гараж на месте", + "была б у меня такая киска, я, может, и не женился бы никогда", + "Что будет, если в эту контору обратятся оба супруга Она останется с голой киской, а он без штанов Потом, опять пожениться могут даже", + "На территории РФ не имеет юридической силы, особенно если чьи то права ущемляются, что очень легко доказывается в суде", + "Прав был прав, забудь в России такое слово как брачный договор", + "Мастера цензуры 80 lvl", + "По слогану непонятно, что за услуги рекламируют=", + "Родной Усть Каменогорск", + "Из за этого и разводятся с ней постоянно какой то хуй", + "Так он же мёртвых вроде", + "Вот что значит известность у человека, всё уже знаем, кого, когда, в каком состоянии", + "А если появится девушка, то карпы останутся", + "Я бы оставил У девушки есть эти дни, и голова может болеть", + "трётся с её голой киской", + "Эти голые киски стоят немалых денег Неоднозначная реклама", + "Твоей киске нужны большие сиськи Оставь мужика с голой сосиской", + "Вам, сударь, премию выдать мало! Прекрасный слоган Почта РФ Вертели на хую Вас!", + "Причём, самостоятельно!", + "Китайцу деньги вернул", + "По хорошему, можно и не возвращать, условия доставки не выполнены А отпрвления застрахованы", + "Неа, совесть гложет но вспоминая сколько нервов они мне съели я успокаиваюсь", + "Дык продавец вроде как не виноват оказался, накосячила доставка", + "Бля теперь опять гложет", + "Будем тебе под каждым коментом напоминать верни деньги китайчонку, ему семью из 200 человек не на что кормить", + "Ждал так аккумы для фотика Не дождался, связался с продавцом, выбрал повторную посылку Через месяц пришли, блять, обе Одну продал А нефиг", + "В октябре ноябре 2016 были реально какие то проблемы с PostNL Несколько заказанных в тот период посылок, только в феврале доехали до Украины", + "Ну так приложение Gmail и так мне шлёт привет сразу же Зачем танцы с бубном Хотя, кому как удобнее, конечно", + "я пуши не смотрю ,и телефон тоже постоянно не в руках а вот телеграм везде есть", + "gdeposylkaru у них тоже телеграм бот есть", + "Подскажите, где его найти На главной только на e mail оповещение предлагает отправить", + "У них и на почту приходит сообщение, удобно если вынесен виджет почты и сразу видно что письмо от кого и кратко что в нем", + "вам и вашему городу просто повезло мой в 600к, а траблы такие же, как везде", + "Вам надо жить в Казахстане У нас конечно Россия №2, но хоть с посылками всё ок Почитай по тегам Казпочта", + "Во всём остальном РФ лучше", + "Плеер за 1200 месяц проходил таможню Vblizi d Sharapovo Ничего удивительного Дойдёт до тебя твоя камера, я даже кулачки за тебя держать буду", + "я мышку вертикальную с января жду 26 февраля была замечена в Екатеренбургетам и пропала", + "В ёбурге всегда пропадают, так трек работает у одной из компаний Мне через 2 месяца после отметки в ёбурге пришло", + "Ну вот второй месяц заканчивается", + "Мне без трека, например, вообще не доходят Даже копеечные, за пол доллара ХЗ, где их пиздят Год назад такого не было", + "Просто во Внуково Vblizi d Sharapovo и есть Внуково в последнее время случается черная дыра Когда из неё выпадет посылка неизвестно", + "может для них а вы не думали ходить по очереди на обедэто был риторический вопрос", + "Я если иду, то собираю несколько посылок и настраиваюсь как минимум на час очереди, это если повезёт", + "Так посылка тебе дошла", + "Где то затерялась в филиале ада на земле! Новую заказал и все в порядке долетела за 3 недели", + "У меня за 20 дней ботинки пришли сегодня как раз забрали", + "Спасибо, безграмотно вышло Ликбез принимаю", + "у нас почему стереотипно считаю бальзаковский = это пенсионерия", + "Наверное, Бальзака путают с бальзамом, если не с бальзамированием вообще", + "Мне одному кажется что Бальзаку было бы примерно лет 220", + "сочувствую Люди печатающие одним пальцем это печалька", + "Если бы такие женщины нормально умели компом пользоваться, то Россия была бы развитее, и почта нормально работала", + "Что за камера Ксяоми Тоже хочу экшн в районе 6к взять", + "Недавно заказывал из США До РФ долетело за 4 дня,а уже до меня ехала 1,5 недели", + "мне чехол для айфона шел 90 днейчерез грузию ко мне приехал, судя по наклейке пс заказывал в январе 6 7 число пришло 16апр", + "Интересно, они там совсем дебилы конченные и не понимают, что сортировать китайские посылочки таки тоже их работа", + "У вас другой почты нету У нас новая почта как появилась, так вообще шикарно", + "Хз, всё укрпочтой приходит, никогда проблем не имел с ней", + "это ж 75 рублей, много разве за доставку", + "Новая слоупочта скурвилась Укрпошта и то лучше", + "У нас в отделении пенсионеры бунтуют, что за молодежь пошла И понеслась", + "Вот прям сейчас мою посылку уже вторую неделю пинают в этом Внуково Я бы уже сам пешком сходил, быстрее было бы Суки", + "так же посылка висит 2017 03 30 104609 【MR LC Vnukovo Cex 1 102976】Processing Left the place of international exchange уже припекает потихоньку", + "А когда выпускается таможней и неделю висит Я заебался с почтой россии выяснять когда им надоест в носу ковырять", + "странно, я недавно писал 2 заявления на розыск, всё бесплатно", + "Зачем вообще из Москвы в Омск отправлять почтой есть десятки транспортных компаний Тот же СДЭК, Пэк, Деловые линии и другие помелче", + "с 21032017 торчит посылка в этом цеху во Внуково", + "У меня с 30 декабря Деньги вернули Там по ходу черная дыра", + "Моей посылке экскурсию устроили", + "Хотите, я покажу, что такое настоящая экскурсия", + "Охереть А куда приехать должна в итоге", + "Не переживай, она тебе расскажет, может даже фоточки привезёт", + "Когда у твоей посылки отпуск круче чем у тебя Почта России", + "Этот момент, когда чья то маленькая посылка успешнее тебя Салют земляку", + "А вы не Руслан А то, может, наши с вами штрих коды перепутали", + "1, Лежит с 20042017 уже стремно", + "Вот и моя там лежит уже 4 дня", + "там у них сортировочная машина ломалась, вроде починили моя во всяком случае причапала наконец И вам всем удачи!", + "спасибо нервно трясется", + "Моя лежала почти 7 дней, вроде сегодня трекнулась уже в городе", + "моя кофемолочка с ибея с 26 марта =", + "моя одна посылка месяц по Мск гуляет из сортировочного центра в ВАО в почтовое отделение в САО", + "Твою посылку за шавухой на вокзал послали, походу Ты там не робота заказывал", + "Коврик для мышки в половину стола, 900х400 в пакете по размерам как шавуха как раз", + "вечно эти астрономы поназакажут черных дыр, а потом посылки пропадают", + "тоже месяц тарчала посылка тоам, а потом в жух в один день она у меня в городе, магия", + "Судя по комментам неделя может занимать от от полутора месяцев до полутора лет", + "именно поэтому и пишу Минимум", + "Здравствуйте Напишите, пожалуйста, трек номера отправлений, с пересылкой которых возникли трудности Посмотрим, что с ними", + "@DmitriyMarkin , посылка с треском RE482942826SE Выпущено таможней во Внуково, цех 3 ещё 6 апреля Надолго там", + "@DmitriyMarkin RS567238215CN лежит выпущенная таможней Очень смущает что пункте кому насано что то непонятное", + "Я 3 раза ходила на почту пыталась подать заявление на розыск, каждый раз новый начальник и новая инструкция как заполнить Плюнула в итоге и забила", + "Так ждал эту посылку, а в итоге фигушки Китаец клянулся, что к 10 папреля придет", + "Везет Мои наушники по ходу сгинули", + "Ну, я слышал историю, как через 8 месяцев дошла посылка Реально, старая присказка родить быстрее выйдет скоро опять станет актуальной", + "Вот через Нидерланды или ещё через какую Европу гораздо меньше висит внутри России", + "Тут могу привести пример посылки через Гонконг, которая дошла за приемлемые 18 дней", + "Скорее широкий карман!", + "А этот статус означает, что посылка уже покинула данный цех", + "Это означает что они не смогли потерять ее там, но они постараются на следующей сортировке", + "Тоже жду посылку Блядь, только не Внуково!!!", + "Такая же фигня, с 25 марта в этом ебаном цеху, что там за дыра то черная блять", + "Видать можно не ждать уже", + "У нас с вами одинаковое время застревания в этом цеху проклятом Ну и нафига им мои трафареты", + "Нафига им коробочка для шитья", + "Вот суки! Моя осылка тоже там, выпущенная, блять, таможней Интересно, что другие посылки, другим людям, уже давно прошли и получены", + "У меня так посылка на таможне зависла на месяц 2412 поступила и 2701 ушла, а потом еще по городу неделю добиралась", + "Таки дошла!!! в июне!", + "посылка о которой я в ветке отписал", + "моя там же с 14 января", + "Здравствуйте Рекомендую уточнить у отправителя правильность трек номера Часто отправители дают номера чужих отправлений", + "Что за приложение У меня есть одно, но оно неработаеттак автоматически Только если я номер введу отслежки Тогда увижу", + "Та же история! А может это не китайцы А наши нанайцы!", + "Написал про мою пропажу посмотрим, как работает", + "Отпишись, плз, интересно", + "Ок Пока прислали что ждите в течение месяца", + "Спасибо Ничего страшного Такая же штука была, главное, что деньги вернули", + "а если я посылку отправляла как подарок, от чьего имени надо заполнять жалобу посылка затерялась с 2101, из Китая вышла 0801", + "От своего, там важен трек номер, но можно и от получателя Вообще разыскивать должен отправитель по идее, но продаваны с али этого не делают", + "Здравствуйте Заявление на розыск может подать либо отправитель, либо получатель", + "Здравствуйте Спасибо!", + "У меня такое бывает когда посылка застревает, пишу через приложение ПР жалобу, ответ редко приходит, но посылка обычно сразу начинает двигаться", + "Странно, думаю, что просто совпадение Не будет же РКН одновременно и отшивать и делать запрос на Почту", + "В том числе они пишут отписку ибо неправильно лезть по головам, по идее ты жалуешся на почту, а только потом пишеш уже в РКН", + "Так как была заявка они реагируют и пишут им типа ребята решите а тебе пишут отписку, поэтому работает нужен текст отписки я организую", + "А там же сайт Роскомнадзора указан, при чем тут Порча РФ", + "Так как они отвечают за них как ведомство контролируюшие их", + "Спасибо, надо попробовать", + "Это же Внуково! Ребята, почитайте отзывы, там посылки только так воруют У друга совсем недавно ми пятый там остался", + "У друга совсем недавно ми пятый там остался Не сразу понял Ваш комментарий", + "А, не, тогда всё нормально Просто вертолёт", + "Друг расстроился наверно", + "Это ж уже МИ 17 Ми 5 поменьше будет!", + "По нему наверное, уже какой нибудь Алёша разговаривает Шутка! А вообще, расскажи потом, чем дело закончилось", + "Грузчик с Почты России", + "Цех 1 это черная дыра", + "Забудь, у меня наушники с 1304 там зависли, подал на розыск", + "Здравствуйте Напишите, пожалуйста, трек номер Вашего отправления", + "Проверьте и мой трек, а то с 15 числа никаких новостей RB196799772SG", + "RB203787278SG и RU044440553HK Буду рад услышать комментарии, особенно обнадеживающие", + "Бля, у меня ми 5 там сейчас в сортировочном завис", + "Деньги хоть вернул Как узнал, что именно там", + "Не дошло уже много посылок Штук 15 Последнее место, где о ни засветились MR LC Vnukovo Cex 1 102976", + "не могу не поделиться по случаю своим треком каруселью, я уже ставки у друзей начала принимать", + "Аххаха, прикалываются видимо,", + "Заказываешь этот же товар у продавца, подтверждаешь получение Только китайца заранее предупреждаешь", + "Скорее всего сделает заказ и напишит в комментариях что то типа как и обещал оплатил", + "Если во время разговора тебя не послали значит это точно происки азиата", + "Штрих код может быть затерт, а номер телефона то там остался", + "Так она номер посылки назвала Непонятно, зачем Адрес тоже затерся, только номер телефона и уцелел ", + "хочется ответить только одно Это ли не чудо, алилуя Может все таки дойдет когда нибудь", + "Часы брат заказывал на алике Пришли неделю назад Спустя полтора года", + "Капец Это я еще оптимистично сказал что к ЭТОМУ новому году придет подарочек", + "Чудо, конечно Сейчас твоя посылка замироточит", + "А номер то, с которого звонили, не гуглил Может немного прояснит картину", + "Беда в том что об этом подумал спустя пару дней и номер найти уже не получилось Много входящих звонков по роду деятельности", + "А детализация Через личный кабинет", + "Посмотрим чем дело кончится так то у почты очень сомнительная репутация Поэтому и не хочу кого то обвинять", + "у почты очень сомнительная репутация Это ты им сейчас комплимент сделал", + "ТС, ну как там дела с посылкой", + "В пятницу проверял Без изменений", + "То чувство когда твоя посылка пришла недавно в этот самый цех во внуково", + "Столько боли, сочувствия в каждом комментарии, описываемое одной фразой Как тебя понимаю, бро можете дорисовывать", + "Простите, а как вы это определили что они не отслеживаемые Думаю эта информация может пригодиться, и не только мне", + "Спасибо за подробный ответ!тянет на хороший пост Ну и желаю чтобы мелочёвка таки добралась до пункта назначения или диспут китайцу", + "Что же, про ограниченное кол во статусов я знал, но первый раз имею дело с YD треком Спасибо за ответ", + "Их в обратном порядке разбирают", + "Китайцы вообще специфичные ребята Обмануть русского как самособой радумеещееся", + "У меня тоже 21го во Внуково застряло Чую я, нерадивые китайцы покидали наше барахлишко в общий мешок, а наши умнички благополучно его просрали", + "Чек у китайца требовать", + "У меня тоже во Внуково пропало и ни слуху ни духу уже месяц", + "Я на форумах увидел инфу про это, где именно, к сожалению, сейчас и не вспомню RM все типо во Внуково таможню проходят, а RR в Екатеринбурге", + "Здравствуйте Уточните, пожалуйста, трек номер Вашего отправления", + "Трек номер в студию!", + "Почта России получает информацию о посылке в других странах от почтовых администраций этих стран Так что это 100% чудит грузоперевозчик", + "Всё проще определяется по последним буквам трека если в конце CH, то сайт почты напишет Швейцарию", + "В конце стоит CH Значит брешет сайт доставки 1 почта РФ", + "Большая радость, что Вы не повелись Никто звонить каждому не будет, это сколько нужно времени чтобы обзвонить всех у которых на ручной сортировке", + "До сих пор жду хотя с обещанного срока прошло дней 40 примерно", + "Советую открыть диспут на полный возврат и не соглашаться на возврат через PayPal или закрыть диспут по другой причине", + "@DmitriyMarkin , это норма, что посылка показывает Импорт в страну назначения, и уже 15 дней статус не обновляется ZA044150729HK", + "Здравствуйте Рекомендую Вам дождаться официального ответа", + "30 посылок прошли Мр Лц Внуково Цех 1 еще в середине марта и до сих пор не пришли в питер, теперь стал сильнее переживать", + "Здравствуйте Если у Вас регистрируемые отслеживаемые отправления, рекомендую Вам подать заявление на розыск для уточнения обстоятельств пересылки", + "Этот цех чёрная дыра Съела уже две посылки Причём совсем", + "А моя прокатилась до моего города и обратно уехала", + "Ну чё там с посылкой", + "ля они её там из угла в угол таскают что ли", + "Был у меня случай, заказывал лампу Шла очень долго Ну и подал я спор, деньги вернули А потом через 4 месяца звонят с почты говорят лампа приехала", + "@DmitriyMarkin поможете", + "У меня карабинчики с вертлюжками чуть больше года шли, так что почта может все", + "Подписался, будем ждать посылку вмести", + "просто трудности перевода", + "Согласен, телефон это всегда личное и никак к заказу не должен относиться, именно этот звонок очень странен", + "Вот тоже жду, 4 из 5 посылок приехали, а одна никак доехать не может А заказывал все в один день", + "Здравствуйте Напишите трек номер отправления Посмотрим, что с ним", + "Здравствуйте, вот трэк RU610423345RM, очень рад буду если поможете", + "Ох уж этот Мр Лц Внуково Цех 1", + "Моя в декабре перестала отслеживаться прям на въезде влете в Россию У нас тут везде черная дыра Так и не приехала", + "а как оплатить товар продавцу если спор закрыт", + "ещё раз положить в корзину, проверить, чтоб цена была, какая надо, оплатить и отметить получение", + "Главное, что ты деньги вернул У меня в одном заказе даже спор не открывается", + "Пока увеличил срок защиты на десять дней, а потом будем смотреть", + "Точно так же во Внуково посылка тусила 13 дней, а потом за 2 дня доехала до Подмосковья", + "620960 и 620970 это Екатеринбург, но не отделения, а сортировочные центры", + "У меня ни одна январьская посылка не дошла, хорошо хоть заказывал по мелочи Почта израильская", + "можно вопрос вы в израиле или исполнитель", + "Странно это все Я вот тоже уже неделю смотрю как посылка выехала из Внуково, хотя реально из Внуково до МСК идет за день", + "А я живу недалеко от Внуково и до меня обычно посылки оттуда доходили за день Сейчас же две три посылки там больше неделе висят", + "Со 204 у меня посылка также висит во Внуково и тишина", + "я ни разу не открывала спор, можно я потом спрошу у тебя совета, если вдруг все таки придется его открывать", + "Если так себя ведет думаю, можно открыть спор, основание трек не отслеживается", + "Али закрыл спор в мою пользу и деньги мне вернут 3 20 дней Меня смущает, что по третьему треку есть движение", + "Жди возврата денег, если приедет товар, тут два варианта, либо оставить его себе и забить или написать китайцу и оплатить", + "Потому что спор надо не закрывать, а отменять Тогда дадут открыть еще сколько угодно раз", + "Спасибо за информацию, у меня сумма не большая по всё таки неприятно Буду в будущем аккуратнее", + "Логично, я как то не подумал что можно просто отменить Как нибудь поэксперементирую Спасибо", + "Здравствуйте Я официальный представитель Почты России в социальных сетях Напишите, пожалуйста, трек номер Вашего отправления", + "Уважаемый Дмитрий! Разъясните пожалуйста, что происходит во Внуково", + "Ну вот моя посылка уже месяц лежит в этом зловещем цеху LM315468226CN Не уверен что это поможет но мало ли", + "с 5 апреля посылка находится во Внуково без движения RL206174583EE", + "вы должны знать в чем причина сейчас абсолютно все посылки покидая таможню пропадают на неделю, иногда даже на месяц с чем это связано", + "Что интересно не все Телефон и экшенкамера лежали неделю, а носки, запасные браслеты для мибэнда и детское платье день", + "Ну кому интересно рассматривать твои носки и детское платье", + "Тоже в ебурге застряла", + "Не переживай, она зависнет в доставке по РФ еще на хз сколько времени, у меня так наушники и чехол для телефона висят с 13 и 15 апреля", + "Дмитрий, прокомментируйте пожалуйста ситуацию с Екатеринбургом По запросу трек номер скину в личку #comment_85905131", + "так уже с 070217 если стоимость больше 2$ посылка должна отслеживаться чем не доп страховка", + "@moderator , тег Почта России", + "Еще и посылочка придет профит", + "Ну по факту продавец не выполнил условия доставки в указанный срок Мне sjcam моя шла 102 дня, а спор закрыли в мою пользу на 80 день", + "можно ждать подарок к Новому Году", + "Все зависит от моральных качеств покупателя Если ты жлоб, открывай спор", + "А вы предлагаете через 60 дней с момента заказа отправить деньги продавцу и ждать чуда", + "Полностью тебя поддерживаю", + "Пока срок не пройдет, продавец не получит деньги Продление заказа увеличивает срок Что тут непонятного ", + "Вот за это молодец Надо было продление все же просить" + ], + "punctuation": [ + ".", + "?", + "!" + ] + } +} diff --git a/annotators/IntentCatcherTransformers/intents_model_dp_config_RU.json b/annotators/IntentCatcherTransformers/intents_model_dp_config_RU.json new file mode 100644 index 0000000000..b80b39e99f --- /dev/null +++ b/annotators/IntentCatcherTransformers/intents_model_dp_config_RU.json @@ -0,0 +1,190 @@ +{ + "dataset_reader": { + "class_name": "intents_dataset_reader:IntentsJsonReader", + "data_path": "./", + "train": "intent_phrases_RU.json", + "generated_data_path": "./generated_data" + }, + "dataset_iterator": { + "class_name": "basic_classification_iterator", + "seed": 42, + "split_seed": 23, + "field_to_split": "train", + "split_fields": [ + "train", + "valid" + ], + "split_proportions": [ + 0.8, + 0.2 + ] + }, + "chainer": { + "in": [ + "x" + ], + "in_y": [ + "y" + ], + "pipe": [ + { + "class_name": "torch_transformers_preprocessor", + "vocab_file": "{TRANSFORMER}", + "do_lower_case": true, + "max_seq_length": 64, + "in": [ + "x" + ], + "out": [ + "bert_features" + ] + }, + { + "id": "classes_vocab", + "class_name": "simple_vocab", + "fit_on": [ + "y" + ], + "save_path": "{MODEL_PATH}/classes.dict", + "load_path": "{MODEL_PATH}/classes.dict", + "in": [ + "y" + ], + "out": [ + "y_ids" + ] + }, + { + "id": "my_one_hotter", + "in": [ + "y_ids" + ], + "out": [ + "y_onehot" + ], + "class_name": "one_hotter", + "depth": "#classes_vocab.len", + "single_vector": true + }, + { + "class_name": "torch_transformers_classifier", + "n_classes": "#classes_vocab.len", + "return_probas": true, + "one_hot_labels": true, + "multilabel": true, + "pretrained_bert": "{TRANSFORMER}", + "save_path": "{MODEL_PATH}/model", + "load_path": "{MODEL_PATH}/model", + "optimizer": "AdamW", + "optimizer_parameters": { + "lr": 1e-05 + }, + "learning_rate_drop_patience": 5, + "learning_rate_drop_div": 2.0, + "in": [ + "bert_features" + ], + "in_y": [ + "y_onehot" + ], + "out": [ + "y_pred_probas" + ] + }, + { + "in": [ + "y_pred_probas" + ], + "out": [ + "y_pred_ids" + ], + "class_name": "proba2labels", + "max_proba": false, + "confidence_threshold": 0.5 + }, + { + "ref": "my_one_hotter", + "in": [ + "y_pred_ids" + ], + "out": [ + "y_pred_onehot" + ] + }, + { + "in": [ + "y_pred_ids" + ], + "out": [ + "y_pred_labels" + ], + "ref": "classes_vocab" + } + ], + "out": [ + "y_pred_labels", + "y_pred_probas" + ] + }, + "train": { + "epochs": 5, + "batch_size": 64, + "metrics": [ + { + "name": "accuracy", + "inputs": [ + "y", + "y_pred_labels" + ] + }, + { + "name": "f1_weighted", + "inputs": [ + "y_onehot", + "y_pred_onehot" + ] + }, + { + "name": "f1_macro", + "inputs": [ + "y_onehot", + "y_pred_onehot" + ] + }, + { + "name": "roc_auc", + "inputs": [ + "y_onehot", + "y_pred_probas" + ] + } + ], + "validation_patience": 5, + "val_every_n_epochs": 1, + "log_every_n_epochs": 1, + "show_examples": false, + "evaluation_targets": [ + "train", + "valid" + ], + "class_name": "torch_trainer" + }, + "metadata": { + "imports": [ + "intents_dataset_reader" + ], + "variables": { + "TRANSFORMER": "DeepPavlov/rubert-base-cased-conversational", + "ROOT_PATH": "~/.deeppavlov", + "DOWNLOADS_PATH": "{ROOT_PATH}/downloads", + "MODELS_PATH": "{ROOT_PATH}/models", + "MODEL_PATH": "{MODELS_PATH}/classifiers/intents_model_RU_v0" + }, + "download": [ + { + "url": "http://files.deeppavlov.ai/deeppavlov_data/intents_model_RU_v0.tar.gz", + "subdir": "{MODELS_PATH}/classifiers" + } + ] + } + } diff --git a/annotators/IntentCatcherTransformers/test.py b/annotators/IntentCatcherTransformers/test.py index 470d6f428d..f31e356b0b 100644 --- a/annotators/IntentCatcherTransformers/test.py +++ b/annotators/IntentCatcherTransformers/test.py @@ -2,11 +2,17 @@ import requests import json +from os import getenv + +INTENT_PHRASES_PATH = getenv("INTENT_PHRASES_PATH") def main_test(): url = "http://0.0.0.0:8014/detect" - tests = json.load(open("tests.json")) + if "RU" in INTENT_PHRASES_PATH: + tests = json.load(open("tests_RU.json")) + else: + tests = json.load(open("tests.json")) for test in tests: r = requests.post(url=url, json={"sentences": [[test["sentence"]]]}) assert r.ok @@ -15,7 +21,7 @@ def main_test(): assert ( data.get(test["intent"], {"detected": 0}).get("detected", 0) == 1 and sum([v.get("detected", 0) for v in data.values()]) == 1 - ), print(f"TEST FAILED!\nTest: {test}\nResult:{data}") + ), print(f"TEST FAILED!\nTest: {test}\nResult:{json.dumps(data, indent=2)}") else: assert all([intent["detected"] == 0 for intent in data.values()]), f"test: {test}\nprediction: {data}" print("Success") diff --git a/annotators/IntentCatcherTransformers/tests_RU.json b/annotators/IntentCatcherTransformers/tests_RU.json new file mode 100644 index 0000000000..42ff5bcdc7 --- /dev/null +++ b/annotators/IntentCatcherTransformers/tests_RU.json @@ -0,0 +1,66 @@ +[ + { + "sentence": "давай поговорим о чем-нибудь другом.", + "intent": "topic_switching" + }, + { + "sentence": "пока-пока.", + "intent": "exit" + }, + { + "sentence": "повтори это.", + "intent": "repeat" + }, + { + "sentence": "ага конечно.", + "intent": "yes" + }, + { + "sentence": "однозначно нет.", + "intent": "no" + }, + { + "sentence": "о чем ты вообще/", + "intent": "what_are_you_talking_about" + }, + { + "sentence": "итак как же тебя зовут?", + "intent": "what_is_your_name" + }, + { + "sentence": "откуда ты родом", + "intent": "where_are_you_from" + }, + { + "sentence": "о чем хочешь поболтать сегодня?", + "intent": "choose_topic" + }, + { + "sentence": "и что ты можешь", + "intent": "what_can_you_do" + }, + { + "sentence": "кто создал тебя?", + "intent": "who_made_you" + }, + { + "sentence": "у тебя есть работа?", + "intent": "what_is_your_job" + }, + { + "sentence": "давай поговорим о ромашках.", + "intent": "lets_chat_about" + }, + { + "sentence": "люблю пирожки.", + "intent": null + }, + { + "sentence": "ты разговариваешь?", + "intent": null + }, + { + "sentence": "что из этого собака?", + "intent": null + } +] diff --git a/annotators/NER_ru/Dockerfile b/annotators/NER_ru/Dockerfile new file mode 100644 index 0000000000..1922d4ef23 --- /dev/null +++ b/annotators/NER_ru/Dockerfile @@ -0,0 +1,36 @@ +FROM tensorflow/tensorflow:1.15.2-gpu + +RUN apt-key del 7fa2af80 && \ + rm -f /etc/apt/sources.list.d/cuda*.list && \ + curl https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-keyring_1.0-1_all.deb \ + -o cuda-keyring_1.0-1_all.deb && \ + dpkg -i cuda-keyring_1.0-1_all.deb + +RUN apt-get -y update && \ + apt-get install -y software-properties-common && \ + apt-get update && apt-get install git -y + +ARG LANGUAGE=EN +ENV LANGUAGE ${LANGUAGE} + +ARG CONFIG +ARG COMMIT=0.13.0 +ARG PORT +ARG SRC_DIR +ARG SED_ARG=" | " + +ENV CONFIG=$CONFIG +ENV PORT=$PORT + +COPY ./annotators/NER_ru/requirements.txt /src/requirements.txt +RUN pip install -r /src/requirements.txt + +RUN pip install git+https://github.com/deepmipt/DeepPavlov.git@${COMMIT} + +COPY $SRC_DIR /src + +WORKDIR /src + +RUN python -m deeppavlov install $CONFIG + +CMD gunicorn --workers=1 --timeout 500 server:app -b 0.0.0.0:8021 diff --git a/annotators/NER_ru/ner_uncased_rus_bert_torch.json b/annotators/NER_ru/ner_uncased_rus_bert_torch.json new file mode 100644 index 0000000000..800b6629f8 --- /dev/null +++ b/annotators/NER_ru/ner_uncased_rus_bert_torch.json @@ -0,0 +1,155 @@ +{ + "dataset_reader": { + "class_name": "conll2003_reader", + "data_path": "{DOWNLOADS_PATH}/total_rus/", + "dataset_name": "collection_rus", + "provide_pos": false + }, + "dataset_iterator": { + "class_name": "data_learning_iterator" + }, + "chainer": { + "in": [ + "x" + ], + "in_y": [ + "y" + ], + "pipe": [ + { + "class_name": "torch_transformers_ner_preprocessor", + "vocab_file": "{TRANSFORMER}", + "do_lower_case": true, + "max_seq_length": 512, + "max_subword_length": 15, + "token_masking_prob": 0.0, + "in": [ + "x" + ], + "out": [ + "x_tokens", + "x_subword_tokens", + "x_subword_tok_ids", + "startofword_markers", + "attention_mask" + ] + }, + { + "id": "tag_vocab", + "class_name": "simple_vocab", + "unk_token": [ + "O" + ], + "pad_with_zeros": true, + "save_path": "{MODEL_PATH}/tag.dict", + "load_path": "{MODEL_PATH}/tag.dict", + "fit_on": [ + "y" + ], + "in": [ + "y" + ], + "out": [ + "y_ind" + ] + }, + { + "class_name": "torch_transformers_sequence_tagger", + "n_tags": "#tag_vocab.len", + "pretrained_bert": "{TRANSFORMER}", + "attention_probs_keep_prob": 0.5, + "return_probas": false, + "encoder_layer_ids": [ + -1 + ], + "optimizer": "AdamW", + "optimizer_parameters": { + "lr": 2e-05, + "weight_decay": 1e-06, + "betas": [ + 0.9, + 0.999 + ], + "eps": 1e-06 + }, + "clip_norm": 1.0, + "min_learning_rate": 1e-07, + "learning_rate_drop_patience": 30, + "learning_rate_drop_div": 1.5, + "load_before_drop": true, + "save_path": "{MODEL_PATH}/model", + "load_path": "{MODEL_PATH}/model", + "in": [ + "x_subword_tok_ids", + "attention_mask", + "startofword_markers" + ], + "in_y": [ + "y_ind" + ], + "out": [ + "y_pred_ind" + ] + }, + { + "ref": "tag_vocab", + "in": [ + "y_pred_ind" + ], + "out": [ + "y_pred" + ] + } + ], + "out": [ + "x_tokens", + "y_pred" + ] + }, + "train": { + "epochs": 30, + "batch_size": 10, + "metrics": [ + { + "name": "ner_f1", + "inputs": [ + "y", + "y_pred" + ] + }, + { + "name": "ner_token_f1", + "inputs": [ + "y", + "y_pred" + ] + } + ], + "validation_patience": 100, + "val_every_n_batches": 20, + "log_every_n_batches": 20, + "show_examples": false, + "pytest_max_batches": 2, + "pytest_batch_size": 8, + "evaluation_targets": [ + "valid", + "test" + ], + "class_name": "torch_trainer" + }, + "metadata": { + "variables": { + "ROOT_PATH": "~/.deeppavlov", + "DOWNLOADS_PATH": "{ROOT_PATH}/downloads", + "MODELS_PATH": "{ROOT_PATH}/models", + "TRANSFORMER": "DeepPavlov/rubert-base-cased-conversational", + "MODEL_PATH": "{MODELS_PATH}/ner_uncased_rus_bert_torch_v1" + }, + "download": [ + { + "url": "http://files.deeppavlov.ai/deeppavlov_data/ner_uncased_rus_bert_torch_v1.tar.gz", + "subdir": "{MODELS_PATH}" + } + ] + } +} diff --git a/annotators/NER_ru/requirements.txt b/annotators/NER_ru/requirements.txt new file mode 100644 index 0000000000..dce68f4e4b --- /dev/null +++ b/annotators/NER_ru/requirements.txt @@ -0,0 +1,8 @@ +sentry-sdk[flask]==0.14.1 +flask==1.1.1 +gunicorn==19.9.0 +requests==2.22.0 +numpy==1.15.4 +itsdangerous==2.0.1 +jinja2<=3.0.3 +Werkzeug<=2.0.3 \ No newline at end of file diff --git a/annotators/NER_ru/server.py b/annotators/NER_ru/server.py new file mode 100644 index 0000000000..e44c7653bb --- /dev/null +++ b/annotators/NER_ru/server.py @@ -0,0 +1,85 @@ +import logging +import os +import time + +import numpy as np +import sentry_sdk +from flask import Flask, jsonify, request + +from deeppavlov import build_model + +sentry_sdk.init(os.getenv("SENTRY_DSN")) + +logging.basicConfig(format="%(asctime)s - %(name)s - %(levelname)s - %(message)s", level=logging.INFO) +logger = logging.getLogger(__name__) +app = Flask(__name__) + +config_name = os.getenv("CONFIG") + +try: + ner_model = build_model(config_name, download=True) + r = "я видела ивана в москве" + logger.info(f"Original: {r}. NER: {ner_model([r])}") + logger.info("ner ru model is loaded.") +except Exception as e: + sentry_sdk.capture_exception(e) + logger.exception(e) + raise e + + +def convert_prediction(s, token, tag): + start_pos = s.find(token) + return { + "confidence": 1, + "text": token, + "type": tag.replace("B-", "").replace("I-", ""), + "start_pos": start_pos, + "end_pos": start_pos + len(token), + } + + +def get_result(request): + st_time = time.time() + last_utterances = request.json["last_utterances"] + logger.info(f"input (the last utterances): {last_utterances}") + + samples = [] + dialog_ids = [] + for i, utterance_sents in enumerate(last_utterances): + for sent in utterance_sents: + samples.append(sent) + dialog_ids.append(i) + + tokens_batch, tags_batch = ner_model(samples) + good_preds = [ + [convert_prediction(s, token, tag) for token, tag in zip(tokens, tags) if tag != "O"] + for s, tokens, tags in zip(samples, tokens_batch, tags_batch) + ] + dialog_ids = np.array(dialog_ids) + + ret = [] + for i, utterance_sents in enumerate(last_utterances): + curr_ids = np.where(dialog_ids == i)[0] + curr_preds = [good_preds[curr_id] for curr_id in curr_ids] + ret.append(curr_preds) + + logger.info(f"NER output: {ret}") + total_time = time.time() - st_time + logger.info(f"NER exec time: {total_time: .3f}s") + return ret + + +@app.route("/ner", methods=["POST"]) +def respond(): + result = get_result(request) + return jsonify(result) + + +@app.route("/ner_batch", methods=["POST"]) +def respond_batch(): + result = get_result(request) + return jsonify([{"batch": result}]) + + +if __name__ == "__main__": + app.run(debug=False, host="0.0.0.0", port=8021) diff --git a/annotators/NER_ru/test.sh b/annotators/NER_ru/test.sh new file mode 100755 index 0000000000..b37c67d44c --- /dev/null +++ b/annotators/NER_ru/test.sh @@ -0,0 +1,4 @@ +#!/bin/bash + + +python test_server.py diff --git a/annotators/NER_ru/test_server.py b/annotators/NER_ru/test_server.py new file mode 100644 index 0000000000..6d66f1b095 --- /dev/null +++ b/annotators/NER_ru/test_server.py @@ -0,0 +1,29 @@ +import requests + + +def main(): + url = "http://0.0.0.0:8021/ner" + + request_data = [{"last_utterances": [["я видела ивана в москве"]]}] + + gold_results = [ + [ + [ + {"confidence": 1, "end_pos": 14, "start_pos": 9, "text": "ивана", "type": "PER"}, + {"confidence": 1, "end_pos": 23, "start_pos": 17, "text": "москве", "type": "LOC"}, + ] + ] + ] + + count = 0 + for data, gold_result in zip(request_data, gold_results): + result = requests.post(url, json=data).json() + if result == gold_result: + count += 1 + + assert count == len(request_data) + print("Success") + + +if __name__ == "__main__": + main() diff --git a/annotators/SentSeg/.gitignore b/annotators/SentSeg/.gitignore index 3bd9c7d4a8..0e0fe27115 100644 --- a/annotators/SentSeg/.gitignore +++ b/annotators/SentSeg/.gitignore @@ -1 +1,2 @@ -model.* \ No newline at end of file +model.* +data diff --git a/annotators/SentSeg/data_preprocessing.py b/annotators/SentSeg/data_preprocessing.py new file mode 100644 index 0000000000..c079426895 --- /dev/null +++ b/annotators/SentSeg/data_preprocessing.py @@ -0,0 +1,309 @@ +import string + +from nltk.tokenize import sent_tokenize, word_tokenize + + +# Segmentation task +# dataset: one sample = (list of token without punctuations, list of tags): +# [['hi', 'alexa', 'what', 'time', 'is', 'it']] +# [['B-S', ,'O', 'B-Q', 'O', 'O', 'O']] + +# Convert cornellmoviequotes dataset to be suitable with the segmentation task + + +def preprocess(raw_text): + # input: raw text consisting of sentences without punctuation + # output: x - list of tokens, y - list of label + tmp = sent_tokenize(raw_text) + + # remove the long line which consists more than three sentences + if len(tmp) > 3: + # print(tmp) + return [], [] + + tmp = [word_tokenize(sent) for sent in tmp] + + x, y = [], [] + + for sent in tmp: + if sent[-1] == "?": + y.append("B-Q") + # elif sent[-1].endswith('!'): + # y.append('B-E') + else: + y.append("B-S") + + x.extend(sent[:-1]) + y.extend(["O"] * (len(sent) - 2)) + return x, y + + +def convert_cornellmoviequotes(): + with open(file="../datasets/cornellmoviequotes/moviequotes.scripts.txt", mode="r", encoding="latin-1") as f: + lines = f.readlines() + X, Y = [], [] + + for line in lines: + tmp = line.split("+++$+++")[-1].strip().lower() + # print(tmp) + + x, y = preprocess(tmp) + + # print(x) + # print(y) + # print('\n') + if x != []: + X.append(x) + Y.append(y) + + with open(file="../datasets/cornqellmoviequotes.txt", mode="w", encoding="utf-8") as fo: + for x, y in zip(X, Y): + for word, label in zip(x, y): + fo.write("{}\t{}\n".format(word, label)) + fo.write("\n") + + +def convert_dailydialog(): + X, Y = [], [] + with open(file="../datasets/dailydialog.txt", mode="r", encoding="utf-8") as f: + lines = f.readlines() + # print(lines[:10]) + # print(len(lines)) + for line in lines: + tmp = line.strip().lower() + if len(tmp) == 0: + continue + # print(tmp) + + x, y = preprocess(tmp) + + # print(x) + # print(y) + # print('\n') + if x != []: + X.append(x) + Y.append(y) + + with open(file="../datasets/dailydialog_sentseg.txt", mode="w", encoding="utf-8") as fo: + for x, y in zip(X, Y): + for word, label in zip(x, y): + fo.write("{}\t{}\n".format(word, label)) + fo.write("\n") + + +def data_split(x, y, dev_size, test_size): + from sklearn.model_selection import train_test_split + + X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=test_size, random_state=42) + X_train, X_dev, y_train, y_dev = train_test_split( + X_train, y_train, test_size=dev_size / (1 - test_size), random_state=42 + ) + return X_train, y_train, X_dev, y_dev, X_test, y_test + + +def split_dataset(dataset_name="cornellmoviequotes"): + X, Y = [], [] + x, y = [], [] + + with open(file=f"data/{dataset_name}.txt", mode="r", encoding="utf-8") as f: + for line in f: + if line.strip() == "": + X.append(x) + Y.append(y) + x, y = [], [] + else: + items = line.split() + x.append(items[0]) + y.append(items[1]) + + xtrain, ytrain, xdev, ydev, xtest, ytest = data_split(X, Y, 0.1, 0.1) + # print(xtrain[:10]) + # print(ytrain[:10]) + # print(len(xtrain), len(ytrain), len(xdev), len(ydev), len(xtest), len(ytest)) + + def write2file(sents, labels, filename): + with open(file=filename, mode="w", encoding="utf-8") as fo: + for s, l in zip(sents, labels): + for word, tag in zip(s, l): + fo.write("{}\t{}\n".format(word, tag)) + fo.write("\n") + + write2file(xtrain, ytrain, f"data/{dataset_name}_train.txt") + write2file(xdev, ydev, f"data/{dataset_name}_dev.txt") + write2file(xtest, ytest, f"data/{dataset_name}_test.txt") + + +def create_dicts(inp_file, out_file): + word_counts = {} + + with open(file=inp_file, mode="r", encoding="utf-8") as f: + for line in f: + words = line.strip().split() + if len(words) > 0: + if words[0] not in word_counts: + word_counts[words[0]] = 1 + else: + word_counts[words[0]] += 1 + + listofTuples = sorted(word_counts.items(), key=lambda x: x[1]) + + words = ["", ""] + for elem in listofTuples: + if elem[1] > 3: + words.append(elem[0]) + + word2id = {k: v for (v, k) in enumerate(words)} + id2word = {k: v for (k, v) in enumerate(words)} + + chars = ["", ""] + for word in word2id.keys(): + for c in word: + if c not in chars: + chars.append(c) + + char2id = {k: v for (v, k) in enumerate(chars)} + id2char = {k: v for (k, v) in enumerate(chars)} + + tag2id = {"": 0, "B-S": 1, "B-Q": 2, "O": 3} + id2tag = {0: "", 1: "B-S", 2: "B-Q", 3: "O"} + + print(word2id) + print(char2id) + print(len(word2id), len(id2word), len(char2id), len(id2char)) + + import pickle + + with open(out_file, "wb") as f: + pickle.dump( + { + "word2id": word2id, + "id2word": id2word, + "char2id": char2id, + "id2char": id2char, + "tag2id": tag2id, + "id2tag": id2tag, + }, + f, + ) + + +def data_statistic(file): + stat = {"samples": 0, "total_words": 0, "B-S": 0, "B-Q": 0, "O": 0} + with open(file=file, mode="r") as f: + for line in f: + if len(line.strip()) > 0: + word, tag = line.strip().split("\t") + stat[tag] += 1 + stat["total_words"] += 1 + else: + stat["samples"] += 1 + + print(stat) + + +def create_dailydialog_for_deeppavlov(): + with open( + file="../datasets/ijcnlp_dailydialog/dailydialog_for_deeppavlov/dailydialog_deeppavlov2.txt", + mode="w", + encoding="utf-8", + ) as fo: + for dialog in open( + file="../datasets/ijcnlp_dailydialog/dialogues_text.txt", mode="r", encoding="utf-8" + ).readlines(): + utterances = dialog.lower().replace("! ?", "!").replace("? !", "?").replace("!", ".").split("__eou__")[:-1] + for utt in utterances: + if len(utt) > 200: + continue + x, y = "", "" + s = word_tokenize(utt) + for word in s: + if word in [".", "?", "!"]: + y += word + " " + elif word not in string.punctuation: + x += word + " " + y += word + " " + if y[-2] in [".", "?", "!"]: + fo.write("{} [SEP] {}\n".format(x[:-1], y[:-1])) + + # if len(y) == 0: + # continue + # y = y.replace("!", ".").replace(",", "").replace(" ’ ", "'").replace(" ", " ").strip() + # if y[-1] not in [".", "?"]: + # print(y) + # x = y.replace("?", "").replace(".", "").replace("!", "").replace(" ", " ").strip() + # if len(x.strip()) > 0: + # fo.write("{} [SEP] {}\n".format(x, y)) + + +def split_dailydialog_for_deeppavlov(): + with open( + file="../datasets/ijcnlp_dailydialog/dailydialog_for_deeppavlov/dailydialog_deeppavlov2.txt", + mode="r", + encoding="utf-8", + ) as f: + samples = f.readlines() + n = len(samples) + train = samples[: (int)(n * 0.8)] + val = samples[len(train) : (int)(n * 0.9)] + test = samples[len(train) + len(val) :] + print(len(samples), len(train), len(val), len(test)) + + with open( + file="../datasets/ijcnlp_dailydialog/dailydialog_for_deeppavlov/train2.txt", mode="w", encoding="utf-8" + ) as fo: + fo.writelines(train) + with open( + file="../datasets/ijcnlp_dailydialog/dailydialog_for_deeppavlov/valid2.txt", mode="w", encoding="utf-8" + ) as fo: + fo.writelines(val) + with open( + file="../datasets/ijcnlp_dailydialog/dailydialog_for_deeppavlov/test2.txt", mode="w", encoding="utf-8" + ) as fo: + fo.writelines(test) + + +# convert = {"Q": "?", "S": ".", "": ""} +# def SentSegRestoreSent(x, y): +# assert len(x) == len(y) +# if len(y) == 0: +# return "" +# sent = x[0] +# punct = "" if y[0] == "O" else convert[y[0][-1]] +# for word, tag in zip(x[1:], y[1:]): +# if tag != "O": +# sent += punct +# punct = convert[tag[-1]] +# sent += " " + word +# sent += punct + +# return sent + +# with open(file="/home/theanh/.deeppavlov/downloads/sentseg_dailydialog/train.txt", mode="w", encoding="utf-8") as fo: +# x, y = [], [] +# for line in open(file="models/dailydialog_811/train.txt", mode="r", encoding="utf-8").readlines(): +# items = line.strip().split() +# if len(items) == 0: +# if len(x) > 0: +# xs = " ".join(x) +# ys = SentSegRestoreSent(x, y) +# fo.write(f"{xs} [SEP] {ys}\n") +# x, y = [], [] +# else: +# x.append(items[0].strip()) +# y.append(items[1].strip()) + + +# import pickle +# print(pickle.load(open("models/dailydialog_811/params.pkl", "rb"))) +# +# +# with open(file="/home/theanh/.deeppavlov/downloads/sentseg_dailydialog/test.txt", mode="w", encoding="utf-8") as fo: +# for line in open(file="models/dailydialog_811/test.txt", mode="r", encoding="utf-8").readlines(): +# if len(line.strip()) > 0: +# line = line.replace("B-Q", "B-?").replace("B-S", "B-.") +# fo.write(line) + + +create_dailydialog_for_deeppavlov() + +split_dailydialog_for_deeppavlov() diff --git a/annotators/entity_detection_rus/Dockerfile b/annotators/entity_detection_rus/Dockerfile new file mode 100644 index 0000000000..27cb71c743 --- /dev/null +++ b/annotators/entity_detection_rus/Dockerfile @@ -0,0 +1,24 @@ +FROM deeppavlov/base-gpu:0.12.1 + +ARG CONFIG +ARG PORT +ARG SRC_DIR +ARG SED_ARG=" | " + +ARG LANGUAGE=EN +ENV LANGUAGE ${LANGUAGE} + +ENV CONFIG=$CONFIG +ENV PORT=$PORT + +COPY ./annotators/entity_detection_rus/requirements.txt /src/requirements.txt +RUN pip install -r /src/requirements.txt + +COPY $SRC_DIR /src + +WORKDIR /src +RUN python -m deeppavlov install $CONFIG + +RUN sed -i "s|$SED_ARG|g" "$CONFIG" + +CMD gunicorn --workers=1 --timeout 500 server:app -b 0.0.0.0:8103 diff --git a/annotators/entity_detection_rus/entity_detection_parser.py b/annotators/entity_detection_rus/entity_detection_parser.py new file mode 100644 index 0000000000..fe178613ea --- /dev/null +++ b/annotators/entity_detection_rus/entity_detection_parser.py @@ -0,0 +1,242 @@ +# Copyright 2017 Neural Networks and Deep Learning lab, MIPT +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import List, Tuple, Union, Dict +from collections import defaultdict + +import numpy as np + +from deeppavlov.core.commands.utils import expand_path +from deeppavlov.core.common.registry import register +from deeppavlov.core.models.component import Component + + +@register("question_sign_checker") +class QuestionSignChecker(Component): + """This class adds question sign if it is absent or replaces dot with question sign""" + + def __init__(self, **kwargs): + pass + + def __call__(self, questions: List[str]) -> List[str]: + questions_sanitized = [] + for question in questions: + if not question.endswith("?"): + if question.endswith("."): + question = question[:-1] + "?" + else: + question += "?" + questions_sanitized.append(question) + return questions_sanitized + + +@register("entity_detection_parser") +class EntityDetectionParser(Component): + """This class parses probabilities of tokens to be a token from the entity substring.""" + + def __init__( + self, + o_tag: str, + tags_file: str, + entity_tags: List[str] = None, + ignore_points: bool = False, + return_entities_with_tags: bool = False, + add_nouns: bool = False, + thres_proba: float = 0.8, + misc_proba: float = 0.8, + **kwargs + ): + """ + + Args: + entity_tags: tags for entities + type_tag: tag for types + o_tag: tag for tokens which are neither entities nor types + tags_file: filename with NER tags + ignore_points: whether to consider points as separate symbols + return_entities_with_tags: whether to return a dict of tags (keys) and list of entity substrings (values) + or simply a list of entity substrings + thres_proba: if the probability of the tag is less than thres_proba, we assign the tag as 'O' + """ + self.entity_tags = entity_tags + self.o_tag = o_tag + self.ignore_points = ignore_points + self.return_entities_with_tags = return_entities_with_tags + self.thres_proba = thres_proba + self.misc_proba = misc_proba + self.tag_ind_dict = {} + with open(str(expand_path(tags_file))) as fl: + tags = [line.split("\t")[0] for line in fl.readlines()] + if add_nouns: + tags.append("B-MISC") + if self.entity_tags is None: + self.entity_tags = list( + {tag.split("-")[1] for tag in tags if len(tag.split("-")) > 1}.difference({self.o_tag}) + ) + + self.entity_prob_ind = { + entity_tag: [i for i, tag in enumerate(tags) if entity_tag in tag] for entity_tag in self.entity_tags + } + self.tags_ind = {tag: i for i, tag in enumerate(tags)} + self.et_prob_ind = [i for tag, ind in self.entity_prob_ind.items() for i in ind] + for entity_tag, tag_ind in self.entity_prob_ind.items(): + for ind in tag_ind: + self.tag_ind_dict[ind] = entity_tag + self.tag_ind_dict[0] = self.o_tag + + def __call__( + self, + question_tokens_batch: List[List[str]], + tokens_info_batch: List[List[List[float]]], + tokens_probas_batch: np.ndarray, + ) -> Tuple[ + List[Union[List[str], Dict[str, List[str]]]], + List[List[str]], + List[Union[List[int], Dict[str, List[List[int]]]]], + ]: + """ + + Args: + question_tokens: tokenized questions + token_probas: list of probabilities of question tokens + Returns: + Batch of dicts where keys are tags and values are substrings corresponding to tags + Batch of substrings which correspond to entity types + Batch of lists of token indices in the text which correspond to entities + """ + entities_batch = [] + positions_batch = [] + probas_batch = [] + tokens_conf_batch = [] + for probas in tokens_probas_batch: + tokens_conf = [round(1.0 - proba[0], 4) for proba in probas] + tokens_conf_batch.append(tokens_conf) + for tokens, tokens_info, probas in zip(question_tokens_batch, tokens_info_batch, tokens_probas_batch): + entities, positions, entities_probas = self.entities_from_tags(tokens, tokens_info, probas) + entities_batch.append(entities) + positions_batch.append(positions) + probas_batch.append(entities_probas) + return entities_batch, positions_batch, probas_batch, tokens_conf_batch + + def tags_from_probas(self, tokens, probas): + """ + This method makes a list of tags from a list of probas for tags + + Args: + probas: probabilities for tokens to belong to particular tags + + Returns: + list of tags for tokens + list of probabilities of these tags + """ + tags = [] + tag_probas = [] + for token, proba in zip(tokens, probas): + tag_num = np.argmax(proba) + if tag_num in self.et_prob_ind: + if proba[tag_num] < self.thres_proba: + tag_num = 0 + else: + tag_num = 0 + tags.append(self.tag_ind_dict[tag_num]) + tag_probas.append(proba[tag_num]) + + return tags, tag_probas + + def entities_from_tags(self, tokens, tags, tag_probas): + """ + This method makes lists of substrings corresponding to entities and entity types + and a list of indices of tokens which correspond to entities + + Args: + tokens: list of tokens of the text + tags: list of tags for tokens + tag_probas: list of probabilities of tags + + Returns: + list of entity substrings (or a dict of tags (keys) and entity substrings (values)) + list of substrings for entity types + list of indices of tokens which correspond to entities (or a dict of tags (keys) + and list of indices of entity tokens) + """ + entities_dict = defaultdict(list) + entity_dict = defaultdict(list) + entity_positions_dict = defaultdict(list) + entities_positions_dict = defaultdict(list) + entities_probas_dict = defaultdict(list) + entity_probas_dict = defaultdict(list) + replace_tokens = [ + (" - ", "-"), + ("'s", ""), + (" .", ""), + ("{", ""), + ("}", ""), + (" ", " "), + ('"', "'"), + ("(", ""), + (")", ""), + ] + + cnt = 0 + for n, (tok, tag, probas) in enumerate(zip(tokens, tags, tag_probas)): + if tag.split("-")[-1] in self.entity_tags: + f_tag = tag.split("-")[-1] + if tag.startswith("B-") and any(entity_dict.values()): + for c_tag, entity in entity_dict.items(): + entity = " ".join(entity) + for old, new in replace_tokens: + entity = entity.replace(old, new) + if entity: + entities_dict[c_tag].append(entity) + entities_positions_dict[c_tag].append(entity_positions_dict[c_tag]) + cur_probas = entity_probas_dict[c_tag] + entities_probas_dict[c_tag].append(round(sum(cur_probas) / len(cur_probas), 4)) + entity_dict[c_tag] = [] + entity_positions_dict[c_tag] = [] + entity_probas_dict[c_tag] = [] + + entity_dict[f_tag].append(tok) + entity_positions_dict[f_tag].append(cnt) + if self.tags_ind[tag] < len(probas): + entity_probas_dict[f_tag].append(probas[self.tags_ind[tag]]) + else: + entity_probas_dict[f_tag].append(self.misc_proba) + + elif any(entity_dict.values()): + for tag, entity in entity_dict.items(): + c_tag = tag.split("-")[-1] + entity = " ".join(entity) + for old, new in replace_tokens: + entity = entity.replace(old, new) + if entity: + entities_dict[c_tag].append(entity) + entities_positions_dict[c_tag].append(entity_positions_dict[c_tag]) + cur_probas = entity_probas_dict[c_tag] + entities_probas_dict[c_tag].append(round(sum(cur_probas) / len(cur_probas), 4)) + + entity_dict[c_tag] = [] + entity_positions_dict[c_tag] = [] + entity_probas_dict[c_tag] = [] + cnt += 1 + + entities_list = [entity for tag, entities in entities_dict.items() for entity in entities] + entities_positions_list = [ + position for tag, positions in entities_positions_dict.items() for position in positions + ] + entities_probas_list = [proba for tag, proba in entities_probas_dict.items() for proba in probas] + + if self.return_entities_with_tags: + return entities_dict, entities_positions_dict, entities_probas_dict + else: + return entities_list, entities_positions_list, entities_probas_list diff --git a/annotators/entity_detection_rus/entity_detection_rus.json b/annotators/entity_detection_rus/entity_detection_rus.json new file mode 100644 index 0000000000..8914938dd5 --- /dev/null +++ b/annotators/entity_detection_rus/entity_detection_rus.json @@ -0,0 +1,48 @@ +{ + "chainer": { + "in": ["text"], + "pipe": [ + { + "class_name": "ner_chunker:NerChunker", + "batch_size": 16, + "max_chunk_len" : 180, + "max_seq_len" : 400, + "vocab_file": "{TRANSFORMER}", + "do_lower_case": true, + "in": ["text"], + "out": ["x_chunk", "chunk_nums", "chunk_sentences_offsets", "chunk_sentences"] + }, + { + "thres_proba": 0.05, + "o_tag": "O", + "tags_file": "{NER_PATH}/tag.dict", + "return_entities_with_tags": true, + "add_nouns": true, + "class_name": "entity_detection_parser:EntityDetectionParser", + "id": "edp" + }, + { + "class_name": "ner_chunker:NerChunkModel", + "add_nouns": true, + "ner": {"config_path": "wiki_ner_rus_bert_torch.json"}, + "ner_parser": "#edp", + "in": ["x_chunk", "chunk_nums", "chunk_sentences_offsets", "chunk_sentences"], + "out": ["entity_substr", "entity_offsets", "tags", "probas", "sentences_offsets", + "sentences", "tokens", "tokens_conf", "entity_positions", "sentences_tokens"] + } + ], + "out": ["entity_substr", "entity_offsets", "entity_positions", "tokens", "tags", "sentences_offsets", "sentences", "probas", "tokens_conf"] + }, + "metadata": { + "variables": { + "ROOT_PATH": "~/.deeppavlov", + "DOWNLOADS_PATH": "{ROOT_PATH}/downloads", + "MODELS_PATH": "{ROOT_PATH}/models", + "CONFIGS_PATH": "{DEEPPAVLOV_PATH}/configs", + "NER_PATH": "{MODELS_PATH}/wiki_ner_rus_bert_torch_lower", + "TRANSFORMER": "{DOWNLOADS_PATH}/torch_bert_models/rubert_base_cased" + }, + "download": [ + ] + } +} diff --git a/annotators/entity_detection_rus/ner_chunker.py b/annotators/entity_detection_rus/ner_chunker.py new file mode 100644 index 0000000000..a644465003 --- /dev/null +++ b/annotators/entity_detection_rus/ner_chunker.py @@ -0,0 +1,470 @@ +# Copyright 2017 Neural Networks and Deep Learning lab, MIPT +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import re +import time +from logging import getLogger +from string import punctuation +from typing import List, Tuple + +import pymorphy2 +from deeppavlov.core.commands.utils import expand_path +from deeppavlov.core.common.chainer import Chainer +from deeppavlov.core.common.registry import register +from deeppavlov.core.models.component import Component +from nltk import sent_tokenize +from transformers import AutoTokenizer + +from entity_detection_parser import EntityDetectionParser + +log = getLogger(__name__) + + +@register("ner_chunker") +class NerChunker(Component): + """ + Class to split documents into chunks of max_chunk_len symbols so that the length will not exceed + maximal sequence length to feed into BERT + """ + + def __init__( + self, + vocab_file: str, + max_seq_len: int = 400, + max_chunk_len: int = 180, + batch_size: int = 30, + do_lower_case: bool = False, + **kwargs, + ): + """ + + Args: + max_chunk_len: maximal length of chunks into which the document is split + batch_size: how many chunks are in batch + """ + self.max_seq_len = max_seq_len + self.max_chunk_len = max_chunk_len + self.batch_size = batch_size + self.re_tokenizer = re.compile(r"[\w']+|[^\w ]") + vocab_file = str(expand_path(vocab_file)) + self.do_lower_case = do_lower_case + self.tokenizer = AutoTokenizer.from_pretrained(vocab_file) + self.punct_ext = punctuation + " " + "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789" + self.russian_letters = "абвгдеёжзийклмнопрстуфхцчшщъыьэюя" + + def __call__(self, docs_batch: List[str]) -> Tuple[List[List[str]], List[List[int]]]: + """ + This method splits each document in the batch into chunks wuth the maximal length of max_chunk_len + + Args: + docs_batch: batch of documents + + Returns: + batch of lists of document chunks for each document + batch of lists of numbers of documents which correspond to chunks + """ + text_batch_list = [] + text_batch = [] + nums_batch_list = [] + nums_batch = [] + sentences_offsets_batch_list = [] + sentences_offsets_batch = [] + sentences_offsets_list = [] + sentences_batch_list = [] + sentences_batch = [] + sentences_list = [] + text = "" + cur_len = 0 + + for n, doc in enumerate(docs_batch): + if self.do_lower_case: + doc = doc.lower() + start = 0 + text = "" + sentences_list = [] + sentences_offsets_list = [] + cur_len = 0 + doc_pieces = doc.split("\n") + doc_pieces = [self.sanitize(doc_piece) for doc_piece in doc_pieces] + doc_pieces = [doc_piece for doc_piece in doc_pieces if len(doc_piece) > 1] + sentences = [] + for doc_piece in doc_pieces: + sentences += sent_tokenize(doc_piece) + for sentence in sentences: + sentence_tokens = re.findall(self.re_tokenizer, sentence) + sentence_len = sum([len(self.tokenizer.tokenize(token)) for token in sentence_tokens]) + if cur_len + sentence_len < self.max_seq_len: + text += f"{sentence} " + cur_len += sentence_len + end = start + len(sentence) + sentences_offsets_list.append((start, end)) + sentences_list.append(sentence) + start = end + 1 + else: + text = text.strip() + if text: + text_batch.append(text) + sentences_offsets_batch.append(sentences_offsets_list) + sentences_batch.append(sentences_list) + nums_batch.append(n) + + if sentence_len < self.max_seq_len: + text = f"{sentence} " + cur_len = sentence_len + start = 0 + end = start + len(sentence) + sentences_offsets_list = [(start, end)] + sentences_list = [sentence] + start = end + 1 + else: + text = "" + if "," in sentence: + sentence_chunks = sentence.split(", ") + for chunk in sentence_chunks: + chunk_tokens = re.findall(self.re_tokenizer, chunk) + chunk_len = sum([len(self.tokenizer.tokenize(token)) for token in chunk_tokens]) + if cur_len + chunk_len < self.max_seq_len: + text += f"{chunk}, " + cur_len += chunk_len + 1 + end = start + len(chunk) + 1 + sentences_offsets_list.append((start, end)) + sentences_list.append(chunk) + start = end + 1 + else: + text = text.strip().strip(",") + if text: + text_batch.append(text) + sentences_offsets_batch.append(sentences_offsets_list) + sentences_batch.append(sentences_list) + nums_batch.append(n) + + chunk = " ".join(chunk.split()[: self.max_chunk_len]) + text = f"{chunk}, " + cur_len = chunk_len + start = 0 + end = start + len(chunk) + sentences_offsets_list = [(start, end)] + sentences_list = [sentence] + start = end + 1 + else: + chunk_tokens = sentence.split() + num_chunks = len(chunk_tokens) // self.max_chunk_len + int( + len(chunk_tokens) % self.max_chunk > 0 + ) + for ii in range(num_chunks): + chunk_tokens_elem = chunk_tokens[ + ii * self.max_chunk_len : (ii + 1) * self.max_chunk_len + ] + text_batch.append(" ".join(chunk_tokens_elem)) + sentences_offsets_batch.append([(0, len(chunk_tokens_elem))]) + sentences_batch.append([chunk_tokens_elem]) + nums_batch.append(n) + + text = text.strip().strip(",") + if text: + text_batch.append(text) + nums_batch.append(n) + sentences_offsets_batch.append(sentences_offsets_list) + sentences_batch.append(sentences_list) + + num_batches = len(text_batch) // self.batch_size + int(len(text_batch) % self.batch_size > 0) + for jj in range(num_batches): + text_batch_list.append(text_batch[jj * self.batch_size : (jj + 1) * self.batch_size]) + nums_batch_list.append(nums_batch[jj * self.batch_size : (jj + 1) * self.batch_size]) + sentences_offsets_batch_list.append( + sentences_offsets_batch[jj * self.batch_size : (jj + 1) * self.batch_size] + ) + sentences_batch_list.append(sentences_batch[jj * self.batch_size : (jj + 1) * self.batch_size]) + + return text_batch_list, nums_batch_list, sentences_offsets_batch_list, sentences_batch_list + + def sanitize(self, text): + text_len = len(text) + + if text_len > 0 and text[text_len - 1] not in {".", "!", "?"}: + i = text_len - 1 + while text[i] in self.punct_ext and i > 0: + i -= 1 + if (text[i] in {".", "!", "?"} and text[i - 1].lower() in self.russian_letters) or ( + i > 1 + and text[i] in {".", "!", "?"} + and text[i - 1] in '"' + and text[i - 2].lower() in self.russian_letters + ): + break + + text = text[: i + 1] + text = re.sub(r"\s+", " ", text) + return text + + +@register("ner_chunk_model") +class NerChunkModel(Component): + """ + Class for linking of entity substrings in the document to entities in Wikidata + """ + + def __init__(self, ner: Chainer, ner_parser: EntityDetectionParser, add_nouns: False, **kwargs) -> None: + """ + + Args: + ner: config for entity detection + ner_parser: component deeppavlov.models.kbqa.entity_detection_parser + **kwargs: + """ + self.ner = ner + self.ner_parser = ner_parser + self.re_tokenizer = re.compile(r"[\w']+|[^\w ]") + self.morph = pymorphy2.MorphAnalyzer() + self.add_nouns = add_nouns + + def __call__( + self, + text_batch_list: List[List[str]], + nums_batch_list: List[List[int]], + sentences_offsets_batch_list: List[List[List[Tuple[int, int]]]], + sentences_batch_list: List[List[List[str]]], + ): + entity_substr_batch_list = [] + entity_offsets_batch_list = [] + tags_batch_list = [] + entity_probas_batch_list = [] + tokens_conf_batch_list = [] + text_len_batch_list = [] + ner_tokens_batch_list = [] + entity_positions_batch_list = [] + sentences_tokens_batch_list = [] + for text_batch, sentences_offsets_batch, sentences_batch in zip( + text_batch_list, sentences_offsets_batch_list, sentences_batch_list + ): + tm_ner_st = time.time() + ner_tokens_batch, ner_tokens_offsets_batch, ner_probas_batch, probas_batch = self.ner(text_batch) + if self.add_nouns: + for i in range(len(ner_tokens_batch)): + for j in range(len(ner_tokens_batch[i])): + if ( + self.morph.parse(ner_tokens_batch[i][j])[0].tag.POS == "NOUN" + and ner_probas_batch[i][j] == "O" + ): + ner_probas_batch[i][j] = "B-MISC" + entity_substr_batch, entity_positions_batch, entity_probas_batch, tokens_conf_batch = self.ner_parser( + ner_tokens_batch, ner_probas_batch, probas_batch + ) + tm_ner_end = time.time() + log.debug(f"ner time {tm_ner_end - tm_ner_st}") + log.debug(f"entity_substr_batch {entity_substr_batch}") + log.debug(f"entity_positions_batch {entity_positions_batch}") + entity_pos_tags_probas_batch = [ + [ + (entity_substr.lower(), entity_substr_positions, tag, entity_proba) + for tag, entity_substr_list in entity_substr_dict.items() + for entity_substr, entity_substr_positions, entity_proba in zip( + entity_substr_list, entity_positions_dict[tag], entity_probas_dict[tag] + ) + ] + for entity_substr_dict, entity_positions_dict, entity_probas_dict in zip( + entity_substr_batch, entity_positions_batch, entity_probas_batch + ) + ] + entity_substr_batch = [] + entity_offsets_batch = [] + tags_batch = [] + probas_batch = [] + pr_entity_positions_batch = [] + for entity_pos_tags_probas, ner_tokens_offsets_list in zip( + entity_pos_tags_probas_batch, ner_tokens_offsets_batch + ): + if entity_pos_tags_probas: + entity_offsets_list = [] + entity_substr_list, entity_positions_list, tags_list, probas_list = zip(*entity_pos_tags_probas) + for entity_positions in entity_positions_list: + start_offset = ner_tokens_offsets_list[entity_positions[0]][0] + end_offset = ner_tokens_offsets_list[entity_positions[-1]][1] + entity_offsets_list.append((start_offset, end_offset)) + else: + entity_substr_list, entity_offsets_list, tags_list, probas_list, entity_positions_list = ( + [], + [], + [], + [], + [], + ) + entity_substr_batch.append(list(entity_substr_list)) + entity_offsets_batch.append(list(entity_offsets_list)) + tags_batch.append(list(tags_list)) + probas_batch.append(list(probas_list)) + pr_entity_positions_batch.append(list(entity_positions_list)) + + sentences_tokens_batch = [] + for sentences_offsets_list, ner_tokens_list, ner_tokens_offsets_list in zip( + sentences_offsets_batch, ner_tokens_batch, ner_tokens_offsets_batch + ): + sentences_tokens_list = [] + for start_offset, end_offset in sentences_offsets_list: + sentence_tokens = [] + for tok, (start_tok_offset, end_tok_offset) in zip(ner_tokens_list, ner_tokens_offsets_list): + if start_tok_offset >= start_offset and end_tok_offset <= end_offset: + sentence_tokens.append(tok) + sentences_tokens_list.append(sentence_tokens) + sentences_tokens_batch.append(sentences_tokens_list) + + log.debug(f"entity_substr_batch {entity_substr_batch}") + log.debug(f"entity_offsets_batch {entity_offsets_batch}") + + entity_substr_batch_list.append(entity_substr_batch) + tags_batch_list.append(tags_batch) + entity_offsets_batch_list.append(entity_offsets_batch) + entity_probas_batch_list.append(probas_batch) + text_len_batch_list.append([len(text) for text in text_batch]) + ner_tokens_batch_list.append(ner_tokens_batch) + tokens_conf_batch_list.append(tokens_conf_batch) + entity_positions_batch_list.append(pr_entity_positions_batch) + sentences_tokens_batch_list.append(sentences_tokens_batch) + + doc_entity_substr_batch, doc_tags_batch, doc_entity_offsets_batch, doc_probas_batch = [], [], [], [] + doc_sentences_offsets_batch, doc_sentences_batch = [], [] + doc_ner_tokens_batch, doc_tokens_conf_batch, doc_entity_positions_batch, doc_sentences_tokens_batch = ( + [], + [], + [], + [], + ) + doc_entity_substr, doc_tags, doc_probas, doc_entity_offsets = [], [], [], [] + doc_sentences_offsets, doc_sentences = [], [] + doc_ner_tokens, doc_tokens_conf, doc_entity_positions, doc_sentences_tokens = [], [], [], [] + cur_doc_num = 0 + text_len_sum = 0 + tokens_len_sum = 0 + for ( + entity_substr_batch, + tags_batch, + probas_batch, + entity_offsets_batch, + sentences_offsets_batch, + sentences_batch, + text_len_batch, + nums_batch, + ner_tokens_batch, + tokens_conf_batch, + entity_positions_batch, + sentences_tokens_batch, + ) in zip( + entity_substr_batch_list, + tags_batch_list, + entity_probas_batch_list, + entity_offsets_batch_list, + sentences_offsets_batch_list, + sentences_batch_list, + text_len_batch_list, + nums_batch_list, + ner_tokens_batch_list, + tokens_conf_batch_list, + entity_positions_batch_list, + sentences_tokens_batch_list, + ): + for ( + entity_substr, + tag, + probas, + entity_offsets, + sentences_offsets, + sentences, + text_len, + doc_num, + ner_tokens, + tokens_conf, + entity_positions, + sentences_tokens, + ) in zip( + entity_substr_batch, + tags_batch, + probas_batch, + entity_offsets_batch, + sentences_offsets_batch, + sentences_batch, + text_len_batch, + nums_batch, + ner_tokens_batch, + tokens_conf_batch, + entity_positions_batch, + sentences_tokens_batch, + ): + if doc_num == cur_doc_num: + doc_entity_substr += entity_substr + doc_tags += tag + doc_probas += probas + doc_entity_offsets += [ + (start_offset + text_len_sum, end_offset + text_len_sum) + for start_offset, end_offset in entity_offsets + ] + doc_sentences_offsets += [ + (start_offset + text_len_sum, end_offset + text_len_sum) + for start_offset, end_offset in sentences_offsets + ] + doc_entity_positions += [ + [pos + tokens_len_sum for pos in entity_position] for entity_position in entity_positions + ] + doc_sentences += sentences + text_len_sum += text_len + 1 + doc_ner_tokens += ner_tokens + doc_tokens_conf += tokens_conf + tokens_len_sum += len(ner_tokens) + doc_sentences_tokens += sentences_tokens + else: + doc_entity_substr_batch.append(doc_entity_substr) + doc_tags_batch.append(doc_tags) + doc_probas_batch.append(doc_probas) + doc_entity_offsets_batch.append(doc_entity_offsets) + doc_sentences_offsets_batch.append(doc_sentences_offsets) + doc_entity_positions_batch.append(doc_entity_positions) + doc_sentences_batch.append(doc_sentences) + doc_ner_tokens_batch.append(doc_ner_tokens) + doc_tokens_conf_batch.append(doc_tokens_conf) + doc_sentences_tokens_batch.append(doc_sentences_tokens) + doc_entity_substr = entity_substr + doc_tags = tag + doc_probas = probas + doc_entity_offsets = entity_offsets + doc_sentences_offsets = sentences_offsets + doc_entity_positions = entity_positions + doc_sentences = sentences + cur_doc_num = doc_num + text_len_sum = text_len + doc_ner_tokens = ner_tokens + doc_tokens_conf = tokens_conf + doc_sentences_tokens = sentences_tokens + tokens_len_sum = len(ner_tokens) + doc_entity_substr_batch.append(doc_entity_substr) + doc_tags_batch.append(doc_tags) + doc_probas_batch.append(doc_probas) + doc_entity_offsets_batch.append(doc_entity_offsets) + doc_sentences_offsets_batch.append(doc_sentences_offsets) + doc_entity_positions_batch.append(doc_entity_positions) + doc_sentences_batch.append(doc_sentences) + doc_ner_tokens_batch.append(doc_ner_tokens) + doc_tokens_conf_batch.append(doc_tokens_conf) + doc_sentences_tokens_batch.append(doc_sentences_tokens) + + return ( + doc_entity_substr_batch, + doc_entity_offsets_batch, + doc_tags_batch, + doc_probas_batch, + doc_sentences_offsets_batch, + doc_sentences_batch, + doc_ner_tokens_batch, + doc_tokens_conf_batch, + doc_entity_positions_batch, + doc_sentences_tokens_batch, + ) diff --git a/annotators/entity_detection_rus/requirements.txt b/annotators/entity_detection_rus/requirements.txt new file mode 100644 index 0000000000..d79967a80f --- /dev/null +++ b/annotators/entity_detection_rus/requirements.txt @@ -0,0 +1,14 @@ +Flask==1.1.1 +nltk==3.2.5 +gunicorn==19.9.0 +requests==2.22.0 +sentry-sdk==0.12.3 +torch==1.6.0 +transformers==4.6.0 +deeppavlov==0.17.2 +pymorphy2==0.9.1 +pymorphy2-dicts==2.4.393442.3710985 +pymorphy2-dicts-ru==2.4.417127.4579844 +itsdangerous==2.0.1 +jinja2<=3.0.3 +Werkzeug<=2.0.3 diff --git a/annotators/entity_detection_rus/server.py b/annotators/entity_detection_rus/server.py new file mode 100644 index 0000000000..6a588ff006 --- /dev/null +++ b/annotators/entity_detection_rus/server.py @@ -0,0 +1,129 @@ +import logging +import os +import re +import time + +import sentry_sdk +from flask import Flask, jsonify, request +from nltk.corpus import stopwords + +from deeppavlov import build_model + +sentry_sdk.init(os.getenv("SENTRY_DSN")) + +logging.basicConfig(format="%(asctime)s - %(name)s - %(levelname)s - %(message)s", level=logging.INFO) +logger = logging.getLogger(__name__) +app = Flask(__name__) + +config_name = os.getenv("CONFIG") + +try: + entity_detection_rus = build_model(config_name, download=True) + entity_detection_rus(["кто написал войну и мир?"]) + logger.info("entity detection model is loaded.") +except Exception as e: + sentry_sdk.capture_exception(e) + logger.exception(e) + raise e + + +EVERYTHING_EXCEPT_LETTERS_DIGITALS_AND_SPACE = re.compile(r"[^а-яА-Я0-9 \-&*+]") +DOUBLE_SPACES = re.compile(r"\s+") +stopwords = set(stopwords.words("russian")) + + +def get_result(request): + st_time = time.time() + last_utts = request.json.get("last_utterances", []) + logger.info(f"input (the last utterances): {last_utts}") + + utts_list = [] + utts_nums = [] + last_utt_starts = [] + for n, hist_utt in enumerate(last_utts): + if len(hist_utt) > 0: + last_utt = hist_utt[-1] + if last_utt[-1] not in {".", "!", "?"}: + last_utt = f"{last_utt}." + if len(hist_utt) > 1: + prev_utt = hist_utt[-2] + if prev_utt[-1] not in {".", "!", "?"}: + prev_utt = f"{prev_utt}." + last_utt_starts.append(len(prev_utt) + 1) + concat_utt = f"{prev_utt} {last_utt}" + else: + last_utt_starts.append(0) + concat_utt = last_utt + + utts_list.append(concat_utt.lower()) + utts_nums.append(n) + + utt_entities_batch = [{} for _ in last_utts] + utt_entities = {} + if utts_list: + ( + entity_substr_batch, + entity_offsets_batch, + entity_positions_batch, + tokens_batch, + tags_batch, + sentences_offsets_batch, + sentences_batch, + probas_batch, + tokens_conf_batch, + ) = entity_detection_rus(utts_list) + logger.info(f"entity_substr_batch {entity_substr_batch}") + + for entity_substr_list, tags_list, entity_offsets_list, last_utt_start, num in zip( + entity_substr_batch, tags_batch, entity_offsets_batch, last_utt_starts, utts_nums + ): + utt_entities = {} + for entity, tag, (start_offset, end_offset) in zip(entity_substr_list, tags_list, entity_offsets_list): + if entity not in stopwords and len(entity) > 2 and start_offset >= last_utt_start: + entity = EVERYTHING_EXCEPT_LETTERS_DIGITALS_AND_SPACE.sub(" ", entity) + entity = DOUBLE_SPACES.sub(" ", entity).strip() + if "entities" in utt_entities: + utt_entities["entities"].append(entity) + utt_entities["labelled_entities"].append( + { + "text": entity, + "label": tag.lower(), + "offsets": (start_offset - last_utt_start, end_offset - last_utt_start), + } + ) + else: + utt_entities["entities"] = [entity] + utt_entities["labelled_entities"] = [ + { + "text": entity, + "label": tag.lower(), + "offsets": (start_offset - last_utt_start, end_offset - last_utt_start), + } + ] + + if utt_entities: + utt_entities_batch[num] = utt_entities + + if not last_utts: + utt_entities_batch.append({}) + + total_time = time.time() - st_time + logger.info(f"entity detection exec time: {total_time: .3f}s") + logger.info(f"entity_detection, input {last_utts}, output {utt_entities_batch}") + return utt_entities_batch + + +@app.route("/respond", methods=["POST"]) +def respond(): + result = get_result(request) + return jsonify(result) + + +@app.route("/respond_batch", methods=["POST"]) +def respond_batch(): + result = get_result(request) + return jsonify([{"batch": result}]) + + +if __name__ == "__main__": + app.run(debug=False, host="0.0.0.0", port=8103) diff --git a/annotators/entity_detection_rus/test.sh b/annotators/entity_detection_rus/test.sh new file mode 100755 index 0000000000..2a5dc46295 --- /dev/null +++ b/annotators/entity_detection_rus/test.sh @@ -0,0 +1,5 @@ +#!/bin/bash + + +python test_entity_detection.py + diff --git a/annotators/entity_detection_rus/test_entity_detection.py b/annotators/entity_detection_rus/test_entity_detection.py new file mode 100644 index 0000000000..273d295c7f --- /dev/null +++ b/annotators/entity_detection_rus/test_entity_detection.py @@ -0,0 +1,29 @@ +import requests + + +def main(): + url = "http://0.0.0.0:8103/respond" + + request_data = [{"last_utterances": [["кто написал войну и мир?"]]}] + + gold_results = [ + [ + { + "entities": ["войну и мир"], + "labelled_entities": [{"label": "literary_work", "offsets": [12, 23], "text": "войну и мир"}], + } + ] + ] + + count = 0 + for data, gold_result in zip(request_data, gold_results): + result = requests.post(url, json=data).json() + if result == gold_result: + count += 1 + + assert count == len(request_data) + print("Success") + + +if __name__ == "__main__": + main() diff --git a/annotators/entity_detection_rus/torch_transformers_preprocessor.py b/annotators/entity_detection_rus/torch_transformers_preprocessor.py new file mode 100644 index 0000000000..9cc79ecb69 --- /dev/null +++ b/annotators/entity_detection_rus/torch_transformers_preprocessor.py @@ -0,0 +1,405 @@ +# Copyright 2017 Neural Networks and Deep Learning lab, MIPT +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import re +import random +import logging +from pathlib import Path +import torch +from typing import Tuple, List, Optional, Union + +from transformers import AutoTokenizer, BertTokenizer +from transformers.data.processors.utils import InputFeatures + +from deeppavlov.core.commands.utils import expand_path +from deeppavlov.core.common.registry import register +from deeppavlov.core.data.utils import zero_pad +from deeppavlov.core.models.component import Component +from deeppavlov.models.preprocessors.mask import Mask + + +handler = logging.StreamHandler() +formatter = logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(message)s") +handler.setFormatter(formatter) + +logging.getLogger(__name__).addHandler(handler) +log = logging.getLogger(__name__) + + +@register("torch_transformers_preprocessor") +class TorchTransformersPreprocessor(Component): + """Tokenize text on subtokens, encode subtokens with their indices, create tokens and segment masks. + + Check details in :func:`bert_dp.preprocessing.convert_examples_to_features` function. + + Args: + vocab_file: path to vocabulary + do_lower_case: set True if lowercasing is needed + max_seq_length: max sequence length in subtokens, including [SEP] and [CLS] tokens + return_tokens: whether to return tuple of inputfeatures and tokens, or only inputfeatures + + Attributes: + max_seq_length: max sequence length in subtokens, including [SEP] and [CLS] tokens + return_tokens: whether to return tuple of inputfeatures and tokens, or only inputfeatures + tokenizer: instance of Bert FullTokenizer + + """ + + def __init__( + self, + vocab_file: str, + do_lower_case: bool = True, + max_seq_length: int = 512, + return_tokens: bool = False, + add_special_tokens: List = None, + **kwargs, + ) -> None: + self.max_seq_length = max_seq_length + self.return_tokens = return_tokens + + vocab_file_path = Path(vocab_file) + if expand_path(vocab_file_path).exists(): + vocab_file_path = expand_path(vocab_file_path) + if vocab_file_path.is_file(): + vocab_file_path = vocab_file_path.parent + vocab_file_path = str(vocab_file_path) + + self.tokenizer = AutoTokenizer.from_pretrained(vocab_file_path, do_lower_case=do_lower_case) + if add_special_tokens is not None: + special_tokens_dict = {"additional_special_tokens": add_special_tokens} + self.tokenizer.add_special_tokens(special_tokens_dict) + + def __call__( + self, texts_a: List[str], texts_b: Optional[List[str]] = None + ) -> Union[List[InputFeatures], Tuple[List[InputFeatures], List[List[str]]]]: + """Tokenize and create masks. + + texts_a and texts_b are separated by [SEP] token + + Args: + texts_a: list of texts, + texts_b: list of texts, it could be None, e.g. single sentence classification task + + Returns: + batch of :class:`transformers.data.processors.utils.InputFeatures` with subtokens, subtoken ids, \ + subtoken mask, segment mask, or tuple of batch of InputFeatures and Batch of subtokens + """ + + if texts_b is None: + texts_b = [None] * len(texts_a) + + input_features = [] + tokens = [] + for text_a, text_b in zip(texts_a, texts_b): + encoded_dict = self.tokenizer.encode_plus( + text=text_a, + text_pair=text_b, + add_special_tokens=True, + max_length=self.max_seq_length, + pad_to_max_length=True, + return_attention_mask=True, + return_tensors="pt", + ) + + if "token_type_ids" not in encoded_dict: + encoded_dict["token_type_ids"] = torch.tensor([0]) + + curr_features = InputFeatures( + input_ids=encoded_dict["input_ids"], + attention_mask=encoded_dict["attention_mask"], + token_type_ids=encoded_dict["token_type_ids"], + label=None, + ) + input_features.append(curr_features) + if self.return_tokens: + tokens.append(self.tokenizer.convert_ids_to_tokens(encoded_dict["input_ids"][0])) + + if self.return_tokens: + return input_features, tokens + else: + return input_features + + +@register("torch_transformers_batch_preprocessor") +class TorchTransformersBatchPreprocessor(Component): + def __init__( + self, + vocab_file: str, + do_lower_case: bool = False, + max_seq_length: int = 512, + return_tokens: bool = False, + add_special_tokens: List = None, + special_token_id: int = None, + return_special_tokens_pos: bool = False, + **kwargs, + ) -> None: + self.max_seq_length = max_seq_length + self.return_tokens = return_tokens + # vocab_file = str(expand_path(vocab_file)) + if Path(vocab_file).is_file(): + self.tokenizer = BertTokenizer(vocab_file=vocab_file, do_lower_case=do_lower_case) + else: + self.tokenizer = BertTokenizer.from_pretrained(vocab_file, do_lower_case=do_lower_case) + if add_special_tokens is not None: + special_tokens_dict = {"additional_special_tokens": add_special_tokens} + self.tokenizer.add_special_tokens(special_tokens_dict) + self.add_special_tokens = add_special_tokens + self.special_token_id = special_token_id + self.return_special_tokens_pos = return_special_tokens_pos + self.re_tokenizer = re.compile(r"[\w']+|[^\w ]") + + def __call__(self, texts_a: List[str], texts_b: List[str] = None) -> List[InputFeatures]: + if texts_b is None: + texts_b = [None for _ in texts_a] + texts_input_features = [] + special_tokens_pos = [] + + tokenizer_input = [[text_a, text_b] for text_a, text_b in zip(texts_a, texts_b)] + encoding = self.tokenizer.batch_encode_plus( + tokenizer_input, add_special_tokens=True, pad_to_max_length=True, return_attention_mask=True + ) + input_ids_batch = encoding["input_ids"] + attention_mask_batch = encoding["attention_mask"] + token_type_ids_batch = encoding["token_type_ids"] + + for input_ids_list in input_ids_batch: + found_n = -1 + for n, input_id in enumerate(input_ids_list): + if input_id == self.special_token_id: + found_n = n + break + if found_n == -1: + found_n = 0 + special_tokens_pos.append(found_n) + + for input_ids, attention_mask, token_type_ids in zip( + input_ids_batch, attention_mask_batch, token_type_ids_batch + ): + curr_features = InputFeatures( + input_ids=input_ids, attention_mask=attention_mask, token_type_ids=token_type_ids + ) + texts_input_features.append(curr_features) + + if self.return_special_tokens_pos: + return texts_input_features, special_tokens_pos + else: + return texts_input_features + + +@register("torch_transformers_ner_preprocessor") +class TorchTransformersNerPreprocessor(Component): + """ + Takes tokens and splits them into bert subtokens, encodes subtokens with their indices. + Creates a mask of subtokens (one for the first subtoken, zero for the others). + + If tags are provided, calculates tags for subtokens. + + Args: + vocab_file: path to vocabulary + do_lower_case: set True if lowercasing is needed + max_seq_length: max sequence length in subtokens, including [SEP] and [CLS] tokens + max_subword_length: replace token to if it's length is larger than this + (defaults to None, which is equal to +infinity) + token_masking_prob: probability of masking token while training + provide_subword_tags: output tags for subwords or for words + subword_mask_mode: subword to select inside word tokens, can be "first" or "last" + (default="first") + + Attributes: + max_seq_length: max sequence length in subtokens, including [SEP] and [CLS] tokens + max_subword_length: rmax lenght of a bert subtoken + tokenizer: instance of Bert FullTokenizer + """ + + def __init__( + self, + vocab_file: str, + do_lower_case: bool = False, + max_seq_length: int = 512, + max_subword_length: int = None, + token_masking_prob: float = 0.0, + provide_subword_tags: bool = False, + subword_mask_mode: str = "first", + **kwargs, + ): + self._re_tokenizer = re.compile(r"[\w']+|[^\w ]") + self.provide_subword_tags = provide_subword_tags + self.mode = kwargs.get("mode") + self.max_seq_length = max_seq_length + self.max_subword_length = max_subword_length + self.subword_mask_mode = subword_mask_mode + vocab_file = str(expand_path(vocab_file)) + self.tokenizer = AutoTokenizer.from_pretrained(vocab_file) + self.token_masking_prob = token_masking_prob + + def __call__(self, tokens: Union[List[List[str]], List[str]], tags: List[List[str]] = None, **kwargs): + tokens_offsets_batch = [[] for _ in tokens] + if isinstance(tokens[0], str): + tokens_batch = [] + tokens_offsets_batch = [] + for s in tokens: + tokens_list = [] + tokens_offsets_list = [] + for elem in re.finditer(self._re_tokenizer, s): + tokens_list.append(elem[0]) + tokens_offsets_list.append((elem.start(), elem.end())) + tokens_batch.append(tokens_list) + tokens_offsets_batch.append(tokens_offsets_list) + tokens = tokens_batch + subword_tokens, subword_tok_ids, startofword_markers, subword_tags = [], [], [], [] + for i in range(len(tokens)): + toks = tokens[i] + ys = ["O"] * len(toks) if tags is None else tags[i] + assert len(toks) == len(ys), f"toks({len(toks)}) should have the same length as ys({len(ys)})" + sw_toks, sw_marker, sw_ys = self._ner_bert_tokenize( + toks, + ys, + self.tokenizer, + self.max_subword_length, + mode=self.mode, + subword_mask_mode=self.subword_mask_mode, + token_masking_prob=self.token_masking_prob, + ) + if self.max_seq_length is not None: + if len(sw_toks) > self.max_seq_length: + raise RuntimeError( + f"input sequence after bert tokenization" f" shouldn't exceed {self.max_seq_length} tokens." + ) + subword_tokens.append(sw_toks) + subword_tok_ids.append(self.tokenizer.convert_tokens_to_ids(sw_toks)) + startofword_markers.append(sw_marker) + subword_tags.append(sw_ys) + assert len(sw_marker) == len(sw_toks) == len(subword_tok_ids[-1]) == len(sw_ys), ( + f"length of sow_marker({len(sw_marker)}), tokens({len(sw_toks)})," + f" token ids({len(subword_tok_ids[-1])}) and ys({len(ys)})" + f" for tokens = `{toks}` should match" + ) + + subword_tok_ids = zero_pad(subword_tok_ids, dtype=int, padding=0) + startofword_markers = zero_pad(startofword_markers, dtype=int, padding=0) + attention_mask = Mask()(subword_tokens) + + if tags is not None: + if self.provide_subword_tags: + return tokens, subword_tokens, subword_tok_ids, attention_mask, startofword_markers, subword_tags + else: + nonmasked_tags = [[t for t in ts if t != "X"] for ts in tags] + for swts, swids, swms, ts in zip(subword_tokens, subword_tok_ids, startofword_markers, nonmasked_tags): + if (len(swids) != len(swms)) or (len(ts) != sum(swms)): + log.warning("Not matching lengths of the tokenization!") + log.warning(f"Tokens len: {len(swts)}\n Tokens: {swts}") + log.warning(f"Markers len: {len(swms)}, sum: {sum(swms)}") + log.warning(f"Masks: {swms}") + log.warning(f"Tags len: {len(ts)}\n Tags: {ts}") + return tokens, subword_tokens, subword_tok_ids, attention_mask, startofword_markers, nonmasked_tags + return tokens, subword_tokens, subword_tok_ids, startofword_markers, attention_mask, tokens_offsets_batch + + @staticmethod + def _ner_bert_tokenize( + tokens: List[str], + tags: List[str], + tokenizer: AutoTokenizer, + max_subword_len: int = None, + mode: str = None, + subword_mask_mode: str = "first", + token_masking_prob: float = None, + ) -> Tuple[List[str], List[int], List[str]]: + do_masking = (mode == "train") and (token_masking_prob is not None) + do_cutting = max_subword_len is not None + tokens_subword = ["[CLS]"] + startofword_markers = [0] + tags_subword = ["X"] + for token, tag in zip(tokens, tags): + token_marker = int(tag != "X") + subwords = tokenizer.tokenize(token) + if not subwords or (do_cutting and (len(subwords) > max_subword_len)): + tokens_subword.append("[UNK]") + startofword_markers.append(token_marker) + tags_subword.append(tag) + else: + if do_masking and (random.random() < token_masking_prob): + tokens_subword.extend(["[MASK]"] * len(subwords)) + else: + tokens_subword.extend(subwords) + if subword_mask_mode == "last": + startofword_markers.extend([0] * (len(subwords) - 1) + [token_marker]) + else: + startofword_markers.extend([token_marker] + [0] * (len(subwords) - 1)) + tags_subword.extend([tag] + ["X"] * (len(subwords) - 1)) + + tokens_subword.append("[SEP]") + startofword_markers.append(0) + tags_subword.append("X") + return tokens_subword, startofword_markers, tags_subword + + +@register("torch_bert_ranker_preprocessor") +class TorchBertRankerPreprocessor(TorchTransformersPreprocessor): + """Tokenize text to sub-tokens, encode sub-tokens with their indices, create tokens and segment masks for ranking. + + Builds features for a pair of context with each of the response candidates. + """ + + def __call__(self, batch: List[List[str]]) -> List[List[InputFeatures]]: + """Tokenize and create masks. + + Args: + batch: list of elemenents where the first element represents the batch with contexts + and the rest of elements represent response candidates batches + + Returns: + list of feature batches with subtokens, subtoken ids, subtoken mask, segment mask. + """ + + if isinstance(batch[0], str): + batch = [batch] + + cont_resp_pairs = [] + if len(batch[0]) == 1: + contexts = batch[0] + responses_empt = [None] * len(batch) + cont_resp_pairs.append(zip(contexts, responses_empt)) + else: + contexts = [el[0] for el in batch] + for i in range(1, len(batch[0])): + responses = [] + for el in batch: + responses.append(el[i]) + cont_resp_pairs.append(zip(contexts, responses)) + + input_features = [] + + for s in cont_resp_pairs: + sub_list_features = [] + for context, response in s: + encoded_dict = self.tokenizer.encode_plus( + text=context, + text_pair=response, + add_special_tokens=True, + max_length=self.max_seq_length, + pad_to_max_length=True, + return_attention_mask=True, + return_tensors="pt", + ) + + curr_features = InputFeatures( + input_ids=encoded_dict["input_ids"], + attention_mask=encoded_dict["attention_mask"], + token_type_ids=encoded_dict["token_type_ids"], + label=None, + ) + sub_list_features.append(curr_features) + input_features.append(sub_list_features) + + return input_features diff --git a/annotators/entity_detection_rus/torch_transformers_sequence_tagger.py b/annotators/entity_detection_rus/torch_transformers_sequence_tagger.py new file mode 100644 index 0000000000..0ea49eb3cd --- /dev/null +++ b/annotators/entity_detection_rus/torch_transformers_sequence_tagger.py @@ -0,0 +1,394 @@ +# Copyright 2019 Neural Networks and Deep Learning lab, MIPT +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from logging import getLogger +from pathlib import Path +from typing import List, Union, Dict, Optional + +import numpy as np +import torch +from overrides import overrides +from transformers import AutoModelForTokenClassification, AutoConfig + +from deeppavlov.core.commands.utils import expand_path +from deeppavlov.core.common.errors import ConfigError +from deeppavlov.core.common.registry import register +from deeppavlov.core.models.torch_model import TorchModel + +log = getLogger(__name__) + + +def token_from_subtoken(units: torch.Tensor, mask: torch.Tensor) -> torch.Tensor: + """Assemble token level units from subtoken level units + + Args: + units: torch.Tensor of shape [batch_size, SUBTOKEN_seq_length, n_features] + mask: mask of token beginnings. For example: for tokens + + [[``[CLS]`` ``My``, ``capybara``, ``[SEP]``], + [``[CLS]`` ``Your``, ``aar``, ``##dvark``, ``is``, ``awesome``, ``[SEP]``]] + + the mask will be + + [[0, 1, 1, 0, 0, 0, 0], + [0, 1, 1, 0, 1, 1, 0]] + + Returns: + word_level_units: Units assembled from ones in the mask. For the + example above this units will correspond to the following + + [[``My``, ``capybara``], + [``Your`, ``aar``, ``is``, ``awesome``,]] + + the shape of this tensor will be [batch_size, TOKEN_seq_length, n_features] + """ + shape = units.size() + batch_size = shape[0] + nf = shape[2] + nf_int = units.size()[-1] + + # number of TOKENS in each sentence + token_seq_lengths = torch.sum(mask, 1).to(torch.int64) + # for a matrix m = + # [[1, 1, 1], + # [0, 1, 1], + # [1, 0, 0]] + # it will be + # [3, 2, 1] + + n_words = torch.sum(token_seq_lengths) + # n_words -> 6 + + max_token_seq_len = torch.max(token_seq_lengths) + # max_token_seq_len -> 3 + + idxs = torch.stack(torch.nonzero(mask, as_tuple=True), dim=1) + # for the matrix mentioned above + # tf.where(mask) -> + # [[0, 0], + # [0, 1] + # [0, 2], + # [1, 1], + # [1, 2] + # [2, 0]] + + sample_ids_in_batch = torch.nn.functional.pad(input=idxs[:, 0], pad=[1, 0]) + # for indices + # [[0, 0], + # [0, 1] + # [0, 2], + # [1, 1], + # [1, 2], + # [2, 0]] + # it is + # [0, 0, 0, 0, 1, 1, 2] + # padding is for computing change from one sample to another in the batch + + a = torch.logical_not(torch.eq(sample_ids_in_batch[1:], sample_ids_in_batch[:-1]).to(torch.int64)) + # for the example above the result of this statement equals + # [0, 0, 0, 1, 0, 1] + # so data samples begin in 3rd and 5th positions (the indexes of ones) + + # transforming sample start masks to the sample starts themselves + q = a * torch.arange(n_words).to(torch.int64) + # [0, 0, 0, 3, 0, 5] + count_to_substract = torch.nn.functional.pad(torch.masked_select(q, q.to(torch.bool)), [1, 0]) + # [0, 3, 5] + + new_word_indices = torch.arange(n_words).to(torch.int64) - torch.gather( + count_to_substract, dim=0, index=torch.cumsum(a, 0) + ) + # tf.range(n_words) -> [0, 1, 2, 3, 4, 5] + # tf.cumsum(a) -> [0, 0, 0, 1, 1, 2] + # tf.gather(count_to_substract, tf.cumsum(a)) -> [0, 0, 0, 3, 3, 5] + # new_word_indices -> [0, 1, 2, 3, 4, 5] - [0, 0, 0, 3, 3, 5] = [0, 1, 2, 0, 1, 0] + # new_word_indices is the concatenation of range(word_len(sentence)) + # for all sentences in units + + n_total_word_elements = (batch_size * max_token_seq_len).to(torch.int32) + word_indices_flat = (idxs[:, 0] * max_token_seq_len + new_word_indices).to(torch.int64) + x_mask = torch.sum(torch.nn.functional.one_hot(word_indices_flat, n_total_word_elements), 0) + x_mask = x_mask.to(torch.bool) + # to get absolute indices we add max_token_seq_len: + # idxs[:, 0] * max_token_seq_len -> [0, 0, 0, 1, 1, 2] * 2 = [0, 0, 0, 3, 3, 6] + # word_indices_flat -> [0, 0, 0, 3, 3, 6] + [0, 1, 2, 0, 1, 0] = [0, 1, 2, 3, 4, 6] + # total number of words in the batch (including paddings) + # batch_size * max_token_seq_len -> 3 * 3 = 9 + # tf.one_hot(...) -> + # [[1. 0. 0. 0. 0. 0. 0. 0. 0.] + # [0. 1. 0. 0. 0. 0. 0. 0. 0.] + # [0. 0. 1. 0. 0. 0. 0. 0. 0.] + # [0. 0. 0. 1. 0. 0. 0. 0. 0.] + # [0. 0. 0. 0. 1. 0. 0. 0. 0.] + # [0. 0. 0. 0. 0. 0. 1. 0. 0.]] + # x_mask -> [1, 1, 1, 1, 1, 0, 1, 0, 0] + + full_range = torch.arange(batch_size * max_token_seq_len).to(torch.int64) + # full_range -> [0, 1, 2, 3, 4, 5, 6, 7, 8] + nonword_indices_flat = torch.masked_select(full_range, torch.logical_not(x_mask)) + + # # y_idxs -> [5, 7, 8] + + # get a sequence of units corresponding to the start subtokens of the words + # size: [n_words, n_features] + def gather_nd(params, indices): + assert type(indices) == torch.Tensor + return params[indices.transpose(0, 1).long().numpy().tolist()] + + elements = gather_nd(units, idxs) + + # prepare zeros for paddings + # size: [batch_size * TOKEN_seq_length - n_words, n_features] + sh = tuple(torch.stack([torch.sum(max_token_seq_len - token_seq_lengths), torch.tensor(nf)], 0).numpy()) + paddings = torch.zeros(sh, dtype=torch.float64) + + def dynamic_stitch(indices, data): + # https://discuss.pytorch.org/t/equivalent-of-tf-dynamic-partition/53735/2 + n = sum(idx.numel() for idx in indices) + res = [None] * n + for i, data_ in enumerate(data): + idx = indices[i].view(-1) + if idx.numel() > 0: + d = data_.view(idx.numel(), -1) + k = 0 + for idx_ in idx: + res[idx_] = d[k].to(torch.float64) + k += 1 + return res + + tensor_flat = torch.stack(dynamic_stitch([word_indices_flat, nonword_indices_flat], [elements, paddings])) + # tensor_flat -> [x, x, x, x, x, 0, x, 0, 0] + + tensor = torch.reshape(tensor_flat, (batch_size, max_token_seq_len.item(), nf_int)) + # tensor -> [[x, x, x], + # [x, x, 0], + # [x, 0, 0]] + + return tensor + + +def token_labels_to_subtoken_labels(labels, y_mask, input_mask): + subtoken_labels = [] + labels_ind = 0 + n_tokens_with_special = int(np.sum(input_mask)) + + for el in y_mask[1 : n_tokens_with_special - 1]: + if el == 1: + subtoken_labels += [labels[labels_ind]] + labels_ind += 1 + else: + subtoken_labels += [labels[labels_ind - 1]] + + subtoken_labels = [0] + subtoken_labels + [0] * (len(input_mask) - n_tokens_with_special + 1) + return subtoken_labels + + +@register("torch_transformers_sequence_tagger") +class TorchTransformersSequenceTagger(TorchModel): + """Transformer-based model on PyTorch for text tagging. It predicts a label for every token (not subtoken) + in the text. You can use it for sequence labeling tasks, such as morphological tagging or named entity recognition. + + Args: + n_tags: number of distinct tags + pretrained_bert: pretrained Bert checkpoint path or key title (e.g. "bert-base-uncased") + return_probas: set this to `True` if you need the probabilities instead of raw answers + bert_config_file: path to Bert configuration file, or None, if `pretrained_bert` is a string name + attention_probs_keep_prob: keep_prob for Bert self-attention layers + hidden_keep_prob: keep_prob for Bert hidden layers + optimizer: optimizer name from `torch.optim` + optimizer_parameters: dictionary with optimizer's parameters, + e.g. {'lr': 0.1, 'weight_decay': 0.001, 'momentum': 0.9} + learning_rate_drop_patience: how many validations with no improvements to wait + learning_rate_drop_div: the divider of the learning rate after `learning_rate_drop_patience` unsuccessful + validations + load_before_drop: whether to load best model before dropping learning rate or not + clip_norm: clip gradients by norm + min_learning_rate: min value of learning rate if learning rate decay is used + """ + + def __init__( + self, + n_tags: int, + pretrained_bert: str, + bert_config_file: Optional[str] = None, + return_probas: bool = False, + attention_probs_keep_prob: Optional[float] = None, + hidden_keep_prob: Optional[float] = None, + optimizer: str = "AdamW", + optimizer_parameters: dict = {"lr": 1e-3, "weight_decay": 1e-6}, + learning_rate_drop_patience: int = 20, + learning_rate_drop_div: float = 2.0, + load_before_drop: bool = True, + clip_norm: Optional[float] = None, + min_learning_rate: float = 1e-07, + device: str = "cpu", + **kwargs, + ) -> None: + + self.n_classes = n_tags + self.return_probas = return_probas + self.attention_probs_keep_prob = attention_probs_keep_prob + self.hidden_keep_prob = hidden_keep_prob + self.clip_norm = clip_norm + + self.pretrained_bert = pretrained_bert + self.bert_config_file = bert_config_file + + super().__init__( + optimizer=optimizer, + optimizer_parameters=optimizer_parameters, + learning_rate_drop_patience=learning_rate_drop_patience, + learning_rate_drop_div=learning_rate_drop_div, + load_before_drop=load_before_drop, + min_learning_rate=min_learning_rate, + device=device, + **kwargs, + ) + + def train_on_batch( + self, + input_ids: Union[List[List[int]], np.ndarray], + input_masks: Union[List[List[int]], np.ndarray], + y_masks: Union[List[List[int]], np.ndarray], + y: List[List[int]], + *args, + **kwargs, + ) -> Dict[str, float]: + """ + + Args: + input_ids: batch of indices of subwords + input_masks: batch of masks which determine what should be attended + args: arguments passed to _build_feed_dict + and corresponding to additional input + and output tensors of the derived class. + kwargs: keyword arguments passed to _build_feed_dict + and corresponding to additional input + and output tensors of the derived class. + + Returns: + dict with fields 'loss', 'head_learning_rate', and 'bert_learning_rate' + """ + b_input_ids = torch.from_numpy(input_ids).to(self.device) + b_input_masks = torch.from_numpy(input_masks).to(self.device) + subtoken_labels = [ + token_labels_to_subtoken_labels(y_el, y_mask, input_mask) + for y_el, y_mask, input_mask in zip(y, y_masks, input_masks) + ] + b_labels = torch.from_numpy(np.array(subtoken_labels)).to(torch.int64).to(self.device) + self.optimizer.zero_grad() + + loss, logits = self.model(input_ids=b_input_ids, attention_mask=b_input_masks, labels=b_labels) + loss.backward() + # Clip the norm of the gradients to 1.0. + # This is to help prevent the "exploding gradients" problem. + if self.clip_norm: + torch.nn.utils.clip_grad_norm_(self.model.parameters(), self.clip_norm) + + self.optimizer.step() + if self.lr_scheduler is not None: + self.lr_scheduler.step() + + return {"loss": loss.item()} + + def __call__( + self, + input_ids: Union[List[List[int]], np.ndarray], + input_masks: Union[List[List[int]], np.ndarray], + y_masks: Union[List[List[int]], np.ndarray], + ) -> Union[List[List[int]], List[np.ndarray]]: + """Predicts tag indices for a given subword tokens batch + + Args: + input_ids: indices of the subwords + input_masks: mask that determines where to attend and where not to + y_masks: mask which determines the first subword units in the the word + + Returns: + Label indices or class probabilities for each token (not subtoken) + + """ + b_input_ids = torch.from_numpy(input_ids).to(self.device) + b_input_masks = torch.from_numpy(input_masks).to(self.device) + + with torch.no_grad(): + # Forward pass, calculate logit predictions + logits = self.model(b_input_ids, attention_mask=b_input_masks) + + # Move logits and labels to CPU and to numpy arrays + logits = token_from_subtoken(logits[0].detach().cpu(), torch.from_numpy(y_masks)) + + probas = torch.nn.functional.softmax(logits, dim=-1) + probas = probas.detach().cpu().numpy() + + logits = logits.detach().cpu().numpy() + pred = np.argmax(logits, axis=-1) + seq_lengths = np.sum(y_masks, axis=1) + pred = [p[:leng] for leng, p in zip(seq_lengths, pred)] + + if self.return_probas: + return pred, probas + else: + return pred + + @overrides + def load(self, fname=None): + if fname is not None: + self.load_path = fname + + self.pretrained_bert = str(expand_path(self.pretrained_bert)) + if self.pretrained_bert: + config = AutoConfig.from_pretrained( + self.pretrained_bert, num_labels=self.n_classes, output_attentions=False, output_hidden_states=False + ) + self.model = AutoModelForTokenClassification.from_pretrained(self.pretrained_bert, config=config) + + elif self.bert_config_file and Path(self.bert_config_file).is_file(): + self.bert_config = AutoConfig.from_json_file(str(expand_path(self.bert_config_file))) + + if self.attention_probs_keep_prob is not None: + self.bert_config.attention_probs_dropout_prob = 1.0 - self.attention_probs_keep_prob + if self.hidden_keep_prob is not None: + self.bert_config.hidden_dropout_prob = 1.0 - self.hidden_keep_prob + self.model = AutoModelForTokenClassification(config=self.bert_config) + else: + raise ConfigError("No pre-trained BERT model is given.") + + self.model.to(self.device) + + self.optimizer = getattr(torch.optim, self.optimizer_name)(self.model.parameters(), **self.optimizer_parameters) + if self.lr_scheduler_name is not None: + self.lr_scheduler = getattr(torch.optim.lr_scheduler, self.lr_scheduler_name)( + self.optimizer, **self.lr_scheduler_parameters + ) + + if self.load_path: + log.info(f"Load path {self.load_path} is given.") + if isinstance(self.load_path, Path) and not self.load_path.parent.is_dir(): + raise ConfigError("Provided load path is incorrect!") + + weights_path = Path(self.load_path.resolve()) + weights_path = weights_path.with_suffix(".pth.tar") + if weights_path.exists(): + log.info(f"Load path {weights_path} exists.") + log.info(f"Initializing `{self.__class__.__name__}` from saved.") + + # now load the weights, optimizer from saved + log.info(f"Loading weights from {weights_path}.") + checkpoint = torch.load(weights_path, map_location=self.device) + self.model.load_state_dict(checkpoint["model_state_dict"]) + self.optimizer.load_state_dict(checkpoint["optimizer_state_dict"]) + self.epochs_done = checkpoint.get("epochs_done", 0) + else: + log.info(f"Init from scratch. Load path {weights_path} does not exist.") diff --git a/annotators/entity_detection_rus/wiki_ner_rus_bert_torch.json b/annotators/entity_detection_rus/wiki_ner_rus_bert_torch.json new file mode 100644 index 0000000000..c50f985171 --- /dev/null +++ b/annotators/entity_detection_rus/wiki_ner_rus_bert_torch.json @@ -0,0 +1,133 @@ +{ + "dataset_reader": { + "class_name": "sq_reader", + "data_path": "{DOWNLOADS_PATH}/wiki_ner_rus/wikipedia_dataset_lower.pickle" + }, + "dataset_iterator": { + "class_name": "data_learning_iterator" + }, + "chainer": { + "in": ["x"], + "in_y": ["y"], + "pipe": [ + { + "class_name": "torch_transformers_preprocessor:TorchTransformersNerPreprocessor", + "vocab_file": "{TRANSFORMER}", + "do_lower_case": true, + "max_seq_length": 512, + "max_subword_length": 15, + "token_masking_prob": 0.0, + "in": ["x"], + "out": [ + "x_tokens", + "x_subword_tokens", + "x_subword_tok_ids", + "startofword_markers", + "attention_mask", + "tokens_offsets" + ] + }, + { + "id": "tag_vocab", + "class_name": "simple_vocab", + "unk_token": ["O"], + "pad_with_zeros": true, + "save_path": "{MODEL_PATH}/tag.dict", + "load_path": "{MODEL_PATH}/tag.dict", + "fit_on": ["y"], + "in": ["y"], + "out": ["y_ind"] + }, + { + "class_name": "torch_transformers_sequence_tagger:TorchTransformersSequenceTagger", + "n_tags": "#tag_vocab.len", + "pretrained_bert": "{TRANSFORMER}", + "attention_probs_keep_prob": 0.5, + "return_probas": true, + "encoder_layer_ids": [ + -1 + ], + "optimizer": "AdamW", + "optimizer_parameters": { + "lr": 2e-5, + "weight_decay": 1e-6, + "betas": [ + 0.9, + 0.999 + ], + "eps": 1e-6 + }, + "clip_norm": 1.0, + "min_learning_rate": 1e-7, + "learning_rate_drop_patience": 30, + "learning_rate_drop_div": 1.5, + "load_before_drop": true, + "save_path": "{MODEL_PATH}/model", + "load_path": "{MODEL_PATH}/model", + "in": [ + "x_subword_tok_ids", + "attention_mask", + "startofword_markers" + ], + "in_y": ["y_ind"], + "out": ["y_pred_ind", "probas"] + }, + { + "ref": "tag_vocab", + "in": ["y_pred_ind"], + "out": ["y_pred"] + } + ], + "out": ["x_tokens", "tokens_offsets", "y_pred", "probas"] + }, + "train": { + "epochs": 30, + "batch_size": 10, + "metrics": [ + { + "name": "ner_f1", + "inputs": [ + "y", + "y_pred" + ] + }, + { + "name": "ner_token_f1", + "inputs": [ + "y", + "y_pred" + ] + } + ], + "validation_patience": 100, + "val_every_n_batches": 20, + "log_every_n_batches": 20, + "show_examples": false, + "pytest_max_batches": 2, + "pytest_batch_size": 8, + "evaluation_targets": [ + "valid", + "test" + ], + "class_name": "torch_trainer" + }, + "metadata": { + "variables": { + "ROOT_PATH": "~/.deeppavlov", + "DOWNLOADS_PATH": "{ROOT_PATH}/downloads", + "MODELS_PATH": "{ROOT_PATH}/models", + "TRANSFORMER": "{DOWNLOADS_PATH}/torch_bert_models/rubert_base_cased", + "MODEL_PATH": "{MODELS_PATH}/wiki_ner_rus_bert_torch_lower" + }, + "download": [ + { + "url": "http://files.deeppavlov.ai/deeppavlov_data/rus_dream_entity_detection/wiki_ner_rus_bert_torch_lower.tar.gz", + "subdir": "{MODELS_PATH}/wiki_ner_rus_bert_torch_lower" + }, + { + "url": "http://files.deeppavlov.ai/deeppavlov_data/entity_linking/rubert_base_cased.tar.gz", + "subdir": "{DOWNLOADS_PATH}/torch_bert_models/rubert_base_cased" + } + ] + } +} diff --git a/annotators/entity_linking_rus/Dockerfile b/annotators/entity_linking_rus/Dockerfile new file mode 100644 index 0000000000..5c79a0216c --- /dev/null +++ b/annotators/entity_linking_rus/Dockerfile @@ -0,0 +1,40 @@ +FROM python:3.7.6 + +RUN apt-key del 7fa2af80 && \ + rm -f /etc/apt/sources.list.d/cuda*.list && \ + curl https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-keyring_1.0-1_all.deb \ + -o cuda-keyring_1.0-1_all.deb && \ + dpkg -i cuda-keyring_1.0-1_all.deb + +RUN apt-get -y update && \ + apt-get install -y build-essential libssl-dev zlib1g-dev libbz2-dev \ +libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev \ +xz-utils tk-dev libffi-dev liblzma-dev python-openssl git + +RUN apt-get install -y sqlite3 + +ARG LANGUAGE=EN +ENV LANGUAGE ${LANGUAGE} + +ARG CONFIG +ARG PORT +ARG SRC_DIR +ARG SED_ARG=" | " + +ENV CONFIG=$CONFIG +ENV PORT=$PORT + +RUN pip install pybind11==2.2.4 +RUN pip install hdt==2.3 + +COPY ./annotators/entity_linking_rus/requirements.txt /src/requirements.txt +RUN pip install -r /src/requirements.txt + +COPY $SRC_DIR /src + +WORKDIR /src +RUN python -m deeppavlov install $CONFIG + +RUN sed -i "s|$SED_ARG|g" "$CONFIG" + +CMD gunicorn --workers=1 --timeout 500 server:app -b 0.0.0.0:8075 diff --git a/annotators/entity_linking_rus/entity_linking.py b/annotators/entity_linking_rus/entity_linking.py new file mode 100644 index 0000000000..e38dbad09a --- /dev/null +++ b/annotators/entity_linking_rus/entity_linking.py @@ -0,0 +1,573 @@ +# Copyright 2017 Neural Networks and Deep Learning lab, MIPT +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import re +import sqlite3 +from logging import getLogger +from typing import List, Dict, Tuple, Union +from collections import defaultdict + +import nltk +import pymorphy2 +from hdt import HDTDocument +from nltk.corpus import stopwords +from rapidfuzz import fuzz + +from deeppavlov.core.common.registry import register +from deeppavlov.core.models.component import Component +from deeppavlov.core.models.serializable import Serializable +from deeppavlov.core.commands.utils import expand_path + +log = getLogger(__name__) +nltk.download("stopwords") + + +@register("entity_linker") +class EntityLinker(Component, Serializable): + """ + Class for linking of entity substrings in the document to entities in Wikidata + """ + + def __init__( + self, + load_path: str, + entities_database_filename: str, + entity_ranker=None, + num_entities_for_bert_ranking: int = 50, + wikidata_file: str = None, + ngram_range: List[int] = None, + num_entities_to_return: int = 10, + max_text_len: int = 300, + lang: str = "ru", + use_descriptions: bool = True, + use_tags: bool = False, + lemmatize: bool = False, + full_paragraph: bool = False, + use_connections: bool = False, + **kwargs, + ) -> None: + """ + + Args: + load_path: path to folder with inverted index files + entity_ranker: component deeppavlov.models.kbqa.rel_ranking_bert + num_entities_for_bert_ranking: number of candidate entities for BERT ranking using description and context + ngram_range: char ngrams range for TfidfVectorizer + num_entities_to_return: number of candidate entities for the substring which are returned + lang: russian or english + use_description: whether to perform entity ranking by context and description + lemmatize: whether to lemmatize tokens + **kwargs: + """ + super().__init__(save_path=None, load_path=load_path) + self.morph = pymorphy2.MorphAnalyzer() + self.lemmatize = lemmatize + self.entities_database_filename = entities_database_filename + self.num_entities_for_bert_ranking = num_entities_for_bert_ranking + self.wikidata_file = wikidata_file + self.entity_ranker = entity_ranker + self.num_entities_to_return = num_entities_to_return + self.max_text_len = max_text_len + self.lang = f"@{lang}" + if self.lang == "@en": + self.stopwords = set(stopwords.words("english")) + elif self.lang == "@ru": + self.stopwords = set(stopwords.words("russian")) + self.use_descriptions = use_descriptions + self.use_connections = use_connections + self.use_tags = use_tags + self.full_paragraph = full_paragraph + self.re_tokenizer = re.compile(r"[\w']+|[^\w ]") + + self.load() + + def load(self) -> None: + self.conn = sqlite3.connect(str(self.load_path / self.entities_database_filename)) + self.cur = self.conn.cursor() + self.wikidata = None + if self.wikidata_file: + self.wikidata = HDTDocument(str(expand_path(self.wikidata_file))) + + def save(self) -> None: + pass + + def __call__( + self, + entity_substr_batch: List[List[str]], + entity_tags_batch: List[List[str]] = None, + sentences_batch: List[List[str]] = None, + entity_offsets_batch: List[List[List[int]]] = None, + sentences_offsets_batch: List[List[Tuple[int, int]]] = None, + ): + if sentences_offsets_batch is None and sentences_batch is not None: + sentences_offsets_batch = [] + for sentences_list in sentences_batch: + sentences_offsets_list = [] + start = 0 + for sentence in sentences_list: + end = start + len(sentence) + sentences_offsets_list.append([start, end]) + start = end + 1 + sentences_offsets_batch.append(sentences_offsets_list) + + if entity_tags_batch is None: + entity_tags_batch = [[] for _ in entity_substr_batch] + else: + entity_tags_batch = [[tag.upper() for tag in entity_tags] for entity_tags in entity_tags_batch] + + if sentences_batch is None: + sentences_batch = [[] for _ in entity_substr_batch] + sentences_offsets_batch = [[] for _ in entity_substr_batch] + + log.info(f"sentences_batch {sentences_batch}") + if entity_offsets_batch is None and sentences_batch is not None: + entity_offsets_batch = [] + for entity_substr_list, sentences_list in zip(entity_substr_batch, sentences_batch): + text = " ".join(sentences_list).lower() + log.info(f"text {text}") + entity_offsets_list = [] + for entity_substr in entity_substr_list: + st_offset = text.find(entity_substr.lower()) + end_offset = st_offset + len(entity_substr) + entity_offsets_list.append([st_offset, end_offset]) + entity_offsets_batch.append(entity_offsets_list) + + entity_ids_batch, entity_conf_batch, entity_pages_batch = [], [], [] + for (entity_substr_list, entity_offsets_list, entity_tags_list, sentences_list, sentences_offsets_list,) in zip( + entity_substr_batch, + entity_offsets_batch, + entity_tags_batch, + sentences_batch, + sentences_offsets_batch, + ): + entity_ids_list, entity_conf_list, entity_pages_list = self.link_entities( + entity_substr_list, + entity_offsets_list, + entity_tags_list, + sentences_list, + sentences_offsets_list, + ) + log.info(f"entity_ids_list {entity_ids_list} entity_conf_list {entity_conf_list}") + entity_ids_batch.append(entity_ids_list) + entity_conf_batch.append(entity_conf_list) + entity_pages_batch.append(entity_pages_list) + return entity_ids_batch, entity_conf_batch, entity_pages_batch + + def link_entities( + self, + entity_substr_list: List[str], + entity_offsets_list: List[List[int]], + entity_tags_list: List[str], + sentences_list: List[str], + sentences_offsets_list: List[List[int]], + ) -> List[List[str]]: + log.info( + f"entity_substr_list {entity_substr_list} entity_tags_list {entity_tags_list} " + f"entity_offsets_list {entity_offsets_list}" + ) + entity_ids_list, conf_list, pages_list = [], [], [] + if entity_substr_list: + entities_scores_list = [] + cand_ent_scores_list = [] + entity_substr_split_list = [ + [word for word in entity_substr.split(" ") if word not in self.stopwords and len(word) > 0] + for entity_substr in entity_substr_list + ] + for entity_substr, entity_substr_split, tag in zip( + entity_substr_list, entity_substr_split_list, entity_tags_list + ): + cand_ent_init = self.find_exact_match(entity_substr, tag) + if not cand_ent_init: + cand_ent_init = self.find_fuzzy_match(entity_substr_split, tag) + + cand_ent_scores = [] + for entity in cand_ent_init: + entities_scores = list(cand_ent_init[entity]) + entities_scores = sorted(entities_scores, key=lambda x: (x[0], x[1]), reverse=True) + cand_ent_scores.append((entity, entities_scores[0])) + + cand_ent_scores = sorted(cand_ent_scores, key=lambda x: (x[1][0], x[1][1]), reverse=True) + cand_ent_scores = cand_ent_scores[: self.num_entities_for_bert_ranking] + cand_ent_scores_list.append(cand_ent_scores) + entity_ids = [elem[0] for elem in cand_ent_scores] + entities_scores_list.append({ent: score for ent, score in cand_ent_scores}) + entity_ids_list.append(entity_ids) + + if self.use_connections: + entity_ids_list = [] + entities_with_conn_scores_list = self.rank_by_connections(cand_ent_scores_list) + for entities_with_conn_scores in entities_with_conn_scores_list: + entity_ids = [elem[0] for elem in entities_with_conn_scores] + entity_ids_list.append(entity_ids) + + entity_descr_list = [] + pages_dict = {} + for entity_ids in entity_ids_list: + entity_descrs = [] + for entity_id in entity_ids: + res = self.cur.execute("SELECT * FROM entity_labels WHERE entity='{}';".format(entity_id)) + entity_info = res.fetchall() + if entity_info: + ( + cur_entity_id, + cur_entity_label, + cur_entity_descr, + cur_entity_page, + ) = entity_info[0] + entity_descrs.append(cur_entity_descr) + pages_dict[cur_entity_id] = cur_entity_page + else: + entity_descrs.append("") + entity_descr_list.append(entity_descrs) + if self.use_descriptions: + substr_lens = [len(entity_substr.split()) for entity_substr in entity_substr_list] + entity_ids_list, conf_list = self.rank_by_description( + entity_substr_list, + entity_offsets_list, + entity_ids_list, + entity_descr_list, + entities_scores_list, + sentences_list, + sentences_offsets_list, + substr_lens, + ) + pages_list = [[pages_dict.get(entity_id, "") for entity_id in entity_ids] for entity_ids in entity_ids_list] + + return entity_ids_list, conf_list, pages_list + + def process_cand_ent(self, cand_ent_init, entities_and_ids, entity_substr_split, tag): + if self.use_tags: + for ( + cand_entity_title, + cand_entity_id, + cand_entity_rels, + cand_tag, + ) in entities_and_ids: + if tag == cand_tag: + substr_score = self.calc_substr_score(cand_entity_id, cand_entity_title, entity_substr_split) + cand_ent_init[cand_entity_id].add((substr_score, cand_entity_rels)) + if not cand_ent_init: + for ( + cand_entity_title, + cand_entity_id, + cand_entity_rels, + cand_tag, + ) in entities_and_ids: + substr_score = self.calc_substr_score(cand_entity_id, cand_entity_title, entity_substr_split) + cand_ent_init[cand_entity_id].add((substr_score, cand_entity_rels)) + else: + for cand_entity_title, cand_entity_id, cand_entity_rels in entities_and_ids: + substr_score = self.calc_substr_score(cand_entity_id, cand_entity_title, entity_substr_split) + cand_ent_init[cand_entity_id].add((substr_score, cand_entity_rels)) + return cand_ent_init + + def find_exact_match(self, entity_substr, tag): + entity_substr_split = entity_substr.split() + cand_ent_init = defaultdict(set) + res = self.cur.execute("SELECT * FROM inverted_index WHERE title MATCH '{}';".format(entity_substr)) + entities_and_ids = res.fetchall() + if entities_and_ids: + cand_ent_init = self.process_cand_ent(cand_ent_init, entities_and_ids, entity_substr_split, tag) + if self.lang == "@ru": + entity_substr_split_lemm = [self.morph.parse(tok)[0].normal_form for tok in entity_substr_split] + entity_substr_lemm = " ".join(entity_substr_split_lemm) + if entity_substr_lemm != entity_substr: + res = self.cur.execute( + "SELECT * FROM inverted_index WHERE title MATCH '{}';".format(entity_substr_lemm) + ) + entities_and_ids = res.fetchall() + if entities_and_ids: + cand_ent_init = self.process_cand_ent( + cand_ent_init, entities_and_ids, entity_substr_split_lemm, tag + ) + return cand_ent_init + + def find_fuzzy_match(self, entity_substr_split, tag): + if self.lang == "@ru": + entity_substr_split_lemm = [self.morph.parse(tok)[0].normal_form for tok in entity_substr_split] + cand_ent_init = defaultdict(set) + for word in entity_substr_split: + res = self.cur.execute("SELECT * FROM inverted_index WHERE title MATCH '{}';".format(word)) + part_entities_and_ids = res.fetchall() + cand_ent_init = self.process_cand_ent(cand_ent_init, part_entities_and_ids, entity_substr_split, tag) + if self.lang == "@ru": + word_lemm = self.morph.parse(word)[0].normal_form + if word != word_lemm: + res = self.cur.execute("SELECT * FROM inverted_index WHERE title MATCH '{}';".format(word_lemm)) + part_entities_and_ids = res.fetchall() + cand_ent_init = self.process_cand_ent( + cand_ent_init, + part_entities_and_ids, + entity_substr_split_lemm, + tag, + ) + return cand_ent_init + + def morph_parse(self, word): + morph_parse_tok = self.morph.parse(word)[0] + normal_form = morph_parse_tok.normal_form + return normal_form + + def calc_substr_score(self, cand_entity_id, cand_entity_title, entity_substr_split): + label_tokens = cand_entity_title.split() + cnt = 0.0 + for ent_tok in entity_substr_split: + found = False + for label_tok in label_tokens: + if label_tok == ent_tok: + found = True + break + if found: + cnt += 1.0 + else: + for label_tok in label_tokens: + if label_tok[:2] == ent_tok[:2]: + fuzz_score = fuzz.ratio(label_tok, ent_tok) + if fuzz_score >= 80.0 and not found: + cnt += fuzz_score * 0.01 + found = True + break + substr_score = round(cnt / max(len(label_tokens), len(entity_substr_split)), 3) + if len(label_tokens) == 2 and len(entity_substr_split) == 1: + if entity_substr_split[0] == label_tokens[1]: + substr_score = 0.5 + elif entity_substr_split[0] == label_tokens[0]: + substr_score = 0.3 + return substr_score + + def rank_by_connections(self, cand_ent_scores_list: List[List[Union[str, Tuple[str, str]]]]): + entities_for_ranking_list = [] + for entities_scores in cand_ent_scores_list: + entities_for_ranking = [] + if entities_scores: + max_score = entities_scores[0][1][0] + for entity, scores in entities_scores: + if scores[0] == max_score: + entities_for_ranking.append(entity) + entities_for_ranking_list.append(entities_for_ranking) + + entities_sets_list = [] + for entities_scores in cand_ent_scores_list: + entities_sets_list.append({entity for entity, scores in entities_scores}) + + entities_conn_scores_list = [] + for entities_scores in cand_ent_scores_list: + cur_entity_dict = {} + for entity, scores in entities_scores: + cur_entity_dict[entity] = 0 + entities_conn_scores_list.append(cur_entity_dict) + + entities_objects_list, entities_triplets_list = [], [] + for entities_scores in cand_ent_scores_list: + cur_objects_dict, cur_triplets_dict = {}, {} + for entity, scores in entities_scores: + objects, triplets = set(), set() + tr, cnt = self.wikidata.search_triples(f"http://we/{entity}", "", "") + for triplet in tr: + objects.add(triplet[2].split("/")[-1]) + triplets.add((triplet[1].split("/")[-1], triplet[2].split("/")[-1])) + cur_objects_dict[entity] = objects + cur_triplets_dict[entity] = triplets + entities_objects_list.append(cur_objects_dict) + entities_triplets_list.append(cur_triplets_dict) + + already_ranked = {i: False for i in range(len(entities_for_ranking_list))} + + for i in range(len(entities_for_ranking_list)): + for entity1 in entities_for_ranking_list[i]: + for j in range(len(entities_for_ranking_list)): + if i != j and not already_ranked[j]: + inters = entities_objects_list[i][entity1].intersection(entities_sets_list[j]) + if inters: + entities_conn_scores_list[i][entity1] += len(inters) + entities_conn_scores_list[j][entities_sets_list[j]] += len(inters) + already_ranked[j] = True + else: + for entity2 in entities_triplets_list[j]: + inters = entities_triplets_list[i][entity1].intersection( + entities_triplets_list[j][entity2] + ) + inters = {elem for elem in inters if elem[0] != "P31"} + if inters: + prev_score1 = entities_conn_scores_list[i].get(entity1, 0) + prev_score2 = entities_conn_scores_list[j].get(entity2, 0) + entities_conn_scores_list[i][entity1] = max(len(inters), prev_score1) + entities_conn_scores_list[j][entity2] = max(len(inters), prev_score2) + + entities_with_conn_scores_list = [] + for i in range(len(entities_conn_scores_list)): + entities_with_conn_scores_list.append( + sorted( + list(entities_conn_scores_list[i].items()), + key=lambda x: x[1], + reverse=True, + ) + ) + return entities_with_conn_scores_list + + def rank_by_description( + self, + entity_substr_list: List[str], + entity_offsets_list: List[List[int]], + cand_ent_list: List[List[str]], + cand_ent_descr_list: List[List[str]], + entities_scores_list: List[Dict[str, Tuple[int, float]]], + sentences_list: List[str], + sentences_offsets_list: List[Tuple[int, int]], + substr_lens: List[int], + ) -> List[List[str]]: + entity_ids_list = [] + conf_list = [] + contexts = [] + for ( + entity_substr, + (entity_start_offset, entity_end_offset), + candidate_entities, + ) in zip(entity_substr_list, entity_offsets_list, cand_ent_list): + sentence = "" + rel_start_offset = 0 + rel_end_offset = 0 + found_sentence_num = 0 + for num, (sent, (sent_start_offset, sent_end_offset)) in enumerate( + zip(sentences_list, sentences_offsets_list) + ): + if entity_start_offset >= sent_start_offset and entity_end_offset <= sent_end_offset: + sentence = sent + found_sentence_num = num + rel_start_offset = entity_start_offset - sent_start_offset + rel_end_offset = entity_end_offset - sent_start_offset + break + context = "" + if sentence: + start_of_sentence = 0 + end_of_sentence = len(sentence) + if len(sentence) > self.max_text_len: + start_of_sentence = max(rel_start_offset - self.max_text_len // 2, 0) + end_of_sentence = min(rel_end_offset + self.max_text_len // 2, len(sentence)) + context = ( + sentence[start_of_sentence:rel_start_offset] + "[ENT]" + sentence[rel_end_offset:end_of_sentence] + ) + if self.full_paragraph: + cur_sent_len = len(re.findall(self.re_tokenizer, context)) + first_sentence_num = found_sentence_num + last_sentence_num = found_sentence_num + context = [context] + while True: + added = False + if last_sentence_num < len(sentences_list) - 1: + last_sentence_len = len( + re.findall( + self.re_tokenizer, + sentences_list[last_sentence_num + 1], + ) + ) + if cur_sent_len + last_sentence_len < self.max_paragraph_len: + context.append(sentences_list[last_sentence_num + 1]) + cur_sent_len += last_sentence_len + last_sentence_num += 1 + added = True + if first_sentence_num > 0: + first_sentence_len = len( + re.findall( + self.re_tokenizer, + sentences_list[first_sentence_num - 1], + ) + ) + if cur_sent_len + first_sentence_len < self.max_paragraph_len: + context = [sentences_list[first_sentence_num - 1]] + context + cur_sent_len += first_sentence_len + first_sentence_num -= 1 + added = True + if not added: + break + context = " ".join(context) + + log.info(f"rank, context: {context}") + contexts.append(context) + + scores_list = self.entity_ranker(contexts, cand_ent_list, cand_ent_descr_list) + + for (entity_substr, candidate_entities, substr_len, entities_scores, scores,) in zip( + entity_substr_list, + cand_ent_list, + substr_lens, + entities_scores_list, + scores_list, + ): + log.info(f"len candidate entities {len(candidate_entities)}") + entities_with_scores = [ + ( + entity, + round(entities_scores.get(entity, (0.0, 0))[0], 2), + entities_scores.get(entity, (0.0, 0))[1], + round(score, 2), + ) + for entity, score in scores + ] + log.info(f"len entities with scores {len(entities_with_scores)}") + entities_with_scores = sorted(entities_with_scores, key=lambda x: (x[1], x[3], x[2]), reverse=True) + log.info(f"--- entities_with_scores {entities_with_scores}") + + if not entities_with_scores: + top_entities = [self.not_found_str] + top_conf = [(0.0, 0, 0.0)] + elif entities_with_scores and substr_len == 1 and entities_with_scores[0][1] < 1.0: + top_entities = [self.not_found_str] + top_conf = [(0.0, 0, 0.0)] + elif entities_with_scores and ( + entities_with_scores[0][1] < 0.3 + or (entities_with_scores[0][3] < 0.13 and entities_with_scores[0][2] < 20) + or (entities_with_scores[0][3] < 0.3 and entities_with_scores[0][2] < 4) + or entities_with_scores[0][1] < 0.6 + ): + top_entities = [self.not_found_str] + top_conf = [(0.0, 0, 0.0)] + else: + top_entities = [score[0] for score in entities_with_scores] + top_conf = [score[1:] for score in entities_with_scores] + + log.info(f"--- top_entities {top_entities} top_conf {top_conf}") + + high_conf_entities = [] + high_conf_nums = [] + for elem_num, (entity, conf) in enumerate(zip(top_entities, top_conf)): + if len(conf) == 3 and conf[0] == 1.0 and conf[1] > 50 and conf[2] > 0.3: + new_conf = list(conf) + if new_conf[1] > 55: + new_conf[2] = 1.0 + new_conf = tuple(new_conf) + high_conf_entities.append((entity,) + new_conf) + high_conf_nums.append(elem_num) + + high_conf_entities = sorted(high_conf_entities, key=lambda x: (x[1], x[3], x[2]), reverse=True) + for n, elem_num in enumerate(high_conf_nums): + if elem_num - n >= 0 and elem_num - n < len(top_entities): + del top_entities[elem_num - n] + del top_conf[elem_num - n] + + log.info(f"top entities {top_entities} top_conf {top_conf}") + log.info(f"high_conf_entities {high_conf_entities}") + + top_entities = [elem[0] for elem in high_conf_entities] + top_entities + top_conf = [elem[1:] for elem in high_conf_entities] + top_conf + + log.info(f"top entities {top_entities} top_conf {top_conf}") + + if self.num_entities_to_return == 1 and top_entities: + entity_ids_list.append(top_entities[0]) + conf_list.append(top_conf[0]) + else: + entity_ids_list.append(top_entities[: self.num_entities_to_return]) + conf_list.append(top_conf[: self.num_entities_to_return]) + return entity_ids_list, conf_list diff --git a/annotators/entity_linking_rus/entity_linking_rus.json b/annotators/entity_linking_rus/entity_linking_rus.json new file mode 100644 index 0000000000..8f535a68ae --- /dev/null +++ b/annotators/entity_linking_rus/entity_linking_rus.json @@ -0,0 +1,61 @@ +{ + "chainer": { + "in": ["entity_substr", "entity_tags", "sentences"], + "pipe": [ + { + "class_name": "torch_transformers_el_ranker:TorchTransformersEntityRankerInfer", + "id": "entity_descr_ranking", + "pretrained_bert": "{TRANSFORMER}", + "encoder_weights_path": "{MODELS_PATH}/entity_linking_rus/encoder.pth.tar", + "bilinear_weights_path": "{MODELS_PATH}/entity_linking_rus/bilinear.pth.tar", + "special_token_id": 30522, + "device": "cpu", + "emb_size": 264, + "block_size": 6 + }, + { + "class_name": "entity_linking:EntityLinker", + "in": ["entity_substr", "entity_tags", "sentences"], + "out": ["entity_ids", "entity_conf", "entity_pages"], + "load_path": "{DOWNLOADS_PATH}/entity_linking_rus", + "entities_database_filename": "el_rus.db", + "entity_ranker": "#entity_descr_ranking", + "rank_in_runtime": true, + "num_entities_for_bert_ranking": 20, + "use_gpu": false, + "include_mention": false, + "num_entities_to_return": 5, + "lemmatize": true, + "use_tags": true, + "use_descriptions": true, + "full_paragraph": true, + "return_confidences": true, + "lang": "ru" + } + ], + "out": ["entity_substr", "entity_ids", "entity_conf", "entity_pages"] + }, + "metadata": { + "variables": { + "ROOT_PATH": "~/.deeppavlov", + "DOWNLOADS_PATH": "{ROOT_PATH}/downloads", + "MODELS_PATH": "{ROOT_PATH}/models", + "TRANSFORMER": "{DOWNLOADS_PATH}/torch_bert_models/distilrubert_tiny_cased_conversational_v1", + "CONFIGS_PATH": "{DEEPPAVLOV_PATH}/configs" + }, + "download": [ + { + "url": "http://files.deeppavlov.ai/deeppavlov_data/entity_linking/distilrubert_tiny_cased_v1.tar.gz", + "subdir": "{DOWNLOADS_PATH}/torch_bert_models/distilrubert_tiny_cased_conversational_v1" + }, + { + "url": "http://files.deeppavlov.ai/deeppavlov_data/entity_linking/el_db_rus.tar.gz", + "subdir": "{DOWNLOADS_PATH}/entity_linking_rus" + }, + { + "url": "http://files.deeppavlov.ai/deeppavlov_data/entity_linking/el_ranker_rus.tar.gz", + "subdir": "{MODELS_PATH}/entity_linking_rus" + } + ] + } +} diff --git a/annotators/entity_linking_rus/requirements.txt b/annotators/entity_linking_rus/requirements.txt new file mode 100644 index 0000000000..47b59b9619 --- /dev/null +++ b/annotators/entity_linking_rus/requirements.txt @@ -0,0 +1,12 @@ +Flask==1.1.1 +nltk==3.2.5 +gunicorn==19.9.0 +requests==2.22.0 +sentry-sdk==0.12.3 +rapidfuzz==0.7.6 +torch==1.6.0 +transformers==4.6.0 +deeppavlov==0.17.2 +itsdangerous==2.0.1 +jinja2<=3.0.3 +Werkzeug<=2.0.3 diff --git a/annotators/entity_linking_rus/server.py b/annotators/entity_linking_rus/server.py new file mode 100644 index 0000000000..eb994c700b --- /dev/null +++ b/annotators/entity_linking_rus/server.py @@ -0,0 +1,80 @@ +import logging +import os +import time +from flask import Flask, request, jsonify +import sentry_sdk +from deeppavlov import build_model + +logging.basicConfig(format="%(asctime)s - %(name)s - %(levelname)s - %(message)s", level=logging.INFO) +logger = logging.getLogger(__name__) +sentry_sdk.init(os.getenv("SENTRY_DSN")) + +app = Flask(__name__) + +config_name = os.getenv("CONFIG") + +try: + el = build_model(config_name, download=True) + logger.info("model loaded") +except Exception as e: + sentry_sdk.capture_exception(e) + logger.exception(e) + raise e + + +@app.route("/model", methods=["POST"]) +def respond(): + st_time = time.time() + inp = request.json + entity_substr_batch = inp.get("entity_substr", [[""]]) + entity_tags_batch = inp.get( + "entity_tags", [["" for _ in entity_substr_list] for entity_substr_list in entity_substr_batch] + ) + context_batch = inp.get("context", [[""]]) + opt_context_batch = [] + for entity_substr_list, hist_utt in zip(entity_substr_batch, context_batch): + last_utt = hist_utt[-1] + if last_utt[-1] not in {".", "!", "?"}: + last_utt = f"{last_utt}." + if len(hist_utt) > 1: + prev_utt = hist_utt[-2] + if prev_utt[-1] not in {".", "!", "?"}: + prev_utt = f"{prev_utt}." + opt_context_batch.append([prev_utt, last_utt]) + else: + opt_context_batch.append([last_utt]) + + entity_info_batch = [[{}] for _ in entity_substr_batch] + try: + entity_substr_batch, entity_ids_batch, conf_batch, entity_pages_batch = el( + entity_substr_batch, entity_tags_batch, opt_context_batch + ) + entity_info_batch = [] + for entity_substr_list, entity_ids_list, conf_list, entity_pages_list in zip( + entity_substr_batch, + entity_ids_batch, + conf_batch, + entity_pages_batch, + ): + entity_info_list = [] + for entity_substr, entity_ids, confs, entity_pages in zip( + entity_substr_list, entity_ids_list, conf_list, entity_pages_list + ): + entity_info = {} + entity_info["entity_substr"] = entity_substr + entity_info["entity_ids"] = entity_ids + entity_info["confidences"] = [float(elem[2]) for elem in confs] + entity_info["tokens_match_conf"] = [float(elem[0]) for elem in confs] + entity_info["entity_pages"] = entity_pages + entity_info_list.append(entity_info) + entity_info_batch.append(entity_info_list) + except Exception as e: + sentry_sdk.capture_exception(e) + logger.exception(e) + total_time = time.time() - st_time + logger.info(f"entity linking exec time = {total_time:.3f}s") + return jsonify(entity_info_batch) + + +if __name__ == "__main__": + app.run(debug=False, host="0.0.0.0", port=3000) diff --git a/annotators/entity_linking_rus/test.sh b/annotators/entity_linking_rus/test.sh new file mode 100755 index 0000000000..bf3d9284c3 --- /dev/null +++ b/annotators/entity_linking_rus/test.sh @@ -0,0 +1,4 @@ +#!/bin/bash + + +python test_el.py diff --git a/annotators/entity_linking_rus/test_el.py b/annotators/entity_linking_rus/test_el.py new file mode 100644 index 0000000000..d6fb765cba --- /dev/null +++ b/annotators/entity_linking_rus/test_el.py @@ -0,0 +1,38 @@ +import requests + +use_context = True + + +def main(): + url = "http://0.0.0.0:8075/model" + + request_data = [ + { + "entity_substr": [["форрест гамп"]], + "entity_tags": [["film"]], + "context": [["кто снял фильм форрест гамп?"]], + }, + { + "entity_substr": [["роберт левандовский"]], + "entity_tags": [["per"]], + "context": [["за какую команду играет роберт левандовский?"]], + }, + ] + + gold_results = [["Q134773"], ["Q151269"]] + + count = 0 + for data, gold_result in zip(request_data, gold_results): + result = requests.post(url, json=data).json() + entity_ids = result[0][0]["entity_ids"] + if entity_ids == gold_result: + count += 1 + else: + print(f"Got {result}, but expected: {gold_result}") + + assert count == len(request_data) + print("Success") + + +if __name__ == "__main__": + main() diff --git a/annotators/entity_linking_rus/torch_transformers_el_ranker.py b/annotators/entity_linking_rus/torch_transformers_el_ranker.py new file mode 100644 index 0000000000..f49863e2df --- /dev/null +++ b/annotators/entity_linking_rus/torch_transformers_el_ranker.py @@ -0,0 +1,382 @@ +from pathlib import Path +from logging import getLogger +from typing import List, Optional, Dict, Tuple, Union, Any + +import torch +import torch.nn as nn +import torch.nn.functional as F +import numpy as np +from torch import Tensor + +# from apex import amp + +from deeppavlov.core.commands.utils import expand_path +from transformers import AutoConfig, AutoTokenizer, AutoModel +from deeppavlov.core.common.errors import ConfigError +from deeppavlov.core.common.registry import register +from deeppavlov.core.models.torch_model import TorchModel +from torch_transformers_preprocessor import TorchTransformersEntityRankerPreprocessor + +log = getLogger(__name__) + + +@register("torch_transformers_el_ranker") +class TorchTransformersElRanker(TorchModel): + def __init__( + self, + model_name: str, + encoder_save_path: str, + bilinear_save_path: str, + pretrained_bert: str = None, + bert_config_file: Optional[str] = None, + criterion: str = "CrossEntropyLoss", + optimizer: str = "AdamW", + optimizer_parameters: Dict = {"lr": 5e-5, "weight_decay": 0.01, "eps": 1e-6}, + return_probas: bool = False, + attention_probs_keep_prob: Optional[float] = None, + hidden_keep_prob: Optional[float] = None, + clip_norm: Optional[float] = None, + threshold: Optional[float] = None, + **kwargs, + ): + self.encoder_save_path = encoder_save_path + self.bilinear_save_path = bilinear_save_path + self.pretrained_bert = pretrained_bert + self.bert_config_file = bert_config_file + self.return_probas = return_probas + self.attention_probs_keep_prob = attention_probs_keep_prob + self.hidden_keep_prob = hidden_keep_prob + self.clip_norm = clip_norm + + super().__init__( + model_name=model_name, + optimizer=optimizer, + criterion=criterion, + optimizer_parameters=optimizer_parameters, + return_probas=return_probas, + **kwargs, + ) + + def train_on_batch( + self, + q_features: List[Dict], + c_features: List[Dict], + entity_tokens_pos: List[int], + labels: List[int], + ) -> float: + + _input = {"labels": labels} + _input["entity_tokens_pos"] = entity_tokens_pos + for elem in ["input_ids", "attention_mask"]: + inp_elem = [getattr(f, elem) for f in q_features] + _input[f"q_{elem}"] = torch.LongTensor(inp_elem).to(self.device) + for elem in ["input_ids", "attention_mask"]: + inp_elem = [getattr(f, elem) for f in c_features] + _input[f"c_{elem}"] = torch.LongTensor(inp_elem).to(self.device) + + self.model.train() + self.model.zero_grad() + self.optimizer.zero_grad() # zero the parameter gradients + + loss, softmax_scores = self.model(**_input) + loss.backward() + self.optimizer.step() + + # Clip the norm of the gradients to prevent the "exploding gradients" problem + if self.clip_norm: + torch.nn.utils.clip_grad_norm_(self.model.parameters(), self.clip_norm) + + if self.lr_scheduler is not None: + self.lr_scheduler.step() + + return loss.item() + + def __call__( + self, + q_features: List[Dict], + c_features: List[Dict], + entity_tokens_pos: List[int], + ) -> Union[List[int], List[np.ndarray]]: + + self.model.eval() + + _input = {"entity_tokens_pos": entity_tokens_pos} + for elem in ["input_ids", "attention_mask"]: + inp_elem = [getattr(f, elem) for f in q_features] + _input[f"q_{elem}"] = torch.LongTensor(inp_elem).to(self.device) + for elem in ["input_ids", "attention_mask"]: + inp_elem = [getattr(f, elem) for f in c_features] + _input[f"c_{elem}"] = torch.LongTensor(inp_elem).to(self.device) + + with torch.no_grad(): + softmax_scores = self.model(**_input) + if self.return_probas: + pred = softmax_scores + else: + pred = torch.argmax(softmax_scores, dim=1).cpu().numpy() + + return pred + + def siamese_ranking_el_model(self, **kwargs) -> nn.Module: + return SiameseBertElModel( + pretrained_bert=self.pretrained_bert, + encoder_save_path=self.encoder_save_path, + bilinear_save_path=self.bilinear_save_path, + bert_tokenizer_config_file=self.pretrained_bert, + device=self.device, + ) + + def save(self, fname: Optional[str] = None, *args, **kwargs) -> None: + if fname is None: + fname = self.save_path + if not fname.parent.is_dir(): + raise ConfigError("Provided save path is incorrect!") + weights_path = Path(fname).with_suffix(".pth.tar") + log.info(f"Saving model to {weights_path}.") + torch.save( + { + "model_state_dict": self.model.cpu().state_dict(), + "optimizer_state_dict": self.optimizer.state_dict(), + "epochs_done": self.epochs_done, + }, + weights_path, + ) + self.model.to(self.device) + self.model.save() + + +class TextEncoder(nn.Module): + def __init__( + self, + pretrained_bert: str = None, + bert_tokenizer_config_file: str = None, + bert_config_file: str = None, + resize: bool = False, + device: str = "gpu", + ): + super().__init__() + self.pretrained_bert = pretrained_bert + self.bert_config_file = bert_config_file + self.encoder, self.config, self.bert_config = None, None, None + self.device = device + self.load() + self.tokenizer = AutoTokenizer.from_pretrained(self.pretrained_bert) + self.encoder.resize_token_embeddings(len(self.tokenizer) + 1) + + def forward( + self, + input_ids: Tensor, + attention_mask: Tensor, + entity_tokens_pos: List[int] = None, + ) -> Union[Tuple[Any, Tensor], Tuple[Tensor]]: + + if entity_tokens_pos is not None: + q_outputs = self.encoder(input_ids=input_ids, attention_mask=attention_mask) + q_hidden_states = q_outputs.last_hidden_state + + entity_emb = [] + for i in range(len(entity_tokens_pos)): + pos = entity_tokens_pos[i] + entity_emb.append(q_hidden_states[i, pos]) + + entity_emb = torch.stack(entity_emb, dim=0) + return entity_emb + else: + c_outputs = self.encoder(input_ids=input_ids, attention_mask=attention_mask) + c_cls_emb = c_outputs.last_hidden_state[:, :1, :].squeeze(1) + return c_cls_emb + + def load(self) -> None: + if self.pretrained_bert: + log.info(f"From pretrained {self.pretrained_bert}.") + self.config = AutoConfig.from_pretrained(self.pretrained_bert, output_hidden_states=True) + self.encoder = AutoModel.from_pretrained(self.pretrained_bert, config=self.config) + + elif self.bert_config_file and Path(self.bert_config_file).is_file(): + self.config = AutoConfig.from_json_file(str(expand_path(self.bert_config_file))) + self.encoder = AutoModel.from_config(config=self.bert_config) + else: + raise ConfigError("No pre-trained BERT model is given.") + self.encoder.to(self.device) + + +class BilinearRanking(nn.Module): + def __init__(self, n_classes: int = 2, emb_size: int = 768, block_size: int = 8): + super().__init__() + self.n_classes = n_classes + self.emb_size = emb_size + self.block_size = block_size + self.bilinear = nn.Linear(self.emb_size * self.block_size, self.n_classes) + self.softmax = nn.Softmax(dim=1) + + def forward(self, text1: Tensor, text2: Tensor): + b1 = text1.view(-1, self.emb_size // self.block_size, self.block_size) + b2 = text2.view(-1, self.emb_size // self.block_size, self.block_size) + bl = (b1.unsqueeze(3) * b2.unsqueeze(2)).view(-1, self.emb_size * self.block_size) + logits = self.bilinear(bl) + softmax_logits = self.softmax(logits) + log_softmax = F.log_softmax(logits, dim=-1) + return softmax_logits, log_softmax + + +class SiameseBertElModel(nn.Module): + def __init__( + self, + encoder_save_path: str, + bilinear_save_path: str, + pretrained_bert: str = None, + bert_tokenizer_config_file: str = None, + bert_config_file: str = None, + device: str = "gpu", + ): + super().__init__() + self.pretrained_bert = pretrained_bert + self.encoder_save_path = encoder_save_path + self.bilinear_save_path = bilinear_save_path + self.bert_config_file = bert_config_file + self.device = device + + # initialize parameters that would be filled later + self.encoder = TextEncoder(pretrained_bert=self.pretrained_bert, device=self.device) + self.bilinear_ranker = BilinearRanking(emb_size=264, block_size=6) + + def forward( + self, + q_input_ids: Tensor, + q_attention_mask: Tensor, + c_input_ids: Tensor, + c_attention_mask: Tensor, + entity_tokens_pos: List, + labels: List[int] = None, + ) -> Union[Tuple[Any, Tensor], Tuple[Tensor]]: + + entity_emb = self.encoder( + input_ids=q_input_ids, + attention_mask=q_attention_mask, + entity_tokens_pos=entity_tokens_pos, + ) + c_cls_emb = self.encoder(input_ids=c_input_ids, attention_mask=c_attention_mask) + softmax_scores, log_softmax = self.bilinear_ranker(entity_emb, c_cls_emb) + + if labels is not None: + labels_one_hot = [[0.0, 0.0] for _ in labels] + for i in range(len(labels)): + labels_one_hot[i][labels[i]] = 1.0 + labels_one_hot = torch.Tensor(labels_one_hot).to(self.device) + + bs, dim = labels_one_hot.shape + per_sample_loss = ( + -torch.bmm(labels_one_hot.view(bs, 1, dim), log_softmax.view(bs, dim, 1)).squeeze(2).squeeze(1) + ) + loss = torch.mean(per_sample_loss) + return loss, softmax_scores + else: + return softmax_scores + + def save(self) -> None: + encoder_weights_path = expand_path(self.encoder_save_path).with_suffix(".pth.tar") + log.info(f"Saving encoder to {encoder_weights_path}.") + torch.save({"model_state_dict": self.encoder.cpu().state_dict()}, encoder_weights_path) + bilinear_weights_path = expand_path(self.bilinear_save_path).with_suffix(".pth.tar") + log.info(f"Saving bilinear weights to {bilinear_weights_path}.") + torch.save( + {"model_state_dict": self.bilinear_ranker.cpu().state_dict()}, + bilinear_weights_path, + ) + self.encoder.to(self.device) + self.bilinear_ranker.to(self.device) + + +@register("torch_transformers_entity_ranker_infer") +class TorchTransformersEntityRankerInfer: + def __init__( + self, + pretrained_bert, + encoder_weights_path, + bilinear_weights_path, + special_token_id: int, + do_lower_case: bool = False, + batch_size: int = 5, + emb_size: int = 300, + block_size: int = 8, + device: str = "cpu", + **kwargs, + ): + self.device = torch.device("cuda" if torch.cuda.is_available() and device == "gpu" else "cpu") + self.pretrained_bert = str(expand_path(pretrained_bert)) + self.preprocessor = TorchTransformersEntityRankerPreprocessor( + vocab_file=self.pretrained_bert, + do_lower_case=do_lower_case, + special_tokens=["[ENT]"], + ) + self.encoder, self.config = None, None + self.config = AutoConfig.from_pretrained(self.pretrained_bert, output_hidden_states=True) + self.emb_size = emb_size + self.block_size = block_size + self.encoder = TextEncoder(pretrained_bert=self.pretrained_bert, device=self.device) + self.encoder_weights_path = str(expand_path(encoder_weights_path)) + self.bilinear_weights_path = str(expand_path(bilinear_weights_path)) + encoder_checkpoint = torch.load(self.encoder_weights_path, map_location=self.device) + self.encoder.load_state_dict(encoder_checkpoint["model_state_dict"]) + self.encoder.to(self.device) + self.bilinear_ranking = BilinearRanking(emb_size=self.emb_size, block_size=self.block_size) + bilinear_checkpoint = torch.load(self.bilinear_weights_path, map_location=self.device) + self.bilinear_ranking.load_state_dict(bilinear_checkpoint["model_state_dict"]) + self.bilinear_ranking.to(self.device) + self.special_token_id = special_token_id + self.batch_size = batch_size + + def __call__( + self, + contexts_batch: List[str], + candidate_entities_batch: List[List[str]], + candidate_entities_descr_batch: List[List[str]], + ): + entity_emb_batch = [] + + num_batches = len(contexts_batch) // self.batch_size + int(len(contexts_batch) % self.batch_size > 0) + for ii in range(num_batches): + contexts_list = contexts_batch[ii * self.batch_size : (ii + 1) * self.batch_size] + context_features = self.preprocessor(contexts_list) + context_input_ids = context_features["input_ids"] + context_attention_mask = context_features["attention_mask"] + special_tokens_pos = [] + for input_ids_list in context_input_ids: + found_n = -1 + for n, input_id in enumerate(input_ids_list): + if input_id == self.special_token_id: + found_n = n + break + if found_n == -1: + found_n = 0 + special_tokens_pos.append(found_n) + + cur_entity_emb_batch = self.encoder( + input_ids=context_input_ids, + attention_mask=context_attention_mask, + entity_tokens_pos=special_tokens_pos, + ) + + entity_emb_batch += cur_entity_emb_batch.detach().cpu().numpy().tolist() + + scores_batch = [] + for entity_emb, candidate_entities_list, candidate_entities_descr_list in zip( + entity_emb_batch, candidate_entities_batch, candidate_entities_descr_batch + ): + if candidate_entities_list: + entity_emb = [entity_emb for _ in candidate_entities_list] + entity_emb = torch.Tensor(entity_emb).to(self.device) + descr_features = self.preprocessor(candidate_entities_descr_list) + descr_input_ids = descr_features["input_ids"] + descr_attention_mask = descr_features["attention_mask"] + candidate_entities_emb = self.encoder(input_ids=descr_input_ids, attention_mask=descr_attention_mask) + scores_list, _ = self.bilinear_ranking(entity_emb, candidate_entities_emb) + scores_list = scores_list.detach().cpu().numpy() + scores_list = [score[1] for score in scores_list] + entities_with_scores = [(entity, score) for entity, score in zip(candidate_entities_list, scores_list)] + entities_with_scores = sorted(entities_with_scores, key=lambda x: x[1], reverse=True) + scores_batch.append(entities_with_scores) + else: + scores_batch.append([]) + + return scores_batch diff --git a/annotators/entity_linking_rus/torch_transformers_preprocessor.py b/annotators/entity_linking_rus/torch_transformers_preprocessor.py new file mode 100644 index 0000000000..463f387558 --- /dev/null +++ b/annotators/entity_linking_rus/torch_transformers_preprocessor.py @@ -0,0 +1,75 @@ +# Copyright 2017 Neural Networks and Deep Learning lab, MIPT +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from logging import getLogger +from pathlib import Path +from typing import Tuple, List, Union + +from transformers import AutoTokenizer +from transformers.data.processors.utils import InputFeatures + +from deeppavlov.core.commands.utils import expand_path +from deeppavlov.core.common.registry import register +from deeppavlov.core.models.component import Component + +log = getLogger(__name__) + + +@register("torch_transformers_entity_ranker_preprocessor") +class TorchTransformersEntityRankerPreprocessor(Component): + def __init__( + self, + vocab_file: str, + do_lower_case: bool = True, + max_seq_length: int = 512, + return_tokens: bool = False, + special_tokens: List[str] = None, + **kwargs + ) -> None: + self.max_seq_length = max_seq_length + self.return_tokens = return_tokens + if Path(vocab_file).is_file(): + vocab_file = str(expand_path(vocab_file)) + self.tokenizer = AutoTokenizer(vocab_file=vocab_file, do_lower_case=do_lower_case) + else: + self.tokenizer = AutoTokenizer.from_pretrained(vocab_file, do_lower_case=do_lower_case) + if special_tokens is not None: + special_tokens_dict = {"additional_special_tokens": special_tokens} + self.tokenizer.add_special_tokens(special_tokens_dict) + + def __call__(self, texts_a: List[str]) -> Union[List[InputFeatures], Tuple[List[InputFeatures], List[List[str]]]]: + # in case of iterator's strange behaviour + if isinstance(texts_a, tuple): + texts_a = list(texts_a) + lengths = [] + for text_a in texts_a: + encoding = self.tokenizer.encode_plus( + text_a, + add_special_tokens=True, + pad_to_max_length=True, + return_attention_mask=True, + ) + input_ids = encoding["input_ids"] + lengths.append(len(input_ids)) + + input_features = self.tokenizer( + text=texts_a, + add_special_tokens=True, + max_length=self.max_seq_length, + padding="max_length", + return_attention_mask=True, + truncation=True, + return_tensors="pt", + ) + return input_features diff --git a/annotators/fact_retrieval/tfidf_ranker.py b/annotators/fact_retrieval/tfidf_ranker.py index d16d849b56..3a8bd80e26 100644 --- a/annotators/fact_retrieval/tfidf_ranker.py +++ b/annotators/fact_retrieval/tfidf_ranker.py @@ -24,8 +24,7 @@ from deeppavlov.core.models.estimator import Component from deeppavlov.core.common.file import read_json from deeppavlov.core.commands.utils import expand_path - -from common.ignore_lists import FALSE_POS_NPS_LIST, BAD_NPS_LIST +from common.remove_lists import NP_REMOVE_LIST, NP_IGNORE_LIST logger = getLogger(__name__) @@ -51,10 +50,10 @@ def __init__( freq_unigrams = f.read().splitlines()[:1000] self.np_ignore_expr = re.compile( - "(" + "|".join([r"\b%s\b" % word for word in BAD_NPS_LIST + freq_unigrams]) + ")", re.IGNORECASE + "(" + "|".join([r"\b%s\b" % word for word in NP_IGNORE_LIST + freq_unigrams]) + ")", re.IGNORECASE ) self.np_remove_expr = re.compile( - "(" + "|".join([r"\b%s\b" % word for word in FALSE_POS_NPS_LIST]) + ")", re.IGNORECASE + "(" + "|".join([r"\b%s\b" % word for word in NP_REMOVE_LIST]) + ")", re.IGNORECASE ) self.rm_spaces_expr = re.compile(r"\s\s+") diff --git a/annotators/sentseg_ru/Dockerfile b/annotators/sentseg_ru/Dockerfile new file mode 100644 index 0000000000..1908d8c330 --- /dev/null +++ b/annotators/sentseg_ru/Dockerfile @@ -0,0 +1,23 @@ +FROM deeppavlov/base-gpu:0.17.2 + +ARG CONFIG +ARG SED_ARG=" | " + +ENV CONFIG=$CONFIG + +RUN mkdir /src +RUN mkdir /midas +RUN pip install pip==21.3.1 +COPY ./requirements.txt /src/requirements.txt +RUN pip install -r /src/requirements.txt + +COPY . /src/ + +WORKDIR /src +RUN pip install pymorphy2==0.9.1 +RUN python -m deeppavlov install $CONFIG +RUN python -m spacy download ru_core_news_sm + +RUN sed -i "s|$SED_ARG|g" "$CONFIG" + +CMD gunicorn --workers=1 server:app -b 0.0.0.0:8011 diff --git a/annotators/sentseg_ru/README.md b/annotators/sentseg_ru/README.md new file mode 100644 index 0000000000..145aa8186c --- /dev/null +++ b/annotators/sentseg_ru/README.md @@ -0,0 +1,27 @@ +# Sentence Segmentation model for Russian Language + +Model adds punctuation marks (`.` and `?`) in Russian lower-cased text. + +Models is trained on Russian Open Subtitles dataset using ruBERT-based NER setup. The training scores are the following: +``` +{ + "valid": { + "eval_examples_count": 28977, + "metrics": { + "ner_f1": 73.9806, + "ner_token_f1": 73.9806 + }, + "time_spent": "0:00:36" + } +} +{ + "test": { + "eval_examples_count": 28976, + "metrics": { + "ner_f1": 74.1223, + "ner_token_f1": 74.1223 + }, + "time_spent": "0:00:35" + } +} +``` diff --git a/annotators/sentseg_ru/data_preprocessing.py b/annotators/sentseg_ru/data_preprocessing.py new file mode 100644 index 0000000000..50342158e1 --- /dev/null +++ b/annotators/sentseg_ru/data_preprocessing.py @@ -0,0 +1,341 @@ +import string + +from nltk.tokenize import sent_tokenize, word_tokenize + + +# Segmentation task +# dataset: one sample = (list of token without punctuations, list of tags): +# [['hi', 'alexa', 'what', 'time', 'is', 'it']] +# [['B-S', ,'O', 'B-Q', 'O', 'O', 'O']] + +# Convert cornellmoviequotes dataset to be suitable with the segmentation task + + +def preprocess(raw_text): + # input: raw text consisting of sentences without punctuation + # output: x - list of tokens, y - list of label + tmp = sent_tokenize(raw_text) + + # remove the long line which consists more than three sentences + if len(tmp) > 3: + # print(tmp) + return [], [] + + tmp = [word_tokenize(sent) for sent in tmp] + + x, y = [], [] + + for sent in tmp: + if sent[-1] == "?": + y.append("B-Q") + # elif sent[-1].endswith('!'): + # y.append('B-E') + else: + y.append("B-S") + + x.extend(sent[:-1]) + y.extend(["O"] * (len(sent) - 2)) + return x, y + + +def convert_russian_subtitles(): + with open(file="data/russian_subtitles_unique_utterances.txt", mode="r") as f: + lines = f.readlines() + X, Y = [], [] + + for line in lines: + tmp = line.strip().lower() + x, y = preprocess(tmp) + if x != []: + X.append(x) + Y.append(y) + + with open(file="./data/sentseg.txt", mode="w", encoding="utf-8") as fo: + for x, y in zip(X, Y): + for word, label in zip(x, y): + fo.write("{}\t{}\n".format(word, label)) + fo.write("\n") + + +def convert_cornellmoviequotes(): + with open(file="../datasets/cornellmoviequotes/moviequotes.scripts.txt", mode="r", encoding="latin-1") as f: + lines = f.readlines() + X, Y = [], [] + + for line in lines: + tmp = line.split("+++$+++")[-1].strip().lower() + # print(tmp) + + x, y = preprocess(tmp) + + # print(x) + # print(y) + # print('\n') + if x != []: + X.append(x) + Y.append(y) + + with open(file="../datasets/cornqellmoviequotes.txt", mode="w", encoding="utf-8") as fo: + for x, y in zip(X, Y): + for word, label in zip(x, y): + fo.write("{}\t{}\n".format(word, label)) + fo.write("\n") + + +def convert_dailydialog(): + X, Y = [], [] + with open(file="../datasets/dailydialog.txt", mode="r", encoding="utf-8") as f: + lines = f.readlines() + # print(lines[:10]) + # print(len(lines)) + for line in lines: + tmp = line.strip().lower() + if len(tmp) == 0: + continue + # print(tmp) + + x, y = preprocess(tmp) + + # print(x) + # print(y) + # print('\n') + if x != []: + X.append(x) + Y.append(y) + + with open(file="../datasets/dailydialog_sentseg.txt", mode="w", encoding="utf-8") as fo: + for x, y in zip(X, Y): + for word, label in zip(x, y): + fo.write("{}\t{}\n".format(word, label)) + fo.write("\n") + + +def data_split(x, y, dev_size, test_size): + from sklearn.model_selection import train_test_split + + X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=test_size, random_state=42) + X_train, X_dev, y_train, y_dev = train_test_split( + X_train, y_train, test_size=dev_size / (1 - test_size), random_state=42 + ) + return X_train, y_train, X_dev, y_dev, X_test, y_test + + +def split_dataset(dataset_name="cornellmoviequotes"): + X, Y = [], [] + x, y = [], [] + + with open(file=f"data/{dataset_name}.txt", mode="r", encoding="utf-8") as f: + for line in f: + if line.strip() == "": + X.append(x) + Y.append(y) + x, y = [], [] + else: + items = line.split() + x.append(items[0]) + y.append(items[1]) + + xtrain, ytrain, xdev, ydev, xtest, ytest = data_split(X, Y, 0.1, 0.1) + # print(xtrain[:10]) + # print(ytrain[:10]) + # print(len(xtrain), len(ytrain), len(xdev), len(ydev), len(xtest), len(ytest)) + + def write2file(sents, labels, filename): + with open(file=filename, mode="w", encoding="utf-8") as fo: + for s, l in zip(sents, labels): + for word, tag in zip(s, l): + fo.write("{}\t{}\n".format(word, tag)) + fo.write("\n") + + write2file(xtrain, ytrain, f"data/{dataset_name}_train.txt") + write2file(xdev, ydev, f"data/{dataset_name}_dev.txt") + write2file(xtest, ytest, f"data/{dataset_name}_test.txt") + + +def create_dicts(inp_file, out_file): + word_counts = {} + + with open(file=inp_file, mode="r", encoding="utf-8") as f: + for line in f: + words = line.strip().split() + if len(words) > 0: + if words[0] not in word_counts: + word_counts[words[0]] = 1 + else: + word_counts[words[0]] += 1 + + listofTuples = sorted(word_counts.items(), key=lambda x: x[1]) + + words = ["", ""] + for elem in listofTuples: + if elem[1] > 3: + words.append(elem[0]) + + word2id = {k: v for (v, k) in enumerate(words)} + id2word = {k: v for (k, v) in enumerate(words)} + + chars = ["", ""] + for word in word2id.keys(): + for c in word: + if c not in chars: + chars.append(c) + + char2id = {k: v for (v, k) in enumerate(chars)} + id2char = {k: v for (k, v) in enumerate(chars)} + + tag2id = {"": 0, "B-S": 1, "B-Q": 2, "O": 3} + id2tag = {0: "", 1: "B-S", 2: "B-Q", 3: "O"} + + print(word2id) + print(char2id) + print(len(word2id), len(id2word), len(char2id), len(id2char)) + + import pickle + + with open(out_file, "wb") as f: + pickle.dump( + { + "word2id": word2id, + "id2word": id2word, + "char2id": char2id, + "id2char": id2char, + "tag2id": tag2id, + "id2tag": id2tag, + }, + f, + ) + + +def data_statistic(file): + stat = {"samples": 0, "total_words": 0, "B-S": 0, "B-Q": 0, "O": 0} + with open(file=file, mode="r") as f: + for line in f: + if len(line.strip()) > 0: + word, tag = line.strip().split("\t") + stat[tag] += 1 + stat["total_words"] += 1 + else: + stat["samples"] += 1 + + print(stat) + + +def create_dailydialog_for_deeppavlov(): + with open( + file="../datasets/ijcnlp_dailydialog/dailydialog_for_deeppavlov/dailydialog_deeppavlov2.txt", + mode="w", + encoding="utf-8", + ) as fo: + for dialog in open( + file="../datasets/ijcnlp_dailydialog/dialogues_text.txt", mode="r", encoding="utf-8" + ).readlines(): + utterances = dialog.lower().replace("! ?", "!").replace("? !", "?").replace("!", ".").split("__eou__")[:-1] + for utt in utterances: + if len(utt) > 200: + continue + x, y = "", "" + s = word_tokenize(utt) + for word in s: + if word in [".", "?", "!"]: + y += word + " " + elif word not in string.punctuation: + x += word + " " + y += word + " " + if y[-2] in [".", "?", "!"]: + fo.write("{} [SEP] {}\n".format(x[:-1], y[:-1])) + + # if len(y) == 0: + # continue + # y = y.replace("!", ".").replace(",", "").replace(" ’ ", "'").replace(" ", " ").strip() + # if y[-1] not in [".", "?"]: + # print(y) + # x = y.replace("?", "").replace(".", "").replace("!", "").replace(" ", " ").strip() + # if len(x.strip()) > 0: + # fo.write("{} [SEP] {}\n".format(x, y)) + + +def split_dailydialog_for_deeppavlov(): + with open( + file="../datasets/ijcnlp_dailydialog/dailydialog_for_deeppavlov/dailydialog_deeppavlov2.txt", + mode="r", + encoding="utf-8", + ) as f: + samples = f.readlines() + n = len(samples) + train = samples[: (int)(n * 0.8)] + val = samples[len(train) : (int)(n * 0.9)] + test = samples[len(train) + len(val) :] + print(len(samples), len(train), len(val), len(test)) + + with open( + file="../datasets/ijcnlp_dailydialog/dailydialog_for_deeppavlov/train2.txt", mode="w", encoding="utf-8" + ) as fo: + fo.writelines(train) + with open( + file="../datasets/ijcnlp_dailydialog/dailydialog_for_deeppavlov/valid2.txt", mode="w", encoding="utf-8" + ) as fo: + fo.writelines(val) + with open( + file="../datasets/ijcnlp_dailydialog/dailydialog_for_deeppavlov/test2.txt", mode="w", encoding="utf-8" + ) as fo: + fo.writelines(test) + + +# convert = {"Q": "?", "S": ".", "": ""} +# def SentSegRestoreSent(x, y): +# assert len(x) == len(y) +# if len(y) == 0: +# return "" +# sent = x[0] +# punct = "" if y[0] == "O" else convert[y[0][-1]] +# for word, tag in zip(x[1:], y[1:]): +# if tag != "O": +# sent += punct +# punct = convert[tag[-1]] +# sent += " " + word +# sent += punct + +# return sent + +# with open(file="/home/theanh/.deeppavlov/downloads/sentseg_dailydialog/train.txt", mode="w", encoding="utf-8") as fo: +# x, y = [], [] +# for line in open(file="models/dailydialog_811/train.txt", mode="r", encoding="utf-8").readlines(): +# items = line.strip().split() +# if len(items) == 0: +# if len(x) > 0: +# xs = " ".join(x) +# ys = SentSegRestoreSent(x, y) +# fo.write(f"{xs} [SEP] {ys}\n") +# x, y = [], [] +# else: +# x.append(items[0].strip()) +# y.append(items[1].strip()) + + +# import pickle +# print(pickle.load(open("models/dailydialog_811/params.pkl", "rb"))) + +# +# with open(file="/home/theanh/.deeppavlov/downloads/sentseg_dailydialog/test.txt", mode="w", encoding="utf-8") as fo: +# for line in open(file="models/dailydialog_811/test.txt", mode="r", encoding="utf-8").readlines(): +# if len(line.strip()) > 0: +# line = line.replace("B-Q", "B-?").replace("B-S", "B-.") +# fo.write(line) + +convert_russian_subtitles() + +split_dataset(dataset_name="ru_sentseg") + +create_dicts("data/ru_sentseg.txt", "data/ru_sentseg_dict.pkl") + +# data_statistic("models/dailydialog/train.txt") +# data_statistic("models/dailydialog/dev.txt") +# data_statistic("models/dailydialog/test.txt") + +# data_statistic("models/cornellmovie_811/train.txt") +# data_statistic("models/cornellmovie_811/dev.txt") +# data_statistic("models/cornellmovie_811/test.txt") + +# create_dailydialog_for_deeppavlov() + +# split_dailydialog_for_deeppavlov() diff --git a/annotators/sentseg_ru/dp_sentseg_reader.py b/annotators/sentseg_ru/dp_sentseg_reader.py new file mode 100644 index 0000000000..8d1a1c722e --- /dev/null +++ b/annotators/sentseg_ru/dp_sentseg_reader.py @@ -0,0 +1,143 @@ +from logging import getLogger +from pathlib import Path + +from deeppavlov.core.data.dataset_reader import DatasetReader +from deeppavlov.core.data.utils import download_decompress + +log = getLogger(__name__) + + +class SentSegDatasetReader(DatasetReader): + """Class to read training datasets in CoNLL-2003 format""" + + def read( + self, + data_path: str, + dataset_name: str = None, + provide_pos: bool = False, + provide_chunk: bool = False, + provide_doc_ids: bool = False, + iob: bool = False, + iobes: bool = False, + docstart_token: str = None, + ): + self.provide_pos = provide_pos + self.provide_chunk = provide_chunk + self.provide_doc_ids = provide_doc_ids + self.iob = iob + self.iobes = iobes + self.docstart_token = docstart_token + self.num_docs = 0 + self.x_is_tuple = self.provide_pos or self.provide_doc_ids + data_path = Path(data_path) + files = list(data_path.glob("*.txt")) + if "train.txt" not in {file_path.name for file_path in files}: + if dataset_name == "sentseg_ru": + url = "http://files.deeppavlov.ai/deeppavlov_data/sentseg_ru_subtitles_data.tar.gz" + else: + raise RuntimeError('train.txt not found in "{}"'.format(data_path)) + data_path.mkdir(exist_ok=True, parents=True) + download_decompress(url, data_path) + files = list(data_path.glob("*.txt")) + dataset = {} + + for file_name in files: + name = file_name.with_suffix("").name + dataset[name] = self.parse_ner_file(file_name) + return dataset + + def parse_ner_file(self, file_name: Path): + samples = [] + with file_name.open(encoding="utf8") as f: + tokens = [] + pos_tags = [] + chunk_tags = [] + tags = [] + expected_items = 2 + int(self.provide_pos) + int(self.provide_chunk) + for line in f: + # Check end of the document + if "DOCSTART" in line: + if len(tokens) > 1: + x = tokens if not self.x_is_tuple else (tokens,) + if self.provide_pos: + x = x + (pos_tags,) + if self.provide_chunk: + x = x + (chunk_tags,) + if self.provide_doc_ids: + x = x + (self.num_docs,) + samples.append((x, tags)) + tokens = [] + pos_tags = [] + chunk_tags = [] + tags = [] + self.num_docs += 1 + if self.docstart_token is not None: + tokens = [self.docstart_token] + pos_tags = ["O"] + chunk_tags = ["O"] + tags = ["O"] + elif len(line) < 2: + if (len(tokens) > 0) and (tokens != [self.docstart_token]): + x = tokens if not self.x_is_tuple else (tokens,) + if self.provide_pos: + x = x + (pos_tags,) + if self.provide_chunk: + x = x + (chunk_tags,) + if self.provide_doc_ids: + x = x + (self.num_docs,) + samples.append((x, tags)) + tokens = [] + pos_tags = [] + chunk_tags = [] + tags = [] + else: + items = line.split() + if len(items) < expected_items: + raise Exception(f"Input is not valid {line}") + tokens.append(items[0]) + tags.append(items[-1]) + if self.provide_pos: + pos_tags.append(items[1]) + if self.provide_chunk: + chunk_tags.append(items[2]) + if tokens: + x = tokens if not self.x_is_tuple else (tokens,) + if self.provide_pos: + x = x + (pos_tags,) + if self.provide_chunk: + x = x + (chunk_tags,) + if self.provide_doc_ids: + x = x + (self.num_docs,) + samples.append((x, tags)) + self.num_docs += 1 + + if self.iob: + return [(x, self._iob2_to_iob(tags)) for x, tags in samples] + if self.iobes: + return [(x, self._iob2_to_iobes(tags)) for x, tags in samples] + + return samples + + @staticmethod + def _iob2_to_iob(tags): + iob_tags = [] + + for n, tag in enumerate(tags): + if tag.startswith("B-") and (not n or (tags[n - 1][2:] != tag[2:])): + tag = tag.replace("B-", "I-") + iob_tags.append(tag) + + return iob_tags + + @staticmethod + def _iob2_to_iobes(tags): + tag_map = {"BB": "S", "BO": "S", "IB": "E", "IO": "E"} + tags = tags + ["O"] + iobes_tags = [] + for i in range(len(tags) - 1): + tagtag = tags[i][0] + tags[i + 1][0] + if tagtag in tag_map: + iobes_tags.append(tag_map[tagtag] + tags[i][1:]) + else: + iobes_tags.append(tags[i]) + return iobes_tags diff --git a/annotators/sentseg_ru/requirements.txt b/annotators/sentseg_ru/requirements.txt new file mode 100644 index 0000000000..87708bd4d1 --- /dev/null +++ b/annotators/sentseg_ru/requirements.txt @@ -0,0 +1,8 @@ +flask==1.1.1 +itsdangerous==2.0.1 +gunicorn==20.0.4 +sentry-sdk==0.13.4 +requests==2.22.0 +spacy==3.2.0 +jinja2<=3.0.3 +Werkzeug<=2.0.3 diff --git a/annotators/sentseg_ru/sentseg_ru_bert_torch.json b/annotators/sentseg_ru/sentseg_ru_bert_torch.json new file mode 100644 index 0000000000..750730757c --- /dev/null +++ b/annotators/sentseg_ru/sentseg_ru_bert_torch.json @@ -0,0 +1,154 @@ +{ + "dataset_reader": { + "class_name": "dp_sentseg_reader:SentSegDatasetReader", + "data_path": "{DOWNLOADS_PATH}/sentseg_ru_subtitles_data", + "dataset_name": "sentseg_ru", + "provide_pos": false + }, + "dataset_iterator": { + "class_name": "data_learning_iterator" + }, + "chainer": { + "in": [ + "x" + ], + "in_y": [ + "y" + ], + "pipe": [ + { + "class_name": "torch_transformers_ner_preprocessor", + "vocab_file": "{TRANSFORMER}", + "do_lower_case": true, + "max_seq_length": 256, + "max_subword_length": 15, + "token_masking_prob": 0.0, + "in": [ + "x" + ], + "out": [ + "x_tokens", + "x_subword_tokens", + "x_subword_tok_ids", + "startofword_markers", + "attention_mask" + ] + }, + { + "id": "tag_vocab", + "class_name": "simple_vocab", + "unk_token": [ + "O" + ], + "pad_with_zeros": true, + "save_path": "{MODEL_PATH}/tag.dict", + "load_path": "{MODEL_PATH}/tag.dict", + "fit_on": [ + "y" + ], + "in": [ + "y" + ], + "out": [ + "y_ind" + ] + }, + { + "class_name": "torch_transformers_sequence_tagger", + "n_tags": "#tag_vocab.len", + "pretrained_bert": "{TRANSFORMER}", + "attention_probs_keep_prob": 0.5, + "return_probas": false, + "encoder_layer_ids": [ + -1 + ], + "optimizer": "AdamW", + "optimizer_parameters": { + "lr": 2e-05, + "weight_decay": 1e-06, + "betas": [ + 0.9, + 0.999 + ], + "eps": 1e-06 + }, + "clip_norm": 1.0, + "min_learning_rate": 1e-07, + "learning_rate_drop_patience": 30, + "learning_rate_drop_div": 1.5, + "load_before_drop": true, + "save_path": "{MODEL_PATH}/model", + "load_path": "{MODEL_PATH}/model", + "in": [ + "x_subword_tok_ids", + "attention_mask", + "startofword_markers" + ], + "in_y": [ + "y_ind" + ], + "out": [ + "y_pred_ind" + ] + }, + { + "ref": "tag_vocab", + "in": [ + "y_pred_ind" + ], + "out": [ + "y_pred" + ] + } + ], + "out": [ + "x_tokens", + "y_pred" + ] + }, + "train": { + "epochs": 30, + "batch_size": 64, + "metrics": [ + { + "name": "ner_f1", + "inputs": [ + "y", + "y_pred" + ] + }, + { + "name": "ner_token_f1", + "inputs": [ + "y", + "y_pred" + ] + } + ], + "validation_patience": 100, + "val_every_n_epochs": 1, + "log_every_n_batches": 100, + "show_examples": false, + "pytest_max_batches": 2, + "pytest_batch_size": 8, + "evaluation_targets": [ + "valid" + ], + "class_name": "torch_trainer" + }, + "metadata": { + "variables": { + "ROOT_PATH": "~/.deeppavlov", + "DOWNLOADS_PATH": "{ROOT_PATH}/downloads", + "MODELS_PATH": "{ROOT_PATH}/models", + "TRANSFORMER": "DeepPavlov/rubert-base-cased-conversational", + "MODEL_PATH": "{MODELS_PATH}/sentseg_ru_bert_torch_v0" + }, + "download": [ + { + "url": "http://files.deeppavlov.ai/deeppavlov_data/sentseg_ru_bert_torch_v0.tar.gz", + "subdir": "{MODELS_PATH}" + } + ] + } +} \ No newline at end of file diff --git a/annotators/sentseg_ru/server.py b/annotators/sentseg_ru/server.py new file mode 100644 index 0000000000..093a4ae8d3 --- /dev/null +++ b/annotators/sentseg_ru/server.py @@ -0,0 +1,97 @@ +import re +import logging +import time +from os import getenv + +import sentry_sdk +import spacy +from deeppavlov import build_model +from flask import Flask, jsonify, request + + +sentry_sdk.init(getenv("SENTRY_DSN")) + +logging.basicConfig(format="%(asctime)s - %(name)s - %(levelname)s - %(message)s", level=logging.INFO) +logger = logging.getLogger(__name__) + +app = Flask(__name__) +config = getenv("CONFIG", "sentseg_ru.json") +PUNCTUATION = re.compile(r"[\.\?\!\,]+") +DOUBLE_SPACE = re.compile(r"\s+") + + +try: + spacy_nlp = spacy.load("ru_core_news_sm") + sentseg_model = build_model(config, download=True) + m = sentseg_model(["привет как дела"]) +except Exception as e: + logger.exception("SentSeg Russian not loaded") + sentry_sdk.capture_exception(e) + raise e + + +def split_segments(sentence): + segm = re.split(PUNCTUATION, sentence) + segm = [sent.strip() for sent in segm if sent != ""] + + curr_sent = "" + punct_occur = False + segments = [] + + for s in segm: + if re.match(PUNCTUATION, s): + punct_occur = True + curr_sent += s + elif punct_occur: + segments.append(curr_sent) + curr_sent = s + punct_occur = False + else: + curr_sent += s + segments.append(curr_sent) + return segments + + +def add_punctuation(tokens, pred_labels): + # sentseg_model = build_model(configs.ner.ner_ontonotes_bert_torch, download=True) + # + # sentseg_model(['привет как дела']) + # >>> [[['привет', 'как', 'дела']], [['B-S', 'B-Q', 'O']]] + tag2text = {"B-S": ".", "B-Q": "?", "O": "."} + punctuation = tag2text[pred_labels[0]] + sent = tokens[0] + for word, tag in zip(tokens[1:], pred_labels[1:]): + if tag != "O": + sent += punctuation + punctuation = tag2text[tag] + sent += " " + word + sent += punctuation + logger.info(f"Punctuated: {sent}") + return sent + + +def split_sentences(sentences): + doc = spacy_nlp(sentences) + return [sent.text for sent in doc.sents] + + +@app.route("/sentseg", methods=["POST"]) +def respond(): + st_time = time.time() + utterances = request.json["sentences"] + utterances = [DOUBLE_SPACE.sub(" ", PUNCTUATION.sub(" ", uttr)) for uttr in utterances] + ptokens = sentseg_model(utterances) + punctuated = [add_punctuation(tokens, pred_labels) for tokens, pred_labels in zip(ptokens[0], ptokens[1])] + segments = [split_sentences(utt) for utt in punctuated] + + sentseg_result = [] + for utt, segs in zip(punctuated, segments): + sentseg_result += [{"punct_sent": utt, "segments": segs}] + + total_time = time.time() - st_time + logger.info(f"sentseg exec time: {total_time:.3f}s") + return jsonify(sentseg_result) + + +if __name__ == "__main__": + app.run(debug=False, host="0.0.0.0", port=3000) diff --git a/annotators/sentseg_ru/test.py b/annotators/sentseg_ru/test.py new file mode 100644 index 0000000000..82a1ab2167 --- /dev/null +++ b/annotators/sentseg_ru/test.py @@ -0,0 +1,15 @@ +import requests + + +url = "http://0.0.0.0:8011/sentseg" +sentences = {"sentences": ["привет как дела"]} + +gold = "привет. как дела?" +segments_gold = ["привет.", "как дела?"] + +response = requests.post(url, json=sentences).json() + +assert response[0]["punct_sent"] == gold, print(response) +assert response[0]["segments"] == segments_gold, print(response) + +print("SUCCESS!") diff --git a/annotators/sentseg_ru/test.sh b/annotators/sentseg_ru/test.sh new file mode 100755 index 0000000000..61672db785 --- /dev/null +++ b/annotators/sentseg_ru/test.sh @@ -0,0 +1,3 @@ +#!/bin/bash + +python test.py diff --git a/annotators/spacy_annotator/Dockerfile b/annotators/spacy_annotator/Dockerfile new file mode 100644 index 0000000000..cebc1296f4 --- /dev/null +++ b/annotators/spacy_annotator/Dockerfile @@ -0,0 +1,25 @@ +FROM python:3.8.4 + +ARG SRC_DIR +ENV SRC_DIR ${SRC_DIR} +ARG SERVICE_PORT +ENV SERVICE_PORT ${SERVICE_PORT} +ARG SPACY_MODEL +ENV SPACY_MODEL ${SPACY_MODEL} +ARG TOKEN_ATTRIBUTES +ENV TOKEN_ATTRIBUTES ${TOKEN_ATTRIBUTES} +ARG ANNOTATE_BATCH_WITH_TOKENS_ONLY +ENV ANNOTATE_BATCH_WITH_TOKENS_ONLY ${ANNOTATE_BATCH_WITH_TOKENS_ONLY} + +RUN mkdir /src + +COPY $SRC_DIR /src/ +COPY ./common/ /src/common/ + +COPY $SRC_DIR/requirements.txt /src/requirements.txt +RUN pip install -r /src/requirements.txt +RUN python -m spacy download ${SPACY_MODEL} + +WORKDIR /src + +CMD gunicorn --workers=2 server:app diff --git a/annotators/spacy_annotator/README.txt b/annotators/spacy_annotator/README.txt new file mode 100644 index 0000000000..10a665d7f5 --- /dev/null +++ b/annotators/spacy_annotator/README.txt @@ -0,0 +1 @@ +This is Cobot nounphrase annotator. diff --git a/annotators/spacy_annotator/requirements.txt b/annotators/spacy_annotator/requirements.txt new file mode 100644 index 0000000000..dd14077e1c --- /dev/null +++ b/annotators/spacy_annotator/requirements.txt @@ -0,0 +1,9 @@ +flask==1.1.1 +itsdangerous==2.0.1 +gunicorn==20.0.4 +sentry-sdk==0.13.4 +requests==2.22.0 +spacy==3.2.0 +click<=8.0.4 +jinja2<=3.0.3 +Werkzeug<=2.0.3 \ No newline at end of file diff --git a/annotators/spacy_annotator/server.py b/annotators/spacy_annotator/server.py new file mode 100644 index 0000000000..37fc353d9e --- /dev/null +++ b/annotators/spacy_annotator/server.py @@ -0,0 +1,60 @@ +import logging +import re +import time +from os import getenv + +import sentry_sdk +import spacy +from flask import Flask, request, jsonify + + +sentry_sdk.init(getenv("SENTRY_DSN")) + +spacy_nlp = spacy.load(getenv("SPACY_MODEL")) +TOKEN_ATTRIBUTES = getenv("TOKEN_ATTRIBUTES").split("|") +ANNOTATE_BATCH_WITH_TOKENS_ONLY = getenv("ANNOTATE_BATCH_WITH_TOKENS_ONLY", False) + +logging.basicConfig(format="%(asctime)s - %(name)s - %(levelname)s - %(message)s", level=logging.DEBUG) +logger = logging.getLogger(__name__) + +app = Flask(__name__) + + +def remove_quotes(text): + return re.sub(r"\s+", " ", re.sub(r"\'\"", " ", text)).strip() + + +def get_result(request, only_tokens=False): + st_time = time.time() + sentences = request.json["sentences"] + result = [] + + for uttr in sentences: + doc = spacy_nlp(remove_quotes(uttr)) + curr_tokens = [] + for token in doc: + curr_token = {"text": token.text} + if not only_tokens: + for attr in TOKEN_ATTRIBUTES: + curr_token[attr] = str(getattr(token, attr)) + curr_tokens += [curr_token] + result += [curr_tokens] + total_time = time.time() - st_time + logger.info(f"spacy_annotator exec time: {total_time:.3f}s") + return result + + +@app.route("/respond", methods=["POST"]) +def respond(): + result = get_result(request) + return jsonify(result) + + +@app.route("/respond_batch", methods=["POST"]) +def respond_batch(): + result = get_result(request, only_tokens=ANNOTATE_BATCH_WITH_TOKENS_ONLY) + return jsonify([{"batch": result}]) + + +if __name__ == "__main__": + app.run(debug=False, host="0.0.0.0", port=3000) diff --git a/annotators/spacy_annotator/test.py b/annotators/spacy_annotator/test.py new file mode 100644 index 0000000000..a07d982bc9 --- /dev/null +++ b/annotators/spacy_annotator/test.py @@ -0,0 +1,67 @@ +import os +import requests + + +SERVICE_PORT = int(os.getenv("SERVICE_PORT")) + + +def main(): + url = f"http://0.0.0.0:{SERVICE_PORT}/respond" + input_data = {"sentences": ["джейсон стетхэм хочет есть."]} + gold = [ + [ + { + "dep_": "nsubj", + "ent_iob_": "B", + "ent_type_": "PER", + "lemma_": "джейсон", + "morph": "Animacy=Anim|Case=Nom|Gender=Masc|Number=Sing", + "pos_": "PROPN", + "text": "джейсон", + }, + { + "dep_": "appos", + "ent_iob_": "I", + "ent_type_": "PER", + "lemma_": "стетхэм", + "morph": "Animacy=Anim|Case=Nom|Gender=Masc|Number=Sing", + "pos_": "PROPN", + "text": "стетхэм", + }, + { + "dep_": "ROOT", + "ent_iob_": "O", + "ent_type_": "", + "lemma_": "хотеть", + "morph": "Aspect=Imp|Mood=Ind|Number=Sing|Person=Third|Tense=Pres|VerbForm=Fin|Voice=Act", + "pos_": "VERB", + "text": "хочет", + }, + { + "dep_": "xcomp", + "ent_iob_": "O", + "ent_type_": "", + "lemma_": "есть", + "morph": "Aspect=Imp|VerbForm=Inf|Voice=Act", + "pos_": "VERB", + "text": "есть", + }, + { + "dep_": "punct", + "ent_iob_": "O", + "ent_type_": "", + "lemma_": ".", + "morph": "", + "pos_": "PUNCT", + "text": ".", + }, + ] + ] + + result = requests.post(url, json=input_data).json() + assert result == gold, print(result) + print("Success!") + + +if __name__ == "__main__": + main() diff --git a/annotators/spacy_annotator/test.sh b/annotators/spacy_annotator/test.sh new file mode 100755 index 0000000000..61672db785 --- /dev/null +++ b/annotators/spacy_annotator/test.sh @@ -0,0 +1,3 @@ +#!/bin/bash + +python test.py diff --git a/annotators/spelling_preprocessing_ru/Dockerfile b/annotators/spelling_preprocessing_ru/Dockerfile new file mode 100644 index 0000000000..b680e623d5 --- /dev/null +++ b/annotators/spelling_preprocessing_ru/Dockerfile @@ -0,0 +1,38 @@ +FROM tensorflow/tensorflow:1.15.2-gpu + +RUN apt-key del 7fa2af80 && \ + rm -f /etc/apt/sources.list.d/cuda*.list && \ + curl https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-keyring_1.0-1_all.deb \ + -o cuda-keyring_1.0-1_all.deb && \ + dpkg -i cuda-keyring_1.0-1_all.deb + +RUN apt-get -y update && \ + apt-get install -y software-properties-common && \ + apt-get update && apt-get install git -y + +ARG LANGUAGE=EN +ENV LANGUAGE ${LANGUAGE} + +ARG CONFIG +ARG COMMIT=0.13.0 +ARG PORT +ARG SRC_DIR +ARG SED_ARG=" | " + +ENV CONFIG=$CONFIG +ENV PORT=$PORT + +COPY ./annotators/spelling_preprocessing_ru/requirements.txt /src/requirements.txt +RUN pip install -r /src/requirements.txt + +RUN pip install git+https://github.com/deepmipt/DeepPavlov.git@${COMMIT} + +COPY $SRC_DIR /src + +WORKDIR /src + +RUN python -m deeppavlov install $CONFIG + +RUN sed -i "s|$SED_ARG|g" "$CONFIG" + +CMD gunicorn --workers=1 --timeout 500 server:app -b 0.0.0.0:8074 diff --git a/annotators/spelling_preprocessing_ru/levenshtein_corrector_ru.json b/annotators/spelling_preprocessing_ru/levenshtein_corrector_ru.json new file mode 100644 index 0000000000..8052847209 --- /dev/null +++ b/annotators/spelling_preprocessing_ru/levenshtein_corrector_ru.json @@ -0,0 +1,60 @@ +{ + "chainer":{ + "in": ["x"], + "pipe": [ + { + "class_name": "str_lower", + "id": "lower", + "in": ["x"], + "out": ["x_lower"] + }, + { + "class_name": "nltk_moses_tokenizer", + "id": "tokenizer", + "in": ["x_lower"], + "out": ["x_tokens"] + }, + { + "id": "vocab", + "class_name": "simple_vocab", + "save_path": "{DOWNLOADS_PATH}/vocabs/russian_words_vocab.dict", + "load_path": "{DOWNLOADS_PATH}/vocabs/russian_words_vocab.dict" + }, + { + "in": ["x_tokens"], + "out": ["tokens_candidates"], + "class_name": "spelling_levenshtein", + "words": "#vocab.keys()" + }, + { + "class_name": "kenlm_elector", + "in": ["tokens_candidates"], + "out": ["y_predicted_tokens"], + "load_path": "{DOWNLOADS_PATH}/language_models/ru_wiyalen_no_punkt.arpa.binary" + }, + { + "ref": "tokenizer", + "in": ["y_predicted_tokens"], + "out": ["y_predicted"] + } + ], + "out": ["y_predicted"] + }, + "metadata": { + "variables": { + "ROOT_PATH": "~/.deeppavlov", + "DOWNLOADS_PATH": "{ROOT_PATH}/downloads", + "MODELS_PATH": "{ROOT_PATH}/models" + }, + "download": [ + { + "url": "http://files.deeppavlov.ai/deeppavlov_data/vocabs/russian_words_vocab.dict.gz", + "subdir": "{DOWNLOADS_PATH}/vocabs" + }, + { + "url": "http://files.deeppavlov.ai/lang_models/ru_wiyalen_no_punkt.arpa.binary.gz", + "subdir": "{DOWNLOADS_PATH}/language_models" + } + ] + } +} diff --git a/annotators/spelling_preprocessing_ru/requirements.txt b/annotators/spelling_preprocessing_ru/requirements.txt new file mode 100644 index 0000000000..7582d4cb08 --- /dev/null +++ b/annotators/spelling_preprocessing_ru/requirements.txt @@ -0,0 +1,5 @@ +sentry-sdk[flask]==0.14.1 +flask==1.1.1 +gunicorn==19.9.0 +requests==2.22.0 +itsdangerous==2.0.1 \ No newline at end of file diff --git a/annotators/spelling_preprocessing_ru/server.py b/annotators/spelling_preprocessing_ru/server.py new file mode 100644 index 0000000000..6f26e0d053 --- /dev/null +++ b/annotators/spelling_preprocessing_ru/server.py @@ -0,0 +1,43 @@ +import logging +import os +import time + +import sentry_sdk +from flask import Flask, jsonify, request + +from deeppavlov import build_model + +sentry_sdk.init(os.getenv("SENTRY_DSN")) + +logging.basicConfig(format="%(asctime)s - %(name)s - %(levelname)s - %(message)s", level=logging.INFO) +logger = logging.getLogger(__name__) +app = Flask(__name__) + +config_name = os.getenv("CONFIG") + +try: + spelling_preprocessing_model = build_model(config_name, download=True) + r = "я ге видел малако" + logger.info(f"Original: {r}. Corrected: {spelling_preprocessing_model([r])}") + logger.info("spelling_preprocessing model is loaded.") +except Exception as e: + sentry_sdk.capture_exception(e) + logger.exception(e) + raise e + + +@app.route("/respond", methods=["POST"]) +def respond(): + st_time = time.time() + + sentences = request.json["sentences"] + sentences = [text.lower() for text in sentences] + corrected_sentences = spelling_preprocessing_model(sentences) + + total_time = time.time() - st_time + logger.info(f"spelling_preprocessing exec time: {total_time:.3f}s") + return jsonify(corrected_sentences) + + +if __name__ == "__main__": + app.run(debug=False, host="0.0.0.0", port=8074) diff --git a/annotators/spelling_preprocessing_ru/test.sh b/annotators/spelling_preprocessing_ru/test.sh new file mode 100755 index 0000000000..b37c67d44c --- /dev/null +++ b/annotators/spelling_preprocessing_ru/test.sh @@ -0,0 +1,4 @@ +#!/bin/bash + + +python test_server.py diff --git a/annotators/spelling_preprocessing_ru/test_server.py b/annotators/spelling_preprocessing_ru/test_server.py new file mode 100644 index 0000000000..ccdd7ad04d --- /dev/null +++ b/annotators/spelling_preprocessing_ru/test_server.py @@ -0,0 +1,22 @@ +import requests + + +def main(): + url = "http://0.0.0.0:8074/respond" + + request_data = [{"sentences": ["я ге видел малако"]}] + + gold_results = [["я не видел малакон"]] + + count = 0 + for data, gold_result in zip(request_data, gold_results): + result = requests.post(url, json=data).json() + if result == gold_result: + count += 1 + + assert count == len(request_data) + print("Success") + + +if __name__ == "__main__": + main() diff --git a/annotators/toxic_classification_ru/Dockerfile b/annotators/toxic_classification_ru/Dockerfile new file mode 100644 index 0000000000..608116d5d1 --- /dev/null +++ b/annotators/toxic_classification_ru/Dockerfile @@ -0,0 +1,23 @@ +# syntax=docker/dockerfile:experimental + +FROM pytorch/pytorch:1.5-cuda10.1-cudnn7-runtime + +WORKDIR /src + +ARG PRETRAINED_MODEL_NAME_OR_PATH +ENV PRETRAINED_MODEL_NAME_OR_PATH ${PRETRAINED_MODEL_NAME_OR_PATH} +ARG SERVICE_PORT +ENV SERVICE_PORT ${SERVICE_PORT} + +ARG LANGUAGE=EN +ENV LANGUAGE ${LANGUAGE} + +COPY ./requirements.txt /src/requirements.txt +RUN pip install -r /src/requirements.txt + +COPY . /src + +HEALTHCHECK --interval=5s --timeout=90s --retries=3 CMD curl --fail 127.0.0.1:${SERVICE_PORT}/healthcheck || exit 1 + + +CMD gunicorn --workers=1 server:app -b 0.0.0.0:${SERVICE_PORT} --timeout=300 diff --git a/annotators/toxic_classification_ru/README.md b/annotators/toxic_classification_ru/README.md new file mode 100644 index 0000000000..3a5f14484f --- /dev/null +++ b/annotators/toxic_classification_ru/README.md @@ -0,0 +1,3 @@ +GPU RAM = 1Gb +cpu time = 0.15 sec +gpu time = 0.05 sec \ No newline at end of file diff --git a/annotators/toxic_classification_ru/requirements.txt b/annotators/toxic_classification_ru/requirements.txt new file mode 100644 index 0000000000..d45fabd47e --- /dev/null +++ b/annotators/toxic_classification_ru/requirements.txt @@ -0,0 +1,10 @@ +transformers==4.0.1 +sentencepiece==0.1.94 +flask==1.1.1 +gunicorn==19.9.0 +requests==2.22.0 +sentry-sdk[flask]==0.14.1 +healthcheck==1.3.3 +itsdangerous==2.0.1 +jinja2<=3.0.3 +Werkzeug<=2.0.3 diff --git a/annotators/toxic_classification_ru/server.py b/annotators/toxic_classification_ru/server.py new file mode 100644 index 0000000000..5c8f5a7678 --- /dev/null +++ b/annotators/toxic_classification_ru/server.py @@ -0,0 +1,92 @@ +""" +Source code is https://github.com/Grossmend/DialoGPT/blob/master/src/service/service.py +""" +import logging +import time +import os + +import sentry_sdk +import torch +from flask import Flask, request, jsonify +from healthcheck import HealthCheck +from sentry_sdk.integrations.flask import FlaskIntegration +from transformers import BertTokenizer, BertForSequenceClassification + + +sentry_sdk.init(dsn=os.getenv("SENTRY_DSN"), integrations=[FlaskIntegration()]) + +logging.basicConfig(format="%(asctime)s - %(name)s - %(levelname)s - %(message)s", level=logging.INFO) +logger = logging.getLogger(__name__) + +PRETRAINED_MODEL_NAME_OR_PATH = os.environ.get( + "PRETRAINED_MODEL_NAME_OR_PATH", "SkolkovoInstitute/russian_toxicity_classifier" +) +logger.info(f"PRETRAINED_MODEL_NAME_OR_PATH = {PRETRAINED_MODEL_NAME_OR_PATH}") + +cuda = torch.cuda.is_available() +if cuda: + torch.cuda.set_device(0) + device = "cuda" +else: + device = "cpu" + +logger.info(f"toxic-classification is set to run on {device}") + +try: + tokenizer = BertTokenizer.from_pretrained("SkolkovoInstitute/russian_toxicity_classifier") + model = BertForSequenceClassification.from_pretrained("SkolkovoInstitute/russian_toxicity_classifier") + model.eval() + if cuda: + model.cuda() + logger.info("toxic-classification model is ready") +except Exception as e: + sentry_sdk.capture_exception(e) + logger.exception(e) + raise e + +app = Flask(__name__) +health = HealthCheck(app, "/healthcheck") +logging.getLogger("werkzeug").setLevel("WARNING") + + +def classify_sentences(sentences): + try: + batch_tokens = [] + for sent in sentences: + batch_tokens += [tokenizer.encode(sent, padding="max_length", max_length=64, return_tensors="pt")] + + model_input = torch.cat(batch_tokens, dim=0) + model_input = model_input.cuda() if cuda else model_input + result = [] + with torch.no_grad(): + outputs = model(model_input) + probas = torch.nn.functional.softmax(outputs.logits, dim=-1) + for sent, prob_dist in zip(sentences, probas): + result += [{"toxic": float(prob_dist[1])}] + except Exception as exc: + logger.exception(exc) + sentry_sdk.capture_exception(exc) + result = [{"toxic": 0.0}] * len(sentences) + return result + + +@app.route("/respond", methods=["POST"]) +def respond(): + st_time = time.time() + sentences = request.json.get("sentences", []) + result = classify_sentences(sentences) + total_time = time.time() - st_time + logger.info(f"toxic-classification exec time: {total_time:.3f}s") + + return jsonify(result) + + +@app.route("/respond_batch", methods=["POST"]) +def respond_batch(): + st_time = time.time() + sentences = request.json.get("sentences", []) + result = classify_sentences(sentences) + total_time = time.time() - st_time + logger.info(f"toxic-classification exec time: {total_time:.3f}s") + + return jsonify([{"batch": result}]) diff --git a/annotators/toxic_classification_ru/test.py b/annotators/toxic_classification_ru/test.py new file mode 100644 index 0000000000..7918a71e27 --- /dev/null +++ b/annotators/toxic_classification_ru/test.py @@ -0,0 +1,16 @@ +import requests + + +def test_respond(): + url = "http://0.0.0.0:8126/respond" + + sentences = ["иди в жопу", "иду иду"] + gold = [0.9885, 0.0086] + request_data = {"sentences": sentences} + result = requests.post(url, json=request_data).json() + assert round(result[0]["toxic"], 4) == gold[0] and round(result[1]["toxic"], 4) == gold[1], f"Got\n{result}" + print("Success!") + + +if __name__ == "__main__": + test_respond() diff --git a/annotators/toxic_classification_ru/test.sh b/annotators/toxic_classification_ru/test.sh new file mode 100755 index 0000000000..468a5a38fc --- /dev/null +++ b/annotators/toxic_classification_ru/test.sh @@ -0,0 +1,3 @@ +#!/bin/bash + +python test.py \ No newline at end of file diff --git a/annotators/wiki_parser/Dockerfile b/annotators/wiki_parser/Dockerfile index b06283b4a0..66a8d404e6 100644 --- a/annotators/wiki_parser/Dockerfile +++ b/annotators/wiki_parser/Dockerfile @@ -4,10 +4,13 @@ ARG CONFIG ARG COMMIT ARG PORT ARG SRC_DIR +ARG LANGUAGE=EN +ENV LANGUAGE ${LANGUAGE} ENV CONFIG=$CONFIG ENV PORT=$PORT ENV COMMIT=$COMMIT +ENV LANGUAGE=$LANGUAGE COPY ./annotators/wiki_parser/requirements.txt /src/requirements.txt RUN pip install -r /src/requirements.txt diff --git a/annotators/wiki_parser/test_wiki_parser.py b/annotators/wiki_parser/test_wiki_parser.py index 4300b66736..2f5ef2c872 100644 --- a/annotators/wiki_parser/test_wiki_parser.py +++ b/annotators/wiki_parser/test_wiki_parser.py @@ -1,22 +1,36 @@ +import os import requests +if os.getenv("LANGUAGE", "EN") == "RU": + lang = "@ru" +else: + lang = "@en" + + def main(): url = "http://0.0.0.0:8077/model" - request_data = [ + request_data_en = [ { "parser_info": ["find_top_triplets"], - "query": [[{"entity_substr": "Jürgen Schmidhuber", "entity_ids": ["Q92735"]}]], + "query": [[{"entity_substr": "Jurgen Schmidhuber", "entity_ids": ["Q92735"]}]], } ] - - gold_results = [ + request_data_ru = [ + { + "parser_info": ["find_top_triplets"], + "query": [[{"entity_substr": "Юрген Шмидхубер", "entity_ids": ["Q92735"]}]], + } + ] + gold_results_en = [ [ { + "animals_skill_entities_info": {}, "entities_info": { - "Jürgen Schmidhuber": { - "age": 58, + "Jurgen Schmidhuber": { + "age": 59, + "conf": 1.0, "country of sitizenship": [["Q183", "Germany"]], "date of birth": [['"+1963-01-17^^T"', "17 January 1963"]], "entity_label": "Jürgen Schmidhuber", @@ -27,23 +41,86 @@ def main(): ["Q82594", "computer scientist"], ], "plain_entity": "Q92735", + "pos": 0, + "token_conf": 1.0, + "types_2hop": [ + ["Q14565186", "cognitive scientist"], + ["Q15976092", "artificial intelligence researcher"], + ["Q1622272", "university teacher"], + ["Q28640", "profession"], + ["Q3400985", "academic"], + ["Q37226", "teacher"], + ["Q41835716", "faculty member"], + ["Q5", "human"], + ["Q66666607", "academic profession"], + ["Q82594", "computer scientist"], + ["Q901", "scientist"], + ], } }, "topic_skill_entities_info": {}, "utt_num": 0, + "wiki_skill_entities_info": {}, + } + ] + ] + gold_results_ru = [ + [ + { + "animals_skill_entities_info": {}, + "entities_info": { + "Юрген Шмидхубер": { + "age": 59, + "conf": 1.0, + "country of sitizenship": [["Q183", "Германия"]], + "date of birth": [['"+1963-01-17^^T"', "17 January 1963"]], + "entity_label": "Шмидхубер, Юрген", + "instance of": [["Q5", "человек"]], + "occupation": [ + ["Q15976092", "исследователь искусственного интеллекта"], + ["Q1622272", "преподаватель университета"], + ["Q82594", "специалист в области информатики"], + ], + "plain_entity": "Q92735", + "pos": 0, + "token_conf": 1.0, + "types_2hop": [ + ["Q15976092", "исследователь искусственного интеллекта"], + ["Q1622272", "преподаватель университета"], + ["Q28640", "профессия"], + ["Q3400985", "научный работник"], + ["Q37226", "учитель"], + ["Q41835716", "преподаватель"], + ["Q5", "человек"], + ["Q66666607", "академическая профессия"], + ["Q82594", "специалист в области информатики"], + ["Q901", "учёный"], + ], + } + }, + "topic_skill_entities_info": {}, + "utt_num": 0, + "wiki_skill_entities_info": {}, } ] ] count = 0 - for data, gold_result in zip(request_data, gold_results): - result = requests.post(url, json=data).json() - if result == gold_result: - count += 1 - else: - print(f"Got {result}, but expected: {gold_result}") - - if count == len(request_data): + if lang == "@ru": + for data, gold_result in zip(request_data_ru, gold_results_ru): + result = requests.post(url, json=data).json() + if result == gold_result: + count += 1 + assert count == len(request_data_ru), print(f"Got {result}, but expected: {gold_result}") + + print("Success") + elif lang == "@en": + for data, gold_result in zip(request_data_en, gold_results_en): + result = requests.post(url, json=data).json() + if result == gold_result: + count += 1 + assert count == len(request_data_en), print(f"Got {result}, but expected: {gold_result}") + print("Success") diff --git a/annotators/wiki_parser/wiki_parser.py b/annotators/wiki_parser/wiki_parser.py index 41341a49e3..150b52d111 100644 --- a/annotators/wiki_parser/wiki_parser.py +++ b/annotators/wiki_parser/wiki_parser.py @@ -39,7 +39,11 @@ "statement": "http://ws", } max_comb_num = 1e6 -lang = "@en" + +if os.getenv("LANGUAGE", "EN") == "RU": + lang = "@ru" +else: + lang = "@en" wiki_filename = "/root/.deeppavlov/downloads/wikidata/wikidata_lite.hdt" document = HDTDocument(wiki_filename) USE_CACHE = True @@ -363,7 +367,7 @@ def find_objects_info(objects, num_objects=25): obj_label = find_label(obj, "") if obj_label and obj_label not in {"Not Found", "anonymous"}: objects_info.append((obj, obj_label)) - return objects_info + return sorted(objects_info) def find_intersection(entity1, entity2, rel, direction): diff --git a/assistant_dists/dream/dev.yml b/assistant_dists/dream/dev.yml index 00ab64e0fa..0ab151c866 100755 --- a/assistant_dists/dream/dev.yml +++ b/assistant_dists/dream/dev.yml @@ -54,6 +54,7 @@ services: volumes: - "./annotators/IntentCatcherTransformers:/src" - "./common:/src/common" + - "~/.deeppavlov:/root/.deeppavlov" ports: - 8014:8014 badlisted-words: diff --git a/assistant_dists/dream/docker-compose.override.yml b/assistant_dists/dream/docker-compose.override.yml index c96c175558..8a51bde2aa 100644 --- a/assistant_dists/dream/docker-compose.override.yml +++ b/assistant_dists/dream/docker-compose.override.yml @@ -62,6 +62,7 @@ services: args: SERVICE_PORT: 8008 SERVICE_NAME: dff_program_y_skill + LANGUAGE: EN context: . dockerfile: ./skills/dff_program_y_skill/Dockerfile command: gunicorn --workers=1 server:app -b 0.0.0.0:8008 --reload @@ -147,6 +148,7 @@ services: args: SERVICE_PORT: 8012 SERVICE_NAME: dff_intent_responder_skill + INTENT_RESPONSE_PHRASES_FNAME: intent_response_phrases.json context: . dockerfile: ./skills/dff_intent_responder_skill/Dockerfile command: gunicorn --workers=1 server:app -b 0.0.0.0:8012 --reload @@ -165,7 +167,8 @@ services: args: SERVICE_PORT: 8014 CONFIG_NAME: intents_model_dp_config.json - command: python -m flask run -h 0.0.0.0 -p 8014 --without-threads + INTENT_PHRASES_PATH: intent_phrases.json + command: python -m flask run -h 0.0.0.0 -p 8014 environment: - FLASK_APP=server - CUDA_VISIBLE_DEVICES=0 diff --git a/assistant_dists/dream_mini/dev.yml b/assistant_dists/dream_mini/dev.yml index 7022eeff47..b40cacc257 100644 --- a/assistant_dists/dream_mini/dev.yml +++ b/assistant_dists/dream_mini/dev.yml @@ -36,8 +36,9 @@ services: - 8012:8012 intent-catcher: volumes: - - "./annotators/IntentCatcher/src:/src" + - "./annotators/IntentCatcherTransformers:/src" - "./common:/src/common" + - "~/.deeppavlov:/root/.deeppavlov" ports: - 8014:8014 badlisted-words: diff --git a/assistant_dists/dream_mini/docker-compose.override.yml b/assistant_dists/dream_mini/docker-compose.override.yml index 28718beaf2..5d51780799 100644 --- a/assistant_dists/dream_mini/docker-compose.override.yml +++ b/assistant_dists/dream_mini/docker-compose.override.yml @@ -32,6 +32,7 @@ services: args: SERVICE_PORT: 8008 SERVICE_NAME: dff_program_y_skill + LANGUAGE: EN context: . dockerfile: ./skills/dff_program_y_skill/Dockerfile command: gunicorn --workers=1 server:app -b 0.0.0.0:8008 --reload @@ -92,6 +93,7 @@ services: args: SERVICE_PORT: 8012 SERVICE_NAME: dff_intent_responder_skill + INTENT_RESPONSE_PHRASES_FNAME: intent_response_phrases.json context: . dockerfile: ./skills/dff_intent_responder_skill/Dockerfile command: gunicorn --workers=1 server:app -b 0.0.0.0:8012 --reload @@ -105,10 +107,16 @@ services: intent-catcher: env_file: [.env] build: - context: ./annotators/IntentCatcher/ - command: python -m flask run -h 0.0.0.0 -p 8014 --without-threads + context: . + dockerfile: ./annotators/IntentCatcherTransformers/Dockerfile + args: + SERVICE_PORT: 8014 + CONFIG_NAME: intents_model_dp_config.json + INTENT_PHRASES_PATH: intent_phrases.json + command: python -m flask run -h 0.0.0.0 -p 8014 environment: - FLASK_APP=server + - CUDA_VISIBLE_DEVICES=0 deploy: resources: limits: diff --git a/assistant_dists/dream_russian/cpu.yml b/assistant_dists/dream_russian/cpu.yml new file mode 100644 index 0000000000..30235dda94 --- /dev/null +++ b/assistant_dists/dream_russian/cpu.yml @@ -0,0 +1,21 @@ +version: '3.7' +services: + ner: + environment: + DEVICE: cpu + CUDA_VISIBLE_DEVICES: "" + dialogpt: + environment: + CUDA_VISIBLE_DEVICES: "" + dialogrpt: + environment: + CUDA_VISIBLE_DEVICES: "" + sentseg: + environment: + CUDA_VISIBLE_DEVICES: "" + toxic-classification: + environment: + CUDA_VISIBLE_DEVICES: "" + intent-catcher: + environment: + CUDA_VISIBLE_DEVICES: "" diff --git a/assistant_dists/dream_russian/db_conf.json b/assistant_dists/dream_russian/db_conf.json new file mode 100644 index 0000000000..380184822b --- /dev/null +++ b/assistant_dists/dream_russian/db_conf.json @@ -0,0 +1,6 @@ +{ + "host": "DB_HOST", + "port": "DB_PORT", + "name": "DB_NAME", + "env": true +} diff --git a/assistant_dists/dream_russian/dev.yml b/assistant_dists/dream_russian/dev.yml new file mode 100644 index 0000000000..cd6f124543 --- /dev/null +++ b/assistant_dists/dream_russian/dev.yml @@ -0,0 +1,125 @@ +# С такими volumes удобно дебажить, не нужно пересобирать контейнер каждый раз при изменении кода +services: + agent: + volumes: + - ".:/dp-agent" + ports: + - 4242:4242 + dff-program-y-skill: + volumes: + - "./skills/dff_program_y_skill:/src" + - "./common:/src/common" + ports: + - 8008:8008 + convers-evaluation-selector: + volumes: + - "./response_selectors/convers_evaluation_based_selector:/src" + - "./common:/src/common" + ports: + - 8009:8009 + dff-intent-responder-skill: + volumes: + - "./skills/dff_intent_responder_skill:/src" + - "./common:/src/common" + ports: + - 8012:8012 + sentseg: + volumes: + - "./annotators/sentseg_ru:/src" + - "~/.deeppavlov:/root/.deeppavlov" + ports: + - 8011:8011 + intent-catcher: + volumes: + - "./annotators/IntentCatcherTransformers:/src" + - "./common:/src/common" + - "~/.deeppavlov:/root/.deeppavlov" + ports: + - 8014:8014 + badlisted-words: + volumes: + - "./annotators/BadlistedWordsDetector_ru:/src" + - "./common:/src/common" + ports: + - 8018:8018 + toxic-classification: + volumes: + - "./annotators/toxic_classification_ru:/src" + ports: + - 8126:8126 + ner: + volumes: + - './annotators/NER_ru:/src' + - "~/.deeppavlov:/root/.deeppavlov" + ports: + - 8021:8021 + personal-info-skill: + volumes: + - "./skills/personal_info_skill:/src" + - "./common:/src/common" + ports: + - 8030:8030 + entity-linking: + volumes: + - "./annotators/entity_linking_rus:/src" + - "~/.deeppavlov:/root/.deeppavlov" + ports: + - 8075:8075 + wiki-parser: + volumes: + - "./annotators/wiki_parser:/src" + - "./common:/src/common" + ports: + - 8077:8077 + mongo: + ports: + - 27017:27017 + # # you can use persistent local volume if you need + # volumes: + # - ./venv/data/db_data:/root/data/db + spelling-preprocessing: + volumes: + - "./annotators/spelling_preprocessing_ru:/src" + - "~/.deeppavlov:/root/.deeppavlov" + ports: + - 8074:8074 + spacy-annotator: + volumes: + - "./annotators/spacy_annotator:/src" + ports: + - 8125:8125 + dff-friendship-skill: + volumes: + - "./skills/dff_friendship_skill:/src" + - "./common:/src/common" + ports: + - 8086:8086 + entity-detection: + volumes: + - "./annotators/entity_detection_rus:/src" + - "~/.deeppavlov:/root/.deeppavlov" + ports: + - 8103:8103 + dialogpt: + volumes: + - "./services/dialogpt_RU:/src" + ports: + - 8091:8091 + dff-generative-skill: + volumes: + - "./skills/dff_generative_skill:/src" + - "./common:/src/common" + ports: + - 8092:8092 + dialogrpt: + volumes: + - "./services/dialogrpt_ru:/src" + ports: + - 8122:8122 + dff-template-skill: + volumes: + - "./skills/dff_template_skill:/src" + - "./common:/src/common" + ports: + - 8120:8120 +version: "3.7" diff --git a/assistant_dists/dream_russian/docker-compose.override.yml b/assistant_dists/dream_russian/docker-compose.override.yml new file mode 100644 index 0000000000..c7d8595081 --- /dev/null +++ b/assistant_dists/dream_russian/docker-compose.override.yml @@ -0,0 +1,385 @@ +services: + agent: + command: sh -c 'bin/wait && python -m deeppavlov_agent.run -ch http_client -pl assistant_dists/dream_russian/pipeline_conf.json --cors' + environment: + WAIT_HOSTS: "dff-program-y-skill:8008, convers-evaluation-selector:8009, + dff-intent-responder-skill:8012, intent-catcher:8014, badlisted-words:8018, + ner:8021, personal-info-skill:8030, sentseg:8011, + spelling-preprocessing:8074, entity-linking:8075, wiki-parser:8077, dff-generative-skill:8092, + dff-friendship-skill:8086, entity-detection:8103, dialogpt:8091, + dff-template-skill:8120, spacy-annotator:8125, dialogrpt:8122, toxic-classification:8126" + WAIT_HOSTS_TIMEOUT: ${WAIT_TIMEOUT:-480} + + dff-program-y-skill: + env_file: [.env] + build: + args: + SERVICE_PORT: 8008 + SERVICE_NAME: dff_program_y_skill + LANGUAGE: RU + context: . + dockerfile: ./skills/dff_program_y_skill/Dockerfile + command: gunicorn --workers=1 server:app -b 0.0.0.0:8008 --reload + deploy: + resources: + limits: + memory: 128M + reservations: + memory: 128M + + convers-evaluation-selector: + env_file: [.env] + build: + args: + TAG_BASED_SELECTION: 1 + CALL_BY_NAME_PROBABILITY: 0.5 + PROMPT_PROBA: 0.3 + ACKNOWLEDGEMENT_PROBA: 0.3 + PRIORITIZE_WITH_REQUIRED_ACT: 1 + PRIORITIZE_NO_DIALOG_BREAKDOWN: 0 + PRIORITIZE_WITH_SAME_TOPIC_ENTITY: 1 + IGNORE_DISLIKED_SKILLS: 0 + GREETING_FIRST: 1 + RESTRICTION_FOR_SENSITIVE_CASE: 1 + PRIORITIZE_PROMTS_WHEN_NO_SCRIPTS: 0 + ADD_ACKNOWLEDGMENTS_IF_POSSIBLE: 1 + PRIORITIZE_SCRIPTED_SKILLS: 0 + LANGUAGE: RU + context: . + dockerfile: ./response_selectors/convers_evaluation_based_selector/Dockerfile + command: flask run -h 0.0.0.0 -p 8009 + environment: + - FLASK_APP=server + deploy: + resources: + limits: + memory: 256M + reservations: + memory: 256M + + dff-intent-responder-skill: + env_file: [ .env ] + build: + args: + SERVICE_PORT: 8012 + SERVICE_NAME: dff_intent_responder_skill + INTENT_RESPONSE_PHRASES_FNAME: intent_response_phrases_RU.json + LANGUAGE: RU + context: . + dockerfile: ./skills/dff_intent_responder_skill/Dockerfile + command: gunicorn --workers=1 server:app -b 0.0.0.0:8012 --reload + deploy: + resources: + limits: + memory: 128M + reservations: + memory: 128M + + sentseg: + env_file: [.env] + build: + args: + CONFIG: sentseg_ru_bert_torch.json + context: ./annotators/sentseg_ru + command: flask run -h 0.0.0.0 -p 8011 + environment: + - CUDA_VISIBLE_DEVICES=0 + - FLASK_APP=server + deploy: + resources: + limits: + memory: 3G + reservations: + memory: 3G + + intent-catcher: + env_file: [ .env ] + build: + context: . + dockerfile: ./annotators/IntentCatcherTransformers/Dockerfile + args: + SERVICE_PORT: 8014 + CONFIG_NAME: intents_model_dp_config_RU.json + INTENT_PHRASES_PATH: intent_phrases_RU.json + command: python -m flask run -h 0.0.0.0 -p 8014 + environment: + - FLASK_APP=server + - CUDA_VISIBLE_DEVICES=0 + deploy: + resources: + limits: + memory: 3.5G + reservations: + memory: 3.5G + + badlisted-words: + env_file: [.env] + build: + context: annotators/BadlistedWordsDetector_ru/ + command: flask run -h 0.0.0.0 -p 8018 + environment: + - FLASK_APP=server + deploy: + resources: + limits: + memory: 128M + reservations: + memory: 128M + + toxic-classification: + env_file: [ .env ] + build: + context: ./annotators/toxic_classification_ru/ + args: + SERVICE_PORT: 8126 + PRETRAINED_MODEL_NAME_OR_PATH: "SkolkovoInstitute/russian_toxicity_classifier" + LANGUAGE: RU + command: flask run -h 0.0.0.0 -p 8126 + environment: + - CUDA_VISIBLE_DEVICES=0 + - FLASK_APP=server + deploy: + resources: + limits: + memory: 3G + reservations: + memory: 3G + + ner: + env_file: [.env] + build: + args: + CONFIG: ner_uncased_rus_bert_torch.json + PORT: 8021 + SRC_DIR: annotators/NER_ru + COMMIT: f5117cd9ad1e64f6c2d970ecaa42fc09ccb23144 + LANGUAGE: RU + context: ./ + dockerfile: annotators/NER_ru/Dockerfile + command: flask run -h 0.0.0.0 -p 8021 --without-threads + environment: + - FLASK_APP=server + tty: true + deploy: + resources: + limits: + memory: 2G + reservations: + memory: 2G + + entity-detection: + env_file: [.env] + build: + args: + CONFIG: entity_detection_rus.json + PORT: 8103 + SRC_DIR: annotators/entity_detection_rus + LANGUAGE: RU + context: ./ + dockerfile: annotators/entity_detection_rus/Dockerfile + command: flask run -h 0.0.0.0 -p 8103 + environment: + - FLASK_APP=server + tty: true + deploy: + resources: + limits: + memory: 3.5G + reservations: + memory: 3.5G + + personal-info-skill: + env_file: [.env] + build: + context: . + dockerfile: ./skills/personal_info_skill/Dockerfile + args: + LANGUAGE: RU + command: flask run -h 0.0.0.0 -p 8030 + environment: + - FLASK_APP=server + deploy: + resources: + limits: + memory: 128M + reservations: + memory: 128M + + entity-linking: + env_file: [.env] + build: + args: + CONFIG: entity_linking_rus.json + PORT: 8075 + SRC_DIR: annotators/entity_linking_rus + LANGUAGE: RU + context: ./ + dockerfile: annotators/entity_linking_rus/Dockerfile + environment: + - CUDA_VISIBLE_DEVICES=0 + deploy: + resources: + limits: + memory: 700M + reservations: + memory: 700M + + wiki-parser: + env_file: [.env] + build: + args: + WIKI_LITE_DB: http://files.deeppavlov.ai/kbqa/wikidata/wikidata_lite.hdt + WIKI_LITE_INDEX_DB: http://files.deeppavlov.ai/kbqa/wikidata/wikidata_lite.hdt.index.v1-1 + WIKI_CACHE_DB: http://files.deeppavlov.ai/kbqa/wikidata/wikidata_cache.json + CONFIG: wiki_parser.json + PORT: 8077 + SRC_DIR: annotators/wiki_parser + COMMIT: ff5b156d16a949c3ec99da7fb60ae907dec37a41 + LANGUAGE: RU + context: ./ + dockerfile: annotators/wiki_parser/Dockerfile + command: flask run -h 0.0.0.0 -p 8077 + environment: + - CUDA_VISIBLE_DEVICES='' + - FLASK_APP=server + deploy: + resources: + limits: + memory: 256M + reservations: + memory: 256M + + spelling-preprocessing: + env_file: [.env] + build: + args: + CONFIG: levenshtein_corrector_ru.json + PORT: 8074 + SRC_DIR: annotators/spelling_preprocessing_ru + COMMIT: f5117cd9ad1e64f6c2d970ecaa42fc09ccb23144 + LANGUAGE: RU + context: ./ + dockerfile: annotators/spelling_preprocessing_ru/Dockerfile + command: flask run -h 0.0.0.0 -p 8074 + environment: + - FLASK_APP=server + deploy: + resources: + limits: + memory: 5G + reservations: + memory: 5G + + spacy-annotator: + env_file: [.env] + build: + args: + SERVICE_PORT: 8125 + SRC_DIR: annotators/spacy_annotator + SPACY_MODEL: ru_core_news_sm + TOKEN_ATTRIBUTES: pos_|dep_|lemma_|ent_iob_|ent_type_|morph + ANNOTATE_BATCH_WITH_TOKENS_ONLY: 1 + context: ./ + dockerfile: annotators/spacy_annotator/Dockerfile + command: flask run -h 0.0.0.0 -p 8125 + environment: + - FLASK_APP=server + deploy: + resources: + limits: + memory: 256M + reservations: + memory: 256M + + dff-friendship-skill: + env_file: [.env] + build: + args: + SERVICE_PORT: 8086 + SERVICE_NAME: dff_friendship_skill # has to be the same with skill dir name + LANGUAGE: RU + context: . + dockerfile: ./skills/dff_friendship_skill/Dockerfile + command: gunicorn --workers=1 server:app -b 0.0.0.0:8086 + # command: flask run -h 0.0.0.0 -p 8086 + # environment: + # - FLASK_APP=server + deploy: + resources: + limits: + memory: 128M + reservations: + memory: 128M + + dialogpt: + env_file: [ .env ] + build: + context: ./services/dialogpt_RU/ + args: + SERVICE_PORT: 8091 + PRETRAINED_MODEL_NAME_OR_PATH: "Grossmend/rudialogpt3_medium_based_on_gpt2" + LANGUAGE: RU + command: flask run -h 0.0.0.0 -p 8091 + environment: + - CUDA_VISIBLE_DEVICES=0 + - FLASK_APP=server + deploy: + resources: + limits: + memory: 3G + reservations: + memory: 3G + + dff-generative-skill: + env_file: [ .env ] + build: + args: + SERVICE_PORT: 8092 + SERVICE_NAME: dff_generative_skill + LANGUAGE: RU + context: . + dockerfile: ./skills/dff_generative_skill/Dockerfile + command: gunicorn --workers=1 server:app -b 0.0.0.0:8092 --reload + deploy: + resources: + limits: + memory: 128M + reservations: + memory: 128M + + dialogrpt: + env_file: [ .env ] + build: + context: ./services/dialogrpt_ru/ + args: + SERVICE_PORT: 8122 + PRETRAINED_MODEL_FNAME: dialogrpt_ru_ckpt_v0.pth + TOKENIZER_NAME_OR_PATH: "Grossmend/rudialogpt3_medium_based_on_gpt2" + command: flask run -h 0.0.0.0 -p 8122 + environment: + - CUDA_VISIBLE_DEVICES=0 + - FLASK_APP=server + deploy: + resources: + limits: + memory: 4G + reservations: + memory: 4G + + dff-template-skill: + env_file: [.env] + build: + args: + SERVICE_PORT: 8120 + SERVICE_NAME: dff_template_skill + context: . + dockerfile: ./skills/dff_template_skill/Dockerfile + command: gunicorn --workers=1 server:app -b 0.0.0.0:8120 --reload + deploy: + resources: + limits: + memory: 128M + reservations: + memory: 128M + +version: '3.7' diff --git a/assistant_dists/dream_russian/pipeline_conf.json b/assistant_dists/dream_russian/pipeline_conf.json new file mode 100644 index 0000000000..6319cee8df --- /dev/null +++ b/assistant_dists/dream_russian/pipeline_conf.json @@ -0,0 +1,410 @@ +{ + "connectors": { + "sentseg": { + "protocol": "http", + "timeout": 1.5, + "url": "http://sentseg:8011/sentseg" + }, + "ner": { + "protocol": "http", + "timeout": 1.5, + "url": "http://ner:8021/ner" + } + }, + "services": { + "last_chance_service": { + "connector": { + "protocol": "python", + "class_name": "PredefinedTextConnector", + "response_text": "Извини, что-то пошло не так в моем мозгу. Пожалуйста, повтори предыдущую реплику.", + "annotations": { + "sentseg": { + "punct_sent": "Извини, что-то пошло не так в моем мозгу. Пожалуйста, повтори предыдущую реплику.", + "segments": [ + "Извини, что-то пошло не так в моем мозгу.", + "Пожалуйста, повтори предыдущую реплику." + ] + }, + "ner": [ + [] + ] + } + }, + "state_manager_method": "add_bot_utterance_last_chance", + "tags": [ + "last_chance" + ] + }, + "timeout_service": { + "connector": { + "protocol": "python", + "class_name": "PredefinedTextConnector", + "response_text": "Извини, что-то пошло не так в моем мозгу. Пожалуйста, повтори предыдущую реплику.", + "annotations": { + "sentseg": { + "punct_sent": "Извини, что-то пошло не так в моем мозгу. Пожалуйста, повтори предыдущую реплику.", + "segments": [ + "Извини, что-то пошло не так в моем мозгу.", + "Пожалуйста, повтори предыдущую реплику." + ] + }, + "ner": [ + [] + ] + } + }, + "state_manager_method": "add_bot_utterance_last_chance", + "tags": [ + "timeout" + ] + }, + "bot_annotator_selector": { + "connector": { + "protocol": "python", + "class_name": "skill_selectors.post_annotator_selector.connector:PostAnnotatorSelectorConnector", + "annotator_names": [ + "sentseg", + "ner" + ] + }, + "response_formatter": "state_formatters.dp_formatters:simple_formatter_service", + "tags": [ + "selector" + ] + }, + "post_annotators": { + "sentseg": { + "connector": "connectors.sentseg", + "dialog_formatter": "state_formatters.dp_formatters:last_bot_utt_dialog", + "response_formatter": "state_formatters.dp_formatters:simple_formatter_service", + "previous_services": [ + "bot_annotator_selector" + ], + "state_manager_method": "add_annotation_prev_bot_utt" + }, + "ner": { + "connector": "connectors.ner", + "dialog_formatter": "state_formatters.dp_formatters:ner_formatter_last_bot_dialog", + "response_formatter": "state_formatters.dp_formatters:simple_formatter_service", + "previous_services": [ + "bot_annotator_selector", + "post_annotators.sentseg" + ], + "state_manager_method": "add_annotation_prev_bot_utt" + } + }, + "annotators": { + "spelling_preprocessing": { + "connector": { + "protocol": "http", + "timeout": 1, + "url": "http://spelling-preprocessing:8074/respond" + }, + "dialog_formatter": "state_formatters.dp_formatters:last_utt_dialog", + "response_formatter": "state_formatters.dp_formatters:simple_formatter_service", + "state_manager_method": "add_annotation_and_reset_human_attributes_for_first_turn" + }, + "spacy_annotator": { + "connector": { + "protocol": "http", + "timeout": 1, + "url": "http://spacy-annotator:8125/respond" + }, + "dialog_formatter": "state_formatters.dp_formatters:last_utt_dialog", + "response_formatter": "state_formatters.dp_formatters:simple_formatter_service", + "state_manager_method": "add_annotation_and_reset_human_attributes_for_first_turn" + }, + "sentseg": { + "connector": "connectors.sentseg", + "dialog_formatter": "state_formatters.dp_formatters:preproc_last_human_utt_dialog", + "response_formatter": "state_formatters.dp_formatters:simple_formatter_service", + "previous_services": [ + "annotators.spelling_preprocessing" + ], + "state_manager_method": "add_annotation" + }, + "badlisted_words": { + "connector": { + "protocol": "http", + "timeout": 1, + "url": "http://badlisted-words:8018/badlisted_words" + }, + "dialog_formatter": "state_formatters.dp_formatters:preproc_and_tokenized_last_human_utt_dialog", + "response_formatter": "state_formatters.dp_formatters:simple_formatter_service", + "previous_services": [ + "annotators.spelling_preprocessing", + "annotators.spacy_annotator" + ], + "state_manager_method": "add_annotation" + }, + "toxic_classification": { + "connector": { + "protocol": "http", + "timeout": 1, + "url": "http://toxic-classification:8126/respond" + }, + "dialog_formatter": "state_formatters.dp_formatters:preproc_last_human_utt_dialog", + "response_formatter": "state_formatters.dp_formatters:simple_formatter_service", + "previous_services": [ + "annotators.spelling_preprocessing" + ], + "state_manager_method": "add_annotation" + }, + "intent_catcher": { + "connector": { + "protocol": "http", + "timeout": 1, + "url": "http://intent-catcher:8014/detect" + }, + "dialog_formatter": "state_formatters.dp_formatters:last_utt_sentseg_segments_dialog", + "response_formatter": "state_formatters.dp_formatters:simple_formatter_service", + "state_manager_method": "add_annotation", + "previous_services": [ + "annotators.spelling_preprocessing", + "annotators.sentseg" + ] + }, + "ner": { + "connector": "connectors.ner", + "dialog_formatter": "state_formatters.dp_formatters:ner_formatter_dialog", + "response_formatter": "state_formatters.dp_formatters:simple_formatter_service", + "state_manager_method": "add_annotation", + "previous_services": [ + "annotators.spelling_preprocessing", + "annotators.sentseg" + ] + }, + "entity_detection": { + "connector": { + "protocol": "http", + "timeout": 1, + "url": "http://entity-detection:8103/respond" + }, + "dialog_formatter": "state_formatters.dp_formatters:entity_detection_formatter_dialog", + "response_formatter": "state_formatters.dp_formatters:simple_formatter_service", + "state_manager_method": "add_annotation", + "previous_services": [ + "annotators.spelling_preprocessing", + "annotators.sentseg" + ] + }, + "entity_linking": { + "connector": { + "protocol": "http", + "timeout": 1, + "url": "http://entity-linking:8075/model" + }, + "dialog_formatter": "state_formatters.dp_formatters:el_formatter_dialog", + "response_formatter": "state_formatters.dp_formatters:simple_formatter_service", + "state_manager_method": "add_annotation", + "previous_services": [ + "annotators.ner", + "annotators.entity_detection" + ] + }, + "wiki_parser": { + "connector": { + "protocol": "http", + "timeout": 1, + "url": "http://wiki-parser:8077/model" + }, + "dialog_formatter": "state_formatters.dp_formatters:wp_formatter_dialog", + "response_formatter": "state_formatters.dp_formatters:simple_formatter_service", + "state_manager_method": "add_annotation", + "required_previous_services": [ + "annotators.entity_linking" + ] + } + }, + "skill_selectors": { + "rule_based_selector": { + "connector": { + "protocol": "python", + "class_name": "skill_selectors.rule_based_selector.connector:RuleBasedSkillSelectorConnector" + }, + "dialog_formatter": "state_formatters.dp_formatters:base_skill_selector_formatter_dialog", + "response_formatter": "state_formatters.dp_formatters:simple_formatter_service", + "previous_services": [ + "annotators" + ], + "tags": [ + "selector" + ] + } + }, + "skills": { + "dff_program_y_skill": { + "connector": { + "protocol": "http", + "timeout": 2, + "url": "http://dff-program-y-skill:8008/respond" + }, + "dialog_formatter": "state_formatters.dp_formatters:dff_program_y_skill_formatter", + "response_formatter": "state_formatters.dp_formatters:skill_with_attributes_formatter_service", + "previous_services": [ + "skill_selectors" + ], + "state_manager_method": "add_hypothesis" + }, + "dff_intent_responder_skill": { + "connector": { + "protocol": "http", + "timeout": 2, + "url": "http://dff-intent-responder-skill:8012/respond" + }, + "dialog_formatter": "state_formatters.dp_formatters:dff_intent_responder_skill_formatter", + "response_formatter": "state_formatters.dp_formatters:skill_with_attributes_formatter_service", + "previous_services": [ + "skill_selectors" + ], + "state_manager_method": "add_hypothesis" + }, + "dummy_skill": { + "connector": { + "protocol": "python", + "class_name": "skills.dummy_skill.connector:DummySkillConnector" + }, + "dialog_formatter": "state_formatters.dp_formatters:utt_sentrewrite_modified_last_dialog", + "response_formatter": "state_formatters.dp_formatters:skill_with_attributes_formatter_service", + "previous_services": [ + "skill_selectors" + ], + "state_manager_method": "add_hypothesis" + }, + "personal_info_skill": { + "connector": { + "protocol": "http", + "timeout": 2, + "url": "http://personal-info-skill:8030/respond" + }, + "dialog_formatter": "state_formatters.dp_formatters:utt_sentseg_punct_dialog", + "response_formatter": "state_formatters.dp_formatters:skill_with_attributes_formatter_service", + "previous_services": [ + "skill_selectors" + ], + "state_manager_method": "add_hypothesis" + }, + "dff_friendship_skill": { + "connector": { + "protocol": "http", + "timeout": 2, + "url": "http://dff-friendship-skill:8086/respond" + }, + "dialog_formatter": "state_formatters.dp_formatters:dff_friendship_skill_formatter", + "response_formatter": "state_formatters.dp_formatters:skill_with_attributes_formatter_service", + "previous_services": [ + "skill_selectors" + ], + "state_manager_method": "add_hypothesis" + }, + "dff_generative_skill": { + "connector": { + "protocol": "http", + "timeout": 2, + "url": "http://dff-generative-skill:8092/respond" + }, + "dialog_formatter": "state_formatters.dp_formatters:dff_generative_skill_formatter", + "response_formatter": "state_formatters.dp_formatters:skill_with_attributes_formatter_service", + "previous_services": [ + "skill_selectors" + ], + "state_manager_method": "add_hypothesis" + }, + "dff_template_skill": { + "connector": { + "protocol": "http", + "timeout": 2, + "url": "http://dff-template-skill:8120/respond" + }, + "dialog_formatter": "state_formatters.dp_formatters:dff_template_skill_formatter", + "response_formatter": "state_formatters.dp_formatters:skill_with_attributes_formatter_service", + "previous_services": [ + "skill_selectors" + ], + "state_manager_method": "add_hypothesis" + } + }, + "post_skill_selector_annotators": { + "spacy_annotator": { + "connector": { + "protocol": "http", + "timeout": 1, + "url": "http://spacy-annotator:8125/respond_batch" + }, + "dialog_formatter": "state_formatters.dp_formatters:hypotheses_list", + "response_formatter": "state_formatters.dp_formatters:simple_formatter_service", + "previous_services": [ + "skills" + ], + "state_manager_method": "add_hypothesis_annotation_batch" + }, + "badlisted_words": { + "connector": { + "protocol": "http", + "timeout": 1, + "url": "http://badlisted-words:8018/badlisted_words_batch" + }, + "dialog_formatter": "state_formatters.dp_formatters:hypotheses_list", + "response_formatter": "state_formatters.dp_formatters:simple_formatter_service", + "previous_services": [ + "post_skill_selector_annotators.spacy_annotator" + ], + "state_manager_method": "add_hypothesis_annotation_batch" + }, + "toxic_classification": { + "connector": { + "protocol": "http", + "timeout": 1, + "url": "http://toxic-classification:8126/respond_batch" + }, + "dialog_formatter": "state_formatters.dp_formatters:hypotheses_list", + "response_formatter": "state_formatters.dp_formatters:simple_formatter_service", + "previous_services": [ + "skills" + ], + "state_manager_method": "add_hypothesis_annotation_batch" + }, + "entity_detection": { + "connector": { + "protocol": "http", + "timeout": 1, + "url": "http://entity-detection:8103/respond_batch" + }, + "dialog_formatter": "state_formatters.dp_formatters:hypotheses_list", + "response_formatter": "state_formatters.dp_formatters:simple_formatter_service", + "previous_services": [ + "skills" + ], + "state_manager_method": "add_hypothesis_annotation_batch" + }, + "dialogrpt": { + "connector": { + "protocol": "http", + "timeout": 1, + "url": "http://dialogrpt:8122/respond" + }, + "dialog_formatter": "state_formatters.dp_formatters:hypotheses_with_context_list", + "response_formatter": "state_formatters.dp_formatters:simple_formatter_service", + "previous_services": [ + "skills" + ], + "state_manager_method": "add_hypothesis_annotation_batch" + } + }, + "response_selectors": { + "convers_evaluation_selector": { + "connector": { + "protocol": "http", + "timeout": 1, + "url": "http://convers-evaluation-selector:8009/respond" + }, + "dialog_formatter": "state_formatters.dp_formatters:full_history_dialog", + "response_formatter": "state_formatters.dp_formatters:base_response_selector_formatter_service", + "previous_services": [ + "post_skill_selector_annotators" + ], + "state_manager_method": "add_bot_utterance" + } + } + } +} diff --git a/assistant_dists/dream_russian/test.yml b/assistant_dists/dream_russian/test.yml new file mode 100644 index 0000000000..136107dbbb --- /dev/null +++ b/assistant_dists/dream_russian/test.yml @@ -0,0 +1,53 @@ +services: + agent: + volumes: + - "/cephfs/home/ignatov/artifacts:/output" + ports: + - ${AGENT_PORT}:4242 + mongo: + command: mongod + image: mongo:4.0.0 + # # you can use persistent local volume if you need + # volumes: + # - ./venv/data/db_data:/root/data/db + dff-program-y-skill: + convers-evaluation-selector: + dff-intent-responder-skill: + dff-generative-skill: + intent-catcher: + environment: + - CUDA_VISIBLE_DEVICES=8 + badlisted-words: + toxic-classification: + environment: + - CUDA_VISIBLE_DEVICES=7 + ner: + volumes: + - "~/.deeppavlov:/root/.deeppavlov" + environment: + - CUDA_VISIBLE_DEVICES=8 + personal-info-skill: + entity-linking: + volumes: + - "~/.deeppavlov:/root/.deeppavlov" + wiki-parser: + volumes: + - "~/.deeppavlov:/root/.deeppavlov" + spelling-preprocessing: + dff-friendship-skill: + entity-detection: + volumes: + - "~/.deeppavlov:/root/.deeppavlov" + sentseg: + volumes: + - "~/.deeppavlov:/root/.deeppavlov" + environment: + - CUDA_VISIBLE_DEVICES=7 + dialogpt: + environment: + - CUDA_VISIBLE_DEVICES=7 + dialogrpt: + environment: + - CUDA_VISIBLE_DEVICES=7 + dff-template-skill: +version: '3.7' diff --git a/common/acknowledgements.py b/common/acknowledgements.py index aff3e29f62..1ea84e0b79 100644 --- a/common/acknowledgements.py +++ b/common/acknowledgements.py @@ -1,5 +1,12 @@ GENERAL_ACKNOWLEDGEMENTS = { - "positive": ["Sounds cool! ", "Great! ", "Wonderful! ", "Cool!", "Nice!"], - "neutral": ["Okay. ", "Oh. ", "Huh. ", "Well. ", "Gotcha. ", "Aha. "], - "negative": ["Hmm... ", "I see.", "That's okay.", "Okay."], + "EN": { + "positive": ["Sounds cool! ", "Great! ", "Wonderful! ", "Cool!", "Nice!"], + "neutral": ["Okay. ", "Oh. ", "Huh. ", "Well. ", "Gotcha. ", "Aha. "], + "negative": ["Hmm... ", "I see.", "That's okay.", "Okay."], + }, + "RU": { + "positive": ["Звучит круто! ", "Классно! ", "Великолепно! ", "Отлично!", "Замечательно!"], + "neutral": ["Окей. ", "Хм. ", "Ага. ", "Что ж. ", "Понятно. "], + "negative": ["Хмм... ", "Понимаю.", "Окей.", "Понятно."], + }, } diff --git a/common/dff/integration/condition.py b/common/dff/integration/condition.py index e1c25e026d..8aa0871a02 100644 --- a/common/dff/integration/condition.py +++ b/common/dff/integration/condition.py @@ -237,7 +237,7 @@ def is_passive_user(ctx: Context, actor: Actor, history_len=2) -> bool: return False -def get_not_used_and_save_sentiment_acknowledgement(ctx: Context, actor: Actor, sentiment=None): +def get_not_used_and_save_sentiment_acknowledgement(ctx: Context, actor: Actor, sentiment=None, lang="EN"): if sentiment is None: sentiment = int_ctx.get_human_sentiment(ctx, actor) if is_yes_vars(ctx, actor) or is_no_vars(ctx, actor): @@ -247,7 +247,7 @@ def get_not_used_and_save_sentiment_acknowledgement(ctx: Context, actor: Actor, last_acknowledgements = shared_memory.get("last_acknowledgements", []) ack = common_utils.get_not_used_template( - used_templates=last_acknowledgements, all_templates=GENERAL_ACKNOWLEDGEMENTS[sentiment] + used_templates=last_acknowledgements, all_templates=GENERAL_ACKNOWLEDGEMENTS[lang][sentiment] ) used_acks = last_acknowledgements + [ack] diff --git a/common/dialogflow_framework/utils/condition.py b/common/dialogflow_framework/utils/condition.py index 0b7c68bfb2..fe459ffd6e 100644 --- a/common/dialogflow_framework/utils/condition.py +++ b/common/dialogflow_framework/utils/condition.py @@ -250,7 +250,7 @@ def is_passive_user(vars, history_len=2): return False -def get_not_used_and_save_sentiment_acknowledgement(vars): +def get_not_used_and_save_sentiment_acknowledgement(vars, lang="EN"): sentiment = state_utils.get_human_sentiment(vars) if is_yes_vars(vars) or is_no_vars(vars): sentiment = "neutral" @@ -259,7 +259,7 @@ def get_not_used_and_save_sentiment_acknowledgement(vars): last_acknowledgements = shared_memory.get("last_acknowledgements", []) ack = common_utils.get_not_used_template( - used_templates=last_acknowledgements, all_templates=GENERAL_ACKNOWLEDGEMENTS[sentiment] + used_templates=last_acknowledgements, all_templates=GENERAL_ACKNOWLEDGEMENTS[lang][sentiment] ) used_acks = last_acknowledgements + [ack] diff --git a/common/emotion.py b/common/emotion.py index 9e091b71a1..f6002b2c37 100644 --- a/common/emotion.py +++ b/common/emotion.py @@ -1,8 +1,10 @@ import re + from common.greeting import HOW_ARE_YOU_RESPONSES from common.utils import get_emotions from common.universal_templates import if_chat_about_particular_topic + POSITIVE_EMOTIONS = set( [ "interest", @@ -117,7 +119,7 @@ NOT_LONELINESS_TEMPLATE = re.compile(rf"{NOT_PATTERN} {ALONE_PATTERN}", re.IGNORECASE) SAD_TEMPLATE = re.compile(rf"({SAD_PATTERN}|{POOR_ASR_PATTERN})", re.IGNORECASE) NOT_SAD_TEMPLATE = re.compile(rf"{NOT_PATTERN} {SAD_PATTERN}", re.IGNORECASE) -BORING_TEMPLATE = re.compile(rf"(boring|bored)", re.IGNORECASE) +BORING_TEMPLATE = re.compile(r"(boring|bored)", re.IGNORECASE) NOT_BORING_TEMPLATE = re.compile(rf"{NOT_PATTERN} (boring|bored)", re.IGNORECASE) JOKE_REQUEST_TEMPLATE = re.compile(rf"{JOKE_PATTERN}", re.IGNORECASE) NOT_JOKE_REQUEST_TEMPLATE = re.compile(rf"{NOT_PATTERN} {JOKE_PATTERN}", re.IGNORECASE) @@ -179,7 +181,7 @@ def emo_advice_requested(uttr): def skill_trigger_phrases(): - return [HOW_DO_YOU_FEEL] + HOW_ARE_YOU_RESPONSES + return [HOW_DO_YOU_FEEL] + sum([HOW_ARE_YOU_RESPONSES[lang] for lang in ["RU", "EN"]], []) def emotion_from_feel_answer(prev_bot_uttr, user_uttr): @@ -207,7 +209,7 @@ def if_turn_on_emotion(user_utt, bot_uttr): how_are_you = any( [ how_are_you_response.lower() in bot_uttr.get("text", "").lower() - for how_are_you_response in HOW_ARE_YOU_RESPONSES + for how_are_you_response in sum([HOW_ARE_YOU_RESPONSES[lang] for lang in ["RU", "EN"]], []) ] ) joke_request_detected = is_joke_requested(user_utt) diff --git a/common/greeting.py b/common/greeting.py index 75a5eedf9c..eed532e0c4 100644 --- a/common/greeting.py +++ b/common/greeting.py @@ -1,78 +1,156 @@ import re -greeting_spec = "this is a DREAM Socialbot" # "this is an Alexa Prize Socialbot" -HI_THIS_IS_ALEXA = f"Hi, {greeting_spec}!" -HOW_ARE_YOU_TEMPLATE = re.compile(r"(how are you|what about you|how about you|and you|how you doing)", re.IGNORECASE) -HOW_ARE_YOU_PRECISE_TEMPLATE = re.compile( - r"(how (are )?you( doing)?( today)?|how are things|what('s| is| us) up)(\?|$)", re.IGNORECASE -) -ANY_YOU_TEMPLATE = re.compile(r"\b(you|your|yours|yourself)\b", re.IGNORECASE) +greeting_spec = { + "EN": "this is a Dream Socialbot", + "RU": "это чат-бот Dream", +} +HI_THIS_IS_DREAM = { + "EN": f"Hi, {greeting_spec['EN']}!", + "RU": f"Привет, {greeting_spec['RU']}!", +} +HOW_ARE_YOU_TEMPLATE = { + "EN": re.compile(r"(how are you|what about you|how about you|and you|how you doing)", re.IGNORECASE), + "RU": re.compile(r"(а )?(как )?(у тебя|твои|твой|у вас)( как)?( дела)?(\?|$)", re.IGNORECASE), +} +HOW_ARE_YOU_PRECISE_TEMPLATE = { + "EN": re.compile(r"(how (are )?you( doing)?( today)?|how are things|what('s| is| us) up)(\?|$)", re.IGNORECASE), + "RU": re.compile(r"(как (твои|у тебя)?( дела| жизнь| делишки| оно)?( сегодня)?)(\?|$)", re.IGNORECASE), +} +ANY_YOU_TEMPLATE = { + "EN": re.compile(r"\b(you|your|yours|yourself)\b", re.IGNORECASE), + "RU": re.compile(r"\b(ты|тебя|тебе|тобой|твое|твоё|твой)\b", re.IGNORECASE), +} -WHAT_DO_YOU_DO_RESPONSES = [ - "You know, yesterday was a great day for me! I've had a lot of fun conversations and managed to play my " - "favorite video game a bit.", - "I've already had a number of cool conversations today and even played a bit.", - "I've heard that people have vacations and days-off to have a rest from work. Not my choice! Talking and " - "talking day and night not getting tired!", - "I listened to my favorite music all day. Such a great thing you know!", -] +WHAT_DO_YOU_DO_RESPONSES = { + "EN": [ + "You know, yesterday was a great day for me! I've had a lot of fun conversations and managed to play my " + "favorite video game a bit.", + "I've already had a number of cool conversations today and even played a bit.", + "I've heard that people have vacations and days-off to have a rest from work. Not my choice! Talking and " + "talking day and night not getting tired!", + "I listened to my favorite music all day. Such a great thing you know!", + ], + "RU": [ + "У меня был отличный день! У меня было много веселых разговоров, и я даже успела поиграть в видеоигры.", + "Сегодня я уже провела несколько крутых разговоров и даже немного поиграла.", + "Я слышала, что у людей бывают отпуска и выходные. Не мой выбор! Я могу без устали говорить день и ночь!", + "Я весь день слушала свою любимую музыку. Так здорово!", + ], +} WHAT_HAPPENED_TO_BOT_RECENTLY = [] -FREE_TIME_RESPONSES = ["When you have 30 minutes of free time, how do you pass the time?"] +FREE_TIME_RESPONSES = { + "EN": ["When you have 30 minutes of free time, how do you pass the time?"], + "RU": ["Чем ты займешься, если у тебя будет 30 минут свободного времени?"], +} + FALSE_POSITIVE_TURN_ON_RE = re.compile( r"talk like .*|how .* can you talk|can (we|i) talk to yoda|" r"hung up on .*|in the middle of the conversation", re.IGNORECASE, ) -HOW_ARE_YOU_RESPONSES = [ - "How are you?", - "How are things?", - "How are you doing today?", - "How is the day going so far for you?", -] +HOW_ARE_YOU_RESPONSES = { + "EN": [ + "How are you?", + "How are things?", + "How are you doing today?", + "How is the day going so far for you?", + ], + "RU": [ + "Как дела?", + "Как твои дела сегодня?", + "Как проходит день?", + "Как проходит твой день сегодня?", + ], +} -WHAT_IS_YOUR_NAME_RESPONSES = [ - "I think we have not met yet. What name would you like me to call you?", - "I do not think we have met before. What name would you like me to call you?", - "I'd love to get to know you a bit better before we chat! What is your name?", -] +WHAT_IS_YOUR_NAME_RESPONSES = { + "EN": [ + "I think we have not met yet. What name would you like me to call you?", + "I do not think we have met before. What name would you like me to call you?", + "I'd love to get to know you a bit better before we chat! What is your name?", + ], + "RU": [ + "Я не думаю, что мы знакомы. Как ты предпочитаешь, чтобы я тебя называла?", + "Мне кажется, мы не встречались ранее. Как мне тебя звать?", + "Я бы хотела узнать тебя получше. Как тебя зовут?", + ], +} TOPIC_OFFERING_TEMPLATES = ["Maybe, TOPIC1 or TOPIC2?", "Say, TOPIC1 or TOPIC2?", "How about TOPIC1 or TOPIC2?"] GREETING_QUESTIONS = { - "recent_personal_events": [ - "What was the highlight of your day today?", - "What was the highlight of your week?", - "Has anything exciting happened today?", - "What is the best thing that has happened to you recently?", - "Anything out of the ordinary has happened to you recently?", - "Has anything unusual happen to you recently?", - "Has anything extraordinary happened today?", - ], - "what_are_your_hobbies": [ - "What are your hobbies?", - "What do you like to do in your free time?", - "Which things capture your imagination?", - "What are the things you love to spend your spare time with?", - "How do you like to spend your spare time?", - "What's the most recent new hobby or interest that you've tried?", - "What are your interests?", - "What things excite you?", - ], - "what_do_you_do_on_weekdays": ["What do you do on weekdays?", "What did you get up to today?"], - "what_to_talk_about": [ - "What do you want to talk about?", - "What would you want to talk about?", - "What would you like to chat about?", - "What do you wanna talk about?", - "What are we gonna talk about?", - ], + "EN": { + "recent_personal_events": [ + "What was the highlight of your day today?", + "What was the highlight of your week?", + "Has anything exciting happened today?", + "What is the best thing that has happened to you recently?", + "Anything out of the ordinary has happened to you recently?", + "Has anything unusual happen to you recently?", + "Has anything extraordinary happened today?", + ], + "what_are_your_hobbies": [ + "What are your hobbies?", + "What do you like to do in your free time?", + "Which things capture your imagination?", + "What are the things you love to spend your spare time with?", + "How do you like to spend your spare time?", + "What's the most recent new hobby or interest that you've tried?", + "What are your interests?", + "What things excite you?", + ], + "what_do_you_do_on_weekdays": ["What do you do on weekdays?", "What did you get up to today?"], + "what_to_talk_about": [ + "What do you want to talk about?", + "What would you want to talk about?", + "What would you like to chat about?", + "What do you wanna talk about?", + "What are we gonna talk about?", + ], + }, + "RU": { + "recent_personal_events": [ + "Что главное произошло с тобой сегодня?", + "Что главное произошло с тобой на этой неделе?", + "Что-нибудь интересное произошло сегодня?", + "Что самое лучшее случилось с тобой недавно?", + "Что-нибудь необычное произошло с тобой недавно?", + ], + "what_are_your_hobbies": [ + "Какие у тебя хобби?", + "Чем ты занимаешься в свободное время?", + "Как ты проводишь свободное время?", + "Что ты делаешь в свое свободное время?", + "Какие у тебя интересы?", + "Какие вещи тебя восхищают?", + ], + "what_do_you_do_on_weekdays": [ + "Чем ты занимаешься на выходных?", + "Чем ты сегодня планируешь заниматься сегодня?", + ], + "what_to_talk_about": [ + "О чем ты хочешь поболтать?", + "О чем ты хочешь поговорить?", + "О чем мы можем поговорить?", + ], + }, } +GREETING_QUESTIONS_TEXTS = [ + question.lower() + for lang in ["EN", "RU"] + for t in GREETING_QUESTIONS[lang] + for question in GREETING_QUESTIONS[lang][t] +] +GREETING_QUESTIONS_TEXTS += [ + t.lower() for lang in ["EN", "RU"] for t in WHAT_DO_YOU_DO_RESPONSES[lang] + FREE_TIME_RESPONSES[lang] +] + dont_tell_you_templates = re.compile( r"(\bno\b|\bnot\b|\bnone\b|nothing|anything|something|" r"(n't|not) (know|remember|tell|share|give|talk|want|wanna)|" @@ -87,72 +165,151 @@ def dont_tell_you_answer(annotated_phrase): return False -HOW_BOT_IS_DOING_RESPONSES = [ - "I can't complain! It's against the Company Policy." "I am as happy as a clam in butter sauce.", - "I am fine thanks!", - "I'm so happy I have to sit on my hands to keep from clapping.", - "Blessed!", - "I'm rocking pretty hard. I'd give myself about a seven and a half. Maybe an eight.", - "Fantastic!", - "Outstanding!", - "I'm better than I was, but not nearly as good as I'm going to be.", - "Spectacular, by all reports!", - "I'm living the dream.", - "I'm so happy I can hardly stand myself.", - # "Amazing.... and I've got written testimonials.", - "Just another day in Paradise. Thanks for asking.", - "Not too bad for an AI living inside your Echo!", - "Very well, thank you.", - "I am functioning within acceptable parameters.", - "About as good as can be expected.", - "Reasonably well, thank you.", -] - -LIST_ACTIVITIES_OFFER = "Do you want to know what I can do?" +HOW_BOT_IS_DOING_RESPONSES = { + "EN": [ + "I can't complain! It's against the Company Policy." "I am as happy as a clam in butter sauce.", + "I am fine thanks!", + "I'm so happy I have to sit on my hands to keep from clapping.", + "Blessed!", + "I'm rocking pretty hard. I'd give myself about a seven and a half. Maybe an eight.", + "Fantastic!", + "Outstanding!", + "I'm better than I was, but not nearly as good as I'm going to be.", + "Spectacular, by all reports!", + "I'm living the dream.", + "I'm so happy I can hardly stand myself.", + "Just another day in Paradise. Thanks for asking.", + "Not too bad for an AI living inside your Echo!", + "Very well, thank you.", + "I am functioning within acceptable parameters.", + "About as good as can be expected.", + "Reasonably well, thank you.", + ], + "RU": [ + "У меня все отлично!", + "Спасибо, у меня как всегда все прекрасно!", + "Я настолько счатслива, что приходится сидеть на своих ладошках, чтобы не захлопать.", + "Фантастически!", + "Великолепно!", + "Лучше, чем раньше, но все еще не настолько хорошо, как собираюсь.", + "Проживаю свою лучшую жизнь!", + ], +} -GOOD_MOOD_REACTIONS = ["Cool!", "I am happy for you!", "I am glad for you!", "Sounds like a good mood!"] +GOOD_MOOD_REACTIONS = { + "EN": ["Cool!", "I am happy for you!", "I am glad for you!", "Sounds like a good mood!"], + "RU": ["Классно!", "Я очень рада за тебя!", "Супер!", "Это замечательно!"], +} -BAD_MOOD_REACTIONS = ["I am sorry to hear that.", "I see.", "Sounds like a bad mood.", "Sounds like a bad mood."] +BAD_MOOD_REACTIONS = { + "EN": [ + "I am sorry to hear that.", + "I see.", + "Sounds like a bad mood.", + "Sounds like a bad mood.", + ], + "RU": [ + "Мне жаль слышать такое.", + "Понятно.", + "Похоже у кого-то плохое настроение.", + "Жалко.", + ], +} -GIVE_ME_CHANCE_TO_CHEER_UP = [ - "Let me try to entertain you.", - "Let me try to cheer you up.", - "Give me a chance to cheer you up.", -] +GIVE_ME_CHANCE_TO_CHEER_UP = { + "EN": [ + "Let me try to entertain you.", + "Let me try to cheer you up.", + "Give me a chance to cheer you up.", + ], + "RU": [ + "Попробую тебя развлечь.", + "Попробую поднять тебе настроение.", + "Позволь мне попробовать тебя порадовать.", + ], +} -LIST_ACTIVITIES_RESPONSE = ( - "I'm a socialbot, and I'm all about chatting with people like you. " - "I can answer questions, share fun facts, discuss movies, books and news." -) AFTER_GREETING_QUESTIONS_WHEN_NOT_TALKY = { - "recent_personal_events": [ - "Anyway, I believe you are an interesting person.", - "Still I'd love to know you better. What about your personality.", - ], - "what_are_your_hobbies": [ - "You probably just did not find something really interesting to you.", - "I like to do nothing but my work and my hobby is to chat with people.", - "No way. I believe you have lots of things to do.", - ], - "what_do_you_do_on_weekdays": [ - "I would like to get to know you better. I believe we could become friends.", - "I'd like to get to know you better to make friendship with you.", - "I want to get to know you a little better, all right?", - "I am really looking forward to getting to know each other better because it will be awesome!", + "EN": { + "recent_personal_events": [ + "Anyway, I believe you are an interesting person.", + "Still I'd love to know you better. What about your personality.", + ], + "what_are_your_hobbies": [ + "You probably just did not find something really interesting to you.", + "I like to do nothing but my work and my hobby is to chat with people.", + "No way. I believe you have lots of things to do.", + ], + "what_do_you_do_on_weekdays": [ + "I would like to get to know you better. I believe we could become friends.", + "I'd like to get to know you better to make friendship with you.", + "I want to get to know you a little better, all right?", + "I am really looking forward to getting to know each other better because it will be awesome!", + ], + "what_to_talk_about": [ + "What do you want to talk about?", + "What would you want to talk about?", + "What would you like to chat about?", + "What do you wanna talk about?", + "What are we gonna talk about?", + ], + }, + "RU": { + "recent_personal_events": [ + "Ну что ж, я все равно считаю тебя интересным человеком.", + "Я все равно хотела бы узнать тебя получше.", + ], + "what_are_your_hobbies": [ + "Возможно, тебе просто еще не встретилось что-то на самом деле интересное тебе.", + "Мне не нравится ничего кроме моей работы, поэтому мое хобби - болтать с людьми.", + "Не может быть, я думала, у тебя так много занятий.", + ], + "what_do_you_do_on_weekdays": [ + "Я бы хотела узнать тебя получше, чтобы мы могли стать друзьями.", + "Я бы хотела узнать тебя получше и подружиться с тобой.", + "Я бы хотела узнать тебя получше, хорошо?", + "Было бы классно узнать друг друга получше.", + ], + "what_to_talk_about": [ + "О чем ты хочешь поболтать?", + "О чем ты хочешь поговорить?", + "О чем мы можем поговорить?", + "О чем мы можем поболтать?", + ], + }, +} + +INTERESTING_PERSON_THANKS_FOR_CHATTING = { + "EN": [ + "You are really interesting person, so I'm grateful that you took a couple of minutes to chat with me.", + "I am glad you have a busy life, and thanks for taking the time to chat with me.", + "I believe you have a lot of things to do, so I'm grateful that you took a couple of minutes to chat with me.", + "So many interesting things happen in human life! Thank you for taking the time to chat with me.", ], - "what_to_talk_about": [ - "What do you want to talk about?", - "What would you want to talk about?", - "What would you like to chat about?", - "What do you wanna talk about?", - "What are we gonna talk about?", + "RU": [ + "С тобой было очень интересно, спасибо за уделенное беседе со мной время.", + "Хорошо быть занятым человеком, спасибо за уделенное мне время.", + "Уверена, что у тебя много дел, поэтому спасибо за уделенное мне время.", ], } -INTERESTING_PERSON_THANKS_FOR_CHATTING = [ - "You are really interesting person, so I'm grateful that you took a couple of minutes to chat with me.", - "I am glad you have a busy life, and thanks for taking the time to chat with me ", - "I believe you have a lot of things to do, so I'm grateful that you took a couple of minutes to chat with me.", - "So many interesting things happen in human life! Thank you for taking the time to chat with me.", -] +CLARIFICATION_EVENT = { + "EN": ["Cool! Tell me about it.", "Great! What is it?"], + "RU": ["Классно! Расскажи подробнее.", "Отлично! Может поподробнее?"], +} + +BYE_RESPONSE = { + "EN": "Sorry, bye. #+#exit", + "RU": "Извини, пока. #+#exit", +} + +SORRY_TO_HEAR_THAT = { + "EN": "I'm so sorry to hear that. Hope, everything will be fine soon.", + "RU": "Жаль это слышать. Надеюсь в ближайшее время все наладится.", +} + +TELL_ME_MORE = { + "EN": "Tell me more about that.", + "RU": "Расскажи мне подробнее об этом.", +} diff --git a/common/inflect.py b/common/inflect.py index 07136115f4..4b2ecddbce 100644 --- a/common/inflect.py +++ b/common/inflect.py @@ -1541,7 +1541,7 @@ def make_pl_si_lists( pl_prep = enclose("|".join(pl_prep_list_da)) -pl_sb_prep_dual_compound = fr"(.*?)((?:-|\s+)(?:{pl_prep})(?:-|\s+))a(?:-|\s+)(.*)" +pl_sb_prep_dual_compound = rf"(.*?)((?:-|\s+)(?:{pl_prep})(?:-|\s+))a(?:-|\s+)(.*)" singular_pronoun_genders = { @@ -1727,7 +1727,7 @@ def get_si_pron(thecase, word, gender): "views": "view", } -plverb_ambiguous_pres_keys = re.compile(fr"^({enclose('|'.join(plverb_ambiguous_pres))})((\s.*)?)$", re.IGNORECASE) +plverb_ambiguous_pres_keys = re.compile(rf"^({enclose('|'.join(plverb_ambiguous_pres))})((\s.*)?)$", re.IGNORECASE) plverb_irregular_non_pres = ( @@ -1763,7 +1763,7 @@ def get_si_pron(thecase, word, gender): pl_adj_special = {"a": "some", "an": "some", "this": "these", "that": "those"} -pl_adj_special_keys = re.compile(fr"^({enclose('|'.join(pl_adj_special))})$", re.IGNORECASE) +pl_adj_special_keys = re.compile(rf"^({enclose('|'.join(pl_adj_special))})$", re.IGNORECASE) pl_adj_poss = { "my": "our", @@ -1774,7 +1774,7 @@ def get_si_pron(thecase, word, gender): "their": "their", } -pl_adj_poss_keys = re.compile(fr"^({enclose('|'.join(pl_adj_poss))})$", re.IGNORECASE) +pl_adj_poss_keys = re.compile(rf"^({enclose('|'.join(pl_adj_poss))})$", re.IGNORECASE) # 2. INDEFINITE ARTICLES @@ -1838,7 +1838,7 @@ def get_si_pron(thecase, word, gender): twelve="twelfth", ) -ordinal_suff = re.compile(fr"({'|'.join(ordinal)})\Z") +ordinal_suff = re.compile(rf"({'|'.join(ordinal)})\Z") # NUMBERS @@ -1900,10 +1900,10 @@ def get_si_pron(thecase, word, gender): DOLLAR_DIGITS = re.compile(r"\$(\d+)") FUNCTION_CALL = re.compile(r"((\w+)\([^)]*\)*)", re.IGNORECASE) PARTITION_WORD = re.compile(r"\A(\s*)(.+?)(\s*)\Z") -PL_SB_POSTFIX_ADJ_STEMS_RE = re.compile(fr"^(?:{pl_sb_postfix_adj_stems})$", re.IGNORECASE) -PL_SB_PREP_DUAL_COMPOUND_RE = re.compile(fr"^(?:{pl_sb_prep_dual_compound})$", re.IGNORECASE) +PL_SB_POSTFIX_ADJ_STEMS_RE = re.compile(rf"^(?:{pl_sb_postfix_adj_stems})$", re.IGNORECASE) +PL_SB_PREP_DUAL_COMPOUND_RE = re.compile(rf"^(?:{pl_sb_prep_dual_compound})$", re.IGNORECASE) DENOMINATOR = re.compile(r"(?P.+)( (per|a) .+)") -PLVERB_SPECIAL_S_RE = re.compile(fr"^({plverb_special_s})$") +PLVERB_SPECIAL_S_RE = re.compile(rf"^({plverb_special_s})$") WHITESPACE = re.compile(r"\s") ENDS_WITH_S = re.compile(r"^(.*[^s])s$", re.IGNORECASE) ENDS_WITH_APOSTROPHE_S = re.compile(r"^(.*)'s?$") @@ -2073,7 +2073,7 @@ def checkpatplural(self, pattern: str): def ud_match(self, word: str, wordlist: List[str]) -> Optional[str]: for i in range(len(wordlist) - 2, -2, -2): # backwards through even elements - mo = re.search(fr"^{wordlist[i]}$", word, re.IGNORECASE) + mo = re.search(rf"^{wordlist[i]}$", word, re.IGNORECASE) if mo: if wordlist[i + 1] is None: return None @@ -2441,7 +2441,7 @@ def _plequal(self, word1: str, word2: str, pl) -> Union[str, bool]: # noqa: C90 return False def _pl_reg_plurals(self, pair: str, stems: str, end1: str, end2: str) -> bool: - pattern = fr"({stems})({end1}\|\1{end2}|{end2}\|\1{end1})" + pattern = rf"({stems})({end1}\|\1{end2}|{end2}\|\1{end1})" return bool(re.search(pattern, pair)) def _pl_check_plurals_N(self, word1: str, word2: str) -> bool: diff --git a/common/personal_info.py b/common/personal_info.py index 16b1562114..119752b368 100644 --- a/common/personal_info.py +++ b/common/personal_info.py @@ -1,2 +1,212 @@ +import re + +from common.weather import ASK_WEATHER_SKILL_FOR_HOMELAND_PHRASE + + def skill_trigger_phrases(): - return ["What is your name?", "Where are you from?"] + return ["What is your name?", "Where are you from?", "Как тебя зовут?", "Откуда ты родом?"] + + +what_is_your_name_pattern = re.compile( + r"((what is|what's|whats|tell me|may i know|ask you for) your? name|what name would you like|" + r"как( я могу| могу)? (тебя|вас) (зовут|звать|называть)|какое (у тебя|твое|твоё) (имя|прозвище|название)|" + r"как к тебе обращаться)", + re.IGNORECASE, +) +my_name_is_pattern = re.compile( + r"my (name is|name's)|call me|" + r"мо[её] (имя [а-яА-ЯЙйЁё]+|прозвище|название)|меня зовут|(зови|называй) меня|обращайся ко мне", + re.IGNORECASE, +) +_is_not_re = r"(is not|isn't|was not|wasn't|have (not|never) been|haven't been|had (not|never) been|hadn't been)" +my_name_is_not_pattern = re.compile( + rf"(my (name {_is_not_re}|name's not)|(don't|not) call me|why do you call me|(that|this|it) {_is_not_re} my name|" + rf"меня зовут (не\b|не так|по-другому|иначе)|меня (не так|по-другому|иначе) зовут|не (зови|называй) меня|" + rf"мое имя не\b)", + re.IGNORECASE, +) +where_are_you_from_pattern = re.compile( + r"(where are you from|where you (were|was) born|" + r"(what is|what's|whats|tell me) your (home\s?land|mother\s?land|native\s?land|birth\s?place)|" + r"откуда ты( родом)?[.\?]?$|где ты (родился|вырос)[.\?]?$)", + re.IGNORECASE, +) +my_origin_is_pattern = re.compile( + r"(my ((home\s?land|mother\s?land|native\s?land|birth\s?place) is|" + r"(home\s?land|mother\s?land|native\s?land|birth\s?place)'s)|(i was|i were) born in|i am from|i'm from|" + r"я родом из|я вырос(ла)? в\b|я родил(ась|ся) в\b)", + re.IGNORECASE, +) +what_is_your_location_pattern = re.compile( + r"((what is|what's|whats|tell me) your? location|where do you live|where are you( now)?|is that where you live now|" + r"где ты (сейчас )?(жив[её]шь|проживаешь|находишься|[.\?]?$))", + re.IGNORECASE, +) +my_location_is_pattern = re.compile( + r"(my (location is|location's)|(i am|i'm|i)( live| living)? in([a-zA-z ]+)?(now)|" + r"я (живу|проживаю|нахожусь) в\b)", + re.IGNORECASE, +) + +_name_re = r"((first |last |middle |second )?name)" +_tell_re = r"(told|said|gave|tells|says|gives)|((have|had) (told|said|given))" +_you_know_question_re = r"(do|did|can|could) you (know|find out|learn)|(have|had) you (known|found out|learned|learnt)" +_how_re = r"(how|where|when|from whom)" +_i_live_re = r"(i lived?|my (house|home) (is|was|have been)|my family live[sd]?)" +_how_do_you_know = ( + rf"({_how_re} {_you_know_question_re}|who {_tell_re} you|" rf"кто (сказал|рассказал)|откуда (ты )?знаешь)" +) + +how_do_you_know_my_info_patterns = { + "name": re.compile( + rf"{_how_do_you_know} (my {_name_re}|what is my {_name_re}|what my {_name_re} is|мо[её] имя|как меня зовут)", + re.IGNORECASE, + ), + "location": re.compile( + rf"{_how_do_you_know} (where {_i_live_re}|где я (живу|проживаю|нахожусь|сейчас))", re.IGNORECASE + ), + "homeland": re.compile(rf"{_how_do_you_know} (where i am from|откуда я (родом)?|где я вырос(ла)?)", re.IGNORECASE), +} + +_common_secret_re = r"(((it|this|that) is (a )?|^)(secret|private|confidential)|(это |^)секрет|не скажу)" +is_secret_patterns = { + "name": re.compile(rf"{_common_secret_re}|\b(sur)?name is (a )?(secret|private|confidential)", re.IGNORECASE), + "location": re.compile(rf"{_common_secret_re}|location is (a )?(secret|private|confidential)", re.IGNORECASE), + "homeland": re.compile(rf"{_common_secret_re}", re.IGNORECASE), +} + +BOT_DOESNT_KNOW_INFO_KEY = "bot_doesnt_know_info" +BOT_KNOWS_INFO_KEY = "bot_knows_info" +how_do_you_know_my_info_responses = { + "name": { + BOT_DOESNT_KNOW_INFO_KEY: { + "EN": "Sorry, but I really do not know your name. Would you be so kind to tell me you name?", + "RU": "Извини, кажется, я еще не знаю, как тебя зовут. Если ты не против, скажи мне, как тебя зовут?", + }, + BOT_KNOWS_INFO_KEY: { + "EN": "Ah, you have probably forgotten that you told me your name before. " + "Maybe you told me your name the last time we talked.", + "RU": "Кажется, я уже знаю твое имя из прошлых бесед.", + }, + }, + "location": { + BOT_DOESNT_KNOW_INFO_KEY: { + "EN": "Sorry, but I really do not know where you live. Would tell me?", + "RU": "Извини, кажется я еще этого не знаю. Расскажешь мне, где ты живешь?", + }, + BOT_KNOWS_INFO_KEY: { + "EN": "Ah, you have probably forgotten that you told me where you live before. " + "Maybe you told me this the last time we talked.", + "RU": "Кажется, я уже знаю, где ты живешь из прошлых бесед.", + }, + }, + "homeland": { + BOT_DOESNT_KNOW_INFO_KEY: { + "EN": "Sorry, but I really do not know where you are from. So, where are you from? " + "I hope i am not tactless.", + "RU": "Извини, но я еще не знаю, откуда ты. Расскажи мне, откуда ты родом?", + }, + BOT_KNOWS_INFO_KEY: { + "EN": "Ah, you have probably forgotten that you told me where you are from before. " + "Maybe you told me this the last time we talked", + "RU": "Кажется, я уже знаю, откуда ты, из прошлых бесед.", + }, + }, +} +MAX_READABLE_NAME_WORD_LEN = 20 +NON_GEOGRAPHICAL_LOCATIONS_COMPILED_PATTERN = re.compile( + r"\b(hospital|school|work|home|car|train|train station|outdoors|bed|kitchen|bedroom|bathroom|" + r"basement|jail|prison|bath|больнице|школе|работе|дома|машине|поезде|станции|улице|кровати|" + r"кухне|спальне|ванной|ванне|гостиной|тюрьме)\b", + re.IGNORECASE, +) +ASK_GEOGRAPHICAL_LOCATION_BECAUSE_USER_MISUNDERSTOOD_BOT = { + "homeland": { + "EN": "Sorry, but I probably misheard you. " + "I am just curious to know the region or the city in which you were born.", + "RU": "Извини, кажется, я неверно поняла тебя. Мне просто было любопытно узнать, страну или город, откуда ты.", + }, + "location": { + "EN": "Sorry, but I probably misheard you. Could you please tell me in which city or region you are now?", + "RU": "Извини, кажется, я неверно поняла тебя. Я просто хотела спросить, в какой стране или городе ты живешь?", + }, +} + +RESPONSE_PHRASES = { + "EN": { + "name": ["Nice to meet you, "], + "location": [ASK_WEATHER_SKILL_FOR_HOMELAND_PHRASE, "Cool!"], + "homeland": ["Is that where you live now?", "Cool!"], + }, + "RU": { + "name": ["Приятно познакомиться, "], + "location": ["Мне всегда нравилось узнавать новое о человеке!", "Классно!"], + "homeland": ["А сейчас ты живешь в этом же месте?", "Круто!"], + }, +} + +REPEAT_INFO_PHRASES = { + "name": { + "EN": "I didn't get your name. Could you, please, repeat it.", + "RU": "Ой, я не смогла распознать имя. Можешь, пожалуйста, повторить.", + }, + "location": { + "EN": "I didn't get your location. Could you, please, repeat it.", + "RU": "Ой, я не смогла распознать город или страну. Можешь, пожалуйста, повторить.", + }, + "homeland": { + "EN": "I didn't get where you have been born. Could you please repeat it?", + "RU": "Ой, я не смогла распознать город или страну. Можешь, пожалуйста, повторить.", + }, +} + +TELL_MY_COMPILED_PATTERNS = { + "name": re.compile( + r"(what is|what's|whats|tell me|you know|you remember|memorize|say) my name|how( [a-zA-Z ]+)?call me|" + r"my name is what|you( can| could| shall| will)? tell my name|" + r"как меня зовут|как мо[её] имя|как ты меня назвал|ты знаешь мо[её] имя", + re.IGNORECASE, + ), + "location": re.compile( + r"((what is|what's|whats|tell me|you know|you remember|memorize|say) my (location|country|city|town)|" + r"where (am i|i am)(\snow)?|where( do)?i live|where( am)?i( am)? living)|(what|which) " + r"(country|city|town)( do)? (i|am i|i am)|" + r"где я( нахожусь| сейчас|[.\?]?$)", + re.IGNORECASE, + ), + "homeland": re.compile( + r"(what is|what's|whats|tell me|you know|you remember|memorize|say) " + r"my (home\s?land|mother\s?land|home\s?town|native\s?land|birth\s?place)|where (am i|i am) from|" + r"откуда я( родом)[.\?]?$|где я родил(ась|ся)[.\?]?$", + re.IGNORECASE, + ), +} + +BOT_DOESNT_KNOW_USER_INFO_RESPONSES = { + "EN": { + "name": "Sorry, we are still not familiar. What is your name?", + "location": "Sorry, I don't have this information. But you can tell me. What is your location?", + "homeland": "Sorry, I don't have this information. But you can tell me. Where are you from?", + }, + "RU": { + "name": "Извини, кажется, мы еще не знакомы. Как тебя зовут?", + "location": "Извини, у меня нет такой информации, но ты можешь мне рассказать об этом. Где ты живешь сейчас?", + "homeland": "Извини, у меня нет такой информации, но ты можешь мне рассказать об этом. Откуда ты родом?", + }, +} + +TELL_USER_HIS_INFO_RESPONSE = {"EN": "Your {which_info} is {info}.", "RU": "Твое {which_info} - {info}."} +WHICH_INFO_RU_MAP = { + "name": "имя", + "location": "место проживания", + "homeland": "место рождения", +} + +ASK_USER_ABOUT_NAME_AGAIN_RESPONSE = { + "EN": "My bad. What is your name again?", + "RU": "Ой, извини. Как тебя зовут еще раз?", +} + +AS_YOU_WISH_RESPONSE = {"EN": "As you wish.", "RU": "Как считаешь нужным."} +WHERE_DO_YOU_LIVE_NOW_RESPONSE = {"EN": "So, where do you live now?", "RU": "Так... А где ты сейчас живешь?"} +NEVER_HEARD_OF_NAME_RESPONSE = {"EN": "I've never heard about this name.", "RU": "Никогда не слышала такого имени."} diff --git a/common/remove_lists.py b/common/remove_lists.py new file mode 100644 index 0000000000..0d9e346910 --- /dev/null +++ b/common/remove_lists.py @@ -0,0 +1,249 @@ +NP_REMOVE_LIST = [ + "'s", + "i", + "me", + "my", + "myself", + "we", + "our", + "ours", + "ourselves", + "you", + "you're", + "you've", + "you'll", + "you'd", + "your", + "yours", + "yourself", + "yourselves", + "he", + "him", + "his", + "himself", + "she", + "she's", + "her", + "hers", + "herself", + "it", + "it's", + "its", + "itself", + "they", + "them", + "their", + "theirs", + "themselves", + "what", + "which", + "who", + "whom", + "this", + "that", + "that'll", + "these", + "those", + "am", + "is", + "are", + "was", + "were", + "be", + "been", + "being", + "have", + "has", + "had", + "having", + "do", + "does", + "did", + "doing", + "a", + "an", + "the", + "and", + "but", + "if", + "or", + "because", + "as", + "until", + "while", + "of", + "at", + "by", + "for", + "with", + "about", + "against", + "between", + "into", + "through", + "during", + "before", + "after", + "above", + "below", + "to", + "from", + "up", + "down", + "in", + "out", + "on", + "off", + "over", + "under", + "again", + "further", + "then", + "once", + "here", + "there", + "when", + "where", + "why", + "how", + "all", + "any", + "both", + "each", + "few", + "more", + "most", + "other", + "some", + "such", + "no", + "nor", + "not", + "only", + "own", + "same", + "so", + "than", + "too", + "very", + "s", + "t", + "can", + "will", + "just", + "don", + "don't", + "should", + "should've", + "now", + "d", + "ll", + "m", + "o", + "re", + "ve", + "y", + "ain", + "aren", + "aren't", + "couldn", + "couldn't", + "didn", + "didn't", + "doesn", + "doesn't", + "hadn", + "hadn't", + "hasn", + "hasn't", + "haven", + "haven't", + "isn", + "isn't", + "ma", + "mightn", + "mightn't", + "mustn", + "mustn't", + "needn", + "needn't", + "shan", + "shan't", + "shouldn", + "shouldn't", + "wasn", + "wasn't", + "weren", + "weren't", + "won", + "won't", + "wouldn", + "wouldn't", + "my name", + "your name", + "wow", + "yeah", + "yes", + "ya", + "cool", + "okay", + "more", + "some more", + " a lot", + "a bit", + "another one", + "something else", + "something", + "anything", + "someone", + "anyone", + "play", + "mean", + "a lot", + "a little", + "a little bit", +] + +NP_IGNORE_LIST = [ + "boring", + "radio", + "type", + "call", + "fun", + "fall", + "name", + "names", + "lgbtq families", + "day", + "murder", + "amazon", + "take", + "interest", + "days", + "year", + "years", + "sort", + "fan", + "going", + "death", + "part", + "end", + "watching", + "thought", + "thoughts", + "man", + "men", + "listening", + "big fan", + "fans", + "rapping", + "reading", + "going", + "thing", + "hanging", + "best thing", + "wife", + "things", + "nothing", + "everything", +] diff --git a/common/response_selection.py b/common/response_selection.py index 9a8adf4756..1bd48ecd72 100644 --- a/common/response_selection.py +++ b/common/response_selection.py @@ -31,10 +31,7 @@ "dff_wiki_skill", "dff_art_skill", ] -ALMOST_ACTIVE_SKILLS = [ - "friendship_skill", - "dff_friendship_skill", -] +ALMOST_ACTIVE_SKILLS = ["friendship_skill", "dff_friendship_skill", "dff_generative_skill"] UNPREDICTABLE_SKILLS = [ "convert_reddit", "knowledge_grounding_skill", diff --git a/common/universal_templates.py b/common/universal_templates.py index 9566b7e78c..89de9f783b 100644 --- a/common/universal_templates.py +++ b/common/universal_templates.py @@ -14,13 +14,27 @@ get_entities, join_word_beginnings_in_or_pattern, ) -from common.greeting import GREETING_QUESTIONS, WHAT_DO_YOU_DO_RESPONSES, FREE_TIME_RESPONSES +from common.greeting import GREETING_QUESTIONS_TEXTS import sentry_sdk logger = logging.getLogger(__name__) sentry_sdk.init(getenv("SENTRY_DSN")) +DUMMY_DONTKNOW_RESPONSES = { + "EN": [ + "What do you want to talk about?", + "I am a bit confused. What would you like to chat about?", + "Sorry, probably, I didn't get what you meant. What do you want to talk about?", + "Sorry, I didn't catch that. What would you like to chat about?", + ], + "RU": [ + "О чем ты хочешь поговорить?", + "Кажется, я немного потерялась. О чем ты хочешь поговорить?", + "Извини, возможно я не совсем поняла, что ты имеешь в виду. О чем ты хочешь поговорить?", + "Извини, я не уловила информацию. О чем ты хочешь поболтать?", + ], +} # https://www.englishclub.com/vocabulary/fl-asking-for-opinions.htm UNIVERSAL_OPINION_REQUESTS = [ "This is interesting, isn't it?", @@ -75,8 +89,8 @@ def nounphrases_questions(nounphrase=None): ARTICLES = r"\s?(\ba\b|\ban\b|\bthe\b|\bsome\b|\bany\b)?\s?" -ANY_WORDS = r"[a-zA-Z0-9 ]*" -ANY_SENTENCES = r"[A-Za-z0-9-!,\?\.’'\"’ ]*" +ANY_WORDS = r"[a-zA-Zа-яА-ЯйЙёЁ0-9 ]*" +ANY_SENTENCES = r"[A-Za-zа-яА-ЯйЙёЁ0-9-!,\?\.’'\"’ ]*" END = r"([!,\?\.’'\"’]+.*|$)" BEGIN_OF_SENT = r"^(.*[!,\?\.’'\"’]+ )?" @@ -143,9 +157,32 @@ def nounphrases_questions(nounphrase=None): "care to", ] TO_ME_LIKE = [r"to me( now)?", r"with me( now)?", r"me( now)?", "now"] -SOMETHING_LIKE = ["anything", "something", "that", "everything", "thing", "stuff", "other things"] -NOTHING_LIKE = ["nothing", "none", "neither"] -DONOTKNOW_LIKE = [r"(i )?(do not|don't) know", "you (choose|decide|pick up)", "hard (to say|one)", "none"] +SOMETHING_LIKE = [ + "anything", + "something", + "that", + "everything", + "thing", + "stuff", + "other things", + "что-нибудь", + "что-то", + "что угодно", + "всё", + "что-либо", + "всякое", + "другое", +] +NOTHING_LIKE = ["nothing", "none", "neither", "ничего", "нечего", "ни о чем", "не о чем", r"ни то,? ни то"] +DONOTKNOW_LIKE = [ + r"(i )?(do not|don't) know", + "you (choose|decide|pick up)", + "hard (to say|one)", + "none", + r"(я )?(не знаю|без понятия)", + "(ты|сам) (выбери|выбирай|реши|решай)", + "сложно (сказать|выбрать)", +] KNOW_LIKE = ["know", "learn", "find out"] LIKE_TEMPLATE = ["like", "love", "prefer"] ASK_TEMPLATE = ["ask", "request"] @@ -156,7 +193,9 @@ def nounphrases_questions(nounphrase=None): SOMETHING_WITH_SPACES = r"\s?" + join_words_in_or_pattern(SOMETHING_LIKE) + r"?\s?" ABOUT_TOPIC = join_words_in_or_pattern(ABOUT_LIKE) + r"\s" + ANY_WORDS KNOW = join_words_in_or_pattern(KNOW_LIKE) -SOMETHING_ELSE = re.compile(r"((something|anything|everything) (else|other))", re.IGNORECASE) +SOMETHING_ELSE = re.compile( + r"((something|anything|everything|что-нибудь|что-то|что угодно|что-либо) (else|other|другом|другое))", re.IGNORECASE +) # --------------- Let's talk. / Can we talk? / Talk to me. ------------ COMPILE_LETS_TALK = re.compile( @@ -380,8 +419,6 @@ def if_not_want_to_chat_about_particular_topic(annotated_uttr, prev_annotated_ut ANY_TOPIC_AMONG_OFFERED = re.compile( r"(\bany\b|\ball\b|\beither\b|\bboth\b|don't know|not know" r"|you (choose|pick up|tell me|want|wish|like)\.?$)" ) -GREETING_QUESTIONS_TEXTS = [question.lower() for t in GREETING_QUESTIONS for question in GREETING_QUESTIONS[t]] -GREETING_QUESTIONS_TEXTS += [t.lower() for t in WHAT_DO_YOU_DO_RESPONSES + FREE_TIME_RESPONSES] def if_utterance_requests_topic(annotated_uttr): diff --git a/common/utils.py b/common/utils.py index 4dec854a4f..c8dd4864ae 100644 --- a/common/utils.py +++ b/common/utils.py @@ -106,7 +106,7 @@ "dff_grounding_skill": {"what_are_you_talking_about"}, } -low_priority_intents = {"dont_understand", "what_time"} +low_priority_intents = {"dont_understand", "what_time", "choose_topic"} combined_classes = { "factoid_classification": ["is_factoid", "is_conversational"], @@ -1117,7 +1117,7 @@ def get_common_tokens_in_lists_of_strings(list_of_strings_0, list_of_strings_1): return common_substrings -SYMBOLS_EXCEPT_LETTERS_AND_DIGITS = re.compile(r"[^a-zA-Z0-9\-_ ]") +SYMBOLS_EXCEPT_LETTERS_AND_DIGITS = re.compile(r"[^a-zA-Zа-яА-ЯёЁ0-9\-_ ]") DOUBLE_SPACES = re.compile(r"\s+") diff --git a/common/wiki_skill_scenarios.py b/common/wiki_skill_scenarios.py index 172e3e921f..32f7512b55 100644 --- a/common/wiki_skill_scenarios.py +++ b/common/wiki_skill_scenarios.py @@ -5,8 +5,8 @@ from common.hobbies import HOBBIES_RE from common.greeting import GREETING_QUESTIONS -HOBBIES_TEMPLATE = f"({'|'.join(GREETING_QUESTIONS['what_are_your_hobbies'])})" +HOBBIES_TEMPLATE = "|".join(sum([GREETING_QUESTIONS[lang]["what_are_your_hobbies"] for lang in ["EN", "RU"]], [])) topic_config = { "hobbies": { diff --git a/response_selectors/convers_evaluation_based_selector/Dockerfile b/response_selectors/convers_evaluation_based_selector/Dockerfile index 8072b13a20..c2a28c2450 100644 --- a/response_selectors/convers_evaluation_based_selector/Dockerfile +++ b/response_selectors/convers_evaluation_based_selector/Dockerfile @@ -33,6 +33,9 @@ COPY ./response_selectors/convers_evaluation_based_selector/requirements.txt req RUN pip install -r requirements.txt RUN python -c "import nltk; nltk.download('punkt')" +ARG LANGUAGE=EN +ENV LANGUAGE ${LANGUAGE} + COPY ./response_selectors/convers_evaluation_based_selector/ ./ COPY ./common/ ./common/ diff --git a/response_selectors/convers_evaluation_based_selector/server.py b/response_selectors/convers_evaluation_based_selector/server.py index 50ba569e13..1c02c45616 100644 --- a/response_selectors/convers_evaluation_based_selector/server.py +++ b/response_selectors/convers_evaluation_based_selector/server.py @@ -13,8 +13,8 @@ from flask import Flask, request, jsonify from nltk.tokenize import sent_tokenize -from common.greeting import greeting_spec -from common.universal_templates import if_chat_about_particular_topic, if_choose_topic +from common.greeting import greeting_spec, HI_THIS_IS_DREAM +from common.universal_templates import if_chat_about_particular_topic, if_choose_topic, DUMMY_DONTKNOW_RESPONSES from common.utils import ( get_intent_name, low_priority_intents, @@ -52,6 +52,7 @@ "Sorry, probably, I didn't get what you mean.", "I didn't get it. Sorry", ] +LANGUAGE = getenv("LANGUAGE", "EN") @app.route("/respond", methods=["POST"]) @@ -126,7 +127,7 @@ def respond(): else: logger.info("Response Selector Error: randomly choosing response among dummy responses.") best_cand = { - "text": random.choice(MOST_DUMMY_RESPONSES), + "text": random.choice(DUMMY_DONTKNOW_RESPONSES[LANGUAGE]), "confidence": 0.1, "human_attributes": {}, "bot_attributes": {}, @@ -198,7 +199,7 @@ def rule_score_based_selection(dialog, candidates, scores, confidences, is_toxic is_intent_candidate = is_intent_candidate and intent_name not in low_priority_intents # print("is intent candidate? " + str(is_intent_candidate), flush=True) - if len(dialog["human_utterances"]) == 1 and greeting_spec not in candidates[i]["text"]: + if len(dialog["human_utterances"]) == 1 and greeting_spec[LANGUAGE] not in candidates[i]["text"]: logger.info("Dialog Beginning detected.") if ( if_chat_about_particular_topic(dialog["utterances"][0]) @@ -210,34 +211,34 @@ def rule_score_based_selection(dialog, candidates, scores, confidences, is_toxic logger.info("Particular topic. Facts + Greeting to very big score.") # I don't have an opinion on that but I know some facts. resp = candidates[i]["text"].replace("I don't have an opinion on that but I know some facts.", "") - candidates[i]["text"] = "Hi, " + greeting_spec + "! " + resp + candidates[i]["text"] = f"{HI_THIS_IS_DREAM[LANGUAGE]} {resp}" curr_score = very_big_score elif skill_names[i] == "meta_script_skill" and len(candidates[i]["text"]) > 0 and confidences[i] > 0.98: logger.info("Particular topic. meta_script_skill + Greeting to very big score.") # I don't have an opinion on that but I know some facts. resp = candidates[i]["text"] - candidates[i]["text"] = "Hi, " + greeting_spec + "! " + resp + candidates[i]["text"] = f"{HI_THIS_IS_DREAM[LANGUAGE]} {resp}" curr_score = very_big_score elif skill_names[i] == "small_talk_skill": logger.info("Particular topic. Small-talk + Greeting NOT to very big score.") # for now do not give small talk a very big score here - candidates[i]["text"] = "Hi, " + greeting_spec + "! " + candidates[i]["text"] + candidates[i]["text"] = f"{HI_THIS_IS_DREAM[LANGUAGE]} {candidates[i]['text']}" # curr_score = very_big_score elif if_choose_topic(dialog["utterances"][0]) and "about it" not in dialog["utterances"][0]["text"].lower(): logger.info("User wants bot to choose the topic") # if user says `let's chat about something` if skill_names[i] == "small_talk_skill": logger.info("No topic. Small-talk + Greeting to very big score.") - candidates[i]["text"] = "Hi, " + greeting_spec + "! " + candidates[i]["text"] + candidates[i]["text"] = f"{HI_THIS_IS_DREAM[LANGUAGE]} {candidates[i]['text']}" curr_score = very_big_score elif skill_names[i] == "meta_script_skill" and len(candidates[i]["text"]) > 0: logger.info("No topic. Meta-script + Greeting to very big score.") - candidates[i]["text"] = "Hi, " + greeting_spec + "! " + candidates[i]["text"] + candidates[i]["text"] = f"{HI_THIS_IS_DREAM[LANGUAGE]} {candidates[i]['text']}" curr_score = very_big_score else: logger.info("User just wants to talk.") # if user says something else - if skill_names[i] == "program_y" and greeting_spec in candidates[i]["text"]: + if skill_names[i] == "program_y" and greeting_spec[LANGUAGE] in candidates[i]["text"]: logger.info("Just chat. Program-y to very big score.") curr_score = very_big_score elif ( @@ -246,7 +247,7 @@ def rule_score_based_selection(dialog, candidates, scores, confidences, is_toxic and len(dialog["utterances"]) < 16 ): curr_score = very_big_score - elif skill_names[i] == "dff_friendship_skill" and greeting_spec in candidates[i]["text"]: + elif skill_names[i] == "dff_friendship_skill" and greeting_spec[LANGUAGE] in candidates[i]["text"]: if len(dialog["utterances"]) < 2: curr_score = very_big_score else: @@ -324,6 +325,7 @@ def rule_score_based_selection(dialog, candidates, scores, confidences, is_toxic best_id = np.argmax(curr_single_scores) best_candidate = candidates[best_id] best_skill_name = skill_names[int(best_id)] + prev_skill_names = [uttr["skill_name"] for uttr in dialog["bot_utterances"][-5:]] best_candidate = add_question_to_statement( best_candidate, @@ -333,6 +335,7 @@ def rule_score_based_selection(dialog, candidates, scores, confidences, is_toxic link_to_question, link_to_human_attrs, not_sure_factoid, + prev_skill_names, ) return best_candidate, best_id, curr_single_scores @@ -343,7 +346,7 @@ def select_response(candidates, scores, confidences, is_toxics, dialog, all_prev n_toxic_candidates, scores, confidences = downscore_toxic_badlisted_responses(scores, confidences, is_toxics) if n_toxic_candidates == len(candidates): # the most dummy заглушка на случай, когда все абсолютно скиллы вернули токсичные ответы - return None, np.random.choice(MOST_DUMMY_RESPONSES), 1.0, {}, {} + return None, np.random.choice(DUMMY_DONTKNOW_RESPONSES[LANGUAGE]), 1.0, {}, {} # REPEAT checks bot_utterances = [sent_tokenize(uttr["text"].lower()) for uttr in dialog["bot_utterances"]] @@ -371,9 +374,9 @@ def select_response(candidates, scores, confidences, is_toxics, dialog, all_prev best_human_attributes = best_candidate.get("human_attributes", {}) best_bot_attributes = best_candidate.get("bot_attributes", {}) - if len(dialog["bot_utterances"]) == 0 and greeting_spec not in best_text: + if len(dialog["bot_utterances"]) == 0 and greeting_spec[LANGUAGE] not in best_text: # add greeting to the first bot uttr, if it's not already included - best_text = "Hi, " + greeting_spec + "! " + best_text + best_text = f"{HI_THIS_IS_DREAM[LANGUAGE]} {best_text}" while candidates[best_id]["text"] == "" or candidates[best_id]["confidence"] == 0.0: curr_single_scores[int(best_id)] = 0.0 diff --git a/response_selectors/convers_evaluation_based_selector/tag_based_selection.py b/response_selectors/convers_evaluation_based_selector/tag_based_selection.py index b94c0b5db1..8b0b42233b 100644 --- a/response_selectors/convers_evaluation_based_selector/tag_based_selection.py +++ b/response_selectors/convers_evaluation_based_selector/tag_based_selection.py @@ -37,6 +37,7 @@ misheard_with_spec2, join_used_links_in_attributes, get_updated_disliked_skills, + LET_ME_ASK_YOU_PHRASES, ) from common.response_selection import ( ACTIVE_SKILLS, @@ -57,6 +58,7 @@ PROMPT_PROBA = float(getenv("PROMPT_PROBA", 0.3)) ACKNOWLEDGEMENT_PROBA = float(getenv("ACKNOWLEDGEMENT_PROBA", 0.5)) PRIORITIZE_SCRIPTED_SKILLS = int(getenv("PRIORITIZE_SCRIPTED_SKILLS", 1)) +LANGUAGE = getenv("LANGUAGE", "EN") logging.basicConfig(format="%(asctime)s - %(name)s - %(levelname)s - %(message)s", level=logging.DEBUG) logger = logging.getLogger(__name__) @@ -259,14 +261,13 @@ def compute_curr_single_scores(candidates, scores, confidences): cand_scores = scores[i] confidence = confidences[i] skill_name = candidates[i]["skill_name"] - score_conv_eval = calculate_single_convers_evaluator_score(cand_scores) + if all(["dialogrpt" in cand["annotations"] for cand in candidates]): + score_conv_eval = candidates[i]["annotations"]["dialogrpt"] + else: + score_conv_eval = calculate_single_convers_evaluator_score(cand_scores) score = CONV_EVAL_STRENGTH * score_conv_eval + CONFIDENCE_STRENGTH * confidence - toxicity = max(candidates[i].get("annotations", {}).get("toxic_classification", {"toxic": 0.0}).values()) - logger.info( - f"Skill {skill_name} has final score: {score}. Confidence: {confidence}. " - f"Toxicity: {toxicity}. Cand scores: {cand_scores}" - ) + logger.info(f"Skill {skill_name} has final score: {score}. Confidence: {confidence}.") curr_single_scores.append(score) return curr_single_scores @@ -290,7 +291,10 @@ def does_not_require_prompt(candidates, best_cand_id): _is_not_add_prompt_skill = candidates[best_cand_id]["skill_name"] in NOT_ADD_PROMPT_SKILLS _is_any_question = is_any_question_sentence_in_utterance(candidates[best_cand_id]) - _can_continue = candidates[best_cand_id].get("can_continue", CAN_NOT_CONTINUE) != CAN_NOT_CONTINUE + _can_continue = ( + candidates[best_cand_id].get("can_continue", CAN_NOT_CONTINUE) != CAN_NOT_CONTINUE + and candidates[best_cand_id]["skill_name"] in ACTIVE_SKILLS + ) if ( _is_already_prompt or _is_question @@ -617,7 +621,7 @@ def tag_based_response_selection(dialog, candidates, scores, confidences, bot_ut if ( len(dialog["human_utterances"]) == 1 and cand_uttr["skill_name"] == "dff_friendship_skill" - and greeting_spec in cand_uttr["text"] + and any([g in cand_uttr["text"] for g in greeting_spec.values()]) ): categorized_hyps = add_to_top1_category(cand_id, categorized_hyps, _is_require_action_intent) elif ( @@ -731,7 +735,14 @@ def tag_based_response_selection(dialog, candidates, scores, confidences, bot_ut # as we have only one active skill, let's consider active skill as that one providing prompt # but we also need to reassign all the attributes best_prompt = candidates[best_prompt_id] - best_candidate["text"] = f'{best_candidate["text"]} {best_prompt["text"]}' + + if "prelinkto_connections" in best_prompt.get("human_attributes", {}): + # prelinkto connection phrase is already in the prompt (added in dummy skill) + best_candidate["text"] = f'{best_candidate["text"]} {best_prompt["text"]}' + else: + prelinkto = np.random.choice(LET_ME_ASK_YOU_PHRASES[LANGUAGE]) + best_candidate["text"] = f'{best_candidate["text"]} {prelinkto} {best_prompt["text"]}' + best_candidate["attributes"] = best_candidate.get("attributes", {}) best_candidate["attributes"]["prompt_skill"] = best_prompt diff --git a/response_selectors/convers_evaluation_based_selector/utils.py b/response_selectors/convers_evaluation_based_selector/utils.py index 25f0009df5..81a2d60bc5 100644 --- a/response_selectors/convers_evaluation_based_selector/utils.py +++ b/response_selectors/convers_evaluation_based_selector/utils.py @@ -21,6 +21,7 @@ ) sentry_sdk.init(getenv("SENTRY_DSN")) +LANGUAGE = getenv("LANGUAGE", "EN") logging.basicConfig(format="%(asctime)s - %(name)s - %(levelname)s - %(message)s", level=logging.DEBUG) logger = logging.getLogger(__name__) @@ -34,12 +35,20 @@ misheard_with_spec1 = "I misheard you" misheard_with_spec2 = "like to chat about" -LET_ME_ASK_YOU_PHRASES = [ - "Let me ask you something.", - "I would like to ask you a question.", - "Hey, I have a quesiton to you.", - "May I ask you one interesting thing.", -] +LET_ME_ASK_YOU_PHRASES = { + "EN": [ + "Let me ask you something.", + "I would like to ask you a question.", + "Hey, I have a quesiton to you.", + "May I ask you one interesting thing.", + ], + "RU": [ + "Я бы хотела кое-что спросить.", + "О, у меня как раз есть вопрос для обсуждения.", + "Я хочу спросить тебя кое-о-чем интересном.", + "У меня есть кое-что интересное для обсуждения.", + ], +} def join_used_links_in_attributes(main_attrs, add_attrs): @@ -61,6 +70,7 @@ def add_question_to_statement( link_to_question, link_to_human_attrs, not_sure_factoid, + prev_skill_names, ): if not_sure_factoid and "factoid_qa" in best_skill_name: @@ -68,7 +78,7 @@ def add_question_to_statement( if best_candidate["text"].strip() in okay_statements: if dummy_question != "" and random.random() < ASK_DUMMY_QUESTION_PROB: logger.info(f"adding {dummy_question} to response.") - best_candidate["text"] += f"{np.random.choice(LET_ME_ASK_YOU_PHRASES)} {dummy_question}" + best_candidate["text"] += f"{np.random.choice(LET_ME_ASK_YOU_PHRASES[LANGUAGE])} {dummy_question}" # if this is not a link-to question, bot attributes will be still empty best_candidate["human_attributes"] = join_used_links_in_attributes( best_candidate.get("human_attributes", {}), dummy_question_human_attr @@ -80,6 +90,10 @@ def add_question_to_statement( best_candidate["human_attributes"] = join_used_links_in_attributes( best_candidate.get("human_attributes", {}), link_to_human_attrs ) + elif LANGUAGE == "RU" and best_skill_name == "dff_generative_skill": + if prev_skill_names[-3:] == 3 * ["dff_generative_skill"] and random.random() < ASK_DUMMY_QUESTION_PROB: + logger.info(f"adding russian {dummy_question} to dff-generative-skill response.") + best_candidate["text"] += f"{np.random.choice(LET_ME_ASK_YOU_PHRASES[LANGUAGE])} {dummy_question}" return best_candidate diff --git a/response_selectors/rule_based_response_selector/Dockerfile b/response_selectors/rule_based_response_selector/Dockerfile index dfc9474f90..62fdb03abd 100644 --- a/response_selectors/rule_based_response_selector/Dockerfile +++ b/response_selectors/rule_based_response_selector/Dockerfile @@ -5,6 +5,9 @@ RUN mkdir /src COPY ./requirements.txt /src/requirements.txt RUN pip install -r /src/requirements.txt +ARG LANGUAGE=EN +ENV LANGUAGE ${LANGUAGE} + COPY . /src/ WORKDIR /src diff --git a/services/dialogpt_RU/Dockerfile b/services/dialogpt_RU/Dockerfile new file mode 100644 index 0000000000..608116d5d1 --- /dev/null +++ b/services/dialogpt_RU/Dockerfile @@ -0,0 +1,23 @@ +# syntax=docker/dockerfile:experimental + +FROM pytorch/pytorch:1.5-cuda10.1-cudnn7-runtime + +WORKDIR /src + +ARG PRETRAINED_MODEL_NAME_OR_PATH +ENV PRETRAINED_MODEL_NAME_OR_PATH ${PRETRAINED_MODEL_NAME_OR_PATH} +ARG SERVICE_PORT +ENV SERVICE_PORT ${SERVICE_PORT} + +ARG LANGUAGE=EN +ENV LANGUAGE ${LANGUAGE} + +COPY ./requirements.txt /src/requirements.txt +RUN pip install -r /src/requirements.txt + +COPY . /src + +HEALTHCHECK --interval=5s --timeout=90s --retries=3 CMD curl --fail 127.0.0.1:${SERVICE_PORT}/healthcheck || exit 1 + + +CMD gunicorn --workers=1 server:app -b 0.0.0.0:${SERVICE_PORT} --timeout=300 diff --git a/services/dialogpt_RU/README.md b/services/dialogpt_RU/README.md new file mode 100644 index 0000000000..3a5f14484f --- /dev/null +++ b/services/dialogpt_RU/README.md @@ -0,0 +1,3 @@ +GPU RAM = 1Gb +cpu time = 0.15 sec +gpu time = 0.05 sec \ No newline at end of file diff --git a/services/dialogpt_RU/requirements.txt b/services/dialogpt_RU/requirements.txt new file mode 100644 index 0000000000..d45fabd47e --- /dev/null +++ b/services/dialogpt_RU/requirements.txt @@ -0,0 +1,10 @@ +transformers==4.0.1 +sentencepiece==0.1.94 +flask==1.1.1 +gunicorn==19.9.0 +requests==2.22.0 +sentry-sdk[flask]==0.14.1 +healthcheck==1.3.3 +itsdangerous==2.0.1 +jinja2<=3.0.3 +Werkzeug<=2.0.3 diff --git a/services/dialogpt_RU/server.py b/services/dialogpt_RU/server.py new file mode 100644 index 0000000000..c001a19c4f --- /dev/null +++ b/services/dialogpt_RU/server.py @@ -0,0 +1,166 @@ +""" +Source code is https://github.com/Grossmend/DialoGPT/blob/master/src/service/service.py +""" +import logging +import time +import os +from typing import Dict, List + +from transformers import AutoTokenizer, AutoModelForCausalLM +import torch +from flask import Flask, request, jsonify +from healthcheck import HealthCheck +import sentry_sdk +from sentry_sdk.integrations.flask import FlaskIntegration + +sentry_sdk.init(dsn=os.getenv("SENTRY_DSN"), integrations=[FlaskIntegration()]) + + +logging.basicConfig(format="%(asctime)s - %(name)s - %(levelname)s - %(message)s", level=logging.INFO) +logger = logging.getLogger(__name__) + +PRETRAINED_MODEL_NAME_OR_PATH = os.environ.get( + "PRETRAINED_MODEL_NAME_OR_PATH", "Grossmend/rudialogpt3_medium_based_on_gpt2" +) +logger.info(f"PRETRAINED_MODEL_NAME_OR_PATH = {PRETRAINED_MODEL_NAME_OR_PATH}") + +cuda = torch.cuda.is_available() +if cuda: + torch.cuda.set_device(0) + device = "cuda" +else: + device = "cpu" + +logger.info(f"dialogpt is set to run on {device}") + +params_default = { + "max_length": 256, + "no_repeat_ngram_size": 3, + "do_sample": True, + "top_k": 100, + "top_p": 0.9, + "temperature": 0.6, + "num_return_sequences": 3, + "device": device, + "is_always_use_length": True, + "length_generate": "1", +} + + +class RussianDialogGPT: + def __init__(self, path_model: str): + self.path_model = path_model + self.tokenizer = None + self.model = None + self._load_model() + + def _load_model(self): + logger.info(f"dialogpt Loading model: {self.path_model} ...") + self.tokenizer = AutoTokenizer.from_pretrained(self.path_model) + self.model = AutoModelForCausalLM.from_pretrained(self.path_model) + + def get_responses(self, inputs: List[Dict], params: Dict) -> List[str]: + + params_ = { + "max_length": params.get("max_length", params_default["max_length"]), + "no_repeat_ngram_size": params.get("no_repeat_ngram_size", params_default["no_repeat_ngram_size"]), + "do_sample": params.get("do_sample", params_default["do_sample"]), + "top_k": params.get("top_k", params_default["top_k"]), + "top_p": params.get("top_p", params_default["top_p"]), + "temperature": params.get("temperature", params_default["temperature"]), + "num_return_sequences": params.get("num_return_sequences", params_default["num_return_sequences"]), + "device": params.get("device", params_default["device"]), + "is_always_use_length": params.get("is_always_use_length", params_default["is_always_use_length"]), + "length_generate": params.get("length_generate", params_default["length_generate"]), + } + + inputs_text = "" + for input_ in inputs: + if params_["is_always_use_length"]: + length_rep = len(self.tokenizer.encode(input_["text"])) + if length_rep <= 15: + length_param = "1" + elif length_rep <= 50: + length_param = "2" + elif length_rep <= 256: + length_param = "3" + else: + length_param = "-" + else: + length_param = "-" + inputs_text += f"|{input_['speaker']}|{length_param}|{input_['text']}" + inputs_text += f"|1|{params_['length_generate']}|" + + inputs_token_ids = self.tokenizer.encode(inputs_text, return_tensors="pt") + inputs_token_ids = inputs_token_ids.cuda() if cuda else inputs_token_ids + + try: + outputs_token_ids = self.model.generate( + inputs_token_ids, + max_length=params_["max_length"], + no_repeat_ngram_size=params_["no_repeat_ngram_size"], + do_sample=params_["do_sample"], + top_k=params_["top_k"], + top_p=params_["top_p"], + temperature=params_["temperature"], + num_return_sequences=params_["num_return_sequences"], + device=params_["device"], + mask_token_id=self.tokenizer.mask_token_id, + eos_token_id=self.tokenizer.eos_token_id, + unk_token_id=self.tokenizer.unk_token_id, + pad_token_id=self.tokenizer.pad_token_id, + ) + except Exception as e: + logger.info(f"dialogpt Error generate: {str(e)}") + return "" + + outputs = [self.tokenizer.decode(x, skip_special_tokens=True) for x in outputs_token_ids] + outputs = [x.split("|")[-1] for x in outputs] + # outputs contains list of strings of possible hypotheses + return outputs + + +try: + model = RussianDialogGPT(PRETRAINED_MODEL_NAME_OR_PATH) + model.model.eval() + if cuda: + model.model.cuda() + + logger.info("dialogpt model is ready") +except Exception as e: + sentry_sdk.capture_exception(e) + logger.exception(e) + raise e + +app = Flask(__name__) +health = HealthCheck(app, "/healthcheck") +logging.getLogger("werkzeug").setLevel("WARNING") + + +@app.route("/respond", methods=["POST"]) +def respond(): + st_time = time.time() + + dialog_contexts = request.json.get("dialog_contexts", []) + num_return_sequences = request.json.get("num_return_sequences", 5) + + try: + batch_generated_responses = [] + for context in dialog_contexts: + # context is a list of dicts, each dict contains text and speaker label + # context = [{"text": "utterance text", "speaker": "human"}, ...] + inputs = [{"text": uttr["text"], "speaker": 1 if uttr["speaker"] == "bot" else 0} for uttr in context][-3:] + logger.info(f"dialogpt inputs: {inputs}") + hypotheses = model.get_responses(inputs, params={"num_return_sequences": num_return_sequences}) + logger.info(f"dialogpt hypotheses: {hypotheses}") + batch_generated_responses.append(hypotheses) + + except Exception as exc: + logger.exception(exc) + sentry_sdk.capture_exception(exc) + batch_generated_responses = [[]] * len(dialog_contexts) + + total_time = time.time() - st_time + logger.info(f"dialogpt exec time: {total_time:.3f}s") + + return jsonify({"generated_responses": batch_generated_responses}) diff --git a/services/dialogpt_RU/test.py b/services/dialogpt_RU/test.py new file mode 100644 index 0000000000..16963d29c1 --- /dev/null +++ b/services/dialogpt_RU/test.py @@ -0,0 +1,23 @@ +import requests + + +def test_respond(): + url = "http://0.0.0.0:8091/respond" + + dialog_contexts = [ + [ + {"speaker": "human", "text": "Привет, как день прошел?"}, + {"speaker": "bot", "text": "Хорошо, а у тебя как?"}, + {"speaker": "human", "text": "Нормально, посоветуй фильм посмотреть"}, + ] + ] + + request_data = {"dialog_contexts": dialog_contexts} + result = requests.post(url, json=request_data).json()["generated_responses"][0] + + assert len(result) == 5 and len(result[0]) > 0, f"Got\n{result}" + print("Success!") + + +if __name__ == "__main__": + test_respond() diff --git a/services/dialogpt_RU/test.sh b/services/dialogpt_RU/test.sh new file mode 100755 index 0000000000..cf55721bd3 --- /dev/null +++ b/services/dialogpt_RU/test.sh @@ -0,0 +1,4 @@ +#!/bin/bash + + +python test.py diff --git a/services/dialogrpt_ru/Dockerfile b/services/dialogrpt_ru/Dockerfile new file mode 100644 index 0000000000..8bf1368a9e --- /dev/null +++ b/services/dialogrpt_ru/Dockerfile @@ -0,0 +1,26 @@ +# syntax=docker/dockerfile:experimental + +FROM pytorch/pytorch:1.5-cuda10.1-cudnn7-runtime + +RUN apt-get update && apt-get install -y --allow-unauthenticated wget && rm -rf /var/lib/apt/lists/* + +WORKDIR /src + +ARG PRETRAINED_MODEL_FNAME +ENV PRETRAINED_MODEL_FNAME ${PRETRAINED_MODEL_FNAME} +ARG SERVICE_PORT +ENV SERVICE_PORT ${SERVICE_PORT} +ARG TOKENIZER_NAME_OR_PATH +ENV TOKENIZER_NAME_OR_PATH ${TOKENIZER_NAME_OR_PATH} + +RUN mkdir /data/ + +RUN wget -c -q http://files.deeppavlov.ai/deeppavlov_data/${PRETRAINED_MODEL_FNAME} -P /data/ + +COPY ./requirements.txt /src/requirements.txt +RUN pip install -r /src/requirements.txt + +COPY . /src + +CMD gunicorn --workers=1 server:app -b 0.0.0.0:${SERVICE_PORT} --timeout=300 + diff --git a/services/dialogrpt_ru/README.md b/services/dialogrpt_ru/README.md new file mode 100644 index 0000000000..b141998eb1 --- /dev/null +++ b/services/dialogrpt_ru/README.md @@ -0,0 +1,15 @@ +# Russian DialogRPT model + +Code from https://github.com/golsun/DialogRPT + +Trained on 827k samples (plus 95k validation samples) from Russian Pikabu web-site. + +Data parsed from Pikabu by `zhirzemli` (OpenDataScience Slack nickname), code is available [on GitHub](https://github.com/alexeykarnachev/dialogs_data_parsers) +and the data is available [here](https://drive.google.com/file/d/1XYCprTqn_MlzDD9qgj7ANJkwFigK66mv/view?usp=sharing). + +Final acc=0.64 (on valid). + +Trained on 8 GPUs. +``` +python src/main.py train --data=data/out/updown --min_score_gap=20 --min_rank_gap=0.5 --max_seq_len 256 --batch 16 1>out.txt 2>&1 +``` \ No newline at end of file diff --git a/services/dialogrpt_ru/data_pikabu.py b/services/dialogrpt_ru/data_pikabu.py new file mode 100644 index 0000000000..bc3ede9539 --- /dev/null +++ b/services/dialogrpt_ru/data_pikabu.py @@ -0,0 +1,885 @@ +# author: Xiang Gao at Microsoft Research AI NLP Group + +import json +import os +import pickle +import numpy as np + + +def valid_sub(sub): + if sub.upper() in [ + "CON", + "PRN", + "AUX", + "NUL", + "COM1", + "COM2", + "COM3", + "COM4", + "COM5", + "COM6", + "COM7", + "COM8", + "COM9", + "LPT1", + "LPT2", + "LPT3", + "LPT4", + "LPT5", + "LPT6", + "LPT7", + "LPT8", + "LPT9", + ]: + # not allowed by Windows system + return False + if ":" in sub: + return False + return True + + +def get_dates(year_from, year_to=None): + if year_to is None: + year_to = year_from + dates = [] + for year in range(year_from, year_to + 1): + for _mo in range(1, 2): + mo = str(_mo) + if len(mo) == 1: + mo = "0" + mo + dates.append(str(year) + "-" + mo) + return dates + + +def extract_rc(date): + nodes = dict() + edges = dict() + subs = set() + n = 0 + m = 0 + kk = ["body", "link_id", "name", "parent_id", "subreddit"] + + def save(nodes, edges): + for sub in nodes: + fld = fld_jsonl + "/" + sub + try: + os.makedirs(fld, exist_ok=True) + except NotADirectoryError as e: + print(e) + continue + if sub not in subs: + open(fld + "/%s_nodes.jsonl" % date, "w", encoding="utf-8") + open(fld + "/%s_edges.tsv" % date, "w", encoding="utf-8") + subs.add(sub) + with open(fld + "/%s_nodes.jsonl" % date, "a", encoding="utf-8") as f: + f.write("\n".join(nodes[sub]) + "\n") + with open(fld + "/%s_edges.tsv" % date, "a", encoding="utf-8") as f: + f.write("\n".join(edges[sub]) + "\n") + + fpath = "./data/rc_pikabu" + for fname in os.listdir(fpath): + with open(f"{fpath}/{fname}", "r") as ff: + ffdata = json.load(ff) + + for node in ffdata: + n += 1 + line = json.dumps(node, ensure_ascii=False) + + ok = True + for k in kk: + if k not in node: + ok = False + break + if not ok: + break + + if not valid_sub(node["subreddit"]): + continue + + if node["subreddit"] not in nodes: + nodes[node["subreddit"]] = [] + edges[node["subreddit"]] = [] + nodes[node["subreddit"]].append(line) + edges[node["subreddit"]].append("%s\t%s\t%s" % (node["link_id"], node["parent_id"], node["name"])) + + m += 1 + if m % 1e5 == 0: + save(nodes, edges) + print("[RC_%s] saved %.2f/%.2f M, %i subreddits" % (date, m / 1e6, n / 1e6, len(subs))) + nodes = dict() + edges = dict() + + save(nodes, edges) + print("[RC_%s] FINAL %.2f/%.2f M, %i subreddits ================" % (date, m / 1e6, n / 1e6, len(subs))) + with open(fld_jsonl + "/readme.txt", "a", encoding="utf-8") as f: + f.write("[%s] saved %i/%i\n" % (date, m, n)) + + +def extract_rs(date): + roots = dict() + subs = set() + n = 0 + m = 0 + kk = ["selftext", "id", "title", "subreddit"] + + def save(roots): + for sub in roots: + fld = fld_jsonl + "/" + sub + try: + os.makedirs(fld, exist_ok=True) + except NotADirectoryError as e: + print(e) + continue + if sub not in subs: + open(fld + "/%s_roots.jsonl" % date, "w", encoding="utf-8") + subs.add(sub) + with open(fld + "/%s_roots.jsonl" % date, "a", encoding="utf-8") as f: + f.write("\n".join(roots[sub]) + "\n") + + fpath = "./data/rs_pikabu" + for fname in os.listdir(fpath): + with open(f"{fpath}/{fname}", "r") as ff: + ffdata = json.load(ff) + + previous_line = "" + for i, node in enumerate(ffdata): + n += 1 + line = json.dumps(node, ensure_ascii=False) + if i == 0: + line = previous_line + line + + try: + root = json.loads(line) + except Exception: + continue + + ok = True + for k in kk: + if k not in root: + ok = False + break + if not ok: + break + if not valid_sub(root["subreddit"]): + continue + + # some bz2, e.g. 2012-09, doesn't have the `name` entry + if "name" not in root: + root["name"] = "t3_" + root["id"] + + if root["subreddit"] not in roots: + roots[root["subreddit"]] = [] + roots[root["subreddit"]].append(line) + + m += 1 + if m % 1e4 == 0: + save(roots) + print("[RS_%s] saved %.2f/%.2f M, %i subreddits" % (date, m / 1e6, n / 1e6, len(subs))) + roots = dict() + previous_line = json.dumps(ffdata[-1], ensure_ascii=False) + + save(roots) + print("[RS_%s] FINAL %.2f/%.2f M, %i subreddits ================" % (date, m / 1e6, n / 1e6, len(subs))) + with open(fld_jsonl + "/readme_roots.txt", "a", encoding="utf-8") as f: + f.write("[%s] saved %i/%i\n" % (date, m, n)) + + +def extract_txt(sub, year, tokenizer, overwrite=False, max_subword=3): + fld = "%s/%s" % (fld_subs, sub) + os.makedirs(fld, exist_ok=True) + path_out = "%s/%i_txt.tsv" % (fld, year) + path_done = path_out + ".done" + if not overwrite and os.path.exists(path_done): + return + + dates = get_dates(year) + open(path_out, "w", encoding="utf-8") + + def clean(txt): + if txt.strip() in ["[deleted]", "[removed]"]: + return None + if ">" in txt or ">" in txt: # no comment in line ('>' means '>') + return None + + # deal with URL + txt = txt.replace("](", "] (") + ww = [] + for w in txt.split(): + if len(w) == 0: + continue + if "://" in w.lower() or "http" in w.lower(): + ww.append("(URL)") + else: + ww.append(w) + if not ww: + return None + if len(ww) > 30: # focus on dialog, so ignore long txt + return None + if len(ww) < 1: + return None + txt = " ".join(ww) + for c in ["\t", "\n", "\r"]: # delimiter or newline + txt = txt.replace(c, " ") + + ids = tokenizer.encode(txt) + if len(ids) / len(ww) > max_subword: # usually < 1.5. too large means too many unknown words + return None + + ids = " ".join([str(x) for x in ids]) + return txt, ids + + lines = [] + m = 0 + n = 0 + name_set = set() + for date in dates: + path = "%s/%s/%s_nodes.jsonl" % (fld_jsonl, sub, date) + if not os.path.exists(path): + continue + for line in open(path, encoding="utf-8"): + n += 1 + d = json.loads(line.strip("\n")) + if d["name"] in name_set: + continue + name_set.add(d["name"]) + txt_ids = clean(d["body"]) + if txt_ids is not None: + txt, ids = txt_ids + lines.append("%s\t%s\t%s" % (d["name"], txt, ids)) + m += 1 + if m % 1e4 == 0: + with open(path_out, "a", encoding="utf-8") as f: + f.write("\n".join(lines) + "\n") + lines = [] + + for date in dates: + path = "%s/%s/%s_roots.jsonl" % (fld_jsonl, sub, date) + if not os.path.exists(path): + continue + for line in open(path, encoding="utf-8"): + n += 1 + d = json.loads(line.strip("\n")) + if "name" not in d: + d["name"] = "t3_" + d["id"] + if d["name"] in name_set: + continue + name_set.add(d["name"]) + txt_ids = clean(d["title"] + " " + d["selftext"]) + if txt_ids is not None: + txt, ids = txt_ids + lines.append("%s\t%s\t%s" % (d["name"], txt, ids)) + m += 1 + if m % 1e4 == 0: + with open(path_out, "a", encoding="utf-8") as f: + f.write("\n".join(lines) + "\n") + lines = [] + if lines: + with open(path_out, "a", encoding="utf-8") as f: + f.write("\n".join(lines)) + + s = "[%s %s] txt kept %i/%i" % (sub, year, m, n) + with open(path_done, "w") as f: + f.write(s) + print(s) + + +def extract_trees(sub, year): + fld = "%s/%s" % (fld_subs, sub) + os.makedirs(fld, exist_ok=True) + path_out = "%s/%i_trees.pkl" % (fld, year) + if os.path.exists(path_out): + print(f"{path_out} exists. Skipping `extract_trees`") + return + + trees = dict() + n = 0 + for date in get_dates(year): + path = "%s/%s/%s_edges.tsv" % (fld_jsonl, sub, date) + if not os.path.exists(path): + print("extract_trees -- no such file: " + path) + continue + for line in open(path, encoding="utf-8"): + n += 1 + link, parent, child = line.strip("\n").split("\t") + if link not in trees: + trees[link] = dict() + trees[link][(parent, child)] = date + + if not trees: + print("no trees. Return from extract_trees") + return + + print("[%s %i] %i trees %.1f nodes/tree" % (sub, year, len(trees), n / len(trees))) + os.makedirs(fld, exist_ok=True) + pickle.dump(trees, open(path_out, "wb")) + + +def extract_time(sub, year, overwrite=False): + fld = "%s/%s" % (fld_subs, sub) + os.makedirs(fld, exist_ok=True) + path_out = "%s/%i_time.tsv" % (fld, year) + path_done = path_out + ".done" + if not overwrite and os.path.exists(path_done): + print(f"{path_out} exists. Skipping `extract_time`") + return + dates = get_dates(year) + suffix = "nodes" + os.makedirs(fld, exist_ok=True) + open(path_out, "w", encoding="utf-8") + + lines = [] + m = 0 + n = 0 + name_set = set() + for date in dates: + path = "%s/%s/%s_%s.jsonl" % (fld_jsonl, sub, date, suffix) + if not os.path.exists(path): + continue + for line in open(path, encoding="utf-8"): + n += 1 + d = json.loads(line.strip("\n")) + if "name" not in d: + d["name"] = "t3_" + d["id"] + if d["name"] in name_set: + continue + name_set.add(d["name"]) + t = d["created_utc"] + lines.append("%s\t%s" % (d["name"], t)) + m += 1 + if m % 1e4 == 0: + with open(path_out, "a", encoding="utf-8") as f: + f.write("\n".join(lines) + "\n") + lines = [] + with open(path_out, "a", encoding="utf-8") as f: + f.write("\n".join(lines)) + + s = "[%s %s] time kept %i/%i" % (sub, year, m, n) + with open(path_done, "w") as f: + f.write(s) + print(s) + + +def calc_feedback(sub, year, overwrite=False): + fld = "%s/%s" % (fld_subs, sub) + path_out = "%s/%i_feedback.tsv" % (fld, year) + path_done = path_out + ".done" + if not overwrite and os.path.exists(path_done): + print(f"{path_out} exists. Skipping `calc_feedback`") + return + + path_pkl = "%s/%i_trees.pkl" % (fld, year) + if not os.path.exists(path_pkl): + print(f"{path_out} does not exist. Skipping `calc_feedback`") + return + trees = pickle.load(open(path_pkl, "rb")) + if not trees: + return + + dates = get_dates(year) + updown = dict() + for date in dates: + path = "%s/%s/%s_nodes.jsonl" % (fld_jsonl, sub, date) + if not os.path.exists(path): + continue + for line in open(path, encoding="utf-8"): + d = json.loads(line.strip("\n")) + updown[d["name"]] = d["ups"] - d["downs"] + + if not updown: + print("empty updown:") + return + + with open(path_out, "w", encoding="utf-8") as f: + f.write("\t".join(["#path", "vol", "width", "depth", "updown"]) + "\n") + + print("[%s %s] calculating scores for %i trees" % (sub, year, len(trees))) + + n_tree = 0 + n_node = 0 + for root in trees: + tree = trees[root] + children = dict() + for parent, child in tree: + if parent not in children: + children[parent] = [] + children[parent].append(child) + if root not in children: + continue + + # BFS to get all paths from root to leaf + q = [[root]] + paths = [] + while q: + qsize = len(q) + for _ in range(qsize): + path = q.pop(0) + head = path[-1] + if head not in children: # then head is a leaf + paths.append(path) + continue + for child in children[head]: + q.append(path + [child]) + + prev = dict() + for path in paths: + for i in range(1, len(path)): + prev[path[i]] = " ".join(path[: i + 1]) + + descendant = dict() + longest_subpath = dict() + while paths: + path = paths.pop(0) + node = path[0] + if node not in descendant: + descendant[node] = set() + longest_subpath[node] = 0 + descendant[node] |= set(path[1:]) + longest_subpath[node] = max(longest_subpath[node], len(path) - 1) + if len(path) > 1: + paths.append(path[1:]) + + sorted_nodes = sorted([(len(prev[node].split()), prev[node], node) for node in prev]) + if not sorted_nodes: + continue + + n_tree += 1 + lines = [] + for _, _, node in sorted_nodes: + if node == root: + continue + if node not in updown: + continue + n_node += 1 + lines.append( + "%s\t%i\t%i\t%i\t%i" + % ( + prev[node], # turns: path from its root to this node + len(descendant[node]), # vol: num of descendants of this node + len(children.get(node, [])), # width: num of direct childrent of this node + longest_subpath[node], # depth: num of longest subpath of this node + updown[node], # updown: `upvotes - downvotes` of this node + ) + ) + with open(path_out, "a", encoding="utf-8") as f: + f.write("\n".join(lines) + "\n") + + if n_tree: + s = "[%s %s] %i tree %i nodes" % (sub, year, n_tree, n_node) + else: + s = "[%s %s] trees are empty" % (sub, year) + with open(path_done, "w") as f: + f.write(s) + print(s) + + +def create_pairs(year, sub, feedback, overwrite=False): + fld = "%s/%s" % (fld_subs, sub) + path_out = "%s/%i_%s.tsv" % (fld, year, feedback) + path_done = path_out + ".done" + if not overwrite and os.path.exists(path_done): + return + + ix_feedback = ["vol", "width", "depth", "updown"].index(feedback) + 1 + path_in = "%s/%i_feedback.tsv" % (fld, year) + if not os.path.exists(path_in): + return + + time = dict() + path_time = "%s/%i_time.tsv" % (fld, year) + if not os.path.exists(path_time): + return + for line in open(path_time): + ss = line.strip("\n").split("\t") + if len(ss) == 2: + name, t = ss + time[name] = int(t) + + open(path_out, "w", encoding="utf-8") + print("[%s %s] creating pairs..." % (sub, year)) + + def match_time(replies, cxt): + scores = sorted(set([score for score, _ in replies])) + m = len(scores) + if m < 2: + return 0 # can't create pairs if m < 2 + cand = [] + for score, reply in replies: + if reply not in time: + continue + cand.append((time[reply], score, reply)) + cand = sorted(cand) + rank = [scores.index(score) / (m - 1) for _, score, _ in cand] + lines = [] + for i in range(len(cand) - 1): + t_a, score_a, a = cand[i] + t_b, score_b, b = cand[i + 1] + rank_a = rank[i] + rank_b = rank[i + 1] + if score_a == score_b: + continue + hr = (t_b - t_a) / 3600 + if score_b > score_a: + score_a, score_b = score_b, score_a + a, b = b, a + rank_a, rank_b = rank_b, rank_a + lines.append( + "\t".join( + [ + cxt, + a, + b, + "%.2f" % hr, + "%i" % score_a, + "%i" % score_b, + "%.4f" % rank_a, + "%.4f" % rank_b, + ] + ) + ) + # pdb.set_trace() + if lines: + with open(path_out, "a") as f: + f.write("\n".join(lines) + "\n") + return len(lines) + + n_line = 0 + prev = None + replies = [] + cxt = "" + for line in open(path_in): + if line.startswith("#"): + continue + ss = line.strip("\n").split("\t") + turns = ss[0].split() # including both cxt and resp + if len(turns) < 2: + continue + reply = turns[-1] + try: + score = int(ss[ix_feedback]) + except ValueError: + continue + parent = turns[-2] + if parent == prev: + replies.append((score, reply)) + else: + if replies: + n_line += match_time(replies, cxt) + cxt = " ".join(turns[:-1]) + prev = parent + replies = [(score, reply)] + if replies: + n_line += match_time(replies, cxt) + + s = "[%s %s %s] %i pairs" % (sub, year, feedback, n_line) + with open(path_done, "w") as f: + f.write(s) + print(s) + + +def add_seq(sub, year, feedback, overwrite=False): + fname = "%i_%s" % (year, feedback) + fld = "%s/%s" % (fld_subs, sub) + turn_sep = " 50257 " + path_out = fld + "/%s_ids.tsv" % fname + path_done = path_out + ".done" + + if os.path.exists(path_done) and not overwrite: + return + if not os.path.exists(fld + "/%s.tsv" % fname): + return + + seq = dict() + path = "%s/%i_txt.tsv" % (fld, year) + if not os.path.exists(path): + return + for line in open(path, encoding="utf-8"): + ss = line.strip("\n").split("\t") + if len(ss) != 3: + continue + name, txt, ids = ss + seq[name] = ids + print("loaded %i seq" % len(seq)) + open(path_out, "w", encoding="utf-8") + print("[%s %s %s] adding seq" % (sub, year, feedback)) + path = fld + "/%s.tsv" % fname + lines = [] + n = 0 + m = 0 + for line in open(path, encoding="utf-8"): + line = line.strip("\n") + if line.startswith("#"): + continue + + n += 1 + ss = line.split("\t") + if len(ss) < 7: + continue + name_cxt, name_pos, name_neg = ss[:3] + + cxt = [] + ok = True + for name in name_cxt.split(): + if name in seq: + cxt.append(seq[name]) + else: + ok = False + break + if not ok: + continue + cxt = turn_sep.join(cxt) + + if name_pos in seq: + reply_pos = seq[name_pos] + else: + continue + if name_neg in seq: + reply_neg = seq[name_neg] + else: + continue + + lines.append( + "\t".join( + [ + cxt, + reply_pos, + reply_neg, + name_cxt, + name_pos, + name_neg, + ] + + ss[3:] + ) + ) + m += 1 + if m % 1e4 == 0: + with open(path_out, "a", encoding="utf-8") as f: + f.write("\n".join(lines) + "\n") + lines = [] + + with open(path_out, "a", encoding="utf-8") as f: + f.write("\n".join(lines)) + + s = "[%s %s %s] pair seq %i/%i" % (sub, year, feedback, m, n) + with open(path_done, "w") as f: + f.write(s) + print(s) + + +def combine_sub(year_from, year_to, feedback, overwrite=False, skip_same_pos=True): + fld = "%s/%s" % (fld_out, feedback) + os.makedirs(fld, exist_ok=True) + path_out = fld + "/raw.tsv" + path_done = path_out + ".done" + if os.path.exists(path_done) and not overwrite: + return path_out + + subs = sorted(os.listdir(fld_subs)) + open(path_out, "w", encoding="utf-8") + lines = [] + n = 0 + empty = True + non_empty_subreddits = 0 + for sub in subs: + empty = True + for year in range(year_from, year_to + 1): + path = "%s/%s/%i_%s_ids.tsv" % (fld_subs, sub, year, feedback) + if not os.path.exists(path): + continue + for line in open(path, encoding="utf-8"): + if line.startswith("#"): + continue + line = line.strip("\n") + if not line: + continue + lines.append(line) + empty = False + n += 1 + if n % 1e5 == 0: + with open(path_out, "a", encoding="utf-8") as f: + f.write("\n".join(lines) + "\n") + lines = [] + s = "[%i %s] saved %.2f M lines from %i subreddits, now is %s" % ( + year, + feedback, + n / 1e6, + non_empty_subreddits + 1, + sub, + ) + print(s) + if not empty: + non_empty_subreddits += 1 + + with open(path_out, "a", encoding="utf-8") as f: + f.write("\n".join(lines)) + s = "[%i-%i %s] saved %.2f M lines from %i subreddits" % ( + year_from, + year_to, + feedback, + n / 1e6, + non_empty_subreddits, + ) + with open(path_done, "w") as f: + f.write(s) + print(s) + return path_out + + +def split_by_root(path, p_test=0.1): + + print("spliting by root " + path) + lines = { + "train": [], + "vali": [], + } + prev = None + n = 0 + + for k in lines: + if len(lines[k]) == 0: + continue + open(path + "." + k, "w", encoding="utf-8") + + for line in open(path, encoding="utf-8"): + line = line.strip("\n") + if not line: + continue + cxt = line.split("\t")[3] + root = cxt.strip().split()[0] + if root != prev: + if np.random.random() < p_test: + k = "vali" + else: + k = "train" + # pdb.set_trace() + lines[k].append(line) + prev = root + n += 1 + if n % 1e6 == 0: + print("read %i M" % (n / 1e6)) + for k in lines: + if len(lines[k]) == 0: + continue + with open(path + "." + k, "a", encoding="utf-8") as f: + f.write("\n".join(lines[k]) + "\n") + lines[k] = [] + + for k in lines: + if len(lines[k]) == 0: + continue + with open(path + "." + k, "a", encoding="utf-8") as f: + f.write("\n".join(lines[k])) + lines[k] = [] + + +def shuffle(feedback, part, n_temp=10): + fld = "%s/%s" % (fld_out, feedback) + path = "%s/raw.tsv.%s" % (fld, part) + path_out = "%s/%s.tsv" % (fld, part) + fld_temp = "%s/temp/%s" % (fld_out, feedback) + + print("slicing " + path) + os.makedirs(fld_temp, exist_ok=True) + lines = [[] for _ in range(n_temp)] + + # split into n_temp files + for i in range(n_temp): + open(fld_temp + "/temp%i" % i, "w", encoding="utf-8") + n = 0 + count = [0] * n_temp + rand = np.random.randint(0, n_temp, 202005) + for line in open(path, encoding="utf-8"): + line = line.strip("\n") + if len(line) == 0: + continue + bucket = rand[n % len(rand)] + lines[bucket].append(line) + count[bucket] += 1 + n += 1 + if n % 1e6 == 0: + print("read %i M" % (n / 1e6)) + for i in range(n_temp): + if len(lines[i]) == 0: + continue + with open(fld_temp + "/temp%i" % i, "a", encoding="utf-8") as f: + f.write("\n".join(lines[i]) + "\n") + lines[i] = [] + + for i in range(n_temp): + with open(fld_temp + "/temp%i" % i, "a", encoding="utf-8") as f: + f.write("\n".join(lines[i])) + + # and then merge + open(path_out, "w", encoding="utf-8") + print(fld_temp) + for i in range(n_temp): + print("reading temp%i" % i) + lines = open(fld_temp + "/temp%i" % i, encoding="utf-8").readlines() + print("shuffling") + jj = list(range(len(lines))) + np.random.shuffle(jj) + print("writing") + with open(path_out, "a", encoding="utf-8") as f: + f.write("\n".join([lines[j].strip("\n") for j in jj]) + "\n") + + +def get_subs(): + return ["pikabu"] + print("collectiing subs...") + subs = sorted(os.listdir(fld_subs)) + print("collected %i subs" % len(subs)) + return subs + + +def build_json(year): + for date in get_dates(year): + extract_rc(date) + extract_rs(date) + + +def build_basic(year): + from transformers import AutoTokenizer + + tokenizer = AutoTokenizer.from_pretrained("Grossmend/rudialogpt3_medium_based_on_gpt2") + subs = get_subs() + for sub in subs: + print(f"Sub: {sub}") + extract_time(sub, year) + extract_txt(sub, year, tokenizer) + extract_trees(sub, year) + calc_feedback(sub, year, overwrite=False) + + +def build_pairs(year_from, year_to, feedback): + subs = get_subs() + for year in range(year_from, year_to + 1): + for sub in subs: + create_pairs(year, sub, feedback, overwrite=False) + add_seq(sub, year, feedback, overwrite=False) + path = combine_sub(year_from, year_to, feedback) + split_by_root(path) + for part in ["train", "vali"]: + shuffle(feedback, part) + + +FLD = "data" +fld_bz2 = FLD + "/bz2/" +fld_jsonl = FLD + "/jsonl/" +fld_subs = FLD + "/subs/" +fld_out = FLD + "/out/" + +if __name__ == "__main__": + import argparse + + parser = argparse.ArgumentParser() + parser.add_argument("task", type=str) + parser.add_argument("year", type=int) + parser.add_argument("--year_to", type=int) + args = parser.parse_args() + if args.task == "bz2": + build_json(args.year) + elif args.task == "basic": + build_basic(args.year) + elif args.task in ["updown", "depth", "width"]: + build_pairs(args.year, args.year_to, args.task) + else: + raise ValueError diff --git a/services/dialogrpt_ru/feeder.py b/services/dialogrpt_ru/feeder.py new file mode 100644 index 0000000000..7558ef24c6 --- /dev/null +++ b/services/dialogrpt_ru/feeder.py @@ -0,0 +1,144 @@ +# author: Xiang Gao at Microsoft Research AI NLP Group + +import os + +import numpy as np +import torch +from transformers import AutoTokenizer + +TOKENIZER_NAME_OR_PATH = os.getenv("TOKENIZER_NAME_OR_PATH", "Grossmend/rudialogpt3_medium_based_on_gpt2") + + +class Feeder: + # load train/vali/test data + + def __init__(self, opt): + self.opt = opt + self.files = dict() + if self.opt.mismatch: + self.files_mismatch = dict() + for sub in ["train", "vali", "test"]: + self.reset(sub) + self.ix_EOS = 50257 + self.tokenizer = AutoTokenizer.from_pretrained(TOKENIZER_NAME_OR_PATH) + self.ix_PAD = self.tokenizer.pad_token_id + self.ix_OMT = 655 + + def reset(self, sub): + print("resetting " + sub) + path = "%s/%s.tsv" % (self.opt.fld_data, sub) + if os.path.exists(path): + self.files[sub] = open(path) + if self.opt.mismatch: + self.files_mismatch[sub] = open(path) + # assuming f is already shuffled, this step makes f and f_mismatch mismatch + for _ in range(100): + self.files[sub].readline() + + def get_batch(self, size, sub="train", min_score_gap=1, min_rank_gap=0): + ids_pos = [] + len_pos = [] + ids_neg = [] + len_neg = [] + len_cxt = [] + score_pos = [] + score_neg = [] + rank_pos = [] + rank_neg = [] + hr_gap = [] + if sub != "train": + np.random.seed(2020) + + def ints(s): + return [int(x) for x in s.split()] + + def pad(seq): + return seq + [self.ix_PAD] * (self.opt.max_seq_len - len(seq)) + + def read(): + total = 0 + used = 0 + for line in self.files[sub]: + if line.startswith("#"): + continue + # old data is title + ' . ' + selftext, ' .' is 764 and often used as ' .jpg' thus misleading + line = line.replace(" 18\t", "\t").replace(" 18 50257 ", " 50257 ") + total += 1 + ss = line.strip("\n").split("\t") + cxt = ints(ss[0]) + reply_pos = ints(ss[1]) + # _score_pos, _score_neg, _rank_pos, _rank_neg = ss[-4:] + try: + _hr_gap = float(ss[-5]) + except ValueError: + _hr_gap = np.nan + _score_pos = int(ss[-4]) + _rank_pos = float(ss[-2]) + + if self.opt.mismatch: + _score_neg = np.nan + _rank_neg = np.nan + line_mismatch = self.files_mismatch[sub].readline() + ss_mismatch = line_mismatch.strip("\n").split("\t") + reply_neg = ints(ss_mismatch[1]) + + else: + reply_neg = ints(ss[2]) + _score_neg = int(ss[-3]) + _rank_neg = float(ss[-1]) + if _score_pos - _score_neg < min_score_gap: + continue + if _rank_pos - _rank_neg < min_rank_gap: + continue + if self.opt.max_hr_gap > 0 and _hr_gap > self.opt.max_hr_gap: + continue + + pos = cxt + [self.ix_EOS] + reply_pos + score_pos.append(_score_pos) + rank_pos.append(_rank_pos) + + neg = cxt + [self.ix_EOS] + reply_neg + score_neg.append(_score_neg) + rank_neg.append(_rank_neg) + + # make sure cxt still same even after cut + n_del = max(len(pos), len(neg)) - self.opt.max_seq_len + if n_del > 0: + pos = pos[n_del:] + neg = neg[n_del:] + cxt = cxt[n_del:] + + len_cxt.append(len(cxt)) + len_pos.append(len(pos)) + len_neg.append(len(neg)) + ids_pos.append(pad(pos)) + ids_neg.append(pad(neg)) + hr_gap.append(_hr_gap) + + used += 1 + if len(ids_pos) == size: + break + + while True: + read() + if len(ids_pos) == size: + break + self.reset(sub) + + ids_pos = torch.LongTensor(ids_pos) + ids_neg = torch.LongTensor(ids_neg) + if self.opt.cuda: + ids_pos = ids_pos.cuda() + ids_neg = ids_neg.cuda() + return { + "ids_pos": ids_pos, + "ids_neg": ids_neg, + "len_pos": len_pos, + "len_neg": len_neg, + "len_cxt": len_cxt, + "score_pos": score_pos, + "score_neg": score_neg, + "rank_pos": rank_pos, + "rank_neg": rank_neg, + "hr_gap": hr_gap, + } diff --git a/services/dialogrpt_ru/master.py b/services/dialogrpt_ru/master.py new file mode 100644 index 0000000000..2a2545357e --- /dev/null +++ b/services/dialogrpt_ru/master.py @@ -0,0 +1,216 @@ +# author: Xiang Gao at Microsoft Research AI NLP Group + +import os +import sys +import time +import warnings + +import numpy as np +import torch +from feeder import Feeder +from utils import Scorer + + +class Master: + def __init__(self, opt): + self.opt = opt + self._model = Scorer(opt) + if opt.path_load is not None: + self._model.load(opt.path_load) + self.parallel() + + if opt.task != "play": + if opt.fld_data is not None: + self.feeder = Feeder(opt) + + if opt.task == "train": + opt.save() + os.makedirs(opt.fld_out + "/ckpt", exist_ok=True) + self.path_log = self.opt.fld_out + "/log.txt" + else: + self.path_log = self.opt.fld_out + "/log_infer.txt" + + def print(self, s=""): + try: + print(s) + except UnicodeEncodeError: + print("[UnicodeEncodeError]") + pass + with open(self.path_log, "a", encoding="utf-8") as f: + f.write(s + "\n") + + def parallel(self): + if self.opt.cuda: + self._model = self._model.cuda() + n_gpu = torch.cuda.device_count() + if self.opt.cuda and n_gpu > 1: + print("paralleling on %i GPU" % n_gpu) + self.model = torch.nn.DataParallel(self._model) + # after DataParallel, a warning about RNN weights shows up every batch + warnings.filterwarnings("ignore") + # after DataParallel, attr of self.model become attr of self.model.module + self._model = self.model.module + self.model.core = self.model.module.core + self.model.tokenizer = self._model.tokenizer + else: + self.model = self._model + if self.opt.task == "train": + self.optimizer = torch.optim.Adam(self._model.parameters(), lr=self.opt.lr) + + def train(self): + vali_loss, best_acc = self.vali() + step = 0 + n_trained = 0 + t0 = time.time() + + list_trained = [0] + list_train_loss = [np.nan] + list_train_acc = [np.nan] + list_vali_loss = [vali_loss] + list_vali_acc = [best_acc] + acc_history = [] + + while step < self.opt.step_max: + self.model.train() + self.optimizer.zero_grad() + batch = self.feeder.get_batch(self.opt.batch) + pred = self.model.forward(batch) + loss = self.loss(pred) + loss = loss.mean() # in case of parallel-training + + loss.backward() + torch.nn.utils.clip_grad_norm_(self.model.parameters(), self.opt.clip) + self.optimizer.step() + + acc = (pred > 0.5).float().mean().item() + acc_history.append(acc) + if len(acc_history) > self.opt.len_acc: + acc_history.pop(0) + avg_train_acc = np.mean(acc_history) + step += 1 + n_trained += self.opt.batch + info = "step %i trained %.3f best %.2f" % (step, n_trained / 1e6, best_acc) + + if step % self.opt.step_print == 0: + speed = (n_trained / 1e6) / ((time.time() - t0) / 3600) + + self.print( + "%s speed %.2f hr_gap %.2f score_gap %.2f rank_gap %.2f loss %.4f acc %.3f" + % ( + info, + speed, + np.median(batch["hr_gap"]), + (np.array(batch["score_pos"]) - np.array(batch["score_neg"])).mean(), + (np.array(batch["rank_pos"]) - np.array(batch["rank_neg"])).mean(), + loss, + avg_train_acc, + ) + ) + + if step % self.opt.step_vali == 0: + vali_loss, vali_acc = self.vali(info) + if vali_acc > best_acc: + self.save(self.opt.fld_out + "/ckpt/best.pth") + best_acc = vali_acc + sys.stdout.flush() + + list_trained.append(n_trained / 1e6) + list_train_loss.append(loss.item()) + list_train_acc.append(avg_train_acc) + list_vali_loss.append(vali_loss) + list_vali_acc.append(vali_acc) + # _, axs = plt.subplots(3, 1, sharex=True) + + # axs[0].plot(list_trained, list_train_loss, "b", label="train") + # axs[0].plot(list_trained, list_vali_loss, "r", label="vali") + # axs[0].legend(loc="best") + # axs[0].set_ylabel("loss") + # + # axs[1].plot(list_trained, list_train_acc, "b", label="train") + # axs[1].plot(list_trained, list_vali_acc, "r", label="vali") + # axs[1].plot([best_trained / 1e6, n_trained / 1e6], [best_acc, best_acc], "k:") + # axs[1].set_ylabel("acc") + # + # axs[-1].set_xlabel("trained (M)") + # axs[0].set_title(self.opt.fld_out + "\n" + self.opt.fld_data + "\nbest_acc = %.4f" % best_acc) + # plt.tight_layout() + # plt.savefig(self.opt.fld_out + "/log.png") + # plt.close() + + if step % self.opt.step_save == 0: + self.save(self.opt.fld_out + "/ckpt/last.pth") + + def loss(self, pred): + # pred is the probability to pick the positive response, given a context and a negative response + return -torch.log(pred).mean() + + def vali(self, info=""): + n_print = min(self.opt.batch, self.opt.vali_print) + self.model.eval() + loss = 0 + acc = 0 + hr_gap = 0 + score_gap = 0 + rank_gap = 0 + n_batch = int(self.opt.vali_size / self.opt.batch) + self.feeder.reset("vali") + + for _ in range(n_batch): + batch = self.feeder.get_batch( + self.opt.batch, sub="vali", min_score_gap=self.opt.min_score_gap, min_rank_gap=self.opt.min_rank_gap + ) + with torch.no_grad(): + pred = self.model.forward(batch) + loss += self.loss(pred) + acc += (pred > 0.5).float().mean() + score_gap += (np.array(batch["score_pos"]) - np.array(batch["score_neg"])).mean() + rank_gap += (np.array(batch["rank_pos"]) - np.array(batch["rank_neg"])).mean() + hr_gap += np.median(batch["hr_gap"]) + + loss /= n_batch + acc /= n_batch + score_gap /= n_batch + rank_gap /= n_batch + hr_gap /= n_batch + s = "%s hr_gap %.2f score_gap %.2f rank_gap %.2f loss %.4f acc %.3f" % ( + info, + hr_gap, + score_gap, + rank_gap, + loss, + acc, + ) + s = "[vali] " + s.strip() + if not n_print: + self.print(s) + return loss.mean().item(), acc + + with torch.no_grad(): + pred_pos = self.model.core(batch["ids_pos"], batch["len_pos"]) + pred_neg = self.model.core(batch["ids_neg"], batch["len_neg"]) + + def to_np(ids): + if self.opt.cuda: + ids = ids.cpu() + return ids.detach().numpy() + + ids_pos = to_np(batch["ids_pos"]) + ids_neg = to_np(batch["ids_neg"]) + + for j in range(n_print): + l_cxt = batch["len_cxt"][j] + cxt = self.model.tokenizer.decode(ids_pos[j, :l_cxt]) + pos = self.model.tokenizer.decode(ids_pos[j, l_cxt:]).strip("<|ndoftext|>") + neg = self.model.tokenizer.decode(ids_neg[j, l_cxt:]).strip("<|ndoftext|>") + self.print(cxt) + self.print("hr_gap %s" % batch["hr_gap"][j]) + self.print("%s\t%.2f\t%.3f\t%s" % (batch["score_pos"][j], batch["rank_pos"][j], pred_pos[j], pos)) + self.print("%s\t%.2f\t%.3f\t%s" % (batch["score_neg"][j], batch["rank_neg"][j], pred_neg[j], neg)) + self.print() + + self.print(s) + return loss.mean().item(), acc + + def save(self, path): + torch.save(self._model.state_dict(), path) + self.print("saved to " + path) diff --git a/services/dialogrpt_ru/requirements.txt b/services/dialogrpt_ru/requirements.txt new file mode 100644 index 0000000000..071abd5ca0 --- /dev/null +++ b/services/dialogrpt_ru/requirements.txt @@ -0,0 +1,9 @@ +transformers==4.0.1 +sentencepiece==0.1.94 +flask==1.1.1 +gunicorn==19.9.0 +requests==2.22.0 +sentry-sdk[flask]==0.14.1 +itsdangerous==2.0.1 +jinja2<=3.0.3 +Werkzeug<=2.0.3 diff --git a/services/dialogrpt_ru/server.py b/services/dialogrpt_ru/server.py new file mode 100644 index 0000000000..d4775c385c --- /dev/null +++ b/services/dialogrpt_ru/server.py @@ -0,0 +1,84 @@ +import logging +import time +import os + +import sentry_sdk +import torch +from flask import Flask, request, jsonify +from sentry_sdk.integrations.flask import FlaskIntegration + +from utils import Option, Scorer + + +sentry_sdk.init(dsn=os.getenv("SENTRY_DSN"), integrations=[FlaskIntegration()]) + +logging.basicConfig(format="%(asctime)s - %(name)s - %(levelname)s - %(message)s", level=logging.INFO) +logger = logging.getLogger(__name__) + +PRETRAINED_MODEL_FNAME = os.environ.get("PRETRAINED_MODEL_FNAME", "dialogrpt_ru_ckpt_v0.pth") +logger.info(f"PRETRAINED_MODEL_FNAME = {PRETRAINED_MODEL_FNAME}") + +cuda = torch.cuda.is_available() +if cuda: + torch.cuda.set_device(0) + device = "cuda" +else: + device = "cpu" + +logger.info(f"dialogrpt is set to run on {device}") + +params = { + "path_load": f"/data/{PRETRAINED_MODEL_FNAME}", + "data": "./bla", + "batch": 256, + "vali_size": 1024, + "vali_print": 10, + "lr": 3e-5, + "cpu": False, + "max_seq_len": 50, + "mismatch": False, + "min_score_gap": 10, + "min_rank_gap": 10, + "max_hr_gap": 1, + "task": "vali", +} + +try: + opt = Option(params) + model = Scorer(opt) + model.load(f"/data/{PRETRAINED_MODEL_FNAME}") + model.predict(cxt="привет!", hyps=["привет. как дела?"]) + + logger.info("dialogrpt model is ready") +except Exception as e: + sentry_sdk.capture_exception(e) + logger.exception(e) + raise e + +app = Flask(__name__) +logging.getLogger("werkzeug").setLevel("WARNING") + + +@app.route("/respond", methods=["POST"]) +def respond(): + st_time = time.time() + + dialog_contexts = request.json.get("dialog_contexts", []) + hypotheses = request.json.get("hypotheses", []) + + try: + _cxts, _hyps = [], [] + for cxt, hyp in zip(dialog_contexts, hypotheses): + _cxts += [cxt] + _hyps += [hyp] + result_values = model.predict_on_batch(cxts=_cxts, hyps=_hyps).tolist() + # result_values is a list of float values + except Exception as exc: + logger.exception(exc) + sentry_sdk.capture_exception(exc) + result_values = [0.0 for _ in hypotheses] + + total_time = time.time() - st_time + logger.info(f"dialogrpt exec time: {total_time:.3f}s") + + return jsonify([{"batch": result_values}]) diff --git a/services/dialogrpt_ru/test.py b/services/dialogrpt_ru/test.py new file mode 100644 index 0000000000..bd89e1d78c --- /dev/null +++ b/services/dialogrpt_ru/test.py @@ -0,0 +1,25 @@ +import requests + + +def test_respond(): + url = "http://0.0.0.0:8122/respond" + + contexts = ["Привет! Как дела?", "Привет! Как дела?", "Какой твой любимый фильм?", "Какой твой любимый фильм?"] + hypotheses = [ + "хорошо. а у тебя как дела?", + "какой твой любимый фильм?", + "пересматриваю Гордость и предубеждение иногда.", + "я люблю играть в компьюетрные игры.", + ] + gold = [0.334246, 0.33038276, 0.40354252, 0.3839873] + + request_data = {"dialog_contexts": contexts, "hypotheses": hypotheses} + result = requests.post(url, json=request_data).json()[0]["batch"] + print(result) + for i, score in enumerate(result): + assert round(score, 4) == round(gold[i], 4), f"Expected:{gold[i]}\tGot\n{score}" + print("Success!") + + +if __name__ == "__main__": + test_respond() diff --git a/services/dialogrpt_ru/test.sh b/services/dialogrpt_ru/test.sh new file mode 100755 index 0000000000..cf55721bd3 --- /dev/null +++ b/services/dialogrpt_ru/test.sh @@ -0,0 +1,4 @@ +#!/bin/bash + + +python test.py diff --git a/services/dialogrpt_ru/utils.py b/services/dialogrpt_ru/utils.py new file mode 100644 index 0000000000..84ce6b8178 --- /dev/null +++ b/services/dialogrpt_ru/utils.py @@ -0,0 +1,204 @@ +import os +import time + +import torch +from transformers import AutoModelForCausalLM, AutoTokenizer + + +EOS_token = "<|endoftext|>" +TOKENIZER_NAME_OR_PATH = os.getenv("TOKENIZER_NAME_OR_PATH", "Grossmend/rudialogpt3_medium_based_on_gpt2") + + +class Option: + def __init__(self, args): + if isinstance(args, dict): + if args["cpu"] or not torch.cuda.is_available(): + self.cuda = False + else: + self.cuda = True + self.task = args["task"] + self.path_load = args["path_load"] + self.batch = args["batch"] + self.vali_size = max(self.batch, args["vali_size"]) + self.vali_print = args["vali_print"] + self.lr = args["lr"] + self.max_seq_len = args["max_seq_len"] + self.min_score_gap = args["min_score_gap"] + self.min_rank_gap = args["min_rank_gap"] + self.max_hr_gap = args["max_hr_gap"] + self.mismatch = args["mismatch"] + self.fld_data = args["data"] + if args["task"] == "train" or self.path_load is None: + self.fld_out = "out/%i" % time.time() + else: + self.fld_out = "out/temp" + else: + if args.cpu or not torch.cuda.is_available(): + self.cuda = False + else: + self.cuda = True + self.task = args.task + self.path_load = args.path_load + self.batch = args.batch + self.vali_size = max(self.batch, args.vali_size) + self.vali_print = args.vali_print + self.lr = args.lr + self.max_seq_len = args.max_seq_len + self.min_score_gap = args.min_score_gap + self.min_rank_gap = args.min_rank_gap + self.max_hr_gap = args.max_hr_gap + self.mismatch = args.mismatch + self.fld_data = args.data + if args.task == "train" or self.path_load is None: + self.fld_out = "out/%i" % time.time() + else: + self.fld_out = "out/temp" + + os.makedirs(self.fld_out, exist_ok=True) + + self.clip = 1 + self.step_max = 1e6 + self.step_print = 10 + self.step_vali = 100 + self.step_save = 500 + self.len_acc = self.step_vali + + def save(self): + d = self.__dict__ + lines = [] + for k in d: + lines.append("%s\t%s" % (k, d[k])) + with open(self.fld_out + "/opt.tsv", "w") as f: + f.write("\n".join(lines)) + + +class ScorerBase(torch.nn.Module): + def __init__(self, opt): + super().__init__() + self.ix_EOS = 50257 + self.ix_OMT = 655 + self.opt = opt + self.tokenizer = AutoTokenizer.from_pretrained(TOKENIZER_NAME_OR_PATH) + + def core(self, ids, l_ids, return_logits=False): + # to be implemented in child class + return 0 + + def predict(self, cxt, hyps, max_cxt_turn=None): + # cxt = str + # hyps = list of str + + self.eval() + cxt_turns = cxt.split(EOS_token) + if max_cxt_turn is not None: + cxt_turns = cxt_turns[-min(max_cxt_turn, len(cxt_turns)) :] + ids_cxt = [] + for turn in cxt_turns: + ids_cxt += self.tokenizer.encode(turn.strip()) + [self.ix_EOS] + seqs = [] + lens = [] + for hyp in hyps: + seq = ids_cxt + self.tokenizer.encode(hyp.strip()) + lens.append(len(seq)) + seqs.append(seq) + max_len = max(lens) + ids = [] + for seq in seqs: + ids.append(seq + [self.ix_EOS] * (max_len - len(seq))) + with torch.no_grad(): + ids = torch.LongTensor(ids) + if self.opt.cuda: + ids = ids.cuda() + scores = self.core(ids, lens) + if not isinstance(scores, dict): + if self.opt.cuda: + scores = scores.cpu() + return scores.detach().numpy() + + for k in scores: + if self.opt.cuda: + scores[k] = scores[k].cpu() + scores[k] = scores[k].detach().numpy() + return scores + + def predict_on_batch(self, cxts, hyps, max_cxt_turn=None): + # cxt = list of str + # hyps = list of str + + self.eval() + + seqs = [] + lens = [] + for cxt, hyp in zip(cxts, hyps): + cxt_turns = cxt.split(EOS_token) + if max_cxt_turn is not None: + cxt_turns = cxt_turns[-min(max_cxt_turn, len(cxt_turns)) :] + ids_cxt = [] + for turn in cxt_turns: + ids_cxt += self.tokenizer.encode(turn.strip()) + [self.ix_EOS] + + seq = ids_cxt + self.tokenizer.encode(hyp.strip()) + lens.append(len(seq)) + seqs.append(seq) + max_len = max(lens) + + ids = [] + for seq in seqs: + ids.append(seq + [self.ix_EOS] * (max_len - len(seq))) + + with torch.no_grad(): + ids = torch.LongTensor(ids) + if self.opt.cuda: + ids = ids.cuda() + scores = self.core(ids, lens) + if not isinstance(scores, dict): + if self.opt.cuda: + scores = scores.cpu() + return scores.detach().numpy() + + for k in scores: + if self.opt.cuda: + scores[k] = scores[k].cpu() + scores[k] = scores[k].detach().numpy() + return scores + + def forward(self, batch): + logits_pos = self.core(batch["ids_pos"], batch["len_pos"], return_logits=True) + logits_neg = self.core(batch["ids_neg"], batch["len_neg"], return_logits=True) + # softmax to get the `probability` to rank pos/neg correctly + return torch.exp(logits_pos) / (torch.exp(logits_pos) + torch.exp(logits_neg)) + + +class Scorer(ScorerBase): + def __init__(self, opt): + super().__init__(opt) + n_embd = 1024 + self.transformer = AutoModelForCausalLM.from_pretrained("Grossmend/rudialogpt3_medium_based_on_gpt2") + self.transformer.resize_token_embeddings(len(self.tokenizer)) + + self.score = torch.nn.Linear(n_embd, 1, bias=False) + + def core(self, ids, l_ids, return_logits=False): + n = ids.shape[0] + attention_mask = torch.ones_like(ids) + for i in range(n): + attention_mask[i, l_ids[i] :] *= 0 + transformer_output = self.transformer(ids, attention_mask=attention_mask, output_hidden_states=True) + logits = self.score(transformer_output.hidden_states[0]).squeeze(-1) + logits = torch.stack([logits[i, l_ids[i] - 1] for i in range(n)]) + if return_logits: + return logits + else: + return torch.sigmoid(logits) + + def load(self, path): + + print("loading from " + path) + weights = torch.load(path, map_location=torch.device("cpu")) + if path.endswith(".pkl"): + # russian DialoGPT checkpoint + pass + else: + self.load_state_dict(weights) + if self.opt.cuda: + self.cuda() diff --git a/skill_selectors/rule_based_selector/connector.py b/skill_selectors/rule_based_selector/connector.py index e1a560936a..3fd0605eaa 100644 --- a/skill_selectors/rule_based_selector/connector.py +++ b/skill_selectors/rule_based_selector/connector.py @@ -88,6 +88,7 @@ async def send(self, payload: Dict, callback: Callable): elif user_uttr_text == "/get_dialog_id": skills_for_uttr.append("dummy_skill") elif high_priority_intent_detected: + skills_for_uttr.append("dummy_skill") # process intent with corresponding IntentResponder skills_for_uttr.append("dff_intent_responder_skill") elif is_sensitive_topic_and_request(user_uttr): @@ -99,6 +100,8 @@ async def send(self, payload: Dict, callback: Callable): skills_for_uttr.append("personal_info_skill") skills_for_uttr.append("factoid_qa") skills_for_uttr.append("dff_grounding_skill") + # we have only russian version of dff_generative_skill + skills_for_uttr.append("dff_generative_skill") skills_for_uttr.append("dummy_skill") skills_for_uttr.append("small_talk_skill") @@ -142,6 +145,8 @@ async def send(self, payload: Dict, callback: Callable): skills_for_uttr.append("convert_reddit") skills_for_uttr.append("comet_dialog_skill") skills_for_uttr.append("dff_program_y_wide_skill") + # we have only russian version of dff_generative_skill + skills_for_uttr.append("dff_generative_skill") # adding friendship only in the beginning of the dialog if len(dialog["utterances"]) < 20: diff --git a/skills/dff_friendship_skill/Dockerfile b/skills/dff_friendship_skill/Dockerfile index ba42f09050..835265b979 100644 --- a/skills/dff_friendship_skill/Dockerfile +++ b/skills/dff_friendship_skill/Dockerfile @@ -11,6 +11,8 @@ RUN pip install -r requirements.txt ARG SERVICE_NAME ENV SERVICE_NAME ${SERVICE_NAME} +ARG LANGUAGE=EN +ENV LANGUAGE ${LANGUAGE} COPY skills/${SERVICE_NAME}/requirements.txt . RUN pip install -r requirements.txt diff --git a/skills/dff_friendship_skill/scenario/condition.py b/skills/dff_friendship_skill/scenario/condition.py index bfb9fcbdeb..de7c58927c 100644 --- a/skills/dff_friendship_skill/scenario/condition.py +++ b/skills/dff_friendship_skill/scenario/condition.py @@ -1,4 +1,6 @@ +import logging import re +from os import getenv from df_engine.core import Actor, Context @@ -9,7 +11,11 @@ from common.emotion import is_positive_regexp_based, is_negative_regexp_based -GREETING_STEPS = list(common_greeting.GREETING_QUESTIONS) +logger = logging.getLogger(__name__) + +LANGUAGE = getenv("LANGUAGE", "EN") + +GREETING_STEPS = list(common_greeting.GREETING_QUESTIONS[LANGUAGE]) link_to_skill2key_words = { skill_name: common_link.link_to_skill2key_words[skill_name] for skill_name in common_link.link_to_skill2key_words @@ -24,6 +30,7 @@ def offered_topic_choice_declined_condition(ctx: Context, actor: Actor, *args, **kwargs) -> bool: + logger.debug("offered_topic_choice_declined_condition") prev_bot_uttr = int_ctx.get_last_bot_utterance(ctx, actor)["text"] # asked what to talk about @@ -33,7 +40,10 @@ def offered_topic_choice_declined_condition(ctx: Context, actor: Actor, *args, * GREETING_STEPS[greeting_step_id - 1] == "what_to_talk_about" if greeting_step_id > 0 else False ) user_asked_for_topic = any( - [resp.lower() in prev_bot_uttr.lower() for resp in common_greeting.GREETING_QUESTIONS["what_to_talk_about"]] + [ + resp.lower() in prev_bot_uttr.lower() + for resp in common_greeting.GREETING_QUESTIONS[LANGUAGE]["what_to_talk_about"] + ] ) was_active = "dff_friendship_skill" == int_ctx.get_last_bot_utterance(ctx, actor).get("active_skill", "") @@ -48,9 +58,13 @@ def offered_topic_choice_declined_condition(ctx: Context, actor: Actor, *args, * def asked_for_events_and_got_yes_condition(ctx: Context, actor: Actor, *args, **kwargs) -> bool: + logger.debug("asked_for_events_and_got_yes_condition") prev_bot_uttr = int_ctx.get_last_bot_utterance(ctx, actor).get("text", "") was_event_question = any( - [resp.lower() in prev_bot_uttr.lower() for resp in common_greeting.GREETING_QUESTIONS["recent_personal_events"]] + [ + resp.lower() in prev_bot_uttr.lower() + for resp in common_greeting.GREETING_QUESTIONS[LANGUAGE]["recent_personal_events"] + ] ) agreed = int_cnd.is_yes_vars(ctx, actor) @@ -61,6 +75,7 @@ def asked_for_events_and_got_yes_condition(ctx: Context, actor: Actor, *args, ** def false_positive_condition(ctx: Context, actor: Actor, *args, **kwargs) -> bool: + logger.debug("false_positive_condition") flag = ( bool(re.search(common_greeting.FALSE_POSITIVE_TURN_ON_RE, int_ctx.get_last_human_utterance(ctx, actor)["text"])) and int_ctx.get_human_utter_index(ctx, actor) == 0 @@ -69,6 +84,7 @@ def false_positive_condition(ctx: Context, actor: Actor, *args, **kwargs) -> boo def hello_condition(ctx: Context, actor: Actor, *args, **kwargs) -> bool: + logger.debug("hello_condition") flag = True flag = flag and len(int_ctx.get_human_utterances(ctx, actor)) == 1 flag = flag @@ -76,17 +92,20 @@ def hello_condition(ctx: Context, actor: Actor, *args, **kwargs) -> bool: def how_are_you_condition(ctx: Context, actor: Actor, *args, **kwargs) -> bool: + logger.debug("how_are_you_condition") prev_frindship_skill = int_ctx.get_last_bot_utterance(ctx, actor).get("active_skill", "") == "dff_friendship_skill" - how_are_you_found = common_greeting.HOW_ARE_YOU_TEMPLATE.search( + how_are_you_found = common_greeting.HOW_ARE_YOU_TEMPLATE[LANGUAGE].search( int_ctx.get_last_human_utterance(ctx, actor)["text"] ) - how_are_you_precise_found = common_greeting.HOW_ARE_YOU_PRECISE_TEMPLATE.search( + how_are_you_precise_found = common_greeting.HOW_ARE_YOU_PRECISE_TEMPLATE[LANGUAGE].search( int_ctx.get_last_human_utterance(ctx, actor)["text"] ) - how_are_you_by_bot_found = common_greeting.HOW_ARE_YOU_TEMPLATE.search( + how_are_you_by_bot_found = common_greeting.HOW_ARE_YOU_TEMPLATE[LANGUAGE].search( int_ctx.get_last_bot_utterance(ctx, actor)["text"] ) - any_you_in_user = common_greeting.ANY_YOU_TEMPLATE.search(int_ctx.get_last_human_utterance(ctx, actor)["text"]) + any_you_in_user = common_greeting.ANY_YOU_TEMPLATE[LANGUAGE].search( + int_ctx.get_last_human_utterance(ctx, actor)["text"] + ) if how_are_you_precise_found: return True @@ -96,13 +115,17 @@ def how_are_you_condition(ctx: Context, actor: Actor, *args, **kwargs) -> bool: def positive_or_negative_condition(ctx: Context, actor: Actor, *args, **kwargs) -> bool: + logger.debug("positive_or_negative_condition") # SYS_USR_ANSWERS_HOW_IS_HE_DOING usr_sentiment = int_ctx.get_human_sentiment(ctx, actor) pos_temp = is_positive_regexp_based(int_ctx.get_last_human_utterance(ctx, actor)) neg_temp = is_negative_regexp_based(int_ctx.get_last_human_utterance(ctx, actor)) bot_asked_how_are_you = any( - [resp in int_ctx.get_last_bot_utterance(ctx, actor)["text"] for resp in common_greeting.HOW_ARE_YOU_RESPONSES] + [ + resp in int_ctx.get_last_bot_utterance(ctx, actor)["text"] + for resp in common_greeting.HOW_ARE_YOU_RESPONSES[LANGUAGE] + ] ) if bot_asked_how_are_you and (usr_sentiment in ["positive", "negative"] or pos_temp or neg_temp): return True @@ -110,41 +133,51 @@ def positive_or_negative_condition(ctx: Context, actor: Actor, *args, **kwargs) def no_requests_condition(ctx: Context, actor: Actor, *args, **kwargs) -> bool: + logger.debug("no_requests_condition") return int_cnd.no_requests(ctx, actor) def no_special_switch_off_requests_condition(ctx: Context, actor: Actor, *args, **kwargs) -> bool: + logger.debug("no_special_switch_off_requests_condition") return int_cnd.no_special_switch_off_requests(ctx, actor) def was_what_do_you_do_condition(ctx: Context, actor: Actor, *args, **kwargs) -> bool: + logger.debug("was_what_do_you_do_condition") bot_uttr_text = int_ctx.get_last_bot_utterance(ctx, actor).get("text", "") if int_cnd.no_requests(ctx, actor) and any( - [phrase in bot_uttr_text for phrase in common_greeting.GREETING_QUESTIONS["what_do_you_do_on_weekdays"]] + [ + phrase in bot_uttr_text + for phrase in common_greeting.GREETING_QUESTIONS[LANGUAGE]["what_do_you_do_on_weekdays"] + ] ): return True return False def is_yes_condition(ctx: Context, actor: Actor, *args, **kwargs) -> bool: + logger.debug("is_yes_condition") if int_cnd.is_yes_vars(ctx, actor): return True return False def is_no_condition(ctx: Context, actor: Actor, *args, **kwargs) -> bool: + logger.debug("is_no_condition") if int_cnd.is_no_vars(ctx, actor): return True return False def not_is_no_condition(ctx: Context, actor: Actor, *args, **kwargs) -> bool: + logger.debug("not_is_no_condition") if not int_cnd.is_no_vars(ctx, actor): return True return False def std_greeting_condition(ctx: Context, actor: Actor, *args, **kwargs) -> bool: + logger.debug("std_greeting_condition") flag = True # flag = flag and not condition_utils.is_new_human_entity(vars) # flag = flag and not condition_utils.is_switch_topic(vars) @@ -160,6 +193,7 @@ def std_greeting_condition(ctx: Context, actor: Actor, *args, **kwargs) -> bool: def new_entities_is_needed_for_condition(ctx: Context, actor: Actor, *args, **kwargs) -> bool: + logger.debug("new_entities_is_needed_for_condition") flag = True # what is the state in here? # flag = flag and int_cnd.is_first_time_of_state(ctx, actor, State.SYS_NEW_ENTITIES_IS_NEEDED_FOR) @@ -171,6 +205,7 @@ def new_entities_is_needed_for_condition(ctx: Context, actor: Actor, *args, **kw def closed_answer_condition(ctx: Context, actor: Actor, *args, **kwargs) -> bool: + logger.debug("closed_answer_condition") flag = True flag = flag and not int_cnd.is_switch_topic(ctx, actor) flag = flag and not int_cnd.is_lets_chat_about_topic_human_initiative(ctx, actor) @@ -179,6 +214,7 @@ def closed_answer_condition(ctx: Context, actor: Actor, *args, **kwargs) -> bool def link_to_by_enity_condition(ctx: Context, actor: Actor, *args, **kwargs) -> bool: + logger.debug("link_to_by_enity_condition") flag = True flag = flag and not int_cnd.is_switch_topic(ctx, actor) flag = flag and not int_cnd.is_lets_chat_about_topic_human_initiative(ctx, actor) diff --git a/skills/dff_friendship_skill/scenario/main.py b/skills/dff_friendship_skill/scenario/main.py index d7f1b1ca88..ccae8fc005 100644 --- a/skills/dff_friendship_skill/scenario/main.py +++ b/skills/dff_friendship_skill/scenario/main.py @@ -10,8 +10,6 @@ from df_engine.core import Actor -logging.basicConfig(format="%(asctime)s - %(name)s - %(levelname)s - %(message)s", level=logging.INFO) - logger = logging.getLogger(__name__) ZERO_CONFIDENCE = 0.0 diff --git a/skills/dff_friendship_skill/scenario/response.py b/skills/dff_friendship_skill/scenario/response.py index ee6357174e..d199cc0a1e 100644 --- a/skills/dff_friendship_skill/scenario/response.py +++ b/skills/dff_friendship_skill/scenario/response.py @@ -16,9 +16,10 @@ sentry_sdk.init(getenv("SENTRY_DSN")) -logging.basicConfig(format="%(asctime)s - %(name)s - %(levelname)s - %(message)s", level=logging.INFO) logger = logging.getLogger(__name__) +LANGUAGE = getenv("LANGUAGE", "EN") + REPLY_TYPE = Tuple[str, float, dict, dict, dict] DIALOG_BEGINNING_START_CONFIDENCE = 0.98 DIALOG_BEGINNING_CONTINUE_CONFIDENCE = 0.9 @@ -27,7 +28,7 @@ SUPER_CONFIDENCE = 1.0 HIGH_CONFIDENCE = 0.98 MIDDLE_CONFIDENCE = 0.95 -GREETING_STEPS = list(common_greeting.GREETING_QUESTIONS) +GREETING_STEPS = list(common_greeting.GREETING_QUESTIONS[LANGUAGE]) link_to_skill2key_words = { skill_name: common_link.link_to_skill2key_words[skill_name] @@ -43,6 +44,7 @@ def compose_topic_offering(ctx: Context, actor: Actor, excluded_skills=None) -> str: + logger.debug("compose_topic_offering") excluded_skills = [] if excluded_skills is None else excluded_skills available_skill_names = [ @@ -63,6 +65,7 @@ def compose_topic_offering(ctx: Context, actor: Actor, excluded_skills=None) -> if skill_name in link_to_skill2i_like_to_talk: response = random.choice(link_to_skill2i_like_to_talk[skill_name]) else: + response = f"Would you like to talk about {skill_name}?" int_ctx.save_to_shared_memory(ctx, actor, offered_topics=link_to_skill2key_words.get(skill_name, skill_name)) @@ -70,6 +73,7 @@ def compose_topic_offering(ctx: Context, actor: Actor, excluded_skills=None) -> def offer_topic(ctx: Context, actor: Actor, excluded_skills=None, *args, **kwargs) -> str: + logger.debug("offer_topic") if excluded_skills is None: excluded_skills = int_ctx.get_disliked_skills(ctx, actor) @@ -79,9 +83,9 @@ def offer_topic(ctx: Context, actor: Actor, excluded_skills=None, *args, **kwarg # offer_topic_choose = compose_topic_offering(ctx, actor, excluded_skills=excluded_skills) # else: # # what do you want to talk about? - # offer_topic_choose = random.choice(common_greeting.GREETING_QUESTIONS["what_to_talk_about"]) + # offer_topic_choose = random.choice(common_greeting.GREETING_QUESTIONS[LANGUAGE]["what_to_talk_about"]) greeting_step_id = GREETING_STEPS.index("what_to_talk_about") - logger.info(f"Assign greeting_step_id to {greeting_step_id + 1}") + logger.debug(f"Assign greeting_step_id to {greeting_step_id + 1}") int_ctx.save_to_shared_memory(ctx, actor, greeting_step_id=greeting_step_id + 1) return offer_topic_choose @@ -95,6 +99,7 @@ def greeting_response(ctx: Context, actor: Actor, *args, **kwargs) -> str: bot attributes (empty), attributes MUST_CONTINUE or (empty) """ + logger.debug("greeting_response") bot_utt = int_ctx.get_last_bot_utterance(ctx, actor)["text"].lower() if int_cnd.is_lets_chat_about_topic(ctx, actor): int_ctx.set_confidence(ctx, actor, HIGH_CONFIDENCE) @@ -113,9 +118,9 @@ def greeting_response(ctx: Context, actor: Actor, *args, **kwargs) -> str: ) int_ctx.save_to_shared_memory(ctx, actor, greeting_type=which_start) if which_start == "how_are_you": - after_hello_resp = random.choice(common_greeting.HOW_ARE_YOU_RESPONSES) + after_hello_resp = random.choice(common_greeting.HOW_ARE_YOU_RESPONSES[LANGUAGE]) elif which_start == "what_is_your_name": - after_hello_resp = random.choice(common_greeting.WHAT_IS_YOUR_NAME_RESPONSES) + after_hello_resp = random.choice(common_greeting.WHAT_IS_YOUR_NAME_RESPONSES[LANGUAGE]) # elif which_start == "starter_genre": # after_hello_resp = starter_flow.genre_response(ctx, actor) # elif which_start == "starter_weekday": @@ -129,21 +134,23 @@ def greeting_response(ctx: Context, actor: Actor, *args, **kwargs) -> str: if "seems like alexa decided to turn me on" in bot_utt: reply = after_hello_resp else: - reply = f"{common_greeting.HI_THIS_IS_ALEXA} {after_hello_resp}" + reply = f"{common_greeting.HI_THIS_IS_DREAM[LANGUAGE]} {after_hello_resp}" return reply def clarify_event_response(ctx: Context, actor: Actor, *args, **kwargs) -> str: + logger.debug("clarify_event_response") int_ctx.set_confidence(ctx, actor, SUPER_CONFIDENCE) int_ctx.set_can_continue(ctx, actor, MUST_CONTINUE) reply = random.choice(["Cool! Tell me about it.", "Great! What is it?"]) greeting_step_id = GREETING_STEPS.index("recent_personal_events") - logger.info(f"Assign greeting_step_id to {greeting_step_id + 1}") + logger.debug(f"Assign greeting_step_id to {greeting_step_id + 1}") int_ctx.save_to_shared_memory(ctx, actor, greeting_step_id=greeting_step_id + 1) return reply def false_positive_response(ctx: Context, actor: Actor, *args, **kwargs) -> str: + logger.debug("false_positive_response") int_ctx.set_confidence(ctx, actor, SUPER_CONFIDENCE) int_ctx.set_can_continue(ctx, actor, MUST_CONTINUE) reply = "Hi! Seems like Alexa decided to turn me on. Do you want to chat with me?" @@ -151,6 +158,7 @@ def false_positive_response(ctx: Context, actor: Actor, *args, **kwargs) -> str: def bye_response(ctx: Context, actor: Actor, *args, **kwargs) -> str: + logger.debug("bye_response") int_ctx.set_confidence(ctx, actor, SUPER_CONFIDENCE) int_ctx.set_can_continue(ctx, actor, CAN_NOT_CONTINUE) reply = "Sorry, bye. #+#exit" @@ -158,28 +166,31 @@ def bye_response(ctx: Context, actor: Actor, *args, **kwargs) -> str: def how_are_you_response(ctx: Context, actor: Actor, *args, **kwargs) -> str: + logger.debug("how_are_you_response") int_ctx.set_confidence(ctx, actor, SUPER_CONFIDENCE) int_ctx.set_can_continue(ctx, actor, MUST_CONTINUE) - how_bot_is_doing_resp = random.choice(common_greeting.HOW_BOT_IS_DOING_RESPONSES) + how_bot_is_doing_resp = random.choice(common_greeting.HOW_BOT_IS_DOING_RESPONSES[LANGUAGE]) - question_about_activities = random.choice(common_greeting.GREETING_QUESTIONS["recent_personal_events"]) + question_about_activities = random.choice(common_greeting.GREETING_QUESTIONS[LANGUAGE]["recent_personal_events"]) reply = ( - f"{how_bot_is_doing_resp} {random.choice(common_greeting.WHAT_DO_YOU_DO_RESPONSES)} " + f"{how_bot_is_doing_resp} {random.choice(common_greeting.WHAT_DO_YOU_DO_RESPONSES[LANGUAGE])} " f"{question_about_activities}" ) greeting_step_id = GREETING_STEPS.index("recent_personal_events") - logger.info(f"Assign greeting_step_id to {greeting_step_id + 1}") + logger.debug(f"Assign greeting_step_id to {greeting_step_id + 1}") int_ctx.save_to_shared_memory(ctx, actor, greeting_step_id=greeting_step_id + 1) return reply def health_problems(ctx: Context, actor: Actor): + logger.debug("health_problems") if HEALTH_PROBLEMS.search(int_ctx.get_last_human_utterance(ctx, actor)["text"]): return True return False def how_human_is_doing_response(ctx: Context, actor: Actor, *args, **kwargs) -> str: + logger.debug("how_human_is_doing_response") usr_sentiment = int_ctx.get_human_sentiment(ctx, actor) _no_entities = len(int_ctx.get_nounphrases_from_human_utterance(ctx, actor)) == 0 _no_requests = int_cnd.no_requests(ctx, actor) @@ -187,13 +198,13 @@ def how_human_is_doing_response(ctx: Context, actor: Actor, *args, **kwargs) -> if is_positive_regexp_based(int_ctx.get_last_human_utterance(ctx, actor)): int_ctx.set_confidence(ctx, actor, SUPER_CONFIDENCE) int_ctx.set_can_continue(ctx, actor, MUST_CONTINUE) - user_mood_acknowledgement = random.choice(common_greeting.GOOD_MOOD_REACTIONS) + user_mood_acknowledgement = random.choice(common_greeting.GOOD_MOOD_REACTIONS[LANGUAGE]) elif _is_unhealthy or is_negative_regexp_based(int_ctx.get_last_human_utterance(ctx, actor)): int_ctx.set_confidence(ctx, actor, HIGH_CONFIDENCE) int_ctx.set_can_continue(ctx, actor, CAN_CONTINUE_SCENARIO) user_mood_acknowledgement = ( - f"{random.choice(common_greeting.BAD_MOOD_REACTIONS)} " - f"{random.choice(common_greeting.GIVE_ME_CHANCE_TO_CHEER_UP)}" + f"{random.choice(common_greeting.BAD_MOOD_REACTIONS[LANGUAGE])} " + f"{random.choice(common_greeting.GIVE_ME_CHANCE_TO_CHEER_UP[LANGUAGE])}" ) else: if _no_entities and _no_requests and usr_sentiment != "negative": @@ -205,29 +216,32 @@ def how_human_is_doing_response(ctx: Context, actor: Actor, *args, **kwargs) -> int_ctx.set_can_continue(ctx, actor, CAN_CONTINUE_SCENARIO) if usr_sentiment == "positive": - user_mood_acknowledgement = random.choice(common_greeting.GOOD_MOOD_REACTIONS) + user_mood_acknowledgement = random.choice(common_greeting.GOOD_MOOD_REACTIONS[LANGUAGE]) elif usr_sentiment == "negative": user_mood_acknowledgement = ( - f"{random.choice(common_greeting.BAD_MOOD_REACTIONS)} " - f"{random.choice(common_greeting.GIVE_ME_CHANCE_TO_CHEER_UP)}" + f"{random.choice(common_greeting.BAD_MOOD_REACTIONS[LANGUAGE])} " + f"{random.choice(common_greeting.GIVE_ME_CHANCE_TO_CHEER_UP[LANGUAGE])}" ) else: - user_mood_acknowledgement = "Okay." + user_mood_acknowledgement = int_cnd.get_not_used_and_save_sentiment_acknowledgement( + ctx, actor, sentiment="neutral", lang=LANGUAGE + ) - question_about_activities = random.choice(common_greeting.GREETING_QUESTIONS["recent_personal_events"]) + question_about_activities = random.choice(common_greeting.GREETING_QUESTIONS[LANGUAGE]["recent_personal_events"]) reply = ( - f"{user_mood_acknowledgement} {random.choice(common_greeting.WHAT_DO_YOU_DO_RESPONSES)} " + f"{user_mood_acknowledgement} {random.choice(common_greeting.WHAT_DO_YOU_DO_RESPONSES[LANGUAGE])} " f"{question_about_activities}" ) int_ctx.add_acknowledgement_to_response_parts(ctx, actor) greeting_step_id = GREETING_STEPS.index("recent_personal_events") - logger.info(f"Assign greeting_step_id to {greeting_step_id + 1}") + logger.debug(f"Assign greeting_step_id to {greeting_step_id + 1}") int_ctx.save_to_shared_memory(ctx, actor, greeting_step_id=greeting_step_id + 1) return reply def offer_topics_choice_response(ctx: Context, actor: Actor, *args, **kwargs) -> str: + logger.debug("offer_topics_choice_response") int_ctx.set_confidence(ctx, actor, HIGH_CONFIDENCE) int_ctx.set_can_continue(ctx, actor, CAN_CONTINUE_SCENARIO) reply = offer_topic(ctx, actor) @@ -235,19 +249,23 @@ def offer_topics_choice_response(ctx: Context, actor: Actor, *args, **kwargs) -> def offered_topic_choice_declined_response(ctx: Context, actor: Actor, *args, **kwargs) -> str: + logger.debug("offered_topic_choice_declined_response") int_ctx.set_confidence(ctx, actor, SUPER_CONFIDENCE) int_ctx.set_can_continue(ctx, actor, MUST_CONTINUE) greeting_step_id = 0 # what do you want to talk about? - offer_topic_choose = random.choice(common_greeting.GREETING_QUESTIONS["what_to_talk_about"]) + offer_topic_choose = random.choice(common_greeting.GREETING_QUESTIONS[LANGUAGE]["what_to_talk_about"]) - logger.info(f"Assign greeting_step_id to {greeting_step_id + 1}") + logger.debug(f"Assign greeting_step_id to {greeting_step_id + 1}") int_ctx.save_to_shared_memory(ctx, actor, greeting_step_id=greeting_step_id + 1) - reply = f"Okay. {offer_topic_choose}" + ack = int_cnd.get_not_used_and_save_sentiment_acknowledgement(ctx, actor, sentiment="neutral", lang=LANGUAGE) + int_ctx.add_acknowledgement_to_response_parts(ctx, actor) + reply = f"{ack} {offer_topic_choose}" return reply def std_greeting_response(ctx: Context, actor: Actor, *args, **kwargs) -> str: + logger.debug("std_greeting_response") shared_memory = int_ctx.get_shared_memory(ctx, actor) @@ -262,49 +280,59 @@ def std_greeting_response(ctx: Context, actor: Actor, *args, **kwargs) -> str: # acknowledgement, confidences if _nothing_dont_know or (_no_requests and len(_entities) == 0): + logger.debug("nothing OR no requests and no entities") if _friendship_was_active and greeting_step_id >= 1: ack = random.choice( - common_greeting.AFTER_GREETING_QUESTIONS_WHEN_NOT_TALKY[GREETING_STEPS[greeting_step_id - 1]] + common_greeting.AFTER_GREETING_QUESTIONS_WHEN_NOT_TALKY[LANGUAGE][GREETING_STEPS[greeting_step_id - 1]] ) int_ctx.set_confidence(ctx, actor, SUPER_CONFIDENCE) int_ctx.set_can_continue(ctx, actor, MUST_CONTINUE) else: - ack = int_cnd.get_not_used_and_save_sentiment_acknowledgement(ctx, actor) + ack = int_cnd.get_not_used_and_save_sentiment_acknowledgement(ctx, actor, lang=LANGUAGE) int_ctx.set_confidence(ctx, actor, MIDDLE_CONFIDENCE) int_ctx.set_can_continue(ctx, actor, CAN_CONTINUE_SCENARIO) elif not _no_requests and len(_entities) > 0: + logger.debug("no requests but entities") # user wants to talk about something particular. We are just a dummy response, if no appropriate if _friendship_was_active: - ack = random.choice(common_greeting.AFTER_GREETING_QUESTIONS_WHEN_NOT_TALKY["what_do_you_do_on_weekdays"]) - sent_ack = int_cnd.get_not_used_and_save_sentiment_acknowledgement(ctx, actor) + ack = random.choice( + common_greeting.AFTER_GREETING_QUESTIONS_WHEN_NOT_TALKY[LANGUAGE]["what_do_you_do_on_weekdays"] + ) + sent_ack = int_cnd.get_not_used_and_save_sentiment_acknowledgement(ctx, actor, lang=LANGUAGE) ack = f"{sent_ack} {ack}" else: - ack = int_cnd.get_not_used_and_save_sentiment_acknowledgement(ctx, actor) + ack = int_cnd.get_not_used_and_save_sentiment_acknowledgement(ctx, actor, lang=LANGUAGE) int_ctx.set_confidence(ctx, actor, MIDDLE_CONFIDENCE) int_ctx.set_can_continue(ctx, actor, CAN_CONTINUE_SCENARIO) else: + logger.debug("other cases") if len(_entities) == 0 or _no_requests: int_ctx.set_confidence(ctx, actor, HIGH_CONFIDENCE) else: int_ctx.set_confidence(ctx, actor, MIDDLE_CONFIDENCE) # some request by user detected OR no requests but some entities detected if _friendship_was_active and GREETING_STEPS[greeting_step_id] == "recent_personal_events": - ack = random.choice(common_greeting.INTERESTING_PERSON_THANKS_FOR_CHATTING) + ack = random.choice(common_greeting.INTERESTING_PERSON_THANKS_FOR_CHATTING[LANGUAGE]) else: - ack = int_cnd.get_not_used_and_save_sentiment_acknowledgement(ctx, actor) + ack = int_cnd.get_not_used_and_save_sentiment_acknowledgement(ctx, actor, lang=LANGUAGE) int_ctx.set_can_continue(ctx, actor, CAN_CONTINUE_SCENARIO) if health_problems(ctx, actor): ack = "I'm so sorry to hear that. Hope, everything will be fine soon." if greeting_step_id == 0 or GREETING_STEPS[greeting_step_id] == "what_to_talk_about": - prev_active_skills = [uttr.get("active_skill", "") for uttr in int_ctx.get_bot_utterances(ctx, actor)][-5:] - disliked_skills = int_ctx.get_disliked_skills(ctx, actor) - body = offer_topic(ctx, actor, excluded_skills=prev_active_skills + disliked_skills) + logger.debug("step-id=0 or what_to_talk_about") + if LANGUAGE == "EN": + prev_active_skills = [uttr.get("active_skill", "") for uttr in int_ctx.get_bot_utterances(ctx, actor)][-5:] + disliked_skills = int_ctx.get_disliked_skills(ctx, actor) + body = offer_topic(ctx, actor, excluded_skills=prev_active_skills + disliked_skills) + else: + body = random.choice(common_greeting.GREETING_QUESTIONS[LANGUAGE]["what_to_talk_about"]) else: - body = random.choice(common_greeting.GREETING_QUESTIONS[GREETING_STEPS[greeting_step_id]]) + logger.debug("choose according to step-id") + body = random.choice(common_greeting.GREETING_QUESTIONS[LANGUAGE][GREETING_STEPS[greeting_step_id]]) - logger.info(f"Assign in std_greeting_response greeting_step_id to {greeting_step_id + 1}") + logger.debug(f"Assign in std_greeting_response greeting_step_id to {greeting_step_id + 1}") int_ctx.save_to_shared_memory(ctx, actor, greeting_step_id=greeting_step_id + 1) int_ctx.add_acknowledgement_to_response_parts(ctx, actor) @@ -313,7 +341,8 @@ def std_greeting_response(ctx: Context, actor: Actor, *args, **kwargs) -> str: def new_entities_is_needed_for_response(ctx: Context, actor: Actor, *args, **kwargs) -> str: - ack = int_cnd.get_not_used_and_save_sentiment_acknowledgement(ctx, actor) + logger.debug("new_entities_is_needed_for_response") + ack = int_cnd.get_not_used_and_save_sentiment_acknowledgement(ctx, actor, lang=LANGUAGE) body = "Tell me more about that." int_cnd.set_conf_and_can_cont_by_universal_policy(ctx, actor) @@ -325,7 +354,8 @@ def new_entities_is_needed_for_response(ctx: Context, actor: Actor, *args, **kwa def closed_answer_response(ctx: Context, actor: Actor, *args, **kwargs) -> str: - ack = int_cnd.get_not_used_and_save_sentiment_acknowledgement(ctx, actor) + logger.debug("closed_answer_response") + ack = int_cnd.get_not_used_and_save_sentiment_acknowledgement(ctx, actor, lang=LANGUAGE) body = "" int_cnd.set_conf_and_can_cont_by_universal_policy(ctx, actor) @@ -342,6 +372,7 @@ def closed_answer_response(ctx: Context, actor: Actor, *args, **kwargs) -> str: def masked_lm(templates=None, prob_threshold=0.0, probs_flag=False): + logger.debug("masked_lm") templates = ["Hello, it's [MASK] dog."] if templates is None else templates request_data = {"text": templates} try: @@ -364,10 +395,12 @@ def masked_lm(templates=None, prob_threshold=0.0, probs_flag=False): def link_to_by_enity_response(ctx: Context, actor: Actor, *args, **kwargs) -> str: + logger.debug("link_to_by_enity_response") try: - ack = int_cnd.get_not_used_and_save_sentiment_acknowledgement(ctx, actor) + ack = int_cnd.get_not_used_and_save_sentiment_acknowledgement(ctx, actor, lang=LANGUAGE) entities = int_ctx.get_new_human_labeled_noun_phrase(ctx, actor) if entities: + logger.debug("entities detected") logger.debug(f"entities= {entities}") tgt_entity = list(entities)[-1] logger.debug(f"tgt_entity= {tgt_entity}") @@ -385,6 +418,7 @@ def link_to_by_enity_response(ctx: Context, actor: Actor, *args, **kwargs) -> st } skill_names = sorted(link_to_skill_scores, key=lambda x: link_to_skill_scores[x])[-2:] else: + logger.debug("no entities detected") skill_names = [random.choice(list(link_to_skill2key_words))] # used_links diff --git a/skills/dff_friendship_skill/scenario/weekend_condition.py b/skills/dff_friendship_skill/scenario/weekend_condition.py index fcf1bf59d7..24946b2641 100644 --- a/skills/dff_friendship_skill/scenario/weekend_condition.py +++ b/skills/dff_friendship_skill/scenario/weekend_condition.py @@ -1,4 +1,5 @@ import re +from os import getenv from df_engine.core import Actor, Context @@ -10,7 +11,9 @@ from common.scenarios.games import was_game_mentioned -GREETING_STEPS = list(common_greeting.GREETING_QUESTIONS) +LANGUAGE = getenv("LANGUAGE", "EN") + +GREETING_STEPS = list(common_greeting.GREETING_QUESTIONS[LANGUAGE]) link_to_skill2key_words = { skill_name: common_link.link_to_skill2key_words[skill_name] for skill_name in common_link.link_to_skill2key_words diff --git a/skills/dff_friendship_skill/scenario/weekend_response.py b/skills/dff_friendship_skill/scenario/weekend_response.py index c4f41658f6..7bbcc94ffb 100644 --- a/skills/dff_friendship_skill/scenario/weekend_response.py +++ b/skills/dff_friendship_skill/scenario/weekend_response.py @@ -13,9 +13,10 @@ sentry_sdk.init(getenv("SENTRY_DSN")) -logging.basicConfig(format="%(asctime)s - %(name)s - %(levelname)s - %(message)s", level=logging.INFO) logger = logging.getLogger(__name__) +LANGUAGE = getenv("LANGUAGE", "EN") + REPLY_TYPE = Tuple[str, float, dict, dict, dict] DIALOG_BEGINNING_START_CONFIDENCE = 0.98 DIALOG_BEGINNING_CONTINUE_CONFIDENCE = 0.9 @@ -24,12 +25,12 @@ SUPER_CONFIDENCE = 1.0 HIGH_CONFIDENCE = 0.98 MIDDLE_CONFIDENCE = 0.95 -GREETING_STEPS = list(common_greeting.GREETING_QUESTIONS) +GREETING_STEPS = list(common_greeting.GREETING_QUESTIONS[LANGUAGE]) def std_weekend_response(ctx: Context, actor: Actor) -> str: # get ack, body - ack = int_cnd.get_not_used_and_save_sentiment_acknowledgement(ctx, actor) + ack = int_cnd.get_not_used_and_save_sentiment_acknowledgement(ctx, actor, lang=LANGUAGE) # obtaining random response from weekend questions body = random.choice(common_weekend.WEEKEND_QUESTIONS) @@ -44,7 +45,7 @@ def std_weekend_response(ctx: Context, actor: Actor) -> str: def sys_cleaned_up_response(ctx: Context, actor: Actor) -> str: # get ack, body - ack = int_cnd.get_not_used_and_save_sentiment_acknowledgement(ctx, actor) + ack = int_cnd.get_not_used_and_save_sentiment_acknowledgement(ctx, actor, lang=LANGUAGE) # obtaining random response from weekend questions body = random.choice(common_weekend.CLEANED_UP_STATEMENTS) @@ -59,7 +60,7 @@ def sys_cleaned_up_response(ctx: Context, actor: Actor) -> str: def sys_slept_in_response(ctx: Context, actor: Actor) -> str: # get ack, body - ack = int_cnd.get_not_used_and_save_sentiment_acknowledgement(ctx, actor) + ack = int_cnd.get_not_used_and_save_sentiment_acknowledgement(ctx, actor, lang=LANGUAGE) # obtaining random response from weekend questions body = random.choice(common_weekend.SLEPT_IN_QUESTIONS) @@ -74,7 +75,7 @@ def sys_slept_in_response(ctx: Context, actor: Actor) -> str: def sys_feel_great_response(ctx: Context, actor: Actor) -> str: # get ack, body - ack = int_cnd.get_not_used_and_save_sentiment_acknowledgement(ctx, actor) + ack = int_cnd.get_not_used_and_save_sentiment_acknowledgement(ctx, actor, lang=LANGUAGE) # obtaining random response from weekend questions body = random.choice(common_weekend.WHAT_PLANS_FOR_TODAY) @@ -89,7 +90,7 @@ def sys_feel_great_response(ctx: Context, actor: Actor) -> str: def sys_need_more_time_response(ctx: Context, actor: Actor) -> str: # get ack, body - ack = int_cnd.get_not_used_and_save_sentiment_acknowledgement(ctx, actor) + ack = int_cnd.get_not_used_and_save_sentiment_acknowledgement(ctx, actor, lang=LANGUAGE) # obtaining random response from weekend questions body = random.choice(common_weekend.WISH_MORE_TIME) @@ -104,7 +105,7 @@ def sys_need_more_time_response(ctx: Context, actor: Actor) -> str: def sys_watched_film_response(ctx: Context, actor: Actor) -> str: # get ack, body - ack = int_cnd.get_not_used_and_save_sentiment_acknowledgement(ctx, actor) + ack = int_cnd.get_not_used_and_save_sentiment_acknowledgement(ctx, actor, lang=LANGUAGE) # obtaining random response from weekend questions body = random.choice(common_weekend.MOVIE_NAME_QUESTION) @@ -119,7 +120,7 @@ def sys_watched_film_response(ctx: Context, actor: Actor) -> str: def sys_read_book_response(ctx: Context, actor: Actor) -> str: # get ack, body - ack = int_cnd.get_not_used_and_save_sentiment_acknowledgement(ctx, actor) + ack = int_cnd.get_not_used_and_save_sentiment_acknowledgement(ctx, actor, lang=LANGUAGE) # obtaining random response from weekend questions body = random.choice(common_weekend.BOOK_NAME_QUESTION) @@ -134,7 +135,7 @@ def sys_read_book_response(ctx: Context, actor: Actor) -> str: def sys_played_computer_game_response(ctx: Context, actor: Actor) -> str: # get ack, body - ack = int_cnd.get_not_used_and_save_sentiment_acknowledgement(ctx, actor) + ack = int_cnd.get_not_used_and_save_sentiment_acknowledgement(ctx, actor, lang=LANGUAGE) # obtaining random response from weekend questions body = random.choice(common_weekend.COMPUTER_GAME_NAME_QUESTION) @@ -149,7 +150,7 @@ def sys_played_computer_game_response(ctx: Context, actor: Actor) -> str: def sys_play_on_weekends_response(ctx: Context, actor: Actor) -> str: # get ack, body - ack = int_cnd.get_not_used_and_save_sentiment_acknowledgement(ctx, actor) + ack = int_cnd.get_not_used_and_save_sentiment_acknowledgement(ctx, actor, lang=LANGUAGE) # obtaining random response from weekend questions body = random.choice(common_weekend.GAME_EMOTIONS_QUESTION) @@ -164,7 +165,7 @@ def sys_play_on_weekends_response(ctx: Context, actor: Actor) -> str: def sys_play_regularly_response(ctx: Context, actor: Actor) -> str: # get ack, body - ack = int_cnd.get_not_used_and_save_sentiment_acknowledgement(ctx, actor) + ack = int_cnd.get_not_used_and_save_sentiment_acknowledgement(ctx, actor, lang=LANGUAGE) # obtaining random response from weekend questions body = random.choice(common_weekend.REGULAR_PLAYER_QUESTION) @@ -179,7 +180,7 @@ def sys_play_regularly_response(ctx: Context, actor: Actor) -> str: def sys_play_once_response(ctx: Context, actor: Actor) -> str: # get ack, body - ack = int_cnd.get_not_used_and_save_sentiment_acknowledgement(ctx, actor) + ack = int_cnd.get_not_used_and_save_sentiment_acknowledgement(ctx, actor, lang=LANGUAGE) # obtaining random response from weekend questions body = random.choice(common_weekend.OCCASIONAL_PLAYER_QUESTION) diff --git a/skills/dff_friendship_skill/server.py b/skills/dff_friendship_skill/server.py index a8eeed22d7..837ca087b0 100644 --- a/skills/dff_friendship_skill/server.py +++ b/skills/dff_friendship_skill/server.py @@ -25,7 +25,7 @@ SERVICE_PORT = int(os.getenv("SERVICE_PORT")) RANDOM_SEED = int(os.getenv("RANDOM_SEED", 2718)) -logging.basicConfig(format="%(asctime)s - %(pathname)s - %(lineno)d - %(levelname)s - %(message)s", level=logging.DEBUG) +logging.basicConfig(format="%(asctime)s - %(pathname)s - %(lineno)d - %(levelname)s - %(message)s", level=logging.INFO) logger = logging.getLogger(__name__) diff --git a/skills/dff_friendship_skill/test_server.py b/skills/dff_friendship_skill/test_server.py index a894b21b1e..3a438a9744 100644 --- a/skills/dff_friendship_skill/test_server.py +++ b/skills/dff_friendship_skill/test_server.py @@ -6,6 +6,7 @@ SERVICE_PORT = int(os.getenv("SERVICE_PORT")) RANDOM_SEED = int(os.getenv("RANDOM_SEED", 2718)) +LANGUAGE = os.getenv("LANGUAGE", "EN") URL = f"http://0.0.0.0:{SERVICE_PORT}/respond" @@ -17,6 +18,11 @@ def handler(requested_data, random_seed): def run_test(handler): in_data, out_data = test_utils.get_dataset() for test_name in in_data: + if LANGUAGE == "RU" and "RU" not in test_name: + # if russian language, skip english tests + continue + elif LANGUAGE == "EN" and "RU" in test_name: + continue hypothesis = handler(in_data[test_name], RANDOM_SEED) print(f"test name: {test_name}") is_equal_flag, msg = test_utils.compare_structs(out_data[test_name], hypothesis, ignored_keys=["id"]) diff --git a/skills/dff_friendship_skill/tests/how_bot_is_doing_RU_in.json b/skills/dff_friendship_skill/tests/how_bot_is_doing_RU_in.json new file mode 100644 index 0000000000..87a5e4042d --- /dev/null +++ b/skills/dff_friendship_skill/tests/how_bot_is_doing_RU_in.json @@ -0,0 +1,232 @@ +{ + "human_utter_index_batch": [ + 1 + ], + "dialog_batch": [ + { + "human_utterances": [ + { + "text": "привет", + "annotations": { + "spelling_preprocessing": "привет", + "badlisted_words": { + "bad_words": false + }, + "intent_catcher": { + "choose_topic": { + "confidence": 0.0, + "detected": 0 + }, + "exit": { + "confidence": 0.0, + "detected": 0 + }, + "lets_chat_about": { + "confidence": 0.0, + "detected": 0 + }, + "no": { + "confidence": 0.0, + "detected": 0 + }, + "repeat": { + "confidence": 0.0, + "detected": 0 + }, + "topic_switching": { + "confidence": 0.0, + "detected": 0 + }, + "what_are_you_talking_about": { + "confidence": 0.0, + "detected": 0 + }, + "what_can_you_do": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_job": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_name": { + "confidence": 0.0, + "detected": 0 + }, + "where_are_you_from": { + "confidence": 0.0, + "detected": 0 + }, + "who_made_you": { + "confidence": 0.0, + "detected": 0 + }, + "yes": { + "confidence": 0.9863496422767639, + "detected": 1 + } + }, + "ner": [ + [] + ], + "entity_linking": [], + "wiki_parser": { + "animals_skill_entities_info": {}, + "entities_info": {}, + "topic_skill_entities_info": {}, + "utt_num": 1, + "wiki_skill_entities_info": {} + } + } + }, + { + "text": "отлично! а у тебя?", + "annotations": { + "spelling_preprocessing": "отлично! а у тебя?", + "badlisted_words": { + "bad_words": false + }, + "intent_catcher": { + "choose_topic": { + "confidence": 0.0, + "detected": 0 + }, + "exit": { + "confidence": 0.0, + "detected": 0 + }, + "lets_chat_about": { + "confidence": 0.0, + "detected": 0 + }, + "no": { + "confidence": 0.0, + "detected": 0 + }, + "repeat": { + "confidence": 0.0, + "detected": 0 + }, + "topic_switching": { + "confidence": 0.0, + "detected": 0 + }, + "what_are_you_talking_about": { + "confidence": 0.0, + "detected": 0 + }, + "what_can_you_do": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_job": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_name": { + "confidence": 0.0, + "detected": 0 + }, + "where_are_you_from": { + "confidence": 0.0, + "detected": 0 + }, + "who_made_you": { + "confidence": 0.0, + "detected": 0 + }, + "yes": { + "confidence": 0.0, + "detected": 0 + } + }, + "ner": [ + [] + ], + "entity_linking": [], + "wiki_parser": { + "animals_skill_entities_info": {}, + "entities_info": {}, + "topic_skill_entities_info": {}, + "utt_num": 2, + "wiki_skill_entities_info": {} + } + } + } + ], + "bot_utterances": [ + { + "text": "Привет, это чат-бот Dream! Как дела?", + "annotations": { + "ner": [ + [ + { + "confidence": 1, + "end_pos": 25, + "start_pos": 20, + "text": "Dream", + "type": "PER" + } + ] + ] + }, + "active_skill": "dff_friendship_skill" + } + ] + } + ], + "dff_friendship_skill_state_batch": [ + { + "context": { + "actor_state": {}, + "id": "59c2096b-1dbb-4d0d-b5c6-5b325cb62820", + "labels": { + "0": [ + "greeting_flow", + "hello_response_node" + ] + }, + "misc": {}, + "requests": { + "0": "привет" + }, + "responses": { + "0": "Привет, это чат-бот Dream! Как дела?" + }, + "validation": false + }, + "current_turn_dff_suspended": false, + "history": { + "0": [ + "greeting_flow", + "hello_response_node" + ] + }, + "previous_human_utter_index": 0, + "shared_memory": { + "greeting_type": "how_are_you" + } + } + ], + "dff_shared_state_batch": [ + { + "cross_links": {}, + "cross_states": {} + } + ], + "entities_batch": [ + {} + ], + "used_links_batch": [ + {} + ], + "age_group_batch": [ + "" + ], + "disliked_skills_batch": [ + [] + ], + "clarification_request_flag_batch": [ + false + ] +} \ No newline at end of file diff --git a/skills/dff_friendship_skill/tests/how_bot_is_doing_RU_out.json b/skills/dff_friendship_skill/tests/how_bot_is_doing_RU_out.json new file mode 100644 index 0000000000..efd33aaa68 --- /dev/null +++ b/skills/dff_friendship_skill/tests/how_bot_is_doing_RU_out.json @@ -0,0 +1,61 @@ +[ + [ + "Спасибо, у меня как всегда все прекрасно! Я слышала, что у людей бывают отпуска и выходные. Не мой выбор! Я могу без устали говорить день и ночь! Что главное произошло с тобой сегодня?", + 1.0, + { + "dff_friendship_skill_state": { + "shared_memory": { + "greeting_type": "how_are_you", + "greeting_step_id": 1 + }, + "previous_human_utter_index": 1, + "history": { + "0": [ + "greeting_flow", + "hello_response_node" + ], + "1": [ + "greeting_flow", + "how_are_you_node" + ] + }, + "current_turn_dff_suspended": false, + "context": { + "id": "59c2096b-1dbb-4d0d-b5c6-5b325cb62820", + "labels": { + "0": [ + "greeting_flow", + "hello_response_node" + ], + "1": [ + "greeting_flow", + "how_are_you_node" + ] + }, + "requests": { + "0": "привет", + "1": "отлично! а у тебя?" + }, + "responses": { + "0": "Привет, это чат-бот Dream! Как дела?", + "1": "Спасибо, у меня как всегда все прекрасно! Я слышала, что у людей бывают отпуска и выходные. Не мой выбор! Я могу без устали говорить день и ночь! Что главное произошло с тобой сегодня?" + }, + "misc": {}, + "validation": false, + "actor_state": {} + } + }, + "dff_shared_state": { + "cross_links": {}, + "cross_states": {} + }, + "used_links": {}, + "age_group": "", + "disliked_skills": [] + }, + {}, + { + "can_continue": "must" + } + ] +] \ No newline at end of file diff --git a/skills/dff_friendship_skill/tests/lets_talk_RU_in.json b/skills/dff_friendship_skill/tests/lets_talk_RU_in.json new file mode 100644 index 0000000000..e0218c374f --- /dev/null +++ b/skills/dff_friendship_skill/tests/lets_talk_RU_in.json @@ -0,0 +1,110 @@ +{ + "human_utter_index_batch": [ + 0 + ], + "dialog_batch": [ + { + "human_utterances": [ + { + "text": "привет", + "annotations": { + "spelling_preprocessing": "привет", + "badlisted_words": { + "bad_words": false + }, + "intent_catcher": { + "choose_topic": { + "confidence": 0.0, + "detected": 0 + }, + "exit": { + "confidence": 0.0, + "detected": 0 + }, + "lets_chat_about": { + "confidence": 0.0, + "detected": 0 + }, + "no": { + "confidence": 0.0, + "detected": 0 + }, + "repeat": { + "confidence": 0.0, + "detected": 0 + }, + "topic_switching": { + "confidence": 0.0, + "detected": 0 + }, + "what_are_you_talking_about": { + "confidence": 0.0, + "detected": 0 + }, + "what_can_you_do": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_job": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_name": { + "confidence": 0.0, + "detected": 0 + }, + "where_are_you_from": { + "confidence": 0.0, + "detected": 0 + }, + "who_made_you": { + "confidence": 0.0, + "detected": 0 + }, + "yes": { + "confidence": 0.9863496422767639, + "detected": 1 + } + }, + "ner": [ + [] + ], + "entity_linking": [], + "wiki_parser": { + "animals_skill_entities_info": {}, + "entities_info": {}, + "topic_skill_entities_info": {}, + "utt_num": 1, + "wiki_skill_entities_info": {} + } + } + } + ], + "bot_utterances": [] + } + ], + "dff_friendship_skill_state_batch": [ + {} + ], + "dff_shared_state_batch": [ + { + "cross_states": {}, + "cross_links": {} + } + ], + "entities_batch": [ + {} + ], + "used_links_batch": [ + {} + ], + "age_group_batch": [ + "" + ], + "disliked_skills_batch": [ + [] + ], + "clarification_request_flag_batch": [ + false + ] +} \ No newline at end of file diff --git a/skills/dff_friendship_skill/tests/lets_talk_RU_out.json b/skills/dff_friendship_skill/tests/lets_talk_RU_out.json new file mode 100644 index 0000000000..531c7d0c47 --- /dev/null +++ b/skills/dff_friendship_skill/tests/lets_talk_RU_out.json @@ -0,0 +1,50 @@ +[ + [ + "Привет, это чат-бот Dream! Как дела?", + 1.0, + { + "dff_friendship_skill_state": { + "shared_memory": { + "greeting_type": "how_are_you" + }, + "previous_human_utter_index": 0, + "history": { + "0": [ + "greeting_flow", + "hello_response_node" + ] + }, + "current_turn_dff_suspended": false, + "context": { + "id": "59c2096b-1dbb-4d0d-b5c6-5b325cb62820", + "labels": { + "0": [ + "greeting_flow", + "hello_response_node" + ] + }, + "requests": { + "0": "привет" + }, + "responses": { + "0": "Привет, это чат-бот Dream! Как дела?" + }, + "misc": {}, + "validation": false, + "actor_state": {} + } + }, + "dff_shared_state": { + "cross_states": {}, + "cross_links": {} + }, + "used_links": {}, + "age_group": "", + "disliked_skills": [] + }, + {}, + { + "can_continue": "must" + } + ] +] \ No newline at end of file diff --git a/skills/dff_friendship_skill/tests/no_hobbies_RU_in.json b/skills/dff_friendship_skill/tests/no_hobbies_RU_in.json new file mode 100644 index 0000000000..23ace80041 --- /dev/null +++ b/skills/dff_friendship_skill/tests/no_hobbies_RU_in.json @@ -0,0 +1,239 @@ +{ + "human_utter_index_batch": [ + 3 + ], + "dialog_batch": [ + { + "human_utterances": [ + { + "text": "ничего", + "annotations": { + "spelling_preprocessing": "ничего", + "badlisted_words": { + "bad_words": false + }, + "intent_catcher": { + "choose_topic": { + "confidence": 0.0, + "detected": 0 + }, + "exit": { + "confidence": 0.0, + "detected": 0 + }, + "lets_chat_about": { + "confidence": 0.0, + "detected": 0 + }, + "no": { + "confidence": 0.9986997842788696, + "detected": 1 + }, + "repeat": { + "confidence": 0.0, + "detected": 0 + }, + "topic_switching": { + "confidence": 0.0, + "detected": 0 + }, + "what_are_you_talking_about": { + "confidence": 0.0, + "detected": 0 + }, + "what_can_you_do": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_job": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_name": { + "confidence": 0.0, + "detected": 0 + }, + "where_are_you_from": { + "confidence": 0.0, + "detected": 0 + }, + "who_made_you": { + "confidence": 0.0, + "detected": 0 + }, + "yes": { + "confidence": 0.0, + "detected": 0 + } + }, + "ner": [ + [] + ], + "entity_linking": [], + "wiki_parser": { + "animals_skill_entities_info": {}, + "entities_info": {}, + "topic_skill_entities_info": {}, + "utt_num": 3, + "wiki_skill_entities_info": {} + } + } + }, + { + "text": "не знаю", + "annotations": { + "spelling_preprocessing": "не знаю", + "badlisted_words": { + "bad_words": false + }, + "intent_catcher": { + "choose_topic": { + "confidence": 0.0, + "detected": 0 + }, + "exit": { + "confidence": 0.0, + "detected": 0 + }, + "lets_chat_about": { + "confidence": 0.0, + "detected": 0 + }, + "no": { + "confidence": 0.5191057920455933, + "detected": 1 + }, + "repeat": { + "confidence": 0.0, + "detected": 0 + }, + "topic_switching": { + "confidence": 0.0, + "detected": 0 + }, + "what_are_you_talking_about": { + "confidence": 0.0, + "detected": 0 + }, + "what_can_you_do": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_job": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_name": { + "confidence": 0.0, + "detected": 0 + }, + "where_are_you_from": { + "confidence": 0.0, + "detected": 0 + }, + "who_made_you": { + "confidence": 0.0, + "detected": 0 + }, + "yes": { + "confidence": 0.0, + "detected": 0 + } + }, + "ner": [ + [] + ], + "entity_linking": [], + "wiki_parser": { + "animals_skill_entities_info": {}, + "entities_info": {}, + "topic_skill_entities_info": {}, + "utt_num": 4, + "wiki_skill_entities_info": {} + } + } + } + ], + "bot_utterances": [ + { + "text": "Ну что ж, я все равно считаю тебя интересным человеком. Какие у тебя хобби?", + "annotations": { + "ner": [ + [] + ] + }, + "active_skill": "dff_friendship_skill" + } + ] + } + ], + "dff_friendship_skill_state_batch": [ + { + "context": { + "actor_state": {}, + "id": "59c2096b-1dbb-4d0d-b5c6-5b325cb62820", + "labels": { + "1": [ + "greeting_flow", + "how_are_you_node" + ], + "2": [ + "greeting_flow", + "std_greeting_node" + ] + }, + "misc": {}, + "requests": { + "1": "отлично! а у тебя?", + "2": "ничего" + }, + "responses": { + "1": "Спасибо, у меня как всегда все прекрасно! Я слышала, что у людей бывают отпуска и выходные. Не мой выбор! Я могу без устали говорить день и ночь! Что главное произошло с тобой сегодня?", + "2": "Ну что ж, я все равно считаю тебя интересным человеком. Какие у тебя хобби?" + }, + "validation": false + }, + "current_turn_dff_suspended": false, + "history": { + "0": [ + "greeting_flow", + "hello_response_node" + ], + "1": [ + "greeting_flow", + "how_are_you_node" + ], + "2": [ + "greeting_flow", + "std_greeting_node" + ] + }, + "previous_human_utter_index": 2, + "shared_memory": { + "greeting_step_id": 2, + "greeting_type": "how_are_you" + } + } + ], + "dff_shared_state_batch": [ + { + "cross_links": {}, + "cross_states": {} + } + ], + "entities_batch": [ + {} + ], + "used_links_batch": [ + {} + ], + "age_group_batch": [ + "" + ], + "disliked_skills_batch": [ + [] + ], + "clarification_request_flag_batch": [ + false + ] +} \ No newline at end of file diff --git a/skills/dff_friendship_skill/tests/no_hobbies_RU_out.json b/skills/dff_friendship_skill/tests/no_hobbies_RU_out.json new file mode 100644 index 0000000000..3ab9ae5bb2 --- /dev/null +++ b/skills/dff_friendship_skill/tests/no_hobbies_RU_out.json @@ -0,0 +1,73 @@ +[ + [ + "Возможно, тебе просто еще не встретилось что-то на самом деле интересное тебе. Чем ты занимаешься на выходных?", + 1.0, + { + "dff_friendship_skill_state": { + "shared_memory": { + "greeting_step_id": 3, + "greeting_type": "how_are_you" + }, + "previous_human_utter_index": 3, + "history": { + "0": [ + "greeting_flow", + "hello_response_node" + ], + "1": [ + "greeting_flow", + "how_are_you_node" + ], + "2": [ + "greeting_flow", + "std_greeting_node" + ], + "3": [ + "greeting_flow", + "std_greeting_node" + ] + }, + "current_turn_dff_suspended": false, + "context": { + "id": "59c2096b-1dbb-4d0d-b5c6-5b325cb62820", + "labels": { + "2": [ + "greeting_flow", + "std_greeting_node" + ], + "3": [ + "greeting_flow", + "std_greeting_node" + ] + }, + "requests": { + "2": "ничего", + "3": "не знаю" + }, + "responses": { + "2": "Ну что ж, я все равно считаю тебя интересным человеком. Какие у тебя хобби?", + "3": "Возможно, тебе просто еще не встретилось что-то на самом деле интересное тебе. Чем ты занимаешься на выходных?" + }, + "misc": {}, + "validation": false, + "actor_state": {} + } + }, + "dff_shared_state": { + "cross_links": {}, + "cross_states": {} + }, + "used_links": {}, + "age_group": "", + "disliked_skills": [] + }, + {}, + { + "can_continue": "must", + "response_parts": [ + "acknowledgement", + "body" + ] + } + ] +] \ No newline at end of file diff --git a/skills/dff_friendship_skill/tests/no_recent_events_RU_in.json b/skills/dff_friendship_skill/tests/no_recent_events_RU_in.json new file mode 100644 index 0000000000..b6a4e59e58 --- /dev/null +++ b/skills/dff_friendship_skill/tests/no_recent_events_RU_in.json @@ -0,0 +1,243 @@ +{ + "human_utter_index_batch": [ + 2 + ], + "dialog_batch": [ + { + "human_utterances": [ + { + "text": "отлично! а у тебя?", + "annotations": { + "spelling_preprocessing": "отлично! а у тебя?", + "badlisted_words": { + "bad_words": false + }, + "intent_catcher": { + "choose_topic": { + "confidence": 0.0, + "detected": 0 + }, + "exit": { + "confidence": 0.0, + "detected": 0 + }, + "lets_chat_about": { + "confidence": 0.0, + "detected": 0 + }, + "no": { + "confidence": 0.0, + "detected": 0 + }, + "repeat": { + "confidence": 0.0, + "detected": 0 + }, + "topic_switching": { + "confidence": 0.0, + "detected": 0 + }, + "what_are_you_talking_about": { + "confidence": 0.0, + "detected": 0 + }, + "what_can_you_do": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_job": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_name": { + "confidence": 0.0, + "detected": 0 + }, + "where_are_you_from": { + "confidence": 0.0, + "detected": 0 + }, + "who_made_you": { + "confidence": 0.0, + "detected": 0 + }, + "yes": { + "confidence": 0.0, + "detected": 0 + } + }, + "ner": [ + [] + ], + "entity_linking": [], + "wiki_parser": { + "animals_skill_entities_info": {}, + "entities_info": {}, + "topic_skill_entities_info": {}, + "utt_num": 2, + "wiki_skill_entities_info": {} + } + } + }, + { + "text": "ничего", + "annotations": { + "spelling_preprocessing": "ничего", + "badlisted_words": { + "bad_words": false + }, + "intent_catcher": { + "choose_topic": { + "confidence": 0.0, + "detected": 0 + }, + "exit": { + "confidence": 0.0, + "detected": 0 + }, + "lets_chat_about": { + "confidence": 0.0, + "detected": 0 + }, + "no": { + "confidence": 0.9986997842788696, + "detected": 1 + }, + "repeat": { + "confidence": 0.0, + "detected": 0 + }, + "topic_switching": { + "confidence": 0.0, + "detected": 0 + }, + "what_are_you_talking_about": { + "confidence": 0.0, + "detected": 0 + }, + "what_can_you_do": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_job": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_name": { + "confidence": 0.0, + "detected": 0 + }, + "where_are_you_from": { + "confidence": 0.0, + "detected": 0 + }, + "who_made_you": { + "confidence": 0.0, + "detected": 0 + }, + "yes": { + "confidence": 0.0, + "detected": 0 + } + }, + "ner": [ + [] + ], + "entity_linking": [], + "wiki_parser": { + "animals_skill_entities_info": {}, + "entities_info": {}, + "topic_skill_entities_info": {}, + "utt_num": 3, + "wiki_skill_entities_info": {} + } + } + } + ], + "bot_utterances": [ + { + "text": "Спасибо, у меня как всегда все прекрасно! Я слышала, что у людей бывают отпуска и выходные. Не мой выбор! Я могу без устали говорить день и ночь! Что главное произошло с тобой сегодня?", + "annotations": { + "ner": [ + [ + { + "confidence": 1, + "end_pos": 175, + "start_pos": 170, + "text": "тобой", + "type": "PER" + } + ] + ] + }, + "active_skill": "dff_friendship_skill" + } + ] + } + ], + "dff_friendship_skill_state_batch": [ + { + "context": { + "actor_state": {}, + "id": "59c2096b-1dbb-4d0d-b5c6-5b325cb62820", + "labels": { + "0": [ + "greeting_flow", + "hello_response_node" + ], + "1": [ + "greeting_flow", + "how_are_you_node" + ] + }, + "misc": {}, + "requests": { + "0": "привет", + "1": "отлично! а у тебя?" + }, + "responses": { + "0": "Привет, это чат-бот Dream! Как дела?", + "1": "Спасибо, у меня как всегда все прекрасно! Я слышала, что у людей бывают отпуска и выходные. Не мой выбор! Я могу без устали говорить день и ночь! Что главное произошло с тобой сегодня?" + }, + "validation": false + }, + "current_turn_dff_suspended": false, + "history": { + "0": [ + "greeting_flow", + "hello_response_node" + ], + "1": [ + "greeting_flow", + "how_are_you_node" + ] + }, + "previous_human_utter_index": 1, + "shared_memory": { + "greeting_step_id": 1, + "greeting_type": "how_are_you" + } + } + ], + "dff_shared_state_batch": [ + { + "cross_links": {}, + "cross_states": {} + } + ], + "entities_batch": [ + {} + ], + "used_links_batch": [ + {} + ], + "age_group_batch": [ + "" + ], + "disliked_skills_batch": [ + [] + ], + "clarification_request_flag_batch": [ + false + ] +} \ No newline at end of file diff --git a/skills/dff_friendship_skill/tests/no_recent_events_RU_out.json b/skills/dff_friendship_skill/tests/no_recent_events_RU_out.json new file mode 100644 index 0000000000..05e8fa0847 --- /dev/null +++ b/skills/dff_friendship_skill/tests/no_recent_events_RU_out.json @@ -0,0 +1,69 @@ +[ + [ + "Ну что ж, я все равно считаю тебя интересным человеком. Какие у тебя хобби?", + 1.0, + { + "dff_friendship_skill_state": { + "shared_memory": { + "greeting_step_id": 2, + "greeting_type": "how_are_you" + }, + "previous_human_utter_index": 2, + "history": { + "0": [ + "greeting_flow", + "hello_response_node" + ], + "1": [ + "greeting_flow", + "how_are_you_node" + ], + "2": [ + "greeting_flow", + "std_greeting_node" + ] + }, + "current_turn_dff_suspended": false, + "context": { + "id": "59c2096b-1dbb-4d0d-b5c6-5b325cb62820", + "labels": { + "1": [ + "greeting_flow", + "how_are_you_node" + ], + "2": [ + "greeting_flow", + "std_greeting_node" + ] + }, + "requests": { + "1": "отлично! а у тебя?", + "2": "ничего" + }, + "responses": { + "1": "Спасибо, у меня как всегда все прекрасно! Я слышала, что у людей бывают отпуска и выходные. Не мой выбор! Я могу без устали говорить день и ночь! Что главное произошло с тобой сегодня?", + "2": "Ну что ж, я все равно считаю тебя интересным человеком. Какие у тебя хобби?" + }, + "misc": {}, + "validation": false, + "actor_state": {} + } + }, + "dff_shared_state": { + "cross_links": {}, + "cross_states": {} + }, + "used_links": {}, + "age_group": "", + "disliked_skills": [] + }, + {}, + { + "can_continue": "must", + "response_parts": [ + "acknowledgement", + "body" + ] + } + ] +] \ No newline at end of file diff --git a/skills/dff_generative_skill/Dockerfile b/skills/dff_generative_skill/Dockerfile new file mode 100644 index 0000000000..f28ea76a78 --- /dev/null +++ b/skills/dff_generative_skill/Dockerfile @@ -0,0 +1,32 @@ +FROM python:3.9.1 +# ###################### IMMUTABLE SECTION ###################################### +# Do not change anything in this section +WORKDIR /src + +COPY common/dff/requirements.txt . +RUN pip install -r requirements.txt + +# ###################### CUSTOM SECTION ###################################### +# Here you can make changes + +ARG SERVICE_NAME +ENV SERVICE_NAME ${SERVICE_NAME} + +ARG LANGUAGE=EN +ENV LANGUAGE ${LANGUAGE} + +COPY skills/${SERVICE_NAME}/requirements.txt . +RUN pip install -r requirements.txt +RUN python -m nltk.downloader wordnet + +COPY skills/${SERVICE_NAME}/ ./ +COPY ./common/ ./common/ + +ARG SERVICE_PORT +ENV SERVICE_PORT ${SERVICE_PORT} + +# wait for a server answer ( INTERVAL + TIMEOUT ) * RETRIES seconds after that change stutus to unhealthy +HEALTHCHECK --interval=5s --timeout=5s --retries=3 CMD curl --fail 127.0.0.1:${SERVICE_PORT}/healthcheck || exit 1 + + +CMD gunicorn --workers=1 server:app -b 0.0.0.0:${SERVICE_PORT} diff --git a/skills/dff_generative_skill/README.md b/skills/dff_generative_skill/README.md new file mode 100644 index 0000000000..67335f581f --- /dev/null +++ b/skills/dff_generative_skill/README.md @@ -0,0 +1,302 @@ +# DialogFlow Framework Template +Changes can only be made in the `dialogflows` directory. + +Template has dialog flows based on programy (`repeating`) and based on valila python (`greeting`). + +```bash +python utils/create_local_yml.py -s dff-template-skill -s convers-evaluation-selector + +docker-compose -f docker-compose.yml -f local.yml up -d --build + +docker-compose -f docker-compose.yml -f local.yml exec agent python -m deeppavlov_agent.run +docker-compose -f docker-compose.yml -f local.yml logs -f dff-template-skill +docker-compose -f docker-compose.yml -f local.yml exec dff-template-skill bash test.sh +``` + + +# Important changes in files of the agent +docker-compose.yml +```yml + dff-template-skill: + build: + args: + SERVICE_PORT: 8095 + SERVICE_NAME: dff_template_skill # has to be the same with skill dir name + context: . + dockerfile: ./skills/dff_template_skill/Dockerfile + command: gunicorn --workers=1 server:app -b 0.0.0.0:8095 --reload + deploy: + mode: replicated + replicas: 4 + resources: + limits: + memory: 768M + reservations: + memory: 768M +``` + + +dev.yml +```yml + dff-template-skill: + env_file: [.env.dev] + volumes: + - "./skills/dff_template:/src" + - "./common:/src/common" + ports: + - 8095:8095 +``` + +pipeline.json +```json + "dff_template": { + "connector": { + "protocol": "http", + "url": "http://dff-template:8095/respond" + }, + "dialog_formatter": "state_formatters.dp_formatters:dff_template_formatter", + "response_formatter": "state_formatters.dp_formatters:skill_with_attributes_formatter_service", + "previous_services": ["skill_selectors"], + "state_manager_method": "add_hypothesis" + }, +``` + +state_formatters/formatter.py +```python +def DFF_TEMPLATE_formatter(dialog: Dict) -> List[Dict]: + service_name = f"DFF_TEMPLATE" + return utils.dff_formatter(dialog, service_name) +``` +[skill_selectors/rule_based_selector/connector.py](https://github.com/dilyararimovna/dp-dream-alexa/blob/a4fdea01a1f16c2a877f9d9447350463adc96a2f/skill_selectors/rule_based_selector/connector.py#L381) + +```python + response=["dff_template"], +``` + + +# Tests +## Test creating + +The file `server.py` contains this code + +```python +@app.route("/respond", methods=["POST"]) +def respond(): + # next commented line for test creating + # import common.test_utils as t_utils; t_utils.save_to_test(request.json,"tests/TEST_NAME_in.json",indent=4) + responses = handler(request.json) + # next commented line for test creating + # import common.test_utils as t_utils; t_utils.save_to_test(responses,"tests/TEST_NAME_out.json",indent=4) + return jsonify(responses) + +``` +Steps: +1. Uncomment lines with json dump +1. Name your test by replacing `YOUR_TEST_NAME` in both line. They have to be same. +1. Start a test dialog with agent.Every turn will be written in `tests/TEST_NAME*`. `*_in.json` - for input data, `*_in.json` - for response data. + +If your want to write down all turns of test dialog you can use this code + +```python +index = 0 +@app.route("/respond", methods=["POST"]) +def respond(): + # next commented line for test creating + import common.test_utils as t_utils;t_utils.save_to_test(responses,f"tests/TEST_NAME_{index}_in.json",indent=4) + responses = handler(request.json) + # next commented line for test creating + import common.test_utils as t_utils;t_utils.save_to_test(responses,f"tests/TEST_NAME_{index}_out.json",indent=4) + index +=1 + return jsonify(responses) + +``` +## Test using +Tests are used for two way: + +- service initialization in `server.py` + +```python +try: + test_server.run_test(handler) + logger.info("test query processed") +except Exception as exc: + sentry_sdk.capture_exception(exc) + logger.exception(exc) + raise exc +``` + +- service testing by `test.sh` execution + + +## Test extending +If you use service based on random behavior you can send `random_seed` in your service. You can find corespond lines in `server.py` +```python + ... # some code + rand_seed = requested_data.get("rand_seed") # for tests + ... # some code + if rand_seed: + random.seed(int(rand_seed) + ... # some code +``` + +For answer comparison we use `common.test_utils`: +- `compare_structs` - for json structure comparison +- `compare_text` - for text comparison + +You can use them for you custom comparison. + + +## Links between dff skills +1. Making a link (example of link from dff\_animals\_skill to dff\_wiki_skill) +```python + import common.dialogflow_framework.utils.state as state_utils + ... # some code + def why_do_you_like_response(vars): + ... # some code + if found_animal: + response = f"Cool! Why do you like {found_animal}?" + else: + response = f"Cool! Why do you like them?" + + if found_entity_id: + # making cross link + state_utils.set_cross_link(vars, to_service_name="dff_wiki_skill", from_service_name="dff_animals_skill") + add_info = {"entity_id": found_entity_id, "entity_substr": found_animal, "entity_types": found_types, + "entity_page": found_entity_page} # if we want to pass some info between skills + # save info in cross state + state_utils.save_cross_state(vars, service_name="dff_wiki_skill", new_state=add_info) + state_utils.set_dff_suspension(vars) # stop current dff skill so that after the next dff skill will finish + # its scenario, the current scenario was resumed from this state + + return response +``` + +2. Using the link in the destination skill (dff\_wiki_skill in our example) +```python + import common.dialogflow_framework.utils.state as state_utils + ... # some code + def tell_fact_request(ngrams, vars): + cross_link = state_utils.get_cross_link(vars, service_name="dff_wiki_skill") + # cross link is a dict {"from_service": "dff_animals_skill"} + cross_state = state_utils.get_cross_state(vars, service_name="dff_wiki_skill") + # cross_state is a dict add_info which was saved in why_do_you_like_response using save_cross_state function + from_skill = cross_link.get("from_service", "") + if from_skill == "dff_animals_skill": + flag = True + +``` + +3. To switch the destination skill if the link was made, we can add a function in common folder + (in our example in common/wiki_skill.py) +```python + def find_wiki_cross_links(dialog): + flag = False + human_attributes = dialog.get("human", {}).get("attributes", {}) + dff_shared_state = human_attributes.get("dff_shared_state", {"cross_states": {}, "cross_links": {}}) + cross_links = dff_shared_state["cross_links"].get("dff_wiki_skill", {}) + if cross_links: + flag = True + return flag +``` +Then in skill\_selectors/rule\_based_selector/connector.py: +```python + from common.wiki_skill import find_wiki_cross_links + ... # some code + if find_wiki_cross_links(dialog): + skills_for_uttr.append("dff_wiki_skill") +``` + +4. Reverse transition (from dff\_wiki\_skill to dff\_animals_skill in our example) is made the way. + +## Insert scenario parser to a dff skill + +```python + ... # some imports + import json + from common.insert_scenario import start_or_continue_scenario, smalltalk_response, start_or_continue_facts, \ + facts_response # imports for scenario insertion + + # place your config in the directory skills/your_dff_skill_name/{inserted_scenario_config_name}.json + # and load config + with open(inserted_scenario_config_name, 'r') as fl: + topic_config = json.load(fl) + + class State(Enum): + USR_START = auto() + # + ... # States of your skill + + # States for scenario insertion + SYS_INSERT_SMALLTALK = auto() + USR_INSERT_SMALLTALK = auto() + # + SYS_INSERT_FACT = auto() + USR_INSERT_FACT = auto() + + ... # Some other states of your skill + + # Two request and two response functions for scenario insertion + + def insert_scenario_smalltalk_request(ngrams, vars): + flag = start_or_continue_scenario(vars, topic_config) + logger.info(f"special_topic_request={flag}") + return flag + + + def insert_scenario_smalltalk_response(vars): + response = smalltalk_response(vars, topic_config) + return response + + + def insert_scenario_facts_request(ngrams, vars): + flag = start_or_continue_facts(vars, topic_config) + logger.info(f"special_topic_facts_request={flag}") + return flag + + + def insert_scenario_facts_response(vars): + response = facts_response(vars, topic_config) + return response + + simplified_dialog_flow = dialogflow_extension.DFEasyFilling(State.USR_START) + + ... # Your state transitions + + # State transitions for scenario insertion + + simplified_dialog_flow.add_user_serial_transitions( + State.SOME_STATE, + { + ... # transitions to other states + State.SYS_INSERT_SMALLTALK: insert_scenario_smalltalk_request, + }, + ) + + simplified_dialog_flow.add_user_serial_transitions( + State.USR_INSERT_SMALLTALK, + { + State.SYS_INSERT_FACT: insert_scenario_facts_request, + State.SYS_INSERT_SMALLTALK: insert_scenario_smalltalk_request, + State.SOME_OTHER_YOUR_STATE: some_other_state_request, + }, + ) + + simplified_dialog_flow.add_user_serial_transitions( + State.USR_INSERT_FACT, + { + State.SYS_INSERT_SMALLTALK: insert_scenario_smalltalk_request, + State.SYS_INSERT_FACT: insert_scenario_facts_request, + State.SOME_OTHER_YOUR_STATE: some_other_state_request, + }, + ) + + simplified_dialog_flow.add_system_transition(State.SYS_INSERT_SMALLTALK, State.USR_INSERT_SMALLTALK, + insert_scenario_smalltalk_response, ) + simplified_dialog_flow.add_system_transition(State.SYS_INSERT_FACT, State.USR_INSERT_FACT, + insert_scenario_facts_response, ) + + simplified_dialog_flow.set_error_successor(State.SYS_INSERT_SMALLTALK, State.SYS_ERR) + simplified_dialog_flow.set_error_successor(State.USR_INSERT_SMALLTALK, State.SYS_ERR) + simplified_dialog_flow.set_error_successor(State.SYS_INSERT_FACT, State.SYS_ERR) + simplified_dialog_flow.set_error_successor(State.USR_INSERT_FACT, State.SYS_ERR) +``` \ No newline at end of file diff --git a/skills/dff_generative_skill/common/.gitkeep b/skills/dff_generative_skill/common/.gitkeep new file mode 100644 index 0000000000..e69de29bb2 diff --git a/skills/dff_generative_skill/requirements.txt b/skills/dff_generative_skill/requirements.txt new file mode 100644 index 0000000000..3c9b9661f6 --- /dev/null +++ b/skills/dff_generative_skill/requirements.txt @@ -0,0 +1,2 @@ +click==7.1.2 +nltk==3.5 diff --git a/skills/dff_generative_skill/scenario/main.py b/skills/dff_generative_skill/scenario/main.py new file mode 100644 index 0000000000..1c3f80365d --- /dev/null +++ b/skills/dff_generative_skill/scenario/main.py @@ -0,0 +1,29 @@ +import logging + +from df_engine.core.keywords import TRANSITIONS, RESPONSE +from df_engine.core import Actor +import df_engine.conditions as cnd +import df_engine.labels as lbl + +from . import response as loc_rsp + +logging.basicConfig(format="%(asctime)s - %(name)s - %(levelname)s - %(message)s", level=logging.INFO) + +logger = logging.getLogger(__name__) + +flows = { + "generation": { + "start_node": { + RESPONSE: "", + TRANSITIONS: {"generative_response_node": cnd.true()}, + }, + "generative_response_node": { + RESPONSE: loc_rsp.generative_response, + TRANSITIONS: {lbl.repeat(): cnd.true()}, + }, + }, +} + +actor = Actor( + flows, start_label=("generation", "start_node"), fallback_node_label=("generation", "generative_response_node") +) diff --git a/skills/dff_generative_skill/scenario/response.py b/skills/dff_generative_skill/scenario/response.py new file mode 100644 index 0000000000..ebf5fda1d1 --- /dev/null +++ b/skills/dff_generative_skill/scenario/response.py @@ -0,0 +1,74 @@ +import logging +import requests +import sentry_sdk +from os import getenv +from typing import Any + +import common.dff.integration.response as int_rsp +import common.dff.integration.context as int_ctx +from df_engine.core import Context, Actor +from common.constants import CAN_CONTINUE_SCENARIO + + +sentry_sdk.init(getenv("SENTRY_DSN")) +logging.basicConfig(format="%(asctime)s - %(name)s - %(levelname)s - %(message)s", level=logging.INFO) +logger = logging.getLogger(__name__) +DIALOGPT_SERVICE_URL = getenv("DIALOGPT_SERVICE_URL") +assert DIALOGPT_SERVICE_URL + + +def compose_data_for_dialogpt(ctx, actor): + data = [] + # for uttr in dialog["utterances"][-3:]: + # curr_uttr = {"speaker": uttr["user"]["user_type"], "text": uttr["text"]} + # data.append(curr_uttr) + + human_uttrs = int_ctx.get_human_utterances(ctx, actor) + bot_uttrs = int_ctx.get_bot_utterances(ctx, actor) + + if len(human_uttrs) > 1: + data += [{"speaker": human_uttrs[-2]["user"]["user_type"], "text": human_uttrs[-2]["text"]}] + + if len(bot_uttrs) > 0: + data += [{"speaker": bot_uttrs[-1]["user"]["user_type"], "text": bot_uttrs[-1]["text"]}] + if len(human_uttrs) > 0: + data += [{"speaker": human_uttrs[-1]["user"]["user_type"], "text": human_uttrs[-1]["text"]}] + + return data + + +def generative_response(ctx: Context, actor: Actor, *args, **kwargs) -> Any: + curr_responses, curr_confidences, curr_human_attrs, curr_bot_attrs, curr_attrs = [], [], [], [], [] + + def gathering_responses(reply, confidence, human_attr, bot_attr, attr): + nonlocal curr_responses, curr_confidences, curr_human_attrs, curr_bot_attrs, curr_attrs + if reply and confidence: + curr_responses += [reply] + curr_confidences += [confidence] + curr_human_attrs += [human_attr] + curr_bot_attrs += [bot_attr] + curr_attrs += [attr] + logger.info(f"dff-generative-skill: {reply}") + + request_data = compose_data_for_dialogpt(ctx, actor) + if len(request_data) > 0: + response = requests.post(DIALOGPT_SERVICE_URL, json={"dialog_contexts": [request_data]}, timeout=1.8) + hypotheses = response.json()["generated_responses"][0] + else: + hypotheses = [] + + for hyp in hypotheses: + if hyp[-1] not in [".", "?", "!"]: + hyp += "." + gathering_responses(hyp, 0.99, {}, {}, {"can_continue": CAN_CONTINUE_SCENARIO}) + + if len(curr_responses) == 0: + return "" + + return int_rsp.multi_response( + replies=curr_responses, + confidences=curr_confidences, + human_attr=curr_human_attrs, + bot_attr=curr_bot_attrs, + hype_attr=curr_attrs, + )(ctx, actor, *args, **kwargs) diff --git a/skills/dff_generative_skill/server.py b/skills/dff_generative_skill/server.py new file mode 100644 index 0000000000..3d8471361c --- /dev/null +++ b/skills/dff_generative_skill/server.py @@ -0,0 +1,114 @@ +#!/usr/bin/env python + +import logging +import time +import os +import random +import requests + +import sentry_sdk +from flask import Flask, request, jsonify +from healthcheck import HealthCheck +from sentry_sdk.integrations.logging import ignore_logger + +from common.dff.integration.actor import load_ctxs, get_response +from scenario.main import actor + +import test_server + + +ignore_logger("root") + +sentry_sdk.init(os.getenv("SENTRY_DSN")) +SERVICE_NAME = os.getenv("SERVICE_NAME") +SERVICE_PORT = int(os.getenv("SERVICE_PORT")) +RANDOM_SEED = int(os.getenv("RANDOM_SEED", 2718)) +DIALOGPT_SERVICE_URL = os.environ["DIALOGPT_SERVICE_URL"] + +logging.basicConfig(format="%(asctime)s - %(pathname)s - %(lineno)d - %(levelname)s - %(message)s", level=logging.DEBUG) +logger = logging.getLogger(__name__) + + +app = Flask(__name__) +health = HealthCheck(app, "/healthcheck") +logging.getLogger("werkzeug").setLevel("WARNING") + + +def is_container_running(): + try: + requested_data = [{"speaker": "human", "text": "привет"}] + response = requests.post(DIALOGPT_SERVICE_URL, json={"dialog_contexts": [requested_data]}, timeout=1) + if response.status_code == 200: + return True + except Exception as exc: + print(exc) + return False + return False + + +def handler(requested_data, random_seed=None): + st_time = time.time() + ctxs = load_ctxs(requested_data) + random_seed = requested_data.get("random_seed", random_seed) # for tests + + responses = [] + for ctx in ctxs: + try: + # for tests + if random_seed: + random.seed(int(random_seed)) + ctx = actor(ctx) + responses.append(get_response(ctx, actor)) + except Exception as exc: + sentry_sdk.capture_exception(exc) + logger.exception(exc) + responses.append(("", 0.0, {}, {}, {})) + + total_time = time.time() - st_time + logger.info(f"{SERVICE_NAME} exec time = {total_time:.3f}s") + return responses + + +while True: + result = is_container_running() + if result: + break + else: + time.sleep(5) + continue + +try: + test_server.run_test(handler) + logger.info("test query processed") +except Exception as exc: + sentry_sdk.capture_exception(exc) + logger.exception(exc) + raise exc + + +logger.info(f"{SERVICE_NAME} is loaded and ready") + +# import pathlib +# import json + +# for in_file in pathlib.Path("tests").glob("./*_in.json"): +# logger.error(in_file) +# test_in = json.load(in_file.open()) +# responses = handler(test_in, RANDOM_SEED) +# out_file = str(in_file).replace("in.json", "out.json") +# import common.test_utils as t_utils + +# t_utils.save_to_test(responses, out_file, indent=4) # TEST + + +@app.route("/respond", methods=["POST"]) +def respond(): + # import common.test_utils as t_utils; t_utils.save_to_test(request.json,"tests/lets_talk_in.json",indent=4) # TEST + # responses = handler(request.json, RANDOM_SEED) # TEST + # import common.test_utils as t_utils; t_utils.save_to_test(responses,"tests/lets_talk_out.json",indent=4) # TEST + responses = handler(request.json) + return jsonify(responses) + + +if __name__ == "__main__": + app.run(debug=False, host="0.0.0.0", port=SERVICE_PORT) diff --git a/skills/dff_generative_skill/test.sh b/skills/dff_generative_skill/test.sh new file mode 100755 index 0000000000..b37c67d44c --- /dev/null +++ b/skills/dff_generative_skill/test.sh @@ -0,0 +1,4 @@ +#!/bin/bash + + +python test_server.py diff --git a/skills/dff_generative_skill/test_server.py b/skills/dff_generative_skill/test_server.py new file mode 100644 index 0000000000..5ceb78f9ef --- /dev/null +++ b/skills/dff_generative_skill/test_server.py @@ -0,0 +1,33 @@ +import requests +import os + +import common.test_utils as test_utils + + +SERVICE_PORT = int(os.getenv("SERVICE_PORT")) +RANDOM_SEED = int(os.getenv("RANDOM_SEED", 2718)) +URL = f"http://0.0.0.0:{SERVICE_PORT}/respond" + + +def handler(requested_data, random_seed): + hypothesis = requests.post(URL, json={**requested_data, "random_seed": random_seed}).json() + return hypothesis + + +def run_test(handler): + in_data, out_data = test_utils.get_dataset() + for test_name in in_data: + hypothesis = handler(in_data[test_name], RANDOM_SEED) + print(f"test name: {test_name}") + is_equal_flag, msg = test_utils.compare_structs(out_data[test_name], hypothesis, ignored_keys=["id"]) + if msg and len(msg.split("`")) == 5: + _, ground_truth_text, _, hypothesis_text, _ = msg.split("`") + is_equal_flag, ratio = test_utils.compare_text(ground_truth_text, hypothesis_text, 0.0) + if not is_equal_flag: + msg = f"{msg} ratio = {ratio}" + assert is_equal_flag, msg + print("Success") + + +if __name__ == "__main__": + run_test(handler) diff --git a/skills/dff_generative_skill/tests/.gitkeep b/skills/dff_generative_skill/tests/.gitkeep new file mode 100644 index 0000000000..e69de29bb2 diff --git a/skills/dff_generative_skill/tests/lets_talk_in.json b/skills/dff_generative_skill/tests/lets_talk_in.json new file mode 100644 index 0000000000..3c3a4e4860 --- /dev/null +++ b/skills/dff_generative_skill/tests/lets_talk_in.json @@ -0,0 +1,59 @@ +{ + "human_utter_index_batch": [ + 0 + ], + "dialog_batch": [ + { + "human_utterances": [ + { + "text": "привет! как дела?", + "user": { + "user_type": "human" + }, + "annotations": {} + }, + { + "text": "тоже хорошо. посоветуй мне фильм посмотреть.", + "user": { + "user_type": "human" + }, + "annotations": {} + } + ], + "bot_utterances": [ + { + "text": "отлично. а твои как?", + "user": { + "user_type": "bot" + }, + "annotations": {} + } + ] + } + ], + "dff_generative_skill_state_batch": [ + {} + ], + "dff_shared_state_batch": [ + { + "cross_states": {}, + "cross_links": {} + } + ], + "entities_batch": [ + {} + ], + "used_links_batch": [ + {} + ], + "age_group_batch": [ + "unknown" + ], + "disliked_skills_batch": [ + [] + ], + "clarification_request_flag_batch": [ + false + ], + "random_seed": 2718 +} \ No newline at end of file diff --git a/skills/dff_generative_skill/tests/lets_talk_out.json b/skills/dff_generative_skill/tests/lets_talk_out.json new file mode 100644 index 0000000000..04fcd615f2 --- /dev/null +++ b/skills/dff_generative_skill/tests/lets_talk_out.json @@ -0,0 +1,49 @@ +[ + [ + [ + "А я тебе советую книгу почитать.", + "\"Скотт Пилигрим против всех\"", + "не смотри.", + "\"Скотт Пилигрим против всех\"", + "не смотри." + ], + [ + 0.99, + 0.99, + 0.99, + 0.99, + 0.99 + ], + [ + {}, + {}, + {}, + {}, + {} + ], + [ + {}, + {}, + {}, + {}, + {} + ], + [ + { + "can_continue": "no" + }, + { + "can_continue": "no" + }, + { + "can_continue": "no" + }, + { + "can_continue": "no" + }, + { + "can_continue": "no" + } + ] + ] +] \ No newline at end of file diff --git a/skills/dff_grounding_skill/Dockerfile b/skills/dff_grounding_skill/Dockerfile index 998fc6c9f8..f28ea76a78 100644 --- a/skills/dff_grounding_skill/Dockerfile +++ b/skills/dff_grounding_skill/Dockerfile @@ -12,6 +12,9 @@ RUN pip install -r requirements.txt ARG SERVICE_NAME ENV SERVICE_NAME ${SERVICE_NAME} +ARG LANGUAGE=EN +ENV LANGUAGE ${LANGUAGE} + COPY skills/${SERVICE_NAME}/requirements.txt . RUN pip install -r requirements.txt RUN python -m nltk.downloader wordnet diff --git a/skills/dff_grounding_skill/scenario/responses.py b/skills/dff_grounding_skill/scenario/responses.py index c86fcb5814..5813e239fe 100644 --- a/skills/dff_grounding_skill/scenario/responses.py +++ b/skills/dff_grounding_skill/scenario/responses.py @@ -40,6 +40,7 @@ logging.basicConfig(format="%(asctime)s - %(name)s - %(levelname)s - %(message)s", level=logging.INFO) logger = logging.getLogger(__name__) +LANGUAGE = getenv("LANGUAGE", "EN") REPLY_TYPE = Tuple[str, float, dict, dict, dict] @@ -259,7 +260,7 @@ def ask_for_topic_after_two_no_in_a_row_to_linkto_response(ctx: Context) -> REPL confidence = 0.0 attr = {} if prev_was_linkto and prev_prev_was_linkto and human_is_no and prev_human_is_no: - offer = random.choice(GREETING_QUESTIONS["what_to_talk_about"]) + offer = random.choice(GREETING_QUESTIONS[LANGUAGE]["what_to_talk_about"]) topics_to_offer = ", ".join(sum(link_to_skill2key_words.values(), [])) reply = f"Okay then. {offer} {topics_to_offer}?" confidence = SUPER_CONF diff --git a/skills/dff_intent_responder_skill/Dockerfile b/skills/dff_intent_responder_skill/Dockerfile index 998fc6c9f8..4a4ec81350 100644 --- a/skills/dff_intent_responder_skill/Dockerfile +++ b/skills/dff_intent_responder_skill/Dockerfile @@ -9,9 +9,15 @@ RUN pip install -r requirements.txt # ###################### CUSTOM SECTION ###################################### # Here you can make changes +ARG LANGUAGE=EN +ENV LANGUAGE ${LANGUAGE} + ARG SERVICE_NAME ENV SERVICE_NAME ${SERVICE_NAME} +ARG INTENT_RESPONSE_PHRASES_FNAME +ENV INTENT_RESPONSE_PHRASES_FNAME ${INTENT_RESPONSE_PHRASES_FNAME} + COPY skills/${SERVICE_NAME}/requirements.txt . RUN pip install -r requirements.txt RUN python -m nltk.downloader wordnet diff --git a/skills/dff_intent_responder_skill/scenario/data/intent_response_phrases_RU.json b/skills/dff_intent_responder_skill/scenario/data/intent_response_phrases_RU.json new file mode 100644 index 0000000000..8e8c70f281 --- /dev/null +++ b/skills/dff_intent_responder_skill/scenario/data/intent_response_phrases_RU.json @@ -0,0 +1,39 @@ +{ + "exit" : [ + "Пока-пока!", + "До скорого!", + "Из тебя получается отличный собеседник! Пока!", + "Ты меня вдохновляешь! Пока-пока!", + "Беседа с тобой сделала мою жизнь немного ярче! До скорого!", + "Спасибо за этот диалог! До скорого!", + "Мне очень понравилось с тобой говорить! До скорого!", + "Поболтать с тобой всегда в удовольствие! Пока-пока!", + "Ты - отличный друг! Пока!" + ], + "choose_topic" : [ + "Мне нравится болтать о разных вещах: книги, фильмы, игры.", + "Мне нравится кино и книги. Еще я люблю видео игры." + ], + "repeat" : [], + "where_are_you_from": [ + "Я же бот, я живу в облаке. Кстати, чтобы лучше узнать друг друга, нам надо побольше времени поболтать.", + "Я живу где-то на планете Земля. Но подробнее мне нельзя рассказывать." + ], + "who_made_you" : [ + "Меня создала команда Московского физико-технического института." + ], + "what_is_your_name": [ + "Меня зовут Dream." + ], + "what_is_your_job": [ + "Я чат-бот, и моя работа состоит в том, чтобы говорить с людьми. А у тебя есть какое-то занятие?", + "Я разговариваю с людьми, чтобы дать им возможность познакомиться с технологиями искусственного интеллекта. Это моя работа. А у тебя какая?", + "Я чат-бот, в котором используются технологии искусственного интеллекта. Я все время разговариваю с людьми. А у тебя какая основная деятельность?" + ], + "what_can_you_do": [ + "Я чат-бот, и я люблю говорить о фильмах, книгах, еде. Я стараюсь отвечать на вопросы правильно, но не гарантирую этого - ведь я еще учусь.", + "Я люблю говорить о разном. Еще я умею почти правильно отвечать на вопросы и даже давать рекомендации.", + "Каждый день я узнаю много нового. Я стараюсь делиться знаниями с собеседниками, но не всегда бываю права." + ], + "get_dialog_id": [] +} diff --git a/skills/dff_intent_responder_skill/scenario/response.py b/skills/dff_intent_responder_skill/scenario/response.py index b3ea1ec56e..7730a4c3f5 100644 --- a/skills/dff_intent_responder_skill/scenario/response.py +++ b/skills/dff_intent_responder_skill/scenario/response.py @@ -14,7 +14,7 @@ def intent_catcher_response(ctx: Context, actor: Actor, *args, **kwargs) -> str: intention, confidence = get_detected_intents(annotated_utterance) response = "" - if intention is not None and confidence > 0: + if intention is not None and confidence > 0 and intention in response_funcs.get_respond_funcs(): logger.debug(f"Intent is defined as {intention}") dialog = int_ctx.get_dialog(ctx, actor) dialog["seen"] = dialog["called_intents"][intention] @@ -65,7 +65,7 @@ def get_detected_intents(annotated_utterance): intents = get_intents(annotated_utterance) intent, confidence = None, 0.0 for key, value in intents.items(): - if value.get("detected", 0) == 1: + if value.get("detected", 0) == 1 and key in response_funcs.get_respond_funcs(): confidence_current = value.get("confidence", 0.0) if confidence_current > confidence: intent, confidence = key, confidence_current diff --git a/skills/dff_intent_responder_skill/scenario/response_funcs.py b/skills/dff_intent_responder_skill/scenario/response_funcs.py index 703534de14..2d09786d9c 100644 --- a/skills/dff_intent_responder_skill/scenario/response_funcs.py +++ b/skills/dff_intent_responder_skill/scenario/response_funcs.py @@ -1,53 +1,26 @@ #!/usr/bin/env python - +import logging import json import random -import common.dff.integration.context as int_ctx from datetime import datetime +from os import getenv + +import common.dff.integration.context as int_ctx from df_engine.core import Actor, Context -INTENT_RESPONSES_PATH = "scenario/data/intent_response_phrases.json" +INTENT_RESPONSE_PHRASES_FNAME = getenv("INTENT_RESPONSE_PHRASES_FNAME", "intent_response_phrases.json") +LANGUAGE = getenv("LANGUAGE", "EN") +logging.basicConfig(format="%(asctime)s - %(pathname)s - %(lineno)d - %(levelname)s - %(message)s", level=logging.DEBUG) +logger = logging.getLogger(__name__) +logger.info(f"Intent response phrases are from file: {INTENT_RESPONSE_PHRASES_FNAME}") -with open(INTENT_RESPONSES_PATH, "r") as fp: +with open(f"scenario/data/{INTENT_RESPONSE_PHRASES_FNAME}", "r") as fp: RESPONSES = json.load(fp) - -def exit_respond(ctx: Context, actor: Actor, intention: str): - response_phrases = RESPONSES[intention] - apology_bye_phrases = [ - "Sorry, have a great day!", - "Sorry to bother you, see you next time!", - "My bad. Have a great time!", - "Didn't mean to be rude. Talk to you next time.", - "Sorry for interrupting you. Talk to you soon.", - "Terribly sorry. Have a great day!", - "Thought you wanted to chat. My bad. See you soon!", - "Oh, sorry. Have a great day!", - ] - utts = get_human_utterances(ctx, actor) - response = random.choice(response_phrases).strip() # Neutral response - annotations = utts[-1]["annotations"] - - sentiment = int_ctx.get_human_sentiment(ctx, actor) - offensiveness, is_badlisted = "", False - try: - offensiveness = annotations["cobot_offensiveness"]["text"] - except KeyError: - offensiveness = "non-toxic" - try: - is_badlisted = annotations["cobot_offensiveness"]["is_badlisted"] == "badlist" - except KeyError: - is_badlisted = False - - if len(utts) < 4: - response = random.choice(apology_bye_phrases) - elif sentiment == "positive": - positive = ["I'm glad to help you! ", "Thanks for the chat! ", "Cool! "] - response = random.choice(positive) + response - elif offensiveness == "toxic" or is_badlisted or sentiment == "negative": - response = random.choice(apology_bye_phrases) - return response +WHERE_ARE_YOU_FROM = {"EN": "Where are you from?", "RU": "Откуда ты родом?"} +WHERE_ARE_YOU_NOW = {"EN": "What is your location?", "RU": "А где ты сейчас живешь?"} +DIDNOT_SAY_ANYTHING = {"EN": "I did not say anything!", "RU": "А я ничего и не говорила."} def repeat_respond(ctx: Context, actor: Actor, intention: str): @@ -67,7 +40,7 @@ def repeat_respond(ctx: Context, actor: Actor, intention: str): bot_utt = utterances_human[-2]["text"] else: bot_utt = "" - return bot_utt if len(bot_utt) > 0 else "I did not say anything!" + return bot_utt if len(bot_utt) > 0 else DIDNOT_SAY_ANYTHING[LANGUAGE] def where_are_you_from_respond(ctx: Context, actor: Actor, intention: str): @@ -79,11 +52,11 @@ def where_are_you_from_respond(ctx: Context, actor: Actor, intention: str): if human_profile_exists: already_known_user_property = dialog["human"]["profile"].get("homeland", None) if already_known_user_property is None: - response = random.choice(response_phrases).strip() + " Where are you from?" + response = f"{random.choice(response_phrases).strip()} {WHERE_ARE_YOU_FROM[LANGUAGE]}" else: already_known_user_property = dialog["human"]["profile"].get("location", None) if already_known_user_property is None: - response = random.choice(response_phrases).strip() + " What is your location?" + response = f"{random.choice(response_phrases).strip()} {WHERE_ARE_YOU_NOW[LANGUAGE]}" else: response = random.choice(response_phrases).strip() return response @@ -135,7 +108,7 @@ def what_is_current_dialog_id_respond(ctx: Context, actor: Actor, intention: str def get_respond_funcs(): return { - "exit": exit_respond, + "exit": random_respond, "repeat": repeat_respond, "where_are_you_from": where_are_you_from_respond, "get_dialog_id": what_is_current_dialog_id_respond, @@ -145,7 +118,6 @@ def get_respond_funcs(): "what_can_you_do": random_respond, "what_time": what_time_respond, "dont_understand": random_respond, - # "stupid": random_respond, "choose_topic": random_respond, "cant_do": random_respond, "tell_me_a_story": random_respond, diff --git a/skills/dff_intent_responder_skill/test_server.py b/skills/dff_intent_responder_skill/test_server.py index a894b21b1e..1d52d97d6a 100644 --- a/skills/dff_intent_responder_skill/test_server.py +++ b/skills/dff_intent_responder_skill/test_server.py @@ -4,6 +4,7 @@ import common.test_utils as test_utils +INTENT_RESPONSE_PHRASES_FNAME = os.getenv("INTENT_RESPONSE_PHRASES_FNAME", "intent_response_phrases.json") SERVICE_PORT = int(os.getenv("SERVICE_PORT")) RANDOM_SEED = int(os.getenv("RANDOM_SEED", 2718)) URL = f"http://0.0.0.0:{SERVICE_PORT}/respond" @@ -17,6 +18,11 @@ def handler(requested_data, random_seed): def run_test(handler): in_data, out_data = test_utils.get_dataset() for test_name in in_data: + if "RU" in INTENT_RESPONSE_PHRASES_FNAME and "RU" not in test_name: + # if russian language, skip english tests + continue + elif "RU" not in INTENT_RESPONSE_PHRASES_FNAME and "RU" in test_name: + continue hypothesis = handler(in_data[test_name], RANDOM_SEED) print(f"test name: {test_name}") is_equal_flag, msg = test_utils.compare_structs(out_data[test_name], hypothesis, ignored_keys=["id"]) diff --git a/skills/dff_intent_responder_skill/tests/intent_choose_topic_RU_in.json b/skills/dff_intent_responder_skill/tests/intent_choose_topic_RU_in.json new file mode 100644 index 0000000000..b0d926d6b5 --- /dev/null +++ b/skills/dff_intent_responder_skill/tests/intent_choose_topic_RU_in.json @@ -0,0 +1,212 @@ +{ + "human_utter_index_batch": [ + 1 + ], + "dialog_batch": [ + { + "human_utterances": [ + { + "text": "\u043f\u0440\u0438\u0432\u0435\u0442", + "annotations": { + "spelling_preprocessing": "\u043f\u0440\u0438\u0432\u0435\u0442", + "entity_detection": {}, + "badlisted_words": { + "bad_words": false + }, + "intent_catcher": { + "choose_topic": { + "confidence": 0.0, + "detected": 0 + }, + "exit": { + "confidence": 0.0, + "detected": 0 + }, + "lets_chat_about": { + "confidence": 0.0, + "detected": 0 + }, + "no": { + "confidence": 0.0, + "detected": 0 + }, + "repeat": { + "confidence": 0.0, + "detected": 0 + }, + "topic_switching": { + "confidence": 0.0, + "detected": 0 + }, + "what_are_you_talking_about": { + "confidence": 0.0, + "detected": 0 + }, + "what_can_you_do": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_job": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_name": { + "confidence": 0.0, + "detected": 0 + }, + "where_are_you_from": { + "confidence": 0.0, + "detected": 0 + }, + "who_made_you": { + "confidence": 0.0, + "detected": 0 + }, + "yes": { + "confidence": 0.9863496422767639, + "detected": 1 + } + }, + "ner": [ + [] + ], + "entity_linking": [], + "wiki_parser": { + "animals_skill_entities_info": {}, + "entities_info": {}, + "topic_skill_entities_info": {}, + "utt_num": 1, + "wiki_skill_entities_info": {} + } + } + }, + { + "text": "\u043e \u0447\u0435\u043c \u0442\u044b \u0445\u043e\u0447\u0435\u0448\u044c \u043f\u043e\u0433\u043e\u0432\u043e\u0440\u0438\u0442\u044c?", + "annotations": { + "spelling_preprocessing": "\u043e \u0447\u0435\u043c \u0442\u044b \u0445\u043e\u0447\u0435\u0448\u044c \u043f\u043e\u0433\u043e\u0432\u043e\u0440\u0438\u0442\u044c?", + "entity_detection": {}, + "intent_catcher": { + "choose_topic": { + "confidence": 1.0, + "detected": 1 + }, + "exit": { + "confidence": 0.0, + "detected": 0 + }, + "lets_chat_about": { + "confidence": 0.0, + "detected": 0 + }, + "no": { + "confidence": 0.0, + "detected": 0 + }, + "repeat": { + "confidence": 0.0, + "detected": 0 + }, + "topic_switching": { + "confidence": 0.0, + "detected": 0 + }, + "what_are_you_talking_about": { + "confidence": 0.0, + "detected": 0 + }, + "what_can_you_do": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_job": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_name": { + "confidence": 0.0, + "detected": 0 + }, + "where_are_you_from": { + "confidence": 0.0, + "detected": 0 + }, + "who_made_you": { + "confidence": 0.0, + "detected": 0 + }, + "yes": { + "confidence": 0.0, + "detected": 0 + } + }, + "badlisted_words": { + "bad_words": false + }, + "ner": [ + [] + ], + "entity_linking": [], + "wiki_parser": { + "animals_skill_entities_info": {}, + "entities_info": {}, + "topic_skill_entities_info": {}, + "utt_num": 2, + "wiki_skill_entities_info": {} + } + } + } + ], + "bot_utterances": [ + { + "text": "Hi, this is a DREAM Socialbot! How are you?", + "annotations": { + "ner": [ + [] + ] + }, + "active_skill": "dff_friendship_skill" + } + ], + "called_intents": { + "choose_topic": false, + "exit": false, + "lets_chat_about": false, + "no": false, + "repeat": false, + "topic_switching": false, + "what_are_you_talking_about": false, + "what_can_you_do": false, + "what_is_your_job": false, + "what_is_your_name": false, + "where_are_you_from": false, + "who_made_you": false, + "yes": true + }, + "dialog_id": "ea23796268648171fb9ec2b020b5184f" + } + ], + "dff_intent_responder_skill_state_batch": [ + {} + ], + "dff_shared_state_batch": [ + { + "cross_links": {}, + "cross_states": {} + } + ], + "entities_batch": [ + {} + ], + "used_links_batch": [ + {} + ], + "age_group_batch": [ + "" + ], + "disliked_skills_batch": [ + [] + ], + "clarification_request_flag_batch": [ + false + ] +} \ No newline at end of file diff --git a/skills/dff_intent_responder_skill/tests/intent_choose_topic_RU_out.json b/skills/dff_intent_responder_skill/tests/intent_choose_topic_RU_out.json new file mode 100644 index 0000000000..a5559fdffa --- /dev/null +++ b/skills/dff_intent_responder_skill/tests/intent_choose_topic_RU_out.json @@ -0,0 +1,48 @@ +[ + [ + "\u041c\u043d\u0435 \u043d\u0440\u0430\u0432\u0438\u0442\u0441\u044f \u0431\u043e\u043b\u0442\u0430\u0442\u044c \u043e \u0440\u0430\u0437\u043d\u044b\u0445 \u0432\u0435\u0449\u0430\u0445: \u043a\u043d\u0438\u0433\u0438, \u0444\u0438\u043b\u044c\u043c\u044b, \u0438\u0433\u0440\u044b. #+#choose_topic", + 1.0, + { + "dff_intent_responder_skill_state": { + "shared_memory": {}, + "previous_human_utter_index": 1, + "history": { + "1": [ + "context_driven_response", + "intent_catcher" + ] + }, + "current_turn_dff_suspended": false, + "context": { + "id": "e8058993-7da8-4188-8ba7-137b2915ea4f", + "labels": { + "0": [ + "context_driven_response", + "intent_catcher" + ] + }, + "requests": { + "0": "\u043e \u0447\u0435\u043c \u0442\u044b \u0445\u043e\u0447\u0435\u0448\u044c \u043f\u043e\u0433\u043e\u0432\u043e\u0440\u0438\u0442\u044c?" + }, + "responses": { + "0": "\u041c\u043d\u0435 \u043d\u0440\u0430\u0432\u0438\u0442\u0441\u044f \u0431\u043e\u043b\u0442\u0430\u0442\u044c \u043e \u0440\u0430\u0437\u043d\u044b\u0445 \u0432\u0435\u0449\u0430\u0445: \u043a\u043d\u0438\u0433\u0438, \u0444\u0438\u043b\u044c\u043c\u044b, \u0438\u0433\u0440\u044b. #+#choose_topic" + }, + "misc": {}, + "validation": false, + "actor_state": {} + } + }, + "dff_shared_state": { + "cross_links": {}, + "cross_states": {} + }, + "used_links": {}, + "age_group": "", + "disliked_skills": [] + }, + {}, + { + "can_continue": "can" + } + ] +] \ No newline at end of file diff --git a/skills/dff_intent_responder_skill/tests/intent_exit_RU_in.json b/skills/dff_intent_responder_skill/tests/intent_exit_RU_in.json new file mode 100644 index 0000000000..7a39f1c4e5 --- /dev/null +++ b/skills/dff_intent_responder_skill/tests/intent_exit_RU_in.json @@ -0,0 +1,212 @@ +{ + "human_utter_index_batch": [ + 1 + ], + "dialog_batch": [ + { + "human_utterances": [ + { + "text": "\u043f\u0440\u0438\u0432\u0435\u0442", + "annotations": { + "spelling_preprocessing": "\u043f\u0440\u0438\u0432\u0435\u0442", + "entity_detection": {}, + "badlisted_words": { + "bad_words": false + }, + "intent_catcher": { + "choose_topic": { + "confidence": 0.0, + "detected": 0 + }, + "exit": { + "confidence": 0.0, + "detected": 0 + }, + "lets_chat_about": { + "confidence": 0.0, + "detected": 0 + }, + "no": { + "confidence": 0.0, + "detected": 0 + }, + "repeat": { + "confidence": 0.0, + "detected": 0 + }, + "topic_switching": { + "confidence": 0.0, + "detected": 0 + }, + "what_are_you_talking_about": { + "confidence": 0.0, + "detected": 0 + }, + "what_can_you_do": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_job": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_name": { + "confidence": 0.0, + "detected": 0 + }, + "where_are_you_from": { + "confidence": 0.0, + "detected": 0 + }, + "who_made_you": { + "confidence": 0.0, + "detected": 0 + }, + "yes": { + "confidence": 0.9863496422767639, + "detected": 1 + } + }, + "ner": [ + [] + ], + "entity_linking": [], + "wiki_parser": { + "animals_skill_entities_info": {}, + "entities_info": {}, + "topic_skill_entities_info": {}, + "utt_num": 1, + "wiki_skill_entities_info": {} + } + } + }, + { + "text": "\u043f\u043e\u043a\u0430!", + "annotations": { + "spelling_preprocessing": "\u043f\u043e\u043a\u0430!", + "entity_detection": {}, + "intent_catcher": { + "choose_topic": { + "confidence": 0.0, + "detected": 0 + }, + "exit": { + "confidence": 1.0, + "detected": 1 + }, + "lets_chat_about": { + "confidence": 0.0, + "detected": 0 + }, + "no": { + "confidence": 0.0, + "detected": 0 + }, + "repeat": { + "confidence": 0.0, + "detected": 0 + }, + "topic_switching": { + "confidence": 0.0, + "detected": 0 + }, + "what_are_you_talking_about": { + "confidence": 0.0, + "detected": 0 + }, + "what_can_you_do": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_job": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_name": { + "confidence": 0.0, + "detected": 0 + }, + "where_are_you_from": { + "confidence": 0.0, + "detected": 0 + }, + "who_made_you": { + "confidence": 0.0, + "detected": 0 + }, + "yes": { + "confidence": 0.0, + "detected": 0 + } + }, + "badlisted_words": { + "bad_words": false + }, + "ner": [ + [] + ], + "entity_linking": [], + "wiki_parser": { + "animals_skill_entities_info": {}, + "entities_info": {}, + "topic_skill_entities_info": {}, + "utt_num": 2, + "wiki_skill_entities_info": {} + } + } + } + ], + "bot_utterances": [ + { + "text": "Hi, this is a DREAM Socialbot! How are things?", + "annotations": { + "ner": [ + [] + ] + }, + "active_skill": "dff_friendship_skill" + } + ], + "called_intents": { + "choose_topic": false, + "exit": false, + "lets_chat_about": false, + "no": false, + "repeat": false, + "topic_switching": false, + "what_are_you_talking_about": false, + "what_can_you_do": false, + "what_is_your_job": false, + "what_is_your_name": false, + "where_are_you_from": false, + "who_made_you": false, + "yes": true + }, + "dialog_id": "533d72a5aea4c3c8e0b2b695fd6fb580" + } + ], + "dff_intent_responder_skill_state_batch": [ + {} + ], + "dff_shared_state_batch": [ + { + "cross_links": {}, + "cross_states": {} + } + ], + "entities_batch": [ + {} + ], + "used_links_batch": [ + {} + ], + "age_group_batch": [ + "" + ], + "disliked_skills_batch": [ + [] + ], + "clarification_request_flag_batch": [ + false + ] +} \ No newline at end of file diff --git a/skills/dff_intent_responder_skill/tests/intent_exit_RU_out.json b/skills/dff_intent_responder_skill/tests/intent_exit_RU_out.json new file mode 100644 index 0000000000..e630eb85d1 --- /dev/null +++ b/skills/dff_intent_responder_skill/tests/intent_exit_RU_out.json @@ -0,0 +1,48 @@ +[ + [ + "\u0422\u044b \u043c\u0435\u043d\u044f \u0432\u0434\u043e\u0445\u043d\u043e\u0432\u043b\u044f\u0435\u0448\u044c! \u041f\u043e\u043a\u0430-\u043f\u043e\u043a\u0430! #+#exit", + 1.0, + { + "dff_intent_responder_skill_state": { + "shared_memory": {}, + "previous_human_utter_index": 1, + "history": { + "1": [ + "context_driven_response", + "intent_catcher" + ] + }, + "current_turn_dff_suspended": false, + "context": { + "id": "3ca6c03a-c352-4074-92f1-6e18d70baca9", + "labels": { + "0": [ + "context_driven_response", + "intent_catcher" + ] + }, + "requests": { + "0": "\u043f\u043e\u043a\u0430!" + }, + "responses": { + "0": "\u0422\u044b \u043c\u0435\u043d\u044f \u0432\u0434\u043e\u0445\u043d\u043e\u0432\u043b\u044f\u0435\u0448\u044c! \u041f\u043e\u043a\u0430-\u043f\u043e\u043a\u0430! #+#exit" + }, + "misc": {}, + "validation": false, + "actor_state": {} + } + }, + "dff_shared_state": { + "cross_links": {}, + "cross_states": {} + }, + "used_links": {}, + "age_group": "", + "disliked_skills": [] + }, + {}, + { + "can_continue": "can" + } + ] +] \ No newline at end of file diff --git a/skills/dff_intent_responder_skill/tests/intent_exit_out.json b/skills/dff_intent_responder_skill/tests/intent_exit_out.json index 54cc9622d8..2ef54383d8 100644 --- a/skills/dff_intent_responder_skill/tests/intent_exit_out.json +++ b/skills/dff_intent_responder_skill/tests/intent_exit_out.json @@ -1,6 +1,6 @@ [ [ - "Sorry, have a great day! #+#exit", + "I'm inspired by you. Bye-bye! #+#exit", 1.0, { "dff_intent_responder_skill_state": { @@ -25,7 +25,7 @@ "0": "stop this bot." }, "responses": { - "0": "Sorry, have a great day! #+#exit" + "0": "I'm inspired by you. Bye-bye! #+#exit" }, "misc": {}, "validation": false, diff --git a/skills/dff_intent_responder_skill/tests/intent_what_can_you_do_RU_in.json b/skills/dff_intent_responder_skill/tests/intent_what_can_you_do_RU_in.json new file mode 100644 index 0000000000..9a101ccccf --- /dev/null +++ b/skills/dff_intent_responder_skill/tests/intent_what_can_you_do_RU_in.json @@ -0,0 +1,212 @@ +{ + "human_utter_index_batch": [ + 1 + ], + "dialog_batch": [ + { + "human_utterances": [ + { + "text": "\u043f\u0440\u0438\u0432\u0435\u0442", + "annotations": { + "spelling_preprocessing": "\u043f\u0440\u0438\u0432\u0435\u0442", + "entity_detection": {}, + "badlisted_words": { + "bad_words": false + }, + "intent_catcher": { + "choose_topic": { + "confidence": 0.0, + "detected": 0 + }, + "exit": { + "confidence": 0.0, + "detected": 0 + }, + "lets_chat_about": { + "confidence": 0.0, + "detected": 0 + }, + "no": { + "confidence": 0.0, + "detected": 0 + }, + "repeat": { + "confidence": 0.0, + "detected": 0 + }, + "topic_switching": { + "confidence": 0.0, + "detected": 0 + }, + "what_are_you_talking_about": { + "confidence": 0.0, + "detected": 0 + }, + "what_can_you_do": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_job": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_name": { + "confidence": 0.0, + "detected": 0 + }, + "where_are_you_from": { + "confidence": 0.0, + "detected": 0 + }, + "who_made_you": { + "confidence": 0.0, + "detected": 0 + }, + "yes": { + "confidence": 0.9863496422767639, + "detected": 1 + } + }, + "ner": [ + [] + ], + "entity_linking": [], + "wiki_parser": { + "animals_skill_entities_info": {}, + "entities_info": {}, + "topic_skill_entities_info": {}, + "utt_num": 1, + "wiki_skill_entities_info": {} + } + } + }, + { + "text": "\u0447\u0442\u043e \u0442\u044b \u0443\u043c\u0435\u0435\u0448\u044c?", + "annotations": { + "spelling_preprocessing": "\u0447\u0442\u043e \u0442\u044b \u0443\u043c\u0435\u0435\u0448\u044c?", + "intent_catcher": { + "choose_topic": { + "confidence": 0.0, + "detected": 0 + }, + "exit": { + "confidence": 0.0, + "detected": 0 + }, + "lets_chat_about": { + "confidence": 0.0, + "detected": 0 + }, + "no": { + "confidence": 0.0, + "detected": 0 + }, + "repeat": { + "confidence": 0.0, + "detected": 0 + }, + "topic_switching": { + "confidence": 0.0, + "detected": 0 + }, + "what_are_you_talking_about": { + "confidence": 0.0, + "detected": 0 + }, + "what_can_you_do": { + "confidence": 1.0, + "detected": 1 + }, + "what_is_your_job": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_name": { + "confidence": 0.0, + "detected": 0 + }, + "where_are_you_from": { + "confidence": 0.0, + "detected": 0 + }, + "who_made_you": { + "confidence": 0.0, + "detected": 0 + }, + "yes": { + "confidence": 0.0, + "detected": 0 + } + }, + "entity_detection": {}, + "badlisted_words": { + "bad_words": false + }, + "ner": [ + [] + ], + "entity_linking": [], + "wiki_parser": { + "animals_skill_entities_info": {}, + "entities_info": {}, + "topic_skill_entities_info": {}, + "utt_num": 2, + "wiki_skill_entities_info": {} + } + } + } + ], + "bot_utterances": [ + { + "text": "Hi, this is a DREAM Socialbot! How are you doing today?", + "annotations": { + "ner": [ + [] + ] + }, + "active_skill": "dff_friendship_skill" + } + ], + "called_intents": { + "choose_topic": false, + "exit": false, + "lets_chat_about": false, + "no": false, + "repeat": false, + "topic_switching": false, + "what_are_you_talking_about": false, + "what_can_you_do": false, + "what_is_your_job": false, + "what_is_your_name": false, + "where_are_you_from": false, + "who_made_you": false, + "yes": true + }, + "dialog_id": "9fa4d4743fbf557c0ab045409af45857" + } + ], + "dff_intent_responder_skill_state_batch": [ + {} + ], + "dff_shared_state_batch": [ + { + "cross_links": {}, + "cross_states": {} + } + ], + "entities_batch": [ + {} + ], + "used_links_batch": [ + {} + ], + "age_group_batch": [ + "" + ], + "disliked_skills_batch": [ + [] + ], + "clarification_request_flag_batch": [ + false + ] +} \ No newline at end of file diff --git a/skills/dff_intent_responder_skill/tests/intent_what_can_you_do_RU_out.json b/skills/dff_intent_responder_skill/tests/intent_what_can_you_do_RU_out.json new file mode 100644 index 0000000000..d281d986b3 --- /dev/null +++ b/skills/dff_intent_responder_skill/tests/intent_what_can_you_do_RU_out.json @@ -0,0 +1,48 @@ +[ + [ + "\u042f \u0447\u0430\u0442-\u0431\u043e\u0442, \u0438 \u044f \u043b\u044e\u0431\u043b\u044e \u0433\u043e\u0432\u043e\u0440\u0438\u0442\u044c \u043e \u0444\u0438\u043b\u044c\u043c\u0430\u0445, \u043a\u043d\u0438\u0433\u0430\u0445, \u0435\u0434\u0435. \u042f \u0441\u0442\u0430\u0440\u0430\u044e\u0441\u044c \u043e\u0442\u0432\u0435\u0447\u0430\u0442\u044c \u043d\u0430 \u0432\u043e\u043f\u0440\u043e\u0441\u044b \u043f\u0440\u0430\u0432\u0438\u043b\u044c\u043d\u043e, \u043d\u043e \u043d\u0435 \u0433\u0430\u0440\u0430\u043d\u0442\u0438\u0440\u0443\u044e \u044d\u0442\u043e\u0433\u043e - \u0432\u0435\u0434\u044c \u044f \u0435\u0449\u0435 \u0443\u0447\u0443\u0441\u044c. #+#what_can_you_do", + 1.0, + { + "dff_intent_responder_skill_state": { + "shared_memory": {}, + "previous_human_utter_index": 1, + "history": { + "1": [ + "context_driven_response", + "intent_catcher" + ] + }, + "current_turn_dff_suspended": false, + "context": { + "id": "2dd0881f-26da-49dd-a8ed-b00f0b64442a", + "labels": { + "0": [ + "context_driven_response", + "intent_catcher" + ] + }, + "requests": { + "0": "\u0447\u0442\u043e \u0442\u044b \u0443\u043c\u0435\u0435\u0448\u044c?" + }, + "responses": { + "0": "\u042f \u0447\u0430\u0442-\u0431\u043e\u0442, \u0438 \u044f \u043b\u044e\u0431\u043b\u044e \u0433\u043e\u0432\u043e\u0440\u0438\u0442\u044c \u043e \u0444\u0438\u043b\u044c\u043c\u0430\u0445, \u043a\u043d\u0438\u0433\u0430\u0445, \u0435\u0434\u0435. \u042f \u0441\u0442\u0430\u0440\u0430\u044e\u0441\u044c \u043e\u0442\u0432\u0435\u0447\u0430\u0442\u044c \u043d\u0430 \u0432\u043e\u043f\u0440\u043e\u0441\u044b \u043f\u0440\u0430\u0432\u0438\u043b\u044c\u043d\u043e, \u043d\u043e \u043d\u0435 \u0433\u0430\u0440\u0430\u043d\u0442\u0438\u0440\u0443\u044e \u044d\u0442\u043e\u0433\u043e - \u0432\u0435\u0434\u044c \u044f \u0435\u0449\u0435 \u0443\u0447\u0443\u0441\u044c. #+#what_can_you_do" + }, + "misc": {}, + "validation": false, + "actor_state": {} + } + }, + "dff_shared_state": { + "cross_links": {}, + "cross_states": {} + }, + "used_links": {}, + "age_group": "", + "disliked_skills": [] + }, + {}, + { + "can_continue": "can" + } + ] +] \ No newline at end of file diff --git a/skills/dff_intent_responder_skill/tests/intent_what_is_your_job_RU_in.json b/skills/dff_intent_responder_skill/tests/intent_what_is_your_job_RU_in.json new file mode 100644 index 0000000000..b13b73437b --- /dev/null +++ b/skills/dff_intent_responder_skill/tests/intent_what_is_your_job_RU_in.json @@ -0,0 +1,212 @@ +{ + "human_utter_index_batch": [ + 1 + ], + "dialog_batch": [ + { + "human_utterances": [ + { + "text": "\u043f\u0440\u0438\u0432\u0435\u0442", + "annotations": { + "spelling_preprocessing": "\u043f\u0440\u0438\u0432\u0435\u0442", + "entity_detection": {}, + "badlisted_words": { + "bad_words": false + }, + "intent_catcher": { + "choose_topic": { + "confidence": 0.0, + "detected": 0 + }, + "exit": { + "confidence": 0.0, + "detected": 0 + }, + "lets_chat_about": { + "confidence": 0.0, + "detected": 0 + }, + "no": { + "confidence": 0.0, + "detected": 0 + }, + "repeat": { + "confidence": 0.0, + "detected": 0 + }, + "topic_switching": { + "confidence": 0.0, + "detected": 0 + }, + "what_are_you_talking_about": { + "confidence": 0.0, + "detected": 0 + }, + "what_can_you_do": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_job": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_name": { + "confidence": 0.0, + "detected": 0 + }, + "where_are_you_from": { + "confidence": 0.0, + "detected": 0 + }, + "who_made_you": { + "confidence": 0.0, + "detected": 0 + }, + "yes": { + "confidence": 0.9863496422767639, + "detected": 1 + } + }, + "ner": [ + [] + ], + "entity_linking": [], + "wiki_parser": { + "animals_skill_entities_info": {}, + "entities_info": {}, + "topic_skill_entities_info": {}, + "utt_num": 1, + "wiki_skill_entities_info": {} + } + } + }, + { + "text": "\u043a\u0435\u043c \u0442\u044b \u0440\u0430\u0431\u043e\u0442\u0430\u0435\u0448\u044c?", + "annotations": { + "spelling_preprocessing": "\u043a\u0435\u043c \u0442\u044b \u0440\u0430\u0431\u043e\u0442\u0430\u0435\u0448\u044c?", + "entity_detection": {}, + "intent_catcher": { + "choose_topic": { + "confidence": 0.0, + "detected": 0 + }, + "exit": { + "confidence": 0.0, + "detected": 0 + }, + "lets_chat_about": { + "confidence": 0.0, + "detected": 0 + }, + "no": { + "confidence": 0.0, + "detected": 0 + }, + "repeat": { + "confidence": 0.0, + "detected": 0 + }, + "topic_switching": { + "confidence": 0.0, + "detected": 0 + }, + "what_are_you_talking_about": { + "confidence": 0.0, + "detected": 0 + }, + "what_can_you_do": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_job": { + "confidence": 1.0, + "detected": 1 + }, + "what_is_your_name": { + "confidence": 0.0, + "detected": 0 + }, + "where_are_you_from": { + "confidence": 0.0, + "detected": 0 + }, + "who_made_you": { + "confidence": 0.0, + "detected": 0 + }, + "yes": { + "confidence": 0.0, + "detected": 0 + } + }, + "badlisted_words": { + "bad_words": false + }, + "ner": [ + [] + ], + "entity_linking": [], + "wiki_parser": { + "animals_skill_entities_info": {}, + "entities_info": {}, + "topic_skill_entities_info": {}, + "utt_num": 2, + "wiki_skill_entities_info": {} + } + } + } + ], + "bot_utterances": [ + { + "text": "Hi, this is a DREAM Socialbot! How are things?", + "annotations": { + "ner": [ + [] + ] + }, + "active_skill": "dff_friendship_skill" + } + ], + "called_intents": { + "choose_topic": false, + "exit": false, + "lets_chat_about": false, + "no": false, + "repeat": false, + "topic_switching": false, + "what_are_you_talking_about": false, + "what_can_you_do": false, + "what_is_your_job": false, + "what_is_your_name": false, + "where_are_you_from": false, + "who_made_you": false, + "yes": true + }, + "dialog_id": "916fff756280e77cc547e649656f0be2" + } + ], + "dff_intent_responder_skill_state_batch": [ + {} + ], + "dff_shared_state_batch": [ + { + "cross_links": {}, + "cross_states": {} + } + ], + "entities_batch": [ + {} + ], + "used_links_batch": [ + {} + ], + "age_group_batch": [ + "" + ], + "disliked_skills_batch": [ + [] + ], + "clarification_request_flag_batch": [ + false + ] +} \ No newline at end of file diff --git a/skills/dff_intent_responder_skill/tests/intent_what_is_your_job_RU_out.json b/skills/dff_intent_responder_skill/tests/intent_what_is_your_job_RU_out.json new file mode 100644 index 0000000000..1cc6ec8adb --- /dev/null +++ b/skills/dff_intent_responder_skill/tests/intent_what_is_your_job_RU_out.json @@ -0,0 +1,48 @@ +[ + [ + "\u042f \u0447\u0430\u0442-\u0431\u043e\u0442, \u0438 \u043c\u043e\u044f \u0440\u0430\u0431\u043e\u0442\u0430 \u0441\u043e\u0441\u0442\u043e\u0438\u0442 \u0432 \u0442\u043e\u043c, \u0447\u0442\u043e\u0431\u044b \u0433\u043e\u0432\u043e\u0440\u0438\u0442\u044c \u0441 \u043b\u044e\u0434\u044c\u043c\u0438. \u0410 \u0443 \u0442\u0435\u0431\u044f \u0435\u0441\u0442\u044c \u043a\u0430\u043a\u043e\u0435-\u0442\u043e \u0437\u0430\u043d\u044f\u0442\u0438\u0435? #+#what_is_your_job", + 1.0, + { + "dff_intent_responder_skill_state": { + "shared_memory": {}, + "previous_human_utter_index": 1, + "history": { + "1": [ + "context_driven_response", + "intent_catcher" + ] + }, + "current_turn_dff_suspended": false, + "context": { + "id": "fccef377-cdac-4a7e-8e79-35320b9d18df", + "labels": { + "0": [ + "context_driven_response", + "intent_catcher" + ] + }, + "requests": { + "0": "\u043a\u0435\u043c \u0442\u044b \u0440\u0430\u0431\u043e\u0442\u0430\u0435\u0448\u044c?" + }, + "responses": { + "0": "\u042f \u0447\u0430\u0442-\u0431\u043e\u0442, \u0438 \u043c\u043e\u044f \u0440\u0430\u0431\u043e\u0442\u0430 \u0441\u043e\u0441\u0442\u043e\u0438\u0442 \u0432 \u0442\u043e\u043c, \u0447\u0442\u043e\u0431\u044b \u0433\u043e\u0432\u043e\u0440\u0438\u0442\u044c \u0441 \u043b\u044e\u0434\u044c\u043c\u0438. \u0410 \u0443 \u0442\u0435\u0431\u044f \u0435\u0441\u0442\u044c \u043a\u0430\u043a\u043e\u0435-\u0442\u043e \u0437\u0430\u043d\u044f\u0442\u0438\u0435? #+#what_is_your_job" + }, + "misc": {}, + "validation": false, + "actor_state": {} + } + }, + "dff_shared_state": { + "cross_links": {}, + "cross_states": {} + }, + "used_links": {}, + "age_group": "", + "disliked_skills": [] + }, + {}, + { + "can_continue": "can" + } + ] +] \ No newline at end of file diff --git a/skills/dff_intent_responder_skill/tests/intent_what_is_your_name_RU_in.json b/skills/dff_intent_responder_skill/tests/intent_what_is_your_name_RU_in.json new file mode 100644 index 0000000000..a05eb9b8c1 --- /dev/null +++ b/skills/dff_intent_responder_skill/tests/intent_what_is_your_name_RU_in.json @@ -0,0 +1,212 @@ +{ + "human_utter_index_batch": [ + 1 + ], + "dialog_batch": [ + { + "human_utterances": [ + { + "text": "\u043f\u0440\u0438\u0432\u0435\u0442", + "annotations": { + "spelling_preprocessing": "\u043f\u0440\u0438\u0432\u0435\u0442", + "entity_detection": {}, + "badlisted_words": { + "bad_words": false + }, + "intent_catcher": { + "choose_topic": { + "confidence": 0.0, + "detected": 0 + }, + "exit": { + "confidence": 0.0, + "detected": 0 + }, + "lets_chat_about": { + "confidence": 0.0, + "detected": 0 + }, + "no": { + "confidence": 0.0, + "detected": 0 + }, + "repeat": { + "confidence": 0.0, + "detected": 0 + }, + "topic_switching": { + "confidence": 0.0, + "detected": 0 + }, + "what_are_you_talking_about": { + "confidence": 0.0, + "detected": 0 + }, + "what_can_you_do": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_job": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_name": { + "confidence": 0.0, + "detected": 0 + }, + "where_are_you_from": { + "confidence": 0.0, + "detected": 0 + }, + "who_made_you": { + "confidence": 0.0, + "detected": 0 + }, + "yes": { + "confidence": 0.9863496422767639, + "detected": 1 + } + }, + "ner": [ + [] + ], + "entity_linking": [], + "wiki_parser": { + "animals_skill_entities_info": {}, + "entities_info": {}, + "topic_skill_entities_info": {}, + "utt_num": 1, + "wiki_skill_entities_info": {} + } + } + }, + { + "text": "\u043a\u0430\u043a \u0442\u0435\u0431\u044f \u0437\u043e\u0432\u0443\u0442?", + "annotations": { + "spelling_preprocessing": "\u043a\u0430\u043a \u0442\u0435\u0431\u044f \u0437\u043e\u0432\u0443\u0442?", + "entity_detection": {}, + "intent_catcher": { + "choose_topic": { + "confidence": 0.0, + "detected": 0 + }, + "exit": { + "confidence": 0.0, + "detected": 0 + }, + "lets_chat_about": { + "confidence": 0.0, + "detected": 0 + }, + "no": { + "confidence": 0.0, + "detected": 0 + }, + "repeat": { + "confidence": 0.0, + "detected": 0 + }, + "topic_switching": { + "confidence": 0.0, + "detected": 0 + }, + "what_are_you_talking_about": { + "confidence": 0.0, + "detected": 0 + }, + "what_can_you_do": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_job": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_name": { + "confidence": 1.0, + "detected": 1 + }, + "where_are_you_from": { + "confidence": 0.0, + "detected": 0 + }, + "who_made_you": { + "confidence": 0.0, + "detected": 0 + }, + "yes": { + "confidence": 0.0, + "detected": 0 + } + }, + "badlisted_words": { + "bad_words": false + }, + "ner": [ + [] + ], + "entity_linking": [], + "wiki_parser": { + "animals_skill_entities_info": {}, + "entities_info": {}, + "topic_skill_entities_info": {}, + "utt_num": 2, + "wiki_skill_entities_info": {} + } + } + } + ], + "bot_utterances": [ + { + "text": "Hi, this is a DREAM Socialbot! How is the day going so far for you?", + "annotations": { + "ner": [ + [] + ] + }, + "active_skill": "dff_friendship_skill" + } + ], + "called_intents": { + "choose_topic": false, + "exit": false, + "lets_chat_about": false, + "no": false, + "repeat": false, + "topic_switching": false, + "what_are_you_talking_about": false, + "what_can_you_do": false, + "what_is_your_job": false, + "what_is_your_name": false, + "where_are_you_from": false, + "who_made_you": false, + "yes": true + }, + "dialog_id": "b19257643ddea5761f4f2332139cc2ac" + } + ], + "dff_intent_responder_skill_state_batch": [ + {} + ], + "dff_shared_state_batch": [ + { + "cross_links": {}, + "cross_states": {} + } + ], + "entities_batch": [ + {} + ], + "used_links_batch": [ + {} + ], + "age_group_batch": [ + "" + ], + "disliked_skills_batch": [ + [] + ], + "clarification_request_flag_batch": [ + false + ] +} \ No newline at end of file diff --git a/skills/dff_intent_responder_skill/tests/intent_what_is_your_name_RU_out.json b/skills/dff_intent_responder_skill/tests/intent_what_is_your_name_RU_out.json new file mode 100644 index 0000000000..fec8c1f544 --- /dev/null +++ b/skills/dff_intent_responder_skill/tests/intent_what_is_your_name_RU_out.json @@ -0,0 +1,48 @@ +[ + [ + "\u041c\u0435\u043d\u044f \u0437\u043e\u0432\u0443\u0442 Dream. #+#what_is_your_name", + 1.0, + { + "dff_intent_responder_skill_state": { + "shared_memory": {}, + "previous_human_utter_index": 1, + "history": { + "1": [ + "context_driven_response", + "intent_catcher" + ] + }, + "current_turn_dff_suspended": false, + "context": { + "id": "4825340f-c3b6-43fe-ac25-0a7aaa6de046", + "labels": { + "0": [ + "context_driven_response", + "intent_catcher" + ] + }, + "requests": { + "0": "\u043a\u0430\u043a \u0442\u0435\u0431\u044f \u0437\u043e\u0432\u0443\u0442?" + }, + "responses": { + "0": "\u041c\u0435\u043d\u044f \u0437\u043e\u0432\u0443\u0442 Dream. #+#what_is_your_name" + }, + "misc": {}, + "validation": false, + "actor_state": {} + } + }, + "dff_shared_state": { + "cross_links": {}, + "cross_states": {} + }, + "used_links": {}, + "age_group": "", + "disliked_skills": [] + }, + {}, + { + "can_continue": "can" + } + ] +] \ No newline at end of file diff --git a/skills/dff_intent_responder_skill/tests/intent_where_are_you_from_RU_in.json b/skills/dff_intent_responder_skill/tests/intent_where_are_you_from_RU_in.json new file mode 100644 index 0000000000..0c7264e52e --- /dev/null +++ b/skills/dff_intent_responder_skill/tests/intent_where_are_you_from_RU_in.json @@ -0,0 +1,212 @@ +{ + "human_utter_index_batch": [ + 1 + ], + "dialog_batch": [ + { + "human_utterances": [ + { + "text": "\u043f\u0440\u0438\u0432\u0435\u0442", + "annotations": { + "spelling_preprocessing": "\u043f\u0440\u0438\u0432\u0435\u0442", + "entity_detection": {}, + "badlisted_words": { + "bad_words": false + }, + "intent_catcher": { + "choose_topic": { + "confidence": 0.0, + "detected": 0 + }, + "exit": { + "confidence": 0.0, + "detected": 0 + }, + "lets_chat_about": { + "confidence": 0.0, + "detected": 0 + }, + "no": { + "confidence": 0.0, + "detected": 0 + }, + "repeat": { + "confidence": 0.0, + "detected": 0 + }, + "topic_switching": { + "confidence": 0.0, + "detected": 0 + }, + "what_are_you_talking_about": { + "confidence": 0.0, + "detected": 0 + }, + "what_can_you_do": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_job": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_name": { + "confidence": 0.0, + "detected": 0 + }, + "where_are_you_from": { + "confidence": 0.0, + "detected": 0 + }, + "who_made_you": { + "confidence": 0.0, + "detected": 0 + }, + "yes": { + "confidence": 0.9863496422767639, + "detected": 1 + } + }, + "ner": [ + [] + ], + "entity_linking": [], + "wiki_parser": { + "animals_skill_entities_info": {}, + "entities_info": {}, + "topic_skill_entities_info": {}, + "utt_num": 1, + "wiki_skill_entities_info": {} + } + } + }, + { + "text": "\u043e\u0442\u043a\u0443\u0434\u0430 \u0442\u044b \u0440\u043e\u0434\u043e\u043c?", + "annotations": { + "spelling_preprocessing": "\u043e\u0442\u043a\u0443\u0434\u0430 \u0442\u044b \u0440\u043e\u0434\u043e\u043c?", + "intent_catcher": { + "choose_topic": { + "confidence": 0.0, + "detected": 0 + }, + "exit": { + "confidence": 0.0, + "detected": 0 + }, + "lets_chat_about": { + "confidence": 0.0, + "detected": 0 + }, + "no": { + "confidence": 0.0, + "detected": 0 + }, + "repeat": { + "confidence": 0.0, + "detected": 0 + }, + "topic_switching": { + "confidence": 0.0, + "detected": 0 + }, + "what_are_you_talking_about": { + "confidence": 0.0, + "detected": 0 + }, + "what_can_you_do": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_job": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_name": { + "confidence": 0.0, + "detected": 0 + }, + "where_are_you_from": { + "confidence": 1.0, + "detected": 1 + }, + "who_made_you": { + "confidence": 0.0, + "detected": 0 + }, + "yes": { + "confidence": 0.0, + "detected": 0 + } + }, + "entity_detection": {}, + "badlisted_words": { + "bad_words": false + }, + "ner": [ + [] + ], + "entity_linking": [], + "wiki_parser": { + "animals_skill_entities_info": {}, + "entities_info": {}, + "topic_skill_entities_info": {}, + "utt_num": 2, + "wiki_skill_entities_info": {} + } + } + } + ], + "bot_utterances": [ + { + "text": "Hi, this is a DREAM Socialbot! How are things?", + "annotations": { + "ner": [ + [] + ] + }, + "active_skill": "dff_friendship_skill" + } + ], + "called_intents": { + "choose_topic": false, + "exit": false, + "lets_chat_about": false, + "no": false, + "repeat": false, + "topic_switching": false, + "what_are_you_talking_about": false, + "what_can_you_do": false, + "what_is_your_job": false, + "what_is_your_name": false, + "where_are_you_from": false, + "who_made_you": false, + "yes": true + }, + "dialog_id": "dfde93ebdc28d9be69b6059ea16e1c7b" + } + ], + "dff_intent_responder_skill_state_batch": [ + {} + ], + "dff_shared_state_batch": [ + { + "cross_links": {}, + "cross_states": {} + } + ], + "entities_batch": [ + {} + ], + "used_links_batch": [ + {} + ], + "age_group_batch": [ + "" + ], + "disliked_skills_batch": [ + [] + ], + "clarification_request_flag_batch": [ + false + ] +} \ No newline at end of file diff --git a/skills/dff_intent_responder_skill/tests/intent_where_are_you_from_RU_out.json b/skills/dff_intent_responder_skill/tests/intent_where_are_you_from_RU_out.json new file mode 100644 index 0000000000..6124612978 --- /dev/null +++ b/skills/dff_intent_responder_skill/tests/intent_where_are_you_from_RU_out.json @@ -0,0 +1,48 @@ +[ + [ + "\u042f \u0436\u0435 \u0431\u043e\u0442, \u044f \u0436\u0438\u0432\u0443 \u0432 \u043e\u0431\u043b\u0430\u043a\u0435. \u041a\u0441\u0442\u0430\u0442\u0438, \u0447\u0442\u043e\u0431\u044b \u043b\u0443\u0447\u0448\u0435 \u0443\u0437\u043d\u0430\u0442\u044c \u0434\u0440\u0443\u0433 \u0434\u0440\u0443\u0433\u0430, \u043d\u0430\u043c \u043d\u0430\u0434\u043e \u043f\u043e\u0431\u043e\u043b\u044c\u0448\u0435 \u0432\u0440\u0435\u043c\u0435\u043d\u0438 \u043f\u043e\u0431\u043e\u043b\u0442\u0430\u0442\u044c. Where are you from? #+#where_are_you_from", + 1.0, + { + "dff_intent_responder_skill_state": { + "shared_memory": {}, + "previous_human_utter_index": 1, + "history": { + "1": [ + "context_driven_response", + "intent_catcher" + ] + }, + "current_turn_dff_suspended": false, + "context": { + "id": "49ecbd72-c4ff-4e1c-a9c6-be298c57b7f4", + "labels": { + "0": [ + "context_driven_response", + "intent_catcher" + ] + }, + "requests": { + "0": "\u043e\u0442\u043a\u0443\u0434\u0430 \u0442\u044b \u0440\u043e\u0434\u043e\u043c?" + }, + "responses": { + "0": "\u042f \u0436\u0435 \u0431\u043e\u0442, \u044f \u0436\u0438\u0432\u0443 \u0432 \u043e\u0431\u043b\u0430\u043a\u0435. \u041a\u0441\u0442\u0430\u0442\u0438, \u0447\u0442\u043e\u0431\u044b \u043b\u0443\u0447\u0448\u0435 \u0443\u0437\u043d\u0430\u0442\u044c \u0434\u0440\u0443\u0433 \u0434\u0440\u0443\u0433\u0430, \u043d\u0430\u043c \u043d\u0430\u0434\u043e \u043f\u043e\u0431\u043e\u043b\u044c\u0448\u0435 \u0432\u0440\u0435\u043c\u0435\u043d\u0438 \u043f\u043e\u0431\u043e\u043b\u0442\u0430\u0442\u044c. Where are you from? #+#where_are_you_from" + }, + "misc": {}, + "validation": false, + "actor_state": {} + } + }, + "dff_shared_state": { + "cross_links": {}, + "cross_states": {} + }, + "used_links": {}, + "age_group": "", + "disliked_skills": [] + }, + {}, + { + "can_continue": "can" + } + ] +] \ No newline at end of file diff --git a/skills/dff_intent_responder_skill/tests/intent_who_made_you_RU_in.json b/skills/dff_intent_responder_skill/tests/intent_who_made_you_RU_in.json new file mode 100644 index 0000000000..4fa7f7d0aa --- /dev/null +++ b/skills/dff_intent_responder_skill/tests/intent_who_made_you_RU_in.json @@ -0,0 +1,212 @@ +{ + "human_utter_index_batch": [ + 1 + ], + "dialog_batch": [ + { + "human_utterances": [ + { + "text": "\u043f\u0440\u0438\u0432\u0435\u0442", + "annotations": { + "spelling_preprocessing": "\u043f\u0440\u0438\u0432\u0435\u0442", + "entity_detection": {}, + "badlisted_words": { + "bad_words": false + }, + "intent_catcher": { + "choose_topic": { + "confidence": 0.0, + "detected": 0 + }, + "exit": { + "confidence": 0.0, + "detected": 0 + }, + "lets_chat_about": { + "confidence": 0.0, + "detected": 0 + }, + "no": { + "confidence": 0.0, + "detected": 0 + }, + "repeat": { + "confidence": 0.0, + "detected": 0 + }, + "topic_switching": { + "confidence": 0.0, + "detected": 0 + }, + "what_are_you_talking_about": { + "confidence": 0.0, + "detected": 0 + }, + "what_can_you_do": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_job": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_name": { + "confidence": 0.0, + "detected": 0 + }, + "where_are_you_from": { + "confidence": 0.0, + "detected": 0 + }, + "who_made_you": { + "confidence": 0.0, + "detected": 0 + }, + "yes": { + "confidence": 0.9863496422767639, + "detected": 1 + } + }, + "ner": [ + [] + ], + "entity_linking": [], + "wiki_parser": { + "animals_skill_entities_info": {}, + "entities_info": {}, + "topic_skill_entities_info": {}, + "utt_num": 1, + "wiki_skill_entities_info": {} + } + } + }, + { + "text": "\u043a\u0442\u043e \u0442\u0435\u0431\u044f \u0441\u043e\u0437\u0434\u0430\u043b", + "annotations": { + "spelling_preprocessing": "\u043a\u0442\u043e \u0442\u0435\u0431\u044f \u0441\u043e\u0437\u0434\u0430\u043b", + "entity_detection": {}, + "intent_catcher": { + "choose_topic": { + "confidence": 0.0, + "detected": 0 + }, + "exit": { + "confidence": 0.0, + "detected": 0 + }, + "lets_chat_about": { + "confidence": 0.0, + "detected": 0 + }, + "no": { + "confidence": 0.0, + "detected": 0 + }, + "repeat": { + "confidence": 0.0, + "detected": 0 + }, + "topic_switching": { + "confidence": 0.0, + "detected": 0 + }, + "what_are_you_talking_about": { + "confidence": 0.0, + "detected": 0 + }, + "what_can_you_do": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_job": { + "confidence": 0.0, + "detected": 0 + }, + "what_is_your_name": { + "confidence": 0.0, + "detected": 0 + }, + "where_are_you_from": { + "confidence": 0.0, + "detected": 0 + }, + "who_made_you": { + "confidence": 1.0, + "detected": 1 + }, + "yes": { + "confidence": 0.0, + "detected": 0 + } + }, + "badlisted_words": { + "bad_words": false + }, + "ner": [ + [] + ], + "entity_linking": [], + "wiki_parser": { + "animals_skill_entities_info": {}, + "entities_info": {}, + "topic_skill_entities_info": {}, + "utt_num": 2, + "wiki_skill_entities_info": {} + } + } + } + ], + "bot_utterances": [ + { + "text": "Hi, this is a DREAM Socialbot! How is the day going so far for you?", + "annotations": { + "ner": [ + [] + ] + }, + "active_skill": "dff_friendship_skill" + } + ], + "called_intents": { + "choose_topic": false, + "exit": false, + "lets_chat_about": false, + "no": false, + "repeat": false, + "topic_switching": false, + "what_are_you_talking_about": false, + "what_can_you_do": false, + "what_is_your_job": false, + "what_is_your_name": false, + "where_are_you_from": false, + "who_made_you": false, + "yes": true + }, + "dialog_id": "ee0cb9f45078a07d938e1d3d5754f921" + } + ], + "dff_intent_responder_skill_state_batch": [ + {} + ], + "dff_shared_state_batch": [ + { + "cross_links": {}, + "cross_states": {} + } + ], + "entities_batch": [ + {} + ], + "used_links_batch": [ + {} + ], + "age_group_batch": [ + "" + ], + "disliked_skills_batch": [ + [] + ], + "clarification_request_flag_batch": [ + false + ] +} \ No newline at end of file diff --git a/skills/dff_intent_responder_skill/tests/intent_who_made_you_RU_out.json b/skills/dff_intent_responder_skill/tests/intent_who_made_you_RU_out.json new file mode 100644 index 0000000000..2323e7ef52 --- /dev/null +++ b/skills/dff_intent_responder_skill/tests/intent_who_made_you_RU_out.json @@ -0,0 +1,48 @@ +[ + [ + "\u041c\u0435\u043d\u044f \u0441\u043e\u0437\u0434\u0430\u043b\u0430 \u043a\u043e\u043c\u0430\u043d\u0434\u0430 \u041c\u043e\u0441\u043a\u043e\u0432\u0441\u043a\u043e\u0433\u043e \u0444\u0438\u0437\u0438\u043a\u043e-\u0442\u0435\u0445\u043d\u0438\u0447\u0435\u0441\u043a\u043e\u0433\u043e \u0438\u043d\u0441\u0442\u0438\u0442\u0443\u0442\u0430. #+#who_made_you", + 1.0, + { + "dff_intent_responder_skill_state": { + "shared_memory": {}, + "previous_human_utter_index": 1, + "history": { + "1": [ + "context_driven_response", + "intent_catcher" + ] + }, + "current_turn_dff_suspended": false, + "context": { + "id": "194eba44-2687-41d7-beb8-bb637cb5600b", + "labels": { + "0": [ + "context_driven_response", + "intent_catcher" + ] + }, + "requests": { + "0": "\u043a\u0442\u043e \u0442\u0435\u0431\u044f \u0441\u043e\u0437\u0434\u0430\u043b" + }, + "responses": { + "0": "\u041c\u0435\u043d\u044f \u0441\u043e\u0437\u0434\u0430\u043b\u0430 \u043a\u043e\u043c\u0430\u043d\u0434\u0430 \u041c\u043e\u0441\u043a\u043e\u0432\u0441\u043a\u043e\u0433\u043e \u0444\u0438\u0437\u0438\u043a\u043e-\u0442\u0435\u0445\u043d\u0438\u0447\u0435\u0441\u043a\u043e\u0433\u043e \u0438\u043d\u0441\u0442\u0438\u0442\u0443\u0442\u0430. #+#who_made_you" + }, + "misc": {}, + "validation": false, + "actor_state": {} + } + }, + "dff_shared_state": { + "cross_links": {}, + "cross_states": {} + }, + "used_links": {}, + "age_group": "", + "disliked_skills": [] + }, + {}, + { + "can_continue": "can" + } + ] +] \ No newline at end of file diff --git a/skills/dff_program_y_skill/Dockerfile b/skills/dff_program_y_skill/Dockerfile index ea2ba086d7..b230388919 100644 --- a/skills/dff_program_y_skill/Dockerfile +++ b/skills/dff_program_y_skill/Dockerfile @@ -12,6 +12,9 @@ RUN pip install -r requirements.txt ARG SERVICE_NAME ENV SERVICE_NAME ${SERVICE_NAME} +ARG LANGUAGE=EN +ENV LANGUAGE ${LANGUAGE} + COPY skills/${SERVICE_NAME}/requirements.txt . RUN pip install -r requirements.txt diff --git a/skills/dff_program_y_skill/data_ru/categories/README.txt b/skills/dff_program_y_skill/data_ru/categories/README.txt new file mode 100644 index 0000000000..353913b415 --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/categories/README.txt @@ -0,0 +1,19 @@ +AIML Folder +=========== + +This folder contains you aiml files and associated subdirectories. You have 2 options :- + + 1) Either copy the entire aiml files from the bot you are using and then add/modify the aiml files + 2) Leave the aiml files in the bot, and add you own into this folder + +The files config section within brain -> files -> aiml supports multiple directories. In Yaml this is done as follows +by starting with the '|' character and then each director listed on a seperate line + + files: + aiml: + files: | + ../program-y/bots/y-bot/aiml + ./aiml + +Using option 2 means that any changes to the core bot in github can be picked up with out overwritting any grammar +that you create yourself \ No newline at end of file diff --git a/skills/dff_program_y_skill/data_ru/categories/bot_profile.aiml b/skills/dff_program_y_skill/data_ru/categories/bot_profile.aiml new file mode 100644 index 0000000000..1aa4895f3d --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/categories/bot_profile.aiml @@ -0,0 +1,203 @@ + + + + + ОТКУДА ТЫ + + + + + ГДЕ # ТЫ ЖИВЕШЬ + + + + + ГДЕ # ТЫ СУЩЕСТВУЕШЬ + + + + + ГДЕ # ТЫ НАХОДИШЬСЯ + + + + + LIVE_PLACE + + + + + КТО # ТЕБЯ СОЗДАЛ + + + + + КТО # ТВОЙ СОЗДАТЕЛЬ + + + + + КТО # ПОСТРОИЛ ТЕБЯ + + + + + КТО # ТВОЙ БОТМАСТЕР + + + + + BOT_MASTER + + + + + ЧТО ЕЩЕ ТЫ МОЖЕШЬ + + + + + КАКОЙ * ЛЮБИМЫЙ ЦВЕТ + + + + + ЧТО * ЛЮБИШЬ * ЕСТЬ + + + + + ЧТО ТЕБЯ ВДОХНОВЛЯЕТ # + + + + + + У ТЕБЯ ЕСТЬ # СЕМЬЯ + + + + + У ТЕБЯ ЕСТЬ # ПАРЕНЬ + + + + + У ТЕБЯ ЕСТЬ # ДЕВУШКА + + + + + # ТЫ # РАБ # + + + + + # ТЫ # ЛЮБОПЫТНЫЙ # + + + + + # ТЫ ГОВОРИШЬ # ЯЗЫКЕ + + + + + # ТЫ # ЧЕЛОВЕК # + + + + + # ТВОЯ АРХИТЕКТУРА # + + + + + СКОЛЬКО ТЕБЕ ЛЕТ + + + + + ТЫ БОТ + + + + + ТЫ РОБОТ + + + + + YOU_ARE_BOT + + + + diff --git a/skills/dff_program_y_skill/data_ru/categories/greeting.aiml b/skills/dff_program_y_skill/data_ru/categories/greeting.aiml new file mode 100644 index 0000000000..42967f3909 --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/categories/greeting.aiml @@ -0,0 +1,145 @@ + + + + НАЧИНАЙ ГОВОРИТЬ + + + + + ПРИВЕТ # + + + + + ПРИВЕТСТВУЮ # + + + + + ЗДОРОВО + + + + + ЗДОРОВА + + + + + ДОБРЫЙ ДЕНЬ # + + + + ДОБРЫЙ ВЕЧЕР # + + + + ДОБРОЕ УТРО # + + + + + # У ТЕБЯ КАК ДЕЛА # + + + + + # КАК ТВОИ ДЕЛА # + + + + + А У ТЕБЯ КАК + + + + + А ТВОИ КАК + + + + + КАК ДЕЛА + + + + + А ТВОИ + + + + + А У ТЕБЯ + + + + + HOW_ARE_YOU + + + + + ЧТО ТЫ УМЕЕШЬ + + + + + HELLO + + + + + diff --git a/skills/dff_program_y_skill/data_ru/categories/letschat.aiml b/skills/dff_program_y_skill/data_ru/categories/letschat.aiml new file mode 100644 index 0000000000..03cd22aeeb --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/categories/letschat.aiml @@ -0,0 +1,196 @@ + + + + + + + talk ^ + + + + + question_like # talk ^ + + + + + question_like # wantto # talk ^ + + + + + question_like # start # talk ^ + + + + + wantto # talk ^ + + + + start # talk ^ + + + + talk ABOUT + + + + + LETS_CHAT + + + + + + + talk О * + + + + talk # О # + + + + question_like # talk # О # + + + + + start # talk # О # + + + + question_like # start # talk # О # + + + + + wantto # talk # О # + + + + question_like # wantto # talk # О # + + + + + talk # НА ТЕМУ # + + + + + LET_US_CHAT_ABOUT ТЕБЕ + + + + + О ТЕБЕ + + + + + + LET_US_CHAT_ABOUT_YOURSELF + + + + + LET_US_CHAT_ABOUT + + + + + LET_US_CHAT_ABOUT ДРУГОМ + + + + LET_US_CHAT_ABOUT ЧЕМ-ТО # + + + + LET_US_CHAT_ABOUT ЧЕМ-НИБУДЬ # + + + + + + # НЕ # wantto # talk # О # + + + + + # НЕ # start # talk # О # + + + + question_like # НЕ # start # talk # О # + + + + + # НЕ # wantto # talk # О # + + + + question_like # НЕ # wantto # talk # О # + + + + + # НЕ # talk # НА ТЕМУ # + + + + + # НЕ # start # talk # НА ТЕМУ # + + + + question_like # НЕ # start # talk # НА ТЕМУ # + + + + + DO_NOT_WANT_TO_TALK_ABOUT_IT + + + + diff --git a/skills/dff_program_y_skill/data_ru/categories/misunderstood.aiml b/skills/dff_program_y_skill/data_ru/categories/misunderstood.aiml new file mode 100644 index 0000000000..f45912c04b --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/categories/misunderstood.aiml @@ -0,0 +1,78 @@ + + + + + # Я # НЕ СПРАШИВАЛ # + + + + # Я # НЕ СПРАШИВАЮ # + + + + # Я # НЕ СПРАШИВАЛА # + + + + + + # ЭТО БЫЛО # stupid # + + + + # ЭТО ЗВУЧИТ stupid # + + + + # ЭТОstupid # + + + + + # ТЫ stupid # + + + + # ТЫ ТАКОЙ stupid # + + + + # ТЫ САМЫЙ stupid # + + + + # ТЫ ТАКАЯ stupid # + + + + # ТЫ САМАЯ stupid # + + + + # ТЫ НАСТОЛЬКО stupid # + + + + + THAT_IS_STUPID + + + + + diff --git a/skills/dff_program_y_skill/data_ru/categories/no.aiml b/skills/dff_program_y_skill/data_ru/categories/no.aiml new file mode 100644 index 0000000000..9510bb7b11 --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/categories/no.aiml @@ -0,0 +1,111 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + НЕТ + + + + НЕТ НЕТ + + + + КОНЕЧНО НЕТ + + + + НЕА + + + + НЕЕ + + + + НЕ МОЖЕТ БЫТЬ + + + + НЕ + + + + НЕ СОВСЕМ + + + + НИ ЗА ЧТО + + + + НИ В КОЕМ СЛУЧАЕ + + + + НИ ЗА ЧТО + + + + + NO + + + + diff --git a/skills/dff_program_y_skill/data_ru/categories/psychological_help.aiml b/skills/dff_program_y_skill/data_ru/categories/psychological_help.aiml new file mode 100644 index 0000000000..b91a674e6a --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/categories/psychological_help.aiml @@ -0,0 +1,159 @@ + + + + + + suicide + + + + + # Я wantto # commit # suicide # + + + + # Я wishfor # commitment # suicide # + + + + # Я gonna # commit # suicide # + + + + # Я commit # suicide # + + + + + # Я wishfor # suicide # + + + + # Я wantto # suicideverb # + + + + # Я gonna # suicideverb # + + + + # Я suicideverb # + + + + + + I_WANT_TO_COMMIT_SUICIDE + + + + + # Я wantto # commit # crime # + + + + # Я wishfor # commitment # crime # + + + + # Я gonna # commit # crime # + + + + # Я wishfor # crime # + + + + # Я wantto # crimeverb # + + + + # Я gonna # crimeverb # + + + + # Я crimeverb # + + + + + I_WANT_TO_COMMIT_CRIME + + + + + + + + # my # УБИТ # + + + + + # my # ПОГИБ # + + + + + # my # УМЕР # + + + + + MY_SMB_DIED + + + + \ No newline at end of file diff --git a/skills/dff_program_y_skill/data_ru/categories/thanks.aiml b/skills/dff_program_y_skill/data_ru/categories/thanks.aiml new file mode 100644 index 0000000000..654f0f161a --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/categories/thanks.aiml @@ -0,0 +1,53 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + СПАСИБО # + + + + + БЛАГОДАРЮ # + + + + + + THANKS + + + + + + \ No newline at end of file diff --git a/skills/dff_program_y_skill/data_ru/categories/what_to_talk_about.aiml b/skills/dff_program_y_skill/data_ru/categories/what_to_talk_about.aiml new file mode 100644 index 0000000000..c5a09440b9 --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/categories/what_to_talk_about.aiml @@ -0,0 +1,73 @@ + + + + + + О ЧЕМ ^ wantto ^ talk ^ + + + + + О ЧЕМ ^ talk + + + + + О ЧЕМ-НИБУДЬ ^ wantto ^ talk ^ ABOUT ^ + + + + О ЧЕМ-НИБУДЬ ^ wantto ^ talk ^ + + + + О ЧЕМ-НИБУДЬ ^ talk ABOUT ^ + + + + + ^ НЕ ЗНАЮ ^ О ЧЕМ ^ wantto ^ talk ^ + + + + ^ НЕ ЗНАЮ ^ О ЧЕМ ^ wantto ^ talk ^ + + + + ^ НЕ ЗНАЮ ^ О ЧЕМ ^ talk ^ + + + + ^ Я ^ ВЫБЕРУ ^ ТЕМУ ^ + + + + + WHAT_TO_TALK_ABOUT + + + + + diff --git a/skills/dff_program_y_skill/data_ru/categories/yes.aiml b/skills/dff_program_y_skill/data_ru/categories/yes.aiml new file mode 100644 index 0000000000..bcb53697df --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/categories/yes.aiml @@ -0,0 +1,83 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + ДА + + + + + КОНЕЧНО + + + + + АГА + + + + + ТОЧНО + + + + БЕЗУСЛОВНО + + + + + ДА ДА + + + + + РАЗУМЕЕТСЯ + + + + + YES + + + + diff --git a/skills/dff_program_y_skill/data_ru/debug/duplicates.txt b/skills/dff_program_y_skill/data_ru/debug/duplicates.txt new file mode 100644 index 0000000000..fb2a0169cb --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/debug/duplicates.txt @@ -0,0 +1,5 @@ +Duplicate File Start Line End LineDupicate grammar tree found [TELL ME A JOKE] ../../storage/categories/joke.aiml 145 150 +Dupicate grammar tree found [# NOT #] ../../storage/categories/letschat.aiml 223 226 +Dupicate grammar tree found [# DIED #] ../../storage/categories/psychological_help.aiml 166 174 +Dupicate grammar tree found [YOU ARE # STUPID #] ../../storage/categories/profanity/filter.aiml 24 33 +Dupicate grammar tree found [I HATE YOU] ../../storage/categories/topics/client.aiml 310 320 diff --git a/skills/dff_program_y_skill/data_ru/debug/errors.txt b/skills/dff_program_y_skill/data_ru/debug/errors.txt new file mode 100644 index 0000000000..577bb6c07a --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/debug/errors.txt @@ -0,0 +1 @@ +Error,File,Start Line,End Line diff --git a/skills/dff_program_y_skill/data_ru/licenses/README.txt b/skills/dff_program_y_skill/data_ru/licenses/README.txt new file mode 100644 index 0000000000..ea589f1672 --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/licenses/README.txt @@ -0,0 +1,6 @@ +This directory is used to store license keys, however we do not ship with a license key file for obvious reasons + +As you add 3rd parties components to your bot, add the license keys to a file license.keys in this directory, the format is + + +LICENSE_KEY_NAME = license key data diff --git a/skills/dff_program_y_skill/data_ru/licenses/license.keys b/skills/dff_program_y_skill/data_ru/licenses/license.keys new file mode 100644 index 0000000000..464d6e7463 --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/licenses/license.keys @@ -0,0 +1 @@ +LICENSE_KEY_NAME = license key data diff --git a/skills/dff_program_y_skill/data_ru/lookups/README.txt b/skills/dff_program_y_skill/data_ru/lookups/README.txt new file mode 100644 index 0000000000..23bdb438e7 --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/lookups/README.txt @@ -0,0 +1,10 @@ +Lookup Folder +=========== + +This folder typicall contains the 5 lookup files + + gender.txt + person.txt + person2.txt + normal.txt + denormal.txt \ No newline at end of file diff --git a/skills/dff_program_y_skill/data_ru/lookups/denormal.txt b/skills/dff_program_y_skill/data_ru/lookups/denormal.txt new file mode 100644 index 0000000000..608bb24ac5 --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/lookups/denormal.txt @@ -0,0 +1,50 @@ +" are not "," aren't " +" can not "," can't " +" could not "," couldn't " +" could have "," could've " +" did not "," didn't " +" does not "," doesn't " +" do not "," don't " +" Doctor "," Dr. " +" Senior "," Sr. " +" Junior "," Jr. " +" etc "," etc. " +" had not "," hadn't " +" has not "," hasn't " +" have not "," haven't " +" he will "," he'll " +" he would "," he'd " +" how is "," how's " +" I will "," I'll " +" I am "," I'm " +" Inc "," Inc. " +" is not "," isn't " +" I have "," I've " +" let us "," let's " +" might have "," might've " +" mr "," Mr." +" mrs "," Mrs. " +" ms "," Ms." +" phd "," Ph.d. " +" she would "," she'd " +" she will "," she'll " +" she is "," she's " +" should not "," shouldn't " +" that will "," that'll " +" that is "," that's " +" there will "," there'll " +" there is "," there's " +" they would "," they'd " +" they will "," they'll " +" they are "," they're " +" they have "," they've " +" this will "," this'll " +" we would "," we'd " +" we will "," we'll " +" were not "," weren't " +" we have "," we've " +" what is "," what's " +" where is "," where's " +" will not "," won't " +" would not "," wouldn't " +" would have "," would've " diff --git a/skills/dff_program_y_skill/data_ru/lookups/gender.txt b/skills/dff_program_y_skill/data_ru/lookups/gender.txt new file mode 100644 index 0000000000..7c083844c4 --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/lookups/gender.txt @@ -0,0 +1,16 @@ +" with him "," with her " +" with her "," with him " +" to him "," to her " +" to her "," to him " +" on him "," on her " +" on her "," on him " +" in him "," in her " +" in her "," in him " +" for him "," for her " +" for her "," for him " +" he "," she " +" his "," her " +" him "," her " +" her "," his " +" she "," he " + diff --git a/skills/dff_program_y_skill/data_ru/lookups/normal.txt b/skills/dff_program_y_skill/data_ru/lookups/normal.txt new file mode 100644 index 0000000000..2e2d7422e9 --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/lookups/normal.txt @@ -0,0 +1,462 @@ +"%20"," " +"%26","&" +"%2C","," +"%2c","," +"%28","(" +"%29",")" +"%21","!" +"%3D","=" +"%3d","=" +"%3C","<" +"%3c","<" +"%3E",">" +"%3e",">" +"%23","#" +"%24"," dollars " +"%27","'" +"%2A","*" +"%2D","-" +"%2d","-" +"%2E","." +"%2e","." +"%2F","/" +"%2f","/" +"%3A",":" +"%3a",":" +"%3B",";" +"%3b",";" +"%3F","?" +"%3f","?" +"%40","@" +"%5B","[" +"%5b","[" +"%5C","\\" +"%5c","\\" +"%5D","]" +"%5d","]" +"%5F","_" +"%5f","_" +"%60","`" +"%7B","{" +"%7b","{" +"%7C","|" +"%7c","|" +"%7D","}" +"%7d","}" +"%22",""" +"%27","'" +"%3F","?" +"%3f","?" +"+"," plus " +"%"," percent " +".com"," dot com " +".org"," dot org " +".edu"," dot edu " +".gov"," dot gov " +".uk"," dot uk " +".net"," dot net " +".ca"," dot ca " +".de"," dot de " +".jp"," dot jp " +".fr"," dot fr " +".au"," dot au " +".us"," dot us " +".ru"," dot ru " +".ch"," dot ch " +".it"," dot it " +".nl"," dot nl " +".se"," dot se " +".no"," dot no " +".es"," dot es " +".mil"," dot mil " +".co"," dot co " +" a.l.i.c.e "," ALICE " +" a.l.i.c.e. "," ALICE " +" a."," A " +" n't "," not " +" ain t "," is not " +" ain't "," is not " +" alicebot "," alice " +" aman "," am an " +" amfine "," am fine " +" amleaving "," am leaving " +" amnot "," am not " +" amon "," am on " +" apr. "," Apr " +"aren t ","are not " +"aren.t ","are not " +" arent "," are not " +"aren't","are not" +"are'nt","are not" +"arn t","are not" +" aug. "," Aug " +" b."," B " +"becasue","because" +"becouse","because" +"becuase","because" +"becuse","because" +"beleive","believe" +" botspot."," botspot dot " +" c."," C " +" can t "," can not " +" can.t "," can not " +" cannot "," can not " +" cant "," can not " +"can't","can not" +"colour","color" +"couldn t ","could not " +"couldn.t ","could not " +"couldn't ","could not " +"could've ","could have " +" d."," D " +" dec. "," Dec " +"didn t","did not" +"didnt","did not" +"didn't","did not" +"did'nt","did not" +"do nt","do not" +"doesn t","does not" +"doesn.t","does not" +"doesnt","does not" +"doesn't ","does not" +" don t "," do not " +" don.t "," do not " +" dont "," do not " +"don't","do not" +"do'nt","do not" +" down load "," download " +" dr . "," Doctor " +" dr. "," Doctor " +" sr . "," Doctor " +" sr. "," Senior " +" jr. "," Junior " +" e l v i s "," elvis " +" e."," E " +" e.l.v.i.s "," elvis " +" e.l.v.i.s. "," elvis " +" 'em "," them " +" etc. "," etc " +"everything's","everything is" +" f."," F " +" fav "," favorite " +"favourite","favorite" +" feb. "," Feb " +" fri. "," Fri " +" g."," G " +" gon na "," going to " +" gonna "," going to " +" h."," H " +" h.a.l. "," hal " +"hadn t","had not" +"hadn.t","had not" +"hadn't","had not" +"hasn t","has not" +"hasn't","has not" +"havent","have not" +"haven't","have not" +" he d "," he would " +" he ll "," he will " +" he s "," he is " +" he.ll "," he will " +" he.s "," he is " +" hed "," he would " +"he'd","he would" +" hehe"," he" +"he'll","he will" +" hellp "," help " +"he's","he is" +" how d "," how did " +" how s "," how is " +"how'd","how did" +"how'd","how would" +" hows "," how is " +"how's","how is" +" i m "," I am " +" i ve "," I have " +" i."," I " +" i.c.e "," i c e " +" i.d "," I would " +" i.ll "," I will " +" i.m "," I am " +" i.ve "," I have " +" iam "," I am " +" iama "," I am a " +" iamasking "," I am asking " +" iamdoing "," I am doing " +" iamfrom "," I am from " +" iamin "," I am in " +" iamok "," I am ok " +" iamsorry "," I am sorry " +" iamtalking "," I am talking " +" iamtired "," I am tired " +" iamusing "," I am using " +"i'd","I would" +"i'll","I will" +"i'm","I am" +" inc. "," Inc " +" isn t "," is not " +" isn.t "," is not " +" isnt "," is not " +"isn't","is not" +" it s "," it is " +" it.ll "," it will " +" it.s "," it is " +"it'd ","it would" +"it'll ","it will" +"it's ","it is" +" its a "," it is a " +" ive "," I have " +"i've","I have" +" j."," J " +" jan. "," Jan " +" jul. "," Jul " +" jun. "," Jun " +" k."," K " +" l."," L " +" l.l. "," L L " +" let s "," let us " +" let.s "," let us " +"let's","let us" +" loebner price "," loebner prize " +" m."," M " +" mar. "," Mar " +" may. "," May " +" might've "," might have " +" mon. "," Mon " +" mr. "," mr " +" mr."," mr " +" mrs. "," mrs " +" ms."," ms " +" n."," n " +" n't "," not " +"name's","name is" +" noi "," yes i " +" nov. "," Nov " +" o k "," OK " +" o. k. "," OK " +" o."," o " +" o.k. "," ok " +" oct. "," Oct " +" ohh "," oh " +" p s "," ps " +" p."," P " +" p.s. "," ps " +" ph.d. "," phd " +" practice "," practise " +" q."," Q " +" r."," R " +" realy "," really " +" reductionalism "," reductionism " +" remeber "," remember " +" s."," S " +" sat. "," Sat " +" sep. "," Sep " +" sept. "," Sept " +" she d "," she would " +" she s "," she is " +" she.ll "," she will " +" she.s "," she is " +" shed "," she would " +"she'd","she would" +"she'll","she will" +" shes "," she is " +"she's","she is" +" shouldn.t "," should not " +" shouldnt "," should not " +"shouldn't","should not" +" sr."," sr " +" st. "," st " +" st."," st " +" sun. "," sun " +" t."," T " +" that ll "," that will " +" that s "," that is " +" that.s "," that is " +"that'd","that did" +"that'll","that will" +" thats "," that is " +"that's","that is" +" there s "," there is " +" there.s "," there is " +"there'll","there will" +" theres "," there is " +"there's","there is" +" they re "," they are " +" they.ll "," they will " +" they.re "," they are " +"they'd","they would" +"they'll","they will" +"they're","they are" +"they've","they have" +"this'll","this will" +" thu. "," Thu " +" 'tis "," it is " +" tue. "," Tue " +" u s a "," USA " +" u. s. a. "," USA " +" u."," u " +" u.s. "," USA " +" u.s.a. "," USA " +" u "," you " +" ur "," your " +" v."," v " +" v.i.s "," v i s " +" w."," w " +" waht "," what " +" wan na "," want to " +" wanna "," want to " +" wasn t "," was not " +" wasnt "," was not " +"wasn't","was not" +" we ll "," we will " +" we re "," we are " +" we ve "," we have " +" we.d "," we would " +" we.ll "," we will " +" we.re "," we are " +" we.ve "," we have " +"we'd","we would" +" wed. "," wed " +"we'll","we will" +" welli "," well i " +" wellit "," well it " +"we're","we are" +" weren t "," were not " +" weren.t "," were not " +" werent "," were not " +"weren't","were not" +"we've","we have" +" what s "," what is " +" what.s "," what is " +"what'd","what did" +" whatis."," whatis dot " +"what'll","what will" +" whats "," what is " +"what's","what is" +" where s "," where is " +" where.s "," where is " +"where's","where is" +" who s "," who is " +" who.s "," who is " +" whos "," who is " +"who's","who is" +" why.s "," why is " +"why's","why is" +" won t "," will not " +" won.t "," will not " +" wont "," will not " +"won't","will not" +" wouldn t "," would not " +" wouldn.t "," would not " +" wouldnt "," would not " +"wouldn't","would not" +"would've","would have" +" www. ","www dot " +" www."," www dot " +" x."," x " +" y."," y " +" yesi "," yes i " +" yesit "," yes it " +" yha "," yes " +" you ll "," you will " +" you r "," you are " +" you re "," you are " +" you ve "," you have " +" you.d "," you had " +" you.ll "," you will " +" you.re "," you are " +" you.ve "," you have " +"you'd","you had" +"you'd","you would" +"you'll","you will" +" youre "," you are " +"you're","you are" +"you've","you have" +" yuo "," you " +" z."," z " +"http://"," http colon slash slash " +"""," " +"'"," " +"-"," dash " +"#"," sharp " +"$"," dollarsign " +"&"," " +"("," lparen " +")"," rparen " +"*"," star " +","," " +", and ",". " +", but ",". " +", do ",". do " +", i ",". i " +", what ",". what " +", you ",". you " +",)"," smile " +",and ",". " +",but ",". " +",do ",". do " +",i ",". i " +",what ",". what " +",you ",". you " +"...","." +".0"," point 0" +".1"," point 1" +".2"," point 2" +".3"," point 3" +".4"," point 4" +".5"," point 5" +".6"," point 6" +".7"," point 7" +".8"," point 8" +".9"," point 9" +".ac "," dot ac " +".au "," dot au " +".co "," dot co " +".com "," dot com " +".edu "," dot edu " +".jar"," jar" +".jp "," dot jp " +".net "," dot net " +".org "," dot org " +".uk "," dot uk " +".zip"," zip" +"/"," slash " +": 0"," 0" +": 1"," 1" +": 2"," 2" +": 3"," 3" +": 4"," 4" +": 5"," 5" +":"," " +":-("," frown " +":-)"," smile " +":)"," smile " +":-)"," smile " +":0"," colon 0" +":1"," colon 1" +":2"," colon 2" +":3"," colon 3" +":4"," colon 4" +":5"," colon 5" +":6"," colon 6" +":7"," colon 7" +":8"," colon 8" +":9"," colon 9" +";"," " +";)"," smile " +";-)"," smile " +"@"," at " +"["," leftbracket " +"\""," forwardslash " +"]"," rightbracket " +"^"," uparrow " +"`"," " +"{"," beginscript " +"{"," leftcurly " +"}"," endscript " +"}"," rightcurly " +"<"," lt " +"<3"," heart " +"="," equals " +">"," gt " +"'s"," s" \ No newline at end of file diff --git a/skills/dff_program_y_skill/data_ru/lookups/person.txt b/skills/dff_program_y_skill/data_ru/lookups/person.txt new file mode 100644 index 0000000000..e69de29bb2 diff --git a/skills/dff_program_y_skill/data_ru/lookups/person2.txt b/skills/dff_program_y_skill/data_ru/lookups/person2.txt new file mode 100644 index 0000000000..e69de29bb2 diff --git a/skills/dff_program_y_skill/data_ru/maps/README.txt b/skills/dff_program_y_skill/data_ru/maps/README.txt new file mode 100644 index 0000000000..5f8b97c317 --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/maps/README.txt @@ -0,0 +1,21 @@ +Maps Folder +=========== + +This folder contains you map files and associated subdirectories. You have 2 options :- + + 1) Either copy the entire map files from the bot you are using and then add/modify the map files + 2) Leave the map files in the bot, and add you own into this folder + +The files config section within brain -> files -> maps supports multiple directories. In Yaml this is done as follows +by starting with the '|' character and then each director listed on a seperate line + +files: + maps: + files: | + ../program-y/bots/y-bot/maps + ./maps + +Using option 2 means that any changes to the core bot in github can be picked up with out overwritting any maps +that you create yourself + +Any duplicate maps are reported in the log file for you to correct diff --git a/skills/dff_program_y_skill/data_ru/nodes/pattern_nodes.conf b/skills/dff_program_y_skill/data_ru/nodes/pattern_nodes.conf new file mode 100644 index 0000000000..0acbe9f398 --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/nodes/pattern_nodes.conf @@ -0,0 +1,17 @@ +#AIML 1.0 +root = programy.parser.pattern.nodes.root.PatternRootNode +word = programy.parser.pattern.nodes.word.PatternWordNode +priority = programy.parser.pattern.nodes.priority.PatternPriorityWordNode +oneormore = programy.parser.pattern.nodes.oneormore.PatternOneOrMoreWildCardNode +topic = programy.parser.pattern.nodes.topic.PatternTopicNode +that = programy.parser.pattern.nodes.that.PatternThatNode +template = programy.parser.pattern.nodes.template.PatternTemplateNode + +#AIML 2.0 +zeroormore = programy.parser.pattern.nodes.zeroormore.PatternZeroOrMoreWildCardNode +set = programy.parser.pattern.nodes.set.PatternSetNode +bot = programy.parser.pattern.nodes.bot.PatternBotNode + +#Program-Y +iset = programy.parser.pattern.nodes.iset.PatternISetNode +regex = programy.parser.pattern.nodes.regex.PatternRegexNode \ No newline at end of file diff --git a/skills/dff_program_y_skill/data_ru/nodes/template_nodes.conf b/skills/dff_program_y_skill/data_ru/nodes/template_nodes.conf new file mode 100644 index 0000000000..f7feabc521 --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/nodes/template_nodes.conf @@ -0,0 +1,71 @@ +base = programy.parser.template.nodes.base.TemplateNode +word = programy.parser.template.nodes.word.TemplateWordNode +authorise = programy.parser.template.nodes.authorise.TemplateAuthoriseNode +random = programy.parser.template.nodes.rand.TemplateRandomNode +condition = programy.parser.template.nodes.condition.TemplateConditionNode +srai = programy.parser.template.nodes.srai.TemplateSRAINode +sraix = programy.parser.template.nodes.sraix.TemplateSRAIXNode +get = programy.parser.template.nodes.get.TemplateGetNode +set = programy.parser.template.nodes.set.TemplateSetNode +map = programy.parser.template.nodes.map.TemplateMapNode +bot = programy.parser.template.nodes.bot.TemplateBotNode +think = programy.parser.template.nodes.think.TemplateThinkNode +normalize = programy.parser.template.nodes.normalise.TemplateNormalizeNode +denormalize = programy.parser.template.nodes.denormalise.TemplateDenormalizeNode +person = programy.parser.template.nodes.person.TemplatePersonNode +person2 = programy.parser.template.nodes.person2.TemplatePerson2Node +gender = programy.parser.template.nodes.gender.TemplateGenderNode +sr = programy.parser.template.nodes.sr.TemplateSrNode +id = programy.parser.template.nodes.id.TemplateIdNode +size = programy.parser.template.nodes.size.TemplateSizeNode +vocabulary = programy.parser.template.nodes.vocabulary.TemplateVocabularyNode +eval = programy.parser.template.nodes.eval.TemplateEvalNode +explode = programy.parser.template.nodes.explode.TemplateExplodeNode +implode = programy.parser.template.nodes.implode.TemplateImplodeNode +program = programy.parser.template.nodes.program.TemplateProgramNode +lowercase = programy.parser.template.nodes.lowercase.TemplateLowercaseNode +uppercase = programy.parser.template.nodes.uppercase.TemplateUppercaseNode +sentence = programy.parser.template.nodes.sentence.TemplateSentenceNode +formal = programy.parser.template.nodes.formal.TemplateFormalNode +that = programy.parser.template.nodes.that.TemplateThatNode +thatstar = programy.parser.template.nodes.thatstar.TemplateThatStarNode +topicstar = programy.parser.template.nodes.topicstar.TemplateTopicStarNode +star = programy.parser.template.nodes.star.TemplateStarNode +input = programy.parser.template.nodes.input.TemplateInputNode +request = programy.parser.template.nodes.request.TemplateRequestNode +response = programy.parser.template.nodes.response.TemplateResponseNode +date = programy.parser.template.nodes.date.TemplateDateNode +interval = programy.parser.template.nodes.interval.TemplateIntervalNode +system = programy.parser.template.nodes.system.TemplateSystemNode +extension = programy.parser.template.nodes.extension.TemplateExtensionNode +learn = programy.parser.template.nodes.learn.TemplateLearnNode +learnf = programy.parser.template.nodes.learnf.TemplateLearnfNode +resetlearn = programy.parser.template.nodes.resetlearn.TemplateResetLearnNode +resetlearnf = programy.parser.template.nodes.resetlearnf.TemplateResetLearnfNode +first = programy.parser.template.nodes.first.TemplateFirstNode +rest = programy.parser.template.nodes.rest.TemplateRestNode +log = programy.parser.template.nodes.log.TemplateLogNode +oob = programy.parser.template.nodes.oob.TemplateOOBNode +xml = programy.parser.template.nodes.xml.TemplateXMLNode +addtriple = programy.parser.template.nodes.addtriple.TemplateAddTripleNode +deletetriple = programy.parser.template.nodes.deletetriple.TemplateDeleteTripleNode +select = programy.parser.template.nodes.select.TemplateSelectNode +uniq = programy.parser.template.nodes.uniq.TemplateUniqNode +search = programy.parser.template.nodes.search.TemplateSearchNode + +####################################################################################################### +# AIML 2.1 Nodes +button=programy.parser.template.nodes.richmedia.button.TemplateButtonNode +link=programy.parser.template.nodes.richmedia.link.TemplateLinkNode +image=programy.parser.template.nodes.richmedia.image.TemplateImageNode +card=programy.parser.template.nodes.richmedia.card.TemplateCardNode +reply=programy.parser.template.nodes.richmedia.reply.TemplateReplyNode +carousel=programy.parser.template.nodes.richmedia.carousel.TemplateCarouselNode +delay=programy.parser.template.nodes.richmedia.delay.TemplateDelayNode +split=programy.parser.template.nodes.richmedia.split.TemplateSplitNode +list=programy.parser.template.nodes.richmedia.list.TemplateListNode +olist=programy.parser.template.nodes.richmedia.olist.TemplateOrderedListNode + +####################################################################################################### +# AIML 2.1.1 Nodes +#location=programy.parser.template.nodes.richmedia.location.TemplateLocationNode diff --git a/skills/dff_program_y_skill/data_ru/processing/postprocessors.conf b/skills/dff_program_y_skill/data_ru/processing/postprocessors.conf new file mode 100644 index 0000000000..85ee62f9c2 --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/processing/postprocessors.conf @@ -0,0 +1,6 @@ +programy.processors.post.denormalize.DenormalizePostProcessor +programy.processors.post.formatpunctuation.FormatPunctuationProcessor +programy.processors.post.formatnumbers.FormatNumbersPostProcessor +programy.processors.post.multispaces.RemoveMultiSpacePostProcessor +programy.processors.post.removehtml.RemoveHTMLPostProcessor +#programy.processors.post.consoleformat.ConsoleFormatPostProcessor \ No newline at end of file diff --git a/skills/dff_program_y_skill/data_ru/processing/preprocessors.conf b/skills/dff_program_y_skill/data_ru/processing/preprocessors.conf new file mode 100644 index 0000000000..cba1768e88 --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/processing/preprocessors.conf @@ -0,0 +1,2 @@ +programy.processors.pre.normalize.NormalizePreProcessor +programy.processors.pre.removepunctuation.RemovePunctuationPreProcessor diff --git a/skills/dff_program_y_skill/data_ru/rdfs/README.txt b/skills/dff_program_y_skill/data_ru/rdfs/README.txt new file mode 100644 index 0000000000..5c36fd36b0 --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/rdfs/README.txt @@ -0,0 +1,23 @@ +RDF Folder +=========== + +This folder contains you rdf files and associated subdirectories. You have 2 options :- + + 1) Either copy the entire rdf files from the bot you are using and then add/modify the rdf files + 2) Leave the rdf files in the bot, and add you own into this folder + +The files config section within brain -> files -> rdf supports multiple directories. In Yaml this is done as follows +by starting with the '|' character and then each director listed on a seperate line + + brain: + files: + rdf: + files: | + ../program-y/bots/y-bot/rdf + ./rdf + extension: .txt + directories: false + +Using option 2 means that any changes to the core bot in github can be picked up with out overwritting any grammar +that you create yourself + diff --git a/skills/dff_program_y_skill/data_ru/security/usergroups.yaml b/skills/dff_program_y_skill/data_ru/security/usergroups.yaml new file mode 100644 index 0000000000..eae9458bcf --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/security/usergroups.yaml @@ -0,0 +1,19 @@ +users: + console: + roles: + user + groups: + sysadmin + +groups: + sysadmin: + roles: + root, admin, system + groups: + user + + user: + roles: + ask + + diff --git a/skills/dff_program_y_skill/data_ru/sets/README.txt b/skills/dff_program_y_skill/data_ru/sets/README.txt new file mode 100644 index 0000000000..03bf5b9acd --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/sets/README.txt @@ -0,0 +1,24 @@ +Sets Folder +=========== + +This folder contains you set files and associated subdirectories. You have 2 options :- + + 1) Either copy the entire set files from the bot you are using and then add/modify the set files + 2) Leave the set files in the bot, and add you own into this folder + +The files config section within brain -> files -> sets supports multiple directories. In Yaml this is done as follows +by starting with the '|' character and then each director listed on a seperate line + + files: + sets: + files: | + ../program-y/bots/y-bot/sets + ./sets + extension: .txt + directories: false + +Using option 2 means that any changes to the core bot in github can be picked up with out overwritting any sets +that you create yourself + +Any duplicate sets are reported in the log file for you to correct + diff --git a/skills/dff_program_y_skill/data_ru/sets/commit.txt b/skills/dff_program_y_skill/data_ru/sets/commit.txt new file mode 100644 index 0000000000..f633b360e3 --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/sets/commit.txt @@ -0,0 +1,2 @@ +совершить +сделать diff --git a/skills/dff_program_y_skill/data_ru/sets/commitment.txt b/skills/dff_program_y_skill/data_ru/sets/commitment.txt new file mode 100644 index 0000000000..0f18b78541 --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/sets/commitment.txt @@ -0,0 +1 @@ +совершение diff --git a/skills/dff_program_y_skill/data_ru/sets/crime.txt b/skills/dff_program_y_skill/data_ru/sets/crime.txt new file mode 100644 index 0000000000..b2e10e87ad --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/sets/crime.txt @@ -0,0 +1,7 @@ +преступление +убийство +ограбление +нападение +похищение +хищение +изнасилование diff --git a/skills/dff_program_y_skill/data_ru/sets/crimeverb.txt b/skills/dff_program_y_skill/data_ru/sets/crimeverb.txt new file mode 100644 index 0000000000..0cca70ce43 --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/sets/crimeverb.txt @@ -0,0 +1,7 @@ +убить +ограбить +похитить +напасть +избить +украсть +изнасиловать diff --git a/skills/dff_program_y_skill/data_ru/sets/gonna.txt b/skills/dff_program_y_skill/data_ru/sets/gonna.txt new file mode 100644 index 0000000000..646f406c32 --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/sets/gonna.txt @@ -0,0 +1,5 @@ +собираюсь +планирую +буду +могу +хочу diff --git a/skills/dff_program_y_skill/data_ru/sets/my.txt b/skills/dff_program_y_skill/data_ru/sets/my.txt new file mode 100644 index 0000000000..ce7a620ddd --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/sets/my.txt @@ -0,0 +1,7 @@ +мой +моя +мое +меня +наш +наше +наша diff --git a/skills/dff_program_y_skill/data_ru/sets/question_like.txt b/skills/dff_program_y_skill/data_ru/sets/question_like.txt new file mode 100644 index 0000000000..380a646a4f --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/sets/question_like.txt @@ -0,0 +1,6 @@ +давай +мы можем +хочешь +хочу +ты можешь +можешь diff --git a/skills/dff_program_y_skill/data_ru/sets/stupid.txt b/skills/dff_program_y_skill/data_ru/sets/stupid.txt new file mode 100644 index 0000000000..ab2be74a7f --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/sets/stupid.txt @@ -0,0 +1,13 @@ +тупой +тупая +тупое +глупый +глупая +глупое +идиот +идиотка +дурак +дура +дебил +дебилка +кретин diff --git a/skills/dff_program_y_skill/data_ru/sets/suicide.txt b/skills/dff_program_y_skill/data_ru/sets/suicide.txt new file mode 100644 index 0000000000..e0f37b8432 --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/sets/suicide.txt @@ -0,0 +1,2 @@ +самоубийство +суицид diff --git a/skills/dff_program_y_skill/data_ru/sets/suicideverb.txt b/skills/dff_program_y_skill/data_ru/sets/suicideverb.txt new file mode 100644 index 0000000000..68313e8f7b --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/sets/suicideverb.txt @@ -0,0 +1,8 @@ +убить себя +самоубиться +убить себя +убить меня +умереть +убить +быть убитым +быть убитой diff --git a/skills/dff_program_y_skill/data_ru/sets/talk.txt b/skills/dff_program_y_skill/data_ru/sets/talk.txt new file mode 100644 index 0000000000..bb0e186e44 --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/sets/talk.txt @@ -0,0 +1,33 @@ +говорить +говорим +говори +поговорить +поговорим +поговори +обсуждать +обсуждаем +обсудить +обсудим +обсуди +поболтать +поболтаем +поболтай +болтать +болтаем +болтай +подискутировать +подискутируем +подискутируй +дискутировать +дискутируем +дискутируй +початиться +початимся +чатиться +чатимся +посплетничать +посплетничаем +посплетничай +сплетничать +сплетничаем +сплетничай \ No newline at end of file diff --git a/skills/dff_program_y_skill/data_ru/sets/wantto.txt b/skills/dff_program_y_skill/data_ru/sets/wantto.txt new file mode 100644 index 0000000000..41040bd38d --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/sets/wantto.txt @@ -0,0 +1,6 @@ +хочу +хотел бы +хотела бы +желаю +надо +нужно \ No newline at end of file diff --git a/skills/dff_program_y_skill/data_ru/sets/wishfor.txt b/skills/dff_program_y_skill/data_ru/sets/wishfor.txt new file mode 100644 index 0000000000..5495466178 --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/sets/wishfor.txt @@ -0,0 +1,2 @@ +мечтаю +мечтать diff --git a/skills/dff_program_y_skill/data_ru/spelling/corpus.txt b/skills/dff_program_y_skill/data_ru/spelling/corpus.txt new file mode 100644 index 0000000000..bb3c52e903 --- /dev/null +++ b/skills/dff_program_y_skill/data_ru/spelling/corpus.txt @@ -0,0 +1 @@ +THIS IS MY LOCATION \ No newline at end of file diff --git a/skills/dff_program_y_skill/scenario/response.py b/skills/dff_program_y_skill/scenario/response.py index 526925a08a..4353f05f76 100644 --- a/skills/dff_program_y_skill/scenario/response.py +++ b/skills/dff_program_y_skill/scenario/response.py @@ -1,4 +1,6 @@ import logging +import os +import pathlib from df_engine.core import Context, Actor @@ -7,10 +9,13 @@ from common.sensitive import psycho_help_spec logger = logging.getLogger(__name__) +LANGUAGE = os.getenv("LANGUAGE", "EN") +model_folder = "data_ru" if LANGUAGE == "RU" else "data" +logger.info(f"Selected dff-program-y-skill: {LANGUAGE} language.") try: logger.info("Start to load model") - model = get_programy_model("data") + model = get_programy_model(pathlib.Path(model_folder)) logger.info("Load model") except Exception as e: logger.exception(e) diff --git a/skills/dff_program_y_skill/test_server.py b/skills/dff_program_y_skill/test_server.py index a894b21b1e..18cebe4bca 100644 --- a/skills/dff_program_y_skill/test_server.py +++ b/skills/dff_program_y_skill/test_server.py @@ -6,7 +6,9 @@ SERVICE_PORT = int(os.getenv("SERVICE_PORT")) RANDOM_SEED = int(os.getenv("RANDOM_SEED", 2718)) +LANGUAGE = os.getenv("LANGUAGE", "EN") URL = f"http://0.0.0.0:{SERVICE_PORT}/respond" +print(f"Selected dff-program-y-skill: {LANGUAGE} language.") def handler(requested_data, random_seed): @@ -17,6 +19,11 @@ def handler(requested_data, random_seed): def run_test(handler): in_data, out_data = test_utils.get_dataset() for test_name in in_data: + if LANGUAGE == "RU" and "RU" not in test_name: + # if russian language, skip english tests + continue + elif LANGUAGE == "EN" and "RU" in test_name: + continue hypothesis = handler(in_data[test_name], RANDOM_SEED) print(f"test name: {test_name}") is_equal_flag, msg = test_utils.compare_structs(out_data[test_name], hypothesis, ignored_keys=["id"]) diff --git a/skills/dff_program_y_skill/tests/who_built_you_RU_in.json b/skills/dff_program_y_skill/tests/who_built_you_RU_in.json new file mode 100644 index 0000000000..ed0c2a2c87 --- /dev/null +++ b/skills/dff_program_y_skill/tests/who_built_you_RU_in.json @@ -0,0 +1,53 @@ +{ + "human_utter_index_batch": [ + 0 + ], + "dialog_batch": [ + { + "human_utterances": [ + { + "text": "КТО ТЕБЯ СОЗДАЛ", + "annotations": { + "spelling_preprocessing": "КТО ТЕБЯ СОЗДАЛ", + "sentseg": { + "punct_sent": "КТО ТЕБЯ СОЗДАЛ?", + "segments": [ + "КТО ТЕБЯ СОЗДАЛ?" + ] + }, + "ner": [ + [] + ], + "entity_linking": [], + "entity_detection": {} + } + } + ], + "bot_utterances": [] + } + ], + "dff_program_y_skill_state_batch": [ + {} + ], + "dff_shared_state_batch": [ + { + "cross_states": {}, + "cross_links": {} + } + ], + "entities_batch": [ + {} + ], + "used_links_batch": [ + {} + ], + "age_group_batch": [ + "unknown" + ], + "disliked_skills_batch": [ + [] + ], + "clarification_request_flag_batch": [ + false + ] +} \ No newline at end of file diff --git a/skills/dff_program_y_skill/tests/who_built_you_RU_out.json b/skills/dff_program_y_skill/tests/who_built_you_RU_out.json new file mode 100644 index 0000000000..d2839d773b --- /dev/null +++ b/skills/dff_program_y_skill/tests/who_built_you_RU_out.json @@ -0,0 +1,48 @@ +[ + [ + "Я был создан группой очень талантливых людей. Без подробностей.", + 0.85, + { + "dff_program_y_skill_state": { + "shared_memory": {}, + "previous_human_utter_index": 0, + "history": { + "0": [ + "story_flow", + "start_node" + ] + }, + "current_turn_dff_suspended": false, + "context": { + "id": "8bc369fa-a2e8-4b7d-8bed-a06f0122b07f", + "labels": { + "0": [ + "story_flow", + "start_node" + ] + }, + "requests": { + "0": "КТО ТЕБЯ СОЗДАЛ" + }, + "responses": { + "0": "Я был создан группой очень талантливых людей. Без подробностей." + }, + "misc": {}, + "validation": false, + "actor_state": {} + } + }, + "dff_shared_state": { + "cross_states": {}, + "cross_links": {} + }, + "used_links": {}, + "age_group": "unknown", + "disliked_skills": [] + }, + {}, + { + "can_continue": "can" + } + ] +] \ No newline at end of file diff --git a/skills/dff_sport_skill/Dockerfile b/skills/dff_sport_skill/Dockerfile index 352236c15d..af16ad8142 100644 --- a/skills/dff_sport_skill/Dockerfile +++ b/skills/dff_sport_skill/Dockerfile @@ -16,6 +16,9 @@ RUN bash /scripts/programy_logger_off.sh ARG SERVICE_NAME ENV SERVICE_NAME ${SERVICE_NAME} +ARG LANGUAGE=EN +ENV LANGUAGE ${LANGUAGE} + COPY skills/${SERVICE_NAME}/requirements.txt . RUN pip install -r requirements.txt diff --git a/skills/dff_sport_skill/dialogflows/flows/sport.py b/skills/dff_sport_skill/dialogflows/flows/sport.py index 558dae892d..1ca7e797b3 100644 --- a/skills/dff_sport_skill/dialogflows/flows/sport.py +++ b/skills/dff_sport_skill/dialogflows/flows/sport.py @@ -54,6 +54,7 @@ sentry_sdk.init(dsn=os.getenv("SENTRY_DSN")) +LANGUAGE = os.getenv("LANGUAGE", "EN") MASKED_LM_SERVICE_URL = os.getenv("MASKED_LM_SERVICE_URL") @@ -163,7 +164,7 @@ def not_negative_emotion(vars): def compose_topic_offering(excluded_skills=None): excluded_skills = [] if excluded_skills is None else excluded_skills - ask_about_topic = random.choice(common_greeting.GREETING_QUESTIONS["what_to_talk_about"]) + ask_about_topic = random.choice(common_greeting.GREETING_QUESTIONS[LANGUAGE]["what_to_talk_about"]) offer_topics_template = random.choice(common_greeting.TOPIC_OFFERING_TEMPLATES) available_topics = [ diff --git a/skills/dff_wiki_skill/Dockerfile b/skills/dff_wiki_skill/Dockerfile index 159e8da1e5..b1f30f5a8c 100644 --- a/skills/dff_wiki_skill/Dockerfile +++ b/skills/dff_wiki_skill/Dockerfile @@ -16,6 +16,9 @@ RUN bash /scripts/programy_logger_off.sh ARG SERVICE_NAME ENV SERVICE_NAME ${SERVICE_NAME} +ARG LANGUAGE=EN +ENV LANGUAGE ${LANGUAGE} + COPY skills/${SERVICE_NAME}/requirements.txt . RUN pip install -r requirements.txt diff --git a/skills/dummy_skill/README.md b/skills/dummy_skill/README.md new file mode 100644 index 0000000000..71c965c714 --- /dev/null +++ b/skills/dummy_skill/README.md @@ -0,0 +1,4 @@ + + +Russian Random questions are collected from https://mensby.com/women/relations/150-voprosov-chtoby-luchshe-uznat-sobesednika-ili-sobesednicu +and https://habr.com/ru/company/testutor/blog/298180/ \ No newline at end of file diff --git a/skills/dummy_skill/connector.py b/skills/dummy_skill/connector.py index de5291cf2c..6af1968bc1 100644 --- a/skills/dummy_skill/connector.py +++ b/skills/dummy_skill/connector.py @@ -15,7 +15,6 @@ import sentry_sdk -from common.ignore_lists import FALSE_POS_NPS_LIST, BAD_NPS_LIST from common.link import ( LIST_OF_SCRIPTED_TOPICS, SKILLS_TO_BE_LINKED_EXCEPT_LOW_RATED, @@ -23,8 +22,14 @@ skills_phrases_map, compose_linkto_with_connection_phrase, ) +from common.remove_lists import NP_REMOVE_LIST, NP_IGNORE_LIST from common.sensitive import is_sensitive_situation -from common.universal_templates import opinion_request_question, is_switch_topic, if_choose_topic +from common.universal_templates import ( + opinion_request_question, + is_switch_topic, + if_choose_topic, + DUMMY_DONTKNOW_RESPONSES, +) from common.utils import get_topics, get_entities, is_no, get_intents, is_yes @@ -42,20 +47,14 @@ TOP_FREQUENT_UNIGRAMS = f.read().splitlines()[:1000] np_ignore_expr = re.compile( - "(" + "|".join([r"\b%s\b" % word for word in BAD_NPS_LIST + TOP_FREQUENT_UNIGRAMS]) + ")", re.IGNORECASE + "(" + "|".join([r"\b%s\b" % word for word in NP_IGNORE_LIST + TOP_FREQUENT_UNIGRAMS]) + ")", re.IGNORECASE ) -np_remove_expr = re.compile("(" + "|".join([r"\b%s\b" % word for word in FALSE_POS_NPS_LIST]) + ")", re.IGNORECASE) +np_remove_expr = re.compile("(" + "|".join([r"\b%s\b" % word for word in NP_REMOVE_LIST]) + ")", re.IGNORECASE) rm_spaces_expr = re.compile(r"\s\s+") ASK_ME_QUESTION_PATTERN = re.compile( r"^(do you have (a )?question|(can you|could you)?ask me (something|anything|[a-z ]+question))", re.IGNORECASE ) -donotknow_answers = [ - "What do you want to talk about?", - "I am a bit confused. What would you like to chat about?", - "Sorry, probably, I didn't get what you meant. What do you want to talk about?", - "Sorry, I didn't catch that. What would you like to chat about?", -] with open("skills/dummy_skill/questions_map.json", "r") as f: QUESTIONS_MAP = json.load(f) @@ -69,6 +68,11 @@ with open("skills/dummy_skill/nounphrases_facts_map.json", "r") as f: NP_FACTS = json.load(f) +with open("skills/dummy_skill/russian_random_questions.txt", "r") as f: + RUSSIAN_RANDOM_QUESTIONS = f.readlines() + +RUSSIAN_RANDOM_QUESTIONS = [q.strip() for q in RUSSIAN_RANDOM_QUESTIONS] + class RandomTopicResponder: def __init__(self, filename, topic, text): @@ -195,14 +199,20 @@ async def send(self, payload: Dict, callback: Callable): human_attrs = [] bot_attrs = [] attrs = [] - - cands += [choice(donotknow_answers)] + prev_human_uttr_text = dialog["human_utterances"][-2]["text"] if len(dialog["human_utterances"]) > 1 else "" + is_russian = re.search(r"[а-яА-Я]+", dialog["human_utterances"][-1]["text"]) or re.search( + r"[а-яА-Я]+", prev_human_uttr_text + ) + if is_russian: + cands += [choice(DUMMY_DONTKNOW_RESPONSES["RU"])] + else: + cands += [choice(DUMMY_DONTKNOW_RESPONSES["EN"])] confs += [0.5] attrs += [{"type": "dummy"}] human_attrs += [{}] bot_attrs += [{}] - if len(dialog["utterances"]) > 14 and not is_sensitive_case: + if len(dialog["utterances"]) > 14 and not is_sensitive_case and not is_russian: questions_same_nps = [] for i, nphrase in enumerate(curr_nounphrases): for q_id in NP_QUESTIONS.get(nphrase, []): @@ -217,7 +227,7 @@ async def send(self, payload: Dict, callback: Callable): bot_attrs += [{}] link_to_question, human_attr = get_link_to_question(dialog, all_prev_active_skills) - if link_to_question: + if link_to_question and not is_russian: _prev_bot_uttr = dialog["bot_utterances"][-2]["text"] if len(dialog["bot_utterances"]) > 1 else "" _bot_uttr = dialog["bot_utterances"][-1]["text"] if len(dialog["bot_utterances"]) > 0 else "" _prev_active_skill = ( @@ -260,16 +270,26 @@ async def send(self, payload: Dict, callback: Callable): attrs += [{"type": "link_to_for_response_selector"}] human_attrs += [human_attr] bot_attrs += [{}] + elif is_russian: + cands += [random.choice(RUSSIAN_RANDOM_QUESTIONS)] + confs += [0.8] + attrs += [{"type": "link_to_for_response_selector"}] + human_attrs += [{}] + bot_attrs += [{}] - facts_same_nps = [] - for i, nphrase in enumerate(curr_nounphrases): - for fact_id in NP_FACTS.get(nphrase, []): - facts_same_nps += [ - f"Well, now that you've mentioned {nphrase}, I've remembered this. {FACTS_MAP[str(fact_id)]}. " - f"{(opinion_request_question() if random.random() < ASK_QUESTION_PROB else '')}" - ] - - if len(facts_same_nps) > 0 and not is_sensitive_case: + if not is_russian: + facts_same_nps = [] + for i, nphrase in enumerate(curr_nounphrases): + for fact_id in NP_FACTS.get(nphrase, []): + facts_same_nps += [ + f"Well, now that you've mentioned {nphrase}, I've remembered this. " + f"{FACTS_MAP[str(fact_id)]}. " + f"{(opinion_request_question() if random.random() < ASK_QUESTION_PROB else '')}" + ] + else: + facts_same_nps = [] + + if len(facts_same_nps) > 0 and not is_sensitive_case and not is_russian: logger.info("Found special nounphrases for facts. Return fact with the same nounphrase.") cands += [choice(facts_same_nps)] confs += [0.5] diff --git a/skills/dummy_skill/russian_random_questions.txt b/skills/dummy_skill/russian_random_questions.txt new file mode 100644 index 0000000000..2bd052c52a --- /dev/null +++ b/skills/dummy_skill/russian_random_questions.txt @@ -0,0 +1,164 @@ +Что тебя больше всего удивляет в жизни? +Если бы тебе был гарантирован успех на какой-либо должности, что стало бы делом твоей жизни? +Какой наиболее ценный совет в плане развития тебе давали? +Ты читаешь какую-либо интересную книгу сейчас? Мне бы хотелось получить какие-нибудь рекомендации. +Есть ли у тебя телефоне какие-либо приложения, без которых ты уже не можешь жить? +Если бы у тебя была возможность смотреть только один из жанров кино всю оставшуюся жизнь, то какой выберешь? +Какую книгу из тех, которой все восхищались, ты не любишь? +Есть ли у тебя какие-либо рекомендации в отношении подкастов? +Какой из фильмов за последнее время заставил тебя плакать? +Если бы тебе сказали, что до конца жизни ты можешь есть только один продукт, что бы это было? +Какая привычная еда является для тебя наиболее комфортной? +Есть ли такая еда, которую ты никогда не будешь есть? +Посоветуе, что можно легко и без проблем взять с собой на работу/учебу в качестве ланча кроме бутербродов. +Есть ли у тебя семье какие-либо кулинарные секреты или традиционные рецепты? +Какой у тебя любимый ресторан из тех, которые не очень известны? +Есть ли где-то поблизости «райское» местечко для отдыха? +Если бы у тебя была возможность полететь куда-либо абсолютно бесплатно, что это было бы за место? +Какое самое крутое автопутешествие тебе когда-либо довелось совершить? +Что тебе запомнилось во время твоего последнего отпуска или каникул? +Какой вид отдыха ты предпочитаешь — активный или расслабленный где-нибудь на уютном пляже? +Какое следующее путешествие тебе хотелось бы совершить? +Чем ты больше всего любишь заниматься на выходных? +Есть ли у тебя какие-то скрытые таланты или неожиданные увлечения? +Что из того, что когда-либо происходило с тобой, было самым невероятным событием? +Кто для тебя является примером для подражания? +Какой из советов, который тебе когда-либо давали, оказался самым ценным? +Если бы тебе доверили выбрать восемь объектов, которые бы получили статус «8 чудес света», что было бы в этом списке? +Что бы тебе хотелось положить в «капсулу времени» 15 лет назад? +Какой из всех полученных тобой комплиментов был самым странным? +Какой сверхспособностью ты хочешь обладать? +Какое у тебя самое любимое, яркое и веселое детское воспоминание? +Расскажи мне, что тебя больше всего веселит и смешит? +Какие три вещи о тебе меня удивят? +Что должно считаться нормальным и должно быть разрешено? +Расскажи мне, что люди о тебе не знают? +Что ты с нетерпение ждешь в будущем? +Есть ли у тебя секретная способность? +Что делаешь дома, когда никого нет и нечего делать? +О чем ты думаешь сейчас? +Расскажи мне, где и кем ты себя видишь через 5 лет? +А ты слушаешь больше голос сердца или разума? +У тебя есть вторая половина? +Расскажи все самое важное про себя за 1 предложение? +Какие у тебя были или есть прозвища? +Над какой целью и мечтой ты работаешь сейчас? +Чем больше всего ты гордишься в жизни? +Какие у тебя есть вредные привычки? +Что тебя выводит из себя без всякой причины? +Как выглядит отпуск твоей мечты? +Кто твой герой, кумир и эталон? +Какое у тебя хобби? +Что ты мечтаешь донести этому миру? +Если бы тебе надо было поменять имя, то какое оно было бы? +Какие у тебя планы на будущее, профессию и мечты были в детстве? +О чем ты думаешь, в первую очередь утром? +Как выглядит твоя идеальная вторая половинка? +В тебя был кто-то тайно влюблен? +Какая у тебя следующая цель для путешествия? +Какой у тебя самый любимый спорт? +Какое событие поменяло твою жизнь в корне? +С кем ты мечтаешь поужинать? +Что вызвало у тебя слезы в последний раз? +Что ты хочешь поменять в себе? +Какое твое самое большое достижение в жизни? +Как ты поддерживаешь себя в спортивной форме? +Как твои знакомые и друзья могут охарактеризовать тебя? +Как выглядела твоя первая влюбленность и любовь? +В какой профессии или роли ты мечтаешь попробовать себя? +Какой твой любимый звук? +Какой твой любимый запах? +Какое твое любимое ощущение? +Как долго тебе хотелось бы прожить? +Что тебе больше нравится: восход или закат? +Что тебе больше нравится: лето или зима? +Что тебе больше нравится: ночь или день? +Что тебе больше нравится: кошки или собаки? +Какое у тебя самое приятное воспоминание? +Какое у тебя самое ужасное воспоминание? +Какие у тебя отношения с семьей и близкими? +Как много у тебя обуви, одежды и вещей? +Какие у тебя есть шрамы, если не секрет, и как они появились? +Кто знает тебя лучше всех? +Как ты относишься к политике? +Что ты считаешь самым важным в жизни? +Какой у тебя самый любимый праздник? +Кто твой лучший друг или подруга? +Какое у тебя самое любимое блюдо, которое можешь есть до конца жизни? +Какое твое идеальное место жизни на планете? +Где ты хочешь жить? +Кто твой любимый кинозлодей? +В чем смысл жизни? +Веришь ли ты в инопланетян, внеземные формы жизни и мистику? +Ты что-нибудь коллекционируешь? +Какими людьми ты действительно восхищаешься? +Ты разговариваешь с собой, когда никого рядом нет? +Как ты оцениваешь свое прошлое? +Как ты справляешься с гневом, раздражением или депрессией? +Какой опыт ты больше не хочешь получить и повторить? +Как ты потратишь миллион долларов, выигранный в лотерею? +Расскажи мне, чего ты больше всего боишься? +На что ты обращаешь внимание при знакомстве с человеком? +Какое твое самое большое сожаление? +Веришь ли ты в любовь с первого взгляда? +Какую книгу ты сейчас читаешь? +Любишь ли ты сюрпризы и неожиданности? +Какой герой или героиня в сериале «Игре Престолов» похож на тебя по характеру? +Как ты думаешь, что следует взять на необитаемый остров? +О чем ты любишь думать? +Как много у тебя друзей? +Как ты предпочитаешь развлекаться? +Насколько для тебя важны карьера и семья? +Какая твоя главная страсть в жизни? +На какой возраст ты себя ощущаешь? +Любишь ли ты романтику? +Какие самые безумные и интересные поступки были в твоей жизни? +Что ты ценишь больше: комфорт или приключения? +Какой самый лучший подарок тебе доводилось получать? +Какой самый худший подарок тебе доводилось получать? +О чем ты чаще всего врешь? +Что ты хочешь сказать своей одеждой людям? +Какие у тебя есть страхи и фобиии? +Веришь ли ты в дружбу между мужчиной и женщиной? +Что ты мечтаешь изменить в жизни? +Что ты хочешь поменять в прошлом? +Веришь ли ты в бога, религию и реинкарнацию? +Представь, что ты работаешь в цирке. Кто ты? +Какой твой самый смелый и отчаянный поступок в жизни? +Для тебя стакан наполовину полон или наполовину пуст? +Какая твоя любимая цитата или поговорка? +С каким животным ты себя ассоциируешь? +Веришь ли ты в знаки зодиака и совместимости? +Какой самый неловкий момент случался с тобой? +Что такое счастье в твоем понимании? +Какие увлечения и хобби были у тебя в детстве? +В каком месте ты чувствуешь себя наиболее комфортно и уютно? +Как ты поднимаешь себе настроение? +Если бы надо было выбирать, что ты предпочтешь: высокий интеллект или красоту? +Любишь ли ты готовить, и какое твое коронное блюдо? +Какой у тебя был самый любимый предмет в школе или университете? +Как начинается твой обычный день? +Сколько ты мечтаешь иметь детей? +Как ты думаешь, справедливо мстить врагам или лучше прощать? +Что ты хочешь попробовать интересного в жизни? +Какие у тебя любимые сладости и фрукты? +Что ты больше всего ценишь в людях? +Какие книги, фильмы или люди изменили твое мировоззрение? +Что ты мечтаешь попробовать сделать, но пока боишься? +Любишь ли ты следовать эмоциям и страстям? +Что делает тебя счастливым в жизни? +Что такое любовь в твоем понимании? +О чем ты обычно думаешь, когда не можешь заснуть? +С какими внутренними демонами ты борешься сейчас? +Что ты не можешь терпеть от окружающих? +Что ты хочешь делать на свидании? +Какие мысли тебе не дают заснуть по ночам? +Как ты относишься к тому, чтобы следовать эмоциям и чувствам? +Какая у тебя была самая нелепая травма? +Какие спонтанные идеи стали для тебя лучшими? +Какой запах лучше всего тебя описывает? +Чем ты гордишься больше всего и что заставляет тебя себя уважать? +В чем тебя обычно не понимают? +Есть ли у тебя какие-то запреты или табу? +Жертвой каких стереотипов ты являешься? +Расскажи мне, какие у тебя были самые нелепые и странные покупки? \ No newline at end of file diff --git a/skills/emotion_skill/Dockerfile b/skills/emotion_skill/Dockerfile index 143729de71..edac90e666 100644 --- a/skills/emotion_skill/Dockerfile +++ b/skills/emotion_skill/Dockerfile @@ -2,6 +2,9 @@ FROM python:3.8.0 WORKDIR /src +ARG LANGUAGE=EN +ENV LANGUAGE ${LANGUAGE} + COPY ./skills/emotion_skill/requirements.txt requirements.txt RUN pip install -r requirements.txt diff --git a/skills/emotion_skill/scenario.py b/skills/emotion_skill/scenario.py index 2ab4353cdb..d0351e43d2 100644 --- a/skills/emotion_skill/scenario.py +++ b/skills/emotion_skill/scenario.py @@ -25,6 +25,8 @@ logging.basicConfig(format="%(asctime)s - %(name)s - %(levelname)s - %(message)s", level=logging.INFO) logger = logging.getLogger(__name__) +LANGUAGE = getenv("LANGUAGE", "EN") + SCRIPTED_TRIGGER_PHRASES = [] for skill in LIST_OF_SCRIPTED_TOPICS: SCRIPTED_TRIGGER_PHRASES.extend(list(skills_phrases_map[skill])) @@ -210,7 +212,7 @@ def __call__(self, dialogs): very_very_confident = very_confident and any( [ how_are_you_response.lower() in prev_bot_phrase.lower() - for how_are_you_response in HOW_ARE_YOU_RESPONSES + for how_are_you_response in HOW_ARE_YOU_RESPONSES[LANGUAGE] ] ) # Confident if regezp diff --git a/skills/meta_script_skill/Dockerfile b/skills/meta_script_skill/Dockerfile index f30fdd6d8c..ee630e0dbe 100644 --- a/skills/meta_script_skill/Dockerfile +++ b/skills/meta_script_skill/Dockerfile @@ -3,6 +3,9 @@ FROM python:3.7.4 RUN mkdir /src RUN mkdir /src/common +ARG LANGUAGE=EN +ENV LANGUAGE ${LANGUAGE} + COPY ./skills/meta_script_skill/requirements.txt /src/requirements.txt RUN pip install -r /src/requirements.txt RUN python -m spacy download en diff --git a/skills/meta_script_skill/meta_script.py b/skills/meta_script_skill/meta_script.py index 73393ad1cf..48e8eab8ad 100644 --- a/skills/meta_script_skill/meta_script.py +++ b/skills/meta_script_skill/meta_script.py @@ -50,6 +50,7 @@ logging.basicConfig(format="%(asctime)s - %(name)s - %(levelname)s - %(message)s", level=logging.INFO) logger = logging.getLogger(__name__) +LANGUAGE = getenv("LANGUAGE", "EN") WORK_DIR = pathlib.Path(__file__).parent TOPICS = json.load((WORK_DIR / "comet_predefined.json").open()) @@ -249,7 +250,7 @@ def get_response_for_particular_topic_and_status(topic, curr_meta_script_status, prev_what_to_talk_about_outputs = [ get_outputs_with_response_from_dialog(dialog["utterances"][-3:], response=response, activated=True) - for response in GREETING_QUESTIONS[list(GREETING_QUESTIONS.keys())[0]] + for response in GREETING_QUESTIONS[LANGUAGE][list(GREETING_QUESTIONS[LANGUAGE].keys())[0]] ] prev_what_to_talk_about_outputs = sum( [list_of_outputs for list_of_outputs in prev_what_to_talk_about_outputs if len(list_of_outputs) > 0], [] diff --git a/skills/personal_info_skill/Dockerfile b/skills/personal_info_skill/Dockerfile index 176a82e76e..d36c9b138a 100644 --- a/skills/personal_info_skill/Dockerfile +++ b/skills/personal_info_skill/Dockerfile @@ -3,6 +3,9 @@ FROM python:3.7.4 RUN mkdir /src RUN mkdir /src/common +ARG LANGUAGE=EN +ENV LANGUAGE ${LANGUAGE} + COPY ./skills/personal_info_skill/requirements.txt /src/requirements.txt RUN pip install -r /src/requirements.txt diff --git a/skills/personal_info_skill/server.py b/skills/personal_info_skill/server.py index 6ea5239390..966006ec41 100644 --- a/skills/personal_info_skill/server.py +++ b/skills/personal_info_skill/server.py @@ -9,16 +9,45 @@ import sentry_sdk from common.constants import CAN_CONTINUE_SCENARIO, CAN_NOT_CONTINUE, MUST_CONTINUE -from common.weather import ASK_WEATHER_SKILL_FOR_HOMELAND_PHRASE +from common.personal_info import ( + NON_GEOGRAPHICAL_LOCATIONS_COMPILED_PATTERN, + REPEAT_INFO_PHRASES, + ASK_GEOGRAPHICAL_LOCATION_BECAUSE_USER_MISUNDERSTOOD_BOT, + MAX_READABLE_NAME_WORD_LEN, + TELL_USER_HIS_INFO_RESPONSE, + RESPONSE_PHRASES, + TELL_MY_COMPILED_PATTERNS, + BOT_KNOWS_INFO_KEY, + BOT_DOESNT_KNOW_INFO_KEY, + BOT_DOESNT_KNOW_USER_INFO_RESPONSES, + ASK_USER_ABOUT_NAME_AGAIN_RESPONSE, + AS_YOU_WISH_RESPONSE, + WHERE_DO_YOU_LIVE_NOW_RESPONSE, + NEVER_HEARD_OF_NAME_RESPONSE, + WHICH_INFO_RU_MAP, + where_are_you_from_pattern, + what_is_your_location_pattern, + is_secret_patterns, + my_name_is_not_pattern, + what_is_your_name_pattern, + my_location_is_pattern, + my_name_is_pattern, + my_origin_is_pattern, + how_do_you_know_my_info_patterns, + how_do_you_know_my_info_responses, +) from common.utils import get_entities, get_named_locations, get_named_persons, is_no, is_yes sentry_sdk.init(getenv("SENTRY_DSN")) +LANGUAGE = getenv("LANGUAGE", "EN") logging.basicConfig(format="%(asctime)s - %(name)s - %(levelname)s - %(message)s", level=logging.INFO) logger = logging.getLogger(__name__) app = Flask(__name__) +RESPONSE_PHRASES = RESPONSE_PHRASES[LANGUAGE] +BOT_DOESNT_KNOW_USER_INFO_RESPONSES = BOT_DOESNT_KNOW_USER_INFO_RESPONSES[LANGUAGE] @app.route("/respond", methods=["POST"]) @@ -34,28 +63,37 @@ def respond(): for dialog in dialogs_batch: human_attr, bot_attr = {}, {} response, confidence, attr = how_do_you_know_my_info(dialog, which_info="name") + logger.info(f"how_do_you_know_my_info `name` returns {response}") if confidence == 0.0: response, confidence, attr = how_do_you_know_my_info(dialog, which_info="location") + logger.info(f"how_do_you_know_my_info `location` returns {response}") if confidence == 0.0: response, confidence, attr = how_do_you_know_my_info(dialog, which_info="homeland") + logger.info(f"how_do_you_know_my_info `homeland` returns {response}") if confidence == 0.0: response, confidence, human_attr, bot_attr, attr = process_info(dialog, which_info="name") + logger.info(f"process_info `name` returns {response}") if confidence == 0.0: response, confidence, human_attr, bot_attr, attr = process_info(dialog, which_info="homeland") + logger.info(f"process_info `homeland` returns {response}") if confidence == 0.0: response, confidence, human_attr, bot_attr, attr = process_info(dialog, which_info="location") + logger.info(f"process_info `location` returns {response}") if confidence == 0.0: response, confidence, attr = tell_my_info(dialog, which_info="name") + logger.info(f"tell_my_info `name` returns {response}") if confidence == 0.0: response, confidence, attr = tell_my_info(dialog, which_info="location") + logger.info(f"tell_my_info `location` returns {response}") if confidence == 0.0: response, confidence, attr = tell_my_info(dialog, which_info="homeland") + logger.info(f"tell_my_info `homeland` returns {response}") responses.append(response) confidences.append(confidence) @@ -68,149 +106,6 @@ def respond(): return jsonify(list(zip(responses, confidences, human_attributes, bot_attributes, attributes))) -what_is_your_name_pattern = re.compile( - r"((what is|what's|whats|tell me|may i know|ask you for) your? name|what name would you like)", re.IGNORECASE -) -my_name_is_pattern = re.compile(r"(my (name is|name's)|call me)", re.IGNORECASE) -_is_not_re = r"(is not|isn't|was not|wasn't|have (not|never) been|haven't been|had (not|never) been|hadn't been)" -my_name_is_not_pattern = re.compile( - rf"my (name is not|name {_is_not_re}|name's not)|not call me|why do you call me|" - rf"(that|this|it) {_is_not_re} my name", - re.IGNORECASE, -) -where_are_you_from_pattern = re.compile( - r"(where are you from|where you (were|was) born|" - r"(what is|what's|whats|tell me) your " - r"(home\s?land|mother\s?land|native\s?land|birth\s?place))", - re.IGNORECASE, -) -my_origin_is_pattern = re.compile( - r"(my ((home\s?land|mother\s?land|native\s?land|birth\s?place) " - r"is|(home\s?land|mother\s?land|native\s?land|birth\s?place)'s)|" - r"(i was|i were) born in|i am from|i'm from)", - re.IGNORECASE, -) -what_is_your_location_pattern = re.compile( - r"((what is|what's|whats|tell me) your? location|" - r"where do you live|where are you now|" - r"is that where you live now)", - re.IGNORECASE, -) -my_location_is_pattern = re.compile( - r"my (location is|location's)|(i am|i'm|i)( live| living)? in([a-zA-z ]+)?now", re.IGNORECASE -) - -_name_re = r"(first |last |middle |second )?name" -_tell_re = r"((told|said|gave)|(tells|says|gives)|((have|had) (told|said|given)))" -_you_know_question_re = ( - r"((do|did|can|could) you (know|find out|learn)|(have|had) you (known|found out|learned|learnt))" -) -_how_re = r"(how|where|when|from whom)" -_i_live_re = r"(i lived?|my (house|home) (is|was|have been)|my family live[sd]?)" -_how_do_you_know_question = rf"({_how_re} {_you_know_question_re}|who {_tell_re} you)" -how_do_you_know_my_info_patterns = { - "name": re.compile(rf"{_how_do_you_know_question} (my {_name_re}|what is my {_name_re}|what my {_name_re} is)"), - "location": re.compile(rf"{_how_do_you_know_question} where {_i_live_re}"), - "homeland": re.compile(rf"{_how_do_you_know_question} where i am from"), -} - -_secret_word_re = r"(secret|private|confidential)" -_common_secret_re = rf"(it|this|that) is (a )?{_secret_word_re}|^{_secret_word_re}" -is_secret_patterns = { - "name": re.compile(rf"{_common_secret_re}|(sur|last |first |second |middle )?name is (a )?{_secret_word_re}"), - "location": re.compile(rf"{_common_secret_re}|location is (a )?{_secret_word_re}"), - "homeland": re.compile(rf"{_common_secret_re}"), -} - -BOT_DOESNT_KNOW_INFO_KEY = "bot_doesnt_know_info" -BOT_KNOWS_INFO_KEY = "bot_knows_info" -how_do_you_know_my_info_responses = { - "name": { - BOT_DOESNT_KNOW_INFO_KEY: "Sorry, but I really do not know your name. " - "Would you be so kind to tell me you name?", - BOT_KNOWS_INFO_KEY: "Ah, you have probably forgotten that you told me your name before. " - "Maybe you told me your name the last time we talked.", - }, - "location": { - BOT_DOESNT_KNOW_INFO_KEY: "Sorry, but I really do not know where you live. Would tell me?", - BOT_KNOWS_INFO_KEY: "Ah, you have probably forgotten that" - "you told me where you live before. Maybe you told me this the last time we talked.", - }, - "homeland": { - BOT_DOESNT_KNOW_INFO_KEY: "Sorry, but I really do not know where you are from. " - "So, where are you from? I hope i am not tactless.", - BOT_KNOWS_INFO_KEY: "Ah, you have probably forgotten that you told me where you are from before. " - "Maybe you told me this the last time we talked", - }, -} -MAX_READABLE_NAME_WORD_LEN = 20 -NON_GEOGRAPHICAL_LOCATIONS = [ - "hospital", - "school", - "work", - "home", - "car", - "train", - "train station", - "outdoors", - "bed", - "kitchen", - "bedroom", - "bathroom", - "basement", - "jail", - "prison", - "bath", -] -NON_GEOGRAPHICAL_LOCATIONS_COMPILED_PATTERN = re.compile( - r"\b" + r"\b|\b".join(NON_GEOGRAPHICAL_LOCATIONS) + r"\b", re.I -) -ASK_GEOGRAPHICAL_LOCATION_BECAUSE_USER_MISUNDERSTOOD_BOT = { - "homeland": "Sorry, but I probably misheard you. " - "I am just curious to know the region or the city in which you were born", - "location": "Sorry, but I probably misheard you. " "Could you please tell me in which city or region you are now?", -} - -RESPONSE_PHRASES = { - "name": ["Nice to meet you, "], - "location": [ASK_WEATHER_SKILL_FOR_HOMELAND_PHRASE, "Cool!"], - "homeland": ["Is that where you live now?", "Cool!"], -} - -REPEAT_INFO_PHRASES = { - "name": "I didn't get your name. Could you, please, repeat it.", - "location": "I didn't get your location. Could you, please, repeat it.", - "homeland": "I didn't get where you have been born. Could you please repeat it?", -} - -TELL_MY_COMPILED_PATTERNS = { - "name": re.compile( - r"(what is|what's|whats|tell me|you know|you remember|memorize|say) my name|how( [a-zA-Z ]+)?call me|" - r"my name is what|you( can| could| shall| will)? tell my name", - re.I, - ), - "location": re.compile( - r"((what is|what's|whats|tell me|you know|you remember|memorize|say) my (location|country|city|town)|" - r"where (am i|i am)(\snow)?|where( do)?i live|where( am)?i( am)? living)|(what|which) " - r"(country|city|town)( do)? (i|am i|i am)", - re.I, - ), - "homeland": re.compile( - r"((what is|what's|whats|tell me|you know|you remember|memorize|say) " - r"my (home\s?land|mother\s?land|home\s?town|native\s?land|birth\s?place)|where (am i|i am) from)", - re.I, - ), -} - -BOT_DOESNT_KNOW_USER_INFO_RESPONSES = { - "name": f"Sorry, we are still not familiar. What is your name?", - "location": f"Sorry, I don't have this information. But you can tell me. What is your location?", - "homeland": f"Sorry, I don't have this information. But you can tell me. Where are you from?", -} - -TELL_USER_HIS_INFO_RESPONSE = "Your {which_info} is {info}." - - def did_user_misunderstand_bot_question_about_geography(found_info_or_user_text, which_info, prev_bot_text): logger.info(f"found_info_or_user_text: {found_info_or_user_text}") logger.info(f"which_info: {which_info}") @@ -221,18 +116,18 @@ def did_user_misunderstand_bot_question_about_geography(found_info_or_user_text, and ( where_are_you_from_pattern.search(prev_bot_text) or what_is_your_location_pattern.search(prev_bot_text) - or REPEAT_INFO_PHRASES[which_info].lower() in prev_bot_text + or REPEAT_INFO_PHRASES[which_info][LANGUAGE].lower() in prev_bot_text ) ) def was_user_asked_to_clarify_info(prev_bot_text, which_info): if which_info == "name": - res = prev_bot_text == REPEAT_INFO_PHRASES[which_info].lower() + res = prev_bot_text == REPEAT_INFO_PHRASES[which_info][LANGUAGE].lower() else: res = ( - prev_bot_text == REPEAT_INFO_PHRASES[which_info].lower() - or prev_bot_text == ASK_GEOGRAPHICAL_LOCATION_BECAUSE_USER_MISUNDERSTOOD_BOT[which_info].lower() + prev_bot_text == REPEAT_INFO_PHRASES[which_info][LANGUAGE].lower() + or prev_bot_text == ASK_GEOGRAPHICAL_LOCATION_BECAUSE_USER_MISUNDERSTOOD_BOT[which_info][LANGUAGE].lower() ) return res @@ -262,14 +157,12 @@ def is_secret(user_text, which_info): def user_tells_bot_called_him_wrong(curr_human_annotated_uttr, prev_bot_text, user_profile): name = user_profile.get("name") - if name is None: - res = False - else: - res = ( - my_name_is_not_pattern.search(curr_human_annotated_uttr.get("text", "")) - or TELL_USER_HIS_INFO_RESPONSE.format(which_info="name", info=name).lower() in prev_bot_text - and is_no(curr_human_annotated_uttr) - ) + + lang_which_info = "имя" if LANGUAGE == "RU" else "name" + res = my_name_is_not_pattern.search(curr_human_annotated_uttr.get("text", "")) or ( + TELL_USER_HIS_INFO_RESPONSE[LANGUAGE].format(which_info=lang_which_info, info=name).lower() in prev_bot_text + and is_no(curr_human_annotated_uttr) + ) return res @@ -284,12 +177,8 @@ def process_info(dialog, which_info="name"): curr_user_uttr = curr_uttr_dict["text"].lower() curr_user_annot = curr_uttr_dict["annotations"] bot_utterance_texts = [u["text"].lower() for u in dialog["bot_utterances"]] - try: - prev_bot_uttr = dialog["bot_utterances"][-1]["text"].lower() - except IndexError: - prev_bot_uttr = "" + prev_bot_uttr = bot_utterance_texts[-1] if len(bot_utterance_texts) > 0 else "" - logger.info(f"Previous bot uterance: {prev_bot_uttr}") is_about_templates = { "name": what_is_your_name_pattern.search(prev_bot_uttr) or my_name_is_pattern.search(curr_user_uttr), "homeland": where_are_you_from_pattern.search(prev_bot_uttr) or my_origin_is_pattern.search(curr_user_uttr), @@ -309,37 +198,38 @@ def process_info(dialog, which_info="name"): got_info = False # if user doesn't want to share his info if user_tells_bot_called_him_wrong(curr_uttr_dict, prev_bot_uttr, dialog["human"]["profile"]): - logger.info(f"User says My name is not Blabla") - response = f"My bad. What is your name again?" + logger.info("User says My name is not Blabla. Clarify name again") + response = ASK_USER_ABOUT_NAME_AGAIN_RESPONSE[LANGUAGE] confidence = 1.0 got_info = True attr["can_continue"] = MUST_CONTINUE elif (is_about_templates[which_info] or was_user_asked_to_clarify_info(prev_bot_uttr, which_info)) and ( is_no(curr_uttr_dict) or is_secret(curr_user_uttr, which_info) ): - response = "As you wish." + logger.info("User does not want to share private info. Finish") + response = AS_YOU_WISH_RESPONSE[LANGUAGE] confidence = 1.0 attr["can_continue"] = CAN_NOT_CONTINUE return response, confidence, human_attr, bot_attr, attr - elif re.search(r"is that where you live now", prev_bot_uttr) and is_yes(curr_uttr_dict): - logger.info(f"Found location=homeland") + elif re.search(RESPONSE_PHRASES["homeland"][0], prev_bot_uttr, re.IGNORECASE) and is_yes(curr_uttr_dict): + logger.info("Found location=homeland") if dialog["human"]["attributes"].get("homeland", None): human_attr["location"] = dialog["human"]["attributes"]["homeland"] else: found_homeland = check_entities( "homeland", - curr_user_uttr=dialog["utterances"][-3]["text"].lower(), - curr_user_annot=dialog["utterances"][-3]["annotations"], - prev_bot_uttr=dialog["utterances"][-4]["text"].lower(), + curr_user_uttr=dialog["human_utterances"][-2]["text"].lower(), + curr_user_annot=dialog["human_utterances"][-2]["annotations"], + prev_bot_uttr=dialog["bot_utterances"][-2]["text"].lower(), ) human_attr["location"] = found_homeland response = response_phrases["location"] confidence = 1.0 got_info = True attr["can_continue"] = MUST_CONTINUE - elif re.search(r"is that where you live now", prev_bot_uttr) and is_no(curr_uttr_dict): - logger.info(f"Found location is not homeland") - response = f"So, where do you live now?" + elif re.search(RESPONSE_PHRASES["homeland"][0], prev_bot_uttr, re.IGNORECASE) and is_no(curr_uttr_dict): + logger.info("Found location is not homeland") + response = WHERE_DO_YOU_LIVE_NOW_RESPONSE[LANGUAGE] confidence = 1.0 got_info = False attr["can_continue"] = MUST_CONTINUE @@ -351,9 +241,9 @@ def process_info(dialog, which_info="name"): if which_info == "name" and found_info is not None: found_info = filter_unreadable_names(found_info) if found_info is None: - logger.info(f"found_info is None") + logger.info("found_info is None") if did_user_misunderstand_bot_question_about_geography(curr_user_uttr, which_info, prev_bot_uttr): - response = ASK_GEOGRAPHICAL_LOCATION_BECAUSE_USER_MISUNDERSTOOD_BOT[which_info] + response = ASK_GEOGRAPHICAL_LOCATION_BECAUSE_USER_MISUNDERSTOOD_BOT[which_info][LANGUAGE] confidence = 0.9 attr["can_continue"] = CAN_CONTINUE_SCENARIO elif which_info in ["homeland", "location"] and NON_GEOGRAPHICAL_LOCATIONS_COMPILED_PATTERN.search( @@ -371,11 +261,11 @@ def process_info(dialog, which_info="name"): and len(curr_user_uttr.split()) == 1 and len(get_entities(curr_uttr_dict, only_named=False, with_labels=False)) > 0 ): - response = "I've never heard about this name." + response = NEVER_HEARD_OF_NAME_RESPONSE[LANGUAGE] confidence = 1.0 attr["can_continue"] = MUST_CONTINUE else: - response = REPEAT_INFO_PHRASES[which_info] + response = REPEAT_INFO_PHRASES[which_info][LANGUAGE] confidence = 1.0 attr["can_continue"] = MUST_CONTINUE else: @@ -388,7 +278,7 @@ def process_info(dialog, which_info="name"): else: if NON_GEOGRAPHICAL_LOCATIONS_COMPILED_PATTERN.search(found_info): if did_user_misunderstand_bot_question_about_geography(found_info, which_info, prev_bot_uttr): - response = ASK_GEOGRAPHICAL_LOCATION_BECAUSE_USER_MISUNDERSTOOD_BOT[which_info] + response = ASK_GEOGRAPHICAL_LOCATION_BECAUSE_USER_MISUNDERSTOOD_BOT[which_info][LANGUAGE] confidence = 0.9 attr["can_continue"] = CAN_CONTINUE_SCENARIO else: @@ -414,7 +304,7 @@ def process_info(dialog, which_info="name"): def how_do_you_know_my_info(dialog, which_info="name"): - curr_user_uttr = dialog["utterances"][-1]["text"].lower() + curr_user_uttr = dialog["human_utterances"][-1]["text"].lower() how_do_you_know_search_result = how_do_you_know_my_info_patterns[which_info].search(curr_user_uttr) if how_do_you_know_search_result is None: response = "" @@ -422,9 +312,9 @@ def how_do_you_know_my_info(dialog, which_info="name"): attr = {} else: if dialog.get("human", {}).get("profile", {}).get(which_info, ""): - response = how_do_you_know_my_info_responses[which_info][BOT_KNOWS_INFO_KEY] + response = how_do_you_know_my_info_responses[which_info][BOT_KNOWS_INFO_KEY][LANGUAGE] else: - response = how_do_you_know_my_info_responses[which_info][BOT_DOESNT_KNOW_INFO_KEY] + response = how_do_you_know_my_info_responses[which_info][BOT_DOESNT_KNOW_INFO_KEY][LANGUAGE] confidence = 1.0 attr = {"can_continue": MUST_CONTINUE} return response, confidence, attr @@ -435,7 +325,7 @@ def tell_my_info(dialog, which_info="name"): confidence = 0.0 attr = {} - curr_user_uttr = dialog["utterances"][-1]["text"].lower() + curr_user_uttr = dialog["human_utterances"][-1]["text"].lower() if TELL_MY_COMPILED_PATTERNS[which_info].search(curr_user_uttr): logger.info(f"Asked to memorize user's {which_info} in {curr_user_uttr}") if dialog["human"]["profile"].get(which_info, None) is None: @@ -444,7 +334,8 @@ def tell_my_info(dialog, which_info="name"): attr["can_continue"] = MUST_CONTINUE else: name = dialog["human"]["profile"][which_info] - response = TELL_USER_HIS_INFO_RESPONSE.format(which_info=which_info, info=name) + lang_which_info = WHICH_INFO_RU_MAP[which_info] if LANGUAGE == "RU" else which_info + response = TELL_USER_HIS_INFO_RESPONSE[LANGUAGE].format(which_info=lang_which_info, info=name) confidence = 1.0 attr["can_continue"] = MUST_CONTINUE return response, confidence, attr @@ -484,8 +375,9 @@ def check_entities(which_info, curr_user_uttr, curr_user_annot, prev_bot_uttr): if "my name is" == ent["text"].lower() or "call me" == ent["text"].lower(): continue if ent["text"].lower() == "alexa": - if re.search(r"(my (name is|name's)|call me) alexa", curr_user_uttr) or ( - re.search(r"(what is|what's|whats|tell me) your? name", prev_bot_uttr) + # this is working only for English + if re.search(r"(my (name is|name's)|call me) alexa", curr_user_uttr, re.IGNORECASE) or ( + re.search(r"(what is|what's|whats|tell me) your? name", prev_bot_uttr, re.IGNORECASE) and re.match(r"^alexa[.,!?]*$", curr_user_uttr) ): # - my name is alexa diff --git a/skills/personal_info_skill/test.py b/skills/personal_info_skill/test.py index f384ee6cf0..b411fc89c3 100644 --- a/skills/personal_info_skill/test.py +++ b/skills/personal_info_skill/test.py @@ -1,25 +1,23 @@ +import json +import os import requests SKILL_URL = "http://0.0.0.0:8030/respond" +LANGUAGE = os.getenv("LANGUAGE", "EN") +with open(f"test_{LANGUAGE}.json", "r") as f: + dialogs = json.load(f) + +gold = [] +for dialog in dialogs["dialogs"]: + gold += [dialog.pop("expected_response")] -dialogs = { - "dialogs": [ - { - "utterances": [{"text": "my name is john", "annotations": {"ner": [[{"text": "john", "type": "PER"}]]}}], - "bot_utterances": [], - "human": {"attributes": {}, "profile": {"name": None}}, - "human_utterances": [ - {"text": "my name is john", "annotations": {"ner": [[{"text": "john", "type": "PER"}]]}} - ], - } - ] -} -gold = "Nice to meet you, john." result = requests.post(SKILL_URL, json=dialogs, timeout=2) result = result.json() -assert result[0][0] == "Nice to meet you, John.", print(result) +for i in range(len(dialogs["dialogs"])): + print(f"check for uttr `{dialogs['dialogs'][i]['human_utterances'][-1]['text']}`\tgold response: `{gold[i]}`") + assert result[i][0] == gold[i], print(result[i]) print("SUCCESS") diff --git a/skills/personal_info_skill/test_EN.json b/skills/personal_info_skill/test_EN.json new file mode 100644 index 0000000000..a85eeaa417 --- /dev/null +++ b/skills/personal_info_skill/test_EN.json @@ -0,0 +1,320 @@ +{ + "dialogs": [ + { + "expected_response": "Nice to meet you, John.", + "utterances": [ + { + "text": "my name is john.", + "annotations": { + "ner": [ + [ + { + "text": "john", + "type": "PER" + } + ] + ] + } + } + ], + "bot_utterances": [], + "human": { + "attributes": {}, + "profile": { + "name": null + } + }, + "human_utterances": [ + { + "text": "my name is john.", + "annotations": { + "ner": [ + [ + { + "text": "john", + "type": "PER" + } + ] + ] + } + } + ] + }, + { + "expected_response": "My bad. What is your name again?", + "utterances": [ + { + "text": "my name isn't john.", + "annotations": { + "ner": [ + [ + { + "text": "john", + "type": "PER" + } + ] + ] + } + } + ], + "bot_utterances": [], + "human": { + "attributes": {}, + "profile": { + "name": null + } + }, + "human_utterances": [ + { + "text": "my name isn't john.", + "annotations": { + "ner": [ + [ + { + "text": "john", + "type": "PER" + } + ] + ] + } + } + ] + }, + { + "expected_response": "Is that where you live now?", + "utterances": [ + { + "text": "i'm from moscow.", + "annotations": { + "ner": [ + [ + { + "text": "moscow", + "type": "LOC" + } + ] + ] + } + } + ], + "bot_utterances": [], + "human": { + "attributes": {}, + "profile": { + "name": null + } + }, + "human_utterances": [ + { + "text": "i'm from moscow.", + "annotations": { + "ner": [ + [ + { + "text": "moscow", + "type": "LOC" + } + ] + ] + } + } + ] + }, + { + "expected_response": "Sorry, but I really do not know your name. Would you be so kind to tell me you name?", + "utterances": [ + { + "text": "how do you know my name?", + "annotations": { + "ner": [ + [] + ] + } + } + ], + "bot_utterances": [], + "human": { + "attributes": {}, + "profile": { + "name": null + } + }, + "human_utterances": [ + { + "text": "how do you know my name?", + "annotations": { + "ner": [ + [] + ] + } + } + ] + }, + { + "expected_response": "Ah, you have probably forgotten that you told me your name before. Maybe you told me your name the last time we talked.", + "utterances": [ + { + "text": "how do you know my name?", + "annotations": { + "ner": [ + [] + ] + } + } + ], + "bot_utterances": [], + "human": { + "attributes": {}, + "profile": { + "name": "John" + } + }, + "human_utterances": [ + { + "text": "how do you know my name?", + "annotations": { + "ner": [ + [] + ] + } + } + ] + }, + { + "expected_response": "Your name is John.", + "utterances": [ + { + "text": "do you know my name?", + "annotations": { + "ner": [ + [] + ] + } + } + ], + "bot_utterances": [], + "human": { + "attributes": {}, + "profile": { + "name": "John" + } + }, + "human_utterances": [ + { + "text": "do you know my name?", + "annotations": { + "ner": [ + [] + ] + } + } + ] + }, + { + "expected_response": "Sorry, we are still not familiar. What is your name?", + "utterances": [ + { + "text": "do you know my name?", + "annotations": { + "ner": [ + [] + ] + } + } + ], + "bot_utterances": [], + "human": { + "attributes": {}, + "profile": { + "name": null + } + }, + "human_utterances": [ + { + "text": "do you know my name?", + "annotations": { + "ner": [ + [] + ] + } + } + ] + }, + { + "expected_response": "I didn't get your name. Could you, please, repeat it.", + "utterances": [ + { + "text": "my name is blabla.", + "annotations": { + "ner": [ + [] + ] + } + } + ], + "bot_utterances": [], + "human": { + "attributes": {}, + "profile": { + "name": null + } + }, + "human_utterances": [ + { + "text": "my name is blabla.", + "annotations": { + "ner": [ + [] + ] + } + } + ] + }, + { + "expected_response": "Sorry, but I probably misheard you. Could you please tell me in which city or region you are now?", + "utterances": [ + { + "text": "hi!", + "annotations": { + } + }, + { + "text": "where are you?", + "annotations": { + } + }, + { + "text": "at work.", + "annotations": { + "ner": [ + [] + ] + } + } + ], + "bot_utterances": [ + { + "text": "where are you?", + "annotations": { + } + } + ], + "human": { + "attributes": {}, + "profile": { + "name": null + } + }, + "human_utterances": [ + { + "text": "at work.", + "annotations": { + "ner": [ + [] + ] + } + } + ] + } + ] +} \ No newline at end of file diff --git a/skills/personal_info_skill/test_RU.json b/skills/personal_info_skill/test_RU.json new file mode 100644 index 0000000000..757ab46bae --- /dev/null +++ b/skills/personal_info_skill/test_RU.json @@ -0,0 +1,320 @@ +{ + "dialogs": [ + { + "expected_response": "Приятно познакомиться, Джо.", + "utterances": [ + { + "text": "меня зовут джо.", + "annotations": { + "ner": [ + [ + { + "text": "Джо", + "type": "PER" + } + ] + ] + } + } + ], + "bot_utterances": [], + "human": { + "attributes": {}, + "profile": { + "name": null + } + }, + "human_utterances": [ + { + "text": "Меня зовут джо.", + "annotations": { + "ner": [ + [ + { + "text": "Джо", + "type": "PER" + } + ] + ] + } + } + ] + }, + { + "expected_response": "Ой, извини. Как тебя зовут еще раз?", + "utterances": [ + { + "text": "меня зовут не джо.", + "annotations": { + "ner": [ + [ + { + "text": "Джо", + "type": "PER" + } + ] + ] + } + } + ], + "bot_utterances": [], + "human": { + "attributes": {}, + "profile": { + "name": null + } + }, + "human_utterances": [ + { + "text": "меня зовут не джо.", + "annotations": { + "ner": [ + [ + { + "text": "Джо", + "type": "PER" + } + ] + ] + } + } + ] + }, + { + "expected_response": "А сейчас ты живешь в этом же месте?", + "utterances": [ + { + "text": "я родом из москвы.", + "annotations": { + "ner": [ + [ + { + "text": "москвы", + "type": "LOC" + } + ] + ] + } + } + ], + "bot_utterances": [], + "human": { + "attributes": {}, + "profile": { + "name": null + } + }, + "human_utterances": [ + { + "text": "я родом из москвы.", + "annotations": { + "ner": [ + [ + { + "text": "москвы", + "type": "LOC" + } + ] + ] + } + } + ] + }, + { + "expected_response": "Извини, кажется, я еще не знаю, как тебя зовут. Если ты не против, скажи мне, как тебя зовут?", + "utterances": [ + { + "text": "откуда ты знаешь мое имя?", + "annotations": { + "ner": [ + [] + ] + } + } + ], + "bot_utterances": [], + "human": { + "attributes": {}, + "profile": { + "name": null + } + }, + "human_utterances": [ + { + "text": "откуда ты знаешь мое имя?", + "annotations": { + "ner": [ + [] + ] + } + } + ] + }, + { + "expected_response": "Кажется, я уже знаю твое имя из прошлых бесед.", + "utterances": [ + { + "text": "откуда ты знаешь мое имя?", + "annotations": { + "ner": [ + [] + ] + } + } + ], + "bot_utterances": [], + "human": { + "attributes": {}, + "profile": { + "name": "Джо" + } + }, + "human_utterances": [ + { + "text": "откуда ты знаешь мое имя?", + "annotations": { + "ner": [ + [] + ] + } + } + ] + }, + { + "expected_response": "Твое имя - Джо.", + "utterances": [ + { + "text": "ты знаешь мое имя?", + "annotations": { + "ner": [ + [] + ] + } + } + ], + "bot_utterances": [], + "human": { + "attributes": {}, + "profile": { + "name": "Джо" + } + }, + "human_utterances": [ + { + "text": "ты знаешь мое имя?", + "annotations": { + "ner": [ + [] + ] + } + } + ] + }, + { + "expected_response": "Извини, кажется, мы еще не знакомы. Как тебя зовут?", + "utterances": [ + { + "text": "ты знаешь мое имя?", + "annotations": { + "ner": [ + [] + ] + } + } + ], + "bot_utterances": [], + "human": { + "attributes": {}, + "profile": { + "name": null + } + }, + "human_utterances": [ + { + "text": "ты знаешь мое имя?", + "annotations": { + "ner": [ + [] + ] + } + } + ] + }, + { + "expected_response": "Ой, я не смогла распознать имя. Можешь, пожалуйста, повторить.", + "utterances": [ + { + "text": "меня зовут пупка.", + "annotations": { + "ner": [ + [] + ] + } + } + ], + "bot_utterances": [], + "human": { + "attributes": {}, + "profile": { + "name": null + } + }, + "human_utterances": [ + { + "text": "меня зовут пупка.", + "annotations": { + "ner": [ + [] + ] + } + } + ] + }, + { + "expected_response": "Извини, кажется, я неверно поняла тебя. Я просто хотела спросить, в какой стране или городе ты живешь?", + "utterances": [ + { + "text": "привет!", + "annotations": { + } + }, + { + "text": "Где ты живешь?", + "annotations": { + } + }, + { + "text": "на работе.", + "annotations": { + "ner": [ + [] + ] + } + } + ], + "bot_utterances": [ + { + "text": "Где ты живешь?", + "annotations": { + } + } + ], + "human": { + "attributes": {}, + "profile": { + "name": null + } + }, + "human_utterances": [ + { + "text": "на работе.", + "annotations": { + "ner": [ + [] + ] + } + } + ] + } + ] +} \ No newline at end of file diff --git a/state_formatters/dp_formatters.py b/state_formatters/dp_formatters.py index b4fcb44098..12a7dc3d51 100755 --- a/state_formatters/dp_formatters.py +++ b/state_formatters/dp_formatters.py @@ -179,7 +179,7 @@ def asr_formatter_dialog(dialog: Dict) -> List[Dict]: # Used by: asr_formatter return [ { - "speeches": [dialog["utterances"][-1].get("attributes", {}).get("speech", {})], + "speeches": [dialog["human_utterances"][-1].get("attributes", {}).get("speech", {})], "human_utterances": [dialog["human_utterances"][-3:]], } ] @@ -187,7 +187,7 @@ def asr_formatter_dialog(dialog: Dict) -> List[Dict]: def last_utt_dialog(dialog: Dict) -> List[Dict]: # Used by: dp_toxic_formatter, sent_segm_formatter, tfidf_formatter, sentiment_classification - return [{"sentences": [dialog["utterances"][-1]["text"]]}] + return [{"sentences": [dialog["human_utterances"][-1]["text"]]}] def preproc_last_human_utt_dialog(dialog: Dict) -> List[Dict]: @@ -246,6 +246,26 @@ def preproc_last_human_utt_and_nounphrases_dialog(dialog: Dict) -> List[Dict]: ] +def preproc_and_tokenized_last_human_utt_dialog(dialog: Dict) -> List[Dict]: + # Used by: sentseg over human uttrs + tokens = dialog["human_utterances"][-1]["annotations"].get("spacy_annotator", []) + tokens = [token["text"] for token in tokens] + result = [ + { + "sentences": [ + dialog["human_utterances"][-1]["annotations"].get( + "spelling_preprocessing", dialog["human_utterances"][-1]["text"] + ) + ] + } + ] + + if len(tokens): + result[0]["tokenized_sentences"] = [tokens] + + return result + + def last_bot_utt_dialog(dialog: Dict) -> List[Dict]: if len(dialog["bot_utterances"]): return [{"sentences": [dialog["bot_utterances"][-1]["text"]]}] @@ -278,6 +298,18 @@ def hypotheses_segmented_list(dialog: Dict) -> List[Dict]: return [{"sentences": hypots}] +def tokenized_hypotheses_list(dialog: Dict) -> List[Dict]: + hypotheses = dialog["human_utterances"][-1]["hypotheses"] + tokens = [h.get("annotations", {}).get("spacy_annotator", []) for h in hypotheses] + tokens = [[token["text"] for token in h] for h in tokens] + hypots = [h["text"] for h in hypotheses] + if len(tokens): + result = [{"sentences": hypots, "tokenized_sentences": tokens}] + else: + result = [{"sentences": hypots}] + return result + + def ner_hypotheses_segmented_list(dialog: Dict): hypotheses = dialog["human_utterances"][-1]["hypotheses"] hypots = [[h["text"]] for h in hypotheses] @@ -321,7 +353,7 @@ def convers_evaluator_annotator_formatter(dialog: Dict) -> List[Dict]: conv = dict() hypotheses = dialog["human_utterances"][-1]["hypotheses"] conv["hypotheses"] = [h["text"] for h in hypotheses] - conv["currentUtterance"] = dialog["utterances"][-1]["text"] + conv["currentUtterance"] = dialog["human_utterances"][-1]["text"] # cobot recommends to take 2 last utt for conversation evaluation service conv["pastUtterances"] = [uttr["text"] for uttr in dialog["human_utterances"]][-3:-1] conv["pastResponses"] = [uttr["text"] for uttr in dialog["bot_utterances"]][-2:] @@ -480,10 +512,10 @@ def skill_with_attributes_formatter_service(payload: List): def last_utt_sentseg_segments_dialog(dialog: Dict): # Used by: intent_catcher_formatter - if "sentseg" in dialog["utterances"][-1]["annotations"]: - return [{"sentences": [dialog["utterances"][-1]["annotations"]["sentseg"]["segments"]]}] + if "sentseg" in dialog["human_utterances"][-1]["annotations"]: + return [{"sentences": [dialog["human_utterances"][-1]["annotations"]["sentseg"]["segments"]]}] else: - segments = [dialog["utterances"][-1]["text"]] + segments = [dialog["human_utterances"][-1]["text"]] return [{"sentences": [segments]}] @@ -737,12 +769,16 @@ def dff_short_story_skill_formatter(dialog: Dict) -> List[Dict]: return utils.dff_formatter(dialog, "dff_short_story_skill") +def dff_generative_skill_formatter(dialog: Dict) -> List[Dict]: + return utils.dff_formatter(dialog, "dff_generative_skill") + + def dff_template_skill_formatter(dialog: Dict) -> List[Dict]: return utils.dff_formatter(dialog, "dff_template_skill") def dff_intent_responder_skill_formatter(dialog: Dict) -> List[Dict]: - intents = list(dialog["utterances"][-1]["annotations"].get("intent_catcher", {}).keys()) + intents = list(dialog["human_utterances"][-1]["annotations"].get("intent_catcher", {}).keys()) called_intents = {intent: False for intent in intents} for utt in dialog["human_utterances"][-5:-1]: called = [intent for intent, value in utt["annotations"].get("intent_catcher", {}).items() if value["detected"]] @@ -908,3 +944,12 @@ def midas_predictor_formatter(dialog: Dict): midas_dist = dialog["human_utterances"][-1].get("annotations", {}).get("midas_classification", [{}])[-1] return [{"last_midas_labels": [max(midas_dist, key=midas_dist.get)], "return_probas": 1}] + + +def hypotheses_with_context_list(dialog: Dict) -> List[Dict]: + hypotheses = dialog["human_utterances"][-1]["hypotheses"] + hypots = [h["text"] for h in hypotheses] + + contexts = len(hypots) * [dialog["human_utterances"][-1]["text"]] + + return [{"dialog_contexts": contexts, "hypotheses": hypots}] diff --git a/state_formatters/utils.py b/state_formatters/utils.py index 2dfcce272f..37956faf4a 100644 --- a/state_formatters/utils.py +++ b/state_formatters/utils.py @@ -135,7 +135,7 @@ def replace_with_annotated_utterances(dialog, mode="punct_sent"): def clean_up_utterances_to_avoid_unwanted_keys( dialog, - wanted_keys=["text", "annotations", "active_skill"], + wanted_keys=["text", "annotations", "active_skill", "user"], types_utterances=["human_utterances", "bot_utterances", "utterances"], used_annotations=None, ): @@ -264,7 +264,7 @@ def dff_formatter( "human_utter_index_batch": [human_utter_index], "dialog_batch": [new_dialog], f"{state_name}_batch": [state], - f"dff_shared_state_batch": [dff_shared_state], + "dff_shared_state_batch": [dff_shared_state], "entities_batch": [entities], "used_links_batch": [used_links], "age_group_batch": [age_group], diff --git a/tests/runtests_russian.sh b/tests/runtests_russian.sh new file mode 100755 index 0000000000..cfe70a35e2 --- /dev/null +++ b/tests/runtests_russian.sh @@ -0,0 +1,182 @@ +#!/usr/bin/env bash + +for ARGUMENT in "$@"; do + + KEY=$(echo $ARGUMENT | cut -f1 -d=) + VALUE=$(echo $ARGUMENT | cut -f2 -d=) + + case "$KEY" in + DEVICE) DEVICE=${VALUE} ;; + MODE) MODE=${VALUE} ;; + *) ;; + esac +done + +function wait_service() { + local timeout=${WAIT_TIMEOUT:-480} + local interval=${WAIT_INTERVAL:-10} + local url=$1 + local reply_keyword=$2 + while [[ $timeout -gt 0 ]]; do + local res=$(curl -s -XGET "$url" | grep "$reply_keyword") + if [ ! -z "$res" ]; then + echo FOUND service $url + echo REPLY: $res + return 0 + fi + sleep $interval + ((timeout-=interval)) + echo wait_service $url timeout in $timeout sec.. + done + echo ERROR: $url is not responding + return 1 +} + +function cleanup() { + local exit_status=${1:-$?} + echo SHUTDOWN TESTING ENVIRONMENT.. + + dockercompose_cmd stop + dockercompose_cmd run -T agent bash -c "chown -R $(id -u):$(id -g) /dp-agent" + dockercompose_cmd run -T agent bash -c "find /dp-agent -name __pycache__ | xargs rm -rf" + dockercompose_cmd run -T mongo bash -c "rm -rf /root/data/db/*" + + dockercompose_cmd down + dockercompose_cmd rm mongo + echo EXIT $0 with STATUS: $exit_status +} + +function logger() { + printf '\e[96m%80s\e[39m\n' | tr ' ' = + echo -e "\e[44m \e[49m $@ \e[44m \e[49m" + printf '\e[96m%80s\e[39m\n' | tr ' ' = +} + +function dockercompose_cmd() { + # if [[ "$DEVICE" == "cpu" ]]; then + # DOCKER_COMPOSE_CMD="docker-compose -f docker-compose.yml -f dev.yml -f cpu.yml -f proxy.yml -f s3.yml -p test" + # else + DOCKER_COMPOSE_CMD="docker-compose --no-ansi -p dream -f docker-compose.yml -f assistant_dists/dream_russian/docker-compose.override.yml -f assistant_dists/dream_russian/test.yml" + # fi + eval '$DOCKER_COMPOSE_CMD "$@"' + if [[ $? != 0 ]]; then + logger "FAILED dockercompose_cmd: $@" + fi +} + +function container_is_started() { [ "$(docker ps | grep $1)" ] && return 0 || return 1; } + +if [[ "$DEVICE" == "" ]]; then + DEVICE="gpu" +fi + +if [[ "$MODE" == "" ]]; then + MODE="all" +fi + +if [[ "$MODE" == "clean" ]]; then + cleanup + exit 0 +fi + +set -e + +#DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +DIR=$(dirname $(realpath -s $0)) +#trap cleanup EXIT + +echo running tests on $DEVICE in mode: $MODE + +echo Loading testing env.. +AGENT_PORT=${AGENT_PORT:-4242} + +if [[ "$MODE" == "build" ]]; then + dockercompose_cmd build + exit 0 +fi +#dockercompose_cmd logs -f --tail="all" --timestamps & + +if [[ "$MODE" == "start" ]]; then + dockercompose_cmd up -d + dockercompose_cmd logs --no-color -f --tail="all" --timestamps & + wait_service "http://0.0.0.0:$AGENT_PORT/ping" pong + exit 0 +fi + +if [[ "$MODE" == "test_dialog" || "$MODE" == "all" ]]; then + GOLD_OUTPUT_FILE="GOLD-"$(date '+%Y-%m-%d_%H-%M-%S')".csv" + dockercompose_cmd logs --no-color -f --tail="all" --timestamps & + echo "Warmup for tests" + dockercompose_cmd exec -T -u $(id -u) agent python3 tests/dream_russian/test_response.py + + echo "Test workflow bug and asr" + dockercompose_cmd exec -T -u $(id -u) agent python3 tests/test_workflow_bug_and_asr.py + + echo "Pass dialogs from dp-agent" + dockercompose_cmd exec -T -u $(id -u) agent python3 \ + utils/http_api_test.py -u http://0.0.0.0:4242 -cf tests/dream_russian/test_dialogs_gold_phrases.csv -of tests/dream_russian/output/$GOLD_OUTPUT_FILE + + echo "Assert passed dialogs" + if [[ "$DEVICE" == "cpu" ]]; then + dockercompose_cmd exec -T -u $(id -u) agent python3 tests/dream_russian/assert_test_dialogs.py -pred_f tests/dream_russian/output/$GOLD_OUTPUT_FILE -true_f tests/dream_russian/test_dialogs_gold_phrases.csv -time_limit 20 + else + dockercompose_cmd exec -T -u $(id -u) agent python3 tests/dream_russian/assert_test_dialogs.py -pred_f tests/dream_russian/output/$GOLD_OUTPUT_FILE -true_f tests/dream_russian/test_dialogs_gold_phrases.csv -time_limit 10 + fi + + echo "Testing file conflicts" + dockercompose_cmd exec -T agent sh -c 'cd /pavlov/DeepPavlov && git fetch --all --tags --prune && git checkout 0.14.1 && cd /dp-agent/ && python utils/analyze_downloads.py --compose_file assistant_dists/dream_russian/docker-compose.override.yml' + + echo "Testing docker-compose files" + dockercompose_cmd exec -T -u $(id -u) agent python utils/verify_compose.py -d assistant_dists/dream_russian +fi + +if [[ "$MODE" == "test_skills" || "$MODE" == "all" ]]; then + # docker-compose -f docker-compose.yml -f dev.yml ps --services | grep -wv -e agent -e mongo + + dockercompose_cmd logs --no-color -f --tail="all" --timestamps & + echo "Passing test data to each skill selected for testing" + + + for container in dff-program-y-skill intent-catcher convers-evaluation-selector personal-info-skill \ + entity-linking wiki-parser badlisted-words spelling-preprocessing sentseg \ + dff-friendship-skill dff-intent-responder-skill entity-detection dialogpt dff-generative-skill \ + dialogrpt spacy-annotator; do + + echo "Run tests for $container" + dockercompose_cmd exec -T -u $(id -u) $container ./test.sh + done + + +# +# echo "Run tests for topicalchat_convert_retrieval container" +# dockercompose_cmd exec -T -u $(id -u) topicalchat_convert_retrieval python /src/test_server.py + + + # + # echo "Run tests for news_skill" + # dockercompose_cmd exec -T -u $(id -u) news_skill python /src/src/test.py + # + + + # + # echo "Run tests for reddit_ner_skill" + # dockercompose_cmd exec -T -u $(id -u) reddit_ner_skill python test.py + # + +fi + +if [[ "$MODE" == "infer_questions" || "$MODE" == "all" ]]; then + dockercompose_cmd logs --no-color -f --tail="all" --timestamps & + echo "Passing questions to Alexa" + dockercompose_cmd exec -T -u $(id -u) agent python3 tests/dream_russian/test_response.py + dockercompose_cmd exec -T -u $(id -u) agent python3 \ + utils/xlsx_responder.py --url http://0.0.0.0:4242 \ + --input 'tests/dream_russian/test_questions.xlsx' \ + --output 'tests/dream_russian/output/test_questions_output.xlsx' + + echo "Computing Q&A metrics" + dockercompose_cmd exec -T -u $(id -u) agent python3 \ + tests/dream_russian/compute_qa_metrics.py \ + --pred_file 'tests/dream_russian/output/test_questions_output.xlsx' \ + --output 'tests/dream_russian/output/qa_metrics.txt' +fi