Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bot kg annotator #586

Merged
merged 261 commits into from
Dec 12, 2023
Merged

Bot kg annotator #586

merged 261 commits into from
Dec 12, 2023

Conversation

annakorz
Copy link
Collaborator

@annakorz annakorz commented Oct 31, 2023

Bot-knowledge-memorizer to store and retrieve data into/from bot's knowledge graph
dream_bot_kg_prompted distribution with dff-bot-knowledge-prompted-skill that demonstrates bot-km in action

IMPORTANT: not to delete the branch after merge

dmitrijeuseew and others added 30 commits April 9, 2023 10:16
This let us avoid assertion error about lists lengths in deeppavlov-kg
This fixes the problem of storing the same entity many times
This prevents creating the same entity many times in KG
annakorz and others added 5 commits December 1, 2023 15:09
* component card changed in pipeline_conf

* sentence-ranker url removed from .env
* changes in bot-km server file

* custom-el formatter changed
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all weights/models files should be downloaded from files.deeppavlov NOT stored in github.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed, prob it was merged from old branches

@@ -154,6 +154,12 @@ def generate_triplets(uttr_batch, relations_pred_batch):
return triplets_corr_batch


def is_question(uttr: str):
uttr = uttr.lower()
is_q = any([uttr.startswith(q_word) for q_word in ["what ", "who ", "when ", "where "]]) or "?" in uttr
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have the same function to use in common (like is_any_question - i do not remember the title clearly but try to find it and use it)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is only is_questions that checks for question marks or another for factoid-only questions in common.utils.py

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_any_question from common.universal_templates.py does not fit in my case. I am checking each sentence and filtering out questions from it, bu tthe function checks the whole uttr from annotations

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not repeating templates you still can do the following:
is_yes({"text": sentence, "annotations": {}})

@@ -219,6 +219,7 @@ services:
command: flask run -h 0.0.0.0 -p 8020
environment:
- FLASK_APP=server
- CUDA_VISIBLE_DEVICES=0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you have this info in your component and containers cards?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reverted

@@ -238,9 +239,9 @@ services:
deploy:
resources:
limits:
memory: 128M
memory: 512M
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you increase the value in component/container cards?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this should stay unchanged.
Reverted

skills/dff_knowledge_prompted_skill/Dockerfile Outdated Show resolved Hide resolved
@@ -37,4 +37,7 @@ ENV ENVVARS_TO_SEND ${ENVVARS_TO_SEND}
ARG USE_KG_DATA
ENV USE_KG_DATA=$USE_KG_DATA

ARG USE_BOT_KG_DATA
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ARG USE_BOT_KG_DATA
ARG USE_BOT_KG_DATA=0

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otherwise in the server.py getting these vars from env will get you None values (not 0 as you pointed out in deault)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for suggestion. Added default value to Dockerfile.

@@ -37,4 +37,7 @@ ENV ENVVARS_TO_SEND ${ENVVARS_TO_SEND}
ARG USE_KG_DATA
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ARG USE_KG_DATA
ARG USE_KG_DATA=0

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

dialog = utils.get_last_n_turns(dialog, bot_last_turns=1)
dialog = utils.replace_with_annotated_utterances(dialog, mode="punct_sent")
if len(dialog["bot_utterances"]):
context = [dialog["bot_utterances"][-1]["text"]]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

herr [str]

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

state_formatters/dp_formatters.py Show resolved Hide resolved
dialog = utils.get_last_n_turns(dialog, bot_last_turns=1)
dialog = utils.replace_with_annotated_utterances(dialog, mode="punct_sent")
if len(dialog["utterances"]) >= 2:
dialog_history = [dialog["utterances"][-2]["text"]]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should it be last humn uttr? I do not get what utterance you are trying to get to history

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be last bot utterance, fixed it

@@ -154,6 +154,12 @@ def generate_triplets(uttr_batch, relations_pred_batch):
return triplets_corr_batch


def is_question(uttr: str):
uttr = uttr.lower()
is_q = any([uttr.startswith(q_word) for q_word in ["what ", "who ", "when ", "where "]]) or "?" in uttr
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not repeating templates you still can do the following:
is_yes({"text": sentence, "annotations": {}})

uttrs.append(sentrewrite(utt_prev_l, utt_cur_l))
if not init_uttrs[0][0]:
triplets_batch = [[""]]
uttrs = [""]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a strange check for me. You jsut check the first element and assign uttrs to list of empty string. But you are not sure that there is only one element. And why do you do this here?
Probably it;'s better to iterate over the batch of input and check every sample and append to uttr empty string if the input init_uttrs[i] is empty (or whatever check is)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was all used to manage empty or very long bot utterances. Now, I removed the check for empty and moved it for later when iterating the utterances. The is_question function is also removed and I reverted the question-check to the way it was before.

entity_substr_list, entity_tags_list, context = prepare_el_input_last_bot(dialog)
else:
entity_substr_list, entity_tags_list, context = [""], [""], [""]
return [{"entity_substr": [entity_substr_list], "entity_tags": [entity_tags_list], "context": [context]}]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in another PR I have already asked for that, so I have to ask you too))
As these are batches, please name them in multiple way, so contexts not `context1 (and in your formatters bellow)

zucchini-nlp and others added 9 commits December 11, 2023 13:18
* changes reflected in services server-files

* style format
* returns the output in the usual form and prevents skill from error
* property-extraction

* entity-detection

* entity-linking

* custom-entity-linking

* ner

* response_selector

* bot-knowledge-memorizer

* dff-bot-knowledge-prompted-skill
* changes reflected in service_configs for dream_kg_prompted and dream_bot_kg_prompted
@@ -0,0 +1,26 @@
name: response_selector
display_name: Ranking-based Response Selector
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you do not need a new card for this response selector because it is already existing

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

YJzc7NwGrLmKp6gfZJh7X1.yml

@@ -0,0 +1,26 @@
name: ner
display_name: NER
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this card already exists iBC0L15gOFWymHhZEAybUQ.yml

dilyararimovna and others added 2 commits December 12, 2023 15:26
* bot-knowledge-prompted-skill

* response_selector
Copy link
Collaborator

@dilyararimovna dilyararimovna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy Birthday! merge like a present
image

@dilyararimovna dilyararimovna merged commit 442bb26 into dev Dec 12, 2023
32 checks passed
@dilyararimovna dilyararimovna deleted the bot_kg_annotator branch December 13, 2023 07:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants