Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request feedback from CSE about the input of success markers. #9714

Closed
5 tasks done
aeshky opened this issue Sep 24, 2021 · 6 comments
Closed
5 tasks done

Request feedback from CSE about the input of success markers. #9714

aeshky opened this issue Sep 24, 2021 · 6 comments
Assignees
Labels
area:rasa-oss/success-markers Success markers effort:atom-squad/2 Label which is used by the Rasa Atom squad to do internal estimation of task sizes.

Comments

@aeshky
Copy link
Contributor

aeshky commented Sep 24, 2021

Description:

We need feedback from CSE about the input of success markers.

We want to check the following:

For the input:

  1. the meaning of the conditions is clear
  2. it’s easy to construct new conditions.

(please add any additional questions we want answering)

Proposed format for feedback request:

Prepare a short doc explaining the feature.
Prepare a set of examples to answer questions that we have. This can be a list of questions, or it can take the form of an exercise (e.g., what does this condition mean? construct the following condition).

Definition of Done:

  • Feedback document (+ questionnaire or exercise) prepared for Enable and CSE
  • Feedback returned from Enable Ty and Research
  • Feedback returned from CSE (identified team members, perhaps 1:1 interview with each?)
  • Implementation proposal updated to reflect changes
  • Issue created to address changes, or existing issues updated
@aeshky aeshky added the area:rasa-oss/success-markers Success markers label Sep 24, 2021
@TyDunn TyDunn added the effort:atom-squad/2 Label which is used by the Rasa Atom squad to do internal estimation of task sizes. label Sep 24, 2021
@aeshky aeshky changed the title Request feedback from CSE about the input and the output of success markers. Request feedback from CSE about the input of success markers. Oct 4, 2021
@aeshky
Copy link
Contributor Author

aeshky commented Oct 8, 2021

Blank feedback exercise.

Feedback given from:
Ty, Kedz, Thomas K, Melinda, Greg, Akela, and Ben Q.
Will share feedback.

@aeshky
Copy link
Contributor Author

aeshky commented Oct 13, 2021

Feedback summary

Notion page
Google form questionnaire results

Ty

  • Ty attempted to combine operators. I didn't give him a chance to read the description which said that you can't do that. [Moving forward, I gave all participants the chance to read the instructions].
  • There was some ambiguity in the descriptions of the second exercise. Ty suggested to frame them as KPIs, which I did and think it helped people complete the exercise more easily.
  • The scope of the conditions was not clear to Ty: do conditions apply per dialogue or per turn? [I ended up adding the argument scope to the marker, which can have one of two values: dialogue or user-turn. If the scope is dialogue the condition is checked once at the end of the dialogue (zero or one marker per tracker). If the scope is user-turn the condition is checked after every user turn (zero or more marker per tracker)].

Thomas

  • Thomas remarked that boteval checks the conditions only at the end of the tracker, but said he liked the added flexibility of checking the conditions per user-turn (which is similar to weizenbot). He liked that it allows us to count how many times a condition was satisfied in a dialogue (if scope == user-turn).
  • Sometimes slots have a value but it's set to NULL. We should check this in extract markers. (This came up with e2e team).
  • How do we reset slots? false, false, true, true, reset(?), true
  • It's still not super easy to determine what the scope of the marker should be when constructing conditions (this could be linguistic ambiguity in the descriptions, but we need to make sure this is clear in the docs for users).

Kedz

  • In SEQ operations, should we constrain the number of turns between the sequence? for example checking that an affirm followed an action
  • How do we process user uttered and bot uttered? String matching or regular expressions? (Also asked by Thomas)
  • is there a way to reset conditions?

Melinda

  • Melinda initially used the term "successful" to refer to when the marker applies, but noticed that we mark failed situations too.
  • What happens when we place all items under one heading? Is it an AND or the same operator?

Greg

  • Greg was concerned about the design approach. He felt that writing marker conditions would be a lot of work for customers who already write stories and rules.
  • For analytics he said customers build their own bespoke solution and don't need this, or they would use something like snowflake. He said he doesn't expect customers to use this feature and didn't understand why we were building it.
  • Greg was not sure what we meant by dialogues. He said we don't use this term anywhere: it's either session or conversation. [We do use the term dialogue in dialogueStateTracker. I prefer dialogue over conversation for reasons to do with the dialogue/NLP literature where "conversational AI" is used for open-ended domains (chitchat and question answering) whereas "dialogue" is used in task-oriented domains. These conventions probably don't mean much to our customers.]
  • Greg did the first exercise correctly and said it was easy but gave it a neutral score in the questionnaire because he didn't like the feature.
  • He ran out of time and didn't do the second exercise.

Akela

  • Akela said it would be useful to have something to indicate success or failure. Non-technical users will want to filter just by success and failure. [Maybe this can be an extra label, or it can be part of the marker name.]
  • Akela thinks that yaml is not the right interface for enterprise customers. [When we consider offering support for this feature in the enterprise product and Rasa X we can consider suitable interfaces.]

Ben Q

  • Ben mentioned that users can pick up old conversations. [How is this represented in tracker stores?]
  • How will we manage the complexity of the conditions as they grown. [We spoke about the possibility of an advanced developer version (python conditions) and an Enterprise version (a GUI)]
  • Ben suggested giving people templates to change instead of writing conditions from scratch.
  • Ben noticed the connection between steps in stories and SEQ in the marker conditions. if there's a strong connection we should clarify it.

@aeshky
Copy link
Contributor Author

aeshky commented Oct 27, 2021

@usc-m I can see you're assigned to this issue. Anything else needs doing besides updating the implementation proposal to reflect the changes (e.g., new syntax, etc.)? I'm happy to update it (or to link to Kathrin's doc).

@usc-m
Copy link
Contributor

usc-m commented Oct 27, 2021

I wasn't sure what the actionable outcomes here were and was focusing mostly on the code side, so nothing has changed with regards to this ticket, but I have brought up some of the feedback with Kathrin when we were going over things last week and we've tried to incorporate it where possible. The implementation proposal would still need updated to reflect those changes though as you say.

@aeshky
Copy link
Contributor Author

aeshky commented Oct 27, 2021

Thanks Matt. I ticked off the item "Issue created to address changes, or existing issues updated". Do you think that's accurate or is there something left to add?

@usc-m
Copy link
Contributor

usc-m commented Oct 27, 2021

I think so, but there's probably a longer term bit that's missing around capturing this feedback in a more long-term form for later iterations beyond the current release (i.e. issues have only been created or modified in response to changes we thought it would make sense to do now, not in the mid-to-long term)

@aeshky aeshky closed this as completed Oct 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:rasa-oss/success-markers Success markers effort:atom-squad/2 Label which is used by the Rasa Atom squad to do internal estimation of task sizes.
Projects
None yet
Development

No branches or pull requests

3 participants