Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

custom action_session_start can't access user message metadata #7420

Closed
nys99 opened this issue Nov 30, 2020 · 30 comments · Fixed by #7922
Closed

custom action_session_start can't access user message metadata #7420

nys99 opened this issue Nov 30, 2020 · 30 comments · Fixed by #7922
Assignees
Labels
area:rasa-oss 🎡 Anything related to the open source Rasa framework area:rasa-oss/training-data Issues focused around Rasa training data (stories, NLU, domain, etc.) effort:atom-squad/2 Label which is used by the Rasa Atom squad to do internal estimation of task sizes. type:bug 🐛 Inconsistencies or issues which will cause an issue or problem for users or implementors.

Comments

@nys99
Copy link

nys99 commented Nov 30, 2020

Rasa version: 2.1.1

Rasa SDK version (if used & relevant): 2.1.2

Rasa X version (if used & relevant): n/a

Python version: 3.7.9

Operating system (windows, osx, ...): ubuntu 20.04

Issue:
Same issue as: #6348

"Trying to pass metadata to slots using action_session_start; the events key on the tracker data, on the action server request, is arriving empty."

This issue is not present on 1.10.14.

Error (including full traceback):

2020-11-30 14:58:06 WARNING  actions.actions  - session_start but no metadata, tracker.events: []

Command or request that led to error:


Content of configuration file (config.yml) (if relevant):

Content of domain file (domain.yml) (if relevant):

Additional Details (added by @wochinge )

The use case for this is that the first user message contains metadata which is supposed to be passed onto the action_session_started action. To make this a bit more tricky action_session_started is overriden by a custom action.

The problem is a custom action can only access information from the past. However, action_session_started is the first event in the conversation and there is no past (as we execute the session initializtion before we log the UserUttered event). We hence have the task to make metadata available to the custom action before this action actually happened.

Visualization of the problem:

  1. User sends first message
  2. InputChannel extras metadata and passes it along with the UserMessage
  3. processor finds out this is the first message in the conversation and runs action_session_start which is a custom action
  4. action_session_start runs but has no access on the metadata from the UserMessage
  5. processor logs action_session_started + the SessionStarted event on the tracker
  6. the processer runs the NLU prediction and logs it on the tracker (together with the metadata)
  7. the processor predicts and executes further bot actions

The proposed solution is outlined here

@nys99 nys99 added area:rasa-oss 🎡 Anything related to the open source Rasa framework type:bug 🐛 Inconsistencies or issues which will cause an issue or problem for users or implementors. labels Nov 30, 2020
@sara-tagger
Copy link
Collaborator

Thanks for the issue, @amn41 will get back to you about it soon!

You may find help in the docs and the forum, too 🤗

@rgstephens
Copy link
Contributor

image

@degiz
Copy link
Contributor

degiz commented Jan 18, 2021

I've checked, and found that:

  • This line removed in the screenshot was causing a bug, and we’ve fixed it, so it shouldn't be there indeed
  • Metadata is passed to the action_session_start here

So it's the action that will have a property metadata, that contains the metadata.
Check default implementation ActionSessionStart here, it uses exactly that.

Please note however, that it's up to the income channel to make sure that metadata is actually passed from the webhook down the stack. In case of rest channel, user has to implement get_metadata method, like this:

def get_metadata(self, request: Request) -> Optional[Dict[Text, Any]]:
    return request.json.get("metadata", None)

@degiz
Copy link
Contributor

degiz commented Jan 18, 2021

Probably it makes sense to include the snippet from above to the mainline 🤔

@theresazobel
Copy link

Thanks for debugging this. Where do you add the implementation of get_metadata / which file?

@NareshDen
Copy link

NareshDen commented Jan 19, 2021

I've checked, and found that:

  • This line removed in the screenshot was causing a bug, and we’ve fixed it, so it shouldn't be there indeed
  • Metadata is passed to the action_session_start here

So it's the action that will have a property metadata, that contains the metadata.
Check default implementation ActionSessionStart here, it uses exactly that.

Please note however, that it's up to the income channel to make sure that metadata is actually passed from the webhook down the stack. In case of rest channel, user has to implement get_metadata method, like this:

def get_metadata(self, request: Request) -> Optional[Dict[Text, Any]]:
    return request.json.get("metadata", None)

if action.name() == ACTION_SESSION_START_NAME: action.metadata = metadata
If you do this, this metadata is only passed to default action_session_start ( which is inside rasa OSS ), but we do custom implementation of this action ( which is remote ) and then this metadata is not available. As call to remote action does not have that parameter and attribute you set on "action.metadata = metadata " will not transmit to custom remote action. As tracker is empty, metadata info is lost.
async def run( self, output_channel: "OutputChannel", nlg: "NaturalLanguageGenerator", tracker: "DialogueStateTracker", domain: "Domain", ) -> List[Event]

There is no metadata passed in run method and tracker is empty, metadata used to be in tracker at session start.

@kaustuk
Copy link
Contributor

kaustuk commented Jan 20, 2021

@degiz what ever you mentions doesn't work if we are overwriting action_session_start in actions server to set some custom slot value using metadata for example bot name, customer name, language, etc. So we can solve this issue by two approaches

First approach

Producer part:
Here is the code were we don't pass the metadata to our action server for any actions. The correct implementation should be as follow

def _action_call_format(
        self, tracker: "DialogueStateTracker", domain: "Domain"
    ) -> Dict[Text, Any]:
        """Create the request json send to the action server."""
        from rasa.shared.core.trackers import EventVerbosity
        tracker_state = tracker.current_state(EventVerbosity.ALL)
        .
        value = {
            "next_action": self._name,
            "sender_id": tracker.sender_id,
            "tracker": tracker_state,
            "domain": domain.as_dict(),
            "version": rasa.__version__,
        }
        if self._name == ACTION_SESSION_START_NAME:
            value["metadata"] = self.metadata
        return value

So this will basically send metadata to our actions server for action action_session_start

Consumer part:
Here Add the following code

tracker.metadata = action_call.get("metadata",{})

Note: here we are using tracker to pass the metadata to action run method.

Second approach
As @rgstephens share in above image we can add back the deleted line.

@amn41 What do you think which approach we should take? Also in first approach not sure about consumer part I feel it's not that very clean. Can you give some pointer around it? I would be happy to open PR with any approach

@degiz
Copy link
Contributor

degiz commented Jan 20, 2021

Hey @kaustuk

Second approach

This will cause a session restart each time you attach metadata, so that's unlikely an option.

First approach

This sounds quite ok! I'm not sure if the metadata was implemented with custom actions in mind, but this approach (extending the _action_call_format) seems very reasonable. Feel free to open PR!

@degiz
Copy link
Contributor

degiz commented Jan 20, 2021

Hey @theresazobel

Where do you add the implementation of get_metadata / which file?

I meant InputChannel::get_metadata() (channel.py).
So in case of rest channel it's RestInput::get_metadata() (rest.py)

@kaustuk
Copy link
Contributor

kaustuk commented Jan 20, 2021

Hi @degiz thanks for the reply.

This sounds quite ok! I'm not sure if the metadata was implemented with custom actions in mind, but this approach (extending the _action_call_format) seems very reasonable. Feel free to open PR!

What do you think about consumer part in first approach? Is it good enough? or should do something else?

will create PR to 2.2.x branch, is that fine?

@degiz
Copy link
Contributor

degiz commented Jan 21, 2021

cc @RasaHQ/enable-squad

@NareshDen
Copy link

Hey @kaustuk

Second approach

This will cause a session restart each time you attach metadata, so that's unlikely an option.

First approach

This sounds quite ok! I'm not sure if the metadata was implemented with custom actions in mind, but this approach (extending the _action_call_format) seems very reasonable. Feel free to open PR!

For first approach don't you have to change rasa sdk also?

@kaustuk
Copy link
Contributor

kaustuk commented Jan 25, 2021

For first approach don't you have to change rasa sdk also?

@NareshDen Yeah correct we need to change into Rasa SDK as well. The producer part of the approach would be part of rasa open source and consumer part is for Rasa SDK.

@kaustuk
Copy link
Contributor

kaustuk commented Jan 25, 2021

Hey @amn41 did you got chance to look into this? what do you think about the approach I mentioned above? Should I create both the PRs(for producer and consumer)?

PS: We need this feature to set some slots before conversation start. it would be helpful if you could look into this ASAP or point to someone who can help us out here.

@wochinge
Copy link
Contributor

Second approach
As @rgstephens share in above image we can add back the deleted line.

I think we should do this (for now). That's the quickest solution and qualifies as a bug fix. Note that this leads to the issue that we have two SessionStarted events in the tracker (one before the action_session_start and one after) which I don't think is ideal.

This will cause a session restart each time you attach metadata, so that's unlikely an option.

It doesn't as this code only get's applied in _update_tracker_session and if the tracker session is outdated.

General opinion on this:
This is a tricky problem because we try to pass information from an event to an event before it has happened. The metadata belongs to action_session_started but this action wasn't executed yet. Doing the second SessionStarted event is in my opinion a hack around this. I think we should consider finding a clean solution how to pass this information, e.g. a separate MetadataEvent or an extra action before or some extra field in the request to the SDK.

@kaustuk I can do a quick PR or you change your existing one, which option do you prefer? (cc @RasaHQ/enable-squad )

@wochinge
Copy link
Contributor

This line removed in the screenshot was causing a bug, and we’ve fixed it, so it shouldn't be there indeed

@degiz Which bug are you referring to?

@wochinge
Copy link
Contributor

Ok, @degiz is referring to this PR. I still don't see how this could actually happen, did we ever verify this issue?

@wochinge
Copy link
Contributor

The only reason I can imagine which speaks against this solution is if you have an action_session_start implementation which is deciding dynamically if the session should be restarted.

@kaustuk
Copy link
Contributor

kaustuk commented Jan 28, 2021

@kaustuk I can do a quick PR or you change your existing one, which option do you prefer? (cc @RasaHQ/enable-squad )

@wochinge you can create PR and fix it

@wochinge
Copy link
Contributor

@kaustuk The issue I'm having is that I don't know the motivation of the PR which removed this. Would it be okay to come up with a proper solution (avoiding duplicate SessionStarted events) until the next minor (planned for February 11). I can imagine something along the lines of your current solution or something like a new event.

@kaustuk
Copy link
Contributor

kaustuk commented Jan 28, 2021

@wochinge Just thinking out loud. In second approach, we can modify the if condition to action.name() == ACTION_SESSION_START_NAME then only add the sessionStarted event in tracker. And in action_session_start implementation remove this line.

_events = [SessionStarted(metadata=self.metadata)]

So by this only 1 SessionStarted event would be added in tracker. And who ever is overwriting the action_session_start do not need to add this event as well.

Don't know if it has any side effects or not. What do you think?

@wochinge
Copy link
Contributor

wochinge commented Jan 29, 2021

The first SessionStarted event unfortunately would have side effects for users who decide dynamically in their custom action_sesion_start action whether to restart the session (it's in my opinion an unlikely use case, but it might happen).

I was thinking about it a bit more and I think a nice solution would be to append a SlotSet event for an unfeaturized slot instead of using a second SessionStarted. We could e.g. use the convention that there is a slot action_session_start_metadata which stores the metadata. We could either add this slots to the default slots or users have to define it explicitly in their domain and we document the naming scheme in the docs. What do you think? (cc @akelad @rgstephens )

  1. User sends first message
  2. InputChannel extras metadata and passes it along with the UserMessage
  3. processor finds out this is the first message in the conversation and
    • sets unfeaturized slot action_session_start_metadata with user message metadata
    • runs action_session_start which is a custom action
  4. action_session_start runs but has no access on the metadata from the UserMessage
  5. processor logs action_session_started + the SessionStarted event on the tracker
  6. the processer runs the NLU prediction and logs it on the tracker (together with the metadata)
  7. the processor predicts and executes further bot actions

@alwx alwx added the area:rasa-oss/training-data Issues focused around Rasa training data (stories, NLU, domain, etc.) label Jan 29, 2021
@joejuzl joejuzl added the effort:atom-squad/2 Label which is used by the Rasa Atom squad to do internal estimation of task sizes. label Jan 29, 2021
@kaustuk
Copy link
Contributor

kaustuk commented Jan 30, 2021

The first SessionStarted event unfortunately would have side effects for users who decide dynamically in their custom action_sesion_start action whether to restart the session (it's in my opinion an unlikely use case, but it might happen).

@wochinge Understood. Thanks for the reply.

Assuming you guys would be busy with minor release If we can decide on solution, then I can also help with creating the PR.

@rgstephens
Copy link
Contributor

The first SessionStarted event unfortunately would have side effects for users who decide dynamically in their custom action_session_start action whether to restart the session (it's in my opinion an unlikely use case, but it might happen).

What side effect are you thinking of.

I was thinking about it a bit more and I think a nice solution would be to append a SlotSet event for an unfeaturized slot instead of using a second SessionStarted. We could e.g. use the convention that there is a slot action_session_start_metadata which stores the metadata

This sounds like a breaking change. Existing implementations count on action_session_start including metadata. The user can also structure their metadata in any fashion they want. The proposed change seems like it would require a formal definition of a metadata object that isn't currently required.

Minor point but the idea of a slot called action_session_start_metadata is confusing because it implies to me that it is an Event.

@wochinge
Copy link
Contributor

wochinge commented Feb 2, 2021

What side effect are you thinking of.

If users decide dynamically in their custom action_session_started to not restart the session, they'd have to also revert the potentially session_started event which was added for the metadata. Unfortunately nobody asked in the removal PR what the actual reason was to remove this so I can only guess that this might have been the reason. Having two session_started events definitely wouldn't be clean.

This sounds like a breaking change. Existing implementations count on action_session_start including metadata.

You mean in case action_session_start isn't overriden? We can still keep this.

The user can also structure their metadata in any fashion they want.

We can make the slot value an arbitrary payload.

Minor point but the idea of a slot called action_session_start_metadata is confusing because it implies to me that it is an Event.

👍🏻 Open for renaming suggestions. Note that the entire idea is that we log the metadata as event so the custom action_session_started action can access this data.

@wochinge wochinge changed the title Tracker state is sent with no events to action server custom action_session_start can't access user message metadata Feb 3, 2021
@NareshDen
Copy link

NareshDen commented Feb 5, 2021

To initialize a conversation, the only way to get metadata is through tracker(in some form of event) before session starts. If action_session_start_metadata is set, will it available in action_session_start? I thought tracker is always empty and tracker is updated after. Not sure also if there is limitation on slot value, metadata could be large sometime. I don't understand "custom action_session_started to not restart the session". Why would somebody do that at session_start?

@TyDunn TyDunn added this to the 2.3 Rasa Open Source milestone Feb 5, 2021
@wochinge
Copy link
Contributor

wochinge commented Feb 5, 2021

If action_session_start_metadata is set, will it available in action_session_start?

Yes, it would be an event before action_session_start.

I thought tracker is always empty and tracker is updated after

It's only empty for the first session. However, we can insert events before forwarding the tracker to the action server.

Not sure also if there is limitation on slot value, metadata could be large sometime.

As long as it's serializable we should be fine. It's not different to having the metadata field as part of events.

I don't understand "custom action_session_started to not restart the session". Why would somebody do that at session_start?

Do you mean why would somebody not want to restart the session upon executing action_session_started? I don't know but there might be specific use cases where you'd want to decide not to restart the conversation 🤷🏻 Having two session_started events is definitely unclean and error prone in my opinion.

@rgstephens unless there are further comments in response to my previous one I'd go forward and implement a solution so we can release this with 2.3.

@rgstephens
Copy link
Contributor

I'd like to clarify that in the proposed solution, action_session_start will no longer have the metadata object. Is that correct?

If so, then this is a breaking change from how it worked in the past and anyone migrating from 1.x to 2.3 will still not receive metadata?

@wochinge
Copy link
Contributor

wochinge commented Feb 9, 2021

I'd like to clarify that in the proposed solution, action_session_start will no longer have the metadata object. Is that correct?

We can add it after the custom action was run. To be clear: the metadata was never attached to action_session_start. It was previously attached to the session_started part.

If so, then this is a breaking change from how it worked in the past and anyone migrating from 1.x to 2.3 will still not receive metadata?

The breaking change was done when removing this in 2.0. I don't want to re-introduce a buggy (two SessionStarted events) solution for this.

@wochinge
Copy link
Contributor

wochinge commented Feb 9, 2021

I verified that the bug which was supposedly fixed by #5767, doesn't happen. I inserted metadata with 1.x on every user message, this is the tracker:

{
  "sender_id": "tobi2",
  "slots": {},
  "latest_message": {
    "intent": {
      "name": "greet",
      "confidence": 0.9992140531539917
    },
    "entities": [],
    "intent_ranking": [
      {
        "name": "greet",
        "confidence": 0.9992140531539917
      },
      {
        "name": "affirm",
        "confidence": 0.0005684308125637472
      },
      {
        "name": "mood_unhappy",
        "confidence": 0.0001770617818692699
      },
      {
        "name": "mood_great",
        "confidence": 0.000020368892364786007
      },
      {
        "name": "bot_challenge",
        "confidence": 0.00001121014247473795
      },
      {
        "name": "goodbye",
        "confidence": 0.000007328536412387621
      },
      {
        "name": "deny",
        "confidence": 0.0000015816849554539658
      }
    ],
    "response_selector": {
      "default": {
        "response": {
          "name": null,
          "confidence": 0.0
        },
        "ranking": [],
        "full_retrieval_intent": null
      }
    },
    "text": "hi"
  },
  "latest_event_time": 1612865478.57138,
  "followup_action": null,
  "paused": false,
  "events": [
    {
      "event": "session_started",
      "timestamp": 1612865450.723252,
      "metadata": {
        "metadata": "hardcoded by tobi". # session start with metadata
      }
    },
    {
      "event": "action",
      "timestamp": 1612865450.7233338,
      "name": "action_session_start",
      "policy": null,
      "confidence": null
    },
    {
      "event": "session_started",
      "timestamp": 1612865450.72334
    },
    {
      "event": "action",
      "timestamp": 1612865450.7233531,
      "name": "action_listen",
      "policy": null,
      "confidence": null
    },
    {
      "event": "user",
      "timestamp": 1612865450.9300501,
      "metadata": {
        "metadata": "hardcoded by tobi". # user message with passed metadata
      },
      "text": "hi",
      "parse_data": {
        "intent": {
          "name": "greet",
          "confidence": 0.9992140531539917
        },
        "entities": [],
        "intent_ranking": [
          {
            "name": "greet",
            "confidence": 0.9992140531539917
          },
          {
            "name": "affirm",
            "confidence": 0.0005684308125637472
          },
          {
            "name": "mood_unhappy",
            "confidence": 0.0001770617818692699
          },
          {
            "name": "mood_great",
            "confidence": 0.000020368892364786007
          },
          {
            "name": "bot_challenge",
            "confidence": 0.00001121014247473795
          },
          {
            "name": "goodbye",
            "confidence": 0.000007328536412387621
          },
          {
            "name": "deny",
            "confidence": 0.0000015816849554539658
          }
        ],
        "response_selector": {
          "default": {
            "response": {
              "name": null,
              "confidence": 0.0
            },
            "ranking": [],
            "full_retrieval_intent": null
          }
        },
        "text": "hi"
      },
      "input_channel": "rest",
      "message_id": "097716f9b7d846c8abfbebb210a58bde"
    },
    {
      "event": "action",
      "timestamp": 1612865450.9343739,
      "name": "utter_greet",
      "policy": "policy_0_MemoizationPolicy",
      "confidence": 1.0
    },
    {
      "event": "bot",
      "timestamp": 1612865450.934377,
      "text": "Hey! How are you?",
      "data": {
        "elements": null,
        "quick_replies": null,
        "buttons": null,
        "attachment": null,
        "image": null,
        "custom": null
      },
      "metadata": {}
    },
    {
      "event": "action",
      "timestamp": 1612865450.938633,
      "name": "action_listen",
      "policy": "policy_0_MemoizationPolicy",
      "confidence": 1.0
    },
    {
      "event": "user", # next user message with metadata - no session restart
      "timestamp": 1612865478.567907,
      "metadata": {
        "metadata": "hardcoded by tobi"
      },
      "text": "hi",
      "parse_data": {
        "intent": {
          "name": "greet",
          "confidence": 0.9992140531539917
        },
        "entities": [],
        "intent_ranking": [
          {
            "name": "greet",
            "confidence": 0.9992140531539917
          },
          {
            "name": "affirm",
            "confidence": 0.0005684308125637472
          },
          {
            "name": "mood_unhappy",
            "confidence": 0.0001770617818692699
          },
          {
            "name": "mood_great",
            "confidence": 0.000020368892364786007
          },
          {
            "name": "bot_challenge",
            "confidence": 0.00001121014247473795
          },
          {
            "name": "goodbye",
            "confidence": 0.000007328536412387621
          },
          {
            "name": "deny",
            "confidence": 0.0000015816849554539658
          }
        ],
        "response_selector": {
          "default": {
            "response": {
              "name": null,
              "confidence": 0.0
            },
            "ranking": [],
            "full_retrieval_intent": null
          }
        },
        "text": "hi"
      },
      "input_channel": "rest",
      "message_id": "6451251e0092416a973f659ca96c3ed4"
    },
    {
      "event": "action",
      "timestamp": 1612865478.569858,
      "name": "utter_greet",
      "policy": "policy_1_TEDPolicy",
      "confidence": 0.9876126646995544
    },
    {
      "event": "bot",
      "timestamp": 1612865478.569862,
      "text": "Hey! How are you?",
      "data": {
        "elements": null,
        "quick_replies": null,
        "buttons": null,
        "attachment": null,
        "image": null,
        "custom": null
      },
      "metadata": {}
    },
    {
      "event": "action",
      "timestamp": 1612865478.57138,
      "name": "action_listen",
      "policy": "policy_1_TEDPolicy",
      "confidence": 0.944972038269043
    }
  ],
  "latest_input_channel": "rest",
  "active_form": {},
  "latest_action_name": "action_listen"
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:rasa-oss 🎡 Anything related to the open source Rasa framework area:rasa-oss/training-data Issues focused around Rasa training data (stories, NLU, domain, etc.) effort:atom-squad/2 Label which is used by the Rasa Atom squad to do internal estimation of task sizes. type:bug 🐛 Inconsistencies or issues which will cause an issue or problem for users or implementors.
Projects
None yet
Development

Successfully merging a pull request may close this issue.