Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracker gets very big for long conversations #3011

Open
wochinge opened this issue Feb 25, 2019 · 7 comments

Comments

@wochinge
Copy link
Contributor

commented Feb 25, 2019

When a user conversation gets very long, the tracker will get huge and might slow down everything.
Possible solutions:

  • implement some sort of session concept for the tracker
  • only return the last x events
@souvikg10

This comment has been minimized.

Copy link

commented Feb 26, 2019

Just one question, the tracker depends on the max history parameter of the training policies right? maybe it should be possible to do hot/cold switch to reduce the footprint. two KV stores perhaps with the hot store emitting x events which depends on the max history size.

I would hate to lose the conversations in the tracker store if deleted after a user session without proper archiving as such. I am assuming most folks are using RDMS to store the tracker and later use it for analytics

@wochinge

This comment has been minimized.

Copy link
Contributor Author

commented Feb 28, 2019

@souvikg10 Thanks for your input! We have not decided yet how to tackle this one, but we will definitely discuss your approach!

@netcarver

This comment has been minimized.

Copy link

commented Feb 28, 2019

I'm pretty new to Rasa, but there seem to be a number of areas that stand out for a possible reduction in the size of the tracker, although I appreciate that for established code some of these ideas may introduce breaking changes...

  1. Limit the number of decimal places recorded for confidence scores.
    Currently I'm seeing scores recorded with confidences like 0.8011654335672462, but see little point in passing around copies of the tracker store with anything after the 3rd decimal place. Isn't 0.801 enough for most cases?

  2. Allow the number of entries recorded in the intent ranking to be configurable (if not already). In my current project I only really need to know the top two. I assume any such setting would need to take into consideration the number of alternatives needed for the fallback policy.

  3. Don't repeat the data. The winning intent from user input is repeated in the intent ranking...

        {
            "event": "user",
            "timestamp": 1551291525.4992907,
            "text": "do you have a facebook page?",
            "parse_data": {
                "intent": {
                    "name": "ask_socialmedia",
                    "confidence": 0.9274418680727952
                },
                "intent_ranking": [
                    {
                        "name": "ask_socialmedia",
                        "confidence": 0.9274418680727952
                    },
                  ...

so if you only need the top two intents, you only need 1 entry in the intent ranking.

  1. In bot events, allow for configurable omission of null data...
        {
            "event": "bot",
            "timestamp": 1551291525.5324903,
            "text": "You can find us on [Facebook](...), [Twitter](...) and [LinkedIn](...).",
            "data": {
                "elements": null,
                "buttons": null,
                "attachment": null
            }
        },

@tmbo tmbo transferred this issue from RasaHQ/rasa_core Mar 21, 2019

@akelad akelad added the enhancement label Mar 21, 2019

@wochinge

This comment has been minimized.

Copy link
Contributor Author

commented Jul 12, 2019

Also community members are facing this issue: https://forum.rasa.com/t/rasa-performance-for-increasingly-large-trackers/11126/2

Think we should deal with this in near future.

@lingvisa

This comment has been minimized.

Copy link

commented Jul 17, 2019

I want to added my exactly same issue here:

https://forum.rasa.com/t/my-bot-is-running-slow-very-quickly/13018

Rasa Team: it would be great if a solution can be implemented sooner for this and thanks.

@GaoQ1

This comment has been minimized.

Copy link
Contributor

commented Aug 27, 2019

@wochinge so how to solve this problem? i try to set max_event_history=100, but it will loss slot info when events length is higher 100. so i think we can save slot info in other store and meanwhile set max_event_history can reduce the huge of tracker.

@thepurpleowl

This comment has been minimized.

Copy link

commented Sep 4, 2019

Can we also look into this issue? This will happen even with max_history = 0, if there are a lot of checkpoints.

#2765

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants
You can’t perform that action at this time.