Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved startup time #8141

Merged
merged 28 commits into from Mar 16, 2021
Merged

Improved startup time #8141

merged 28 commits into from Mar 16, 2021

Conversation

tmbo
Copy link
Member

@tmbo tmbo commented Mar 8, 2021

Proposed changes:

  • avoid importing tensorflow where possible in CLI (tensorflow has an import time of ~4s)
  • command to profile imports locally (requires https://github.com/nschloe/tuna to be installed and python >= 3.7):
    python -X importtime -m rasa.__main__ --help  2> import.log
    tuna import.log
    
  • fixes Improve Rasa CLI start-up time #4280
  • added a test to avoid a regression (checks if tensorflow gets imported on CLI start)

Status (please check what you already did):

  • added some tests for the functionality
  • updated the documentation
  • updated the changelog (please check changelog for instructions)
  • reformat files using black (please check Readme for instructions)

@tmbo tmbo requested a review from wochinge March 8, 2021 16:59
@tmbo
Copy link
Member Author

tmbo commented Mar 8, 2021

changes make the commands --help & init a lot quicker, but I don't think it makes sense to merge anything without a change as it is very easy to break this again.

Copy link
Contributor

@wochinge wochinge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling this! This is both amazing for the users and our CI speed 🚀

Regarding the regression test: How about testing for the tensorflow import instead of a certain timing? E.g. you could write a script which

  • imports rasa
  • sets sys.argv according to the commands you want to test
  • runs the command (maybe with mocked behavior)
  • assert that tensorflow isn't in the imported modules

rasa/cli/data.py Outdated Show resolved Hide resolved
rasa/telemetry.py Show resolved Hide resolved
rasa/run.py Outdated Show resolved Hide resolved
rasa/__init__.py Outdated Show resolved Hide resolved
@tmbo tmbo requested a review from a team as a code owner March 10, 2021 19:31
@tmbo tmbo changed the title WIP improved startup time Improved startup time Mar 10, 2021
@tmbo
Copy link
Member Author

tmbo commented Mar 10, 2021

ready for another review 😉

@tmbo tmbo requested a review from wochinge March 10, 2021 20:39
Copy link
Contributor

@wochinge wochinge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for wrapping this up so quickly! Let a few more minor comments but good to go from my side!

rasa/api.py Outdated Show resolved Hide resolved
rasa/core/config.py Outdated Show resolved Hide resolved
rasa/shared/core/constants.py Outdated Show resolved Hide resolved
rasa/core/config.py Outdated Show resolved Hide resolved
tests/cli/test_cli.py Show resolved Hide resolved
tests/shared/core/test_constants.py Outdated Show resolved Hide resolved
@wochinge
Copy link
Contributor

@tmbo How severe is the problem with clashing imports? Would be nice to get it in the release today 👀

tmbo and others added 4 commits March 11, 2021 10:56
Co-authored-by: Tobias Wochinger <t.wochinger@rasa.com>
Co-authored-by: Tobias Wochinger <t.wochinger@rasa.com>
@tmbo
Copy link
Member Author

tmbo commented Mar 11, 2021

@wochinge I made the other changes, but I can't look into the import issue today, my suspicion is that it worked beforehand because had

from rasa.train import train

in rasa.__init__ which populated the global state FIRST with rasa.train (module) and AFTERWARDS with rasa.train() (function) - due to the evaluation order.

Now, we are not importing rasa.train (module) in init anymore, which means that as soon as we do that for the first time, it apparently overwrites the defined function (and apparently windows is using a different resolution logic here?)

state on master:
Bildschirmfoto 2021-03-11 um 11 24 43

state on this branch:
Bildschirmfoto 2021-03-11 um 11 25 04

@tmbo
Copy link
Member Author

tmbo commented Mar 11, 2021

If you have time to investigate or try moving this around I am happy to help but I won't have enough time today to dig deeper myself 😞

@wochinge
Copy link
Contributor

Thanks for the heads up. I'll have a short look but it doesn't seem like something we should rush just to get it in the release. Especially as it might require change to our Python API.

@tmbo
Copy link
Member Author

tmbo commented Mar 11, 2021

I mean there is one quick solution we could try: move the functions back into the original modules and move all the imports in there into the functions (or at least all rasa ones). That will likely work and we could revisit this later.

@wochinge
Copy link
Contributor

I mean there is one quick solution we could try: move the functions back into the original modules and move all the imports in there into the functions (or at least all rasa ones). That will likely work and we could revisit this later.

This seems way harder to maintain. I'm in favor of having these things explict. In my opinion it would be fine to rename the modules (e.g. train -> model_training, test -> model_testing and create_agent from run can go to server.py). They are not part of the official Python API as the official Python API is to use rasa.train(...) etc. so in my opinion this would be ok to do, what do you think?

@tmbo
Copy link
Member Author

tmbo commented Mar 11, 2021

well, if they don't appear in the docs I am good with that 👍

@wochinge
Copy link
Contributor

I searched

  • import rasa.core
  • import rasa.nlu
  • import rasa.test
  • from rasa import
  • from rasa.core (has matches but not related to train, run, test
  • from rasa.nlu
  • from rasa.test

and had 0 matches 🎉 So good to rename 🚀

@wochinge wochinge mentioned this pull request Mar 15, 2021
12 tasks
@tmbo
Copy link
Member Author

tmbo commented Mar 15, 2021

yes that makes sense 👍

@tmbo
Copy link
Member Author

tmbo commented Mar 15, 2021

@wochinge are you able to make these changes? Otherwise I'll put it in enable's inbox to make these additional changes.

@wochinge
Copy link
Contributor

Sure, will wrap it up later 👍🏻

@wochinge wochinge self-assigned this Mar 15, 2021
@wochinge
Copy link
Contributor

What about deleted the __init__ imports (here and here) instead of renaming the train / visualize in rasa.core and train / test in rasa.nlu?

They aren't used except by us and in my opinion users should rather go through the __init__ imports in rasa or rasa.model_training, what do you think? This also avoids needless imports this stuff.

@@ -269,6 +270,33 @@ async def schedule_model_pulling(
)


def create_agent(model: Text, endpoints: Text = None) -> "Agent":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved from rasa.run

@wochinge
Copy link
Contributor

@tmbo Changes are ready. Do you have time for a short review? 🙌🏻

Copy link
Member Author

@tmbo tmbo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wochinge can't approve since I opened the PR - but changes look good 👍

rasa/model_training.py Show resolved Hide resolved
rasa/model_testing.py Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve Rasa CLI start-up time
2 participants