Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race conditions resulting in KeyError in module reloading #6404

Open
4 of 5 tasks
martinal opened this issue Mar 31, 2023 · 6 comments
Open
4 of 5 tasks

Race conditions resulting in KeyError in module reloading #6404

martinal opened this issue Mar 31, 2023 · 6 comments
Assignees
Labels
priority:P3 status:confirmed Bug has been confirmed by the Streamlit team type:bug Something isn't working

Comments

@martinal
Copy link

Checklist

  • I have searched the existing issues for similar issues.
  • I added a very descriptive title to this issue.
  • I have provided sufficient information below to help reproduce this issue.

Summary

The streamlit module reloading has race conditions that lead to "KeyError: 'mylib'" from deep inside python import machinery. It has been mentioned in passing in other issues before but here I'm making it reproducible.

Reproducible Code Example

A single code example is not enough to reproduce, but I managed to create a stress test setup that should reproduce the issue quickly by simulating editing files in the background with randomized timing while streamlit is trying to keep up with hotreloading.

Steps To Reproduce

  1. Go to https://github.com/martinal/streamlit-stresstest
  2. Download repo
  3. Run stress test per the repo README

Expected Behavior

No exceptions.

Current Behavior

────────────────────────── Traceback (most recent call last) ───────────────────────────
  /.../streamlit-stresstest/venv/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py:565 in _run_script                                        
                                                                                        
  /.../streamlit-stresstest/main.py:5 in <module>                                   

...                                                                                        
  ❱  5 import mylib                                                                     
...

  in _find_and_load:1027                                                                
  in _find_and_load_unlocked:1006                                                       
  in _load_unlocked:699                                                                 
────────────────────────────────────────────────────────────────────────────────────────
KeyError: 'mylib'

Is this a regression?

  • Yes, this used to work in a previous version.

Debug info

  • Streamlit version: 1.20
  • Python version: 3.10
  • Operating System: Linux
  • Browser: Any
  • Virtual environment: Minimal plus editable modules in PYTHONPATH, see linked gitrepo.

Additional Information

I won't pretend to comprehend the full picture here, this is tricky stuff! Some clues I've found:

The KeyError issue was mentioned in passing in the comments on this issue in October 2022 but that issue is about another topic:
#4595

A possible source of the problem was pointed out in this issue in 2020:
#1430

The problem probably stems from non-threadsafe manipulation of sys.modules such as in this code:
https://github.com/streamlit/streamlit/blob/develop/lib/streamlit/runtime/scriptrunner/script_runner.py#L558

This is similar to what you can optionally have the runpy module do with alter_sys=True, explained in the python documentation as not threadsafe here:
https://docs.python.org/3/library/runpy.html#runpy.run_module

A better solution then might be using runpy.run_module with alter_sys=False. This will however have some side effects, as you mention in comments in script_runner.py. I may be willing to help out fixing this but these side effect issues seem to run deep so the streamlit core developers would need to give their blessing to the approach.

Are you willing to submit a PR?

  • Yes, I am willing to submit a PR!
@martinal martinal added status:needs-triage Has not been triaged by the Streamlit team type:bug Something isn't working labels Mar 31, 2023
@AnOctopus AnOctopus added status:confirmed Bug has been confirmed by the Streamlit team priority:P3 and removed status:needs-triage Has not been triaged by the Streamlit team labels Apr 3, 2023
@AnOctopus
Copy link
Contributor

Thanks for providing the reproduction code, I was able to see the attribute error after tweaking the edit script to run a bit faster.

Since there are some reasons to want the path to be updated, it might not be viable to switch to run_module with the sys modification turned off, but I haven't dived into the exact behavioral requirements of our import machinery in about a year. I'd be happy to work with you on a PR for this, or once I have more slack I might look into it myself, since this kind of thing is tricky to work on in a very fun way.

@martinal
Copy link
Author

Thanks for the fast response. What are the reasons for keeping the sys modifications? Is it only to allow the __name__ == "__main__" notation or is there something else as well?

I do understand you can't just switch behaviour, some user code could break for sure. Myself I would be happy with a configuration option to skip sys modifications that defaults to current behaviour if that's possible. Depends on how deep this goes?

@martinal
Copy link
Author

martinal commented Apr 11, 2023

If it's only to set __name__, there's an argument for that:

runpy.run_module("pagename", run_name="__main__")

@AnOctopus
Copy link
Contributor

The only use I could find mentioned in the commit history is setting the name, but that is used for both the __name__ == "__main__" thing and for making pickle work in the user code.

I tried out switching to run_module (and removing the sys path modification), but when running the repro program I continued to get the errors. This'll take further investigation, but I don't want to get too nerd-sniped by it right now (already spent more time revisiting the import machinery documentation than I intended).

If you want to try out some changes and get something that works on the repro repo, I'll gladly work with you to verify that it still works for the known usecases. Otherwise, I will probably revisit this in a week or two, when I have slightly less competing for my time and attention.

@nsenthilkumar
Copy link

Hi I know this is a little old now but I wanted to +1 that I am encountering this same issue. For me it is not a breaking issue as far as I can tell

@blipk
Copy link

blipk commented Oct 11, 2023

Hi.

I've run into this issue on a project that uses the streamlit module watcher to reload the page from a thread (by modifying an imported file)

To try and debug I excepted the KeyError in streamlits script_runner.py (~L555 on the exec call is where it happens):

            with modified_sys_path(self._main_script_path), self._set_execing_flag():
                # Run callbacks for widgets whose values have changed.
                if rerun_data.widget_states is not None:
                    self._session_state.on_script_will_rerun(rerun_data.widget_states)

                ctx.on_script_start()
                prep_time = timer() - start_time
                print("Y", module.__dict__.keys())
                exec(code, module.__dict__)
                self._session_state.maybe_check_serializable()
                self._session_state[SCRIPT_RUN_WITHOUT_ERRORS_KEY] = True
        except KeyError:
            print("X", module.__dict__.keys())
            raise

Which logs something like this:

Y dict_keys(['__name__', '__doc__', '__package__', '__loader__', '__spec__', '__file__'])
X dict_keys(['__name__', '__doc__', '__package__', '__loader__', '__spec__', '__file__', '__builtins__', 'os', '__streamlitmagic__', 'sys', 'st', 'AuthException', 'LoginError', 'RegisterError', 'ResetError', 'UpdateError', 'log', 'main'])
Y dict_keys(['__name__', '__doc__', '__package__', '__loader__', '__spec__', '__file__'])
Exception in thread ScriptRunner.scriptThread:
Traceback (most recent call last):
  File "/usr/lib/python3.11/threading.py", line 1038, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.11/threading.py", line 975, in run
    self._target(*self._args, **self._kwargs)
  File "/home/user/.local/lib/python3.11/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 297, in _run_script_thread
    self._run_script(request.rerun_data)
  File "/home/user/.local/lib/python3.11/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 542, in _run_script
    exec(code, module.__dict__)
  File "/development/streamlitapp/app/app.py", line 56, in <module>
    main()
  File "/development/streamlitapp/app/app.py", line 19, in main
    import routes.routing as routing  # noqa: E402
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/development/streamlitapp/app/routes/routing.py", line 11, in <module>
    from routes.pages import app_pages
  File "/development/streamlitapp/app/routes/pages.py", line 1, in <module>
    from routes.authpage import auth_page
  File "/development/streamlitapp/app/routes/authpage.py", line 21, in <module>
    from core.firestoredb import titan_db
  File "/development/streamlitapp/app/core/firestoredb.py", line 11, in <module>
    from core.models import UserConfig, Company
  File "/development/streamlitapp/app/core/models.py", line 10, in <module>
    from core.external.firebase import firestore_client
  File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 701, in _load_unlocked
KeyError: 'core.external'

Since I want to rerun anyway the easiest solution I found was to except the KeyError at the root of my streamlit app and call st.rerun() or similar - it would be good to have this more thread safe though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority:P3 status:confirmed Bug has been confirmed by the Streamlit team type:bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants