Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the ability to programatically re-run the streamlit script #653

Closed
blester125 opened this issue Nov 8, 2019 · 34 comments · Fixed by #2060
Closed

Add the ability to programatically re-run the streamlit script #653

blester125 opened this issue Nov 8, 2019 · 34 comments · Fixed by #2060
Labels
area:experimental Related to experimental features type:enhancement Requests for feature enhancements or new features

Comments

@blester125
Copy link

blester125 commented Nov 8, 2019

Problem

I am building a small labeling too where I read in an example, the user will select labels from a list of check boxes and click submit in order to save the labels, when they submit a field in the data for it being labeled is added so in the next read it skips that example. The problem is that the submit button will cause a rerun where the labels are collected and saved but it doesn't cause another re-run that will cause the data to be reread (this will move the app on to the next question)

Solution

I think a solution would be a st.rerun() function that you can call to reexecute the script from the top. In my cause this would be called after my save and trigger a re-read. Basically it would do the same thing I am currently using a second Next button for but remove the need for user interaction.

@blester125 blester125 added the type:enhancement Requests for feature enhancements or new features label Nov 8, 2019
@tvst
Copy link
Contributor

tvst commented Nov 15, 2019

Hey @blester125

We've been thinking a lot about different ways to solve issues like yours recently. We have a few really nice and clean solutions in mind but they will take a little time to implement.

In the meantime, if I understand your app correctly, I think what you need can be implemented by persisting some data between reruns of your script, via a feature we call Session State.

The prototype for Session State can be downloaded from this Gist.

After you download it, you can create a labeling app this way:

import streamlit as st
import SessionState

s = SessionState.get(current_file_index = 0)

# This is a function that loads your file, given the file index.
f = get_file_to_label(s.current_file_index)

# These are your labels.
label1 = st.checkbox("foo")
label2 = st.checkbox("bar")
label2 = st.checkbox("baz")

if st.button("submit"):
  # This is a function that saves labels, given a file index.
  save_labels(s.current_file_index, label1, label2, label3)

  s.current_file_index += 1

That said, it's very possible I misunderstood your app and its requirements. So if you really need to have some way to rerun apps you can actually do it today by delving into Streamlit's internals:

from streamlit.ScriptRunner import StopException, RerunException

# do stuff

if st.button():
  # do other stuff
  raise RerunException()  # This causes the app to rerun

# even more stuff

You can also use StopException to stop the execution at any point.

@braaannigan
Copy link

braaannigan commented Nov 18, 2019

I'm confused by the behaviour on any button/checkbox click. I want to click on multiple checkboxes and then hit a submit button.

When I try to implement your first example below I can never click on the submit button because as soon as I change the first checkbox the submit button disappears and returns me to my initial setup. This is a problem because I want to label multiple items at a time, but I can only ever do one.

Adding in StopExceptions doesn't seem to help either

        label1 = st.checkbox("Relevant",False)
        label2 = st.checkbox("Not relevant",False)
        if label1:
            labelled.loc[row,'relevance'] == 1
            raise StopException()
        if label2:
            labelled.loc[row,'relevance'] == 0
            raise StopException()
        if st.button('submit'):
            labelled = tm.addToTrainValidation(labelled.loc[[row]],train,valid,unlabelled)

Any suggestions?

@braaannigan
Copy link

Duplicate of #166 I guess

@hcoohb
Copy link

hcoohb commented Dec 16, 2019

@tvst Thanks for providing a workaround while the permanent functionality is being worked out.
However, by placing the RerunException as indicated, it seems that I get an error:

TypeError: __init__() missing 1 required positional argument: 'rerun_data'
Traceback:
  File "c:\users\fabien~1.val\trial1~1\venv~1\lib\site-packages\streamlit\ScriptRunner.py", line 311, in _run_script
    exec(code, module.__dict__)
  File "C:\Users\Fab\trial 1\streamlit_gui.py", line 110, in <module>
    raise RerunException()

What data should I provide to this exception?
Thanks

@hcoohb
Copy link

hcoohb commented Dec 16, 2019

@tvst Thanks for providing a workaround while the permanent functionality is being worked out.
However, by placing the RerunException as indicated, it seems that I get an error:

TypeError: __init__() missing 1 required positional argument: 'rerun_data'
Traceback:
  File "c:\users\fabien~1.val\trial1~1\venv~1\lib\site-packages\streamlit\ScriptRunner.py", line 311, in _run_script
    exec(code, module.__dict__)
  File "C:\Users\Fab\trial 1\streamlit_gui.py", line 110, in <module>
    raise RerunException()

What data should I provide to this exception?
Thanks

Oh, sorry, I just found your gist that do not raise the error. (however, values of gui elements are reseted to default and do not match the UI it seems)

@demmerichs
Copy link

This issue is already open for quite some time. Could someone on the team elaborate on the time scale on which this will be implemented if ever?
Also the above mentioned gist seems to be not working anymore in more recent versions (is working for me under 0.53.0, but not 0.56.0 with an error complaining about _main_dg not being a member of session. So probably some internal changes broke this, would be nice, if someone could update the gist accordingly.

Thank you for this amazing project anyway, became quite fond of it.

@jrhone
Copy link
Contributor

jrhone commented Mar 23, 2020

Hi @DavidS3141 , here's the updated gist, https://gist.github.com/tvst/036da038ab3e999a64497f42de966a92

Regarding the triage of this issue, will get back to you shortly!

@demmerichs
Copy link

Sorry, I was not clear. I really need the rerun functionality, as the SessionState workaround is not viable for me. Maybe take another look at the gist I linked (here again) to make sure what I mean. Thanks!

@demmerichs
Copy link

Okay, I looked at your linked gist again and understood why you linked it, as it shows how to fix the _main_dg issue! I fixed the rerun logic accordingly and the resulting fork can be found here.

@SimonBiggs
Copy link
Contributor

A neater fix seems to be the following:

import streamtlit as st

def rerun():
    raise st.ScriptRunner.RerunException(st.ScriptRequestQueue.RerunData(None))

Worked this out from the comment written over at:

# Data attached to RERUN requests
RerunData = namedtuple(
    "RerunData",
    [
        # WidgetStates protobuf to run the script with. If this is None, the
        # widget_state from the most recent run of the script will be used instead.
        "widget_state"
    ],
)

Meaning, passing None to RerunData defaults to no change in state.

@ZGainsforth
Copy link

ZGainsforth commented Aug 18, 2020

Another use case for this: I have an output file being created by a molecular dynamics simulation -- every few minutes it outputs the latest results. I have a streamlit app that reads the file and plots relevant information about the simulation so I can see how it is progressing. Right now I have to refresh the page to see updates. But if I had a rerun capability I could just rerun when the file changes, or on a timer.

@SimonBiggs
Copy link
Contributor

SimonBiggs commented Aug 18, 2020 via email

@ZGainsforth
Copy link

ZGainsforth commented Aug 18, 2020 via email

@SimonBiggs
Copy link
Contributor

SimonBiggs commented Aug 18, 2020 via email

@SimonBiggs
Copy link
Contributor

@tvst would I be able to do a pull request on this one?

@ZGainsforth
Copy link

Ah, OK. As a workaround I'll give that a shot. Thanks.

@SimonBiggs
Copy link
Contributor

The code for this has changed to:

def rerun():
    raise st.script_runner.RerunException(st.script_request_queue.RerunData(None))

@SimonBiggs
Copy link
Contributor

SimonBiggs commented Aug 24, 2020

@ZGainsforth

For reference, below I have made a tool that allows modules to be flagged to trigger an auto-reload on filechange:

# app.py

import st_rerun

import another_module
import and_another

wait_for_rerun = st_rerun.auto_reload_on_module_changes(__name__, [another_module, and_another])

# do stuff ...

wait_for_rerun()
# st_rerun.py

import importlib
import pathlib
import queue
import types

from watchdog import events, observers

import streamlit as st


def rerun():
    raise st.script_runner.RerunException(st.script_request_queue.RerunData(None))


class WatchdogEventHandler(events.FileModifiedEvent):
    def __init__(self, module, module_bucket):
        self.module = module
        self.module_bucket = module_bucket

        super().__init__(self.module.__file__)

    def dispatch(self, event):
        if event.src_path == self.module.__file__:
            self.module_bucket.put(self.module)


def rerun_on_module_reload(module: types.ModuleType, module_bucket):
    observer = observers.polling.PollingObserver()

    module_directory = pathlib.Path(module.__file__).parent

    event_handler = WatchdogEventHandler(module, module_bucket)
    observer.schedule(event_handler, module_directory, recursive=False)

    observer.start()


@st.cache(suppress_st_warning=True)
def auto_reload_on_module_changes(current_module_name, modules):
    current_module = importlib.import_module(current_module_name)

    if isinstance(modules, types.ModuleType):
        modules = [modules]

    modules.append(current_module)
    module_bucket = queue.Queue()

    for module in modules:
        rerun_on_module_reload(module, module_bucket)

    def wait_for_rerun():
        module = module_bucket.get(block=True)
        if module != current_module:
            print(f"Reloading {module.__file__}")
            importlib.reload(module)

        print("Rerunning streamlit")
        rerun()

    return wait_for_rerun

The biggest issue, is I have to have a function at the bottom that blocks any further execution. I was trying to make it so that I could enqueue a ScriptRequest.RERUN onto the ScriptRequestQueue. My thought is, that would make it so that I can call that from any watchdog thread. But, I just could not work out how to do it.

@treuille, might you be able to offer some pointers on how I might be able to append ScriptRequest.RERUN to the ScriptRequestQueue from within a Watchdog thread?

Any help would be massively appreciated.

Cheers,
Simon

@SimonBiggs
Copy link
Contributor

SimonBiggs commented Aug 24, 2020

Actually! :) I worked it out :).

I am now successfully using the following code within my deployed GUI:

# app.py

import rerun

import another_module
import and_another

# this will make it so that streamlit will reload and rerun on module changes
rerun.autoreload([another_module, and_another])

# do stuff ...

if st.button('foo'):
    rerun.rerun()  # this will rerun
# rerun.py

import importlib
import pathlib
import types

from watchdog import events
from watchdog.observers import polling

import streamlit as st


def get_session_id():
    ctx = st.report_thread.get_report_ctx()
    session_id = ctx.session_id

    return session_id


def rerun(session_id=None):
    if session_id is None:
        session_id = get_session_id()

    server = st.server.server.Server.get_current()
    session = server._get_session_info(  # pylint: disable = protected-access
        session_id
    ).session

    session.request_rerun()


class WatchdogEventHandler(events.FileModifiedEvent):
    def __init__(self, module, session_id):
        self.module = module
        self.session_id = session_id

        super().__init__(self.module.__file__)

    def dispatch(self, event):
        if event.src_path == self.module.__file__:
            print(f"Reloading {self.module.__file__}")
            importlib.reload(self.module)
            print("Rerunning streamlit session")
            rerun(self.session_id)


@st.cache()
def reload_and_rerun_on_module_changes(module: types.ModuleType, session_id):
    observer = polling.PollingObserver()

    module_directory = pathlib.Path(module.__file__).parent

    event_handler = WatchdogEventHandler(module, session_id)
    observer.schedule(event_handler, module_directory, recursive=False)

    observer.start()


def autoreload(modules):
    session_id = get_session_id()

    if isinstance(modules, types.ModuleType):
        modules = [modules]

    for module in modules:
        reload_and_rerun_on_module_changes(module, session_id)

@ZGainsforth
Copy link

Pretty fancy! Submit a pull request?

@SimonBiggs
Copy link
Contributor

SimonBiggs commented Aug 25, 2020

Yup, I'd be more than happy to. @tvst and @treuille would you guys be okay if I made a PR to add the following two API calls:

  • st.experimental.rerun(session_id=None)
    • If no session_id provided it'll rerun the session of the current thread.
  • st.experimental.autoreload(modules)
    • modules can be an iterable of imported modules (types.ModuleType) or a single imported module.
  • st.experimental.file_contents(path)
    • Returns the contents of a file and triggers a rerun whenever the contents of that file changes (reusing the same logic as used for autoreload).

@SimonBiggs
Copy link
Contributor

@ZGainsforth what are your thoughts about the st.experimental.file_contents(path) API mentioned above?

@ZGainsforth
Copy link

@SimonBiggs The st.experimental.file_contents(path) API is likely to be very useful. I would use it. I think the simple refresh on a single file is probably the most important use case. Thinking ahead, I imagine somebody is going to want to do that on a directory, or a set of specific files or on a network port. However, I think that options after a single file and possibly a single directory yield diminishing returns. This is especially the case since st.experimental.rerun() should suffice for folks who want to rerun after polling the USB port connected to the voltmeter connected to the laser and other fun but very specialized use cases! At least that's my two cents.

@SimonBiggs
Copy link
Contributor

SimonBiggs commented Aug 26, 2020

I imagine somebody is going to want to do that on a directory

Yup, I actually already wanted to do that over at #543 (comment)

Potentially also a st.experimental.directory_tree('path/to/directory', recursive=False, ignore_directories=False) which updates on watchdog delete, create, or move events. And it just provides a list of file contents of that directory to the user. User's can then chain that with the file_contents function to cover multiple files in a directory.

An extra thought on the file_contents function, is the contents can actually be cached, and then the cache itself can be updated whenever the watchdog observer pings. That way multiple users reading from the same file only actually need to read the contents of the file once.

@SimonBiggs
Copy link
Contributor

Some key issues with the implementation provided above. Each session_id has its own observers. This may create quite the memory leak. These observers aren't being closed down whenever a session_id is closed. Instead, I should have one observer per module, and if an observer for a given file is already registered, then add a session_id to it.

I imagine this logic is already handled within streamlit's watchdog code. I may be able to reuse that. Anyway, a bit more thought needed.

Also, in its current implementation, should someone remove the autoreload function from their file, this won't actually stop the autoreload from occurring, until the cache is cleared.

# TODO: Make it so that instead of creating an observer for every
# session_id, instead, if a file is already being observed, just append
# the new session_id to the rerun trigger.

# TODO: Provide a way to automatically deregister the listeners in the
# case where the autoreload function is no longer being called, or
# some modules are no longer being provided to autoreload function

# TODO: Also need to deregister the reload observer when a session is
# closed.

@tvst
Copy link
Contributor

tvst commented Aug 31, 2020

Just jumping in here (without reading the whole thread, sorry!) to let y'all know we're planning on adding this feature to st.experimental soon. It will likely be called something like st.experimental_rerun.

So please continue the conversation here and add ideas, etc, as we'll be combing through these before starting any work!

@SimonBiggs
Copy link
Contributor

Just jumping in here (without reading the whole thread, sorry!)

All good :) I would like to propose to include both st.experimental_session_id and st.experimental_rerun(session_id=None) that way a separate thread (say within a watchdog observer) can use that session ID to trigger the reload.

Also, for just the rerun option, I'd be quite keen to be able submit the code as a pull request if you're okay with that? It's a bit nice having the git version control record give attribution for the work I did...

@nthmost
Copy link
Contributor

nthmost commented Sep 3, 2020

@SimonBiggs I think you're cleared for takeoff if you'd like to submit a PR for this one...!

@nthmost nthmost self-assigned this Sep 3, 2020
@SimonBiggs
Copy link
Contributor

@SimonBiggs I think you're cleared for takeoff if you'd like to submit a PR for this one...!

Awesome :), thanks for that. I'll jump on it one evening soon :).

@nthmost nthmost removed their assignment Sep 17, 2020
@SimonBiggs
Copy link
Contributor

Hi @nthmost,

So I managed to find some time to implement this this-evening. See the PR over at #2060.

Thanks for letting me add this to the API :)

@nthmost
Copy link
Contributor

nthmost commented Oct 5, 2020

@SimonBiggs Oh, great! I was on vacation when you dropped this. I'll check it out tomorrow morning!

@SimonBiggs
Copy link
Contributor

@kmcgrady would you be okay with me opening another issue that covers the key parts discussed in this issue that were not covered by #2060 ?

@kmcgrady
Copy link
Collaborator

@SimonBiggs Yes! That sounds great. Please do, and just be sure to link in this PR and the issue. Let me know when you file it, and I can then close #2060.

Thank you so much for your help!

@SimonBiggs
Copy link
Contributor

SimonBiggs commented Oct 14, 2020

@SimonBiggs Yes! That sounds great. Please do, and just be sure to link in this PR and the issue. Let me know when you file it, and I can then close #2060.

Done :) see #2180

Thank you so much for your help!

My pleasure :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:experimental Related to experimental features type:enhancement Requests for feature enhancements or new features
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants