-
Notifications
You must be signed in to change notification settings - Fork 151
Description
I think it would be useful to have a source that only emits lines that have been appended to the file since the stream was started.
This could be implemented as a keyword argument to Stream.from_textfile that triggers file.seek:
def __init__(self, f, poll_interval=0.100, delimiter='\n', start=False,
from_end=False, **kwargs):
# ...
self.from_end = from_end
# ...
def start(self):
self.stopped = False
if self.from_end:
self.file.seek(0, 2)
self.loop.add_callback(self.do_poll)Use case:
I'm running simulations of biologically-realistic neural networks with NEST, which outputs measurements of membrane potentials, etc., as lines appended to text files. These simulations can run for a long time. I want to be able to use pd.read_csv to quickly load existing output into a DataFrame to plot the activity, and then stream any lines that are appended afterwards so that the activity plot updates in real-time. If the existing output is large, then streaming everything instead of using pd.read_csv is prohibitively slow (at least in my implementation; maybe that could be improved).