Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow simulation in PySD #374

Closed
jmf119 opened this issue Oct 2, 2022 · 7 comments
Closed

Slow simulation in PySD #374

jmf119 opened this issue Oct 2, 2022 · 7 comments

Comments

@jmf119
Copy link

jmf119 commented Oct 2, 2022

How do I speed up simulation speed in PySD? The model I have translated directly from Vensim takes about 10 minutes to run once using PySD but simulates in milliseconds in Vensim...what are some tips for improving run performance time in PySD?

@enekomartinmartinez
Copy link
Collaborator

Hi @jmf119

We have worked in trying to improve a lot the performance of PySD. Nevertheless, it could still be far away from Vensim, especially for models with a lot of variables and subscripts.

Some suggestions would be (they can be applied when calling the run method):

  • Use the return_columns argument to select only the variables you need.
  • If you don't need to save variables for each step, make sure that you have a saveper value greater than time_step or that you select the returning values with return_timestamps.

Note that you can also run parts of your model, if you split it by modules you can just run some views:
https://pysd.readthedocs.io/en/master/advanced_usage.html#selecting-and-running-a-submodel

You can also run the model until a given time and then save all the states to run the simulation from that time on (you could save time if your model integrates a historic frame that is always the same):
https://pysd.readthedocs.io/en/master/advanced_usage.html#starting-simulations-from-an-end-state-of-another-simulation

We have improved a lot the PySD performance in the last two years, but we still have a lot of work to do. We are planning about moving the xarray.DataArray backend for subscripted variables to numpy.ndarray backend, see #373. And we have several other issues related to performance improvements https://github.com/SDXorg/pysd/issues?q=is%3Aissue+is%3Aopen+label%3Aperformance

I would also like to invite you to help in the development of performance improvements If you are available. Any contribution is welcome!

@jmf119
Copy link
Author

jmf119 commented Oct 2, 2022 via email

@easyas314159
Copy link

This is something I've been fighting with for most of the last week. I added a couple dozen new components to a model that used to run in 5-6s, after adding the components the model now takes ~38s to run.

The underlying performance issue is caused by models that have a large number of components or a large number of time steps. The way pysd appends data to the pandas.DataFrame is extremely inefficient. It's a fixable issue but requires some significant changes to the pysd.py_backend.output.DataFrameHandler class.

The existing approach (as of pysd==3.12.0, pandas==2.1.1, and python==3.11.6) generates a huge number of fragmentation/performance warnings from pandas:

pysd/py_backend/output.py:535: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`

When I wrap my model.run() call using cProfile concatenating data in pandas seems to be accounting for ~80% of the run time.

with cProfile.Profile(builtins=False) as pr:
    df = model.run()

    stats = pstats.Stats(pr)
    stats.sort_stats('cumulative')
    stats.print_stats(0.1)
         39656497 function calls (38638101 primitive calls) in 38.891 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000   38.918   38.918 pysd/py_backend/model.py:1325(run)
        1    0.029    0.029   38.758   38.758 pysd/py_backend/model.py:2086(_integrate)
     6001    0.010    0.000   36.707    0.006 pysd/py_backend/output.py:60(update)
     6001    0.048    0.000   36.697    0.006 pysd/py_backend/output.py:474(update)
     6001    0.052    0.000   34.360    0.006 pandas/core/indexing.py:867(__setitem__)
     6001    0.039    0.000   33.589    0.006 pandas/core/indexing.py:1747(_setitem_with_indexer)
     6001    0.352    0.000   33.541    0.006 pandas/core/indexing.py:2141(_setitem_with_indexer_missing)
     6000    0.144    0.000   32.035    0.005 pandas/core/frame.py:10182(_append)
     6000    0.031    0.000   19.067    0.003 pandas/core/reshape/concat.py:157(concat)
     6000    0.189    0.000   18.821    0.003 pandas/core/reshape/concat.py:618(get_result)
     6000    1.526    0.000   17.631    0.003 pandas/core/internals/concat.py:94(concatenate_managers)
    12093    0.454    0.000   10.439    0.001 pandas/core/internals/managers.py:308(apply)
     6001    0.020    0.000    9.460    0.002 pandas/core/generic.py:6705(infer_objects)
     6001    0.017    0.000    9.367    0.002 pandas/core/internals/managers.py:422(convert)
558093/6001    2.519    0.000    9.188    0.002 pandas/core/internals/blocks.py:533(convert)
     6001    0.464    0.000    9.122    0.002 pandas/core/internals/blocks.py:425(split_and_operate)
   552000    2.754    0.000    8.956    0.000 pandas/core/internals/concat.py:572(_is_uniform_join_units)
  1655946    0.951    0.000    4.978    0.000 pandas/core/internals/concat.py:597(<genexpr>)
  1104000    2.822    0.000    4.027    0.000 pandas/core/internals/concat.py:389(is_na)
     6000    1.479    0.000    3.068    0.001 pandas/core/internals/concat.py:296(_get_combined_plan)
   558172    2.496    0.000    2.549    0.000 <__array_function__ internals>:177(concatenate)
   552092    0.415    0.000    2.355    0.000 pandas/core/internals/blocks.py:247(make_block)
     6001    0.318    0.000    2.079    0.000 pysd/py_backend/output.py:489(<listcomp>)
     6000    0.036    0.000    1.962    0.000 pysd/py_backend/model.py:2120(_integrate_step)
     6000    0.137    0.000    1.896    0.000 pysd/py_backend/model.py:2073(_euler_step)
   564096    1.549    0.000    1.827    0.000 pandas/core/internals/blocks.py:2388(new_block)
     6000    0.034    0.000    1.497    0.000 pandas/core/generic.py:1135(rename_axis)
     6000    0.024    0.000    1.470    0.000 pysd/py_backend/model.py:467(ddt)
1318866/858160    0.619    0.000    1.468    0.000 pysd/py_backend/cache.py:21(cached_func)
     6000    0.125    0.000    1.447    0.000 pysd/py_backend/model.py:468(<listcomp>)
     6000    0.028    0.000    1.431    0.000 pandas/core/generic.py:1310(_set_axis_name)
   576529    0.678    0.000    1.310    0.000 numpy/core/numeric.py:289(full)
12005/12004    0.159    0.000    1.267    0.000 pandas/core/series.py:371(__init__)
     6000    0.021    0.000    1.194    0.000 pandas/core/generic.py:6553(copy)
  1104000    0.956    0.000    1.179    0.000 pandas/core/internals/concat.py:322(_get_block_for_concat_plan)
     6000    0.021    0.000    1.116    0.000 pandas/core/internals/managers.py:540(copy)
    42000    0.375    0.000    1.116    0.000 pysd/py_backend/statefuls.py:209(ddt)
     6001    0.010    0.000    1.090    0.000 pandas/core/frame.py:3747(T)
     6001    1.084    0.000    1.084    0.000 pandas/core/internals/blocks.py:409(_split)
     6001    0.086    0.000    1.080    0.000 pandas/core/frame.py:3575(transpose)

I ended up getting a 30x speed-up by monkey patching pysd.py_backend.output.ModelOutput.__init__ with the following code that just appends results to a list and then constructs the pandas.DataFrame right at the end. The model I mentioned at the beginning now runs in 1.3s. I can clean this up a bit and submit a PR with a proper fix in the next couple of weeks if there is interest.

import pysd
import pandas as pd

from collections import defaultdict

class FastDataFrameHandler(pysd.py_backend.output.OutputHandlerInterface):
	def process_output(self, out_file):
		if out_file is None:
			return self

	def initialize(self, model):
		self.length = 0
		self.ds = defaultdict(list)

	def update(self, model):
		for key in self.capture_elements_step:
			component = getattr(model.components, key)
			self.ds[component.name].append(model.time.round() if key=='time' else component())
		self.length += 1

	def postprocess(self, **kwargs):
		df = pd.DataFrame.from_dict(self.ds)
		df.set_index('Time', inplace=True)

		return df

	def add_run_elements(self, model):
		for key in self.capture_elements_run:
			component = getattr(model.components, key)
			self.ds[component.name] = [component()] * self.length

def ModelOutput_init(self, *args, **kwargs):
	self.handler = FastDataFrameHandler()

# Load/setup model here

with patch('pysd.py_backend.output.ModelOutput.__init__', ModelOutput_init):
    df = model.run()

@enekomartinmartinez
Copy link
Collaborator

Thanks a lot, @easyas314159, for checking that!
We really didn't check the potential improvements in the DataFrame output as we were using NetCDF output for big models.
We will be very happy to include your contribution. Feel free to open a PR whenever you have time! :)

@enekomartinmartinez
Copy link
Collaborator

Hi @easyas314159 are you working on the implementation? Otherwise I could implement it on your behalf. Thanks a lot!

@enekomartinmartinez
Copy link
Collaborator

I already added the improvements in the branch dev. They will be included in the next release. Thanks a lot!

@easyas314159
Copy link

@enekomartinmartinez Thanks for taking this, I had grant funding non-sense land in my lap shortly after my initial post and didn't have a chance to post an update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants