Import the required libraries

In [1]:
from apsimNGpy.core.mult_cores import MultiCoreManager
from pathlib import Path

Initialize the API

In [2]:
db = Path.home() / 'jupiter_example.db'
Parallel = MultiCoreManager(db_path=db, agg_func='sum', table_prefix='jupter_example',)

Let's mimic some jobs

In [3]:
jobs = ({'model': 'Maize', 'ID': i, 'payload': [{'path': '.Simulations.Simulation.Field.Fertilise at sowing',
                                                    'Amount': i}]} for i in range(200))

Submit the jobs and run all

In [4]:
Parallel.run_all_jobs(jobs=jobs, n_cores=6, engine='csharp', threads=False, chunk_size=100,
                          subset=['Yield'],
                          progressbar=True)

Processing 200 jobs wait.. ██████████ 100%  >> completed (elapsed=>03:06, eta=>00:00) 


In case you don't need to monitor the progress bar, use progressbar = False, you can also run your code purely in python by specifying engine ='python'

Retrieving results is as follows:

In [5]:
df = Parallel.results

In [6]:
df.shape

(200, 5)

In [7]:
df

Unnamed: 0,Yield,source_table,ID,Amount,MetaProcessID
0,53408.879389,Report,98,98,6140
1,28794.711983,Report,32,32,6140
2,50174.152258,Report,90,90,23104
3,37700.438595,Report,56,56,6140
4,27759.223405,Report,29,29,35132
...,...,...,...,...,...
195,56012.946973,Report,197,197,36640
196,57353.566422,Report,112,112,43856
197,56556.839573,Report,145,145,59104
198,57242.400862,Report,117,117,48736


It is clear that the shape of the returned data contains 200 rows, corresponding to the 200 simulations that were executed. This row count reflects one summarized row per simulation.

When no aggregation is applied, the number of rows increases because each simulation contributes multiple
records. For example, if each simulation spans 10 years, the resulting DataFrame will contain 10 × 200 = 2,000 rows.

When engine is 'python' the workflow is as follows:

Remember because our jobs above were created as a generator, they were all consumed, so we have to create new ones first


In [9]:
jobs = ({'model': 'Maize', 'ID': i, 'payload': [{'path': '.Simulations.Simulation.Field.Fertilise at sowing',
                                                    'Amount': i}]} for i in range(200))

In [10]:
Parallel.run_all_jobs(jobs=jobs, n_cores=6, engine='python', threads=False, 
                          subset=['Yield'],
                          progressbar=True)

APSIM running[0f] ██████████ 100% (200/200) >> completed (elapsed=>05:51, eta=>00:00) 


Looking at the progressbar elapsed time, when engine is python, the time take to completion is very low

it is also possible to get raw data without subseting any column or aggregating as follows

In [ ]:
Parallel = MultiCoreManager(db_path=db, agg_func=None, table_prefix='jupter_example',)
jobs = ({'model': 'Maize', 'ID': i, 'payload': [{'path': '.Simulations.Simulation.Field.Fertilise at sowing',
                                                    'Amount': i}]} for i in range(200))
Parallel.run_all_jobs(jobs=jobs, n_cores=6, engine='csharp', threads=False, 
                          subset=None,
                          progressbar=True)

Processing 200 jobs wait..              0%  >> completed (elapsed=>00:00, eta=>?) 