Import the required libraries

In [1]:
from apsimNGpy.core.mult_cores import MultiCoreManager
from pathlib import Path

Initialize the API

In [2]:
db = Path.home() / 'jupiter_example.db'
Parallel = MultiCoreManager(db_path=db, agg_func='sum', table_prefix='jupter_example',)

Let's mimic some jobs

In [3]:
jobs = ({'model': 'Maize', 'ID': i, 'payload': [{'path': '.Simulations.Simulation.Field.Fertilise at sowing',
                                                    'Amount': i}]} for i in range(200))

The models argument can be specified as a base path on your computer; in the example above, "Maize" refers to the default Maize module.

The jobs object is an iterator of dictionaries, with each dictionary representing an independent simulation. While jobs may also be provided as a list, doing so can be memory-intensive when working with large numbers of simulations (e.g., 50,000 jobs), potentially slowing down execution or causing failures on systems with limited RAM. Using generators avoids this issue by producing jobs lazily.

Another important argument in the example is the ID key. When engine="csharp", the ID key is required and must be unique across all simulations.

Finally, the payload key (which can also be inputs) is used to provide model-editing instructions. Only one of these keys should be supplied—not both. When provided, the API expects a dictionary whose structure follows the edit_model_by_path syntax defined in the apsimNGpy.core.ApsimModel class. Both payload and inputs serve the same purpose: they carry the model-editing data for each simulation.

Submit the jobs and run all

In [4]:
Parallel.run_all_jobs(jobs=jobs, n_cores=6, engine='csharp', threads=False, chunk_size=100,
                          subset=['Yield'],
                          progressbar=True)

Processing 200 jobs wait.. ██████████ 100%  >> completed (elapsed=>03:06, eta=>00:00) 


In case you don't need to monitor the progress bar, use progressbar = False, you can also run your code purely in python by specifying engine ='python'

Retrieving results is as follows:

In [5]:
df = Parallel.results

In [6]:
df.shape

(200, 5)

In [7]:
df

Unnamed: 0,Yield,source_table,ID,Amount,MetaProcessID
0,53408.879389,Report,98,98,6140
1,28794.711983,Report,32,32,6140
2,50174.152258,Report,90,90,23104
3,37700.438595,Report,56,56,6140
4,27759.223405,Report,29,29,35132
...,...,...,...,...,...
195,56012.946973,Report,197,197,36640
196,57353.566422,Report,112,112,43856
197,56556.839573,Report,145,145,59104
198,57242.400862,Report,117,117,48736


It is clear that the shape of the returned data contains 200 rows, corresponding to the 200 simulations that were executed. This row count reflects one summarized row per simulation.

When no aggregation is applied, the number of rows increases because each simulation contributes multiple
records. For example, if each simulation spans 10 years, the resulting DataFrame will contain 10 × 200 = 2,000 rows.

When engine is 'python' the workflow is as follows:

Remember because our jobs above were created as a generator, they were all consumed, so we have to create new ones first


In [9]:
jobs = ({'model': 'Maize', 'ID': i, 'payload': [{'path': '.Simulations.Simulation.Field.Fertilise at sowing',
                                                    'Amount': i}]} for i in range(200))

In [10]:
Parallel.run_all_jobs(jobs=jobs, n_cores=6, engine='python', threads=False, 
                          subset=['Yield'],
                          progressbar=True)

APSIM running[0f] ██████████ 100% (200/200) >> completed (elapsed=>05:51, eta=>00:00) 


Looking at the progressbar elapsed time, when engine is python, the time take to completion is very low

it is also possible to get raw data without subseting any column or aggregating as follows

In [11]:
Parallel = MultiCoreManager(db_path=db, agg_func=None, table_prefix='jupter_example',)
jobs = ({'model': 'Maize', 'ID': i, 'payload': [{'path': '.Simulations.Simulation.Field.Fertilise at sowing',
                                                    'Amount': i}]} for i in range(200))
Parallel.run_all_jobs(jobs=jobs, n_cores=6, engine='csharp', threads=False, 
                          subset=None,
                          progressbar=True)

Processing 200 jobs wait.. ██████████ 100%  >> completed (elapsed=>03:03, eta=>00:00) 


In [12]:
Parallel.results

Unnamed: 0,CheckpointID,SimulationID,Zone,Clock.Today,Maize.Phenology.CurrentStageName,Maize.AboveGround.Wt,Maize.AboveGround.N,Yield,Maize.Grain.Wt,Maize.Grain.Size,Maize.Grain.NumberFunction,Maize.Grain.Total.Wt,Maize.Grain.N,Maize.Total.Wt,source_table,ID,Amount,MetaProcessID
0,1,1,Field,1991-05-28 00:00:00,HarvestRipe,1603.309641,16.366990,8469.615813,846.961581,0.278267,3043.698222,846.961581,11.178291,1728.427114,Report,185,185,52824
1,1,1,Field,1992-04-09 00:00:00,HarvestRipe,817.797377,9.341324,4540.831319,454.083132,0.271141,1674.714279,454.083132,6.056934,886.580841,Report,185,185,52824
2,1,1,Field,1993-03-16 00:00:00,HarvestRipe,183.160959,1.913657,557.779903,55.777990,0.304051,183.449754,55.777990,0.756003,204.570395,Report,185,185,52824
3,1,1,Field,1994-03-15 00:00:00,HarvestRipe,788.863023,8.252594,3456.675675,345.667568,0.228642,1511.831746,345.667568,4.813789,862.708748,Report,185,185,52824
4,1,1,Field,1995-04-04 00:00:00,HarvestRipe,1527.616645,16.507376,7834.925837,783.492584,0.273649,2863.125457,783.492584,10.482506,1668.106838,Report,185,185,52824
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1995,1,1,Field,1996-03-15 00:00:00,HarvestRipe,505.673220,3.234905,2618.606662,261.860666,0.185948,1408.248365,261.860666,2.427413,530.900585,Report,19,19,57128
1996,1,1,Field,1997-04-05 00:00:00,HarvestRipe,744.071359,5.766247,3310.593638,331.059364,0.166372,1989.879491,331.059364,4.059290,836.466104,Report,19,19,57128
1997,1,1,Field,1998-03-06 00:00:00,HarvestRipe,307.056387,2.170159,1174.404277,117.440428,0.114856,1022.500610,117.440428,1.355511,347.746649,Report,19,19,57128
1998,1,1,Field,1999-04-10 00:00:00,HarvestRipe,1073.201308,7.201122,4188.053764,418.805376,0.162080,2583.943329,418.805376,4.911547,1179.518498,Report,19,19,57128


Total number of row is 2000 and columns 18