# Split the basin model into multiple domains

In this notebook we split the single-model base simulation generated in the previous step into a multi-model simulation. To do this we use a facility in FloPy called the __Model Splitter__. After the splitting, we will run the simulation in parallel before continuing to the next notebook.

### Imports

In [None]:
# General
import os
import shutil
import matplotlib.pyplot as plt
import pathlib as pl

# FloPy
import flopy
from flopy.mf6.utils import Mf6Splitter

# Local
from utilities import *

## Set the number of domains

Here we set the number of domains to split the simulation into. This will generate a single MODFLOW 6 model for each domain. In a parallel simulation run we can then run every model on its own processor core. You probably do not want to oversubscribe to the number of available processors on your machine (i.e. fire up more processes than actual available processor cores). Library methods often return the number of cores including hyperthreading, where a processor core can take two processes at the same time (though only quasi-concurrently). The same is true for `os.cpu_count()`. If you want to know the details of the cpu architecture that is available, run the `lscpu` command in your shell.

In [None]:
# set the number of domains to split into and
# the number of parallel processes to use
ndomains = 4

# set the max. number of physical cores you can run
# on, excluding hyperthreading. (Check with 'lscpu'
# on the command line what your architecture is)
max_nr_cores = 4

# check
if ndomains > max_nr_cores:
  raise ValueError(f"Partitioning into more domains than cores available")

# it should be larger than 1, because that is our base
if ndomains < 2:
  raise ValueError(f"Error: this will not work with less than 2 domains...")

## Load the base model of the watershed

In [None]:
# get the path to the base model directory
base_ws = get_serial_workspace()

# load the FloPy simulation
base_sim = flopy.mf6.MFSimulation.load(
    sim_ws=base_ws,
)

In [None]:
# the base GWF model
base_gwf = base_sim.get_model()
total_nr_cells, nr_active_cells = get_model_cell_count(base_gwf)

print(f"The base model has {nr_active_cells} active cells")

## Split the base simulation

Here we use the Model Splitter on the simulation that is loaded into memory. In the background the splitter uses PyMetis (https://pypi.org/project/PyMetis/) to partition the grid. PyMetis itself wraps the Metis (http://glaros.dtc.umn.edu/gkhome/views/metis) graph partitioning software for the actual work.

In [None]:
# pass the base simulation object to the splitter
mfsplit = Mf6Splitter(base_sim)

# create a splitting array from the set number of domains
split_array = mfsplit.optimize_splitting_mask(nparts=ndomains)

# plot the splitting array, every color (value) is a model
fig, ax = plt.subplots(figsize=(8, 4))
ax.set_aspect("equal")
pmv = flopy.plot.PlotMapView(model=base_gwf, ax=ax)
pa = pmv.plot_array(split_array)
plt.colorbar(pa, shrink=0.5)
plt.show()

Now generate a new, partitioned simulation object:

In [None]:
# this is the actual model splitting
new_sim = mfsplit.split_model(split_array)

# check the model sizes
nr_active_cells_par = []
for model_name in new_sim.model_names:
  model = new_sim.get_model(model_name)
  nr_active_cells_par.append(get_model_cell_count(model)[1])
print(f"Active cells in split simulation: {nr_active_cells_par}")
print(f"Active cells in single model: {nr_active_cells}")


Write everything to disk

In [None]:
# create the simulation directory
parallel_ws = get_workspace(ndomains)
shutil.rmtree(parallel_ws, ignore_errors=True)

# set it and write
new_sim.set_sim_path(parallel_ws)
new_sim.write_simulation(silent=True)

# save the node mapping
# (this will help when combining the results from the domains and
# comparing against the base simulation)
mfsplit.save_node_mapping(parallel_ws / "mfsplit_node_mapping.json")

## Parallel run of the multi-model simulation

In [None]:
# run parallel
new_sim.run_simulation(
    processors=ndomains,
)