# atmodeller

## Tutorial 3: Monte Carlo experiment

We can devise a simple Monte Carlo (MC) approach to sample the probable atmospheres that can arise for different planetary conditions.

In [1]:
from atmodeller import debug_logger
from atmodeller.interior_atmosphere import InteriorAtmosphereSystem, Planet, Species
from atmodeller.constraints import MassConstraint, IronWustiteBufferConstraintHirschmann, SystemConstraints
from atmodeller.interfaces import GasSpecies
from atmodeller.solubilities import PeridotiteH2O, BasaltDixonCO2, BasaltLibourelN2
from atmodeller.utilities import earth_oceans_to_kg
from atmodeller.initial_condition import InitialConditionRegressor, InitialConditionSwitchRegressor
import numpy as np
import logging

For production runs, make sure to set the logger to INFO or higher (i.e. WARNING, ERROR, or CRITICAL). Otherwise you will find that your MC runs slower just because of writing the output to the logger.

In [2]:
logger = debug_logger()
logger.setLevel(logging.INFO)

We now create the species that we are interested in.

In [3]:
species: Species = Species()
species.append(GasSpecies(formula='H2O', solubility=PeridotiteH2O()))
species.append(GasSpecies(formula='H2'))
species.append(GasSpecies(formula='O2'))
species.append(GasSpecies(formula='CO'))
species.append(GasSpecies(formula='CO2', solubility=BasaltDixonCO2()))
species.append(GasSpecies(formula='N2', solubility=BasaltLibourelN2()))
species

[15:57:27 - atmodeller.interfaces          - INFO     ] - Creating a GasSpecies: H2O (H2O)
[15:57:27 - atmodeller.interfaces          - INFO     ] - Creating a GasSpecies: H2 (H2)
[15:57:27 - atmodeller.interfaces          - INFO     ] - Creating a GasSpecies: O2 (O2)
[15:57:27 - atmodeller.interfaces          - INFO     ] - Creating a GasSpecies: CO (CO)
[15:57:27 - atmodeller.interfaces          - INFO     ] - Creating a GasSpecies: CO2 (CO2)


[15:57:27 - atmodeller.interfaces          - INFO     ] - Creating a GasSpecies: N2 (N2)


Species([GasSpecies(chemical_formula='H2O', name_in_thermodynamic_data='H2O', thermodynamic_dataset=<atmodeller.interfaces.ThermodynamicDatasetJANAF object at 0x11b44fad0>, formula=Formula('H2O'), output=None, thermodynamic_data=ThermodynamicDatasetJANAF.ThermodynamicDataForSpecies(species=..., data=<thermochem.janaf.JanafPhase object at 0x11b16f4d0>), solubility=<atmodeller.solubilities.PeridotiteH2O object at 0x11b44fb90>, solid_melt_distribution_coefficient=0, eos=IdealGas(critical_temperature=1, critical_pressure=1, standard_state_pressure=1)),
         GasSpecies(chemical_formula='H2', name_in_thermodynamic_data='H2', thermodynamic_dataset=<atmodeller.interfaces.ThermodynamicDatasetJANAF object at 0x11b1d3e50>, formula=Formula('H2'), output=None, thermodynamic_data=ThermodynamicDatasetJANAF.ThermodynamicDataForSpecies(species=..., data=<thermochem.janaf.JanafPhase object at 0x11b816b50>), solubility=<atmodeller.interfaces.NoSolubility object at 0x11b44fe90>, solid_melt_distributio

Now create a planet. We recall that we can sample different planetary properties by updating the attributes of this object, even though in this tutorial we don't do this.

In [4]:
planet: Planet = Planet()

[15:57:31 - atmodeller.interior_atmosphere - INFO     ] - Creating a new planet
[15:57:31 - atmodeller.interior_atmosphere - INFO     ] - Mantle mass (kg) = 4208261222595110885130240.000000
[15:57:31 - atmodeller.interior_atmosphere - INFO     ] - Mantle melt fraction = 1.000000
[15:57:31 - atmodeller.interior_atmosphere - INFO     ] - Core mass fraction = 0.295335
[15:57:31 - atmodeller.interior_atmosphere - INFO     ] - Planetary radius (m) = 6371000.000000
[15:57:31 - atmodeller.interior_atmosphere - INFO     ] - Planetary mass (kg) = 5972000000000000327155712.000000
[15:57:31 - atmodeller.interior_atmosphere - INFO     ] - Surface temperature (K) = 2000.000000
[15:57:31 - atmodeller.interior_atmosphere - INFO     ] - Surface gravity (m/s^2) = 9.819973
[15:57:31 - atmodeller.interior_atmosphere - INFO     ] - Melt Composition = None


Now set up the main driver of the Monte Carlo (MC) approach. This establishes the ranges over which we sample certain properties.

In [5]:
def monte_carlo(interior_atmosphere: InteriorAtmosphereSystem, number_of_realisations:int=100):
    """Monte Carlo driver
    
    Args:
        interior_atmosphere: An interior-atmosphere system
        number_of_realisation: Number of simulations to perform
    """

    # Parameters are normally distributed between bounds.
    number_ocean_moles = np.random.uniform(1, 10, number_of_realisations)
    ch_ratios = np.random.uniform(0.1, 1, number_of_realisations)
    fo2_shifts = np.random.uniform(-4, 4, number_of_realisations)

    # ppmw of Nitrogen in the mantle. 2.8 is the mantle value of N.
    N_ppmw = 2.8

    # The nitrogen mass is constant
    mass_N = N_ppmw * 1.0e-6 * planet.mantle_mass

    for realisation in range(number_of_realisations):

        mass_H = earth_oceans_to_kg(number_ocean_moles[realisation])
        mass_C = ch_ratios[realisation] * mass_H
        constraints = SystemConstraints([
            MassConstraint(species="H", value=mass_H),
            MassConstraint(species="C", value=mass_C),
            MassConstraint(species="N", value=mass_N),
            IronWustiteBufferConstraintHirschmann(log10_shift=fo2_shifts[realisation])
        ])

        # Extra quantities to write to the output
        # For example, it's often helpful to have the constraints expressed in a more convenient
        # form for analysis and plotting.
        extra = {'fO2_shift': fo2_shifts[realisation], 'C/H ratio':ch_ratios[realisation],
            'Number of ocean moles':number_ocean_moles[realisation]}

        interior_atmosphere.solve(constraints, extra_output=extra, factor=0.1)


We can run the MC as follows. This may take a minute or two to run.

In [6]:
interior_atmosphere: InteriorAtmosphereSystem = InteriorAtmosphereSystem(species=species, planet=planet)
monte_carlo(interior_atmosphere)

[15:57:40 - atmodeller.initial_condition   - INFO     ] - Creating InitialConditionConstant
[15:57:40 - atmodeller.interior_atmosphere - INFO     ] - Creating an interior-atmosphere system
[15:57:40 - atmodeller.interior_atmosphere - INFO     ] - Creating a reaction network
[15:57:40 - atmodeller.interior_atmosphere - INFO     ] - Species = ['H2O', 'H2', 'O2', 'CO', 'CO2', 'N2']
[15:57:40 - atmodeller.interior_atmosphere - INFO     ] - Reactions = 
{0: '1.0 H2O + 1.0 CO = 1.0 H2 + 1.0 CO2', 1: '2.0 H2O = 2.0 H2 + 1.0 O2'}


[15:57:40 - atmodeller.interior_atmosphere - INFO     ] - Set constraints
[15:57:40 - atmodeller.interior_atmosphere - INFO     ] - The solution converged.
[15:57:40 - atmodeller.interior_atmosphere - INFO     ] - InitialConditionConstant: RMSE (actual vs initial) = 0.8269556722045084
[15:57:40 - atmodeller.interior_atmosphere - INFO     ] - {'CO': 272.9260325626845,
 'CO2': 175.0515301688739,
 'H2': 4.279818426947156,
 'H2O': 12.515473778330128,
 'N2': 2.5610023430170434,
 'O2': 7.127313022944436e-07}
[15:57:40 - atmodeller.interior_atmosphere - INFO     ] - Set constraints
[15:57:41 - atmodeller.interior_atmosphere - INFO     ] - The solution converged.
[15:57:41 - atmodeller.interior_atmosphere - INFO     ] - InitialConditionConstant: RMSE (actual vs initial) = 1.1619049417862792
[15:57:41 - atmodeller.interior_atmosphere - INFO     ] - {'CO': 22.97791510737669,
 'CO2': 110.70627207468107,
 'H2': 0.04348149541397646,
 'H2O': 0.9551401007073314,
 'N2': 3.081084112876332,
 'O2': 4.021

The simulation data can be exported to an Excel or a pickle file by setting the appropriate keyword argument in the output method:

In [7]:
interior_atmosphere.output(file_prefix='tutorial3_monte_carlo', to_excel=True, to_pickle=True)

[15:58:52 - atmodeller.output              - INFO     ] - Output data written to tutorial3_monte_carlo.pkl
[15:58:52 - atmodeller.output              - INFO     ] - Output data written to tutorial3_monte_carlo.xlsx


{'solution':           H2O         H2            O2          CO         CO2        N2
 0   12.515474   4.279818  7.127313e-07  272.926033  175.051530  2.561002
 1    0.955140   0.043481  4.021677e-05   22.977915  110.706272  3.081084
 2    7.103025   0.165927  1.527344e-04   22.870718  214.736686  3.135241
 3    3.444295   0.045539  4.767742e-04   10.386988  172.307443  3.192798
 4    7.620206   1.262448  3.036604e-06  154.971372  205.165134  2.779950
 ..        ...        ...           ...         ...         ...       ...
 95  13.002986  10.628530  1.247446e-07  278.886519   74.833558  2.313835
 96   2.308985  88.429758  5.682342e-11   21.685489    0.124191  0.595224
 97   0.727272  13.621937  2.375738e-10   37.964128    0.444560  1.669838
 98   0.423506   0.778440  2.466896e-08   74.756010    8.920302  2.260480
 99   6.431108   1.291662  2.066121e-06   51.638306   56.390747  2.655088
 
 [100 rows x 6 columns],
 'atmosphere':     total_pressure  mean_molar_mass
 0       467.333858   

If you just want to access the dataframes in a dictionary you can use:

In [8]:
output_data = interior_atmosphere.output(to_dataframes=True)
output_data

When performing a MC simulation, sometimes problems can arise when chosen model parameters would result in a solution that is far from the initial guess. Internally, atmodeller chooses an initial guess for the solution and uses this as a starting point for the numerical solution technique. But if this initial guess is far from the actual solution, the solver may fail. To address this, it is often convenient to run a smaller MC simulation with reduced parameter bounds in order to generate some output. Then we can use this output to train a new initial condition to provide an improved initial guess for a new MC run.

In the following, we use the generated output from the previous run to inform the selection of the initial condition:

In [8]:
initial_condition = InitialConditionRegressor.from_pickle('tutorial3_monte_carlo.pkl', fit=True, fit_batch_size=100, partial_fit=True, partial_fit_batch_size=50)

[16:02:15 - atmodeller.initial_condition   - INFO     ] - Creating InitialConditionRegressor
[16:02:15 - atmodeller.initial_condition   - INFO     ] - InitialConditionRegressor: Reading data from tutorial3_monte_carlo.pkl
[16:02:15 - atmodeller.initial_condition   - INFO     ] - InitialConditionRegressor: Fit (None, None)
[16:02:15 - atmodeller.initial_condition   - INFO     ] - InitialConditionRegressor: Found constraints = ['H_mass', 'C_mass', 'N_mass', 'O2_fugacity']
[16:02:15 - atmodeller.initial_condition   - INFO     ] - InitialConditionRegressor: Found species = ['H2O', 'H2', 'O2', 'CO', 'CO2', 'N2']


In the above, fit = True, which means the trained data from the previous run (as computed from the output in the pickle file) is only used for the first fit_batch_size = 100 simulations. Subsequently the regressor will re-train itself on just the (fit_batch_size = 100) samples generated from the current model, discarding knowledge of the previous data it was trained on. Then, every partial_fit_batch_size = 100 simulations, it will update its training with the last batch of newly generated samples in order to better inform the selection of subsequent initial conditions. This is known as a dynamic or online learning approach.

It is necessary to pass the initial condition to the interior atmosphere system when it is created:

In [9]:
interior_atmosphere_ic: InteriorAtmosphereSystem = InteriorAtmosphereSystem(species=species, initial_condition=initial_condition, planet=planet)
monte_carlo(interior_atmosphere_ic, number_of_realisations=200)

[16:02:25 - atmodeller.interior_atmosphere - INFO     ] - Creating an interior-atmosphere system
[16:02:25 - atmodeller.interior_atmosphere - INFO     ] - Creating a reaction network
[16:02:25 - atmodeller.interior_atmosphere - INFO     ] - Species = ['H2O', 'H2', 'O2', 'CO', 'CO2', 'N2']
[16:02:25 - atmodeller.interior_atmosphere - INFO     ] - Reactions = 
{0: '1.0 H2O + 1.0 CO = 1.0 H2 + 1.0 CO2', 1: '2.0 H2O = 2.0 H2 + 1.0 O2'}
[16:02:25 - atmodeller.interior_atmosphere - INFO     ] - Set constraints
[16:02:26 - atmodeller.interior_atmosphere - INFO     ] - The solution converged.
[16:02:26 - atmodeller.interior_atmosphere - INFO     ] - InitialConditionRegressor: RMSE (actual vs initial) = 0.37439537913501897
[16:02:26 - atmodeller.interior_atmosphere - INFO     ] - {'CO': 327.070417030486,
 'CO2': 7.722422170914213,
 'H2': 81.10861585515926,
 'H2O': 8.731327261559702,
 'N2': 1.798126147246543,
 'O2': 9.658473329199923e-10}
[16:02:26 - atmodeller.interior_atmosphere - INFO     ] -

If you compare the log output for the two MC runs, you will see in the second MC example that the initial condition re-trained itself after 100 samples had been generated and then partially retrained itself every 50 samples. This keeps the RMSE between the initial guess and actual solution to a smaller value than simply guessing a constant initial condition. Also, fit = True allows you to train an initial condition on a similar but not identical model (for example, different solubility laws or gas equations of states), where once enough samples have been generated you would prefer to only use the new model to generate new estimates (since the behavior of the new model and the previous similar-but-not-the-same model will diverge).

If you want to combine an initial condition that is constant, and then switches to a regressor, you can use:

In [None]:
initial_condition_switch = InitialConditionSwitchRegressor(value=10, fit=True, fit_batch_size=100, partial_fit=True, partial_fit_batch_size=50)

You must specify the initial constant value first, and then the arguments associated with the regressor. In the above, a constant initial condition will be used for the first 100 samples, after which the initial condition will train itself on those first 100 samples. Then, for every subsequent 50 samples, the initial condition will partially re-train itself. This approach can work well, but of course it relies on finding those first 100 samples to train the regressor.

Although it is tempting to set fit_batch_size to as small a number as possible to begin training the initial condition, formally this initial sample should capture some of the variability in the solution since it is used to calibrate the scalings. This is important because the scalings are fixed for an initial condition regressor and are not updated (unlike the trained model) during partial fitting. Hence in practice, it may be necessary to incrementally add complexity in terms of solubility and equations of state, where the initial condition for each MC is trained on a previous simpler MC run.

In [None]:
interior_atmosphere_ic_switch: InteriorAtmosphereSystem = InteriorAtmosphereSystem(species=species, initial_condition=initial_condition_switch, planet=planet)
monte_carlo(interior_atmosphere_ic_switch, number_of_realisations=200)