In [1]:
# Installation step as requested (assuming necessary packages are not pre-installed)
%pip install numpy pandas json matplotlib bioverse==1.1.8

[31mERROR: Could not find a version that satisfies the requirement json (from versions: none)[0m[31m
[0m[31mERROR: No matching distribution found for json[0m[31m
[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.2.1[0m[39;49m -> [0m[32;49m25.3[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3 -m pip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


# Tutorial: Generating planetary systems (Updated Small Planets) - Test Version

In this tutorial, we review how to use the `Generator` class to generate a sample of planetary systems, using the updated small planet occurrence rates of Bergsten et al. (2022). The standard plotting cells have been replaced with data saving steps for a test environment.

## Setup

Let's start by importing the necessary modules from Bioverse.

In [2]:
# Import numpy and pandas
import numpy as np
import pandas as pd # Added for data saving
import json # Added for saving analysis results

# Import the Generator class
from bioverse.generator import Generator
from bioverse.constants import ROOT_DIR

# Set a seed for reproducibility
np.random.seed(42)

## Loading the Generator and Replacing Steps

We will load the default generator and replace the planet generation step with the one based on the Bergsten et al. (2022) occurrence rates, which are stored in the Bioverse function `create_planets_Bergsten22`.

In [3]:
# Open the transit mode generator
generator = Generator('transit')

# Replace the planet creation step with the one from Bergsten et al. (2022)
generator.replace_step('create_planets', 'create_planets_Bergsten22')

# Save the new step info to a file
step_info = str(generator.steps[1])
output_filename = 'generator_bergsten22_info.txt'
with open(output_filename, 'w') as f:
    f.write(step_info)
print(f"Generator step replaced. New step info saved to {output_filename}")

TypeError: 'str' object cannot be interpreted as an integer

The new function uses an argument `f_E_min` to set the minimum Earth-radius for a planet to be considered an Exo-Earth Candidate (EEC). Let's change this value and then run the generator.

In [3]:
# Set f_E_min = 0.9 (default is 0.8)
generator.set_arg('f_E_min', 0.9)

# Run the generator with d_max = 100 parsecs
sample = generator.generate(d_max=100)
print("Generated a sample of {:d} transiting planets including {:d} exo-Earth candidates.".format(len(sample), sample['EEC'].sum()))

# Convert the sample to a Pandas DataFrame for saving
try:
    df_sample = sample.to_pandas()
except AttributeError:
    df_sample = pd.DataFrame(sample)

# Save the raw simulated data ('sample') to a CSV file
output_filename = 'updated_planet_sample.csv'
df_sample.to_csv(output_filename, index=False)
print(f"Sample data saved to {output_filename}")

Generated a sample of 2792 transiting planets including 71 exo-Earth candidates.
Sample data saved to updated_planet_sample.csv


## Planet Occurrence Analysis

Now, we can analyze the generated sample to compare the occurrence rates of 'super-Earths' ($1 R_\oplus < R < R_{split}$) and 'sub-Neptunes' ($R_{split} < R < 3.5 R_\oplus$) as a function of orbital period. $R_{split}$ is the radius where the population divide occurs, defined as $R_{split} = 2 (M_{st} / 1 M_\odot)^{1/4}$.

In [None]:
# Calculate R_split based on the mean stellar mass of the sample
M_st_mean = sample['M_st'].mean()
Rsplit = 2 * (M_st_mean / 1.0)**(1/4)

# Filter the sample into super-Earths (sE) and sub-Neptunes (sN)
sE = sample[(sample['R'] >= 1.0) & (sample['R'] < Rsplit)]
sN = sample[(sample['R'] >= Rsplit) & (sample['R'] < 3.5)]

# Define period bins for analysis (replacing the plot binning)
pbin_edges = np.array([2, 5, 10, 20, 40, 100])
pbin_centers = (pbin_edges[:-1] + pbin_edges[1:]) / 2

occurrence_data = []
for i in range(len(pbin_edges) - 1):
    p_min, p_max = pbin_edges[i], pbin_edges[i+1]
    
    # Planets in the current period bin
    this_sample = sample[(sample['P'] >= p_min) & (sample['P'] < p_max)]
    N_total = len(this_sample)
    
    # Super-Earths in the bin
    this_sE = this_sample[(this_sample['R'] >= 1.0) & (this_sample['R'] < Rsplit)]
    # Sub-Neptunes in the bin
    this_sN = this_sample[(this_sample['R'] >= Rsplit) & (this_sample['R'] < 3.5)]
    
    # Calculate fractional occurrence (N_subpopulation / N_total_planets in bin)
    if N_total > 0:
        frac_sE = len(this_sE) / N_total
        frac_sN = len(this_sN) / N_total
    else:
        frac_sE = 0.0
        frac_sN = 0.0
        
    occurrence_data.append({
        'P_center': float(pbin_centers[i]),
        'Frac_SuperEarth': float(frac_sE),
        'Frac_SubNeptune': float(frac_sN),
        'N_planets_in_bin': int(N_total)
    })

# Save the summary to a JSON file
summary = {
    'R_split_mean_Mst': float(Rsplit),
    'occurrence_by_period_bin': occurrence_data
}

output_filename = 'occurrence_rates_summary.json'
with open(output_filename, 'w') as f:
    json.dump(summary, f, indent=4)
print(f"Planet occurrence analysis saved to {output_filename}")

Planet occurrence analysis saved to occurrence_rates_summary.json


The data saved above can be used to verify the simulated population distribution, specifically the expected increase in the Super-Earth fraction relative to the Sub-Neptune fraction at longer orbital periods, as suggested by Bergsten et al. (2022).

## Cleanup

The following lines of code will clean up the files created during this exercise:

In [5]:
import os
trash = [
    'generator_bergsten22_info.txt',
    'updated_planet_sample.csv',
    'occurrence_rates_summary.json'
]
for filename in trash:
    if os.path.exists(filename):
        os.remove(filename)
        print(f"Cleaned up: {filename}")

Cleaned up: generator_bergsten22_info.txt
Cleaned up: updated_planet_sample.csv
Cleaned up: occurrence_rates_summary.json
