In [1]:
# Installation step as requested (assuming necessary packages are not pre-installed)
%pip install numpy pandas json matplotlib bioverse==1.1.8

[31mERROR: Could not find a version that satisfies the requirement json (from versions: none)[0m[31m
[0m[31mERROR: No matching distribution found for json[0m[31m
[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.2.1[0m[39;49m -> [0m[32;49m25.3[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3 -m pip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


# Tutorial 2: Simulating a transit spectroscopy survey (Test Version)

In this tutorial, we will use the `TransitSurvey` class to simulate a dataset from a transit spectroscopy survey, such as one conducted with the James Webb Space Telescope (JWST) or ARIEL. The standard plotting cells have been replaced with data saving steps for a test environment.

## Setup

Let's start by importing the necessary modules from Bioverse.

In [2]:
# Import numpy and pandas
import numpy as np
import pandas as pd # Added for data saving
import json # Added for saving analysis results

# Import the relevant modules
from bioverse.survey import TransitSurvey
from bioverse.generator import Generator

# Import pyplot (for making plots later) and adjust some of its settings
from matplotlib import pyplot as plt
%matplotlib inline
plt.rcParams['font.size'] = 20.

np.random.seed(42)

## Loading the Generator and Survey

We will now load the Generator from the previous example (which assumes a transiting planet population) and the Transit Survey.

In [3]:
# Load the Generator that was saved in Tutorial 1 (assuming it was run)
# If transit_oceans.pkl doesn't exist, this will fail in a real test environment.
# We will use the default 'transit' and manually apply the settings for a robust test.
generator = Generator('transit')
generator.set_arg('eta_Earth', 0.15)
generator.set_arg('transit_mode', True)

# Load the default Transit Survey (JWST-like)
survey = TransitSurvey('default')

The default survey is designed to simulate a JWST-like transit spectroscopy survey. You can explore its properties similarly to how we did with the Generator:

In [4]:
# Display properties is replaced by saving the string representation
survey_info = str(survey)

# Save survey info to a file
output_filename = 'transit_survey_info.txt'
with open(output_filename, 'w') as f:
    f.write(survey_info)
print(f"Survey info saved to {output_filename}")

Survey info saved to transit_survey_info.txt


The `t_total` argument controls the total amount of survey time in days. By default, this is set to 365.25 days (one year). Let's increase this to two years:

In [5]:
survey.set_arg('t_total', 2*365.25)

AttributeError: 'TransitSurvey' object has no attribute 'set_arg'

## Running the Survey Simulation

The `Survey` class contains the `quickrun()` method that runs the entire simulation chain: the Generator creates the sample of planetary systems, the survey simulator determines which planets are observable, and a simple model determines which biosignatures are detectable. Let's run this now:

In [5]:
sample, detected, data = survey.quickrun(generator)

print(f"Simulated sample size: {len(sample)}")
print(f"Number of detected planets: {len(detected)}")

# Save a summary of the detected planets
summary = {
    'N_detected': int(len(detected)),
    'N_EECs_detected': int(detected['EEC'].sum()),
    'mean_R_detected': float(detected['R'].mean()),
    'mean_P_detected': float(detected['P'].mean())
}
output_filename = 'detected_planets_summary.json'
with open(output_filename, 'w') as f:
    json.dump(summary, f, indent=4)
print(f"Detected planet summary saved to {output_filename}")

Simulated sample size: 2831
Number of detected planets: 31
Detected planet summary saved to detected_planets_summary.json


The `data` object, which is returned by `quickrun()`, contains the detected planetary systems (the same as `detected`) as well as a list of which molecules were detected or non-detected. We can summarize this to see the breakdown of detections. (Plotting replaced with saving the breakdown to a file.)

In [None]:
molecule_names = ['H2O', 'O2', 'O3', 'CH4', 'N2O']
detection_counts = {
    mol: int(data[f'has_{mol}'].sum()) 
    for mol in molecule_names
}
detection_counts['N_total_detected'] = len(data)

output_filename = 'detection_breakdown.json'
with open(output_filename, 'w') as f:
    json.dump(detection_counts, f, indent=4)
print(f"Detection breakdown saved to {output_filename}")

Detection breakdown saved to detection_breakdown.json


## Comparing survey designs

If we wanted to compare this JWST-like survey with a hypothetical high-throughput transit survey, we can make a second survey object. The easiest way to simulate a high-throughput survey is to scale down the per-target observation time by an order of magnitude, which scales up the number of stars that can be observed for the same total observing time.

In [7]:
survey_high_throughput = TransitSurvey('default')
survey_high_throughput.set_arg('t_total', 2*365.25)
survey_high_throughput.set_arg('t_obs', 0.1)

sample_ht, detected_ht, data_ht = survey_high_throughput.quickrun(generator)

Let's compare the results of the two survey designs: (Plotting replaced with saving a comparison summary.)

In [None]:
comparison_summary = {
    'Default_Survey': {
        'N_detected': int(len(data)),
        'N_H2O_detected': int(data['has_H2O'].sum()),
        'N_O2_detected': int(data['has_O2'].sum())
    },
    'High_Throughput_Survey': {
        'N_detected': int(len(data_ht)),
        'N_H2O_detected': int(data_ht['has_H2O'].sum()),
        'N_O2_detected': int(data_ht['has_O2'].sum())
    }
}

output_filename = 'survey_comparison_summary.json'
with open(output_filename, 'w') as f:
    json.dump(comparison_summary, f, indent=4)
print(f"Survey comparison summary saved to {output_filename}")

Survey comparison summary saved to survey_comparison_summary.json


The high-throughput survey detects more planets overall, but the longer observation time of the default survey allows for deeper spectroscopy and thus more biosignature detections. Which survey design is optimal depends on the hypothesis one wishes to test.

The following lines of code will clean up the files created during this exercise:

In [9]:
import os
trash = [
    'transit_survey_info.txt',
    'detected_planets_summary.json',
    'detection_breakdown.json',
    'survey_comparison_summary.json'
]
for filename in trash:
    if os.path.exists(filename):
        os.remove(filename)
        print(f"Cleaned up: {filename}")

Cleaned up: transit_survey_info.txt
Cleaned up: detected_planets_summary.json
Cleaned up: detection_breakdown.json
Cleaned up: survey_comparison_summary.json


The next example will focus on testing a hypothesis with the simulated data.