<a href="https://colab.research.google.com/github/jamessutton600613-png/GC/blob/main/Copy_of_Untitled178tpu.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
import jax
import jax.numpy as jnp
import numpy as np

# Define simulation parameters
NUM_AGENTS = 50000
SEQUENCE_LENGTH = 100

# Define integer representations for bases
BASE_A = 0
BASE_U = 1
BASE_G = 2
BASE_C = 3

# Define protoribosome states
PROTO_STATE_ACTIVE = 0
PROTO_STATE_ARRESTED = 1
PROTO_STATE_INACTIVE = 2

# Initialize JAX random key
key = jax.random.PRNGKey(42)

# Initialize population state
key, strategy_key, rna_key, proto_key = jax.random.split(key, 4) # Split key for initializations
initial_strategies = jax.random.randint(strategy_key, (NUM_AGENTS,), 0, 2)

# Initialize RNA sequences with random bases (0-3 representing A, U, G, C)
# Use NumPy for initialization
initial_rna_sequences = np.random.randint(0, 4, size=(NUM_AGENTS, SEQUENCE_LENGTH), dtype=np.int8)

# Initialize protoribosome states (e.g., all start as ACTIVE)
initial_protoribosome_states = np.full(NUM_AGENTS, PROTO_STATE_ACTIVE, dtype=np.int8)

population_state = {
    'strategy': np.array(initial_strategies),
    # Modify rna_damage to be an array of shape (NUM_AGENTS, SEQUENCE_LENGTH)
    'rna_damage': np.zeros((NUM_AGENTS, SEQUENCE_LENGTH), dtype=np.float32),
    'has_stop_codon': np.zeros(NUM_AGENTS, dtype=bool),
    'status': np.ones(NUM_AGENTS, dtype=np.int8),
    'rna_sequence': initial_rna_sequences, # Add RNA sequence
    'protoribosome_state': initial_protoribosome_states # Add protoribosome state
}

print("Initial population state created.")
print(f"Number of agents: {NUM_AGENTS}")
print(f"RNA sequence length: {SEQUENCE_LENGTH}")
print(f"Initial strategies distribution: {np.bincount(population_state['strategy'])}")
print(f"Initial protoribosome states distribution: {np.bincount(population_state['protoribosome_state'])}")

Initial population state created.
Number of agents: 50000
RNA sequence length: 100
Initial strategies distribution: [25131 24869]
Initial protoribosome states distribution: [50000]


In [None]:
import pickle
import os
import re # Import regex module

# Define the directory where checkpoints are saved
# Assuming the checkpoint files are saved in '/content/drive/MyDrive/Colab_Notebooks/Sim_Experiments/Replica_1/'
CHECKPOINT_DIR = "/content/drive/MyDrive/Colab Notebooks/qtpu/replica_1/Expt_2025-7-6_08-50" # Ensure this matches the directory used in the simulation

# Create the directory if it doesn't exist
os.makedirs(CHECKPOINT_DIR, exist_ok=True)
print(f"Ensured checkpoint directory exists: {CHECKPOINT_DIR}")


# Initialize an empty list called simulation_data to store the loaded checkpoint data.
simulation_data = []
loaded_steps = [] # To keep track of the step corresponding to each loaded state

# List all files in the checkpoint directory
all_files = os.listdir(CHECKPOINT_DIR)

# Filter for checkpoint files and sort them by step number
checkpoint_files_to_load = []
# Regex to match checkpoint files (initial_state.pkl, checkpoint_step_X.pkl, final_state_step_X.pkl)
checkpoint_pattern = re.compile(r'^(initial_state\.pkl|checkpoint_step_(\d+)\.pkl|final_state_step_(\d+)\.pkl)$')

for filename in all_files:
    match = checkpoint_pattern.match(filename)
    if match:
        file_path = os.path.join(CHECKPOINT_DIR, filename)
        # Extract step number for sorting
        if 'initial_state' in filename:
            step = 0
        elif 'final_state' in filename:
             # Assuming final state filename is final_state_step_NUM_STEPS.pkl
             # We need NUM_STEPS. Let's try to extract it from the filename if available
             try:
                 step = int(match.group(3)) # Capture group 3 for final_state_step_(\d+).pkl
             except (ValueError, TypeError):
                  # Fallback to a large number for sorting if extraction fails
                  step = 999999999
        else:
            step = int(match.group(2)) # Capture group 2 for checkpoint_step_(\d+).pkl
        checkpoint_files_to_load.append((step, file_path))

# Sort files by step number
checkpoint_files_to_load.sort(key=lambda x: x[0])

# Now load the sorted files
print("Loading simulation data from checkpoints...")
for step, file_path in checkpoint_files_to_load:
    try:
        with open(file_path, 'rb') as f:
            state = pickle.load(f)
            simulation_data.append(state)
            # Use the extracted step number, or the large placeholder for the final state
            # We can correct the final step number later if needed, but sorting is key here.
            if "final_state" in file_path and step == 999999999:
                 # If we couldn't extract the step, try to get NUM_STEPS if available
                 # Assuming NUM_STEPS is defined in a previous cell and is in scope
                 try:
                     loaded_steps.append(NUM_STEPS) # Requires NUM_STEPS to be defined globally or in an executed cell
                 except NameError:
                     loaded_steps.append(step) # Fallback to the placeholder if NUM_STEPS is not defined
            else:
                 loaded_steps.append(step)

        print(f"Successfully loaded: {file_path} (Step: {loaded_steps[-1]})")
    except FileNotFoundError:
        print(f"Error: File not found at {file_path}. Skipping.")
    except Exception as e:
        print(f"Error loading {file_path}: {e}")

print(f"\nLoaded data from {len(simulation_data)} checkpoints.")
print(f"Corresponding steps: {loaded_steps}")

# At this point, simulation_data is a list of dictionaries (states at different steps)
# and loaded_steps is a list of the corresponding step numbers.
# You would typically process this data into a pandas DataFrame for plotting.

Ensured checkpoint directory exists: /content/drive/MyDrive/Colab Notebooks/qtpu/replica_1/Expt_2025-7-6_08-50
Loading simulation data from checkpoints...

Loaded data from 0 checkpoints.
Corresponding steps: []


In [None]:
import pandas as pd
import numpy as np # Import numpy as it might be needed for calculations

# Assuming simulation_data is a list of state dictionaries loaded from checkpoints
# and loaded_steps is a list of corresponding step numbers.
# These variables should be available from the previous data loading cell.

if not simulation_data:
    print("No simulation data loaded. Please ensure the data loading cell ran successfully.")
else:
    # Create lists to store the extracted metrics
    steps = []
    cautious_pops = []
    reckless_pops = []
    active_protos = []
    arrested_protos = []
    inactive_protos = []
    avg_damages = []
    total_pops = [] # Add total population for context

    # Iterate through the loaded data
    for i, state in enumerate(simulation_data):
        step = loaded_steps[i] # Get the step number
        steps.append(step)

        # Calculate population counts and other metrics from the state dictionary
        # Ensure we only count agents with status == 1 (alive)
        cautious_pop = np.sum((state['strategy'] == 0) & (state['status'] == 1))
        reckless_pop = np.sum((state['strategy'] == 1) & (state['status'] == 1))
        active_proto = np.sum((state['protoribosome_state'] == PROTO_STATE_ACTIVE) & (state['status'] == 1))
        arrested_proto = np.sum((state['protoribosome_state'] == PROTO_STATE_ARRESTED) & (state['status'] == 1))
        inactive_proto = np.sum((state['protoribosome_state'] == PROTO_STATE_INACTIVE) & (state['status'] == 1))
        avg_damage = np.mean(state['rna_damage'][state['status'] == 1]) if np.sum(state['status'] == 1) > 0 else 0.0 # Calculate average damage for living agents
        total_pop = np.sum(state['status'] == 1) # Total living population

        cautious_pops.append(cautious_pop)
        reckless_pops.append(reckless_pop)
        active_protos.append(active_proto)
        arrested_protos.append(arrested_proto)
        inactive_protos.append(inactive_proto)
        avg_damages.append(avg_damage)
        total_pops.append(total_pop)

    # Create a pandas DataFrame
    simulation_df = pd.DataFrame({
        'Step': steps,
        'Cautious Population': cautious_pops,
        'Reckless Population': reckless_pops,
        'Total Population': total_pops, # Add total population to DataFrame
        'Active Protoribosomes': active_protos,
        'Arrested Protoribosomes': arrested_protos,
        'Inactive Protoribosomes': inactive_protos,
        'Average RNA Damage': avg_damages
    })

    print("\nSimulation data processed into a DataFrame:")
    display(simulation_df.head()) # Display the first few rows of the DataFrame
    display(simulation_df.tail()) # Display the last few rows

# The simulation_df DataFrame is now ready for plotting.

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns # Import seaborn for potentially nicer plots

# Assuming the simulation_df DataFrame is available from the data processing cell.

if 'simulation_df' not in locals() or simulation_df.empty:
    print("Simulation DataFrame not found or is empty. Please ensure the data processing cell ran successfully.")
else:
    print("\nGenerating plots...")

    # --- Plotting Population Counts Over Time ---
    plt.figure(figsize=(12, 6))
    sns.lineplot(data=simulation_df, x='Step', y='Cautious Population', label='Cautious')
    sns.lineplot(data=simulation_df, x='Step', y='Reckless Population', label='Reckless')
    sns.lineplot(data=simulation_df, x='Step', y='Total Population', label='Total') # Plot total population
    plt.xlabel('Simulation Step')
    plt.ylabel('Population Count')
    plt.title('Population Dynamics Over Time (Cautious vs Reckless)')
    plt.legend()
    plt.grid(True)
    plt.show()

    # --- Plotting Protoribosome States Over Time ---
    plt.figure(figsize=(12, 6))
    sns.lineplot(data=simulation_df, x='Step', y='Active Protoribosomes', label='Active')
    sns.lineplot(data=simulation_df, x='Step', y='Arrested Protoribosomes', label='Arrested')
    sns.lineplot(data=simulation_df, x='Step', y='Inactive Protoribosomes', label='Inactive')
    plt.xlabel('Simulation Step')
    plt.ylabel('Number of Agents')
    plt.title('Protoribosome States Over Time')
    plt.legend()
    plt.grid(True)
    plt.show()


    # --- Plotting Average RNA Damage Over Time ---
    plt.figure(figsize=(12, 6))
    sns.lineplot(data=simulation_df, x='Step', y='Average RNA Damage')
    plt.xlabel('Simulation Step')
    plt.ylabel('Average RNA Damage')
    plt.title('Average RNA Damage Over Time')
    plt.grid(True)
    plt.show()

    print("Plots generated.")


In [None]:
## Simulation Results Analysis

Now that you have the plots for population counts, protoribosome states, and average RNA damage over time, you can analyze the simulation results:

1.  **Population Dynamics Plot**:
    *   Observe how the "Cautious" and "Reckless" populations change over the 10,000 simulation steps.
    *   Did one strategy outcompete the other?
    *   Does the total population remain stable, grow, or decline?
    *   Look for fluctuations or trends that might correlate with the daily and seasonal weather cycles (though the weather itself isn't plotted here).

2.  **Protoribosome States Plot**:
    *   Examine the proportions of agents with "Active", "Arrested", and "Inactive" protoribosomes over time.
    *   How do these states relate to the overall population changes?
    *   Does the number of "Arrested" protoribosomes increase with damage or stop codons?
    *   Does the number of "Inactive" protoribosomes increase, perhaps indicating successful peptide production or other factors leading to inactivation?

3.  **Average RNA Damage Plot**:
    *   Track the average RNA damage level across all living agents over time.
    *   Does the damage level fluctuate with the weather cycles (UV and temperature)?
    *   Does the average damage level stabilize, increase, or decrease over the long term?
    *   How does the average damage relate to the population dynamics and protoribosome states?

By examining these plots together, you can gain insights into how the different strategies, environmental conditions, and the mutation/repair/protoribosome mechanisms influenced the population's survival and composition over time.

---

This concludes the planned steps for running the simulation and visualizing the initial results.

**Next Steps:**

*   **Interpret the plots**: Carefully analyze the generated graphs based on the points above.
*   **Refine parameters**: If the simulation results are not as expected or you want to explore different scenarios, you can go back to the initial setup cell and modify parameters like mutation rates, repair rates, damage thresholds, weather patterns, etc., and re-run the simulation and plotting steps.
*   **Further analysis**: You might want to perform more detailed analysis on the final population state (e.g., examine the RNA sequences of surviving agents).
*   **Experiment with strategies**: You could modify the simulation logic to introduce new strategies or change how existing strategies behave.

# Task
Implement a simulation loop with checkpointing and resume functionality for the provided population state and simulation parameters.

## Define simulation step logic

### Subtask:
Outline the individual processes that occur within a single simulation step (e.g., applying environmental effects like UV damage, handling mutations and repair, updating protoribosome states, implementing strategy-based actions, handling reproduction/death, etc.).


**Reasoning**:
Outline the individual processes within a single simulation step based on the provided context and the subtask instructions. This will involve describing environmental effects, damage/repair, mutations, protoribosome updates, strategy actions, and reproduction/death rules.



In [None]:
# 1. Environmental Factors:
#    - UV radiation: Increases RNA damage. Intensity could vary daily (peak at noon) and seasonally (higher in summer). Represented by an environmental variable that scales damage application.
#    - Temperature: Could affect reaction rates (damage, repair) or metabolic costs. Could also vary daily and seasonally. Represented by an environmental variable.

# 2. RNA Damage and Repair:
#    - Damage Application: In each step, apply damage to each base in the 'rna_sequence'. The amount of damage applied per base is influenced by the current UV radiation level. This updates the 'rna_damage' array.
#    - Repair Mechanism: Agents might have a repair rate. In each step, damage could be reduced on each base based on the repair rate. The repair rate could potentially be influenced by strategy or temperature. This also updates the 'rna_damage' array, reducing values towards zero.

# 3. Mutations:
#    - Mutation Occurrence: Mutations can occur spontaneously or potentially be influenced by high damage levels. In each step, with a certain probability per base, a base in the 'rna_sequence' can change to a random other base. This updates the 'rna_sequence' array.

# 4. Protoribosome State Update:
#    - States: ACTIVE, ARRESTED, INACTIVE.
#    - Transition to ARRESTED: A protoribosome transitions from ACTIVE to ARRESTED if the damage on a base exceeds a threshold during translation or if a stop codon is encountered. The 'has_stop_codon' field, determined from the 'rna_sequence', is key here. High 'rna_damage' values can also trigger this transition.
#    - Transition from ARRESTED: An arrested protoribosome might become ACTIVE again if damage is repaired below a threshold or if the stop codon is somehow resolved (less likely in this simple model). It might transition to INACTIVE if it stays arrested for too long or if the cell dies.
#    - Transition to INACTIVE: A protoribosome might become INACTIVE after successful translation (producing a "peptide" - implicitly assumed if not arrested) or if the agent's status becomes dead.
#    - Updates the 'protoribosome_state' array.

# 5. Strategy-Based Actions:
#    - Cautious Strategy (0): May have a higher repair rate, a lower damage threshold for protoribosome arrest, or slower translation speed to reduce damage accumulation. They might invest more resources in repair.
#    - Reckless Strategy (1): May have a lower repair rate, a higher damage threshold for protoribosome arrest, or faster translation speed, potentially leading to faster resource acquisition but also higher risk of damage and arrest. They might invest less in repair.
#    - Strategy influences how damage and protoribosome states affect survival and reproduction.

# 6. Reproduction and Death:
#    - Reproduction: Agents might reproduce if they accumulate enough resources (not explicitly modeled yet, but could be linked to successful protein production) and are healthy (low damage, active protoribosomes). Reproduction creates a new agent with a potentially mutated RNA sequence.
#    - Death: Agents die if their damage level is too high, their protoribosome is arrested for too long, or they fail to reproduce/acquire resources over a period. The 'status' field is set to 0 for dead agents. Dead agents are removed from active consideration but might be kept in the state for analysis.

# 7. Has Stop Codon Determination:
#    - In each step, or when the 'rna_sequence' changes, check if the sequence contains a stop codon (e.g., a specific sequence of 3 bases like UAA, UAG, UGA represented by integers).
#    - Updates the 'has_stop_codon' boolean array for each agent.

# 8. Other Processes:
#    - Resource Dynamics: While not explicitly defined, resource acquisition (linked to successful translation) and consumption (for maintenance, repair, reproduction) would be implicit drivers of survival and reproduction.
#    - Agent Removal: Dead agents ('status' == 0) are effectively removed from the living population for subsequent calculations but their data might be retained for analysis of extinction dynamics.
#    - Step Increment: The simulation step counter is incremented.
#    - Checkpointing: Periodically save the full 'population_state' dictionary to a file.

## Implement the simulation loop

### Subtask:
Write the main loop that iterates for the specified number of simulation steps.


**Reasoning**:
Define the total number of simulation steps and initialize the simulation loop with a placeholder print statement as requested by the instructions.



In [None]:
# 1. Define the total number of simulation steps
NUM_STEPS = 10000

# Determine the starting step. If simulation_data is not empty, start from the step after the last loaded one.
# Otherwise, start from 0.
start_step = loaded_steps[-1] + 1 if loaded_steps else 0

print(f"Starting simulation from step {start_step} for a total of {NUM_STEPS} steps.")

# 2. Initialize a loop that runs from the starting step up to NUM_STEPS.
# 3. Inside the loop, call a function or include the logic for a single simulation step (placeholder).
# 4. Include a placeholder print statement inside the loop to indicate the current simulation step.
# 5. Ensure the loop updates a variable tracking the current simulation step.
for step in range(start_step, NUM_STEPS + 1):
    # Placeholder for simulation step logic (will be implemented later)
    # This is where the population_state dictionary would be updated based on the step's processes.
    # For now, we just have the print statement.

    # Placeholder print statement
    if step % 100 == 0 or step == start_step or step == NUM_STEPS:
        print(f"Simulating step {step}/{NUM_STEPS}")

    # The loop variable 'step' automatically tracks the current simulation step.

    # Placeholder for checkpointing logic (will be implemented later)
    # if step % CHECKPOINT_INTERVAL == 0:
    #     save_checkpoint(population_state, step)

print("Simulation loop initialized.")

Starting simulation from step 0 for a total of 10000 steps.
Simulating step 0/10000
Simulating step 100/10000
Simulating step 200/10000
Simulating step 300/10000
Simulating step 400/10000
Simulating step 500/10000
Simulating step 600/10000
Simulating step 700/10000
Simulating step 800/10000
Simulating step 900/10000
Simulating step 1000/10000
Simulating step 1100/10000
Simulating step 1200/10000
Simulating step 1300/10000
Simulating step 1400/10000
Simulating step 1500/10000
Simulating step 1600/10000
Simulating step 1700/10000
Simulating step 1800/10000
Simulating step 1900/10000
Simulating step 2000/10000
Simulating step 2100/10000
Simulating step 2200/10000
Simulating step 2300/10000
Simulating step 2400/10000
Simulating step 2500/10000
Simulating step 2600/10000
Simulating step 2700/10000
Simulating step 2800/10000
Simulating step 2900/10000
Simulating step 3000/10000
Simulating step 3100/10000
Simulating step 3200/10000
Simulating step 3300/10000
Simulating step 3400/10000
Simulat

## Integrate checkpointing

### Subtask:
Add code within the loop to save the `population_state` to pickle files in the `CHECKPOINT_DIR` periodically. This will allow for resuming the simulation and for later analysis.


**Reasoning**:
Add checkpointing logic inside the simulation loop to save the population state periodically.



In [None]:
import pickle
import os

# Define checkpoint interval
CHECKPOINT_INTERVAL = 1000 # Save checkpoint every 1000 steps

# Check if simulation_data was loaded. If so, resume from the last loaded state.
if simulation_data:
    population_state = simulation_data[-1]
    start_step = loaded_steps[-1] + 1
    print(f"Resuming simulation from step {start_step} with loaded state.")
else:
    # If no data was loaded, use the initial state defined earlier
    # The initial_strategies, initial_rna_sequences, initial_protoribosome_states
    # and the initial population_state dictionary should be available from previous cells.
    # We also need to ensure the initial_state is saved.
    start_step = 0
    print(f"Starting new simulation from step {start_step} with initial state.")
    # Save the initial state
    initial_state_filename = os.path.join(CHECKPOINT_DIR, 'initial_state.pkl')
    with open(initial_state_filename, 'wb') as f:
        pickle.dump(population_state, f)
    print(f"Saved initial state to {initial_state_filename}")


# Main simulation loop
NUM_STEPS = 10000
print(f"Running simulation from step {start_step} to {NUM_STEPS}")

for step in range(start_step, NUM_STEPS + 1):
    # Placeholder for simulation step logic (will be implemented later)
    # This is where the population_state dictionary would be updated based on the step's processes.
    # For now, we just have the print statement.

    # Placeholder print statement
    if step % 1000 == 0 or step == start_step or step == NUM_STEPS:
        print(f"Simulating step {step}/{NUM_STEPS}")

    # Add checkpointing logic
    if step > 0 and (step % CHECKPOINT_INTERVAL == 0 or step == NUM_STEPS):
        if step == NUM_STEPS:
             checkpoint_filename = os.path.join(CHECKPOINT_DIR, f'final_state_step_{step}.pkl')
        else:
             checkpoint_filename = os.path.join(CHECKPOINT_DIR, f'checkpoint_step_{step}.pkl')

        try:
            with open(checkpoint_filename, 'wb') as f:
                pickle.dump(population_state, f)
            print(f"Checkpoint saved at step {step} to {checkpoint_filename}")
        except Exception as e:
            print(f"Error saving checkpoint at step {step}: {e}")


print("Simulation loop finished.")


Starting new simulation from step 0 with initial state.
Saved initial state to /content/drive/MyDrive/Colab Notebooks/qtpu/replica_1/Expt_2025-7-6_08-50/initial_state.pkl
Running simulation from step 0 to 10000
Simulating step 0/10000
Simulating step 1000/10000
Checkpoint saved at step 1000 to /content/drive/MyDrive/Colab Notebooks/qtpu/replica_1/Expt_2025-7-6_08-50/checkpoint_step_1000.pkl
Simulating step 2000/10000
Checkpoint saved at step 2000 to /content/drive/MyDrive/Colab Notebooks/qtpu/replica_1/Expt_2025-7-6_08-50/checkpoint_step_2000.pkl
Simulating step 3000/10000
Checkpoint saved at step 3000 to /content/drive/MyDrive/Colab Notebooks/qtpu/replica_1/Expt_2025-7-6_08-50/checkpoint_step_3000.pkl
Simulating step 4000/10000
Checkpoint saved at step 4000 to /content/drive/MyDrive/Colab Notebooks/qtpu/replica_1/Expt_2025-7-6_08-50/checkpoint_step_4000.pkl
Simulating step 5000/10000
Checkpoint saved at step 5000 to /content/drive/MyDrive/Colab Notebooks/qtpu/replica_1/Expt_2025-7-6_0

## Integrate resume functionality

### Subtask:
Modify the simulation loop to check if checkpoint files exist in the `CHECKPOINT_DIR` at the start. If they do, load the latest state to resume the simulation.


## Combine with initialization and plotting

### Subtask:
Show how the new simulation loop code connects with the existing initialization and plotting cells to create a complete workflow.


**Reasoning**:
Add markdown cells and comments to explain the flow of the notebook and the purpose of each code cell to show how the new simulation loop connects with the existing initialization and plotting cells.



In [None]:
# Add markdown cells and comments to explain the flow of the notebook and the purpose of each code cell

**Reasoning**:
Add markdown cells and comments to explain the flow of the notebook and the purpose of each code cell to show how the new simulation loop connects with the existing initialization and plotting cells.



In [None]:
# This notebook simulates the evolution of a population of agents with different strategies
# and RNA repair mechanisms under environmental pressures.

# --- Section 1: Initialization ---
# This section initializes the simulation parameters and the initial state of the agent population.

# Code cell: Define simulation parameters and initial population state
# Defines constants like NUM_AGENTS, SEQUENCE_LENGTH, base and protoribosome states.
# Initializes the population_state dictionary with initial strategies, RNA sequences,
# RNA damage, stop codon status, agent status (alive/dead), and protoribosome states.

# --- Section 2: Data Loading and Checkpoint Setup ---
# This section sets up the directory for saving checkpoints and attempts to load
# any existing checkpoint data to resume a previous simulation run.

# Code cell: Load simulation data from checkpoints
# Defines the directory for checkpoints (CHECKPOINT_DIR).
# Creates the directory if it doesn't exist.
# Lists, sorts, and loads existing checkpoint files into simulation_data and loaded_steps.
# This allows the simulation to resume from the last saved state.

# --- Section 3: Simulation Loop with Checkpointing ---
# This is the main simulation loop where the state of the population is updated over time.
# It includes the logic for applying simulation processes and saving checkpoints periodically.

# Code cell: Main simulation loop with checkpointing and resume functionality
# Determines the starting step based on whether checkpoint data was loaded.
# Sets the total number of simulation steps (NUM_STEPS).
# If resuming, loads the population state from the last checkpoint.
# If starting new, saves the initial state.
# Iterates from the start_step to NUM_STEPS.
# *** Placeholder for the actual simulation step logic *** (This is where environmental effects,
# damage, repair, mutations, protoribosome updates, reproduction, and death would be implemented).
# Saves the population_state to a checkpoint file at regular intervals (CHECKPOINT_INTERVAL)
# and at the final step.

# --- Section 4: Data Processing ---
# After the simulation is complete (or after loading checkpoints for analysis),
# this section processes the loaded state data into a structured format (pandas DataFrame)
# suitable for analysis and plotting.

# Code cell: Process simulation data into a DataFrame
# Checks if simulation_data was loaded.
# Iterates through each state dictionary in simulation_data.
# Extracts metrics like population counts (cautious, reckless, total),
# protoribosome state counts (active, arrested, inactive), and average RNA damage.
# Creates a pandas DataFrame (simulation_df) with these metrics indexed by simulation step.

# --- Section 5: Data Visualization ---
# This section generates plots based on the processed simulation data in the DataFrame
# to visualize the dynamics of the population and other key metrics over time.

# Code cell: Generate plots from simulation data
# Checks if the simulation_df DataFrame exists and is not empty.
# Uses matplotlib and seaborn to create line plots:
# - Population counts over time (Cautious, Reckless, Total).
# - Protoribosome states over time (Active, Arrested, Inactive).
# - Average RNA damage over time.
# Displays the generated plots.

# --- Section 6: Analysis and Next Steps ---
# Provides guidance on interpreting the generated plots and suggests potential
# next steps for further analysis or simulation experiments.

# Code cell: Simulation Results Analysis (Markdown)
# Contains explanatory text guiding the user on how to analyze the plots
# and what insights to look for regarding population dynamics, protoribosome states,
# and RNA damage in relation to the simulation parameters and processes.
# Suggests next steps like interpreting plots, refining parameters, further analysis,
# and experimenting with strategies.


## Summary:

### Data Analysis Key Findings

*   The simulation step logic was outlined, detailing processes such as environmental factors, RNA damage and repair, mutations, protoribosome state updates, strategy-based actions, reproduction and death, and stop codon determination.
*   A main simulation loop was implemented to iterate for a specified number of steps, incorporating placeholders for the simulation step logic and progress tracking.
*   Checkpointing functionality was integrated into the loop to periodically save the `population_state` to pickle files, including saving the initial and final states.
*   The resume functionality, which checks for and loads the latest checkpoint at the start of the simulation, was confirmed to be already implemented.
*   The complete workflow, connecting initialization, data loading (including resume), the simulation loop with checkpointing, data processing, and plotting, was outlined using markdown and comments.

### Insights or Next Steps

*   The current implementation provides a solid framework for an evolutionary simulation. The next crucial step is to replace the placeholder in the simulation loop with the actual logic for updating the `population_state` based on the detailed processes outlined in the first step.
*   To enable meaningful analysis, the data processing section should be fully implemented to transform the saved `population_state` data from the checkpoints into a structured format (like a pandas DataFrame) that can be used for plotting and quantitative analysis of population dynamics, protoribosome states, and damage levels over time.
