## GSpace Simulation

GSpace uses a file named *GSpaceSettings.txt* specifying simulation parameters, see below the parameters used:

In [None]:
%%%%%%%% SIMULATION SETTINGS %%%%%%%%%%%%%%%
Data_filename=Simulated_sequences
Run_Number=1

%%%%%%%% OUTPUT FILE FORMAT SETTINGS %%%%%%%
Output_Dir=../../TestExample_GSpace/results
Coordinate_file=true
Sequence_characteristics_file=true
Fasta=true
Fasta_Single_Line_Seq=True

%%%%%%%% MARKERS SETTINGS %%%%%%%%%%%%%%%%%%
Ploidy=Haploid
Chromosome_number=1
Sequence_Size=1000
Mutation_Model=HKY
Mutation_Rate=0.0005

%%%%%%%% RECOMBINATION SETTINGS %%%%%%%%%%%%
Recombination_Rate=0

%%%%%%%% DEMOGRAPHIC SETTINGS %%%%%%%%%%%%%%
%% LATTICE
Lattice_Size_X=20
Lattice_Size_Y=20
Ind_Per_Pop=30

%% DISPERSAL
Dispersal_Distribution=uniform
Disp_Dist_Max=1,1
Total_Emigration_Rate=0.05

%%%%%%%% SAMPLE SETTINGS %%%%%%%%%%%%%%%%%%%
%Sample_Size_X=2
%Sample_Size_Y=2
%Min_Sample_Coordinate_X=9
%Min_Sample_Coordinate_Y=12
Ind_Per_Node_Sampled=5
SampleCoordinateX=9,9,10,10
SampleCoordinateY=12,13,12,13

#### Simulation Settings
- **Data_filename**: `Simulated_sequences`
  Prefix for all output files.
- **Run_Number**: `1`
  Number of simulated datasets to generate.

---

#### Output File Format Settings
- **Output_Dir**: `../../TestExample_GSpace/results`
  Directory where output files will be saved.
- **Coordinate_file**: `true`
  Save a file with coordinates of sampled individuals.
- **Sequence_characteristics_file**: `true`
  Save additional sequence characteristics (e.g., mutations, coordinates).
- **Fasta**: `true`
  Export simulated sequences in FASTA format.
- **Fasta_Single_Line_Seq**: `true`
  Write each sequence on a single line in FASTA files.

---

#### Markers Settings
- **Ploidy**: `Haploid`
  Simulate haploid individuals.
- **Chromosome_number**: `1`
  Each individual has 1 chromosome.
- **Sequence_Size**: `1000`
  Each chromosome is 1000 nucleotides long.
- **Mutation_Model**: `HKY`
  Use the Hasegawa-Kishino-Yano (HKY) nucleotide substitution model.
- **Mutation_Rate**: `0.0005`
  Mutation rate per site per generation.

---

#### Recombination Settings
- **Recombination_Rate**: `0`
  No recombination within chromosomes.

---

#### Demographic Settings
- **Lattice_Size_X**: `20`
- **Lattice_Size_Y**: `20`
  Simulate a 20x20 grid (lattice) representing spatial structure.
- **Ind_Per_Pop**: `30`
  30 individuals per grid node (deme).

- **Dispersal_Distribution**: `uniform`
  Dispersal occurs uniformly to neighboring nodes.
- **Disp_Dist_Max**: `1,1`
  Maximum dispersal distance is 1 unit in both X and Y directions.
- **Total_Emigration_Rate**: `0.05`
  5% chance an individual migrates per generation.

---

#### Sample Settings
- **Ind_Per_Node_Sampled**: `5`
  Sample 5 individuals per selected node.
- **SampleCoordinateX**: `9,9,10,10`
- **SampleCoordinateY**: `12,13,12,13`
  Sampling occurs at 4 nodes: (9,12), (9,13), (10,12), (10,13).

> _Note_: The rectangular sampling settings are commented out.


In [1]:
import random

# Parameters
lattice_size_x = 20
lattice_size_y = 20
num_sampled_nodes = 4   # Number of distinct nodes to sample
ind_per_node_sampled = 5

# Generate unique random coordinates
sampled_positions = set()
while len(sampled_positions) < num_sampled_nodes:
    x = random.randint(1, lattice_size_x)
    y = random.randint(1, lattice_size_y)
    sampled_positions.add((x, y))

# Separate X and Y coordinates
sample_x = ",".join(str(pos[0]) for pos in sampled_positions)
sample_y = ",".join(str(pos[1]) for pos in sampled_positions)

# GSpace settings template
gspace_settings = f"""%%%%%%%% SIMULATION SETTINGS %%%%%%%%%%%%%%%
Data_filename=Example_simulated_sequences
Run_Number=1

%%%%%%%% OUTPUT FILE FORMAT SETTINGS %%%%%%%
Output_Dir=../../TestExample_GSpace/results
Coordinate_file=true
Sequence_characteristics_file=true
Fasta=true
Fasta_Single_Line_Seq=True

%%%%%%%% MARKERS SETTINGS %%%%%%%%%%%%%%%%%%
Ploidy=Haploid
Chromosome_number=1
Sequence_Size=1000
Mutation_Model=HKY
Mutation_Rate=0.0005

%%%%%%%% RECOMBINATION SETTINGS %%%%%%%%%%%%
Recombination_Rate=0

%%%%%%%% DEMOGRAPHIC SETTINGS %%%%%%%%%%%%%%
%% LATTICE
Lattice_Size_X={lattice_size_x}
Lattice_Size_Y={lattice_size_y}
Ind_Per_Pop=30

%% DISPERSAL
Dispersal_Distribution=uniform
Disp_Dist_Max=1,1
Total_Emigration_Rate=0.05

%%%%%%%% SAMPLE SETTINGS %%%%%%%%%%%%%%%%%%%
SampleCoordinateX={sample_x}
SampleCoordinateY={sample_y}
Ind_Per_Node_Sampled={ind_per_node_sampled}
"""

# Write to file
with open("GSpaceSettings.txt1", "w") as f:
    f.write(gspace_settings)

print("GSpaceSettings.txt generated with random sampling coordinates!")

GSpaceSettings.txt generated with random sampling coordinates!


## Generate GSpaceSettings.txt with random sampling positions

This script automates the creation of a `GSpaceSettings.txt` file for GSpace simulations, introducing random sampling coordinates within a defined lattice grid.

---
### 1. import required modules

```python
import random
```

import Python's built-in random integer generator `random`

---
### 2. Define parameters

```python
lattice_size_x = 20
lattice_size_y = 20
num_sampled_nodes = 4
ind_per_node_sampled = 5
```

- lattice_size_x / lattice_size_y: Define the grid size.
- num_sampled_nodes: Number of distinct grid nodes to sample.
- ind_per_node_sampled: Number of individuals to sample per node.
---
#### 3. Generate unique random coordinates

```python
sampled_positions = set()
while len(sampled_positions) < num_sampled_nodes:
    x = random.randint(1, lattice_size_x)
    y = random.randint(1, lattice_size_y)
    sampled_positions.add((x, y))
```

- Uses a set to ensure all sampled positions are unique.
- Randomly selects (x, y) coordinates within the lattice bounds until reaching the desired number of sampled nodes.
---
#### 4. Format coordinates for GSpace

```python
sample_x = ",".join(str(pos[0]) for pos in sampled_positions)
sample_y = ",".join(str(pos[1]) for pos in sampled_positions)
```

- Extracts X and Y coordinates separately
- Converts them into comma-separated strings matching GSpace’s expected input format:
  - SampleCoordinateX=...
  - SampleCoordinateY=...
