# Simulation using Epigeopop

This walk-through follows the basic instructions for a simulation using Epigeopop in README, but includes further details, especially for generating a population using the EpiGeoPop repository.

Please use the other example notebook, `toypop_example.ipynb`, for a walk-through of a simulation using a toy population.

## Step 1: Set up rEpiabm

### Prerequisites

Before beginning, ensure you have:
- Git installed and configured
- RStudio installed
- Access to a terminal
- GitHub account with permissions

### Installation Steps

1. Clone the Github rEpiabm repository:

```bash
git clone git@github.com:SABS-R3-Epidemiology/rEpiabm.git
cd rEpiabm
```

2. Create a GitHub Personal Access Token:

- Navigate to GitHub Settings → Developer Settings
- Select "Personal access tokens (fine-grained)"
- Create a new token

<div style="margin: 1em 0; padding: 1em; border-left: 4px solid #f0ad4e; background-color: #fcf8f2;">
<strong>Important:</strong> Make sure to copy your token immediately after creation - you won't be able to see it again!
</div>

3. Configure RStudio with your token:
- in the console:

```bash
file.edit("~/.Renviron")
```

- Add this line to the .Renviron file:

```bash
GITHUB_PAT=<your_personal_access_token>
```

<div style="margin: 1em 0; padding: 1em; border-left: 4px solid #5bc0de; background-color: #f4f8fa;">
<strong>Note:</strong> Replace <code>&lt;your_personal_access_token&gt;</code> with your actual token from GitHub
</div>

- Save file and restart RStudio

4. Install required R packages:
   In the console:
   
```bash
install.packages("devtools")
devtools::install_github("SABS-R3-Epidemiology/rEpiabm")
```

5. Set up your country's data structure
- Navigate to the `data` folder
- Copy the `Andorra` folder structure
- Rename the copy with your country's name (capitalize the initial letter)

Your folder structure should look like this:

<img src="../images/your_country.png" width="40%"> <br>


<div style="margin: 1em 0; padding: 1em; border-left: 4px solid #5bc0de; background-color: #f4f8fa;">
<strong>Success:</strong> You are now ready to generate your population.
</div>



## Step 2. Generate your Github repository [EpiGeoPop](https://github.com/SABS-R3-Epidemiology/EpiGeoPop)

The following walk-through was completed on a mac. Please amend commands as needed for your operating system.

### Environment Setup

1. Clone the [EpiGeoPop](https://github.com/SABS-R3-Epidemiology/EpiGeoPop) repository:

```bash
git clone git@github.com:SABS-R3-Epidemiology/EpiGeoPop.git
cd EpiGeoPop
```

2. Create and activate a Python 3.11 environment:

 <div style="margin: 1em 0; padding: 1em; border-left: 4px solid #5bc0de; background-color: #f4f8fa;">
<strong>Note:</strong> It will not run on Python 3.12.
</div>

```bash
/usr/local/opt/python@3.11/bin/python3.11 -m venv .venv
source .venv/bin/activate
```
    
3. Install dependencies:

```bash
pip install -r requirements.txt
```

 <div style="margin: 1em 0; padding: 1em; border-left: 4px solid #5bc0de; background-color: #f4f8fa;">
<strong>Note:</strong> If you encounter GDAL-related errors, install GDAL first:
</div> 

```bash
brew install gdal
export GDAL_CONFIG=/usr/local/bin/gdal-config
export GDAL_VERSION=$(gdal-config --version)
```
    
At this point, you should have your local repository, dependencies and environment set up and activated.

### Configuration steps

1. Configure <your_country> parameters:

- Locate the parameter files in the 'configs/countries' directory
- Edit or copy an existing parameter file

- Update the country and household distribution data:

```bash
{
"household_size_distribution": [
    0.228400, 0.252600, 0.252600, 0.110500,
    0.110400, 0.020000, 0.015000, 0.005500,
    0.002500, 0.002500
],
"country": <your_country>
}
```
- Save file with <your_country> at start of filename

2. Update the data preparation script:
- Open `prep.sh`
- Replace the population data URL:

```bash
# Comment out old URL
# curl -O https://jeodpp.jrc.ec.europa.eu/ftp/jrc-opendata/GHSL/GHS_POP_MT_GLOBE_R2019A/GHS_POP_E2015_GLOBE_R2019A_4326_30ss/V1-0/GHS_POP_E2015_GLOBE_R2019A_4326_30ss_V1_0.zip

# Add new URL
curl -O https://jeodpp.jrc.ec.europa.eu/ftp/jrc-opendata/GHSL/GHS_POP_GLOBE_R2023A/GHS_POP_E2025_GLOBE_R2023A_4326_30ss/V1-0/GHS_POP_E2025_GLOBE_R2023A_4326_30ss_V1_0.zip
```

- Unzip correct file:

```bash
# change filename after the unzip command
unzip GHS_POP_E2025_GLOBE_R2023A_4326_30ss_V1_0.zip
```

- Remove unwanted code.

```bash
# curl -LO "https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_0_countries_lakes.zip"
# curl -LO "https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_1_states_provinces.zip"
# curl -LO "https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/cultural/ne_10m_urban_areas_landscan.zip"
# echo "Downloading population age file..."
# # This server uses an outdated SSL protocol so we need to enable legacy renegotiation
# OPENSSL_CONF=../../openssl.cnf curl -O "https://population.un.org/wpp/Download/Files/1_Indicators%20(Standard)/CSV_FILES/WPP2022_PopulationByAge5GroupSex_Medium.zip"
# unzip WPP2022_PopulationByAge5GroupSex_Medium.zip
```

<div style="margin: 1em 0; padding: 1em; border-left: 4px solid #5bc0de; background-color: #f4f8fa;">
<strong>Note:</strong> We will need to download the files from the website directly.
</div>

- Save the file and run:

```bash 
bash prep.sh
```

3. Download additional required data:

- Visit [Natural Earth](https://www.naturalearthdata.com/downloads/10m-cultural-vectors/)
- Under Admin 0 - Countries, click on link `Download without boundary lakes`

<img src="../images/download.png" width="50%"> <br>
    
- Save `ne_10m_admin_0_countries_lakes.zip` to `data/raw/` in local repository.

### Generate Population Data

1. Update the Snakefile configuration:

```bash
# Replace country name
"data/processed/countries/<your_country>_microcells.csv",

# Update .tif file references (3 lines within the file)
"GHS_POP_E2025_GLOBE_R2023A_4326_30ss_V1_0.tif"

# comment out row 9:
"data/processed/countries/Luxembourg_pop_dist.json",
    
# comment out row 19:
"outputs/dag.pdf"

# comment out the first rule:
rule render_dag:
input:
    "Snakefile"
output:
    "outputs/dag.pdf"
shell:
    "snakemake --dag | dot -Tpdf > outputs/dag.pdf"

# comment out the last rule:
rule make_pop_dist:
input:
    "data/raw/WPP2022_PopulationByAge5GroupSex_Medium.csv",
    "configs/{region}/{place}_parameters.json"
output:
    "data/processed/{region}/{place}_pop_dist.json"
script:
    "scripts/get_pop_dist.py"
```

2. Generate the population data:

```bash
snakemake --cores 1
```

<div style="margin: 1em 0; padding: 1em; border-left: 4px solid #5cb85c; background-color: #f3f8f3;">
<strong>Tip:</strong> You can speed up the process by using multiple cores.
</div>

3. Copy the generated files:

- Locate `<your_country_microcells>.csv` in `data/raw/processed/countries/`
- Copy it to your rEpiabm repository: 

`data/<your_country>/inputs/`

You're now ready to run simulations with your country's population data.