# Simulation using Epigeopop

This walk-through follows the basic instructions for a simulation using Epigeopop in README, but includes further details, especially for generating a population using the EpiGeoPop repository.

Please use the other example notebook, `toypop_example.ipynb`, for a walk-through of a simulation using a toy population.

## Step 1: Set up rEpiabm

### Prerequisites

Before beginning, ensure you have:
- Git installed and configured
- RStudio installed
- Access to a terminal
- GitHub account with permissions

### Installation Steps

1. Clone the Github rEpiabm repository:

   ```bash
   git clone git@github.com:SABS-R3-Epidemiology/rEpiabm.git
   cd rEpiabm

2. Create a GitHub Personal Access Token:

- Navigate to GitHub Settings → Developer Settings
- Select "Personal access tokens (fine-grained)"
- Create a new token

<div style="margin: 1em 0; padding: 1em; border-left: 4px solid #f0ad4e; background-color: #fcf8f2;">
<strong>Important:</strong> Make sure to copy your token immediately after creation - you won't be able to see it again!
</div>

<div style="margin: 1em 0; padding: 1em; border-left: 4px solid #f0ad4e; background-color: #fcf8f2;">
<strong>Warning:</strong> This step requires administrator privileges.
</div>

<div style="margin: 1em 0; padding: 1em; border-left: 4px solid #5bc0de; background-color: #f4f8fa;">
<strong>Note:</strong> The installation may take several minutes to complete.
</div>

<div style="margin: 1em 0; padding: 1em; border-left: 4px solid #5cb85c; background-color: #f3f8f3;">
<strong>Tip:</strong> You can speed up the process by using multiple cores.
</div>


### 3. Configure RStudio with your token:

<div style="background-color: white; padding: 10px; border-radius: 5px; margin: 10px 0;">
    <div style="font-family: monospace; margin-bottom: 5px;">r</div>
    <div style="background-color: #1e1e1e; padding: 15px; border-radius: 5px; position: relative;">
        <div style="position: absolute; right: 10px; top: 10px; background-color: #e9ecef; padding: 5px 10px; border-radius: 5px; font-size: 0.9em; color: #6c757d;">
            📋 Copy
        </div>
        <code style="color: #6A9955; font-style: italic;"># In RStudio console</code><br>
        <code style="color: #DCDCDC;">file.edit(<span style="color: #CE9178;">"~/.Renviron"</span>)</code>
    </div>
</div>

4. Configure RStudio with your token:

    ```file.edit("~/.Renviron")```

5. In the .Renviron file, add the line

    ```GITHUB_PAT=<your_personal_access_token>```

    where the above is your real token from Github, and make sure to **save** the file.

    <img src="../images/token.jpg" width="50%"> <br>

6. Restart RStudio

7. Enter in R console:
    ```
    install.packages("devtools")
    devtools::install_github("SABS-R3-Epidemiology/rEpiabm")
    ```
8. Copy the example 'Andorra' folder structure within the data folder and name it with your country (capitalise the intial letter) and include the file as you will need to edit it for your simulation.

    Current folder structure:
    <img src="../images/andorra_file_structure.png" width="50%"> <br>
    
    Amended file structure:
    <img src="../images/your_country.png" width="50%"> <br>


You are now ready to generate your population.

### Step 2. Generate your Github repository EpiGeoPop

The following walk-through was completed on a mac, please amend for your operating system.

1. Go to [EpiGeoPop](https://github.com/SABS-R3-Epidemiology/EpiGeoPop) 

2. Clone the repository using your preferred method. The example uses the terminal:

    ```
    git clone git@github.com:SABS-R3-Epidemiology/EpiGeoPop.git
    cd EpiGeoPop
    ```

    The directory structure should be as below:

    <img src="../images/epigeopop_folder_structure.png" width="50%"> <br>

3. Create a python 3.11 environment as it will not run on python 3.12. In the terminal, type:
    ```
    /usr/local/opt/python@3.11/bin/python3.11 -m venv .venv
    ```

4. Activate the environment. In the terminal, type:
    ```
    source .venv/bin/activate
    ```

5. Install the dependencies. In the terminal, type:
    ```
    pip install -r requirements.txt
    ```
    
    It may crash with an error ending:

        <string>:78: DeprecationWarning: The 'warn' function is deprecated, use 'warning' instead
        WARNING:root:Failed to get options via gdal-config: [Errno 2] No such file or directory: 'gdal-config'
        CRITICAL:root:A GDAL API version must be specified. Provide a path to gdal-config using a GDAL_CONFIG environment variable or use a GDAL_VERSION environment variable.
        [end of output]

        note: This error originates from a subprocess, and is likely not a problem with pip.
        error: subprocess-exited-with-error
        × Getting requirements to build wheel did not run successfully.
        │ exit code: 1
        ╰─> See above for output.
        note: This error originates from a subprocess, and is likely not a problem with pip.


    This is because Fiona, which is a Python package for reading and writing spatial data, requires GDAL (Geospatial Data Abstraction Library) to be installed on your system first.

    In the terminal, type:

    ```
    brew install gdal
    export GDAL_CONFIG=/usr/local/bin/gdal-config
    export GDAL_VERSION=$(gdal-config --version)
    ```

At this point, you should have your local repository, dependencies and environment set up and activated.

6. Epigeopop uses a json file to read in which country's population is being extracted and what the country's household size distribution. Edit one of the existing parameter files in the folder shown below.

    <img src="../images/epigeopop_country.png" width="50%"> <br>

    Change household_size_distribution figures to your country's household distribution. The example below is for Gibraltar:

    ```
    "household_size_distribution": [0.228400, 0.252600, 0.252600, 0.110500,
                                        0.110400, 0.020000, 0.015000, 0.005500,
                                        0.002500, 0.002500]
    ```

    Change line 16 to the country of your choice, for example:

    ```
        "country": "Andorra",
    ```


    Save file with your country at the start of the name.

7. Next, the file ```prep.sh``` downloads data from two different websites. However, these websites have been updated so the following changes need to be made:

    comment out the first 'curl' line

    ```curl -O https://jeodpp.jrc.ec.europa.eu/ftp/jrc-opendata/GHSL/GHS_POP_MT_GLOBE_R2019A/GHS_POP_E2015_GLOBE_R2019A_4326_30ss/V1-0/GHS_POP_E2015_GLOBE_R2019A_4326_30ss_V1_0.zip```

    and replace with 

    ```curl -O https://jeodpp.jrc.ec.europa.eu/ftp/jrc-opendata/GHSL/GHS_POP_GLOBE_R2023A/GHS_POP_E2025_GLOBE_R2023A_4326_30ss/V1-0/GHS_POP_E2025_GLOBE_R2023A_4326_30ss_V1_0.zip```

    Copy the filename from the end of this path, ```GHS_POP_E2025_GLOBE_R2023A_4326_30ss_V1_0.zip```, and replace the filename after 'unzip' command:

    ```unzip GHS_POP_E2025_GLOBE_R2023A_4326_30ss_V1_0.zip```

    Finally, comment out the remaining lines of code. We will need to download the files from the website directly.

    Your file should look like this (the full long links are not shown):

    <img src="../images/prep_sh.png" width="100%"> <br>

    Save the file and in the terminal, type:

    ```bash prep.sh```

8. For the final download, go here [Natural Earth](https://www.naturalearthdata.com/downloads/10m-cultural-vectors/) and click on the link ```Download without boundary lakes```. This downloads a zip file, 

    ```ne_10m_admin_0_countries_lakes.zip```

    and save in the folder (which was created by prep.sh) ```data/raw```

9. These data extracts will be used by Snakefile to create the csv file. However, the Snakefile needs to be amended as follows:
    Open the file ```Snakefile```, amend row 8 to be the country of your choice (replace Luxembourg)

    ```"data/processed/countries/<your_country>_microcells.csv",```

    comment out row 9:

    ```"data/processed/countries/Luxembourg_pop_dist.json",```
        
    comment out row 19:

    ```"outputs/dag.pdf"```

    comment out the first rule:

    ```
    rule render_dag:
    input:
        "Snakefile"
    output:
        "outputs/dag.pdf"
    shell:
        "snakemake --dag | dot -Tpdf > outputs/dag.pdf"
    ```

    Replace the 'tif' file references in row 31, 40 and 49 with the following:

    ```GHS_POP_E2025_GLOBE_R2023A_4326_30ss_V1_0.tif```

    as this is the new file downloaded using prep.sh

    Finally, scroll to the bottom of Snakefile and comment out the following:

    ```
    rule make_pop_dist:
    input:
        "data/raw/WPP2022_PopulationByAge5GroupSex_Medium.csv",
        "configs/{region}/{place}_parameters.json"
    output:
        "data/processed/{region}/{place}_pop_dist.json"
    script:
        "scripts/get_pop_dist.py"
    ```

    as this is not needed.


10. Extract the population for your country using the terminal by typing:

    ```snakemake --cores 1```
    
    This creates an output file directory called ```data/raw/processed/countries```. 
    
    Copy ```<your_country>_microcells.csv``` to the folder you created in Step 1 ```data/<your_country>/inputs``` in your rEpiabm local repository.


Now you have the population for your country ready for rEpiabm to run a simulation.

