# Semantic Segmentation of Aerial Imagery with Raster Vision 
## Part 1: Tutorial Setup on SCINet

This tutorial series walks through an example of using [Raster Vision](https://rastervision.io/) to train a deep learning model to identify buildings in satellite imagery.</br>

*Primary Libraries and Tools*:

|Name|Description|Link|
|-|-|-|
| `Raster Vision ` | Library and framework for geospatial semantic segmentation, object detection, and chip classification with python| https://rastervision.io/ |
| `Singularity` | Containerization software that allows for transportable and reproducible software | https://docs.sylabs.io/guides/3.5/user-guide/introduction.html |
| `pandas` | Dataframes and other datatypes for data analysis and manipulation | https://pandas.pydata.org/ |
| `geopandas` | Extends datatypes used by pandas to allow spatial operations on geometric types | https://geopandas.org/en/stable/ |
| `rioxarray` | Data structures and routines for working with gridded geospatial data | https://github.com/corteva/rioxarray |
| `plotnine` | A plotting library for Python modeled after R's [ggplot2](https://ggplot2.tidyverse.org/) | https://plotnine.readthedocs.io/en/v0.12.3/ |
| `pathlib` | A Python library for handling files and paths in the filesystem | https://docs.python.org/3/library/pathlib.html |

*Prerequisites*:
  * Basic understanding of navigating the Linux command line, including navigating among directories and editing text files
  * Basic python skills, including an understanding of object-oriented programming, function calls, and basic data types
  * Basic understanding of shell scripts and job scheduling with SLURM for running code on Atlas
  * A SCINet account for running this tutorial on Atlas

*Tutorials in this Series*:
  * 1\. **Tutorial Setup on SCINet <span style="color: red;">_(You are here)_</span>**
  * 2\. **Overview of Deep Learning for Imagery and the Raster Vision Pipeline**
  * 3\. **Constructing and Exploring the Singularity Image**
  * 4\. **Exploring the dataset and problem space**
  * 5\. **Overview of Raster Vision Model Configuration and Setup**
  * 6\. **Breakdown of Raster Vision Code Version 1**
  * 7\. **Evaluating training performance and visualizing predictions**

## Tutorial Setup

To kick off this series of tutorials, we will begin with a tutorial dedicated to setting up your computational environment on SCINet! First, launch [Open OnDemand](https://atlas-ood.hpc.msstate.edu/pun/sys/dashboard) in your browser. Log in with your SCINet credentials. </br>

#### Project Group Identification
This tutorial requires users to specify a project account name to launch a jupyter session, and to run batch scripts through slurm. If you are a part of a project group, then you can use that group name as your account name to run scripts. </br> </br>
From MSU OnDemand, click <b>Clusters</b>, then <b>Atlas Shell Access</b>. </br>
![Cluster_tab.png](imgs/atlas_shell_access.png) </br>
This will open up a terminal tab in another browser window. Log in with your SCINet credentials, then run the following command: </br>
`sacctmgr -Pns show user format=account where user=$USER` </br> </br>
This will output a list of project groups you are a part of. If you are a part of a project group, you can use any of these project group names to launch jobs for this tutorial. </br> </br>
If you are <b>not</b> a part of a project group, you can use the account `sandbox`. This will only grant you access to limited computational resources, and the scripts included in this tutorial will take longer to run. </br></br>
Take note of the project group name you would like to use, as we will need it in the next section.

#### Launching JupyterLab
Click on <b> Interactive Apps </b>, then <b>Jupyter</b>. </br>
![interactive_session.png](imgs/interactive_session.png) </br>
Specify the following input values on the page, replacing "Account Name" with your project group name. You may also wish to change the number of hours based on how long you intend to work on this tutorial for now. </br>
- Python Version: 3.10.8
- Lab or Notebook: JupyterLab
- Working Directory: <b> path to desired project directory </b>
- Account Name: geospatialworkshop
- Partition Name: atlas
- QOS: ood – Max Time: 8-00:00:00
- Number of hours: 4
- Number of nodes: 1
- Number of tasks: 1
- Additional Slurm Parameters: --mem=32gb

Then click the `Launch` button at the bottom of the page. Once your session loads, click the `Connect to Jupyter` button.

Once the jupyter session is launched, we will open up a terminal. Click the `+` button on the top right, above the navigation pane.</br>
![plus_button.png](imgs/plus_button.png)</br>
Then click on the `Terminal` button. </br>
![open_terminal.png](imgs/open_terminal.png) </br>

<a id='var_setup'></a>
#### Setting Project Shell Variables
In the terminal, use the following commands to save your project group name and project directory path as shell variables. If you are not a part of the "geospatialworkshop" project group, replace "geospatialworkshop" in the first line with the name of a project group that you are a part of. 

Next, decide on a project directory location. You may use your home directory, though you may quickly run out of space, so we recommend using the 90daydata directory instead. If you have space in a project directory that you would prefer to use over 90daydata, modify the path in the second command. Otherwise, leave the command as is to use 90daydata. 

Navigate to the directory you would like to store your project directory in, and run the following commands. This will create your project directory, store the project directory path into the shell variable `project_dir`, and to store your project group name into the shell variable `project_name`. If you are not a part of the geospatialworkshop group, replace "geospatialworkshop" with the name of a project group that you are a part of. </br></br>
`mkdir rv_workbook` </br>
``project_dir=`pwd`/rastervision`` </br>
`project_name=geospatialworkshop` </br>

#### Transferring Workshop Files to Project Directory
This workshop refers to files stored in the `/reference/workshops/rastervision` folder. We will only transfer some of the contents of `/reference/workshops/rastervision` to our project directory because some of the files are very large and can be referenced in-place.

Use the following commands to copy the reference files to your project directory. </br>
`cd $project_dir` </br>
`cp /reference/workshops/rastervision/model/ .` </br>
`cp /reference/workshops/rastervision/*.ipynb .` </br>

#### Creating the Kernel

NOA: Update this to refer to a kernel in /reference/workshops/rastervision
(First test in 90daydata, then ask for a transfer)

Run these commands in the terminal to create the jupyter kernel: </br>
`source /project/geospatialworkshop/workshop_venv/bin/activate` </br>
`ipython kernel install --name "grwg_workshop" --user` </br>
`cp /project/geospatialworkshop/grwg_workshop.json ~/.local/share/jupyter/kernels/grwg_workshop/kernel.json` </br>

#### Open Workbook

From the navigation pane on the left side of the screen, navigate to your `rv_workbook` directory.</br>
</br>
![open_workbook_directory.png](imgs/open_workbook_directory.png) </br>
Next, click on `Raster_Vision_workbook.ipynb` to launch the workbook.</br>
![open_workbook.png](imgs/open_workbook.png) </br>

Lastly, set the kernel by clicking on the `Kernel` tab, selecting `Change Kernel...`, and then selecting the `grwg_workshop` kernel. </br>
![change_kernel.png](imgs/change_kernel.png)