# Dealing with limited data for semantic segmentation
> Strategies for efficiently collecting more data to target specific areas of underperforming models and techniques to adopt to maximize utility of the data



**Audience:** This post is geared towards intermediate users who are comfortable with basic machine learning concepts. 

**Time Estimated**: 60-120 min



## Setup Notebook

In [None]:
# install required libraries
!pip install -q rasterio
!pip install -q geopandas
!pip install git+https://github.com/tensorflow/examples.git
!pip install -U tfds-nightly
!pip install focal-loss
!pip install tensorflow-addons==0.8.3


### Getting set up with the data

Create drive shortcuts of the tiled imagery to your own My Drive Folder by Right-Clicking on the Shared folder `terrabio`. Then, this folder will be available at the following path that is accessible with the google.colab `drive` module: `'/content/gdrive/My Drive/servir-tf/terrabio/'`

We'll be working witht he following folders in the `tiled` folder:
```
tiled/
├── images/
├── images_bright/
├── indices/
├── indices_800/
├── labels/
└── labels_800/
```

In [3]:
# mount google drive
from google.colab import drive
drive.mount('/content/gdrive')

Mounted at /content/gdrive


In [4]:
# set your root directory and tiled data folders
if 'google.colab' in str(get_ipython()):
    root_dir = '/content/gdrive/My Drive/servir-tf/terrabio/' 
    print('Running on CoLab')
else:
    root_dir = './data/' 
    print(f'Not running on CoLab, data needs to be downloaded locally at {os.path.abspath(root_dir)}')

img_dir = os.path.join(root_dir,'tiled/indices/') # or root_dir+'tiled/images_bright/' if using the optical tiles
label_dir = os.path.join(root_dir,'tiled/labels/')

Running on CoLab


In [5]:
# go to root directory
%cd $root_dir 

/content/gdrive/My Drive/servir-tf/terrabio


### Enabling GPU

This notebook can utilize a GPU and works better if you use one. Hopefully this notebook is using a GPU, and we can check with the following code.

If it's not using a GPU you can change your session/notebook to use a GPU. See [Instructions](https://colab.research.google.com/notebooks/gpu.ipynb#scrollTo=sXnDmXR7RDr2)

In [None]:
# this is a google colab specific command to ensure TF version 2 is used. 
# it won't work in a regular jupyter notebook, for a regular notebook make sure you install TF version 2
%tensorflow_version 2.x
import tensorflow as tf
device_name = tf.test.gpu_device_name()
if device_name != '/device:GPU:0':
  raise SystemError('GPU device not found')
print('Found GPU at: {}'.format(device_name))

Found GPU at: /device:GPU:0


### Check out the labels

In [27]:
# Read the classes
class_index = pd.read_csv(root_dir+'tiled/terrabio_classes.csv')
class_names = class_index.class_name.unique()
print(class_index) 


   class_id   class_name
0         0   Background
1         1     Bushland
2         2      Pasture
3         3        Roads
4         4        Cocoa
5         5   Tree cover
6         6    Developed
7         7        Water
8         8  Agriculture
