# Automating the Digitization of Drawn Figures on Maps

### Libraries/Dependencies used

**OpenCV** for color extraction and image detection <br>
**Numpy** for numbers and array manipulation <br>
**glob** for directory navigation <br>
**os** for environment variable use <br>
**azureml** for workspace connectivity and dataset creation and access

In [1]:
import numpy as np
import glob
import cv2
import os

### Subscription Information & Datastore Access

Necessary for workspace-storage connectivity and import of data

For questions on how to get these, check the [**README file**](./README.md)


In [2]:
#Access to subscription information
sub_id = os.getenv("SUBSCRIPTION_ID", default="<YOUR_SUBSCRIPTION_ID>")
rsc_group = os.getenv("RESOURCE_GROUP", default="<YOUR_RESOURCE_GROUP>")
ws_name = os.getenv("WORKSPACE_NAME", default="<YOUR_CURRENT_WORKSPACE>")
ws_region = os.getenv("WORKSPACE_REGION", default="eastus2")

#Access to storage information
azure_storage_account_name = "<YOUR_STORAGE_ACCOUNT_NAME>"
azure_storage_account_key = "<YOUR_STORAGE_ACCOUNT_KEY>"

In [3]:
#Connecting to a functional workspace 
from azureml.core import Workspace
from azureml.core import Dataset, Datastore
from azureml.data.datapath import DataPath

try:
    ws = Workspace(subscription_id = sub_id, resource_group = rsc_group, workspace_name = ws_name)
    # write the details of the workspace to a configuration file to the notebook library
    ws.write_config()
    print("Workspace configuration succeeded. Skip the workspace creation steps below")
except:
    print("Workspace not accessible. Change your parameters or create a new workspace below")


# create file dataset from files in datastore
datastore = Datastore.get(ws, '<NAME_OF_THE_DATASTORE>')
datastore_path = DataPath(datastore)
file_dataset = Dataset.File.from_files(path=datastore_path)

Performing interactive authentication. Please follow the instructions on the terminal.
To sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code A8FQC8CVB to authenticate.
Interactive authentication successfully completed.
Workspace not accessible. Change your parameters or create a new workspace below


NameError: name 'ws' is not defined

## Downloading the Data into our Compute

This step should only be done when there is new data to be processed, or the Data Store has changed. This downloads the previously created [File Dataset](https://docs.microsoft.com/python/api/azureml-core/azureml.data.filedataset?view=azure-ml-py?WT.mc_id=mapdigitdemo-github-cxa) into our compute, so that our files may be processed.

In [None]:
#Download the data into the AzureML compute instance.
file_list = file_dataset.download(target_path="<DATA_FOLDER>")

## Feature Extraction

For this project, we will be extracting drawings based on color using [**OpenCV**](https://docs.opencv.org/3.4/d6/d00/tutorial_py_root.html). <br>
OpenCV loads images as *numpy* arrays, consisting of RGB values in a 2D matrix. <br>
Our goal is extracting the pixels that fall within our specified RGB range (BGR as OpenCV formats it this way) and use them to create a mask of the original image.

In [None]:
#General Purpose Function. Loops over our dataset, reading each image and creating a mask for each of them.
def color_masking():
    #Iterating over the test files, would iterate over repo in the future
    for filename in glob.glob("./<DATA_FOLDER>/*.jpg"):
        image = cv2.imread(filename)
        create_mask(image, filename)


In [None]:
#This function creates the masks from a given set of RGB values. These were determined using a simple color picker on our sample images.
#FUTURE WORK: Discard the output image creation, use the final coordinates for cross-referencing and creation of GeoJSON objects.
def create_mask(image, filename):

    #RGB boundaries for the opencv function of inRange.
    #These values are numpy arrays stored in reverse (BGR)
    #And represent the limits of what we consider a color. In this case, boundaries[0] = red, boundaries[1] = blue and boundaries[2] = green.
    color_boundaries = [
        ("red", [17, 15, 100], [100, 106, 250]),
        ("blue", [86, 31, 4], [220, 88, 50]),
        ("green", [57, 64, 36], [105, 125, 58])
    ]

    for (color, lower, upper) in color_boundaries:
        lower = np.array(lower, dtype = "uint8")
        upper = np.array(upper, dtype = "uint8")

        mask = cv2.inRange(image, lower, upper)
        #Extract Pixel coordinates from our mask.
        result = np.where(mask != 0)
        coordinate_list = list(zip(result[0], result[1]))

        #Mask images called for output visualization. To be discarded for cross-referencing and GeoJSON object creation in future iterations.
        mask_images(image, mask, color, filename)
    

In [None]:
#Output visualization function. Currenlty used to showcase the created masks.
def mask_images(image, mask, color, filename):

    result_img = cv2.bitwise_and(image, image, mask = mask)

    cv2.imwrite("./output/" + color + "_" + os.path.basename(filename))

## Creating the Color-Masked Images

This will remain as a placeholder output. Currently showcases masks as images.

In [None]:
color_masking()

## Work in Progress:  Coordinate Referencing & Mask Clean Up

The project is nearing completion, missing the coordinate transformation (from pixel coordinates to GPS coordinates) within the actual range of our map's coordinates.
The step after implies the processing of our masks, mainly to eliminate noise (drawings outside the map). <br>

Next Steps:
- Accurately match the masks' features with referenced spatial coordinates ([Azure Maps Image Layering](https://docs.microsoft.com/javascript/api/azure-maps-control/atlas.layer.imagelayer?view=azure-maps-typescript-latest&viewFallbackFrom=azure-iot-typescript-latest?WT.mc_id=mapdigitdemo-github-cxa))
- Clean up of noise (Marking the valid are and eliminating pixels outside of it)
- Possible: Implementation of Custom Vision model to differentiate between shapes' and their meaning (crosses from polygons)

LinkedIn: https://www.linkedin.com/in/gcordidoa/