# Workforce Automation

<br>
<br>
![alt text](http://sandbox.maps.arcgis.com/sharing/rest/content/items/9e79e17002904ed3b47d7247d389deda/data "Glacier National Park")
<br>
<br>
### Overview
<br>
In this tutorial, we are going to be using Python and ArcGIS API for Python to automate a core workflow in Workforce for ArcGIS. Specifically, we are going to be importing a small workforce from a CSV and spatially assigning them assets to monitor. When working with large workforces, coordination may be difficult and time-consuming. By automating a major step of the process, people can reduce their time and increase their overall productivity.
<br>
Glacier National Park is a US National Park that spans across over one million acres. It is managed by the National Park Service, and there are over two million visitors annually. With the sheer size and amount of visitors to the park, there is a responsibility to maintain it to a high degree. With limited resources, effectively managing representatives of the park becomes essential. As part of this, preserving and protecting the quality of water is a fundamental mission of the National Park Service. In this example, a number of mock water sampling sites have been created across the park. Our objective is to efficiently and effectively assign these sites to a mock workforce to sample the sites across a summer to record the water quality at the park. With Workforce for ArcGIS, this task is made easy. 
<br>
<br>
Even though Glacier National Park is being used as a test case to demonstrate the workflow, this script is easily adaptable to other large or small datasets. See input section for more information on datasets needed to run the script successfully. Once you feel comfortable with the Jupyter Notebook, feel free to adapt it to your data and more specialized workflows.

<br>
Before diving in, make sure that the modules are installed by running the first cell.(Shortcut to run cells in Jupyter Notebooks is <b>Shift</b> + <b>Enter</b>.) After, step through the rest of the tutorial. 
<br>
<br>
##### Input
* Empty Workforce Project
* Two Csv (Add_Worker.csv/Worker_Sections.csv)
* A feature service for assets
* A 5km by 5km grid (Hex, Fishnet, or someother form) that covers the entirety of the region (See http://pro.arcgis.com/en/pro-app/tool-reference/cartography/grid-index-features.htm if you do not have a grid for your region)

<br>
<br>
<i> Test data is available via the GitHub Repository and on ArcGIS Online</i>
<br>
<br>

##### Output
* Assignment Layer available in Workforce


<br>
##### General Step Through
* Import Python Modules
* Authentication and accessing an existing project
* Add a new assignment type
* Convert and preview the Add_Worker.csv
* Adds workers from the dataframe to the workforce project
* Filtering workers based on responsibilities
* Access the Glacier National Park feature layer
* Access the individual layer by selecting the first layer
* Add the water sampling sites layer to the workforce web maps
* Convert Grid Feature Service dataframe
* Preview section assignment csv
* Creating the dictionary with workers assigned to grid cells
* Create an empty dataframe
* Add records to Empty dataframe
* Spatially creates assignments based on worker's grid cells

<br>
<br>
<br>
<i> Note - This tutorial is intended to give you a better sense of how Python can be integrated into existing applications to improve workflows. Python, when combined with GIS, is a powerful tool that has a plethora of applications. Please review, adapt, and build upon things in this code to suit your specific needs.
</i>
<br>
<br>
<br>

## Lets Start!

<br>
#### Import Python modules
<br>
The first part of every Python script is to import the tools needed for the script. The modules imported below are necessary for a portion of the code. If cell block fails, something may not be set up correctly in the environment. Refer to the Environment Setup document if you're experiencing difficulties at this step.

In [2]:
import arcgis
from arcgis.gis import GIS
from arcgis.apps import workforce
from arcgis.geometry import Geometry
import pandas as pd
import getpass
import datetime
import random
import sys
from matplotlib.pyplot import figure
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans
import numpy as np

#### Authentication and accessing an existing project
<br>
Create a blank Workforce project online, but do not alter anything in the project. Retrieve the ID of the project, and then copy and paste the ID in the <b> empty_project </b> variable. You will be prompted to enter a username and password, so enter your ArcGIS Online credentials for authentication.

In [None]:
# Specify Username
user_name = input("Enter a user name for ArcGIS Online Account... ")

# Specify password
print("Enter your password... ")
password = getpass.getpass()

# Connect to ArcGIS Online with creditials (username, password)
gis = GIS('https://www.arcgis.com/',user_name,password)

# Retrieve the workforce project/Copy and paste an ID for a blank Workforce project
empty_project = gis.content.get("72d75221341e4d94aff44271489f7483")

# Instantiate the project
project = workforce.Project(empty_project)

#### Add a new assignment type
<br>
In Workforce for ArcGIS, types of assignments need to be created in order to instruct your workforce about what tasks need to be completed. In this cell block, an assignment type is specified and added to the project.

In [None]:
# Asset name
asset_name = "Tourist Amenity"

# Adds assignment type to the workforce project
asset_type = project.assignment_types.add(name=asset_name)

#### Convert and preview the Add_Worker.csv
<br>
In this code block below, the <b> Add_Worker.csv </b> is converted to a DataFrame using the Pandas module. The <b> Add_Worker.csv </b> contains information on workers and dispatchers. This information is used to add each worker and dispatcher to a Workforce project, and then add responsibilities. If more columns are added representing more asset types, the code in the following cells will account for it. Also, more workers and dispatchers can be added to the CSV. 

<br>
<br>
The structure of the CSV is important because that is what the code is reading. Make sure to alter the code if the structure of your CSV is different.

In [3]:
# Convert the Csv to dataframe
workforce_df = pd.read_csv("Data\Add_Worker.csv")

# Preview the dataframe
workforce_df

Unnamed: 0,name,user_id,status,title,role,Asset_1,Asset_2,Asset_3,Asset_4,Asset_5,Asset_6,Asset_7,Asset_8,Asset_9,Asset_10,Asset_11
0,demo user,xx_python_1,not_working,Park Ranger,worker,Tourist Amenity,Road,Sign,Vehicles,Culvert,Gate,Fence Post,,,,
1,Joseph Munyao,JMunyao_Sandbox,not_working,Senior Park Ecologist,worker,Culvert,Bridge,Sign,Power Generator,Tourist Amenity,Road,Restroom,,,,
2,Jennifer Laws,JLaws_Sandbox,not_working,Ecologist,worker,Vehicles,Tourist Amenity,Road,Pavement,Sign,Power Generator,,,,,
3,Aaron Pulver,aaron_nitro,,Park Manager,dispatcher,,,,,,,,,,,
4,Alex NoHe,anohe_Nitro,not_working,GIS Analyst,worker,Bridge,Sign,Gate,Fence Post,Tourist Amenity,Road,,,,,
5,Jame McManus,james_Nitro,not_working,Seasonal Park Ranger,worker,Solar Panel,Culvert,Gate,Tourist Amenity,Park Bench,,,,,,
6,Joel Whitney,joel_Nitro,not_working,Park Ranger,worker,Utility Pole,Fence Post,Bridge,Well,Gate,Overlook,Tourist Amenity,,,,
7,Jennifer Vaughan-Gibson,JVaughanGibson_HSE,,Director of Field Operations,dispatcher,,,,,,,,,,,
8,Coleman Shepard,Cshepard_Sandbox,not_working,Intern,worker,Power Generator,Road,Vehicle,Solar Panel,Culvert,Coffee Maker,Restroom,,,,


#### Adds workers from the dataframe to the workforce project
<br>
This cell block adds workers to a Workforce project from the dataframe specified in the code cell above. The cell block is able to add both dispatchers and workers. It uses the field names: name, user_id, status, title, and role.
<br>
<br>
<i> If there are already users with the same user_id in the workforce project, then this code block will fail. </i>

In [None]:
# Blank list to add workers to 
workerList = []

# Loop through the dataframe
for index, row in workforce_df.iterrows():
    
    # Checks the role of the person (worker/dispatcher)
    if row["role"] == "worker":
        
        # Add worker to the workforce project
        project.workers.add(name=row["name"],
                        user_id=row["user_id"],
                        title = row["title"],
                        status=row["status"])
        
        # If count of the worker in the list is less than one, append it to the list
        if workerList.count(row["user_id"]) < 1:
            workerList.append(row["user_id"])
    
    # Checks if role is a dispatcher
    elif row["role"] == "dispatcher":
        
        # Add dispatcher to the workforce
        project.dispatchers.add(name=row["name"],
                user_id = row["user_id"])

#### Filtering workers based on responsibilities
<br>
Since we are only evaluating one asset at the moment, only workers responsible for this asset should be assigned to maintain them. To accomplish this, we will alter the <b>workerList</b> variable from above. In the code block below, we are looping through each record (worker) and then looping through the asset types they are responsible for. If the asset type in the <b> workforce_df </b> dataframe is the same as the asset we specified in the <b> Add a new assignment type </b> block, then a counter will be added to. If the counter variable, worker_asset_count, is less than one, then the worker will be removed from workerList.

In [None]:
# Counter variable 
worker_asset_count = 0

# Loop through the row horizontally based on the column headers
for i in list(workforce_df)[5:]:
    
    # Check if the cell value equals the asset that is being monitored
    if row[i] ==  asset_name:
        
        # Counter vairable to keep track if the conditional above is triggered
        worker_asset_count+=1
        
# If count variable equals 1 that means that the worker is responsible for that asset
if worker_asset_count < 1:
    
    # Remove the asset if the counter is less than one 
    workerList.remove(row["user_id"])

#### Access the Glacier National Park feature layer
<br>
By using the ArcGIS API for Python, we are able to easily access content from ArcGIS Online. This web layer includes 3 layers including water sampling sites, park boundary, and 5km by 5km grid representing the different sections of the park. In the following blocks of code, we will access this Feature layer and select individual layers.

In [None]:
# Gets GIS Content for ArcGIS Online (Feature Service to maintain)
web_layer = gis.content.get("255cf85b55d940d093b00ac7647ae29f")

# Preview web_layer
web_layer

#### Access the water sampling sitesl layer by selecting the first layer
<br>
The water sampling sites layer is accessed from the web layer on ArcGIS Online.

In [None]:
# Assign the correct layer in the web layer (Water Quality Sampling Points)
asset_item = web_layer.layers[0]

#### Add the water sampling sites layer to the workforce web maps
<br>
Add the water sampling sites layer to the project map so the sites are easily visible to workers and dispatchers when accessing the application. Updating the sharing property is not necessary if the layer is already public in this case. 

In [None]:
# Add Layer to the dispatcher map & update it
project.dispatcher_webmap.add_layer(asset_item)
project.dispatcher_webmap.update({})

# Add Layer to the worker map & update it
project.worker_webmap.add_layer(asset_item)
project.worker_webmap.update({})

# Share the whole web_layer 
web_layer.share(groups=[project.group])

#### Convert grid feature service dataframe
<br>
From the <b> web_layer </b> variable, access the Glacier National Park Grid layer. Share it with the worker and dispatcher project maps. Then convert it to a spatial dataframe and preview it. A spatial dataframe makes it easy to run analysis using ArcGIS API for Python. 

In [None]:
# Convert Grid to dataframe
grids_df = web_layer.layers[2].query().df

# Add Layer to the dispatcher map & update it
project.dispatcher_webmap.add_layer(web_layer.layers[2])
project.dispatcher_webmap.update({})

# Add Layer to the worker map & update it
project.worker_webmap.add_layer(web_layer.layers[2])
project.worker_webmap.update({})

# Preview dataframe
grids_df.head()

#### Preview section assignment csv
<br>
Load in the CSV that has cell assignments for each worker. This CSV has a field representing each worker. A 1 or 0 is used to determine if the worker operates in that grid cell. 1 represents that the worker is responsible for that specific grid cell, and 0 represents a grid cell that the worker is not responsible for. In the following cells, this will allow us to associate grid cell assignments with each asset. If people operate as teams and will be assigned to all of the same tasks, only list one person in the table.


In [4]:
# Load in csv as a dataframe (Different Csv then above)
worker_section = pd.read_csv("Data\Worker_Sections.csv")

# Preview the data by using the pd.head() in pandas
worker_section.head()

Unnamed: 0,section,demo user,Joseph Munyao,Jennifer Laws,Alex NoHe,Joel Whitney,Jame McManus,Coleman Shepard
0,1,0,0,0,0,0,0,0
1,2,0,0,0,0,0,0,0
2,3,0,0,0,0,0,0,0
3,4,0,0,0,0,0,0,0
4,5,0,0,0,0,0,0,0


#### Creating the dictionary with workers assigned to grid cells
<br>
Create a dictionary that consists of grid cells as keys with a list of the workers that operate in that cell. 
<br>
###### Example Format for the section_lookup dictionary
<br>
{<br>
277: [worker 1, worker 2], <br>
297: [worker 1, worker 3, worker 4] <br>
}

In [None]:
# Empty dictionary to add the workers along with the sections they work in. 
section_lookup = {}

# Loops thorugh workerList which is a list of user_ids (See Adds Workers from the CSV to the Workforce Project section)
for i in workerList:

    # Loops through the the dataframe and applies the function declared above to each worker
    worker = project.workers.get(user_id = i)
    
    # Checks if worker's name is  in the list
    for index, row in worker_section.iterrows():
        
        # Retrieve worker's name
        name = row[worker.name]
        
        # 1 is used to determine if a person works in a cell 
        if name == 1:
            section = row["section"]
            
            # Checks if the the section is in the dictionary already
            if section in section_lookup:
                
                # If it is, then it appends a value to the list example --> Coleman:[1,2,3,4,5,6,144,156]
                section_lookup[section].append(worker)
            
            else:
                # If a entry doesn't exist, a new entry is created for the worker
                section_lookup[section] = [worker]

# Preview the dictionary                
section_lookup

#### Create an empty dataframe
<br>
Create an empty dataframe to populate with information regarding what grid cells assets are contained.


In [None]:
# Create columns/fields for the DataFrame
columns = ["OBJECTID", "section", "x", "y","cluster_id"]

# Create the dataframe and add the columns
df_new = pd.DataFrame(columns = columns)

# Display the blank dataframe
df_new

#### Add records to Empty dataframe
<br>
This cell block adds rows to the new dataframe for assets that intersect with grid cells. 

In [None]:
# Dispatcher to create assignments
dispatcher = project.dispatchers.get(user_id=gis.users.me.username)

# Query all of the assets
asset_features = asset_item.query().features

# Loop through the assets that are being assigned
for asset in asset_features:
    
    # Determine which grid (if any) the asset is in
    contains_df = grids_df.contains(Geometry(asset.geometry))
    
    # Determine the grid 
    container = grids_df[contains_df]
    
    # If container has assets contained within it
    if not container.empty: 
        
        # Assign a section
        section = container['SECTION'].iloc[0]
        
        # Determines if the section is in the section_lookup dictionary
        if section in section_lookup:
            
            # Create a new row
            new_row = [str(asset.attributes["OBJECTID"]),str(section),asset.geometry["x"],asset.geometry["y"],0]
            
            # Add the new row to the dataframe
            df_new.loc[len(df_new)] = new_row

#### Spatially Creates Assignments based on Worker's Grid Cells
<br>
This block of code takes the section_lookup diction where workers are assigned to a grid cell. Workers are assigned to that grid cell are automatically assigned to all the assets in the grid cell. If there are multiple people assigned to a grid cell and type of asset, the workers will be assigned by using a machine learning method K-Means or random if the number of assets is less than the <b>cluster_number</b> variable. There are more possibilities to efficiently allocate points based on variables such as schedule, distance to assets, the urgency of certain assets, etc. Feel free to adapt this code to best suit your workflow. 

In [None]:
# List of inspections to add workforce assignments to
inspections = []

# Number of assets per grid cell to cluster on
cluster_number = 8

# Loop through the unique sections in the dataframe
for i in df_new.section.unique():
    
    # Create a subset dataframe for each section 
    kmeans_df = df_new.loc[df_new['section'] == i]
    
    # If more than 1 person is assigned to a grid section, it will be picked at random
    if len(section_lookup[int(i)]) > 1 and len(kmeans_df) > cluster_number:
        
        # Retrieving the x/y values
        f1 = kmeans_df['x'].values
        f2 = kmeans_df['y'].values
        
        # Find how many clusters there are by looking at how many workers are assigned to the cell
        number_clusters = len(section_lookup[int(i)])

        # Creates an array with the lat/lon values.
        X = np.array(list(zip(f1, f2)))
        
        # Runs K-Means from Scikit-Learn with the number of clusters specified and k-means++ chosen for centroid placement
        kmeans = KMeans(n_clusters=number_clusters, init="k-means++")
        
        # Fitting the input data
        kmeans = kmeans.fit(X)
        
        # More information on K-Means can be found here http://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html
        
        # Prediction/labels
        labels = kmeans.predict(X)
        
        # Find out unique cluster_ids
        cluster_ids = list(set(labels))
        
        # Centroid values
        centroids = kmeans.cluster_centers_
        
        # Worker and Section Values
        section = int(i)
        workers = section_lookup[section]
        
        # ----------------------------------------------------------------------------------------------------
        # If the number of centroids/clusters and workers are the same then this conditional is triggered
        if len(centroids) == len(workers):
            
            # Loop through the cluster_ids list by index (0,1,2,3,etc)
            for h in range(len(cluster_ids)):
                
                # Retrieve a single worker 
                worker = workers[h]
                
                # Return a single cluster
                cluster = cluster_ids[h]
                
                # Now, loop through the rest of the 
                for k in range(len(labels)):
                    
                    # Retrieving the x/y values
                    x = f1[k]
                    y = f2[k]

                    if cluster == labels[k]:
                        status = "assigned"
                        print(project, {'x': x, 'y': y}, section, status, asset_type, worker, dispatcher)
                        #---------------------------------------------------------------------------------------
                        # Create the assignment and add all of the values to the inspection list
                        inspections.append(workforce.Assignment(
                            
                            # Workforce project
                            project,
                            # Geometry of the asset
                            geometry = {'x': x, 'y': y},
                            # Grid cell that the asset falls into
                            location = str(section),
                #             location = asset.attributes['LOCDESC'] or f"{asset.geometry['x']}, {asset.geometry['y']}",
                            status=status,
                            assignment_type=asset_type,
                            # Worker that the asset is assigned to
                            worker = worker,
                            # Dispatcher who assigned the asset
                            dispatcher = dispatcher,
                            # Priority is set to high, but change that as need be (Low, Medium, High, Critical)
                            priority="high",
                            # Due date is set to three days, but this can change
                            due_date=datetime.datetime.now() + datetime.timedelta(days=3),
                            assigned_date=datetime.datetime.now()
                        ))

    # If the clustering conditional statement isn't met above, then this is triggered
    elif len(section_lookup[int(i)]) > 1 and len(kmeans_df) <= cluster_number:
        
        section = int(i)
        
        # Retrieving the x/y values
        f1 = kmeans_df['x'].values
        f2 = kmeans_df['y'].values
        
        # Status of assignment
        status = "assigned"
        
        # Loops through the assets in the kmeans_df DataFrame
        for b in range(len(kmeans_df)):
            
            # Retrieve x/y value for each asset
            x = f1[b]
            y = f2[b]
            
            # Random selection for number of workers under the threshold specified
            worker = random.choice(section_lookup[section])
            
            # Create the assignment and add all of the values to the inspection list
            inspections.append(workforce.Assignment(
                                # Workforce project
                                project,
                                # Geometry of the asset
                                geometry = {'x': x, 'y': y},
                                # Grid cell that the asset falls into
                                location = str(section),
                    #             location = asset.attributes['LOCDESC'] or f"{asset.geometry['x']}, {asset.geometry['y']}",
                                status=status,
                                assignment_type=asset_type,
                                # Worker that the asset is assigned to
                                worker = worker,
                                # Dispatcher who assigned the asset
                                dispatcher = dispatcher,
                                # Priority is set to high, but change that as need be (Low, Medium, High, Critical)
                                priority="high",
                                # Due date is set to three days, but this can change
                                due_date=datetime.datetime.now() + datetime.timedelta(days=3),
                                assigned_date=datetime.datetime.now()
                            ))

    # If there is only one worker per grid cell then conditional is triggered
    elif len(section_lookup[int(i)]) == 1:
        
        section = int(i)
        
        # This will be a single worker
        worker = section_lookup[section][0]
        
        # Retrieving the x/y values
        f1 = kmeans_df['x'].values
        f2 = kmeans_df['y'].values
        
        # Status of assignment
        status = "assigned"
        
        for b in range(len(kmeans_df)):
            
            # Retrieve x/y value for each asset
            x = f1[b]
            y = f2[b]
            
            # Create the assignment and add all of the values to the inspection list
            inspections.append(workforce.Assignment(
                                # Workforce project
                                project,
                                # Geometry of the asset
                                geometry = {'x': x, 'y': y},
                                # Grid cell that the asset falls into
                                location = str(section),
                    #             location = asset.attributes['LOCDESC'] or f"{asset.geometry['x']}, {asset.geometry['y']}",
                                status=status,
                                assignment_type=asset_type,
                                # Worker that the asset is assigned to
                                worker = worker,
                                # Dispatcher who assigned the asset
                                dispatcher = dispatcher,
                                # Priority is set to high, but change that as need be (Low, Medium, High, Critical)
                                priority="high",
                                # Due date is set to three days, but this can change
                                due_date=datetime.datetime.now() + datetime.timedelta(days=3),
                                assigned_date=datetime.datetime.now()
                            ))


# Batch Add inspections and do not print the return portion of the function
save_stdout = sys.stdout
sys.stdout = open('trash', 'w') 
# Batch add the workers
project.assignments.batch_add(inspections)        
sys.stdout = save_stdout

In [None]:
len(inspections)