# ArcGIS Online Organization Administration at Scale
## ESRI EdUC 2019
## Seth Peery, Sr. GIS Architect, Virginia Tech


In [1]:
# This simply sets up the environment for us to run the Jupyter Notebook as a presentation
from notebook.services.config import ConfigManager
cm = ConfigManager()
cm.update('livereveal', {
        'width': 1440,
        'height': 900,
        'scroll': True,
})

{'width': 1440, 'height': 900, 'scroll': True}

# Link to this Presentation and Code

![qr](images/qr.png)
https://github.com/sspeery/educ2019



# Rationale for organization management
* Software as a Service products must be managed with the same attention we give on-premises systems
* Management objective is to lower impediments to use of ArcGIS Online
* AGOL depends on finite shared resources; org administration is the stewardship of these resources
 * named users
 * service credits
 * Pro and other licenses
 * usability and organization of the site
* AGOL orgs can become unwieldy at scale, and require different approaches
* User lifecycle stages require different management practices over time
* Processes, training and automation become increasingly important for large orgs


## A brief history of SaaS WebGIS Administration
* From a mindset of scarcity to abundance
* From a niche content service to the foundation of a WebGIS
* Scripting evolved to fill gaps in the API
* Constant evolution of the technology makes past administration best practice obsolete


## Scale Considerations
* DevOps metaphor: cattle v. pets
* An amorphous blob of users vs first name basis
* Use batch operations as often as possible
* Think in the aggregate
* Anticipate user life cycle events (especially important in university communities with transient populations!)
    * onboarding
    * ad hoc requests
    * deprovisioning


![events](images/adminflow.png)

# Org administration tasks
* Enterprise Logins
* Auto provisioning
* Initial default privileges and entitements
* Credit stewardship
* Pro Licenses
* Ad hoc requests
* Content migration
* Deprovisioning

## Enterprise Logins with automatic provisioning

Once your organization reaches a certain scale, sending invitations to new users is untenable.

We assume that most all entities with large AGOL organizations will have an Enterprise idP (Identity Provider) with which AGOL can federate for purposes of authentication.

Therefore, it's a best practice to use Enterprise Logins with the auto-provisioning model.  This is step # 1. 

At that point, the work is shifted from managing users on the front end (onboarding) to figuring out what to do with them after they show up. 

<img src="images/entlogins.png" alt="Enterprise Logins" style="width: 500px;"/>

## Setting Default User Privileges
After many years of complaining by the education community, ESRI has finally come out with an out of the box solution for auto-provisioning of many of the things we've had to use scripting for in the past: *New Member Defaults.*

This allows for automatic allocation of Software licenses, Credit Budgets, Groups, User Types, Roles, ESRI Access and  *dramatically* cuts down on onboarding overhead. *
<img src="images/newmemberdefaults.png"  style="width: 500px;"/>

* it also made a big chunk of this presentation obsolete and I had to redo it after the June Release.  Thanks, ESRI.


## The Geri Miller Guide to Geo-enabling Your Campus with New Member Defaults
1. Enable enterprise logins, commonly known as SSO – integrate with existing business systems and do not create arcgis-only accounts (unless when working with outside affiliates).
2. Enable "Automatically" join for enterprise logins (also known as auto-provisioning), so that new users are automatically granted access to ArcGIS.
3. Configure the "New Member Defaults" so that people have access to everything in ArcGIS they might need to do their job:
    1. Leave the default user type set to "Creator". (FYI, the default user type will soon become GIS Professional Advanced for those with institutional agreements.)
    2. Set the role to "Publisher" – empower everyone with the abilities to do the work they need to do
    3. Configure default Add-on Licenses (ArcGIS Pro, Insights, Business Analyst, Community Analyst, GeoPlanner, etc.)
    4. Set a Credit Quota (1000, 2000, 5000, etc.) whatever fits your institution -- enable your community to do their work, yet protect them from accidental mistakes
    5. Enable Esri Access for everyone – allow them to utilize Esri Academy (E-Learning/Esri Training), GeoNet, etc.
4. Encourage students, faculty, staff to leverage all free options for self-learning, such as Learn ArcGIS, Esri Academy, documentation, etc.



## New Member Defaults Philosophy

### Give. Everybody. Everything.

* ideally license count is 1:1 with your named user count, in which case grant indiscriminately
* with credit budgeting it is OK to give the Publisher Role
* ESRI access fo everybody
* Credits are not the limiting factor (but licenses and named users might be)

## Documentation
* The customization and automation necessary in large scale organizations become a critical part of your infrastructure!
* It can be challenging to remember what you did for non-recurring operations
* Jupyter Notebooks, like this presentation, can store documentation alongside executable code
    * These can be run locally or on a jupyter notebook server.  
    * [Here's how to set up a Jupyter Notebook Server](https://github.com/sspeery/educ2019/blob/master/jupyter_notebook_server.md)
* Revision control, like github, can be used for storing the authoritative versions of scripts (and sharing them!)
* Internal wikis are also a good documentation repository
* As we move towards Infrastructure as Code, artifacts that we create are more amenable to revision control. 

## Reverse Engineering the ArcGIS Online Admin GUI

There are times we want to do something that seems like it should be possible in the GUI, but is not straightforward to do it in Python.  
* Use Chrome Developer Tools to inspect the http traffic to the Portal's REST API
* Use PostMan to craft our own requests to mimic what the GUI is doing 
* Use the requests module to make the request from Python
<table><tr><td><img src="images/ChromeDevTools.png" alt="devtools" style="width: 500px;"/></td><td><img src="images/postman.png" alt="postman" style="width: 500px;"/></td>

In [None]:
def get_roles(token):
    parameters={'token':token,'f':'json'}
    response = requests.get(portalURLRest+"/roles", params=parameters)
    if(response.ok):
        jsonResponse = json.loads(response.content)
        roleMappings = jsonResponse['roles']
        return roleMappings

## User Interaction Events
* When operating at scale we need efficient mechanisms to communicate with our users
* It is very useful to maintain a listserv of AGOL users for sitewide notifications (and also is a decent proxy for your institution's GIS user base!)
* When users need to contact us, it gets unwieldy if they e-mail the admins directly.
* A good practice in this space is to create points of entry into a ticketing system for AGOL ad hoc events
* Even better if one can automate the fulfillment of the fickets using Python!


## Communication
* Successful management of AGOL is a collaborative process with stakeholders.
* The out of the box functionality of an AGOL org is wrapped in a business process layer that will depend on your administration strategy.

_Users need to know how you operate your organizations._

Define standard operating procedures for 
* User onboarding
* Credit consumption and budgeting
* Default role privileges and escalation processes
* Process for deprovisioning
* Process to request more credits/privileges/ content transfer
* Identify a single point to request ad hoc operations or ask questions.
* User training via documentation, seminars, presentations



  
     
       
       


## Removing "Drive by Users"

### [Drive By Users Example](https://jupyter2.aws.gis.cloud.vt.edu:8888/jupyter/notebooks/educ2019/DriveByUserDeletion.ipynb)

## New User cron job

### New User Cron href
Need to update this for the All org Entitlements thing
This also demonstrates the ethic of give everybody everything

## Topping off a user's credits on demand

### With AWS Lambda goodness and ticketing system integration

## Creating and maintaining a user e-mail list
both in new user cron in if we get all the e-mails. 


## REST API example
tbd

## Identifying users who are no longer affiliated

### itpalsout of band rest

how you determine who is an affiliate is up to you
Some idPs (e.g., Active Directory) may return a negative authentication response once a user has been deactivated or removed.  Thus users are locked out automatically.
At the University of Michigan, identities don’t expire upon graduation, but the idP inserted code into their authentication backend that checks a username for the required affiliations, then returns a negative authN if authZ is negative.
VT’s idP returns positive authN regardless of authZ.  So we developed a microservice to check eligibility and ran that out of band
*It would be nice* if we could read extended attrs from the idP

but anyway you get a list of users who should not be in your org anymore
... then what?  This can't be fully automated. 
* send e-mail  (lambda?)
* move content to an archival account?
* export content elsewhere?


## Content Migration and Deprovisioning
https://ago-assistant.esri.com/ 
Know when a user should be deprovisioned
Important for license compliance 
Important for site security
“Could” be automatic with Enterprise Login model …. depending on idP 
What to do with the content if there is any
User copies it somewhere else (self service operation)
An active user inherits it *(admin must do this)
Archive it
Delete it
Have clear communication of timelines and then delete account

## Future Directions
* More work is needed on what to do with old content
* We really need to tackle the storage problem
*jupyter teaser for top storage offenders*

## Scripting
* The UI has evolved but has its limits when performing large batch operations
* When Scripting for ArcGIS Online organization administration, we mostly use the Python API
* But the Python API wraps the REST API, which, in some cases is the only way to do certain things

## The Python API for organization administration
The python API provides a set of objects for administering your Web GIS.
See 
![GIS Module](http://esri.github.io/arcgis-python-api/notebooks/nbimages/guide_gis_module_01.png)

## Connecting to your Web GIS (ArcGIS Online / ArcGIS Enterprise)
To connect to your organization, we import the requisite libraries and then create an object of type "GIS":

In [None]:
from arcgis.gis import GIS
import requests
import json
import pandas
# Connection syntax is
# gis = GIS("https://myorganization.maps.arcgis.com",username="An_admin_user",password="Please_do_not_put_this_in_clear_text")
with open('vtActiveConfig.json') as configFile:
    myConfig = json.load(configFile)
gis = GIS(myConfig['agolOrg']['url'],username=myConfig['agolOrg']['username'],password=myConfig['agolOrg']['password'])    
gis

In my case I put org-specific stuff into a config file so this notebook can be more easily shared with others.
The file looks like this:

In [None]:


{
    "agolOrg":{
            "url":"https://yourOrgShortName.maps.arcgis.com",
            "username":"*******,
            "password":"*******",
            "shortName":"yourOrgShortName"
    },
    "authService":{
            "url":"https://some_url_that_checks_usernames",
            "username":"*****",
            "password":"*****",
            "valueIfTrue":"true"
    }
}

##  Now let's get some info about our users.
We're creating a notebook local data structure in a single operation that will then allow subsequent cells to search through a large number of users without a bunch of round trips to AGOL.

In [None]:
userList = []
users = gis.users.search(max_users=2000)

for user in users:
    #These things come straight from the user dict
    d_esriUsername = user.username
    d_fullName = user.fullName
    d_email = user.email
    d_role = user.role
    d_storage = (user.storageUsage / 1024)
    
    #number of content items <=100 is returned by length of items arr
    d_items = len(user.items())
    #print(d_items)
    
    #VT PID is returned by stripping off the _virginiatech
    d_pid = user.username.rsplit("_"+myConfig['agolOrg']['shortName'])[0]
    
    #last access comes from https://developers.arcgis.com/python/guide/accessing-and-managing-users/
    t_last_accessed = time.localtime(user.lastLogin/1000)
    d_lastAccess = "{}/{}/{}".format(t_last_accessed[0], t_last_accessed[1], t_last_accessed[2])
    
    #count of groups this user is a member of
    d_groupCount = len(user.groups)
    
    #Now build a data structure    
    currentUserInfo = {"pid":d_pid,
                        "esriUsername":d_esriUsername,
                        "fullName":d_fullName,
                        "email":d_email,
                        "storage":d_storage,
                        "role":d_role,
                        "lastAccess":d_lastAccess,
                        "groups":d_groupCount,
                        "items":d_items}
    userList.append(currentUserInfo)
    
# iteration done.
# now let's make a dataframe.  We'll use this later.
df = pandas.DataFrame(userList)
df




Now that we have a data structure of user information in memory, we can make administrative decisions based on it.

* Search for a role and grant its members ArcGIS Pro licenses
* Delete "drive by" users
* Identify users not affiliated with the institution using an out-of-band lookup service

In [None]:
# What are these weird custom roles?   Let's find out
roles = gis.users.roles.all()
for role in roles:
    print(role)

In [None]:
gis.users.roles.get_role('pH1lPndPVtVbrE6l')

In [None]:
# Let's get all the users whose role is "New User".
# We're using the pandas query syntax here
df.loc[(df['role'] == 'pH1lPndPVtVbrE6l')]

In [None]:
# Let's give our new users an ArcGIS Pro license
# For now, we'll use the licenses and entitlements we expect
entitlements = {'ArcGIS Pro': ['geostatAnalystN', 'spatialAnalystN', 'networkAnalystN', 'dataReviewerN',
                               'dataInteropN', 'workflowMgrN', '3DAnalystN', 'desktopAdvN']}
licenses = {lic: gis.admin.license.get(lic) for lic in entitlements}
new_users = gis.users.search("pH1lPndPVtVbrE6l")            
for u in new_users:
    for license_type in entitlements:
        lic = licenses[license_type]
        # THIS IS A DEMO SO WE DON'T ACTUALLY PULL THE TRIGGER
        #lic.assign(username=u.username, entitlements=entitlements[license_type])
        print('{0} entitlements granted to user {1.username}'.format(license_type, u))

## Example 2: Drive by users

In [None]:
#Let's just look for the users who have no content items or storage, nor are they in any groups.
driveBy = df.loc[(df['storage'] == 0) & (df['items'] == 0) &(df['groups'] ==0)]
driveBy

In [None]:
len(df)

In [None]:
# Sort by last accessed time.
driveBy.sort_values('lastAccess')

####  I feel reasonably confident we can get rid of users who 
* own no content items
* are members of no groups
* use no storage
* have not logged in for a year

... since if that user logs back in, it will be like they never left.

In [None]:
# SO we nuke them from orbit
deleteList = df.loc[(df['storage'] == 0) & (df['items'] == 0) &(df['groups'] ==0) &(df['lastAccess'].str[:4] != '2018')]
deleteList

In [None]:
len(deleteList)

In [None]:
for index, row in deleteList.iterrows():
    sUserToDelete = df['esriUsername'].values[index]
    print ("Deleting " + sUserToDelete +"...")
    #Here we would simply call
    #userToDelete = gis.users.search(sUserToDelete, max_users=1)
    #userToDelete.delete()

## Out of band affiliation check
Since ArcGIS Online cannot pull extended attributes from a SAML identity provider beyond username and email, we developed a web service to perform a check for "Active" status for any username we provide it.  

NOTE that this does not translate outside of the Virginia Tech context; we developed a custom solution for this.  
Your custom web services can be integrated... in orgConfig.json, provide values for the "authService" group:
* URL
* authKey or password to access the service
* value to be returned if the user is a valid member

In [None]:
userList = []
users = gis.users.search(max_users=2000)

# Wrap the call to the REST service in a function
def isActiveVT(pid):
    result = False
    url = myConfig['authService']['url']
    params = {'authkey':myConfig['authService']['password'],'pid':pid}
    r = requests.post(url,data=params)
    if(r.text==myConfig['authService']['valueIfTrue']):
        result=True
    return result

for user in users:
    d_esriUsername = user.username
    d_fullName = user.fullName
    d_email = user.email
    d_pid = user.username.rsplit("_"+myConfig['agolOrg']['shortName'])[0]

    # Call the custom REST service 
    d_active = "false"
    if(isActiveVT(d_pid)):
        d_active = myConfig['authService']['valueIfTrue'] 
    
    currentUserInfo = {"pid":d_pid,
                        "esriUsername":d_esriUsername,
                        "fullName":d_fullName,
                        "email":d_email,
                        "active":d_active}
    userList.append(currentUserInfo)
print (str(len(userList)) +" users are no longer VT affiliates.")
    
# iteration done.
# now let's make a dataframe.  We'll use this later.
df = pandas.DataFrame(userList)
df

In [None]:
nonAffiliates = df.loc[(df['active'] == 'false') ]
print(str(len(nonAffiliates)) + " users are no longer VT Affiliates.")
nonAffiliates

# Automation needs to integrate with your business processes 
* Once we generate a list of users who need to be deprovisioned, then what?
 * We could send them an automatic e-mail notification
 * Provide references for self-deprovisioning (AGO-Assistant)
 * Set a time limit for automated content migration
   * to another user
   * to an orphaned content account
* Note that you can't delete users with remaining content.
* Content management is a whole separate presentation



![ago-assist](agoassist.png)

# Options for running Python API code in production
* Interactively 
 * Jupyter noteboook
 * Your preferred Python IDE
 * Command line
* Event driven
 * AWS Lambda
* Recurring
 * AWS Lambda
 * cron job/ scheduled task

![lambda](lambda.png)

This presentation was built in Jupyter Notebooks using the RISE notebook extension.

I'm running it from a [Jupyter Notebook Server.](https://github.com/sspeery/educ2019/blob/master/jupyter_notebook_server.md)
![Architecture](jupyterarch.png)

# Presenter Contact Information
>*Seth Peery*  
>    Senior GIS Architect  
>    Enterprise GIS (0214)  
>        1700 Pratt Drive  
>        Blacksburg, VA 24061  
>    (540) 231-2178  
>    sspeery@vt.edu   
>    http://gis.vt.edu
![qr](qr.png)
GitHub Repository for this presentation and related examples:  https://github.com/sspeery/educ2019