# Visualization of project coordinate data

The purpose of this notebook is to visualize coordinates in the project data. The coordinates consist of ones already georeferenced in iDigBio and ones georeferenced using Geolocate's batch georeference system. The goal of these visualizations is to see if there are coordinates that have been georeferenced in Geolocate that significantly differ from coordinates already present in the iDigBio data and establish whether more detailed georeferencing is necessary.

Throughout the notebook, geolocate georeferenced records will be marked with a teal marker and iDigBio georeferenced records will be marked with an orange marker. When clicking on a marker on the map, the record's iDigBio uuid will be displayed.

#### NOTE: Maps not displaying in Github, use the following link to view the notebook: http://nbviewer.jupyter.org/github/ams939/COOP2018/blob/master/HabitatProject/Notebooks/Preliminary%20coordinate%20visualization.ipynb

### Leopardus Pardalis records
Below is a visualization of coordinates in the leopardus pardalis (Ocelot) data.

In [1]:
import folium
import pandas as pd
import json

# Initialize map, centered on Panama City
leopardus_map = folium.Map(location=[8.9967396, -79.5232352],
                        zoom_start=3,
                        tiles="CartoDB dark_matter")

# Import coordinate data for leopardus pardalis
df = pd.read_csv("leoparduspardalis.csv")

# Add points to map
for index, row in df.iterrows():
    lon, lat, flag, uuid = row["longitude"], row["latitude"], row["flags"], row["uuid"]
    point = [lat, lon]
    
    # Flag field contents array stored as string, extract it with json module
    flag = json.loads(flag)[0]
    
    # Assign color based on flag
    if flag == "geolocate_georeference":
        marker_color = "#0A8A9F" # Teal
    else:
        marker_color = "#E37222" # Orange
    
    # Initialize marker and add to the map
    folium.CircleMarker(
        location=point, 
        color = marker_color, 
        fill=True,
        fill_color = marker_color,
        radius = 1,
        popup = uuid
    ).add_to(leopardus_map)

# Map legend html code
legend_html = '''
     <div style="position: fixed; 
     bottom: 50px; left: 50px; width: 150px; height: 90px; 
     border:2px orange; z-index:9999; font-size:14px;
     ">&nbsp; <font color="white">Marker legend</font> <br>
     &nbsp; <font face = "Verdana" color="white">geolocate</font> &nbsp; <i class="fa fa-circle"
                  style="color:#0A8A9F"></i><br>
     &nbsp; <font face = "Verdana" color="white">idigbio</font> &nbsp; <i class="fa fa-circle"
                  style="color:#E37222"></i>
      </div>
     '''

# Output map
leopardus_map.get_root().html.add_child(folium.Element(legend_html))
leopardus_map

For the most part, the Geolocate and iDigBio records seem to be fairly consisten with each other. There are exceptions, like for example Panama has far more Geolocate coordinates than iDigBio coordinates and in Equador it is the opposite. Outliers in the coordinates appear to be coordinates located within the US (excluding Texas), some of which appear to be from zoos. Another outlier is a record of the coast of Madagascar. 

### Panthera Onca records
Below is a visualization of coordinates in the Panthera Onca (Jaguar) data.

In [32]:
# Initialize map, centered on Panama City
panthera_map = folium.Map(location=[8.9967396, -79.5232352],
                        zoom_start=3,
                        tiles="CartoDB dark_matter")

# Import coordinate data for panthera onca
df = pd.read_csv("pantheraonca.csv")

# Add points to map
for index, row in df.iterrows():
    lon, lat, flag, uuid = row["longitude"], row["latitude"], row["flags"], row["uuid"]
    point = [lat, lon]
    
    # Flag field contents array stored as string, extract it with json module
    flag = json.loads(flag)[0]
    
    # Assign color based on flag
    if flag == "geolocate_georeference":
        marker_color = "#0A8A9F" # Teal
    else:
        marker_color = "#E37222" # Orange
    
    # Initialize marker and add to the map
    folium.CircleMarker(
        location=point, 
        color = marker_color, 
        fill=True,
        fill_color = marker_color,
        radius = 1,
        popup = uuid
    ).add_to(panthera_map)

# Map legend html code
legend_html = '''
     <div style="position: fixed; 
     bottom: 50px; left: 50px; width: 150px; height: 90px; 
     border:2px orange; z-index:9999; font-size:14px;
     ">&nbsp; <font color="white">Marker legend</font> <br>
     &nbsp; <font face = "Verdana" color="white">geolocate</font> &nbsp; <i class="fa fa-circle"
                  style="color:#0A8A9F"></i><br>
     &nbsp; <font face = "Verdana" color="white">idigbio</font> &nbsp; <i class="fa fa-circle"
                  style="color:#E37222"></i>
      </div>
     '''

# Output map
panthera_map.get_root().html.add_child(folium.Element(legend_html))
panthera_map

The Geolocate and iDigBio georeferenced records seem to appear in the same areas for the most part. There are many more Geolocate coordinates in the US, however, some of which are likely to be outliers such as specimens from zoos. It appears also that there are many more Geolocate georeferenced records in the more more southern parts of South America. For example, there are many more Geolocate coordinates in Brazil than there are iDigBio coordinates. A major outlier is a Geolocate coordinate showing up in France. Overall, the Geolocate coordinates seem to support the iDigBio coordinates and don't add significantly different areas on a cursory glance.

### Puma Concolor records
Below is a visualization for the Puma Concolor records' Geolocate and iDigBio coordinates.
NOTE: The points on the map do not have pop ups when clicked due to limitations of the plotting library.

In [71]:
# Initialize map, centered on Panama City
puma_map = folium.Map(location=[8.9967396, -79.5232352],
                        zoom_start=3,
                        tiles="CartoDB dark_matter")

# Import coordinate data for puma concolor
df = pd.read_csv("pumaconcolor.csv")

# Add points to map
for index, row in df.iterrows():
    lon, lat, flag, uuid = row["longitude"], row["latitude"], row["flags"], row["uuid"]
    point = [lat, lon]
    
    # Flag field contents array stored as string, extract it with json module
    flag = json.loads(flag)[0]
    
    # Assign color based on flag
    if flag == "geolocate_georeference":
        marker_color = "#0A8A9F" # Teal
    else:
        marker_color = "#E37222" # Orange
    
    # Initialize marker and add to the map
    folium.CircleMarker(
        location=point, 
        color = marker_color, 
        fill=True,
        fill_color = marker_color,
        radius = 1,
        #popup = uuid #Having popup with a large no. of records does not seem to work
    ).add_to(puma_map)

# Map legend html code
legend_html = '''
     <div style="position: fixed; 
     bottom: 50px; left: 50px; width: 150px; height: 90px; 
     border:2px orange; z-index:9999; font-size:14px;
     ">&nbsp; <font color="white">Marker legend</font> <br>
     &nbsp; <font face = "Verdana" color="white">geolocate</font> &nbsp; <i class="fa fa-circle"
                  style="color:#0A8A9F"></i><br>
     &nbsp; <font face = "Verdana" color="white">idigbio</font> &nbsp; <i class="fa fa-circle"
                  style="color:#E37222"></i>
      </div>
     '''

# Output map
puma_map.get_root().html.add_child(folium.Element(legend_html))
puma_map

The Geolocate and iDigBio coordinates, once again, seem to be for the most part in the same areas. One notable exception is within Idaho where there is a very large cluster of solely Geolocate coordinates within the state's National Forests, it appears to be a very valuable addition to the iDigBio coordinates. A similar cluster can be seen in Texas. A significant outlier is a Geolocate record off the coast of Africa. Overall, the Geolocate coordinates seem to add new and possibly significant areas to the coordinate data.

### Lynx Canadensis records
Below is a visualization of the lynx canadensis records' iDigbio and Geolocate coordinate data.
NOTE: The points on the map do not have pop ups when clicked due to limitations of the plotting library.

In [2]:
# Initialize map, centered on Edmonton, Canada
lynx_map = folium.Map(location=[53.5559564, -113.774813],
                        zoom_start=3,
                        tiles="CartoDB dark_matter")

# Import coordinate data for lynx canadensis
df = pd.read_csv("lynxcanadensis.csv")

# Add points to map
for index, row in df.iterrows():
    lon, lat, flag, uuid = row["longitude"], row["latitude"], row["flags"], row["uuid"]
    point = [lat, lon]
    
    # Flag field contents array stored as string, extract it with json module
    flag = json.loads(flag)[0]
    
    # Assign color based on flag
    if flag == "geolocate_georeference":
        marker_color = "#0A8A9F" # Teal
    else:
        marker_color = "#E37222" # Orange
    
    # Initialize marker and add to the map
    folium.CircleMarker(
        location=point, 
        color = marker_color, 
        fill=True,
        fill_color = marker_color,
        radius = 1,
        #popup = uuid # Popup not working for large datasets for some reason
    ).add_to(lynx_map)

# Map legend html code
legend_html = '''
     <div style="position: fixed; 
     bottom: 50px; left: 50px; width: 150px; height: 90px; 
     border:2px orange; z-index:9999; font-size:14px;
     ">&nbsp; <font color="white">Marker legend</font> <br>
     &nbsp; <font face = "Verdana" color="white">geolocate</font> &nbsp; <i class="fa fa-circle"
                  style="color:#0A8A9F"></i><br>
     &nbsp; <font face = "Verdana" color="white">idigbio</font> &nbsp; <i class="fa fa-circle"
                  style="color:#E37222"></i>
      </div>
     '''

# Output map
lynx_map.get_root().html.add_child(folium.Element(legend_html))
lynx_map

It appears that the Geolocate coordinates add points to many locations where iDigBio coordinates are sparse, for example in the NW portion of the US and Western Canada. In Alaska, there is a large cluster of both Geolocate and iDigBio coordinates near it's Eastern border but then a sparse and broad spread of Geolocate coordinates towards Western parts of Alaska. In addition, there is a very large cluster of iDigBio coordinates in Ontario near the great lakes with very few Geolocate coordinates mixed in. Overall, the Geolocate coordinates seem to add more datapoints into areas where iDigBio coordinates are.