# Plotting Tara Oceans sample locations

This notebook shows how to plot geospatial metadata attached to the samples gathered by the Tara Oceans project. It will become apparent there is an error in the data.

### Dependencies
- [folium](https://github.com/python-visualization/folium)
- [pandas](https://pandas.pydata.org/)

In [1]:
from idr import connection
try:
    import folium
except ImportError:
    !conda install -n python2 folium -y
    import folium

Each plate in the [Tara Oceans dataset](https://idr.openmicroscopy.org/webclient/?show=screen-1201) contains multiple microscopy images taken from a water sample. Each plate has metadata stored as key-value pairs, including the coordinates of the water sample location.

In [2]:
c = connection()
screen = c.getObject('Screen', 1201)
coords = []
for p in screen.listChildren():
    try:
        d = dict(p.getAnnotation().getValue())
    except AttributeError:
        continue
    if all ((f in d) for f in (
        'EVENT_LATITUDE_Start',
        'EVENT_LONGITUDE_Start',
        'EVENT_LATITUDE_End',
        'EVENT_LONGITUDE_End',
    )):
        co = (
            p.getId(),
            p.getName(),
            float(d['EVENT_LATITUDE_Start']),
            float(d['EVENT_LONGITUDE_Start']),
            float(d['EVENT_LATITUDE_End']),
            float(d['EVENT_LONGITUDE_End']),
        )
        coords.append(co)
c.close()

Connected to IDR ...


Create a folium feature for each sample

In [3]:
features = folium.FeatureGroup(name="Samples")
for coord in coords:
    line = [
        (coord[2], coord[3]),
        (coord[4], coord[5]),
    ]
    label = '<a href="https://idr.openmicroscopy.org/webclient/?show=plate-%d" target="_blank">Plate %s</a>' % coord[:2]
    sample = folium.PolyLine(line, popup=label, color='red', weight=10)
    features.add_child(sample)

Create and display the map. Clicking on each point will popup the plate name and a link to the IDR data.

In [4]:
m = folium.Map(location=[0, 0], zoom_start=2,)
m.add_child(features)
m

There is a very obvious artifact corresponding to [TARA_HCS1_H5_G100006116_G100006253--2013_10_30_19_38_12_chamber--U01--V01](https://idr.openmicroscopy.org/webclient/?show=plate-4807). Examination of the metadata in the key-value pairs suggests either the start or end coordinate has an incorrect east/west longitude:

```
EVENT_LATITUDE_Start    25.5264
EVENT_LONGITUDE_Start  -88.394
EVENT_LATITUDE_End      25.5416
EVENT_LONGITUDE_End     88.4044
```

From the map it should be obvious that the eastern end of the sample is on land (zoom in to the endpoints of the incorrect plotted sample if this isn't clear). `EVENT_LONGITUDE_End` is incorrect and should probably be `-88.4044`.

The IDR has full provenance information for the Tara Oceans samples, so we can verify this by scrolling though the key-value pairs to [`EVENT_URI_Event_Logsheet`](http://store.pangaea.de/Projects/TARA-OCEANS/Logsheets_Event/TARA_20120109T1341Z_142_EVENT_PUMP.pdf) and clicking on the link to view the scanned logsheet.

The start and end longitudes are both in the west (i.e. negative). Note that coordinates on the logsheets are in a different format, but after conversion it matches our expectation

In [5]:
88 + (24.261 / 60)

88.40435