# Mining Images

__tl;dr__: This is a starting point for image mining.

---

## Before you start

### Start the microservices needed for this notebook

To get started you will need to start both the [video asset manager](https://github.com/mbari-media-management/vampire-squid) and [annotation](https://github.com/mbari-media-management/annosaurus) microservices using [Docker](https://www.docker.com/). One of the easiest ways to do this is to use the [m3-microservices project](https://github.com/mbari-media-management/m3-microservices):

```
git clone https://github.com/mbari-media-management/m3-microservices.git
cd m3-microservices
# Edit .env as per the README
docker-compose build
docker-compose up
```

### Get your IP address

On Mac/Linux: 

```
ifconfig | grep "inet " | grep -Fv 127.0.0.1 | awk '{print $2}'
```

In [1]:
# Enter your IP address here
ip_address = "134.89.113.105"

### Set your client secrets

Look in `m3-microservices/.env` for the values for:

- ANNO_APP_CLIENT_SECRET
- VAMP_APP_CLIENT_SECRET

and set them below. I've already set them to the default values so if you haven't changed in the `.env` file, you can skip this step

In [2]:
anno_secret = "foo"
vam_secret = "foo"

---
## Set up

### Endpoints

In [3]:
annosaurus_url = "http://%s:8082/anno/v1" %(ip_address)
vampire_squid_url = "http://%s:8084/vam/v1" % (ip_address)

# Useful annosaurus endpoints
annotation_url = annosaurus_url + "/annotations"
image_url = annosaurus_url + "/images"
imaged_moments_url = annosaurus_url + "/imagedmoments"
observation_url = annosaurus_url + "/observations"
association_url = annosaurus_url + "/associations"
data_url = annosaurus_url + "/ancillarydata"

# Useful vampire-squid endpoints
media_url = vampire_squid_url + "/media"

### Helper Functions

In [4]:
# %load m3_rest.py
import datetime
import dateutil
import json
import pprint
import random
import requests
import urllib
import uuid

def show(s, data = None):
    "Display the json response from API calls"
    pp = pprint.PrettyPrinter(indent=2)
    print("--- " + s)
    if data:
      pp.pprint(data)
    
def iso8601():
    "Standardize the date format for pretty printing"
    return datetime.datetime.now(datetime.timezone.utc).isoformat()[0:-6] + "Z"

def auth_header(access_token):
    "Convience method to build JWT authorization header"
    return {"Authorization": "Bearer " + access_token}

def pretty_dict(d, indent=0):
    "Pretty print a python dictionary"
    for key, value in d.items():
        print('\t' * indent + str(key))
        if isinstance(value, dict):
           pretty(value, indent+1)
        else:
           print('\t' * (indent+1) + str(value))
    
def parse_response(r):
    "Parse a JSON response"
    try:
       return json.loads(r.text)
    except:
        s = "URL: %s\n%s (%s): %s" % (r.request.url, r.status_code, r.reason, r.text)
        print(s)
        return {}
    
# --- Some helper functions that display the web traffic
#     Useful for demo
def pretty_print(pr):
    "Pretty print an HTTP request"
    print('{}\n{}\n{}\n\n{}'.format(
        '-----------REQUEST-----------',
        pr.method + ' ' + pr.url,
        '\n'.join('{}: {}'.format(k, v) for k, v in pr.headers.items()),
        pr.body,
    ))
    
def send(pr):
    pretty_print(pr)
    s = requests.Session()
    return s.send(pr)
     
def pretty_delete(url, access_token):
    r = requests.Request('DELETE', url, headers=auth_header(access_token))
    pr = r.prepare()
    return parse_response(send(pr))

def pretty_get(url):
    r = requests.Request('GET', url)
    pr = r.prepare()
    return parse_response(send(pr))

def pretty_post(url, access_token, data = {}):
    r = requests.Request('POST', url, data = data, headers=auth_header(access_token))
    pr = r.prepare()
    return parse_response(send(pr))

def pretty_put(url, access_token, data = {}):
    r = requests.Request('PUT', url, data = data, headers=auth_header(access_token))
    pr = r.prepare()
    return parse_response(send(pr))
    
    
# --- Basic REST calls, you'd probably use these in your own 
#     applications instead of the pretty-fied versions above. 
def delete(url, headers):
    return parse_response(requests.delete(url, headers=headers))

def get(url):
    return parse_response(requests.get(url))
    
def post(url, headers, data = {}):
    return parse_response(requests.post(url, data, headers=headers))

def put(url, headers, data = {}):
    return parse_response(requests.put(url, data, headers=headers))

---
# Finding Concepts

In [5]:
# Get a list of all concepts that were used in annotations
concept_url = observation_url + "/concepts"
concepts = pretty_get(concept_url)
print("---------PARSED RESPONSE------------")
print(concepts)


-----------REQUEST-----------
GET http://134.89.11.93:8082/anno/v1/observations/concepts


None
---------PARSED RESPONSE------------
['1-gallon paint bucket', '2G Robotics structured light laser', '55-gallon drum', "a'a", 'abandoned research equipment', 'Abraliopsis', 'Abyssocucumis', 'Abyssocucumis abyssorum', 'Acanella', 'Acanthamunnopsis', 'Acanthamunnopsis milleri', 'Acanthascinae', 'Acanthascinae sp. 1', 'Acanthephyra', 'Acanthogorgia', 'Acantholiparis', 'Acanthoptilum', 'Acesta', 'Acesta mori', 'Acesta sphoni', 'Acharax', 'Acoustic Current Meter', 'Acoustic Doppler Current Profiler', 'Acoustic Probe', 'Acrocirridae', 'Actinauge', 'Actinauge verrillii', 'Actinernidae', 'Actinernus', 'Actiniaria', 'Actiniidae', 'Actiniidae sp. 1', 'Actinopteri', 'Actinoscyphia', 'Actinoscyphiidae', 'Actinostola', 'Actinostolidae', 'adhesive', 'Aegina', 'Aegina citrea', 'Aegina rosea', 'Aeginidae', 'Aeginura', 'Aeginura grimaldii', 'Aeolidiidae', 'Aequorea', 'Aequorea aequorea', 'Agalmatidae', 'Agla

---
# Image Mining

For help see the [Annotation API docs](https://app.swaggerhub.com/apis/mbari/annosaurus/1.0.1-oas3)

In [6]:
# Concept of interest
concept = "Aegina"

# Get a count of images of interest
img_count_url = imaged_moments_url + "/concept/images/count/" + concept
img_count = pretty_get(img_count_url)["count"]

# Fetch a few of the images
img_url = imaged_moments_url + "/concept/images/" + concept + "?limit=2"
imgs = pretty_get(img_url)
print("---------PARSED RESPONSE------------")
print(imgs)


-----------REQUEST-----------
GET http://134.89.11.93:8082/anno/v1/imagedmoments/concept/images/count/Aegina


None
URL: http://134.89.11.93:8082/anno/v1/imagedmoments/concept/images/count/Aegina
404 (Not Found): Requesting "GET /concept/images/count/Aegina" on servlet "/v1/imagedmoments" but only have: <ul><li>DELETE /:uuid</li><li>DELETE /videoreference/:uuid</li><li>GET /</li><li>GET /:uuid</li><li>GET /concept/:name</li><li>GET /concept/count/:name</li><li>GET /imagereference/:uuid</li><li>GET /modified/:start/:end</li><li>GET /modified/count/:start/:end</li><li>GET /observation/:uuid</li><li>GET /videoreference</li><li>GET /videoreference/:uuid</li><li>PUT /:uuid</li></ul>



KeyError: 'count'