# Object Identification (TensorFlow Hub) - Images URLs

This notebook identifies whether a given object is present in a set of images. If the object is found, it returns a dataframe with the file name, score and bounding boxes.
We are using a dataset from [UCF](https://www.crcv.ucf.edu/data/GMCP_Geolocalization/#Dataset) that contains 62,058 high quality Google Street View images, and [TensorFlow Hub](https://www.tensorflow.org/hub), a library and platform for transfer learning.

## Dependencies

This notebook has been tested with Python 3.10.6 and the following package versions:

- pandas==1.5.2
- verta==0.21.1

In [None]:
!pip install pandas==1.5.2
!pip install verta==0.21.1

## Imports

In [None]:
import multiprocessing as mp
import os
import pandas as pd
import time

from verta import Client

## Helper Functions

In [None]:
def load_urls():
    urls = [
        'http://s3.amazonaws.com/verta-starter/street-view-images/000001_0.jpg',
        'http://s3.amazonaws.com/verta-starter/street-view-images/000001_1.jpg',
        'http://s3.amazonaws.com/verta-starter/street-view-images/000001_2.jpg',
        'http://s3.amazonaws.com/verta-starter/street-view-images/000001_3.jpg',
        'http://s3.amazonaws.com/verta-starter/street-view-images/000001_4.jpg',
        'http://s3.amazonaws.com/verta-starter/street-view-images/000001_5.jpg',
        'http://s3.amazonaws.com/verta-starter/street-view-images/000002_0.jpg',
        'http://s3.amazonaws.com/verta-starter/street-view-images/000002_1.jpg',
        'http://s3.amazonaws.com/verta-starter/street-view-images/000002_2.jpg',
        'http://s3.amazonaws.com/verta-starter/street-view-images/000002_3.jpg'
    ]
    
    return [[url.strip().split('/')[-1], url.strip()] for url in urls]

In [None]:
def show_metrics(n_urls, n_threads, start_time, end_time):
    total_time = end_time - start_time
    total_time = time.strftime('%Mm %Ss', time.gmtime(total_time))
    
    print(f'URLs processed: {n_urls}.')
    print(f'Threads: {n_threads}.')
    print(f'Total time: {total_time}.')

In [None]:
def show_results(results):
    cols = list(results[0].keys())[:-1]
    cols.extend(list(results[0]['bboxes'].keys()))
    data = []

    for item in results:
        values = list(item.values())[0:3]
        values.extend(list(item['bboxes'].values()))
        data.append(values)

    df = pd.DataFrame(data, columns = cols)
    print(df)

## Main Function
After setting up Verta, it processes the images in parallel, presents metrics and saves the result to a CSV file.

In [None]:
def main():
    VERTA_HOST = 'app.verta.ai'
    ENDPOINT_NAME = 'object-detection-url'

    os.environ['VERTA_EMAIL'] = ''
    os.environ['VERTA_DEV_KEY'] = ''

    client = Client(VERTA_HOST, debug=True)
    endpoint = client.get_or_create_endpoint(ENDPOINT_NAME)
    model = endpoint.get_deployed_model()

    n_threads = int(mp.cpu_count())
    n_urls = 5
    urls = load_urls()[:n_urls]

    start_time = time.time()
    pool = mp.Pool(n_threads)
    map_results = pool.map_async(model.predict, urls, chunksize=1)

    while not map_results.ready():
        print(f"URLs remaining: {map_results._number_left}")
        time.sleep(1.5)

    results = map_results.get()
    pool.close()
    pool.join()
    end_time = time.time()

    show_metrics(n_urls, n_threads, start_time, end_time)
    show_results(results)

## Execution

In [None]:
if __name__ == '__main__':
    main()