# Image Searching
This notebook handles searching imagenet for images based on a keyword.
It performs this task by doing the following:

    1. User inputs keyword and number of images to retrieve.
    2. The keyword is sent to the wordnet API to obtain the synset id.
    3. Get a grandparent from the synset for similar images.
    4. Get up to 5 hyponyms of the hypernym (siblings to the synset).
    5. Get a random synset that is completely unrelated to the synset.
    6. Retrieve a set number of images from each synset:  
        - About 30% of the images retrieved are exactly matching the keyword.  
        - About 50% of the images retrieved are sibling synsets to the keyword. 
        - About 20% of the images are completely unrelated images from a random synset.  
   
The percentage of exact, related, and unrelated images is subject to change depending on what works best for the neural net.

## Imports and Initialization

In [None]:
import requests
import random
import shutil
from IPython.display import Image

# Download wordnet corpus using nltk
from nltk import download
download("wordnet")

# Import the wordnet from nltk corpus
from nltk.corpus import wordnet as wn

API = {
    'allsynsets': "http://image-net.org/api/text/imagenet.synset.obtain_synset_list",
    'wordsfor': "http://image-net.org/api/text/wordnet.synset.getwords?wnid={}",
    'urlsfor': "http://image-net.org/api/text/imagenet.synset.geturls?wnid={}",
    'hyponymfor': "http://image-net.org/api/text/wordnet.structure.hyponym?wnid={}",
}

synsets = requests.get(API['allsynsets']).content.decode().splitlines()

In [None]:
def getSynsetId(synset):
    return "n{}".format(str(synset.offset()).zfill(8))

## 1. User Prompt

In [None]:
keyword = input("Keyword: ")
imgCount = int(input("Image Count: "))

## 2. Obtain Synset ID
Hyponym: A child of the synset  
Hypernym: The parent of the synset

In [None]:
offset = next(iter(wn.synsets(keyword, pos=wn.NOUN)), None).offset()
synsetId = "n{}".format(str(offset).zfill(8))
synset = wn.synset("{}.n.01".format(keyword))
print("{} : {} : {}".format(keyword, synset, synsetId))

synInImagenet = synsetId in synsets
print("In imagenet? {}".format(synInImagenet))

## 3. Obtain Synset Parent

In [None]:
parent = random.choice(synset.hypernyms())
print(parent)

## 4. Obtain Siblings of Synset

In [77]:
siblings = []
siblingCount = 0
for sibling in parent.hyponyms():
    if siblingCount == 5:
        break
    if sibling != synset:
        siblings.insert(siblingCount, sibling)
        siblingCount += 1

for sibling in siblings:
    print(sibling)

Synset('big_cat.n.01')


## 5. Obtain Random Synset

In [112]:
randomSynsetId = random.choice(synsets)
randomSynsetName = random.choice(requests.get(API["wordsfor"].format(randomSynsetId)).content.decode().splitlines())
randomSynset = wn.synset("{}.n.01".format(randomSynsetName))

print(randomSynset)

Synset('coat.n.01')


## 6. Display Obtained Synsets

In [113]:
print("Synset:")
print("-------")
print("{} Id('{}')\n".format(synset, synsetId))

print("Siblings:")
print("-------")
for sibling in siblings:
    print("{} Id('{}')\n".format(sibling, getSynsetId(sibling)))

print("Random:")
print("-------")
print("{} Id('{}')\n".format(randomSynset, randomSynsetId))

Synset:
-------
Synset('cat.n.01') Id('n02121620')

Siblings:
-------
Synset('big_cat.n.01') Id('n02127808')

Random:
-------
Synset('coat.n.01') Id('n03057021')

