## Finding Images and using WordNet

This notebook will demonstrate loading images from google image search and using WordNet to find similar words.

In [1]:
import compsyn
import os

The ```compsyn.helperfunctions``` file contains helper functions to download files and to use NLTKs wordnet to find extra search terms.

In [2]:
from compsyn.helperfunctions import get_wordnet_tree_data, search_and_download, run_google_vision, write_img_classifications_to_file

### Settings

In [3]:
number_images = 100 
search_terms = ['emotion']
filter_data = True
get_tree_data = True

### Get WordNet data

In [4]:
home = os.getcwd()

In [5]:
n_categories = 10

In [6]:
if get_tree_data: 
    print("Adding Search Terms from Tree")
    tree_search_terms, raw_tree, all_tree_data = get_wordnet_tree_data(search_terms, home)
    search_terms = tree_search_terms[:n_categories]
    print(all_tree_data.head())

Adding Search Terms from Tree
  ref_term                        new_term     role  \
0  emotion                           anger  hyponym   
1  emotion                         anxiety  hyponym   
2  emotion  conditioned_emotional_response  hyponym   
3  emotion                 emotional_state  hyponym   
4  emotion                            fear  hyponym   

                                          synset Branch_fact Num_senses  
0                           Synset('anger.n.01')          19          5  
1                         Synset('anxiety.n.02')           6          2  
2  Synset('conditioned_emotional_response.n.01')           1          1  
3                 Synset('emotional_state.n.01')          16          1  
4                            Synset('fear.n.01')          25          8  


In [7]:
tree_search_terms

['emotion',
 'conditioned emotional response',
 'joy',
 'love',
 'hate',
 'emotional state',
 'feeling',
 'anxiety',
 'anger',
 'fear']

In [8]:
search_terms

['emotion',
 'conditioned emotional response',
 'joy',
 'love',
 'hate',
 'emotional state',
 'feeling',
 'anxiety',
 'anger',
 'fear']

We might want to remove the elements from the list which won't have clear images.

In [9]:
search_terms.pop(1)

'conditioned emotional response'

In [10]:
search_terms.pop(2)

'love'

### Download Images

In [11]:
DRIVER_PATH = "/Users/bhargavvader/open_source/comp-syn/chromedriver"

In [12]:
img_urls_dict = {}
for search_term in search_terms:
    print(search_term)
    urls = search_and_download(search_term = search_term, driver_path = DRIVER_PATH, home = home, number_images = number_images)
    img_urls_dict[search_term] = urls

emotion
Found: 100 search results. Extracting links from 0:100
Found: 101 image links, done!
SUCCESS - saved https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcQ2FqCXoQqCqQ82R1YC-0uhSTcDgSn3fZ-f09MbzRTnM4QRYWDX&usqp=CAU - as ./downloads/emotion/455661512c.jpg
SUCCESS - saved https://thumbs.dreamstime.com/z/emotion-scale-speedometer-emotions-portraits-infographic-control-element-customer-satisfaction-quality-score-rating-faces-154466813.jpg - as ./downloads/emotion/2a4e8f3e20.jpg
SUCCESS - saved https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcQ5OVcMm3akqXyp-mdOYh7r2Fe-JmZiiOaDnMxX3F-zXaMnub1V&usqp=CAU - as ./downloads/emotion/6ca9217c77.jpg
SUCCESS - saved https://www.researchgate.net/profile/Radoslaw_Nielek/publication/319045412/figure/fig1/AS:541648786554880@1506150542812/Plutchik-wheel-of-emotion.png - as ./downloads/emotion/b0bdbcd0f3.jpg
SUCCESS - saved https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcQ2rPnX3HyrnMZ56VYd-e7ket1y9P-3M3hZTAsFupbMKXumaYEM&usqp=CAU -

SUCCESS - saved https://barrel.blog/wp-content/uploads/2018/08/emotionindex-1200x628.jpg - as ./downloads/emotion/122440db0a.jpg
SUCCESS - saved https://upload.wikimedia.org/wikipedia/commons/thumb/c/ce/Plutchik-wheel.svg/220px-Plutchik-wheel.svg.png - as ./downloads/emotion/380a0da359.jpg


  "Palette images with Transparency expressed in bytes should be "


SUCCESS - saved https://i0.wp.com/flowingdata.com/wp-content/uploads/2015/06/Inside-Out-Emotion-Combos.png?fit=620%2C577&ssl=1 - as ./downloads/emotion/9b7a1a9fd0.jpg
SUCCESS - saved https://ww2.kqed.org/app/uploads/sites/23/2019/11/emotion-scientists.jpg - as ./downloads/emotion/8db69e8eaa.jpg
SUCCESS - saved https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcTh3ZGPiA4vVs1Q4_d73xgJ4WPT880qgE4Can6wR_scEzIuRyD6&usqp=CAU - as ./downloads/emotion/1fadc0ab5c.jpg
SUCCESS - saved https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcTphvvWBzUAOp9J7kQtEjv3e29_Df-kcg82tvpy22RpohXdD192&usqp=CAU - as ./downloads/emotion/8269989413.jpg
SUCCESS - saved https://i.ytimg.com/vi/3zPJMU3PxuY/maxresdefault.jpg - as ./downloads/emotion/80ad0b065a.jpg
SUCCESS - saved https://i.pinimg.com/originals/dc/7d/18/dc7d186ac389a22090de717f2302bd79.png - as ./downloads/emotion/5b601924cb.jpg
ERROR - Could not save https://martechtoday.com/wp-content/uploads/2020/03/emotion-buttons-stock-1920.jpg - cannot ide

Found: 200 search results. Extracting links from 0:200
Found: 101 image links, done!
SUCCESS - saved https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcTz9HsOITRJCm3XnklAhWrymcY0Z8DpY9zUKToWw0ZhwO2K2DVK&usqp=CAU - as ./downloads/joy/2f8c549313.jpg
SUCCESS - saved https://thenypost.files.wordpress.com/2020/02/alexa-anya-taylor-joy-9.jpg?quality=80&strip=all&w=978&h=652 - as ./downloads/joy/6d287b69c7.jpg
SUCCESS - saved https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcTV4y6LOimYm4t4PjF--oX9kdI4meX1gBGfAmPGg08fn7YfZx3C&usqp=CAU - as ./downloads/joy/5c7a80b612.jpg
SUCCESS - saved https://mitpress.mit.edu/sites/default/files/styles/large_book_cover/http/mitp-content-server.mit.edu%3A18180/books/covers/cover/%3Fcollid%3Dbooks_covers_0%26isbn%3D9780262042871%26type%3D.jpg?itok=r3uyyomS - as ./downloads/joy/e47a8b4a47.jpg
SUCCESS - saved https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcRh13258AkFLpBnK4SSv166ytjfvzEqsWO03155SV61W-h_QNNX&usqp=CAU - as ./downloads/joy/b4560d41

SUCCESS - saved https://media.swncdn.com/via/6879-istockgetty-images-plusdigitalskillet-1.jpg - as ./downloads/joy/7b3beebd3a.jpg
SUCCESS - saved https://pmcvariety.files.wordpress.com/2020/03/joy-behar.jpg?w=681&h=383&crop=1 - as ./downloads/joy/10aba0c59f.jpg
ERROR - Could not save https://www.longislandpress.com/wp-content/uploads/2018/05/Joy-Mangano-.jpg - cannot identify image file <_io.BytesIO object at 0x105334d70>
SUCCESS - saved https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcRq-UCAXBB1PMTx36oeR601F_AXvIt_xSyhsX8lRwx4zgMCvSns&usqp=CAU - as ./downloads/joy/bab4c30fc3.jpg
SUCCESS - saved https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcTXoAGqGdjdER5vTMRslxnDWL4BWCwcwW9cGt-ERqd-Dmgr_OOO&usqp=CAU - as ./downloads/joy/4b9120a6d5.jpg
SUCCESS - saved https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcTKW18U3S4u1QyHsxPl1_xDJHImY82h_gemsD91F35zqn64kmDH&usqp=CAU - as ./downloads/joy/2299493d81.jpg
SUCCESS - saved https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9

SUCCESS - saved https://upload.wikimedia.org/wikipedia/commons/4/41/Joy.png - as ./downloads/joy/cef009e0f6.jpg
SUCCESS - saved https://feedmore.org/wp-content/uploads/FM_Holiday_Instagram-Post.png - as ./downloads/joy/487504bf92.jpg
SUCCESS - saved https://withjoy.dexecure.net/assets/img/joy_logo_transparent.png?opt=aggressive - as ./downloads/joy/3d29478219.jpg
SUCCESS - saved https://a57.foxnews.com/static.foxnews.com/foxnews.com/content/uploads/2020/03/931/524/Joy-Behar.jpg?ve=1&tl=1 - as ./downloads/joy/9405d1d61e.jpg
SUCCESS - saved https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcQQzEOP3f1Sw7KXs6yx3Qpz7k5E1HbbauOR2zJZ79ojefL0B05c&usqp=CAU - as ./downloads/joy/913cb8a638.jpg
hate
Found: 200 search results. Extracting links from 0:200
Found: 96 image links, looking for more ...
emotional state
Found: 200 search results. Extracting links from 0:200
Found: 91 image links, looking for more ...
feeling
Found: 100 search results. Extracting links from 0:100
Found: 17 image links

SUCCESS - saved https://miro.medium.com/max/4800/1*7iSxaz9_H9bQvN_uhcLRAQ.jpeg - as ./downloads/anger/565a00b8ef.jpg
SUCCESS - saved https://www.bphope.com/wp-content/uploads/2019/12/bipolar-anger-management-strategies-840061704.gif - as ./downloads/anger/0edca16e44.jpg
SUCCESS - saved https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcR6L0yNXCGbAmf9W26pNIZKo5ZV-HIbRkMvtXkqtTRG4sZz4K75&usqp=CAU - as ./downloads/anger/9c4c4d91bf.jpg
SUCCESS - saved https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcQ4XM47Hp7u82J5qCKPGfDvbNUmzRhIhWTsqxcmXwC89mokkXqs&usqp=CAU - as ./downloads/anger/29459d7e35.jpg
SUCCESS - saved https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcQLWEVwh4P4lk-EiOVXkR65yWOTRRXYrMu_7zrSMOhZFiZKjog-&usqp=CAU - as ./downloads/anger/8422c37173.jpg
SUCCESS - saved https://i.ytimg.com/vi/BsVq5R_F6RA/maxresdefault.jpg - as ./downloads/anger/fcce5de60e.jpg
SUCCESS - saved https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcSwNJaAzABtxYoM0AK6BXiimryyuGY4G-RM7zLh2r

SUCCESS - saved https://media4.s-nbcnews.com/i/newscms/2017_37/2156151/170915-anger-screaming-stock-njs-12p_b54ffc85cdc4c9170a757211f51069f2.jpg - as ./downloads/anger/eae4841726.jpg
SUCCESS - saved https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcRljHKt3DVo-H-ANDfeg8NarsISbbBnCS8HWJS30FAtxk2eK9Y5&usqp=CAU - as ./downloads/anger/5449fccd1e.jpg
SUCCESS - saved https://www.healthyplace.com/sites/default/files/uploads/2016/11/anger-depression-symptom.jpg - as ./downloads/anger/2e59b6a162.jpg
SUCCESS - saved https://www.hsdinstitute.org/assets/images/face-anger-with-focus.2.png - as ./downloads/anger/af8d9399c4.jpg
SUCCESS - saved https://cdn10.bigcommerce.com/s-hi7gs6/product_images/uploaded_images/release-anger-spells-rituals.jpg?t=1575401498 - as ./downloads/anger/c936baf28a.jpg
SUCCESS - saved https://miro.medium.com/max/960/1*YZIKJ_a_eAGurNOa2PyFVw.jpeg - as ./downloads/anger/588da7351e.jpg
SUCCESS - saved https://www.apa.org/images/anger-title-image_tcm7-230128.jpg - as ./down

### Run Google Vision Filter 

In [None]:
if filter_data: 
    img_classified_dict = run_google_vision(img_urls_dict)
    write_img_classifications_to_file(home, search_terms, img_classified_dict)

You should now have the top 100 images of each of the elements of 'search_term' saved on your machine: you can now run the analysis presented in the ```compsyn_package_pipeline```. Note that 