# Examples Dictionary Generation

This notebook creates a JSON file that contains links and metadata to examples of artwork used in the training dataset. This file is used by the web application Modern Art Classifier (notebook availabled at `/app/art-classifier-ui.ipynb`). When the classifier predicts an images class, a random sample of each class is takend from this file. The metadata is displayed in the user interface, included an example image of the class.

The file created by this notebooks is called `exampleArtists.json`. The JSON file structure follows this pattern:

```
{
    "Surrealism": [
        {
            "artistName": "Rene Magritte",
            "title": "The obsession",
            "style": "Surrealism",
            "link": "https://www.wikiart.org/en/rene-magritte",
            "imageUrl": "https://uploads6.wikiart.org/images/rene-magritte/the-obsession-1928(1).jpg"
        },
        ...
    ]
    "Expressionism": [
        {
            "artistName": "Pablo Picasso",
            "title": "Boy leading a horse",
            "style": "Expressionism",
            "link": "https://www.wikiart.org/en/pablo-picasso",
            "imageUrl": "https://uploads0.wikiart.org/images/pablo-picasso/boy-leading-a-horse-1906.jpg"
        },
        ...
    ]
    ...
}
```

In [1]:
import os
import PIL
import json
import requests
from artist_name_format import correct_artist_name

working_path = '/Users/tod/todgru/notes/pdev/wikiart-working'

classes_to_oversample = [
    'Abstract Art',
    'Abstract Expressionism', # grouped with 'Action painting'
    'Naïve Art (Primitivism)',
    'Op Art',
    'Suprematism', # grouped with 'Neo-Suprematism',
    'Street art',    
]
classes = [
    'Surrealism',
    'Expressionism',
    'Cubism',
    'Pop Art',
    *classes_to_oversample
]

print(classes)
print("number of classes:", len(classes))


['Surrealism', 'Expressionism', 'Cubism', 'Pop Art', 'Abstract Art', 'Abstract Expressionism', 'Naïve Art (Primitivism)', 'Op Art', 'Suprematism', 'Street art']
number of classes: 10


In [6]:
# This metadata is the same metadata used in the training dataset
# metadata json files about artists individual paintings
PATH_META = os.path.join("./artist-meta-data")

exampleArtists = {}
for c in classes:
    exampleArtists[c] = []

def parse_style_field(row):
    if row == None:
        return
    style_list = row.split(",")

    style = style_list[0]
    
    if style not in classes:
        return
    
    if style == 'Neo-Suprematism':
        return 'Suprematism'
    if style == 'Action painting':
        return 'Abstract Expressionism'
    return style

def check_if_image_is_valid(url, check_url=False):
    url = url.split("!")[0]
    return url
    # do not check url unless generating new data
    if(check_url):
        response = requests.head(url)
        print(str(response.status_code) + " : " + url)
        if(response.status_code == 200):
            return url
        return

for artist_meta_filename in os.listdir(PATH_META):
    # don't include these files when collating file name and art style
    if artist_meta_filename not in [ 'artists.json', 'original-artists.json']:
        
        # read file
        with open(os.path.join(PATH_META, artist_meta_filename), 'r') as file:
            data=file.read()
        obj = json.loads(data)
        
        for artwork in obj:                
            temp = {}
            temp["artistName"] = correct_artist_name[artwork['artistName'].rstrip()]
            temp["title"] = artwork['title']
            temp["style"] = parse_style_field(artwork['style'])
            temp["link"] = "https://www.wikiart.org/en/{}".format(artwork['artistUrl'])
            temp["imageUrl"] = check_if_image_is_valid(artwork['image'])

            # art meta: artistName, title, image, style, link:https://www.wikiart.org/en/artistUrl  
            if temp["imageUrl"] == None:
                continue
            if temp["style"] != None:
                s = temp["style"]
                exampleArtists[temp["style"]].append(temp)
                

s = 0
for k in exampleArtists:
    t = len(exampleArtists[k])
    s = s+t
    print ("example paintings per style, {}:".format(k), t)
print("total example paintings", s)

# Write json to file
with open('./app/exampleArtists.json', 'w', encoding='utf-8') as file:
    json.dump(exampleArtists, file, ensure_ascii=False, indent=4)
            
            

example paintings per style, Surrealism: 784
example paintings per style, Expressionism: 562
example paintings per style, Cubism: 411
example paintings per style, Pop Art: 341
example paintings per style, Abstract Art: 210
example paintings per style, Abstract Expressionism: 41
example paintings per style, Naïve Art (Primitivism): 192
example paintings per style, Op Art: 165
example paintings per style, Suprematism: 118
example paintings per style, Street art: 77
total example paintings 2901
