# Network of Painters: building a dataset from paintings datasets, then creating links

The aim of this project is to create a dataset of painters from datasets such as WikiArt and Art500k, combining features, extending missing data of painters with web scraping through Google and Wiki API, and then creating links between painters based on similarity of style, geographical and social interaction.

Note: One long-term goal would be to create a JSON file that contains all combined hierarchically. For example, a level in the structure could be art movement, inside it are artists with some base data like birthplace, year of birth and death and other geographical data, inside it are paintings with all contained data (even better would be including eras of painters in their substructure, and inside them the paintings). Then we could use this to create a network of art movements, artists, and paintings.

NEXT STEPS:<br>
-Add "Places" for Art500k datasets (+change datasets_notebook save.csv loads)<br>
-Add aliases for painters in Art500k datasets<br>
-Combine the datasets on authors<br>

FURTHER STEPS: <br>
-Define connections between painters<br>
-Create a network of painters<br>
-Analyze the network<br>

<details><summary><u> Update 11.06:</u></summary>
<p>
I e-mailed an art researcher that Elisa suggested, Maximilian Schich, asking about datasets for our project. He said: 

-we do not have a record of social interactions between artists at the corpus scale. The closest thing is: co-exhibition networks, which you may already know from the work of Fraiberger et al. (incl. Laszlo Barabasi). (http://genetics.bwh.harvard.edu/courses/Biophysics205/Papers/All_papers/Fraiberger_2018.pdf page 2) The issue there is that the network is short, circa1985 to 2020.

-Hyperlink networks (I guess WikiLinks, Pageranks and such), such as those found in Wikipedia are obviously beset with all kinds of issues, even though they do recapitulate the evolution of conventional style periods pretty well (cf. the work of Doron Goldfarb et al.. incl. myself). More locally speaking, it i a core topic in art history to shed light on the social network of artists and their patrons, but this does not lend itself to quantitative analysis. 

-I personally have done a visualization for Max Planck, based on the social network of 5500 individuals related to the Roman Baroque (https://zuccaro.schich.info/), which did reveal another issue, which is that for painters, art historians tend to research family relationships (more cliques), while for architects they focus on business relationships (more hubs). But here you got the inverse problem that there is not much information on the paintings

-There is a question/issue he raised from this: "Should we really assume social interaction influencing the styles of artists? Note that this may substantially underestimate the plasticity of the human brain/mind! It is like assuming that cellists only hang out with cellists, when we all know that grunge bands in Seatlle all did hang out together and missing a bassist. Meanwhile we do have evidence that artists such as Rubens did routinely hang out with different(!) artists, who could serve clients with different genres and if necessary styles. Bramante did build Gothic in Milan and Renaissance style in Rome at the same time. Rubens would call in Elsheimer to do miniatures, etc. And since the mid 19th century, all artists in the Western scene were essentially familiar, not only with the same corpus of classic artists and their works, but also with the contemporary production. Large art exhibitions in Paris literally drew millions of people each year in the mid 19th century (think Burning Man or SXSW today). So it is save to say that most artists of note were familiar with a great number of styles. Styles may bifurcate. for artists the opposite may be true (cf. run DMC meets Aerosmith => https://www.youtube.com/watch?v=4B_UYYPb-Gk). If I were you, I'd turn the question around, pointing into the opposite direction: **If two artists have similar style, can we find traces that they (eventually) knew each other**?" He said influence is B.S. (literally) and there's 100 times more evidence for similarity than influence between two artworks, and suggested answering "does style lead to social interaction?"

-"Here is how this question can be attacked with the available data: The standard "corpus" for artists is their "catalog raisonne", i.e. the catalog of all their works, which does not exist for all artists and is typically a lot of work, sold in expensive books. We are a long way from a comprehensive dataset like this. Yet, for the purpose of a more limited project, you could use general conventional style similarity from the usual suspect databases (Wikiart, Art500k, etc.). As a proxy of social interaction, you could use the hyperlink and/or wikidata links connected to the same artists. Even though these two sources are limited, you could still compare the two graphs as in "Wikipedia connection" vs. "visual similarity".

We have recently published a paper on general similarity using compression ensembles, using a subset of art500k/Wikiart, which is essentially 65k paintings with a reliably year as a data. We have also used the first 100 days of the hic et nunc NFT art platform (which coincidentally you get both social interaction and painting information). See "Availability of data and materials" in https://epjdatascience.springeropen.com/articles/10.1140/epjds/s13688-023-00397-3#Sec21 "

So this could be interesting to think about
</p>
</details>

In [2]:
import pandas as pd
import numpy as np

## Complete dataset (Combination of WikiArt and Art500k)

The data preparation is done in the GitHub repository [PainterPalette](https://github.com/me9hanics/PainterPalette). The final dataset is available as "artists.csv".

In [8]:
artists = pd.read_csv('https://raw.githubusercontent.com/me9hanics/PainterPalette/main/datasets/artists.csv')
artists

Unnamed: 0,artist,Nationality,birth_place,birth_year,styles,styles_extended,StylesYears,StylesCount,PlacesCount,Contemporary,...,FirstYear,LastYear,Places,PlacesYears,PaintingSchool,Influencedby,Influencedon,Pupils,Teachers,FriendsandCoworkers
0,Ad Reinhardt,American,Buffalo,1913.0,"Abstract Art, Abstract Expressionism, Color Fi...","{Abstract Art:15},{Abstract Expressionism:5},{...","Expressionism:1944-1946,Abstract Art:1937-1941...","{Expressionism:7}, {Abstract Art:15}, {Color F...","{New York City:29},{NY:31},{US:32},{Buffalo:2}...",No,...,1937.0,1966.0,"US, NY, Canberra, Fort Worth, Buffalo, Austral...","New York City:1938-1966,NY:1938-1966,US:1938-1...","New York School,American Abstract Artists,Iras...","Piet Mondrian,Kazimir Malevich,Josef Albers,","Donald Judd,Barnett Newman,Mark Rothko,Frank S...",,,"Jackson Pollock,"
1,Adnan Coker,Turkish,,,"Abstract Art, Abstract Expressionism","{Abstract Art:25},{Abstract Expressionism:3}","Abstract Art:1992-2008,Abstract Expressionism:...","{Abstract Art:25}, {Abstract Expressionism:3}",,Yes,...,1968.0,2008.0,,,,,,,,
2,Akkitham Narayanan,Indian,Kerala,1939.0,Abstract Art,{Abstract Art:17},Abstract Art:1974-1974,{Abstract Art:17},,No,...,1974.0,1974.0,,,,,,,,
3,Alberto Magnelli,"Italian,French",Florence,1888.0,"Abstract Art, Art Nouveau (Modern), Cubism, Ex...","{Abstract Art:19},{Art Nouveau (Modern):2},{Cu...","Abstract Art:1916-1971,Cubism:1914-1935,Metaph...","{Abstract Art:21}, {Cubism:10}, {Metaphysical ...",,No,...,1909.0,1971.0,,,Abstraction-Création,,,,,
4,Alekos Kontopoulos,Greek,Lamia,1904.0,"Abstract Art, Cubism, Expressionism, Post-Impr...","{Abstract Art:26},{Cubism:5},{Expressionism:10...","Post-Impressionism:1932-1955,Expressionism:193...","{Post-Impressionism:8}, {Expressionism:11}, {R...",,No,...,1931.0,1974.0,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2661,Gohar Fermanyan,Armenian,,,Post-Impressionism,{Post-Impressionism:3},,{Post-Impressionism:3},,,...,,,,,,,,,,
2662,JAROSLAV KELUC,Czech,,,Impressionism,{Impressionism:33},Impressionism:1949-1979,{Impressionism:33},,No,...,1949.0,1979.0,,,,,,,,
2663,Ding Yi,,"Suixi County, Anhui",150.0,Maximalism,{Maximalism:29},,,,,...,1989.0,1991.0,,,,,,,,
2664,Phase 2,,,,Street art,{Street art:13},,,"{New York:1},{United States:1}",,...,,,"New York, United States",,,,,,,


## Other: WikiArt data

Load the cleaned paintings data

In [4]:
wa_paintings = pd.read_csv('https://raw.githubusercontent.com/me9hanics/PainterPalette/main/datasets/wikiart_paintings_refined.csv') 
print("Length:", len(wa_paintings))
wa_paintings.head() #Consider dropping style: "Unknown" 

Length: 175313


Unnamed: 0,artist,style,genre,movement,tags
0,Andrei Rublev,Moscow school of icon painting,religious painting,Byzantine Art,"['Christianity', 'saints-and-apostles', 'angel..."
1,Andrei Rublev,Moscow school of icon painting,religious painting,Byzantine Art,"['Christianity', 'Old-Testament', 'Daniel', 'p..."
2,Andrei Rublev,Moscow school of icon painting,miniature,Byzantine Art,"['Christianity', 'saints-and-apostles', 'Khitr..."
3,Andrei Rublev,Moscow school of icon painting,religious painting,Byzantine Art,"['Christianity', 'saints-and-apostles', 'St.-L..."
4,Andrei Rublev,Moscow school of icon painting,miniature,Byzantine Art,"['Christianity', 'arts-and-crafts', 'saints-an..."


Load the grouped data: artists grouped by style

In [6]:
wa_grouped = pd.read_csv('https://raw.githubusercontent.com/me9hanics/PainterPalette/main/datasets/wikiart_artists_styles_grouped.csv')
print("Length:", len(wa_grouped), "\n", "Number of groups with only 1 count:", len(wa_grouped[wa_grouped['count']==min(wa_grouped['count'])]))
wa_grouped[wa_grouped['artist'].str.contains("Monet")].sort_values(by=['count'], ascending=False)

Length: 7646 
 Number of groups with only 1 count: 1115


Unnamed: 0,style,artist,movement,count
2963,Impressionism,Claude Monet,Impressionism,1341
5468,Realism,Claude Monet,Impressionism,12
7041,Unknown,Claude Monet,Impressionism,12
462,Academicism,Claude Monet,Impressionism,1
3339,Japonism,Claude Monet,Impressionism,1


### Birthplaces, birth years

In [7]:
artists_A = pd.read_csv('https://raw.githubusercontent.com/me9hanics/PainterPalette/main/datasets/wikiart_artists.csv')
artists_A

Unnamed: 0,artist,styles,movement,styles_extended,pictures_count,birth_place,birth_year
0,Ad Reinhardt,"Abstract Art, Abstract Expressionism, Color Fi...",Abstract Expressionism,"{Abstract Art:15},{Abstract Expressionism:5},{...",52,Buffalo,1913.0
1,Adnan Coker,"Abstract Art, Abstract Expressionism",Abstract Art,"{Abstract Art:25},{Abstract Expressionism:3}",28,,
2,Akkitham Narayanan,Abstract Art,Abstract Art,{Abstract Art:17},17,Kerala,1939.0
3,Alberto Magnelli,"Abstract Art, Art Nouveau (Modern), Cubism, Ex...",Abstract Art,"{Abstract Art:19},{Art Nouveau (Modern):2},{Cu...",35,Florence,1888.0
4,Alekos Kontopoulos,"Abstract Art, Cubism, Expressionism, Post-Impr...",Social Realism,"{Abstract Art:26},{Cubism:5},{Expressionism:10...",79,Lamia,1904.0
...,...,...,...,...,...,...,...
3198,Serhij Schyschko,Unknown,Academic Art,{Unknown:9},9,,
3199,Vudon Baklytsky,Unknown,Soviet Nonconformist Art,{Unknown:46},46,,
3200,Wolfgang Tillmans,Unknown,Contemporary,{Unknown:9},9,Remscheid,1968.0
3201,Wu Daozi,Unknown,Tang Dynasty (618–907),{Unknown:8},8,Chang'an,680.0


## Art500K

First dataset (from official website)

In [13]:
art500k = pd.read_csv('https://raw.githubusercontent.com/me9hanics/PainterPalette/main/datasets/art500k_paintings_cleaned.csv')
(art500k[0:10])

Unnamed: 0,author_name,Genre,Style,Nationality,PaintingSchool,ArtMovement,Date,Influencedby,Influencedon,Tag,Pupils,Location,Teachers,FriendsandCoworkers
0,Gustave Courbet,,,,,,,,,,,,,
1,Auguste Rodin,,,,,,,,,,,,,
2,Frida Kahlo,,,,,,,,,,,,,
3,Banksy,,,,,,,,,,,in a settlement in Palestine in the middle east,,
4,El Greco,,,,,,ca. 1610-1614,,,,,,,
5,El Greco,,,,,,,,,,,,,
6,Diego Rivera,,,,,,,,,,,,,
7,Claude Monet,,,,,,,,,,,,,
8,Francisco Goya,,,,,,,,,,,,,
9,Francisco Goya,,,,,,,,,,,,,


In [12]:
art500k_artists = pd.read_csv('https://raw.githubusercontent.com/me9hanics/PainterPalette/main/datasets/art500k_artists.csv')
art500k_artists[0:7]

Unnamed: 0,artist,Nationality,PaintingSchool,ArtMovement,Influencedby,Influencedon,Pupils,Teachers,FriendsandCoworkers,FirstYear,LastYear,Places,PlacesYears,StylesYears,StylesCount,PlacesCount,Contemporary,Type
0,Gustave Courbet,French,,{Realism:272},"Rembrandt,Caravaggio,Diego Velazquez,Peter Pau...","Edouard Manet,Claude Monet,Pierre-Auguste Reno...",,,,1830.0,1877.0,"London, Montpellier, Moscow, CA, UK, Norway, D...","France:1841-1876,Switzerland:1844-1874,Lille:1...","Realism:1835-1877,Romanticism:1830-1849","{Realism:257}, {Romanticism:13}","{France:88},{Switzerland:7},{Lille:8},{Paris:4...",No,Painting/Sculpture
1,Auguste Rodin,French,,"{Modern art:3},{Impressionism:91}","Michelangelo,Donatello,","Georgia O'Keeffe,Man Ray,Aristide Maillol,Olex...","Constantin Brancusi,",,,1865.0,1985.0,"London, CA, UK, Switzerland, Lisbon, US, Germa...","France:1865-1889,Paris:1865-1898,CA:1891-1891,...",Impressionism:1865-1905,{Impressionism:90},"{France:52},{Paris:15},{Brussels:2},{Belgium:1...",,Painting/Sculpture
2,Frida Kahlo,Mexican,,"{Naïve Art (Primitivism),Surrealism:99}","Amedeo Modigliani,Diego Rivera,Jose Clemente O...","Judy Chicago,Georgia O'Keeffe,Feminist Art,",,,,1922.0,1954.0,"CA, LA, New York, US, New Orleans, Washington ...","Mexico:1927-1954,San Francisco:1931-1933,Mexic...","Naïve Art (Primitivism):1922-1954,Surrealism:1...","{Naïve Art (Primitivism):99}, {Surrealism:15}","{Mexico:50},{San Francisco:6},{New York:4},{Me...",No,Painting/Sculpture
3,Banksy,,,,,,,,,2011.0,2011.0,"Los Angeles, London, UK, Palestine, California...","London:2011-2011,UK:2011-2011",,,"{Palestine:1},{Los Angeles:3},{California:3},{...",Yes,Painting/Sculpture
4,El Greco,"Spanish,Greek",Cretan School,"{Spanish Renaissance:1},{Renaissance:2},{Manne...","Byzantine Art,","Expressionism,Cubism,Eugene Delacroix,Edouard ...",,"Titian,","Giulio Clovio,",1568.0,1614.0,"Seville, London, Illescas, Romania, Moscow, Gr...","Spain:1577-1599,London:1600-1600,UK:1600-1600,...",Mannerism (Late Renaissance):1568-1600,"{Renaissance:2}, {XVI CenturySpanish Painting:...","{Spain:75},{Boston:1},{MA:1},{US:27},{Museo de...",No,Painting/Sculpture
5,Diego Rivera,Mexican,"Mexican Mural Renaissance,La Ruche","{Social Realism,Muralism:146}","Marc Chagall,Robert Delaunay,","Frida Kahlo,Pedro Coronel,Vlady,",,,"Amedeo Modigliani,Saturnino Herran,Roberto Mon...",1904.0,1956.0,"Moscow, CA, Acapulco, New York, Spain, Northam...","Acapulco:1956-1956,Mexico:1905-1956,Guerrero:1...","Cubism:1912-1916,Muralism:1922-1956,Art Deco:1...","{Post-impressionism:1}, {Cubism:19}, {Mexican ...","{France:1},{Paris:1},{Moscow:1},{Acapulco:2},{...",No,Painting/Sculpture
6,Claude Monet,French,,"{Modern art:3},{Impressionism:1340}","Gustave Courbet,Charles-Francois Daubigny,John...","Childe Hassam,Robert Delaunay,Wassily Kandinsk...",,"Eugene Boudin,Charles Gleyre,","Alfred Sisley,Pierre-Auguste Renoir,Camille Pi...",1858.0,1926.0,"London, Main, Moscow, Rotterdam, Giverny, CA, ...","France:1861-1924,London:1869-1889,UK:1869-1908...","Impressionist:1879-1904,Impressionism:1864-192...",{Nineteenth-Century European PaintingImpressio...,"{France:79},{Giverny:1},{London:6},{UK:15},{Bo...",No,Painting/Sculpture


## Creating networks

This is found in the networks folder, mostly in the networks.ipynb notebook.<br>
Some work can be found in the *scipy.ipynb* notebook.<br>