### By: Mowafak Allaham
In this notebook I'm interested in answering the following questions:

- Where are the artworks displayed at the MET are coming from?
- Which artists is the MET interested in?
- Which department hosts the largest number of artworks?
###### note: all pie charts are interactive!


In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')

In [2]:
data = pd.read_csv('../input/MetObjects.csv')

In [3]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 448203 entries, 0 to 448202
Data columns (total 43 columns):
﻿Object Number             448203 non-null object
Is Highlight               448203 non-null bool
Is Public Domain           448203 non-null bool
Object ID                  448203 non-null int64
Department                 448203 non-null object
Object Name                445568 non-null object
Title                      416906 non-null object
Culture                    186518 non-null object
Period                     71882 non-null object
Dynasty                    23018 non-null object
Reign                      10817 non-null object
Portfolio                  20370 non-null object
Artist Role                259909 non-null object
Artist Prefix              88928 non-null object
Artist Display Name        261111 non-null object
Artist Display Bio         224064 non-null object
Artist Suffix              10212 non-null object
Artist Alpha Sort          261088 non-null object


# Where are the artworks are coming from?

In [4]:
sCountry = pd.DataFrame(data['Country'].value_counts())
sCountry.columns = ['Count']
sCountry['Country'] = sCountry.index.tolist()
sCountry.sort_values(by="Count",ascending=False)
sCountry = sCountry.reset_index(drop=True)
sCountry

Unnamed: 0,Count,Country
0,30914,Egypt
1,8501,United States
2,5886,Iran
3,3422,Peru
4,1673,Byzantine Egypt
5,1670,France
6,1537,Mexico
7,1440,India
8,1394,Indonesia
9,1059,England


###### Choosing the countries that contributed with at least 100 pieces

In [5]:
#The following two lines are important to use plotly offline
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
import plotly.graph_objs as go

In [6]:
plt.figure(figsize=(80,80))
temp = sCountry[sCountry['Count']>=100]
init_notebook_mode(connected=True)
labels=temp['Country']
values=temp['Count']
trace=go.Pie(labels=labels,values=values)

iplot([trace])

### Where the artworks are coming from? 
From this pie chart, it looks like that almost 45% of the artworks at the MET are coming from Egypt, the U.S and Iran. 

In [7]:
data.keys()

Index(['﻿Object Number', 'Is Highlight', 'Is Public Domain', 'Object ID',
       'Department', 'Object Name', 'Title', 'Culture', 'Period', 'Dynasty',
       'Reign', 'Portfolio', 'Artist Role', 'Artist Prefix',
       'Artist Display Name', 'Artist Display Bio', 'Artist Suffix',
       'Artist Alpha Sort', 'Artist Nationality', 'Artist Begin Date',
       'Artist End Date', 'Object Date', 'Object Begin Date',
       'Object End Date', 'Medium', 'Dimensions', 'Credit Line',
       'Geography Type', 'City', 'State', 'County', 'Country', 'Region',
       'Subregion', 'Locale', 'Locus', 'Excavation', 'River', 'Classification',
       'Rights and Reproduction', 'Link Resource', 'Metadata Date',
       'Repository'],
      dtype='object')

# Which artists is the MET interested in?

In [8]:
sArtist = pd.DataFrame(data['Artist Display Name'].value_counts())
sArtist.columns=['Count']
sArtist['Name'] = sArtist.index.tolist()
sArtist.sort_values(by="Count",ascending=False)
sArtist = sArtist.reset_index(drop=True)
sArtist.head(5)

Unnamed: 0,Count,Name
0,9659,Walker Evans
1,4282,Kinney Brothers
2,3184,Allen & Ginter
3,3148,"W. Duke, Sons & Co."
4,2747,Goodwin & Company


### Visualizing artists with at least 1000 art pieces

In [9]:
plt.figure(figsize=(80,80))
temp = sArtist[sArtist['Count']>=1000]
init_notebook_mode(connected=True)
labels=temp['Name']
values=temp['Count']
trace=go.Pie(labels=labels,values=values)

iplot([trace])

### Which artists is the MET interested in?
Artwrorks by Walker Evans seem to dominate a good porition of the display space at the MET, followed by Kinney Brothers and Allen & Ginter.

# Which department hosts the largest number of artworks?

In [10]:
sDpt = pd.DataFrame(data['Department'].value_counts())
sDpt.columns=['Count']
sDpt['Name'] = sDpt.index.tolist()
sDpt.sort_values(by="Count",ascending=False)
sDpt = sDpt.reset_index(drop=True)
sDpt.head(5)

Unnamed: 0,Count,Name
0,154445,Drawings and Prints
1,42528,European Sculpture and Decorative Arts
2,36727,Asian Art
3,36258,Photographs
4,33681,Costume Institute


In [11]:
plt.figure(figsize=(80,80))
temp = sDpt[sDpt['Count']>=1000]
init_notebook_mode(connected=True)
labels=temp['Name']
values=temp['Count']
trace=go.Pie(labels=labels,values=values)

iplot([trace])

### Which department hosts the largest number of artworks?
It looks like "Drawings and prints" department holds roughly 35% of the artworks presented at the MET.