# Dimensions of 2D Artworks in MoMA

Here, I am trying to visualize the different dimensions of artworks that can be found in MoMA. For this small exercise, I'll be taking Drawings, Paintings, and Photographs as my 2D Artworks.

*(Note: There are numerous assumptions that were done in the creation of the visualization, I will try my best to elaborate them as we go.)*

In [None]:
# Import modules
import pandas as pd
import numpy as np
import matplotlib.pylab as plt
import seaborn as sns

# Seaborn Settings
sns.set_style(style='white')

# Import data 
df = pd.read_csv("../input/artworks.csv")

From here, we can see all the different Classifications in the artworks file. From here, I will be choosing the row values `Drawing,` `Painting,` and `Photograph`. 

In [None]:
df.Classification.unique()

What I'll do is create different dataframes for each of these categories, and concatenate them in one dataframe `df_2D`. 

In [None]:
# Get all works in 2D
# Design, Drawing, Painting, Photograph
df_drawing = df[df['Classification'] == 'Drawing']
df_painting = df[df['Classification'] == 'Painting']
df_photo = df[df['Classification'] == 'Photograph']


# Concatenate all of artworks
all_2D = [df_painting, df_drawing, df_photo]
df_2d = pd.concat(all_2D,axis=0,ignore_index=True)

We can then see the different columns that are in the `df_2D` dataframe. Here, I only need the Height and Width. So I'll be dropping all the others except for this (and the Name of the artwork).

In [None]:
df_2d.info()

As you will see, I will also be dropping some rows with `NaN` from the compiled dataframe. This means that artworks that have missing information (without height or width, even if they're 2D) will be dropped. 

In [None]:
# Drop some columns except Classification, Height, and Width
df_2d = df_2d[['Title', 'Classification', 'Height (cm)', 'Width (cm)']]
df_2d = df_2d.rename(columns={'Height (cm)': 'height', 'Width (cm)': 'width'})

# Remove artworks with NaN
df_2d = df_2d.dropna()

In [None]:
# Plot 
ratio =  np.log10(df_2d['height'])/np.log10(df_2d['width'])
width = np.log10(df_2d['width'])

# 4/3
four_thirds = np.log10(4)/np.log10(3)
three_fourths = np.log10(3)/np.log10(4)


In [None]:
h = plt.scatter(width,ratio, alpha=0.02, c='c')
plt.axhline(y=1.0, color='k', linestyle='-',linewidth=0.75,label='Square')
plt.axhline(y=four_thirds, color='r', linestyle='-',linewidth=0.75,label='4x3')
plt.axhline(y=three_fourths, color='r', linestyle='-',linewidth=0.75)
plt.xlim((0.5,3.5))
plt.ylim((0.6,1.4))
plt.title("Dimensions of 2D Artworks in MoMA")
plt.xlabel('Width (cm) [log scale]')
plt.ylabel('Height/Width')
plt.legend()
plt.show()