# Cleaning Data: Python Data Playbook

[Cleaning Data: Python Data Playbook](https://app.pluralsight.com/library/courses/cleaning-data-python-data-playbook)
Cleaning the dataset is an essential part of any data project, but it can be challenging. This course will teach you the basics of cleaning datasets with pandas, and will teach you techniques that you can apply immediately in real world projects.

## Author
[Chris Achard](https://app.pluralsight.com/profile/author/chris-achard)

## Learning Plataform
[Pluralsight](https://www.pluralsight.com/)

## Description
At the core of any successful project that involves a real world dataset is a thorough knowledge of how to clean that dataset from missing, bad, or inaccurate data. In this course, Cleaning Data: Python Data Playbook, you'll learn how to use pandas to clean a real world dataset. First, you'll learn how to understand, view, and explore the data you have. Next, you'll explore how to access just the data that you want to keep in your dataset. Finally, you'll discover different ways to handle bad and missing data. When you're finished with this course, you'll have a foundational knowledge of cleaning real world datasets with pandas that will help you as you move forward to working on real world data science or machine learning problems.


In [42]:
import pandas as pd
import wget
from os.path import exists
import re
from numpy import nan

In [3]:
if not exists('data/artwork_data.csv'):
    wget.download('https://raw.githubusercontent.com/vsvale/datasets/main/artwork_data.csv','data')

In [4]:
art = pd.read_csv('data/artwork_data.csv',error_bad_lines=False)



  art = pd.read_csv('data/artwork_data.csv',error_bad_lines=False)
  art = pd.read_csv('data/artwork_data.csv',error_bad_lines=False)


In [5]:
art.width = pd.to_numeric(art.width,errors='coerce')
art.height = pd.to_numeric(art.height,errors='coerce')
art.year = pd.to_numeric(art.year,errors='coerce')

In [6]:
art.head()

Unnamed: 0,id,accession_number,artist,artistRole,artistId,title,dateText,medium,creditLine,year,acquisitionYear,dimensions,width,height,depth,units,inscription,thumbnailCopyright,thumbnailUrl,url
0,1035,A00001,"Blake, Robert",artist,38,A Figure Bowing before a Seated Old Man with h...,date not known,"Watercolour, ink, chalk and graphite on paper....",Presented by Mrs John Richmond 1922,,1922.0,support: 394 x 419 mm,394.0,419.0,,mm,,,http://www.tate.org.uk/art/images/work/A/A00/A...,http://www.tate.org.uk/art/artworks/blake-a-fi...
1,1036,A00002,"Blake, Robert",artist,38,"Two Drawings of Frightened Figures, Probably f...",date not known,Graphite on paper,Presented by Mrs John Richmond 1922,,1922.0,support: 311 x 213 mm,311.0,213.0,,mm,,,http://www.tate.org.uk/art/images/work/A/A00/A...,http://www.tate.org.uk/art/artworks/blake-two-...
2,1037,A00003,"Blake, Robert",artist,38,The Preaching of Warning. Verso: An Old Man En...,?c.1785,Graphite on paper. Verso: graphite on paper,Presented by Mrs John Richmond 1922,1785.0,1922.0,support: 343 x 467 mm,343.0,467.0,,mm,,,http://www.tate.org.uk/art/images/work/A/A00/A...,http://www.tate.org.uk/art/artworks/blake-the-...
3,1038,A00004,"Blake, Robert",artist,38,Six Drawings of Figures with Outstretched Arms,date not known,Graphite on paper,Presented by Mrs John Richmond 1922,,1922.0,support: 318 x 394 mm,318.0,394.0,,mm,,,http://www.tate.org.uk/art/images/work/A/A00/A...,http://www.tate.org.uk/art/artworks/blake-six-...
4,1039,A00005,"Blake, William",artist,39,The Circle of the Lustful: Francesca da Rimini...,"1826–7, reprinted 1892",Line engraving on paper,Purchased with the assistance of a special gra...,1826.0,1919.0,image: 243 x 335 mm,243.0,335.0,,mm,,,http://www.tate.org.uk/art/images/work/A/A00/A...,http://www.tate.org.uk/art/artworks/blake-the-...


In [7]:
art.dtypes

id                      int64
accession_number       object
artist                 object
artistRole             object
artistId                int64
title                  object
dateText               object
medium                 object
creditLine             object
year                  float64
acquisitionYear       float64
dimensions             object
width                 float64
height                float64
depth                 float64
units                  object
inscription            object
thumbnailCopyright     object
thumbnailUrl           object
url                    object
dtype: object

In [8]:
art.describe()

Unnamed: 0,id,artistId,year,acquisitionYear,width,height,depth
count,69201.0,69201.0,63804.0,69156.0,65834.0,65859.0,2514.0
mean,39148.026213,1201.063251,1867.227823,1910.646856,323.472435,346.436173,479.197772
std,25980.468687,2019.422535,72.012718,64.202148,408.810182,538.037975,1051.141734
min,3.0,0.0,1545.0,1823.0,3.0,6.0,1.0
25%,19096.0,558.0,1817.0,1856.0,118.0,117.0,48.25
50%,37339.0,558.0,1831.0,1856.0,175.0,190.0,190.0
75%,54712.0,1137.0,1953.0,1982.0,345.0,359.0,450.0
max,129068.0,19232.0,2012.0,2013.0,11960.0,37500.0,18290.0


In [9]:
art.year

0           NaN
1           NaN
2        1785.0
3           NaN
4        1826.0
          ...  
69196    1975.0
69197    1976.0
69198    1996.0
69199    2000.0
69200    1764.0
Name: year, Length: 69201, dtype: float64

In [10]:
art.year.min()

1545.0

In [11]:
art.year.max()

2012.0

In [12]:
art.year.agg(['mean','std'])

mean    1867.227823
std       72.012718
Name: year, dtype: float64

In [13]:
art.year.count()

63804

In [14]:
art.year.groupby(art.year).count()

year
1545.0      2
1557.0      1
1563.0      1
1565.0      1
1569.0      1
         ... 
2008.0    111
2009.0     84
2010.0     64
2011.0     29
2012.0     46
Name: year, Length: 360, dtype: int64

In [15]:
normalize = (art.height - art.height.mean()) / art.height.std()
normalize

0        0.134867
1       -0.248005
2        0.224081
3        0.088402
4       -0.021255
           ...   
69196   -0.077013
69197   -0.077013
69198    3.835350
69199         NaN
69200    0.582791
Name: height, Length: 69201, dtype: float64

In [16]:
minmax = (art.height - art.height.min()) / (art.height.max() - art.height.min())
minmax

0        0.011015
1        0.005521
2        0.012295
3        0.010348
4        0.008775
           ...   
69196    0.007975
69197    0.007975
69198    0.064117
69199         NaN
69200    0.017443
Name: height, Length: 69201, dtype: float64

In [17]:
art['height_norm'] = normalize
art['height_minmax'] = minmax

In [18]:
art['height_cm'] = art.height.transform(lambda x: x/10)

In [19]:
art.groupby('artist')['height'].transform('mean')

0         373.250000
1         373.250000
2         373.250000
3         373.250000
4         266.944134
            ...     
69196     305.000000
69197     305.000000
69198    1491.285714
69199    3515.333333
69200     660.000000
Name: height, Length: 69201, dtype: float64

In [20]:
simpleart= art.filter(items=['artist','height','width'])
simpleart

Unnamed: 0,artist,height,width
0,"Blake, Robert",419.0,394.0
1,"Blake, Robert",213.0,311.0
2,"Blake, Robert",467.0,343.0
3,"Blake, Robert",394.0,318.0
4,"Blake, William",335.0,243.0
...,...,...,...
69196,"P-Orridge, Genesis",305.0,305.0
69197,"P-Orridge, Genesis",305.0,305.0
69198,"Hatoum, Mona",2410.0,45.0
69199,"Creed, Martin",,


In [21]:
simpleart.drop(0)

Unnamed: 0,artist,height,width
1,"Blake, Robert",213.0,311.0
2,"Blake, Robert",467.0,343.0
3,"Blake, Robert",394.0,318.0
4,"Blake, William",335.0,243.0
5,"Blake, William",338.0,240.0
...,...,...,...
69196,"P-Orridge, Genesis",305.0,305.0
69197,"P-Orridge, Genesis",305.0,305.0
69198,"Hatoum, Mona",2410.0,45.0
69199,"Creed, Martin",,


In [22]:
simpleart['columtodrop'] = 0
simpleart['columntodrop'] = 1
simpleart

Unnamed: 0,artist,height,width,columtodrop,columntodrop
0,"Blake, Robert",419.0,394.0,0,1
1,"Blake, Robert",213.0,311.0,0,1
2,"Blake, Robert",467.0,343.0,0,1
3,"Blake, Robert",394.0,318.0,0,1
4,"Blake, William",335.0,243.0,0,1
...,...,...,...,...,...
69196,"P-Orridge, Genesis",305.0,305.0,0,1
69197,"P-Orridge, Genesis",305.0,305.0,0,1
69198,"Hatoum, Mona",2410.0,45.0,0,1
69199,"Creed, Martin",,,0,1


In [23]:
simpleart.drop('columtodrop',axis=1,inplace=True)
simpleart.drop(columns=['columntodrop'],inplace=True)
simpleart

Unnamed: 0,artist,height,width
0,"Blake, Robert",419.0,394.0
1,"Blake, Robert",213.0,311.0
2,"Blake, Robert",467.0,343.0
3,"Blake, Robert",394.0,318.0
4,"Blake, William",335.0,243.0
...,...,...,...
69196,"P-Orridge, Genesis",305.0,305.0
69197,"P-Orridge, Genesis",305.0,305.0
69198,"Hatoum, Mona",2410.0,45.0
69199,"Creed, Martin",,


In [24]:
art.columns

Index(['id', 'accession_number', 'artist', 'artistRole', 'artistId', 'title',
       'dateText', 'medium', 'creditLine', 'year', 'acquisitionYear',
       'dimensions', 'width', 'height', 'depth', 'units', 'inscription',
       'thumbnailCopyright', 'thumbnailUrl', 'url', 'height_norm',
       'height_minmax', 'height_cm'],
      dtype='object')

In [26]:
art.columns = [re.sub(r'([A-Z])',r'_\1' ,x).lower() for x in art.columns]
art.columns

Index(['id', 'accession_number', 'artist', 'artist_role', 'artist_id', 'title',
       'date_text', 'medium', 'credit_line', 'year', 'acquisition_year',
       'dimensions', 'width', 'height', 'depth', 'units', 'inscription',
       'thumbnail_copyright', 'thumbnail_url', 'url', 'height_norm',
       'height_minmax', 'height_cm'],
      dtype='object')

In [27]:
art.rename(columns={'thumbnail_url':'thumbnail'},inplace=True)

In [28]:
art['artist'][0]

'Blake, Robert'

In [29]:
art[['artist','medium']]['artist'][0]

'Blake, Robert'

In [30]:
art[['artist','medium']][1:2]

Unnamed: 0,artist,medium
1,"Blake, Robert",Graphite on paper


In [31]:
art[art['year'] > 2000][['artist','medium','year']]

Unnamed: 0,artist,medium,year
1763,"Richter, Gerhard",Glass and wood,2004.0
1765,"Richter, Gerhard",Oil paint on canvas,2004.0
1766,"Richter, Gerhard",Oil paint on aluminium,2002.0
1767,"Richter, Gerhard",Oil paint on aluminium,2002.0
1770,"Mueck, Ron",Mixed media,2005.0
...,...,...,...
69178,"Walker, Kelley",2 screenprints on fibreboard,2010.0
69179,"Bulloch, Angela","4 aluminium pixel boxes with DMX control box, ...",2012.0
69183,"Black, Karla","Cellophane, paint, sellotape, plaster powder, ...",2011.0
69187,"Lord, Andrew","Ceramic, silver and epoxy",2004.0


In [32]:
art.loc[art.artist == 'Richter, Gerhard',['title','year']]

Unnamed: 0,title,year
1759,"Mirror Painting (Grey, 735-2)",1991.0
1760,"Corner Mirror, brown-blue (737-1, 737-2)",1991.0
1761,"Corner Mirror, green-red (737-2 A, 737-2 B)",1991.0
1762,48 Portraits,1971.0
1763,11 Panes,2004.0
1764,Abstract Painting (809-3),1994.0
1765,Abstract Painting (Skin) (887-3),2004.0
1766,Abstract Painting (Silicate) (880-4),2002.0
1767,Abstract Painting (Grey) (880-3),2002.0
1919,"Self Portrait Standing, Three Times, 17.3.1991",1991.0


In [33]:
art.medium

0        Watercolour, ink, chalk and graphite on paper....
1                                        Graphite on paper
2              Graphite on paper. Verso: graphite on paper
3                                        Graphite on paper
4                                  Line engraving on paper
                               ...                        
69196     Perspex, Wood, hairpiece, tampon and human blood
69197    Wood, Perspex, plastic, photograph on paper, t...
69198                                 Soap and glass beads
69199                                     Gallery lighting
69200                                  Oil paint on canvas
Name: medium, Length: 69201, dtype: object

In [34]:
art.loc[art.medium.str.contains('Graphite',case=False,na=False),['artist','medium']]

Unnamed: 0,artist,medium
0,"Blake, Robert","Watercolour, ink, chalk and graphite on paper...."
1,"Blake, Robert",Graphite on paper
2,"Blake, Robert",Graphite on paper. Verso: graphite on paper
3,"Blake, Robert",Graphite on paper
32,"Blake, William",Graphite on paper. Verso: graphite on paper
...,...,...
69140,"Horn, Rebecca","Graphite, coloured graphite and acrylic paint ..."
69154,"Hepworth, Dame Barbara","Graphite, watercolour, crayon and oil paint on..."
69155,"Hepworth, Dame Barbara","Oil paint, watercolour, crayon and graphite on..."
69156,"Hepworth, Dame Barbara","Oil paint, watercolour, crayon and graphite on..."


In [35]:
art.loc[art.medium.str.contains('(?i)Graphite',regex=True,na=False),['artist','medium']]

Unnamed: 0,artist,medium
0,"Blake, Robert","Watercolour, ink, chalk and graphite on paper...."
1,"Blake, Robert",Graphite on paper
2,"Blake, Robert",Graphite on paper. Verso: graphite on paper
3,"Blake, Robert",Graphite on paper
32,"Blake, William",Graphite on paper. Verso: graphite on paper
...,...,...
69140,"Horn, Rebecca","Graphite, coloured graphite and acrylic paint ..."
69154,"Hepworth, Dame Barbara","Graphite, watercolour, crayon and oil paint on..."
69155,"Hepworth, Dame Barbara","Oil paint, watercolour, crayon and graphite on..."
69156,"Hepworth, Dame Barbara","Oil paint, watercolour, crayon and graphite on..."


In [36]:
art.loc[
        art.medium.str.contains('Graphite|line',case=False,na=False) &
        art.artist.str.contains('(?i)Blake',regex=True,na=False) &
        art.year.astype('str').str.contains('1797',case=False,na=False),
        ['artist','medium','year']
        ]

Unnamed: 0,artist,medium,year
62011,"Blake, William",Line engraving and etching on paper,1797.0
62012,"Blake, William",Line engraving and etching on paper,1797.0
62013,"Blake, William",Line engraving and etching on paper,1797.0


In [37]:
art.loc[art.title.str.contains('\s$',regex=True)]

Unnamed: 0,id,accession_number,artist,artist_role,artist_id,title,date_text,medium,credit_line,year,...,height,depth,units,inscription,thumbnail_copyright,thumbnail,url,height_norm,height_minmax,height_cm
49498,4308,P07466,"Hamilton Finlay, Ian",artist,1093,Port-Distinguishing Letters of Scottish Fishin...,1976,Screenprint on ceramic tile,Purchased 1981,1976.0,...,153.0,,mm,,© Estate of Ian Hamilton Finlay,http://www.tate.org.uk/art/images/work/P/P07/P...,http://www.tate.org.uk/art/artworks/hamilton-f...,-0.359521,0.003921,15.3
50534,2235,P11065,"Clarke, Brian",artist,911,Boys,1981,Screenprint on paper,Presented by Paul Beldock 1983,1981.0,...,700.0,,mm,date inscribed,© Brian Clarke. All Rights Reserved 2014 / DACS,http://www.tate.org.uk/art/images/work/P/P11/P...,http://www.tate.org.uk/art/artworks/clarke-boy...,0.657135,0.01851,70.0
50535,2236,P11066,"Clarke, Brian",artist,911,Buildings,1981,Screenprint on paper,Presented by Paul Beldock 1983,1981.0,...,698.0,,mm,date inscribed,© Brian Clarke. All Rights Reserved 2014 / DACS,http://www.tate.org.uk/art/images/work/P/P11/P...,http://www.tate.org.uk/art/artworks/clarke-bui...,0.653418,0.018456,69.8
50537,2238,P11068,"Clarke, Brian",artist,911,Pray for Josquin,1981,Screenprint on paper,Presented by Paul Beldock 1983,1981.0,...,688.0,,mm,date inscribed,© Brian Clarke. All Rights Reserved 2014 / DACS,http://www.tate.org.uk/art/images/work/P/P11/P...,http://www.tate.org.uk/art/artworks/clarke-pra...,0.634832,0.01819,68.8
53186,21168,P77679,"Bourgeois, Louise",artist,2351,Untitled (Safety Pins),1991,Drypoint on paper,Purchased 1994,1991.0,...,379.0,,mm,,© The Easton Foundation,http://www.tate.org.uk/art/images/work/P/P77/P...,http://www.tate.org.uk/art/artworks/bourgeois-...,0.060523,0.009948,37.9
56283,5826,T00705,"Hamilton, Richard",artist,1244,Towards a definitive statement on the coming t...,1962,"Oil paint, cellulose paint and printed paper o...",Purchased 1964,1962.0,...,813.0,,mm,date inscribed,© The estate of Richard Hamilton,http://www.tate.org.uk/art/images/work/T/T00/T...,http://www.tate.org.uk/art/artworks/hamilton-t...,0.867158,0.021523,81.3
67409,88191,T12064,"Leach, David",artist,7651,2 Standard Ware Mead Cups,1945–55,Ochre porcelain,Accepted by HM Government in lieu of inheritan...,1945.0,...,80.0,80.0,mm,,© The estate of Bernard Leach,http://www.tate.org.uk/art/images/work/T/T12/T...,http://www.tate.org.uk/art/artworks/leach-2-st...,-0.4952,0.001974,8.0
67432,88215,T12087,"Leach, Bernard",artist,1478,Bowl,c.1960,Porcelain,Accepted by HM Government in lieu of inheritan...,1960.0,...,140.0,140.0,mm,,© The estate of Bernard Leach,http://www.tate.org.uk/art/images/work/T/T12/T...,http://www.tate.org.uk/art/artworks/leach-bowl...,-0.383683,0.003574,14.0


In [38]:
art.title = art.title.str.strip()

In [39]:
pd.isna(art.loc[:,'date_text'])

0        False
1        False
2        False
3        False
4        False
         ...  
69196    False
69197    False
69198    False
69199    False
69200    False
Name: date_text, Length: 69201, dtype: bool

In [43]:
art.replace({'date_text':{'date not known': nan}},inplace=True)

In [44]:
art.loc[art.date_text=='date not known',['date_text']] = nan

In [45]:
art.loc[art.year.notnull() & art.year.astype(str).str.contains('^[0-9]')]

Unnamed: 0,id,accession_number,artist,artist_role,artist_id,title,date_text,medium,credit_line,year,...,height,depth,units,inscription,thumbnail_copyright,thumbnail,url,height_norm,height_minmax,height_cm
2,1037,A00003,"Blake, Robert",artist,38,The Preaching of Warning. Verso: An Old Man En...,?c.1785,Graphite on paper. Verso: graphite on paper,Presented by Mrs John Richmond 1922,1785.0,...,467.0,,mm,,,http://www.tate.org.uk/art/images/work/A/A00/A...,http://www.tate.org.uk/art/artworks/blake-the-...,0.224081,0.012295,46.7
4,1039,A00005,"Blake, William",artist,39,The Circle of the Lustful: Francesca da Rimini...,"1826–7, reprinted 1892",Line engraving on paper,Purchased with the assistance of a special gra...,1826.0,...,335.0,,mm,,,http://www.tate.org.uk/art/images/work/A/A00/A...,http://www.tate.org.uk/art/artworks/blake-the-...,-0.021255,0.008775,33.5
5,1040,A00006,"Blake, William",artist,39,Ciampolo the Barrator Tormented by the Devils,"1826–7, reprinted 1892",Line engraving on paper,Purchased with the assistance of a special gra...,1826.0,...,338.0,,mm,,,http://www.tate.org.uk/art/images/work/A/A00/A...,http://www.tate.org.uk/art/artworks/blake-ciam...,-0.015680,0.008855,33.8
6,1041,A00007,"Blake, William",artist,39,The Baffled Devils Fighting,"1826–7, reprinted 1892",Line engraving on paper,Purchased with the assistance of a special gra...,1826.0,...,334.0,,mm,,,http://www.tate.org.uk/art/images/work/A/A00/A...,http://www.tate.org.uk/art/artworks/blake-the-...,-0.023114,0.008748,33.4
7,1042,A00008,"Blake, William",artist,39,The Six-Footed Serpent Attacking Agnolo Brunel...,"1826–7, reprinted 1892",Line engraving on paper,Purchased with the assistance of a special gra...,1826.0,...,340.0,,mm,,,http://www.tate.org.uk/art/images/work/A/A00/A...,http://www.tate.org.uk/art/artworks/blake-the-...,-0.011962,0.008908,34.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
69196,122960,T13865,"P-Orridge, Genesis",artist,16646,Larvae (from Tampax Romana),1975,"Perspex, Wood, hairpiece, tampon and human blood",Transferred from Tate Archive 2012,1975.0,...,305.0,135.0,mm,,,,http://www.tate.org.uk/art/artworks/p-orridge-...,-0.077013,0.007975,30.5
69197,122961,T13866,"P-Orridge, Genesis",artist,16646,Living Womb (from Tampax Romana),1976,"Wood, Perspex, plastic, photograph on paper, t...",Transferred from Tate Archive 2012,1976.0,...,305.0,135.0,mm,,,,http://www.tate.org.uk/art/artworks/p-orridge-...,-0.077013,0.007975,30.5
69198,121181,T13867,"Hatoum, Mona",artist,2365,Present Tense,1996,Soap and glass beads,Presented by Tate Members 2013,1996.0,...,2410.0,2990.0,mm,,,,http://www.tate.org.uk/art/artworks/hatoum-pre...,3.835350,0.064117,241.0
69199,112306,T13868,"Creed, Martin",artist,2760,Work No. 227: The lights going on and off,2000,Gallery lighting,"Purchased with funds provided by Tate Members,...",2000.0,...,,,,,,,http://www.tate.org.uk/art/artworks/creed-work...,,,


In [46]:
art.loc[art.year=='no date',['year']] = nan

In [47]:
art.fillna(value={'depth':0},inplace=True)

In [48]:
art.dropna()

Unnamed: 0,id,accession_number,artist,artist_role,artist_id,title,date_text,medium,credit_line,year,...,height,depth,units,inscription,thumbnail_copyright,thumbnail,url,height_norm,height_minmax,height_cm
1025,550,A01029,"Ardizzone, Edward",artist,659,At the Brasserie,1931,Watercolour and ink on paper,Purchased 1940,1931.0,...,222.0,0.0,mm,date inscribed,© Tate,http://www.tate.org.uk/art/images/work/A/A01/A...,http://www.tate.org.uk/art/artworks/ardizzone-...,-0.231278,0.005761,22.2
1026,551,A01030,"Ardizzone, Edward",artist,659,The Meeting,1931,Watercolour and ink on paper,Purchased 1940,1931.0,...,159.0,0.0,mm,date inscribed,© Tate,http://www.tate.org.uk/art/images/work/A/A01/A...,http://www.tate.org.uk/art/artworks/ardizzone-...,-0.348370,0.004081,15.9
1027,552,A01031,"Ardizzone, Edward",artist,659,The Arrival,1931,Watercolour and ink on paper,Purchased 1940,1931.0,...,184.0,0.0,mm,date inscribed,© Tate,http://www.tate.org.uk/art/images/work/A/A01/A...,http://www.tate.org.uk/art/artworks/ardizzone-...,-0.301905,0.004747,18.4
1028,553,A01032,"Ardizzone, Edward",artist,659,The Bedroom,1931,Watercolour and ink on paper,Purchased 1940,1931.0,...,229.0,0.0,mm,date inscribed,© Tate,http://www.tate.org.uk/art/images/work/A/A01/A...,http://www.tate.org.uk/art/artworks/ardizzone-...,-0.218267,0.005948,22.9
1029,554,A01033,"Ardizzone, Edward",artist,659,The Departure,1931,Watercolour and ink on paper,Purchased 1940,1931.0,...,210.0,0.0,mm,date inscribed,© Tate,http://www.tate.org.uk/art/images/work/A/A01/A...,http://www.tate.org.uk/art/artworks/ardizzone-...,-0.253581,0.005441,21.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
67576,89914,T12231,"Moon, Jeremy",artist,1656,Drawing [18/7/71],1971,Graphite and pastel on paper,Purchased 2006,1971.0,...,253.0,0.0,mm,date inscribed,© Estate of Jeremy Moon,http://www.tate.org.uk/art/images/work/T/T12/T...,http://www.tate.org.uk/art/artworks/moon-drawi...,-0.173661,0.006588,25.3
67577,89915,T12232,"Moon, Jeremy",artist,1656,Drawing [13/12/71],1971,Graphite and pastel on paper,Purchased 2006,1971.0,...,253.0,0.0,mm,date inscribed,© Estate of Jeremy Moon,http://www.tate.org.uk/art/images/work/T/T12/T...,http://www.tate.org.uk/art/artworks/moon-drawi...,-0.173661,0.006588,25.3
67578,89916,T12233,"Moon, Jeremy",artist,1656,Drawing [25/5/73],1973,Graphite and pastel on paper,Purchased with assistance from Anne Best 2006,1973.0,...,253.0,0.0,mm,date inscribed,© Estate of Jeremy Moon,http://www.tate.org.uk/art/images/work/T/T12/T...,http://www.tate.org.uk/art/artworks/moon-drawi...,-0.173661,0.006588,25.3
67579,89917,T12234,"Moon, Jeremy",artist,1656,Drawing [24/6/74],1973,Graphite and pastel on paper,Purchased 2006,1973.0,...,253.0,0.0,mm,date inscribed,© Estate of Jeremy Moon,http://www.tate.org.uk/art/images/work/T/T12/T...,http://www.tate.org.uk/art/artworks/moon-drawi...,-0.173661,0.006588,25.3


In [49]:
art.dropna(subset=['year','acquisition_year'],how='all',inplace=True)

In [50]:
art.drop_duplicates(inplace=True)