# Introduction to pandas

In this notebook, you'll get familiar with the basics of reading in and getting acquainted with data using the `pandas` library.

### Import the `pandas` library, aliased as `pd`.

In [55]:
import pandas as pd

### Let's explore these pandas methods, attributes, and accessors
 - .head()
 * .tail()
 * .shape
 * .info()
 * .dtypes
 - .columns
 - .drop()
 * .unique()
 * .nunique()
 * .value_counts()]].
 - .query()
 - .rename()
 - .loc[]
 - [[]]

### Read in the public art data and examine the head, tail, shape, info and dtypes

In [56]:
art = pd.read_csv('../data/public_art.csv')

To inspect a portion of the dataframe, you can use `.head()` (to see the first few rows) or `.tail()` (to see the last few rows).

In [57]:
art.head(2)

Unnamed: 0,Title,Last Name,First Name,Location,Medium,Type,Description,Latitude,Longitude,Mapped Location
0,[Cross Country Runners],Frost,Miley,"4001 Harding Rd., Nashville TN",Bronze,Sculpture,,36.12856,-86.8366,"(36.12856, -86.8366)"
1,[Fourth and Commerce Sculpture],Walker,Lin,"333 Commerce Street, Nashville TN",,Sculpture,,36.16234,-86.77774,"(36.16234, -86.77774)"


In [58]:
art.tail(2)

Unnamed: 0,Title,Last Name,First Name,Location,Medium,Type,Description,Latitude,Longitude,Mapped Location
130,Women Suffrage Memorial,LeQuire,Alan,"600 Charlotte Avenue, Nashville TN",Bronze sculpture,Sculpture,,36.16527,-86.78382,"(36.16527, -86.78382)"
131,Youth Opportunity Center-STARS Nashville - Pea...,Rudloff,Andee,1704 Charlotte Ave.,House paint on vinyl,Mural,,36.15896,-86.799,"(36.15896, -86.799)"


In [59]:
art.shape

(132, 10)

In [60]:
art.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 132 entries, 0 to 131
Data columns (total 10 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   Title            132 non-null    object 
 1   Last Name        132 non-null    object 
 2   First Name       122 non-null    object 
 3   Location         131 non-null    object 
 4   Medium           128 non-null    object 
 5   Type             132 non-null    object 
 6   Description      87 non-null     object 
 7   Latitude         132 non-null    float64
 8   Longitude        132 non-null    float64
 9   Mapped Location  132 non-null    object 
dtypes: float64(2), object(8)
memory usage: 10.4+ KB


**What do you notice?**

Quite a few missing Desciptions, 10 missing First Names, a few missing Mediums, and one missing Location.

In [61]:
art.dtypes

Title               object
Last Name           object
First Name          object
Location            object
Medium              object
Type                object
Description         object
Latitude           float64
Longitude          float64
Mapped Location     object
dtype: object

You may notice that most of the columns are "objects". This is the datatype that `pandas` uses for text data. 

The float64 datatype is a numeric datatype that can handle decimal values.

In [62]:
art.columns

Index(['Title', 'Last Name', 'First Name', 'Location', 'Medium', 'Type',
       'Description', 'Latitude', 'Longitude', 'Mapped Location'],
      dtype='object')

Since the Mapped Location information is already contained in the Latitude and Longitude columns, you really don't need to store it twice. You can use the `.drop()` method to get rid of that column.

In [63]:
art.drop(columns='Mapped Location')

Unnamed: 0,Title,Last Name,First Name,Location,Medium,Type,Description,Latitude,Longitude
0,[Cross Country Runners],Frost,Miley,"4001 Harding Rd., Nashville TN",Bronze,Sculpture,,36.128560,-86.836600
1,[Fourth and Commerce Sculpture],Walker,Lin,"333 Commerce Street, Nashville TN",,Sculpture,,36.162340,-86.777740
2,12th & Porter Mural,Kennedy,Kim,114 12th Avenue N,Porter all-weather outdoor paint,Mural,Kim Kennedy is a musician and visual artist wh...,36.157900,-86.788170
3,A Splash of Color,Stevenson and Stanley and ROFF (Harroff),Doug and Ronnica and Lynn,616 17th Ave. N.,"Steel, brick, wood, and fabric on frostproof c...",Mural,Painted wooden hoop dancer on a twenty foot po...,36.162020,-86.799750
4,A Story of Nashville,Ridley,Greg,"615 Church Street, Nashville TN",Hammered copper repousse,Frieze,"Inside the Grand Reading Room, this is a serie...",36.162150,-86.782050
...,...,...,...,...,...,...,...,...,...
127,We Are Our Stories,Omari Booker & The REAL Program at Oasis Center,,1037 28th Avenue North,acrylic & spray paint on plywood,Mural,"""We Are Our Stories"" is a public art project t...",36.165101,-86.822209
128,Welcome to Flatrock,Cooper,Michael,3756 Nolensville Rd,Silicate paint on concrete,Mural,Trompe L'oeil animals and architectural stonew...,36.090820,-86.734450
129,Wind Reeds,Kahn,Ned,"1 Terminal Drive, Nashville TN",Aluminum panels,Sculpture,Hinged aluminum panels that cover a wall of th...,36.134690,-86.667770
130,Women Suffrage Memorial,LeQuire,Alan,"600 Charlotte Avenue, Nashville TN",Bronze sculpture,Sculpture,,36.165270,-86.783820


In [64]:
art.head(2)

Unnamed: 0,Title,Last Name,First Name,Location,Medium,Type,Description,Latitude,Longitude,Mapped Location
0,[Cross Country Runners],Frost,Miley,"4001 Harding Rd., Nashville TN",Bronze,Sculpture,,36.12856,-86.8366,"(36.12856, -86.8366)"
1,[Fourth and Commerce Sculpture],Walker,Lin,"333 Commerce Street, Nashville TN",,Sculpture,,36.16234,-86.77774,"(36.16234, -86.77774)"


What happened? We failed to save the result of dropping the column. We need to assign the result back to the art dataframe.

In [65]:
art = art.drop(columns = 'Mapped Location')

What are the different Types of artwork in this dataset?

In [66]:
art['Type'].unique()

array(['Sculpture', 'Mural', 'Frieze', 'Monument', 'Mobile', 'Furniture',
       'Mosaic', 'Relief', 'Stained Glass', 'Bronzes',
       'Sculpture/Fountain', 'Various', 'Street Art', 'mural', 'Fountain',
       'Multipart'], dtype=object)

If you only care about the _number_ of unique values in a colmn, you can use `.nunique`.

For example, if you want to know the number of artist last names:

In [67]:
art['Last Name'].nunique()

82

Which is the most popular Type?

In [68]:
art['Type'].value_counts()

Sculpture             61
Mural                 38
Monument              16
Mosaic                 2
Various                2
Mobile                 2
Frieze                 2
Fountain               1
Street Art             1
Furniture              1
Stained Glass          1
Multipart              1
Bronzes                1
Sculpture/Fountain     1
Relief                 1
mural                  1
Name: Type, dtype: int64

What if you want to see all of the Murals? You can slice a dataframe using the `.query` method.

In [69]:
art.query('Type == "Mural"')

Unnamed: 0,Title,Last Name,First Name,Location,Medium,Type,Description,Latitude,Longitude
2,12th & Porter Mural,Kennedy,Kim,114 12th Avenue N,Porter all-weather outdoor paint,Mural,Kim Kennedy is a musician and visual artist wh...,36.1579,-86.78817
3,A Splash of Color,Stevenson and Stanley and ROFF (Harroff),Doug and Ronnica and Lynn,616 17th Ave. N.,"Steel, brick, wood, and fabric on frostproof c...",Mural,Painted wooden hoop dancer on a twenty foot po...,36.16202,-86.79975
5,Aerial Innovations Mural,Rudloff,Andee,202 South 17th St.,House paint on wood,Mural,,36.17354,-86.73994
10,April Baby,Prestwod,Seth,3020 Charlotte Avenue,Acrylic Paint,Mural,portrait of artists little sister with links t...,36.15399,-86.819539
16,Bicycle Bus-Green Fleet,Rudloff,Andee,1st Avenue (under John Seigenthaler Pedestrian...,Metallic paint on metal/found object,Mural,,36.16131,-86.77336
19,Building a Positive Community,"Healing Arts Project, Inc.",Healing Arts Project,East Park Community Center,interior wall paint on board,Mural,"The Healing Arts Project, Inc. sponsored the c...",36.17214,-86.76244
26,Cool Fences,Guion,Scott,"500 East Iris Dr., Nashville, TN",Latex house paint on wood fence,Mural,Portraits of iconic musicians on decorative ba...,36.11554,-86.76366
28,Demonbreun Hill Mural,Deese,Bryan,1524 Demonbreun Street,Latex paint and spray paint,Mural,This piece celebrates Demonbreun Hills former ...,36.153,-86.790492
29,Dragon Wall Mural,Randolf and Glick,Adam and David,21st Avenue and Belcourt Ave.,painting,Mural,,36.1375,-86.80119
30,Eastside Mural,Sterling Goller-Brown. Ian Lawrence,,1008 Forrest Ave,Spray Paint,Mural,,36.178323,-86.75024


If you want to do further work or exploration with the sliced dataframe, you need to save it to a new variable.

In [70]:
murals = art.query('Type == "Mural"')
murals.shape

(38, 9)

Who is the most prolific mural painter in Nashville?

In [71]:
murals['Last Name'].value_counts()

Rudloff                                                6
Cooper                                                 6
Saporiti                                               5
Sterling Goller-Brown.  Ian Lawrence                   3
Deese                                                  2
Healing Arts Project, Inc.                             1
Bryan Deese, Audie Adams, Ryan Shrader                 1
Williams                                               1
Guion                                                  1
Prado                                                  1
Stevenson and Stanley and ROFF (Harroff)               1
Kennedy                                                1
Purcell                                                1
Prestwod                                               1
Randolf and Glick                                      1
Hughes                                                 1
Omari Booker & The REAL Program at Oasis Center        1
Ulibarri                       

Let's see all of the artwork that Cooper painted.

In [72]:
murals.query('Last Name == "Cooper"')

SyntaxError: invalid syntax (<unknown>, line 1)

Oh no. What went wrong?

The `.query` method does not like spaces in column names. These can be escaped by using backticks around the column name.

In [73]:
murals.query('`Last Name` == "Cooper"')

Unnamed: 0,Title,Last Name,First Name,Location,Medium,Type,Description,Latitude,Longitude
39,Gone Fishing,Cooper,Michael,Church Street Park,Acrylic on Brick,Mural,Just having some fun with Trompe L'oeil balconies,36.16298,-86.78184
40,Happy Times at The Arcade,Cooper,Michael,In the Alley between 4th and 5th off of Union,Silicate paint on brick and concrete block,Mural,Trompe L'oeil artwork celebrating The Arcade,36.1647,-86.78043
56,Lane Motor Museum,Cooper,Michael,702 Murfreesboro Pike,Acrylic on Brick; acrylic on metal wall panels,Mural,"Trompe l'oeil scene on garage wall; custom ""po...",36.14034,-86.7344
58,Phillips Toy Mart,Cooper,Michael,5207 Harding Pike,Acrylic on Drywall,Mural,"Planes, trains and automobiles!",36.10255,-86.8697
75,Piecing It All Together,Cooper,Michael,"600 Church Street, Nashville TN",Painting on Stone,Mural,,36.16281,-86.78186
128,Welcome to Flatrock,Cooper,Michael,3756 Nolensville Rd,Silicate paint on concrete,Mural,Trompe L'oeil animals and architectural stonew...,36.09082,-86.73445


But another option is to rename our columns to remove the spaces.

In [74]:
art.columns

Index(['Title', 'Last Name', 'First Name', 'Location', 'Medium', 'Type',
       'Description', 'Latitude', 'Longitude'],
      dtype='object')

Column names can be set either by specifying a **list** of names:

In [75]:
art.columns = ['title', 'last_name', 'first_name', 'location', 'medium',
              'type', 'description', 'lat', 'lng']

or by using the `.rename` method and passing in a **dictionary**. A dictionary is a collection of key-value pairs.

In [76]:
art = art.rename(columns = {'Title': 'title', 
                            'Last Name': 'last_name', 
                            'First Name': 'first_name',
                            'Location': 'loc', 
                            'Medium': 'medium', 
                            'Desccription': 'desc', 
                            'Latiitude': 'lat', 
                            'Longitude': 'lng'})

So now you can try to slice the murals dataframe using the new column name.

In [77]:
murals.query('last_name == "Cooper"')

UndefinedVariableError: name 'last_name' is not defined

Oops, what went wrong? Just because we renamed the art dataframe doesn't mean the murals one will be renamed as well. Fix the issue so that the query above will work.

In [78]:
murals = murals.rename(columns = {'Title': 'title', 
                            'Last Name': 'last_name', 
                            'First Name': 'first_name',
                            'Location': 'loc', 
                            'Medium': 'medium', 
                            'Desccription': 'desc()', 
                            'Latiitude': 'lat', 
                            'Longitude': 'lng'})

In [79]:
murals.query('last_name == "Cooper"')

Unnamed: 0,title,last_name,first_name,loc,medium,Type,Description,Latitude,lng
39,Gone Fishing,Cooper,Michael,Church Street Park,Acrylic on Brick,Mural,Just having some fun with Trompe L'oeil balconies,36.16298,-86.78184
40,Happy Times at The Arcade,Cooper,Michael,In the Alley between 4th and 5th off of Union,Silicate paint on brick and concrete block,Mural,Trompe L'oeil artwork celebrating The Arcade,36.1647,-86.78043
56,Lane Motor Museum,Cooper,Michael,702 Murfreesboro Pike,Acrylic on Brick; acrylic on metal wall panels,Mural,"Trompe l'oeil scene on garage wall; custom ""po...",36.14034,-86.7344
58,Phillips Toy Mart,Cooper,Michael,5207 Harding Pike,Acrylic on Drywall,Mural,"Planes, trains and automobiles!",36.10255,-86.8697
75,Piecing It All Together,Cooper,Michael,"600 Church Street, Nashville TN",Painting on Stone,Mural,,36.16281,-86.78186
128,Welcome to Flatrock,Cooper,Michael,3756 Nolensville Rd,Silicate paint on concrete,Mural,Trompe L'oeil animals and architectural stonew...,36.09082,-86.73445


In [80]:
murals['last_name'].value_counts()

Rudloff                                                6
Cooper                                                 6
Saporiti                                               5
Sterling Goller-Brown.  Ian Lawrence                   3
Deese                                                  2
Healing Arts Project, Inc.                             1
Bryan Deese, Audie Adams, Ryan Shrader                 1
Williams                                               1
Guion                                                  1
Prado                                                  1
Stevenson and Stanley and ROFF (Harroff)               1
Kennedy                                                1
Purcell                                                1
Prestwod                                               1
Randolf and Glick                                      1
Hughes                                                 1
Omari Booker & The REAL Program at Oasis Center        1
Ulibarri                       

Take another look at the murals dataframe and notice that Sterling Goller-Brown and Ian Lawrence collaborated on multiple murals, but these are stored in the dataframe differently. What if we want to slice down and find these rows?

In [81]:
goller_lawrence = ['Sterling Goller-Brown.  Ian Lawrence', 'Sterling Goller-Brown and Ian Lawrence, co-creators']

In [82]:
murals.query('last_name in @goller_lawrence')

Unnamed: 0,title,last_name,first_name,loc,medium,Type,Description,Latitude,lng
30,Eastside Mural,Sterling Goller-Brown. Ian Lawrence,,1008 Forrest Ave,Spray Paint,Mural,,36.178323,-86.75024
69,"Our Past, Your Future",Sterling Goller-Brown. Ian Lawrence,,1524 Gallatin Ave,Spray Paint,Mural,,36.194354,-86.743985
116,Tomatoes,"Sterling Goller-Brown and Ian Lawrence, co-cre...",,701 Porter Rd at Eastland Ave,paint on brick,Mural,Tomatoes,36.182437,-86.733449
118,Two Musicians,Sterling Goller-Brown. Ian Lawrence,,1008 Forrest Ave,Spray Paint,Mural,,36.178323,-86.75024


Another method of slicing a dataframe is by using `.loc`.

We can fetch rows based on their **index** values (the first column of the dataframe).

In [83]:
art.loc[20]

title                                                    Can-Do
last_name                                       Lucking-Reilley
first_name                                                 Mary
location       Corner of 12th Ave S & Sevier Park, Nashville TN
medium                                            Painted metal
type                                                  Sculpture
description            Trash can created by artist and children
lat                                                     36.1216
lng                                                    -86.7903
Name: 20, dtype: object

You can fetch a range of rows:

In [84]:
art.loc[20:25]

Unnamed: 0,title,last_name,first_name,location,medium,type,description,lat,lng
20,Can-Do,Lucking-Reilley,Mary,"Corner of 12th Ave S & Sevier Park, Nashville TN",Painted metal,Sculpture,Trash can created by artist and children,36.12161,-86.79027
21,Chet Atkins,Faxon,Russell,"Corner of Fifth Avenue North and Union Street,...",Bronze stool and guitar on a granite plynth,Sculpture,A sculpture of a young Chet Atkins seated on a...,36.16466,-86.78102
22,Children's Chairs For The Seasons,McGraw,Deloss,"615 Church Street, Nashville TN",Mixed Media - wood and paint,Furniture,chairs depicting the four seasons,36.16215,-86.78205
23,Confederate Memorial,Nicoll,Carlo,"1101 Lebanon Pike, Nashville TN",Marble on barre granite,Monument,,36.14883,-86.73239
24,Confederate Private Monument,Zolnay,George Julian,"2500 West End Avenue, Nashville TN",Bronze on a limestone and granite base,Monument,"A seated portrait of Sam Davis, a boy hero of ...",36.14788,-86.81261
25,Confluence,Medwedeff,John,1515 Fifth Ave North,Steel,Sculpture,,36.17902,-86.79186


You can also fetch only columns you're interested in by specifying them by name.

In [85]:
art.loc[20:25, ['title', 'last_name', 'medium']]

Unnamed: 0,title,last_name,medium
20,Can-Do,Lucking-Reilley,Painted metal
21,Chet Atkins,Faxon,Bronze stool and guitar on a granite plynth
22,Children's Chairs For The Seasons,McGraw,Mixed Media - wood and paint
23,Confederate Memorial,Nicoll,Marble on barre granite
24,Confederate Private Monument,Zolnay,Bronze on a limestone and granite base
25,Confluence,Medwedeff,Steel


It's also possible to slice down to a list of columns using double brackets [[ ]]:

In [86]:
art[['title', 'type', 'description']]

Unnamed: 0,title,type,description
0,[Cross Country Runners],Sculpture,
1,[Fourth and Commerce Sculpture],Sculpture,
2,12th & Porter Mural,Mural,Kim Kennedy is a musician and visual artist wh...
3,A Splash of Color,Mural,Painted wooden hoop dancer on a twenty foot po...
4,A Story of Nashville,Frieze,"Inside the Grand Reading Room, this is a serie..."
...,...,...,...
127,We Are Our Stories,Mural,"""We Are Our Stories"" is a public art project t..."
128,Welcome to Flatrock,Mural,Trompe L'oeil animals and architectural stonew...
129,Wind Reeds,Sculpture,Hinged aluminum panels that cover a wall of th...
130,Women Suffrage Memorial,Sculpture,


And finally, you can slice based on a condition, similar to using `.query`.

In [87]:
art.loc[art['last_name'] == 'Faxon']

Unnamed: 0,title,last_name,first_name,location,medium,type,description,lat,lng
9,Anticipation,Faxon,Russell,"505 Deaderick Street, Nashville TN",Cold cast bronze,Sculpture,A statue of a woman with legs crossed sitting ...,36.16489,-86.78184
17,Blowing Bubbles,Faxon,Russell,"4715 Harding Pike, Nashville TN",Bronze,Sculpture,,36.11975,-86.85343
18,Breaking Bread,Faxon,Russell,705 Drexel Street,bronze,Sculpture,The work represents the principal of sharing.,36.15203,-86.77849
21,Chet Atkins,Faxon,Russell,"Corner of Fifth Avenue North and Union Street,...",Bronze stool and guitar on a granite plynth,Sculpture,A sculpture of a young Chet Atkins seated on a...,36.16466,-86.78102
31,Ed and Bernice Johnson and Mary,Faxon,Russell,"1900 Belmont Blvd, Nashville TN",Bronze,Sculpture,,36.13264,-86.79473
51,Isabelle and Calvin,Faxon,Russell,1718 Patterson Street,bronze on granite bench,Sculpture,,36.15759,-86.79837
67,"Oh, Roy",Faxon,Russell,"116 5th Ave N, Nashville TN",Bronze,Sculpture,Minnie Pearl and Roy Acuff seated on a bench i...,36.16151,-86.77816
95,St. Vincent de Paul,Faxon,Russell,"2216 State Street, Nashville TN",Hollow metal,Sculpture,De Paul wears long robes and a skull cap. He ...,36.15234,-86.80603
101,Tennessee Korean War Memorial,Faxon,Russell,"301 6th Avenue North, Nashville TN",Bronze sculpture on bronze base with black gra...,Monument,Two male figures dressed in combat fatigues an...,36.16414,-86.78289
108,The Readers,Faxon,Russell,"3701 Benham Avenue, Nashville",bronze on bronze bench,Sculpture,,36.10997,-86.80924


Note that to slice to values in a list, you can use `.isin`.

In [88]:
art.loc[art['last_name'].isin(goller_lawrence)]

Unnamed: 0,title,last_name,first_name,location,medium,type,description,lat,lng
30,Eastside Mural,Sterling Goller-Brown. Ian Lawrence,,1008 Forrest Ave,Spray Paint,Mural,,36.178323,-86.75024
69,"Our Past, Your Future",Sterling Goller-Brown. Ian Lawrence,,1524 Gallatin Ave,Spray Paint,Mural,,36.194354,-86.743985
116,Tomatoes,"Sterling Goller-Brown and Ian Lawrence, co-cre...",,701 Porter Rd at Eastland Ave,paint on brick,Mural,Tomatoes,36.182437,-86.733449
118,Two Musicians,Sterling Goller-Brown. Ian Lawrence,,1008 Forrest Ave,Spray Paint,Mural,,36.178323,-86.75024


Finally, you can negate a condition by adding a tilde ~ before that condition. So if you want to find all murals not paineted by Goller-Brown and Lawrence:

In [102]:
murals.loc[~murals['last_name'].isin(goller_lawrence)]

Unnamed: 0,title,last_name,first_name,loc,medium,Type,Description,Latitude,lng
2,12th & Porter Mural,Kennedy,Kim,114 12th Avenue N,Porter all-weather outdoor paint,Mural,Kim Kennedy is a musician and visual artist wh...,36.1579,-86.78817
3,A Splash of Color,Stevenson and Stanley and ROFF (Harroff),Doug and Ronnica and Lynn,616 17th Ave. N.,"Steel, brick, wood, and fabric on frostproof c...",Mural,Painted wooden hoop dancer on a twenty foot po...,36.16202,-86.79975
5,Aerial Innovations Mural,Rudloff,Andee,202 South 17th St.,House paint on wood,Mural,,36.17354,-86.73994
10,April Baby,Prestwod,Seth,3020 Charlotte Avenue,Acrylic Paint,Mural,portrait of artists little sister with links t...,36.15399,-86.819539
16,Bicycle Bus-Green Fleet,Rudloff,Andee,1st Avenue (under John Seigenthaler Pedestrian...,Metallic paint on metal/found object,Mural,,36.16131,-86.77336
19,Building a Positive Community,"Healing Arts Project, Inc.",Healing Arts Project,East Park Community Center,interior wall paint on board,Mural,"The Healing Arts Project, Inc. sponsored the c...",36.17214,-86.76244
26,Cool Fences,Guion,Scott,"500 East Iris Dr., Nashville, TN",Latex house paint on wood fence,Mural,Portraits of iconic musicians on decorative ba...,36.11554,-86.76366
28,Demonbreun Hill Mural,Deese,Bryan,1524 Demonbreun Street,Latex paint and spray paint,Mural,This piece celebrates Demonbreun Hills former ...,36.153,-86.790492
29,Dragon Wall Mural,Randolf and Glick,Adam and David,21st Avenue and Belcourt Ave.,painting,Mural,,36.1375,-86.80119
39,Gone Fishing,Cooper,Michael,Church Street Park,Acrylic on Brick,Mural,Just having some fun with Trompe L'oeil balconies,36.16298,-86.78184
