# Introduction to pandas

In this notebook, you'll get familiar with the basics of reading in and getting acquainted with data using the `pandas` library.

### Import the `pandas` library, aliased as `pd`.

In [1]:
import pandas as pd

### Let's explore these pandas methods, attributes, and accessors
 
**Methods of Inspecting** 
 * .head()
 * .tail()
 * .shape
 * .info()

**Method of Modifying**
 * .drop()
 * renaming columns
 
**Methods of Summarizing**
 * .unique()
 * .nunique()
 * .value_counts()

**Methods of Slicing and Filtering**
 * .loc[]

## Step 1: Reading in Data and Initial Inspection

In [2]:
art = pd.read_csv('../data/public_art.csv')

To inspect a portion of the dataframe, you can use `.head()` (to see the first few rows) or `.tail()` (to see the last few rows).

In [3]:
art.head(2)

Unnamed: 0,Title,Last Name,First Name,Location,Medium,Type,Description,Latitude,Longitude,Mapped Location
0,[Cross Country Runners],Frost,Miley,"4001 Harding Rd., Nashville TN",Bronze,Sculpture,,36.12856,-86.8366,"(36.12856, -86.8366)"
1,[Fourth and Commerce Sculpture],Walker,Lin,"333 Commerce Street, Nashville TN",,Sculpture,,36.16234,-86.77774,"(36.16234, -86.77774)"


In [4]:
art.tail(2)

Unnamed: 0,Title,Last Name,First Name,Location,Medium,Type,Description,Latitude,Longitude,Mapped Location
130,Women Suffrage Memorial,LeQuire,Alan,"600 Charlotte Avenue, Nashville TN",Bronze sculpture,Sculpture,,36.16527,-86.78382,"(36.16527, -86.78382)"
131,Youth Opportunity Center-STARS Nashville - Pea...,Rudloff,Andee,1704 Charlotte Ave.,House paint on vinyl,Mural,,36.15896,-86.799,"(36.15896, -86.799)"


To see the number of rows and columns, you can access the `.shape` attribute. This shows (number of rows, number of columns).

In [5]:
art.shape

(132, 10)

To get more information about what is contained in each column, you can use `.info()'.

In [7]:
art.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 132 entries, 0 to 131
Data columns (total 10 columns):
Title              132 non-null object
Last Name          132 non-null object
First Name         122 non-null object
Location           131 non-null object
Medium             128 non-null object
Type               132 non-null object
Description        87 non-null object
Latitude           132 non-null float64
Longitude          132 non-null float64
Mapped Location    132 non-null object
dtypes: float64(2), object(8)
memory usage: 10.4+ KB


**What do you notice?**

Quite a few missing Desciptions, 10 missing First Names, a few missing Mediums, and one missing Location.

You may notice that most of the columns are "objects". This is the datatype that `pandas` uses for text data. 

The float64 datatype is a numeric datatype that can handle decimal values.

## Step 2: Making Modifications

Since the Mapped Location information is already contained in the Latitude and Longitude columns, you really don't need to store it twice. You can use the `.drop()` method to get rid of that column.

In [8]:
art.drop(columns='Mapped Location')

Unnamed: 0,Title,Last Name,First Name,Location,Medium,Type,Description,Latitude,Longitude
0,[Cross Country Runners],Frost,Miley,"4001 Harding Rd., Nashville TN",Bronze,Sculpture,,36.128560,-86.836600
1,[Fourth and Commerce Sculpture],Walker,Lin,"333 Commerce Street, Nashville TN",,Sculpture,,36.162340,-86.777740
2,12th & Porter Mural,Kennedy,Kim,114 12th Avenue N,Porter all-weather outdoor paint,Mural,Kim Kennedy is a musician and visual artist wh...,36.157900,-86.788170
3,A Splash of Color,Stevenson and Stanley and ROFF (Harroff),Doug and Ronnica and Lynn,616 17th Ave. N.,"Steel, brick, wood, and fabric on frostproof c...",Mural,Painted wooden hoop dancer on a twenty foot po...,36.162020,-86.799750
4,A Story of Nashville,Ridley,Greg,"615 Church Street, Nashville TN",Hammered copper repousse,Frieze,"Inside the Grand Reading Room, this is a serie...",36.162150,-86.782050
5,Aerial Innovations Mural,Rudloff,Andee,202 South 17th St.,House paint on wood,Mural,,36.173540,-86.739940
6,Airport Sun Project,Eldred,Dale,"1 Terminal Drive, Nashville TN",Light interference and transformation panels,Sculpture,Colorful panels along the truss system and tic...,36.130810,-86.668970
7,Andrew Jackson,Mills,Clark,"600 Charlotte Avenue, Nashville TN",Bronze sculpture with patina on a Tennessee ma...,Sculpture,A portrait sculpture of Andrew Jackson on a re...,36.166090,-86.783660
8,Angel,Ralston,William,"4715 Harding Pike, Nashville TN",Stone on concrete and brick base,Sculpture,An elongated figure consisting of a stylized f...,36.119750,-86.853430
9,Anticipation,Faxon,Russell,"505 Deaderick Street, Nashville TN",Cold cast bronze,Sculpture,A statue of a woman with legs crossed sitting ...,36.164890,-86.781840


In [9]:
art.head(2)

Unnamed: 0,Title,Last Name,First Name,Location,Medium,Type,Description,Latitude,Longitude,Mapped Location
0,[Cross Country Runners],Frost,Miley,"4001 Harding Rd., Nashville TN",Bronze,Sculpture,,36.12856,-86.8366,"(36.12856, -86.8366)"
1,[Fourth and Commerce Sculpture],Walker,Lin,"333 Commerce Street, Nashville TN",,Sculpture,,36.16234,-86.77774,"(36.16234, -86.77774)"


What happened? We failed to save the result of dropping the column. We need to assign the result back to the art dataframe.

In [12]:
art = art.drop(columns = 'Mapped Location')

KeyError: "['Mapped Location'] not found in axis"

In [13]:
art.head(2)

Unnamed: 0,Title,Last Name,First Name,Location,Medium,Type,Description,Latitude,Longitude
0,[Cross Country Runners],Frost,Miley,"4001 Harding Rd., Nashville TN",Bronze,Sculpture,,36.12856,-86.8366
1,[Fourth and Commerce Sculpture],Walker,Lin,"333 Commerce Street, Nashville TN",,Sculpture,,36.16234,-86.77774


Let's say you want to rename the columns of the art dataframe. One way to do this is to assign a new list of values to the `columns` attribute.

In [14]:
art.columns = ['title', 'last_name', 'first_name', 'location', 'medium', 'type', 'description', 'lat', 'lng']

In [15]:
art.head(2)

Unnamed: 0,title,last_name,first_name,location,medium,type,description,lat,lng
0,[Cross Country Runners],Frost,Miley,"4001 Harding Rd., Nashville TN",Bronze,Sculpture,,36.12856,-86.8366
1,[Fourth and Commerce Sculpture],Walker,Lin,"333 Commerce Street, Nashville TN",,Sculpture,,36.16234,-86.77774


## Step 3: Exploring and Slicing
What are the different types of artwork in this dataset?

In [16]:
art['type'].unique()

array(['Sculpture', 'Mural', 'Frieze', 'Monument', 'Mobile', 'Furniture',
       'Mosaic', 'Relief', 'Stained Glass', 'Bronzes',
       'Sculpture/Fountain', 'Various', 'Street Art', 'mural', 'Fountain',
       'Multipart'], dtype=object)

If you only care about the _number_ of unique values in a colmn, you can use `.nunique`.

For example, if you want to know the number of artist last names:

In [17]:
art['last_name'].nunique()

82

Which is the most popular type?

In [20]:
art.type.value_counts()

Sculpture             61
Mural                 38
Monument              16
Mosaic                 2
Various                2
Mobile                 2
Frieze                 2
Furniture              1
Stained Glass          1
Bronzes                1
Fountain               1
Relief                 1
mural                  1
Street Art             1
Multipart              1
Sculpture/Fountain     1
Name: type, dtype: int64

What if you want to see all of the Murals? You can slice a DataFrame using `.loc` and passing in a conditional expression.

In [21]:
art.loc[art['type'] == 'Mural']

Unnamed: 0,title,last_name,first_name,location,medium,type,description,lat,lng
2,12th & Porter Mural,Kennedy,Kim,114 12th Avenue N,Porter all-weather outdoor paint,Mural,Kim Kennedy is a musician and visual artist wh...,36.1579,-86.78817
3,A Splash of Color,Stevenson and Stanley and ROFF (Harroff),Doug and Ronnica and Lynn,616 17th Ave. N.,"Steel, brick, wood, and fabric on frostproof c...",Mural,Painted wooden hoop dancer on a twenty foot po...,36.16202,-86.79975
5,Aerial Innovations Mural,Rudloff,Andee,202 South 17th St.,House paint on wood,Mural,,36.17354,-86.73994
10,April Baby,Prestwod,Seth,3020 Charlotte Avenue,Acrylic Paint,Mural,portrait of artists little sister with links t...,36.15399,-86.819539
16,Bicycle Bus-Green Fleet,Rudloff,Andee,1st Avenue (under John Seigenthaler Pedestrian...,Metallic paint on metal/found object,Mural,,36.16131,-86.77336
19,Building a Positive Community,"Healing Arts Project, Inc.",Healing Arts Project,East Park Community Center,interior wall paint on board,Mural,"The Healing Arts Project, Inc. sponsored the c...",36.17214,-86.76244
26,Cool Fences,Guion,Scott,"500 East Iris Dr., Nashville, TN",Latex house paint on wood fence,Mural,Portraits of iconic musicians on decorative ba...,36.11554,-86.76366
28,Demonbreun Hill Mural,Deese,Bryan,1524 Demonbreun Street,Latex paint and spray paint,Mural,This piece celebrates Demonbreun Hills former ...,36.153,-86.790492
29,Dragon Wall Mural,Randolf and Glick,Adam and David,21st Avenue and Belcourt Ave.,painting,Mural,,36.1375,-86.80119
30,Eastside Mural,Sterling Goller-Brown. Ian Lawrence,,1008 Forrest Ave,Spray Paint,Mural,,36.178323,-86.75024


If you want to do further work or exploration with the sliced dataframe, you need to save it to a new variable.

In [23]:
murals = art.loc[art['type'] == 'Mural']
murals.shape

(38, 9)

Who is the most prolific mural painter in Nashville?

In [24]:
murals['last_name'].value_counts()

Cooper                                                 6
Rudloff                                                6
Saporiti                                               5
Sterling Goller-Brown.  Ian Lawrence                   3
Deese                                                  2
Purcell                                                1
Haas                                                   1
Hughes                                                 1
Healing Arts Project, Inc.                             1
Sterling Goller-Brown and Ian Lawrence, co-creators    1
Kennedy                                                1
Brian Law / Jenna Colt                                 1
Williams                                               1
Randolf and Glick                                      1
Omari Booker & The REAL Program at Oasis Center        1
Stevenson and Stanley and ROFF (Harroff)               1
Bryan Deese, Audie Adams, Ryan Shrader                 1
Guion                          

Let's see all of the murals that Cooper painted.

In [25]:
murals.loc[murals['last_name'] == 'Cooper']

Unnamed: 0,title,last_name,first_name,location,medium,type,description,lat,lng
39,Gone Fishing,Cooper,Michael,Church Street Park,Acrylic on Brick,Mural,Just having some fun with Trompe L'oeil balconies,36.16298,-86.78184
40,Happy Times at The Arcade,Cooper,Michael,In the Alley between 4th and 5th off of Union,Silicate paint on brick and concrete block,Mural,Trompe L'oeil artwork celebrating The Arcade,36.1647,-86.78043
56,Lane Motor Museum,Cooper,Michael,702 Murfreesboro Pike,Acrylic on Brick; acrylic on metal wall panels,Mural,"Trompe l'oeil scene on garage wall; custom ""po...",36.14034,-86.7344
58,Phillips Toy Mart,Cooper,Michael,5207 Harding Pike,Acrylic on Drywall,Mural,"Planes, trains and automobiles!",36.10255,-86.8697
75,Piecing It All Together,Cooper,Michael,"600 Church Street, Nashville TN",Painting on Stone,Mural,,36.16281,-86.78186
128,Welcome to Flatrock,Cooper,Michael,3756 Nolensville Rd,Silicate paint on concrete,Mural,Trompe L'oeil animals and architectural stonew...,36.09082,-86.73445


In [29]:
murals['last_name'].value_counts()

Cooper                                                 6
Rudloff                                                6
Saporiti                                               5
Sterling Goller-Brown.  Ian Lawrence                   3
Deese                                                  2
Purcell                                                1
Haas                                                   1
Hughes                                                 1
Healing Arts Project, Inc.                             1
Sterling Goller-Brown and Ian Lawrence, co-creators    1
Kennedy                                                1
Brian Law / Jenna Colt                                 1
Williams                                               1
Randolf and Glick                                      1
Omari Booker & The REAL Program at Oasis Center        1
Stevenson and Stanley and ROFF (Harroff)               1
Bryan Deese, Audie Adams, Ryan Shrader                 1
Guion                          

Take another look at the murals dataframe and notice that Sterling Goller-Brown and Ian Lawrence collaborated on multiple murals, but these are stored in the dataframe differently. What if we want to slice down and find these rows?

In [30]:
goller_lawrence = ['Sterling Goller-Brown.  Ian Lawrence', 'Sterling Goller-Brown and Ian Lawrence, co-creators']

In [31]:
murals.loc[murals['last_name'].isin(goller_lawrence)]

Unnamed: 0,title,last_name,first_name,location,medium,type,description,lat,lng
30,Eastside Mural,Sterling Goller-Brown. Ian Lawrence,,1008 Forrest Ave,Spray Paint,Mural,,36.178323,-86.75024
69,"Our Past, Your Future",Sterling Goller-Brown. Ian Lawrence,,1524 Gallatin Ave,Spray Paint,Mural,,36.194354,-86.743985
116,Tomatoes,"Sterling Goller-Brown and Ian Lawrence, co-cre...",,701 Porter Rd at Eastland Ave,paint on brick,Mural,Tomatoes,36.182437,-86.733449
118,Two Musicians,Sterling Goller-Brown. Ian Lawrence,,1008 Forrest Ave,Spray Paint,Mural,,36.178323,-86.75024


If you are only interested in certain columns, you can specify those with a list.

In [35]:
murals.loc[:,['title', 'location', 'medium']]

Unnamed: 0,title,location,medium
2,12th & Porter Mural,114 12th Avenue N,Porter all-weather outdoor paint
3,A Splash of Color,616 17th Ave. N.,"Steel, brick, wood, and fabric on frostproof c..."
5,Aerial Innovations Mural,202 South 17th St.,House paint on wood
10,April Baby,3020 Charlotte Avenue,Acrylic Paint
16,Bicycle Bus-Green Fleet,1st Avenue (under John Seigenthaler Pedestrian...,Metallic paint on metal/found object
19,Building a Positive Community,East Park Community Center,interior wall paint on board
26,Cool Fences,"500 East Iris Dr., Nashville, TN",Latex house paint on wood fence
28,Demonbreun Hill Mural,1524 Demonbreun Street,Latex paint and spray paint
29,Dragon Wall Mural,21st Avenue and Belcourt Ave.,painting
30,Eastside Mural,1008 Forrest Ave,Spray Paint


Finally, you can negate a condition by adding a tilde ~ before that condition. So if you want to find all murals not paineted by Goller-Brown and Lawrence:

In [36]:
murals.loc[~murals['last_name'].isin(goller_lawrence)]

Unnamed: 0,title,last_name,first_name,location,medium,type,description,lat,lng
2,12th & Porter Mural,Kennedy,Kim,114 12th Avenue N,Porter all-weather outdoor paint,Mural,Kim Kennedy is a musician and visual artist wh...,36.1579,-86.78817
3,A Splash of Color,Stevenson and Stanley and ROFF (Harroff),Doug and Ronnica and Lynn,616 17th Ave. N.,"Steel, brick, wood, and fabric on frostproof c...",Mural,Painted wooden hoop dancer on a twenty foot po...,36.16202,-86.79975
5,Aerial Innovations Mural,Rudloff,Andee,202 South 17th St.,House paint on wood,Mural,,36.17354,-86.73994
10,April Baby,Prestwod,Seth,3020 Charlotte Avenue,Acrylic Paint,Mural,portrait of artists little sister with links t...,36.15399,-86.819539
16,Bicycle Bus-Green Fleet,Rudloff,Andee,1st Avenue (under John Seigenthaler Pedestrian...,Metallic paint on metal/found object,Mural,,36.16131,-86.77336
19,Building a Positive Community,"Healing Arts Project, Inc.",Healing Arts Project,East Park Community Center,interior wall paint on board,Mural,"The Healing Arts Project, Inc. sponsored the c...",36.17214,-86.76244
26,Cool Fences,Guion,Scott,"500 East Iris Dr., Nashville, TN",Latex house paint on wood fence,Mural,Portraits of iconic musicians on decorative ba...,36.11554,-86.76366
28,Demonbreun Hill Mural,Deese,Bryan,1524 Demonbreun Street,Latex paint and spray paint,Mural,This piece celebrates Demonbreun Hills former ...,36.153,-86.790492
29,Dragon Wall Mural,Randolf and Glick,Adam and David,21st Avenue and Belcourt Ave.,painting,Mural,,36.1375,-86.80119
39,Gone Fishing,Cooper,Michael,Church Street Park,Acrylic on Brick,Mural,Just having some fun with Trompe L'oeil balconies,36.16298,-86.78184
