<img src="http://imgur.com/1ZcRyrc.png" style="float: left; margin: 20px; height: 55px">

## Lab: Cleaning Rock Song Data

_Authors: Dave Yerrington (SF)_

---


In [6]:
import pandas as pd
import numpy as np 
import seaborn as sns

%matplotlib inline

### 1. Load `rock.csv` and do an initial examination of its data columns.

In [7]:
rockfile = "./datasets/rock.csv"

In [8]:
# Load the data.
rockfile = pd.read_csv("./datasets/rock.csv", sep=',')

In [9]:
# Look at the information regarding its columns.
rockfile

Unnamed: 0,Song Clean,ARTIST CLEAN,Release Year,COMBINED,First?,Year?,PlayCount,F*G
0,Caught Up in You,.38 Special,1982,Caught Up in You by .38 Special,1,1,82,82
1,Fantasy Girl,.38 Special,,Fantasy Girl by .38 Special,1,0,3,0
2,Hold On Loosely,.38 Special,1981,Hold On Loosely by .38 Special,1,1,85,85
3,Rockin' Into the Night,.38 Special,1980,Rockin' Into the Night by .38 Special,1,1,18,18
4,Art For Arts Sake,10cc,1975,Art For Arts Sake by 10cc,1,1,1,1
...,...,...,...,...,...,...,...,...
2225,She Loves My Automobile,ZZ Top,,She Loves My Automobile by ZZ Top,1,0,1,0
2226,Tube Snake Boogie,ZZ Top,1981,Tube Snake Boogie by ZZ Top,1,1,32,32
2227,Tush,ZZ Top,1975,Tush by ZZ Top,1,1,109,109
2228,TV Dinners,ZZ Top,1983,TV Dinners by ZZ Top,1,1,1,1


In [10]:
rockfile.columns

Index(['Song Clean', 'ARTIST CLEAN', 'Release Year', 'COMBINED', 'First?',
       'Year?', 'PlayCount', 'F*G'],
      dtype='object')

### 2.  Clean up the column names.

Let's clean up the column names. There are two ways we can accomplish this:

#### 2.A Change the column names when you import the data using `pd.read_csv()`.

Notice that, when passing `names=[..A LIST OF STRING..]` with a number of columns that matches the number of strings in the passed list, you replace the column names.

NOTE: When you create custom column names, the first row of the `.csv` already represents a header. It is important to tell `pandas` to skip that row. The `skiprows=1` keyword argument to `read_csv()` will tell `pandas` to skip the first row.

In [11]:
# Change the column names when loading the '.csv':
all_names = ['Song', 'Artist', 'Release Year', 'Combined', 'First?', 'Year?', 'Play_Count', 'F*G']
rockfile = pd.read_csv("./datasets/rock.csv", sep=',', skiprows = 1, names = all_names)
rockfile.head()
#
#pd.read_csv()

Unnamed: 0,Song,Artist,Release Year,Combined,First?,Year?,Play_Count,F*G
0,Caught Up in You,.38 Special,1982.0,Caught Up in You by .38 Special,1,1,82,82
1,Fantasy Girl,.38 Special,,Fantasy Girl by .38 Special,1,0,3,0
2,Hold On Loosely,.38 Special,1981.0,Hold On Loosely by .38 Special,1,1,85,85
3,Rockin' Into the Night,.38 Special,1980.0,Rockin' Into the Night by .38 Special,1,1,18,18
4,Art For Arts Sake,10cc,1975.0,Art For Arts Sake by 10cc,1,1,1,1


#### 2.B Change column names using the `.rename()` function.

The `.rename()` function takes an argument, `columns=name_dict`, in which `name_dict` is a dictionary containing the original column names as keys and the new column names as values.

In [12]:
# Change the column names using the `.rename()` function.
new_names = {'Song Clean':'Song',
            'ARTIST CLEAN': 'Artist',
            'COMBINED':'Combined'}
rockfile.rename(new_names)

Unnamed: 0,Song,Artist,Release Year,Combined,First?,Year?,Play_Count,F*G
0,Caught Up in You,.38 Special,1982,Caught Up in You by .38 Special,1,1,82,82
1,Fantasy Girl,.38 Special,,Fantasy Girl by .38 Special,1,0,3,0
2,Hold On Loosely,.38 Special,1981,Hold On Loosely by .38 Special,1,1,85,85
3,Rockin' Into the Night,.38 Special,1980,Rockin' Into the Night by .38 Special,1,1,18,18
4,Art For Arts Sake,10cc,1975,Art For Arts Sake by 10cc,1,1,1,1
...,...,...,...,...,...,...,...,...
2225,She Loves My Automobile,ZZ Top,,She Loves My Automobile by ZZ Top,1,0,1,0
2226,Tube Snake Boogie,ZZ Top,1981,Tube Snake Boogie by ZZ Top,1,1,32,32
2227,Tush,ZZ Top,1975,Tush by ZZ Top,1,1,109,109
2228,TV Dinners,ZZ Top,1983,TV Dinners by ZZ Top,1,1,1,1


#### 2.C Reassigning the `.columns` attribute of a DataFrame.

You can also just reassign the `.columns` attribute to a list of strings containing the new column names. 

The only caveat with reassigning `.columns` is that you have to reassign all of the column names at once. You can't partially replace a value by working on `.columns` directly. You have to reassign `.columns` with a list of equal length. 

In [40]:
# Replace the column names by reassigning the `.columns` attribute.
rockfile = pd.read_csv("./datasets/rock.csv", sep=',')
rockfile.columns =['Song', 'Artist', 'Release_Year', 'Combined', 'First?', 'Year?', 'Play_Count', 'F*G']
rockfile[rockfile.columns]

Unnamed: 0,Song,Artist,Release_Year,Combined,First?,Year?,Play_Count,F*G
0,Caught Up in You,.38 Special,1982,Caught Up in You by .38 Special,1,1,82,82
1,Fantasy Girl,.38 Special,,Fantasy Girl by .38 Special,1,0,3,0
2,Hold On Loosely,.38 Special,1981,Hold On Loosely by .38 Special,1,1,85,85
3,Rockin' Into the Night,.38 Special,1980,Rockin' Into the Night by .38 Special,1,1,18,18
4,Art For Arts Sake,10cc,1975,Art For Arts Sake by 10cc,1,1,1,1
...,...,...,...,...,...,...,...,...
2225,She Loves My Automobile,ZZ Top,,She Loves My Automobile by ZZ Top,1,0,1,0
2226,Tube Snake Boogie,ZZ Top,1981,Tube Snake Boogie by ZZ Top,1,1,32,32
2227,Tush,ZZ Top,1975,Tush by ZZ Top,1,1,109,109
2228,TV Dinners,ZZ Top,1983,TV Dinners by ZZ Top,1,1,1,1


### 3. Subsetting data where null values exist.

We have mixed `str` and `NaN` values in the `release` column. `NaN` stands for "not a number" and is the way `pandas` handles "nulls" or nonexistent data. We can use the `.isnull()` method of a Series to find null values.

Print the header of the data subset to where the `release` column is null values.

In [14]:
# Show records where df['release'] is null
rockfile[rockfile.Release_Year.isnull()]

Unnamed: 0,Song,Artist,Release_Year,Combined,First?,Year?,Play_Count,F*G
1,Fantasy Girl,.38 Special,,Fantasy Girl by .38 Special,1,0,3,0
10,"Baby, Please Don't Go",AC/DC,,"Baby, Please Don't Go by AC/DC",1,0,1,0
13,CAN'T STOP ROCK'N'ROLL,AC/DC,,CAN'T STOP ROCK'N'ROLL by AC/DC,1,0,5,0
16,Girls Got Rhythm,AC/DC,,Girls Got Rhythm by AC/DC,1,0,24,0
24,Let's Get It Up,AC/DC,,Let's Get It Up by AC/DC,1,0,4,0
...,...,...,...,...,...,...,...,...
2216,"I'm Bad, I'm Nationwide",ZZ Top,,"I'm Bad, I'm Nationwide by ZZ Top",1,0,10,0
2218,Just Got Paid,ZZ Top,,Just Got Paid by ZZ Top,1,0,2,0
2221,My Head's In Mississippi,ZZ Top,,My Head's In Mississippi by ZZ Top,1,0,1,0
2222,Party On The Patio,ZZ Top,,Party On The Patio by ZZ Top,1,0,14,0


### 4. Update slices of your DataFrame based on mask selection/slices.

In many scenarios, we want to upate values in our DataFrame according to criteria. Let's say we wanted to set all of the null values in `release` to 0.

With newer versions of `pandas`, in order to manipulate data in the original DataFrame, we have to use `.loc` while performing reassignment using a mask and an index.

For example, the following won't always work:
```python
df[row_mask]['column_name'] = new_value
```

The best way to accomplish the same task is:
```python
df.loc[row_mask, 'column_name'] = new_value
```

For multiple column assignment, you would use:
```python
df.loc[row_mask, ['col_1', 'col_2', 'col_3']] = new_value
```

#### 4.A Let's try it out. Make all of the null values in `release` 0.

In [41]:
# Replace release nulls with 0
#rockfile.loc[rockfile.Release_Year.isnull()] = 0
#rockfile.Release_Year


rockfile.loc[rockfile.Release_Year.isnull()] = 0
rockfile[rockfile.columns]

Unnamed: 0,Song,Artist,Release_Year,Combined,First?,Year?,Play_Count,F*G
0,Caught Up in You,.38 Special,1982,Caught Up in You by .38 Special,1,1,82,82
1,0,0,0,0,0,0,0,0
2,Hold On Loosely,.38 Special,1981,Hold On Loosely by .38 Special,1,1,85,85
3,Rockin' Into the Night,.38 Special,1980,Rockin' Into the Night by .38 Special,1,1,18,18
4,Art For Arts Sake,10cc,1975,Art For Arts Sake by 10cc,1,1,1,1
...,...,...,...,...,...,...,...,...
2225,0,0,0,0,0,0,0,0
2226,Tube Snake Boogie,ZZ Top,1981,Tube Snake Boogie by ZZ Top,1,1,32,32
2227,Tush,ZZ Top,1975,Tush by ZZ Top,1,1,109,109
2228,TV Dinners,ZZ Top,1983,TV Dinners by ZZ Top,1,1,1,1


#### 4.B Verify that `release` contains no null values.

In [16]:
# A:
rockfile[rockfile.Release_Year.isnull()]
#rockfile.Release_Year.value_counts(dropna=False)

Unnamed: 0,Song,Artist,Release_Year,Combined,First?,Year?,Play_Count,F*G


### 5. Ensure that the data types of the columns make sense. 

Verifying column data types is a critical part of data munging. If columns have the wrong data type, then there is usually corrupted or incorrect data in some of the observations.

#### 5.A Look at the data types for the columns. Are any incorrect given what the data represents?

In [17]:
# A:

rockfile.dtypes
#release year should be int

Song            object
Artist          object
Release_Year    object
Combined        object
First?           int64
Year?            int64
Play_Count       int64
F*G              int64
dtype: object

### 6. Investigate and clean up the `release` column.

The `release` column is a string data type when it should be an integer.

#### 6.A Figure out what value(s) are causing the `release` column to be encoded as a string instead of an integer.

In [21]:
# A: 
#rockfile.Release_Year.dtype == int #- this returns false
#rockfile.Release_Year.dtype == str #- this returns false
#rockfile.Release_Year.dtype == object #this returns true

#rockfile.loc[rockfile.Release_Year != 0]
#I am going to change this column to an int
#rockfile = rockfile.astype({'Release_Year':'int64'})

rockfile.select_dtypes(include='object', exclude= 'int').Release_Year


0       1982
1          0
2       1981
3       1980
4       1975
        ... 
2225       0
2226    1981
2227    1975
2228    1983
2229    1973
Name: Release_Year, Length: 2230, dtype: object

#### 6.B Look at the rows in which there is incorrect data in the `release` column.

In [42]:
# A:
rockfile.Release_Year.value_counts()
#songfacts.com and 1071

0                577
1973             104
1975              83
1977              83
1970              81
1971              75
1969              72
1980              70
1978              64
1979              63
1967              61
1981              61
1983              60
1976              56
1982              54
1984              51
1972              50
1974              48
1968              46
1987              39
1985              39
1986              37
1991              34
1989              32
1966              30
1988              29
1965              28
1994              25
1990              22
1993              19
1992              14
1964              14
1999              13
1995              10
1996               9
1997               9
1963               9
2002               6
1998               6
2005               5
2012               5
2004               5
2001               4
2008               3
2003               3
2011               3
1962               3
2000         

#### 6.C. Clean up the data. Normally we may replace the offending data with null np.nan values, however we previously converted all of the nan values in the release column to zeros so we might as well continue with the same practice. Replacing with 0 (or nan) will allow us to convert the column to numeric.

In [43]:
# A:
#replacing the two wrong values with new values
rockfile.Release_Year = rockfile.Release_Year.replace(['SONGFACTS.COM', '1071'], [0,'1971'])
rockfile = rockfile.astype({'Release_Year':'int64'}) #converting everything to int
rockfile.Release_Year.value_counts()

0       578
1973    104
1977     83
1975     83
1970     81
1971     76
1969     72
1980     70
1978     64
1979     63
1981     61
1967     61
1983     60
1976     56
1982     54
1984     51
1972     50
1974     48
1968     46
1985     39
1987     39
1986     37
1991     34
1989     32
1966     30
1988     29
1965     28
1994     25
1990     22
1993     19
1964     14
1992     14
1999     13
1995     10
1996      9
1997      9
1963      9
2002      6
1998      6
2012      5
2005      5
2004      5
2001      4
2000      3
2008      3
1962      3
2007      3
2003      3
2011      3
2013      2
2014      2
2006      1
1955      1
1961      1
1958      1
Name: Release_Year, dtype: int64

### 7. Get summary statistics for the `release` column using the `.describe()` function.

Now that the `release` column is finally a numeric data type, we can apply the `.describe()` function.  

#### 7.A Print out the summary stats for the `release` column. What is the earliest and latest release date?

In [36]:
# A:
rockfile.Release_Year.describe()

count    2230.000000
mean     1465.734978
std       867.221986
min         0.000000
25%         0.000000
50%      1973.000000
75%      1981.000000
max      2014.000000
Name: Release_Year, dtype: float64

#### 7.B Based on the summary statistics, is there anything else wrong with the `release` column? 

In [None]:
# A: 1071 was also wrong which I fixed above already when fixing the SONGFACTS.COM

_Looking at the DataFrame that contains the year 1071, we can see that the year was probably corrupted and should be replaced with something else if possible._

### 8. Make changes and investigate using custom functions with `.apply()`.

Let's say we want to traverse every single row in our data set and apply a function to that row.

#### 8.A Write a function that will take a row of a DataFrame and print out the song, artist, and whether or not the release date is < 1970.


In [114]:
# A:
def sard1970(row):
    
    if row.Release_Year > 1970:
        print(row[['Song', 'Artist', 'Release_Year']]) #with double bracket it went into dataframe
    #else:
   #     print('this song is before 1970 or equal')


rockfile.apply(sard1970, axis = 1)

Song            Caught Up in You
Artist               .38 Special
Release_Year                1982
Name: 0, dtype: object
Song            Hold On Loosely
Artist              .38 Special
Release_Year               1981
Name: 2, dtype: object
Song            Rockin' Into the Night
Artist                     .38 Special
Release_Year                      1980
Name: 3, dtype: object
Song            Art For Arts Sake
Artist                       10cc
Release_Year                 1975
Name: 4, dtype: object
Song              Kryptonite
Artist          3 Doors Down
Release_Year            2000
Name: 5, dtype: object
Song                   Loser
Artist          3 Doors Down
Release_Year            2000
Name: 6, dtype: object
Song            When I'm Gone
Artist           3 Doors Down
Release_Year             2002
Name: 7, dtype: object
Song               What's Up?
Artist          4 Non Blondes
Release_Year             1992
Name: 8, dtype: object
Song            Take On Me
Artist               

Song                 Them Bones
Artist          Alice In Chains
Release_Year               1992
Name: 90, dtype: object
Song                     Would?
Artist          Alice In Chains
Release_Year               1992
Name: 91, dtype: object
Song            Ain't Wastin' Time No More
Artist                Allman Brothers Band
Release_Year                          1972
Name: 92, dtype: object
Song                        Blue Sky
Artist          Allman Brothers Band
Release_Year                    1972
Name: 93, dtype: object
Song                  Good Clean Fun
Artist          Allman Brothers Band
Release_Year                    1990
Name: 94, dtype: object
Song                         Jessica
Artist          Allman Brothers Band
Release_Year                    1973
Name: 96, dtype: object
Song                         Melissa
Artist          Allman Brothers Band
Release_Year                    1972
Name: 97, dtype: object
Song              No One to Run With
Artist          Allman Brother

Name: 197, dtype: object
Song            All the Small Things
Artist                     Blink-182
Release_Year                    1999
Name: 199, dtype: object
Song            Call Me
Artist          Blondie
Release_Year       1980
Name: 200, dtype: object
Song            Heart of Glass
Artist                 Blondie
Release_Year              1978
Name: 201, dtype: object
Song            One Way Or Another
Artist                     Blondie
Release_Year                  1978
Name: 202, dtype: object
Song            (Don't Fear) The Reaper
Artist                 Blue Oyster Cult
Release_Year                       1976
Name: 203, dtype: object
Song             Burnin' for You
Artist          Blue Oyster Cult
Release_Year                1981
Name: 205, dtype: object
Song                    Godzilla
Artist          Blue Oyster Cult
Release_Year                1977
Name: 206, dtype: object
Song            I Love the Night
Artist          Blue Oyster Cult
Release_Year                1977
Na

Name: 304, dtype: object
Song             Run to You
Artist          Bryan Adams
Release_Year           1984
Name: 305, dtype: object
Song               Somebody
Artist          Bryan Adams
Release_Year           1984
Name: 306, dtype: object
Song            Summer Of '69
Artist            Bryan Adams
Release_Year             1984
Name: 307, dtype: object
Song                 Sorry
Artist          Buckcherry
Release_Year          2007
Name: 308, dtype: object
Song            Comedown
Artist              Bush
Release_Year        1994
Name: 312, dtype: object
Song            Machinehead
Artist                 Bush
Release_Year           1994
Name: 313, dtype: object
Song            Far Behind
Artist           Candlebox
Release_Year          1993
Name: 314, dtype: object
Song            IT'S TOO LATE
Artist            Carole King
Release_Year             1971
Name: 317, dtype: object
Song             WILD WORLD
Artist          Cat Stevens
Release_Year           1971
Name: 319, dtype: obje

Name: 470, dtype: object
Song            Sultans of Swing
Artist              Dire Straits
Release_Year                1978
Name: 472, dtype: object
Song            Walk of Life
Artist          Dire Straits
Release_Year            1985
Name: 473, dtype: object
Song            Counting Blue Cars
Artist                   Dishwalla
Release_Year                  1995
Name: 474, dtype: object
Song            Drift Away
Artist          Dobie Gray
Release_Year          1973
Name: 475, dtype: object
Song            Heavy Metal
Artist           Don Felder
Release_Year           1981
Name: 479, dtype: object
Song            All She Wants to Do Is Dance
Artist                            Don Henley
Release_Year                            1985
Name: 480, dtype: object
Song            Dirty Laundry
Artist             Don Henley
Release_Year             1982
Name: 481, dtype: object
Song            Sunset Grill
Artist            Don Henley
Release_Year            1985
Name: 482, dtype: object
Song   

Name: 561, dtype: object
Song                From the Beginning
Artist          Emerson, Lake & Palmer
Release_Year                      1972
Name: 562, dtype: object
Song                       Karn Evil 9
Artist          Emerson, Lake & Palmer
Release_Year                      1973
Name: 564, dtype: object
Song              Still You Turn Me On
Artist          Emerson, Lake & Palmer
Release_Year                      1973
Name: 566, dtype: object
Song                Bad Love
Artist          Eric Clapton
Release_Year            1989
Name: 568, dtype: object
Song                 Cocaine
Artist          Eric Clapton
Release_Year            1977
Name: 570, dtype: object
Song             Forever Man
Artist          Eric Clapton
Release_Year            1985
Name: 572, dtype: object
Song            I Can't Stand It
Artist              Eric Clapton
Release_Year                1981
Name: 573, dtype: object
Song            I Shot The Sheriff
Artist                Eric Clapton
Release_Year       

Name: 674, dtype: object
Song               Mama
Artist          Genesis
Release_Year       1983
Name: 675, dtype: object
Song            Man On The Corner
Artist                    Genesis
Release_Year                 1981
Name: 676, dtype: object
Song            Misunderstanding
Artist                   Genesis
Release_Year                1980
Name: 677, dtype: object
Song            No Reply At All
Artist                  Genesis
Release_Year               1981
Name: 678, dtype: object
Song            That's All
Artist             Genesis
Release_Year          1983
Name: 679, dtype: object
Song            The Lamb Lies Down On Broadway
Artist                                 Genesis
Release_Year                              1974
Name: 680, dtype: object
Song            Tonight Tonight (live)
Artist                         Genesis
Release_Year                      1986
Name: 681, dtype: object
Song            Turn It On Again
Artist                   Genesis
Release_Year              

Song             Load Out/Stay
Artist          Jackson Browne
Release_Year              1977
Name: 827, dtype: object
Song            Redneck Friend
Artist          Jackson Browne
Release_Year              1973
Name: 828, dtype: object
Song            Running On Empty
Artist            Jackson Browne
Release_Year                1977
Name: 829, dtype: object
Song            Somebody's Baby
Artist           Jackson Browne
Release_Year               1982
Name: 830, dtype: object
Song            The Load-out/stay
Artist             Jackson Browne
Release_Year                 1977
Name: 832, dtype: object
Song            The Road and the Sky
Artist                Jackson Browne
Release_Year                    1977
Name: 833, dtype: object
Song            YOU'VE GOT A FRIEND
Artist                 JAMES TAYLOR
Release_Year                   1971
Name: 836, dtype: object
Song            Been Caught Stealing
Artist              Jane's Addiction
Release_Year                    1990
Name: 837, d

Name: 950, dtype: object
Song            Only the Young
Artist                 Journey
Release_Year              1985
Name: 951, dtype: object
Song            Open Arms
Artist            Journey
Release_Year         1982
Name: 952, dtype: object
Song            Send Her My Love
Artist                   Journey
Release_Year                1983
Name: 953, dtype: object
Song            Separate Ways (Worlds Apart)
Artist                               Journey
Release_Year                            1983
Name: 954, dtype: object
Song            Wheel in the Sky
Artist                   Journey
Release_Year                1978
Name: 956, dtype: object
Song            Who's Crying Now
Artist                   Journey
Release_Year                1981
Name: 957, dtype: object
Song            Breaking The Law
Artist              Judas Priest
Release_Year                1980
Name: 958, dtype: object
Song            Electric Eye
Artist          Judas Priest
Release_Year            1982
Name: 960, 

Name: 1097, dtype: object
Song                Simple Man
Artist          Lynyrd Skynyrd
Release_Year              1973
Name: 1098, dtype: object
Song            Sweet Home Alabama
Artist              Lynyrd Skynyrd
Release_Year                  1974
Name: 1099, dtype: object
Song                That Smell
Artist          Lynyrd Skynyrd
Release_Year              1977
Name: 1100, dtype: object
Song            The Ballad Of Curtis Loew
Artist                     Lynyrd Skynyrd
Release_Year                         1974
Name: 1101, dtype: object
Song            Tuesday's Gone
Artist          Lynyrd Skynyrd
Release_Year              1973
Name: 1102, dtype: object
Song            What's Your Name
Artist            Lynyrd Skynyrd
Release_Year                1977
Name: 1103, dtype: object
Song            Blinded by the Light
Artist                  Manfred Mann
Release_Year                    1976
Name: 1106, dtype: object
Song            Paradise By The Dashboard Light
Artist                  

Name: 1206, dtype: object
Song            STILL THE ONE
Artist                Orleans
Release_Year             1976
Name: 1207, dtype: object
Song            If You Wanna Get to Heaven
Artist           Ozark Mountain Daredevils
Release_Year                          1973
Name: 1209, dtype: object
Song                          Jackie Blue
Artist          Ozark Mountain Daredevils
Release_Year                         1975
Name: 1210, dtype: object
Song            Bark At The Moon
Artist             Ozzy Osbourne
Release_Year                1983
Name: 1211, dtype: object
Song              Crazy Train
Artist          Ozzy Osbourne
Release_Year             1980
Name: 1213, dtype: object
Song            Diary Of A Madman
Artist              Ozzy Osbourne
Release_Year                 1981
Name: 1214, dtype: object
Song            Flying High Again
Artist              Ozzy Osbourne
Release_Year                 1981
Name: 1215, dtype: object
Song            Goodbye To Romance
Artist             

Name: 1299, dtype: object
Song            In the Air Tonight
Artist                Phil Collins
Release_Year                  1981
Name: 1302, dtype: object
Song            Another Brick In The Wall
Artist                         Pink Floyd
Release_Year                         1979
Name: 1303, dtype: object
Song            Another Brick in the Wall, Pt. 2
Artist                                Pink Floyd
Release_Year                                1979
Name: 1304, dtype: object
Song            Any Colour You Like
Artist                   Pink Floyd
Release_Year                   1973
Name: 1305, dtype: object
Song            Brain Damage
Artist            Pink Floyd
Release_Year            1973
Name: 1308, dtype: object
Song            Brain Damage /Eclipse
Artist                     Pink Floyd
Release_Year                     1973
Name: 1309, dtype: object
Song               Breathe
Artist          Pink Floyd
Release_Year          1973
Name: 1310, dtype: object
Song            Comforta

Song            Keep On Loving You
Artist              REO Speedwagon
Release_Year                  1980
Name: 1409, dtype: object
Song            Take It on the Run
Artist              REO Speedwagon
Release_Year                  1981
Name: 1414, dtype: object
Song            Rock And Roll, Hoochie Koo
Artist                      Rick Derringer
Release_Year                          1974
Name: 1416, dtype: object
Song               Jessie's Girl
Artist          Rick Springfield
Release_Year                1981
Name: 1417, dtype: object
Song            Back Off Boogaloo
Artist                Ringo Starr
Release_Year                 1972
Name: 1418, dtype: object
Song            Early 1970 [*]
Artist             Ringo Starr
Release_Year              1971
Name: 1419, dtype: object
Song            It Don't Come Easy
Artist                 Ringo Starr
Release_Year                  1971
Name: 1420, dtype: object
Song             No No Song
Artist          Ringo Starr
Release_Year           1

Name: 1558, dtype: object
Song            Still Loving You
Artist                 Scorpions
Release_Year                1984
Name: 1559, dtype: object
Song              The Zoo
Artist          Scorpions
Release_Year         1980
Name: 1560, dtype: object
Song            Wind Of Change
Artist               Scorpions
Release_Year              1990
Name: 1561, dtype: object
Song             Remedy
Artist          Seether
Release_Year       2005
Name: 1562, dtype: object
Song            God Save The Queen
Artist                 Sex Pistols
Release_Year                  1977
Name: 1564, dtype: object
Song                   45
Artist          Shinedown
Release_Year         2003
Name: 1566, dtype: object
Song               Don't You
Artist          Simple Minds
Release_Year            1985
Name: 1571, dtype: object
Song            18 And Life
Artist             Skid Row
Release_Year           1989
Name: 1572, dtype: object
Song            I Remember You
Artist                Skid Row
Release_

Name: 1881, dtype: object
Song            Train In Vain
Artist              The Clash
Release_Year             1979
Name: 1882, dtype: object
Song            Fire Woman
Artist            The Cult
Release_Year          1989
Name: 1883, dtype: object
Song            Another Park, Another Sunday
Artist                   The Doobie Brothers
Release_Year                            1974
Name: 1884, dtype: object
Song                    Black Water
Artist          The Doobie Brothers
Release_Year                   1974
Name: 1885, dtype: object
Song                    China Grove
Artist          The Doobie Brothers
Release_Year                   1973
Name: 1886, dtype: object
Song            Jesus Is Just Alright
Artist            The Doobie Brothers
Release_Year                     1972
Name: 1888, dtype: object
Song            Listen To The Music
Artist          The Doobie Brothers
Release_Year                   1972
Name: 1889, dtype: object
Song             Long Train Runnin'
Artist      

Name: 1999, dtype: object
Song            Seven Nation Army
Artist          The White Stripes
Release_Year                 2003
Name: 2000, dtype: object
Song               5:15
Artist          The Who
Release_Year       1973
Name: 2001, dtype: object
Song             Athena
Artist          The Who
Release_Year       1982
Name: 2003, dtype: object
Song            Baba O'Reilly
Artist                The Who
Release_Year             1971
Name: 2004, dtype: object
Song            Bargain
Artist          The Who
Release_Year       1971
Name: 2005, dtype: object
Song            Behind Blue Eyes
Artist                   The Who
Release_Year                1971
Name: 2006, dtype: object
Song            Eminence Front
Artist                 The Who
Release_Year              1982
Name: 2007, dtype: object
Song            Going Mobile
Artist               The Who
Release_Year            1971
Name: 2009, dtype: object
Song            Join Together
Artist                The Who
Release_Year       

Name: 2102, dtype: object
Song            Vertigo
Artist               U2
Release_Year       2004
Name: 2103, dtype: object
Song            When Love Comes to Town
Artist                               U2
Release_Year                       1988
Name: 2104, dtype: object
Song            Where the Streets Have No Name
Artist                                      U2
Release_Year                              1987
Name: 2105, dtype: object
Song            With Or Without You
Artist                           U2
Release_Year                   1987
Name: 2106, dtype: object
Song            Red Red Wine
Artist                  UB40
Release_Year            1983
Name: 2107, dtype: object
Song            Lights Out (Live)
Artist                        UFO
Release_Year                 1976
Name: 2108, dtype: object
Song              Stealin'
Artist          Uriah Heep
Release_Year          1973
Name: 2111, dtype: object
Song            The Wizard
Artist          Uriah Heep
Release_Year          1972


Name: 2223, dtype: object
Song            Sharp Dressed Man
Artist                     ZZ Top
Release_Year                 1983
Name: 2224, dtype: object
Song            Tube Snake Boogie
Artist                     ZZ Top
Release_Year                 1981
Name: 2226, dtype: object
Song              Tush
Artist          ZZ Top
Release_Year      1975
Name: 2227, dtype: object
Song            TV Dinners
Artist              ZZ Top
Release_Year          1983
Name: 2228, dtype: object
Song            WAITIN' FOR THE BUS/JESUS JUST LEFT CHICAGO
Artist                                               ZZ Top
Release_Year                                           1973
Name: 2229, dtype: object


0       None
1       None
2       None
3       None
4       None
        ... 
2225    None
2226    None
2227    None
2228    None
2229    None
Length: 2230, dtype: object

#### 8.B Using the `.apply()` function, apply the function you wrote to the first four rows of the DataFrame.

You will need to tell the `apply` function to operate row by row. Setting the keyword argument as `axis=1` indicates that the function should be applied to each row individually.

In [123]:
# A:
def sard1970(row):
    
    if row.Release_Year > 1970 & row.Release_Year != 0:
        print(row[['Song', 'Artist', 'Release_Year']]) #with double bracket it went into dataframe, single is a list
    #else:
   #     print('this song is before 1970 or equal')


rockfile.head(n=4).apply(sard1970, axis = 1)

Song            Caught Up in You
Artist               .38 Special
Release_Year                1982
Name: 0, dtype: object
Song            Hold On Loosely
Artist              .38 Special
Release_Year               1981
Name: 2, dtype: object
Song            Rockin' Into the Night
Artist                     .38 Special
Release_Year                      1980
Name: 3, dtype: object


0    None
1    None
2    None
3    None
dtype: object

You'll notice that there will be a final output Series of `None` values. The `.apply()` function, if a return value is not specified, will return a Series of `None` values (similar to how the default return for Python functions is `None` when a return statement is not specified).

### 9. Write a function that converts cells in a DataFrame to float and otherwise replaces them with `np.nan`.

If applied to our data, it would keep only the numeric information and otherwise input null values.

Recall that the try-except syntax in Python is a great way to try something and take another action if the initial step fails:

```python
try:
    Perform some action.
except:
   Perform some other action if the first failed with an error.
```

#### 9.A Write the function that takes a column and converts all of its values to float if possible and `np.nan` otherwise. The return value should be the converted Series.

In [136]:
# A:
def floatornan(dfcol):
    
    if dfcol.dtypes != float:
       # print('this column is not a float')
        try:
            dfcol = dfcol.astype(float)
            print('this column could turn into a float')
            print(dfcol)
        except:
            dfcol = np.nan
            print('this column could not turn into a float')
            print(dfcol)
    #return dfcol

#rockfile = rockfile.astype({'Release_Year':'int64'})
rockfile.apply(floatornan, axis = 0)

this column could not turn into a float
nan
this column could not turn into a float
nan
this column could turn into a float
0       1982.0
1          0.0
2       1981.0
3       1980.0
4       1975.0
         ...  
2225       0.0
2226    1981.0
2227    1975.0
2228    1983.0
2229    1973.0
Name: Release_Year, Length: 2230, dtype: float64
this column could not turn into a float
nan
this column could turn into a float
0       1.0
1       0.0
2       1.0
3       1.0
4       1.0
       ... 
2225    0.0
2226    1.0
2227    1.0
2228    1.0
2229    1.0
Name: First?, Length: 2230, dtype: float64
this column could turn into a float
0       1.0
1       0.0
2       1.0
3       1.0
4       1.0
       ... 
2225    0.0
2226    1.0
2227    1.0
2228    1.0
2229    1.0
Name: Year?, Length: 2230, dtype: float64
this column could turn into a float
0        82.0
1         0.0
2        85.0
3        18.0
4         1.0
        ...  
2225      0.0
2226     32.0
2227    109.0
2228      1.0
2229      2.0
Name: P

Song            None
Artist          None
Release_Year    None
Combined        None
First?          None
Year?           None
Play_Count      None
F*G             None
dtype: object

#### 9.B Try your function out on the rock song data and ensure the output is what you expected.


In [141]:
# A:

def floatornan(dfcol):
    
    try:
        dfcol = dfcol.astype(float)
    except:
        dfcol = np.nan
    return dfcol


rockfile.apply(floatornan, axis = 0)


Song                                                          NaN
Artist                                                        NaN
Release_Year    [1982.0, 0.0, 1981.0, 1980.0, 1975.0, 2000.0, ...
Combined                                                      NaN
First?          [1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, ...
Year?           [1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, ...
Play_Count      [82.0, 0.0, 85.0, 18.0, 1.0, 13.0, 1.0, 6.0, 3...
F*G             [82.0, 0.0, 85.0, 18.0, 1.0, 13.0, 1.0, 6.0, 3...
dtype: object

#### 9.C Describe the new float-only DataFrame.

In [140]:
# A:
def floatornan(dfcol):
    
    try:
        dfcol = dfcol.astype(float)
    except:
        dfcol = np.nan
    return dfcol


rockfile.apply(floatornan, axis = 0)
rockfile.describe()

Unnamed: 0,Release_Year,First?,Year?,Play_Count,F*G
count,2230.0,2230.0,2230.0,2230.0,2230.0
mean,1465.734978,0.741256,0.741256,15.04843,15.04843
std,867.221986,0.438043,0.438043,25.288366,25.288366
min,0.0,0.0,0.0,0.0,0.0
25%,0.0,0.0,0.0,0.0,0.0
50%,1973.0,1.0,1.0,3.0,3.0
75%,1981.0,1.0,1.0,18.0,18.0
max,2014.0,1.0,1.0,142.0,142.0
