### Reading and Editing the Rollercoaster CSV File

First we opened the rollercoaster CSV and stripped any whitespace. We then fixed the height in the "Height" column so that every value was an integer. For the duration column, we replaced any "NaN" values with "0:00", and replaced all "NaN" values in the Length column with "0". Finally, we converted every value in the Height column to a float.

In [0]:
import pandas as pd

roller = pd.read_csv('rollercoaster.csv')
roller.columns = roller.columns.str.strip()
roller = roller.rename(columns={"Height (feet)":"Height"})
roller.loc[111, 'Height'] = 200
roller = roller [0:300]
roller['Duration (min:sec)'] = roller['Duration (min:sec)'].fillna('0:00')
roller['Duration (min:sec)'] = roller['Duration (min:sec)'].replace(' ','0:00')
roller['Length (feet)'] = roller['Length (feet)'].fillna(0)
roller['Height'] = roller['Height'].astype(float)
roller.columns

Index(['Name', 'Park', 'City/Region', 'City/State/Region', 'Country/Region',
       'Geographic Region', 'Construction', 'Type', 'Status',
       'Year/Date Opened', 'Height', 'Speed (mph)', 'Length (feet)',
       'Inversions (YES or NO)', 'Number of Inversions', 'Drop (feet)',
       'Duration (min:sec)', 'G Force', 'Vertical Angle (degrees)'],
      dtype='object')

### Duration Function

The Duration Function awards points based on the length of the ride. Longer rides will receive more points than shorter rides and vice versa. We first had to convert the time format in the table to just seconds instead of minutes and seconds. Once our times had been changed, we awarded points for every additional 70 seconds.

In [0]:
#DURATION
def mintosec(time):
    new = time.split(':')
    seconds = (int(new[0])*60) + int(new[1])
    return seconds

rollerduration = roller['Duration (min:sec)']
def duration(rollerduration):
    seconds = mintosec(rollerduration)
    if seconds <=70:
        durationpoints = 1
    elif 70 < seconds <= 140:
        durationpoints = 2
    elif 140 < seconds <= 210:
        durationpoints = 3
    elif 210 < seconds <= 280:
        durationpoints = 4
    elif 280 < seconds <= 350:
        durationpoints = 5
    return durationpoints

roller['duration rating'] = roller['Duration (min:sec)'].apply(duration)

### Length Function

The Length Function awards points based on the length of the ride in feet. Rides with a greater length are awarded more points than shorter rides. We added points on a scale based on the minimum and maximum length values in the Length column.

In [0]:
#LENGTH
rollerlength = roller['Length (feet)']
def length(rollerlength):
        if rollerlength <= 1650:
            lengthpoints = 1
        elif 1650 < rollerlength <= 3330:
            lengthpoints = 2
        elif 3330 < rollerlength <= 4980:
            lengthpoints = 3
        elif 4980 < rollerlength <= 6630:
            lengthpoints = 4
        elif 6630 < rollerlength <= 8280:
            lengthpoints = 5
        return lengthpoints
roller['length rating'] = roller['Length (feet)'].apply(length)

### Speed Function

The Speed Function awards points based on the maximum speed of the ride. Faster rides are awarded more points than slower rides. We added points on a scale based on the minimum and maximum speed values in the Speed column.

In [0]:
#SPEED
rollerspeed=roller['Speed (mph)']
def speed(rollerspeed):
    points = 0
    
    if 25 <= rollerspeed < 50:
        points += 1
    if 50 <= rollerspeed < 75:
        points += 2
    if 75 <= rollerspeed < 100:
        points += 3
    if 100 <= rollerspeed < 125:
        points += 4
    if 125 <= rollerspeed:
        points += 5   
    return points
roller['speed rating'] = roller['Speed (mph)'].apply(speed)

### Date Opened Function

The "Date Opened" function gives points to a rollercoaster based on how recently it was built. Newer rollercoasters will be rewarded with more points than older rollercoasters.

In [0]:
#AGE
rollerage = roller['Year/Date Opened']
def years(rollerage):
    points = 0
    if rollerage >= 2000:
        points += 5
    if 1980 <= rollerage < 2000:
        points += 4
    if 1960 <= rollerage < 1980:
        points += 3
    if 1940 <= rollerage < 1960:
        points += 2
    if 1920 <= rollerage < 1940:
        points += 1
    if rollerage == 'NaN':
        points += 0
    return points
        
roller['age rating'] = roller['Year/Date Opened'].apply(years)

### Height Function

The Height Function awards points to roller coasters based on their maximum height. The minimum height was 29 feet and the maximum height was 456 feet, so we added a point for every 100 feet of height.

In [0]:
#HEIGHT
rollerheight=roller['Height']


def Height(rollerheight):
    
    points = 0

    if rollerheight < 100 :
        points += 1
    if 100 <= rollerheight <  200 :
        points += 2
    if 200 <= rollerheight <  300 :
        points += 3
    if 300 <= rollerheight <  400 :
        points += 4
    if 400 <= rollerheight <  500 :
        points += 5
    return points

roller['height rating'] = roller['Height'].apply(Height)

### Inversions Function

The Inversions Function awards points to roller coaster based on the number of inversions it has. The maximum number of inversions was 14 while the minimum was 0. We awarded a point for every 3 inversions.

In [0]:
#INVERSIONS
rollerinversions=roller['Number of Inversions']
def Inversions(rollerinversions):   
    points = 0
    if rollerinversions == 0:
        points += 0
    if 1 <= rollerinversions < 3 :
        points += 1
    if 3 <= rollerinversions < 6 :
        points += 2
    if 6 <= rollerinversions < 9 :
        points += 3
    if 9 <= rollerinversions < 12 :
        points += 4
    if 12 <= rollerinversions:
        points += 1
    return points
roller['inversions rating'] = roller['Number of Inversions'].apply(Inversions)

### Final Rating Function

The Final Rating Function simply adds the points from rating column and creates a new column with the sum called "Total Rating".

In [1]:
def finalrating(rollerspeed,rollerage, rollerduration,rollerheight,rollerinversions):
    totalrating = rollerage + rollerspeed + rollerduration + rollerheight + rollerinversions
    return totalrating
roller['total rating'] = roller['age rating']+roller['speed rating']+roller['length rating'] + roller['height rating'] + roller['inversions rating']

NameError: ignored

In [0]:
roller.sort_values(by='total rating',ascending = False)

NameError: ignored