<img src="http://imgur.com/1ZcRyrc.png" style="float: left; margin: 20px; height: 55px">

# Practice Grouping Data with Pandas

---

You are going to investigate UFO sightings around the US.  This lab will give you practice performing `groupby` operations to split data along multiple dimensions and investigate patterns between subsets of the data using basic aggregation.

In [1]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
sns.set(font_scale=1.5)

%matplotlib inline
%config InlineBackend.figure_format = 'retina'

#### 1. Load and print the header for the UFO data.

In [2]:
ufo_csv = '../../../../../resource-datasets/ufo_sightings/ufo.csv'

In [3]:
ufo_data = pd.read_csv('../../../../../resource-datasets/ufo_sightings/ufo.csv')# A:

In [4]:
ufo_data.shape

(80543, 5)

In [5]:
ufo_data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 80543 entries, 0 to 80542
Data columns (total 5 columns):
City               80496 non-null object
Colors Reported    17034 non-null object
Shape Reported     72141 non-null object
State              80543 non-null object
Time               80543 non-null object
dtypes: object(5)
memory usage: 3.1+ MB


In [6]:
ufo_data.dtypes

City               object
Colors Reported    object
Shape Reported     object
State              object
Time               object
dtype: object

In [7]:
ufo_data.describe()   #why this one doesn't have mean, std.....?

Unnamed: 0,City,Colors Reported,Shape Reported,State,Time
count,80496,17034,72141,80543,80543
unique,13504,31,27,52,68901
top,Seattle,ORANGE,LIGHT,CA,7/4/2014 22:00
freq,646,5216,16332,10743,45


In [8]:
ufo_data.tail()

Unnamed: 0,City,Colors Reported,Shape Reported,State,Time
80538,Neligh,,CIRCLE,NE,9/4/2014 23:20
80539,Uhrichsville,,LIGHT,OH,9/5/2014 1:14
80540,Tucson,RED BLUE,,AZ,9/5/2014 2:40
80541,Orland park,RED,LIGHT,IL,9/5/2014 3:43
80542,Loughman,,LIGHT,FL,9/5/2014 5:30


#### 2. How many null values exist per column?

In [9]:
ufo_data.isnull().sum() # A:

City                  47
Colors Reported    63509
Shape Reported      8402
State                  0
Time                   0
dtype: int64

#### 3. Which city has the most observations?

In [10]:
ufo_data.City.unique()    # A:b

array(['Ithaca', 'Willingboro', 'Holyoke', ..., 'Uhrichsville',
       'Orland park', 'Loughman'], dtype=object)

In [18]:
new_title = ["city","colors","shape","state","time"]

In [19]:
ufo_data.columns=new_title

In [20]:
ufo_data.head()

Unnamed: 0,city,colors,shape,state,time
0,Ithaca,,TRIANGLE,NY,6/1/1930 22:00
1,Willingboro,,OTHER,NJ,6/30/1930 20:00
2,Holyoke,,OVAL,CO,2/15/1931 14:00
3,Abilene,,DISK,KS,6/1/1931 13:00
4,New York Worlds Fair,,LIGHT,NY,4/18/1933 19:00


In [26]:
ufo_data.colors.unique()

array([nan, 'RED', 'GREEN', 'BLUE', 'ORANGE', 'YELLOW', 'ORANGE YELLOW',
       'RED GREEN', 'RED BLUE', 'RED ORANGE', 'RED GREEN BLUE',
       'RED YELLOW GREEN', 'RED YELLOW', 'GREEN BLUE',
       'ORANGE GREEN BLUE', 'ORANGE GREEN', 'YELLOW GREEN',
       'RED YELLOW BLUE', 'ORANGE BLUE', 'RED YELLOW GREEN BLUE',
       'YELLOW GREEN BLUE', 'RED ORANGE YELLOW', 'RED ORANGE YELLOW BLUE',
       'YELLOW BLUE', 'RED ORANGE GREEN', 'RED ORANGE BLUE',
       'ORANGE YELLOW GREEN', 'ORANGE YELLOW BLUE',
       'RED ORANGE GREEN BLUE', 'RED ORANGE YELLOW GREEN',
       'ORANGE YELLOW GREEN BLUE', 'RED ORANGE YELLOW GREEN BLUE'],
      dtype=object)

In [27]:
for color in ufo_data.colors.unique():
    print(color, ufo_data[ufo_data.colors == color].shape[0])

nan 0
RED 4809
GREEN 1897
BLUE 1855
ORANGE 5216
YELLOW 842
ORANGE YELLOW 137
RED GREEN 469
RED BLUE 445
RED ORANGE 486
RED GREEN BLUE 166
RED YELLOW GREEN 35
RED YELLOW 146
GREEN BLUE 147
ORANGE GREEN BLUE 13
ORANGE GREEN 66
YELLOW GREEN 53
RED YELLOW BLUE 36
ORANGE BLUE 58
RED YELLOW GREEN BLUE 26
YELLOW GREEN BLUE 13
RED ORANGE YELLOW 32
RED ORANGE YELLOW BLUE 2
YELLOW BLUE 27
RED ORANGE GREEN 12
RED ORANGE BLUE 21
ORANGE YELLOW GREEN 5
ORANGE YELLOW BLUE 3
RED ORANGE GREEN BLUE 8
RED ORANGE YELLOW GREEN 4
ORANGE YELLOW GREEN BLUE 3
RED ORANGE YELLOW GREEN BLUE 2


In [29]:
ufo_data.city.unique()

array(['Ithaca', 'Willingboro', 'Holyoke', ..., 'Uhrichsville',
       'Orland park', 'Loughman'], dtype=object)

In [37]:
for ufo in ufo_data.city.unique():
    city_ufo = (ufo, ufo_data[ufo_data.city == ufo].shape[0])
    print (city_ufo)

('Ithaca', 30)
('Willingboro', 2)
('Holyoke', 12)
('Abilene', 34)
('New York Worlds Fair', 1)
('Valley City', 7)
('Crater Lake', 2)
('Alma', 12)
('Eklutna', 1)
('Hubbard', 8)
('Fontana', 44)
('Waterloo', 36)
('Belton', 21)
('Keokuk', 7)
('Ludington', 6)
('Forest Home', 1)
('Los Angeles', 416)
('Hapeville', 1)
('Oneida', 6)
('Bering Sea', 1)
('Nebraska', 4)
(nan, 0)
('Owensboro', 18)
('Wilderness', 2)
('San Diego', 401)
('Clovis', 52)
('Los Alamos', 15)
('Ft. Duschene', 1)
('South Kingstown', 16)
('North Tampa', 1)
('Ft. Lee', 2)
('Salinas AFB', 1)
('Jasper', 21)
('Winston-Salem', 28)
('Portsmouth', 49)
('Dallas', 195)
('Huntington Beach', 72)
('San Antonio', 201)
('Roswell', 57)
('New York City', 612)
('Merced', 22)
('Alice', 8)
('Blairsden', 1)
('Index', 4)
('South Portland', 4)
('Oak Lawn', 30)
('Dome', 1)
('Conroe', 15)
('Syracuse', 58)
('Miami', 279)
('San Deigo', 1)
('Minden', 7)
('Cleveland', 131)
('Espanola', 7)
('Oroville', 25)
('Oakmont', 1)
('Winona', 6)
('Gackle', 1)
('Madis

('Boston', 88)
('Carteret', 19)
('Crystal', 3)
('Cherokee', 8)
('Belmont', 26)
('Watchung', 6)
('Cottonwood', 28)
('Saugus', 17)
('Framingham', 10)
('Dickinson', 7)
('Kingsport', 20)
('Susanville', 12)
('Kaneohe', 12)
('Burnsville', 32)
('Westbury', 8)
('Sumner', 22)
('Santa Clara', 36)
('Kendallville', 8)
('Lancaster', 123)
('Panhandle', 2)
('McGuire AFB', 1)
('Shenandoah', 3)
('Altus', 3)
('Parris Island', 1)
('Alpine', 21)
('Opa Locka', 7)
('Carrollton', 35)
('Huntsville', 69)
('Richardson', 19)
('Frederick', 23)
('Webster Groves', 2)
('Greensboro', 63)
('Orlando', 264)
('La Mirada', 12)
('Rotan', 2)
('Lawrenceville', 40)
('West Haven', 15)
('Sapulpa', 7)
('Gettysburg', 11)
('Newberg', 12)
('Arlington Heights', 21)
('Glenshaw', 1)
('Palatine', 23)
('Evendale', 2)
('Fort Benning', 2)
('Wahoo', 1)
('Indialantic', 7)
('Bakersfield', 104)
('Leesville', 2)
('Bartow', 3)
('Forestburg', 1)
('Key West', 20)
('Fayetteville', 97)
('Snohomish', 35)
('Wendover', 10)
('Clayton', 34)
('Huntingdon

('Gerry', 2)
('Laddonia', 1)
('Yucca Valley', 11)
('Hyde Park', 10)
('Ventura', 71)
('Boise', 134)
('Decatur', 41)
('Cedar Rapids', 55)
('Poteau', 10)
('Thomasville', 13)
('DeSoto', 3)
('Highway', 2)
('China Lake', 3)
('Edison', 22)
('Arbuckle', 5)
('Baker', 19)
('Fairfield', 80)
('none', 6)
('Kansas city', 1)
('Lander', 4)
('Oakwood', 9)
('Goleta', 16)
('Pine River', 1)
('Portmouth', 1)
('Reidsville', 9)
('Slidell', 19)
('Schofield Barracks', 3)
('Aiken', 10)
('Ft. Lewis', 1)
('Glenview', 6)
('New Market', 2)
('Maury Mountains', 1)
('Chatfield', 5)
('East Hanover', 3)
('Wyoming', 23)
('Marshfield', 16)
('Carmichaels', 5)
('Boydstown', 1)
('Lakeville', 27)
('Barboursville', 7)
('Crystal Beach', 7)
('West Lake', 1)
('West Palm Beach', 64)
('Churchville', 2)
('Wadsworth', 11)
('Marathon County', 1)
('Coxsackie', 4)
('Downey', 22)
('Davis Park', 1)
('Sturgeon Bay', 9)
('Middlesboro', 6)
('Babylon', 10)
('Alamosa', 8)
('Little Rock', 63)
('Westford', 7)
('Melbourne', 42)
('Burien', 18)
('S

('Brea', 14)
('Morton', 11)
('Chesterfield', 30)
('Mariemont', 1)
('Wayland', 11)
('West Pittson', 1)
('Mojave', 18)
('High Island', 2)
('Bloomington', 77)
('Ellensburg', 21)
('Mt. Sterling', 3)
('Milwaukie', 18)
('Sayreville', 9)
('Coldwater', 13)
('Waukesha', 17)
('Bixby South', 1)
('Onawa', 3)
('Loveland', 55)
('Maple Valley', 21)
('Hermann', 6)
('Hartville', 7)
('Sweet Home', 3)
('Potwin', 1)
('Denton', 36)
('Mosheim', 3)
('Pickens', 7)
('Fortuna', 15)
('Eatonton', 5)
('Harlan County', 1)
('West Bloomfield', 12)
('North Tarrytown', 1)
('Wellsville', 7)
('Monroeville', 14)
('Winding Stair Camp Ground', 1)
('Trinidad', 9)
('Hiawassee', 4)
('Lebanon', 79)
('Chattanooga', 36)
('Eddington', 4)
('Mussel Shoals', 5)
('Farmington', 65)
('Scranton', 18)
('Glendora', 14)
('Port Sulphur', 1)
('Fort Rucker', 1)
('Lake Okeechoobee', 2)
('St. Augustine', 52)
('Plain', 2)
('Inkster', 2)
('Hartford', 27)
('Lufkin', 9)
('Goose Creek', 13)
('Mayfield', 10)
('Glenolden', 3)
('Oakcliff', 2)
('Grand Ch

('Hydes', 1)
('Newtown', 22)
('South Lake Tahoe', 13)
('Mountaintop', 3)
('Bell', 2)
('Pine City', 3)
('Midvale', 9)
('Pass Christian', 7)
('Federal Way', 44)
('Meridien', 1)
('Pacific Palisades', 6)
('Barberton', 10)
('Upper Saddle River', 3)
('Chunky', 1)
('Glennville', 2)
('Santa Paula', 9)
('Covington', 45)
('Shepherd', 4)
('Point Pleasent', 1)
('Cossayuna', 1)
('Gresham', 35)
('Kannapolis', 17)
('Rutherfordton', 2)
('Everett', 122)
('Copley', 3)
('Imperial Beach', 9)
('Littleton', 62)
('Wurtland', 1)
('South  Hattisburg', 1)
('Greenbush', 2)
('Woodberry Forest Preparatory School, VA', 1)
('Bonney Lake', 6)
('Bergenfield', 9)
('Ford City', 5)
('Mt. Royal', 1)
('Simi Valley', 55)
('Smithtown', 7)
('Western', 2)
('Wyandotte', 12)
('Beaver Creek', 3)
('Joshua Tree National Monument', 3)
('Wilkes-Barre', 23)
('Drumright', 2)
('Hobbs', 9)
('Dale', 5)
('Warden', 2)
('Dunbar', 4)
('Lake McClure', 1)
('Kutztown', 4)
('Cactus Flat', 1)
('Berrien Springs', 2)
('Hadley', 6)
('Elmira Heights',

('Fort Valley', 1)
('Twenty Nine Palms', 11)
('North Miami Beach', 7)
('Little Falls', 12)
('New Londons', 1)
('Mammoth Lakes', 8)
('Bistolville', 1)
('Campbellsville', 5)
('West New York', 5)
('Beach Haven Terrace', 3)
('Painesville', 4)
('Clay City', 3)
('Lake Wateree', 1)
('Cass City', 2)
('North Platte', 15)
('Crowley', 4)
('Commack', 12)
('Farmville', 3)
('Kenna', 3)
('Spotsylvania', 9)
('Sickles', 1)
('Lake Conroe', 2)
('Mount Wilson Observatory', 1)
('East New York', 1)
('Mobile', 29)
('Cape Elizabeth', 3)
('Cottage Grove', 17)
('New Brighton', 13)
('Haltom Ciy', 1)
('Ballentine', 2)
('Joliet', 24)
('Lomita', 3)
('Cranbury', 3)
('Security', 2)
('Barnegat', 4)
('Bemidji', 17)
('Gig Harbor', 31)
('Wausau', 9)
('Emerson', 1)
('Wickenburg', 16)
('Hatcreek Campground', 1)
('Wassaic', 2)
('Clarksburg', 16)
('Ft. Jackson', 1)
('Water Valley', 2)
('Morris', 17)
('Houlton', 3)
('Chicopee', 10)
('Orinda', 9)
('Murphy', 10)
('Bend', 45)
('Bisbee', 1)
('Patterson', 17)
('Manassas', 23)
('Mo

('Eustis', 7)
('Alvordon', 1)
("I don't know", 2)
('Los Alamitos', 6)
('Golden Pond', 1)
('Hardy', 3)
('Yosemite National Park', 7)
('Kona', 2)
('Presidio', 2)
('Yosemite Forest', 1)
('Plant City', 19)
('Haledon', 3)
('Belle Chasse', 2)
('Birch Bay', 4)
('Holmdel', 4)
('Trumbull', 15)
('Griffith', 4)
('Copalis Beach', 1)
('Ironton', 11)
('Mercer Island', 13)
('Woodside', 6)
('San Carlos', 9)
('Cimarron', 2)
('Black Diamond', 3)
('Wilburton', 2)
('Eau Claire', 15)
('South Nags Head', 5)
('Tarryall Reservoir', 1)
('Friendship', 9)
('Mechanics Town', 1)
('Rockland', 7)
('Waterbury', 34)
('Wappingers Falls', 7)
('Kaunakakai', 2)
('La Veta', 6)
('Downingtown', 19)
('Beacon Falls', 4)
('Annandale', 14)
('Coconut Creek', 6)
('Holton', 4)
('Santa Margarita', 3)
('Lawrence', 41)
('Swansboro', 1)
('San Ramon', 14)
('Hartford city', 1)
('Throop', 5)
('Moorestown', 2)
('Plan', 1)
('Lou  Costello Recreation Center', 1)
('Vernon Valley', 1)
('South Milwaukee', 6)
('Cedar Grove', 2)
('Big Cane', 1)
(

('Malad', 2)
('Escondido', 59)
('Massapequa', 8)
('Milledgeville', 3)
('Grants', 6)
('Lake Mead', 2)
('Central Texas', 1)
('Northfield', 13)
('Lone Pine', 11)
('Rio Rancho', 26)
('Port Vue', 1)
('Port St. Lucie', 36)
('Whitestone', 2)
('Central Islip', 5)
('Monument Valley', 8)
('Atlantic City', 19)
('Warrenton', 17)
('Dwarf', 2)
('Highway 24', 1)
('Dixon', 15)
('Brawley', 6)
('W. Wendover', 1)
('Gaffnet', 1)
('Kingston area', 1)
('Mifflinville', 1)
('Galconda', 1)
('Trementina', 1)
('Oconomowoc', 8)
('Okanogan', 5)
('Whitefish', 19)
('Hawk Mountain', 1)
('Cuba', 10)
('Maunk Chunk', 1)
('Mulliken', 1)
('Elk River', 6)
('St. David', 2)
('Suisun City', 2)
('Grove Land', 1)
('McKinney', 25)
('Sumter', 16)
('Berea', 7)
('Abingdon', 15)
('Sunset', 2)
('Crittenden', 1)
('Shields', 1)
('Hamden', 13)
('Romeo', 6)
('Chowchilla', 3)
('Hagerhill', 1)
('Donnelly', 3)
('New Idria', 2)
('Yakima Indian Reservation', 1)
('Yakama Indian Reservation', 1)
('Groton', 17)
('Lewiston', 33)
('Greenbrier', 5)

("Coconino Nat'l Forest north of Wupatki NM, AZ", 1)
('Borrego Springs', 9)
('Avondale', 16)
('N. Judson', 1)
('North Judson', 5)
('Whatcom County', 1)
('Haymarket', 6)
('Shepard AFB', 1)
('Cannis City', 1)
('S. Greensburg', 1)
('Lockwood', 3)
('Evington', 2)
('Patricia', 1)
('Ft Benning', 1)
('Beaver', 19)
('Hwy. 18', 1)
('Snoqualmie', 11)
('North Dade', 1)
('Kingsford Heights', 1)
('Harmony', 3)
('McAllen', 14)
('Shippensburg', 6)
('Rainsville', 1)
('Elma', 4)
('Mt. Ranch', 1)
('Central U.S.', 1)
('Smith River', 1)
('Haslett', 3)
('Boyertown', 4)
('Pridice Valley', 1)
('Tajunga', 1)
('Yarldey', 1)
('Norristown', 7)
('Friday Harbor', 9)
('Brooklyn', 13)
('West Memphis', 7)
('Newell area', 1)
('Earlville', 2)
('Pinehurst', 6)
('Urbana', 14)
('Villa Rica', 9)
('Pacificia', 1)
('Leesport', 2)
('Gilbertsville', 2)
('New Smyrna Beach', 12)
('Pembrook Pines', 1)
('Lake Sullivan', 1)
('Watauga', 8)
('Oregon', 16)
('MM 99 mile Exit', 1)
('Bradenton', 48)
('Hickory Corners', 1)
('Quartzsite', 

('Boerne', 5)
('Vilonia', 2)
('Merrimac', 2)
('Reno to Dallas', 1)
('Canyonville', 3)
('Hobe Sound', 9)
('Harker Heights', 3)
('Blossburg', 2)
('Gorman VOR', 1)
('Bandy', 1)
('Frostburg', 7)
('Big Flats', 1)
('Stamping Ground', 2)
('French Camp', 1)
('Arnolds Park', 2)
('Bloomsburg', 8)
('Graham County', 1)
('Wheelersburg', 4)
('Outlook', 1)
('La Joya', 1)
('Shreveport', 23)
('Wrightwood Mtn.', 1)
('Jonesburg', 1)
('Highway 15', 1)
('Gulf Breeze', 13)
('Camarillo', 24)
('I-10, Marker 174', 1)
('Karnak', 2)
('Sunrise', 20)
('Highway 37', 1)
('Hartsburg', 2)
('Chilhowie', 5)
('Chanute', 7)
('Arrayo Grande', 1)
('Johnston', 11)
('Clatonia', 1)
('Cayce', 1)
('Tom Sauk Moutain', 1)
('Steinhatchee', 3)
('Haleiwa', 6)
('Amelia', 7)
("Don't Know", 1)
('South Attleboro', 2)
('Lombard', 21)
('McKinlyville', 1)
('Calabash', 5)
('Pico Rivera', 17)
('Millsboro', 4)
('Seguin', 8)
('Babson Park', 3)
('Blackwell', 3)
('Altus AFB', 1)
('Alta', 1)
('East Greenwich', 4)
('Alto', 6)
('Unincorporated', 1)


('Buffalo Grove', 6)
('Old Appleton', 1)
('Boston Location', 1)
('Tucker', 5)
('Maple Shade', 4)
('Elmira', 11)
('Whitehorse', 1)
('Penfield', 6)
('Andrews', 5)
('Leo', 3)
('Nipomo', 9)
('West Plains', 4)
('West Richland', 6)
('Hunstville', 1)
('Bayfield', 9)
('Cockeysville', 5)
('Liberty County', 2)
('Rancho Tehama', 1)
('Lake Weir', 1)
('Rydal', 4)
('Ocotillo', 1)
('West Paterson', 2)
('Bailey', 8)
('Montverde', 4)
('Elba', 3)
('Cornville', 2)
('Lynch Mountain', 1)
('Newalla', 3)
('Grayling', 5)
('Halsey', 1)
('West Chicago', 3)
('Cartago', 1)
('Boy Scout Camp Royenea', 1)
('CA', 1)
('Glide', 1)
('Irwindale', 2)
('Kalkaska', 2)
('Navajo Lake', 3)
('Pontoto', 1)
('Stoddard Valley', 1)
('Campwood', 1)
('Healdsburg', 13)
('McHenry', 14)
('Aspers', 1)
('Narcoossee', 1)
('Claremore', 12)
('St.Helens', 1)
('Wrightwood', 3)
('Cary', 34)
('Waveland', 2)
('Neshkoro', 1)
('Trumann', 2)
('East Amwell', 1)
('East Amwell Township', 1)
('Ellicott City', 13)
('Fitchburg', 12)
('Central Point', 2)
(

('Cartwright', 2)
('Flippin', 3)
('Carmel Valley', 4)
('Andrade', 1)
('US Hwy 200 between MP 39', 1)
('Lansdowne', 2)
('Sacramento area', 2)
('Greenwater', 3)
('Lyle', 3)
('Lorian', 1)
('Wacker City', 1)
('Tustin', 17)
('Theodore Roosevelt National Park exit 32 on I-94 North Dakot', 1)
('Interstate 10, somewhere by the border', 1)
('Olympic Mountains', 2)
('Cheraw', 1)
('Paso Robles', 14)
('Vestal', 10)
('Pekin', 6)
('Guilford', 10)
('South Jordan', 6)
('Rocky Hill', 4)
('Rowland Heights', 3)
('East Stroudsburg', 6)
('Batavia', 20)
('Roslindale', 4)
('Sandusky', 10)
('Lost Park area', 1)
('Merrill', 5)
('Delavan', 3)
('Mount Shasta City', 1)
('Post Falls', 16)
('Florida Panhandle', 1)
('Machesney Park', 7)
('Meyers-Tahoe paradise', 1)
('Royal Plam Beach', 1)
('Spranger', 1)
('Cortez', 9)
('Newburg', 1)
('The Colony', 11)
('Johnson Park, NE CA.,Shasta Co., calif', 1)
('Firth', 2)
('Sleepy Hollow', 1)
('Alburquerqe', 1)
('Shenandoah County', 1)
('Plain City', 2)
('Wallkill', 3)
('Tomoniu

("D'Hanis", 1)
('Gansevoort', 2)
('Newman Lake', 2)
('Loves Park', 10)
('Patoka', 2)
('Erath', 1)
('Redlands', 21)
('La Quinta', 9)
('Sanger', 4)
('Jurupa', 1)
('Duryea', 2)
('Coleman', 6)
('Summerfield,', 1)
('Exton', 5)
('Lakemoor', 1)
('Rock Spings, 2miles west of', 1)
('Riley County', 1)
('Austell', 4)
('Maitland', 3)
('Las Vegas Blvd.', 1)
('Mannford', 3)
('Brownell', 1)
('Pine Mountain', 3)
('Charlesbourg', 1)
('Keystone Heights', 2)
('Channelview', 2)
('West Fargo', 3)
('Christiana', 1)
('Becker', 1)
('Buttonwillow', 3)
('Dayton township', 1)
('Sixes', 1)
('Bryant', 8)
('Belcamp', 3)
('Tacna', 4)
('Edinburg', 9)
('Parrish', 9)
('San Gabriel Valley', 3)
('Northbend', 1)
('New Wilmington', 2)
('Babbitt', 3)
('Glenwood Landing', 1)
('Bagdad', 2)
('Crystal River', 5)
('Sealy', 1)
('Youngsville', 8)
('St. Albans', 7)
('Sebring', 13)
('Pima', 3)
('Ionia', 7)
('Harbor City', 6)
('Avon Park', 5)
('Harrington', 15)
('Sea Tac Airport', 1)
('Hermitage', 16)
('Mahony City', 1)
('Petrified F

('Fountain City', 1)
('Aptos', 5)
('Wax', 1)
('Progresso', 1)
("Kapa'a", 1)
('Galway', 1)
('Kilauea', 3)
('Lawai', 3)
('Devils Lake', 2)
('Lemon Grove', 5)
('Seldon', 2)
('Anthem', 9)
('Darrington', 2)
('Thorntown', 2)
('Redwood Valley', 6)
('Gillsville', 2)
('Pioneer', 1)
('Fargher Lake', 1)
('Loon Lake', 3)
('Foxboro', 6)
('Elizabeth City', 6)
('Amhearst', 1)
('MM 110', 1)
('Kettleman City', 4)
('Toomsuba', 1)
('Oliphant', 1)
('St. Maries', 2)
('Fallston', 1)
('Greenfiled', 1)
('Prospect Park', 2)
('Imperial', 16)
("Kea'au", 2)
('Saco', 7)
('Havre', 13)
('Cerrilillo', 1)
('Murphreesboro', 1)
('El Campo', 1)
('Berthoud Pass', 1)
('Sisterdale', 2)
('Albrightsville', 3)
('Eufaula', 6)
('Capitola', 13)
('Grant Park', 1)
('Ybor', 1)
('Monroeville to Seven Springs', 1)
('North Whitefield', 1)
('Kellyville', 1)
('Miramar', 6)
('Luka', 2)
('Merrillan', 1)
('Oakham', 1)
('Ballston Lake', 5)
('Lebo', 1)
('Payette', 5)
('Woodinvile', 1)
('Miller', 1)
('South Bay', 2)
('Crookston', 3)
('New Rich

('Bayboro', 1)
('Cross River', 2)
('Tellico Plains', 2)
('Eastlake', 2)
('Kahului', 7)
('Foster City', 7)
('Fergus Falls', 7)
('West Grove', 4)
('Johnson', 3)
('Maplewood', 5)
('Dunlo', 1)
('Centennial', 9)
('Caro', 3)
('Tionesta', 3)
('Bramptom', 1)
('Camdenton', 3)
('Rancho Dominguez', 1)
('Redwood City', 15)
('Indian Creek', 1)
('Howey in the Hills', 1)
('PA', 1)
('East Barnard', 1)
('S. Bossier', 1)
('Parker', 25)
('Goodland', 7)
('Sneads Ferry', 3)
('Eucha', 1)
('Aledo', 5)
('Ozona', 2)
('Sioux Center', 1)
('Perkins', 2)
('Falcon', 8)
('Brock', 1)
('Mount Dora', 2)
('Ferrum', 2)
('Wilsall', 1)
('Wentworth', 3)
('Ucaipa', 1)
('Glen Dale', 1)
('Watseka', 1)
('Aneta', 1)
('Wyncote', 1)
('Bennettsville', 1)
('Bessemer City', 3)
('DeBary', 6)
('Missouri Valley', 1)
('Port Jefferson Station', 5)
('Long Island City', 5)
('New Hyde Park', 2)
('Banner Elk', 2)
('Wainwright', 1)
('Eagleville', 1)
('W. Rutland', 1)
('Dubuque', 16)
('Cordes', 1)
('Tomball', 20)
('Gilliam', 1)
('Prince Frederi

('Heltonville', 1)
('Eastman', 1)
('Ship Bottom', 2)
('Savage', 5)
('Joseph', 2)
('Dallesport', 3)
('Palos Verdes Estates', 3)
('Creswell', 5)
('Ivins', 1)
('Murphys', 2)
('Fall River Mills', 2)
('Elmo', 2)
('Gaysville', 1)
('Clarks Summit', 5)
('Bland', 1)
('Bosler', 1)
('Rexford', 2)
('Galion', 1)
('Chaco Canyon National Historic Park', 1)
('Halletsville', 2)
('Gilmer County', 1)
('Kotzebue', 1)
('Story City', 2)
('Hells Canyon', 1)
('LaCenter', 2)
('Latham', 8)
('St. Clairsville', 1)
('Guttenberg', 1)
('Summit Lake Exit, Hwy 42', 1)
('Bertram', 9)
('Brown City', 2)
('Marshallville', 1)
('Crossing California into Arizona', 1)
('Coralville', 5)
('Port  Orchard', 1)
('Brush Prairie', 1)
('Moundridge', 1)
('Atascadero', 8)
('Nogales', 4)
('Grand Isle', 5)
('Saint Cloud', 9)
('Homer Glenn', 1)
('Chesnee', 3)
('Mullen', 1)
('Arkansas City', 1)
('Calumet', 2)
('Ivesdale', 1)
('Townville', 2)
('Kettle Falls', 2)
('Garwood', 1)
('Los Angeles Area', 1)
('Orcutt', 2)
('Academy', 1)
('Hacienda 

('Mississippi River', 1)
('No.Hollywood', 1)
('Rio Grande City', 1)
('La Vista', 3)
('Shell Knob', 1)
('Rainbow Lakes Estates', 1)
('Lebanon Church', 1)
('Intervalle', 1)
('Kennedyville', 2)
('Brookings Harbor', 1)
('New Buffulo', 1)
('Temperance', 5)
('Crescent Beach', 2)
('Gleason', 6)
('Reevesville', 1)
('Kansas ??', 1)
('Trego', 2)
('Crandall', 3)
('Wofford Heights', 1)
('Sylvan Beach', 1)
('Mckinleyville', 1)
('Hanna', 2)
('Buckeye Lake', 1)
('West Jordan', 18)
('Purlear', 2)
('Route 80 W', 1)
('Tuscumbia', 1)
('Mukwonago', 19)
('Miama', 1)
('Yermo', 1)
('Turners Falls', 2)
('Three Sisters', 1)
('Mercersburg', 2)
('Honea Path', 3)
('Vanceboro', 2)
('Indian Trail', 6)
('Scandia', 4)
('Caswell', 2)
('New River', 5)
('Murrells Inlet', 19)
('Centerton', 2)
('Cliffside', 1)
('Skyline Drive', 1)
('Philippine Sea', 1)
('Geary', 1)
('Bruce', 2)
('Trotwood', 3)
('Maud', 2)
('Lake Hiawatha', 1)
('Staunton', 8)
('Hot Springs Village', 1)
('Rayne', 4)
('Powder Springs', 7)
('Scipio Center', 1

('Hidalgo', 1)
('Brackettville', 1)
('Zelienople', 1)
('Dale Hollow Lake', 2)
('Pacheco', 4)
('Waikele', 1)
('Puxico', 1)
('Whangarei', 1)
('Raeford', 6)
('Mastic Beach', 3)
('Mount Cobb', 2)
('Darien Center', 1)
('Ewing', 5)
('Anoka', 4)
('Embden', 1)
('Lead', 1)
('Auburn Hills', 5)
('Attalla', 3)
('Harleysville', 9)
('South Phoenix', 2)
('Robinson', 2)
('Willow River', 1)
('Topanga', 2)
('Dugas', 1)
('Blue Mound', 2)
('Crozet', 2)
('Shumway', 1)
('Levasy', 1)
('Browns Point', 1)
('Toledo-Cleveland', 1)
('Paramount', 8)
('Columbus Junction', 1)
('Brookpark', 2)


KeyboardInterrupt: 

In [51]:
pd.Series(city_ufo).value_counts()

Brookpark    1
2            1
dtype: int64

In [70]:
df.sort_values(by='playcount')[-20:]


AttributeError: 'tuple' object has no attribute 'sort_values'

In [53]:
ufo_data.head()

Unnamed: 0,city,colors,shape,state,time
0,Ithaca,,TRIANGLE,NY,6/1/1930 22:00
1,Willingboro,,OTHER,NJ,6/30/1930 20:00
2,Holyoke,,OVAL,CO,2/15/1931 14:00
3,Abilene,,DISK,KS,6/1/1931 13:00
4,New York Worlds Fair,,LIGHT,NY,4/18/1933 19:00


In [57]:
grouped_by_city = ufo_data.groupby("city")
grouped_by_city.max().head()

Unnamed: 0_level_0,state,time
city,Unnamed: 1_level_1,Unnamed: 2_level_1
((HOAX??)),PA,1/20/2010 19:30
((Unspecified location)),IL,9/28/2010 19:30
(City not specified),MO,2/21/2014 19:45
(above mountains in airplane),UT,6/20/2010 15:00
Abbeville,SC,7/9/2002 23:45


In [63]:
ufo_data["city"].value_counts().head()

Seattle          646
New York City    612
Phoenix          533
Las Vegas        442
Portland         438
Name: city, dtype: int64

#### 4. What is the observation count per shape?

In [68]:
ufo_data["shape"].value_counts()

LIGHT        16332
TRIANGLE      7816
CIRCLE        7725
FIREBALL      6249
OTHER         5506
SPHERE        5231
DISK          5226
OVAL          3721
FORMATION     2405
CIGAR         1983
VARIOUS       1957
FLASH         1329
RECTANGLE     1295
CYLINDER      1252
DIAMOND       1152
CHEVRON        940
EGG            733
TEARDROP       723
CONE           310
CROSS          241
DELTA            7
ROUND            2
CRESCENT         2
HEXAGON          1
FLARE            1
DOME             1
PYRAMID          1
Name: shape, dtype: int64

In [67]:
for tri in ufo_data.shape.unique():
    print(tri, ufo_data[ufo_data.shape == tri].shape[0])

AttributeError: 'tuple' object has no attribute 'unique'

In [13]:
# A:

#### 5. Create a subset of the data that is only the observations where the city is in the top 5 cities AND the shape is in the top 5 shapes.

In [14]:
# A:

#### 6. CHALLENGE: With the subset, find the percent of each shape seen by city.

In [8]:
# A:

**7. Make a grouped bar chart with your subset data showing counts of shapes seen by city.**

In [9]:
# A: