In [1]:
import pandas as pd

open_seven_days_df = pd.read_parquet("../../data/pandas/open_seven_days_df.parquet")

A theme in this course will be learning transformations across languages— the ability to select the proper tool for the job depends on a knowledge of _what tools exist_.

In this lesson, we'll cover select + filter operations in Pandas.

Creating columns in Pandas is as simple as assigning those columns through the syntax

```python
dataframe['column_name'] = column_value
```

In [2]:
import numpy as np

open_seven_days_df["closed_open"] = np.where(
    open_seven_days_df["standardHours.thursday"] == "Closed", "Closed", "Open"
)
open_seven_days_df["is_closed"] = np.where(
    open_seven_days_df["standardHours.thursday"] == "Closed", True, False
)

Now you might be saying "when are we assigning a single value to a column vs. performing a calculation on a column?" and that would be a great question! The answer lies in _vectorization_— the process of performing calculations on entire columns at once. 

Certain operations can be vectorized and act on other columns, while others need to be _applied_ row-by-row. We'll talk about applying row-wise functions later in the course, but for now we'll focus on vectorized operations. 

In [3]:
open_seven_days_df.columns

Index(['relevanceScore', 'designation', 'weatherInfo', 'addresses',
       'operating_hours', 'entrancePasses', 'name', 'description',
       'directionsUrl', 'fees', 'topics', 'states', 'entranceFees', 'contacts',
       'activities', 'url', 'longitude', 'id', 'images', 'directionsInfo',
       'fullName', 'parkCode', 'latLong', 'latitude', 'category',
       'operating_hours_description', 'exceptions', 'standardHours.friday',
       'standardHours.sunday', 'standardHours.thursday',
       'standardHours.tuesday', 'standardHours.saturday',
       'standardHours.monday', 'standardHours.wednesday', 'monday_hours',
       'tuesday_hours', 'wednesday_hours', 'thursday_hours', 'friday_hours',
       'saturday_hours', 'sunday_hours', 'open_seven_days_a_week',
       'closed_open', 'is_closed'],
      dtype='object')

In [4]:
open_seven_days_df["open_closed"] = (
    "Today, the park is: " + open_seven_days_df["closed_open"]
)

open_seven_days_df["open_closed"]

6      Today, the park is: Open
8      Today, the park is: Open
10     Today, the park is: Open
14     Today, the park is: Open
15     Today, the park is: Open
                 ...           
665    Today, the park is: Open
666    Today, the park is: Open
667    Today, the park is: Open
668    Today, the park is: Open
669    Today, the park is: Open
Name: open_closed, Length: 537, dtype: object

It's also possible to select in Pandas using `iloc` and `loc`. As the name suggest, one is for selecting an _index_, the other a _column_

In [17]:
open_seven_days_df.head(10)

Unnamed: 0,relevanceScore,designation,weatherInfo,addresses,operating_hours,entrancePasses,name,description,directionsUrl,fees,...,tuesday_hours,wednesday_hours,thursday_hours,friday_hours,saturday_hours,sunday_hours,open_seven_days_a_week,closed_open,is_closed,open_closed
6,1,Memorial Parkway,Summers on the parkway are generally hot and h...,"[{'city': 'McLean', 'countryCode': 'US', 'line...",{'description': 'The George Washington Memoria...,[],George Washington,The George Washington Memorial Parkway was des...,http://www.nps.gov/gwmp/planyourvisit/directio...,[],...,All Day,All Day,All Day,All Day,All Day,All Day,True,Open,False,"Today, the park is: Open"
8,1,National Historic Site,Spring: Temperatures usually range from 40°F t...,"[{'city': 'Hyde Park', 'countryCode': 'US', 'l...",{'description': 'Park and Val-Kill Cottage tou...,[],Eleanor Roosevelt,"Visit the home of Eleanor Roosevelt. Here, Fra...",http://www.nps.gov/elro/planyourvisit/directio...,[],...,7:00AM - 5:00PM,7:00AM - 5:00PM,7:00AM - 5:00PM,7:00AM - 5:00PM,7:00AM - 5:00PM,7:00AM - 5:00PM,True,Open,False,"Today, the park is: Open"
10,1,National Historical Park,Weather in this part of New Jersey is typical ...,"[{'city': 'Morristown', 'countryCode': 'US', '...",{'description': 'Current Grounds Hours: 8 am ...,[],Morristown,Morristown National Historical Park commemorat...,http://www.nps.gov/morr/planyourvisit/directio...,[],...,8:00AM - 5:00PM,8:00AM - 5:00PM,8:00AM - 5:00PM,8:00AM - 5:00PM,8:00AM - 5:00PM,8:00AM - 5:00PM,True,Open,False,"Today, the park is: Open"
14,1,National Monument,"Cedar Breaks sits at over 10,000 feet in eleva...","[{'city': 'Brian Head', 'countryCode': 'US', '...","{'description': 'Visitor access, services, and...","[{'cost': 35.0, 'description': 'Cedar Breaks a...",Cedar Breaks,"Crowning the grand staircase, Cedar Breaks sit...",https://www.nps.gov/cebr/planyourvisit/directi...,[],...,All Day,All Day,All Day,All Day,All Day,All Day,True,Open,False,"Today, the park is: Open"
15,1,National Monument,Devils Postpile National Monument is located a...,"[{'city': 'Mammoth Lakes', 'countryCode': 'US'...","{'description': 'In the operating season, Devi...",[],Devils Postpile,Established in 1911 by presidential proclamati...,http://www.nps.gov/depo/planyourvisit/directio...,[],...,All Day,All Day,All Day,All Day,All Day,All Day,True,Open,False,"Today, the park is: Open"
16,1,National Park,Isle Royale National Park is a remote island w...,"[{'city': 'Houghton', 'countryCode': 'US', 'li...",{'description': 'Isle Royale National Park is ...,"[{'cost': 60.0, 'description': 'Isle Royale Se...",Isle Royale,"Explore a rugged, isolated island far from our...",http://www.nps.gov/isro/planyourvisit/houghton...,[],...,All Day,All Day,All Day,All Day,All Day,All Day,True,Open,False,"Today, the park is: Open"
17,1,National Historical Park and Preserve,All temperatures in degrees Fahrenheit. Note t...,"[{'city': 'New Orleans', 'countryCode': 'US', ...",{'description': 'Each of Jean Lafitte's six si...,[],Jean Lafitte,"In Jean Lafitte's day, silver and gold filled ...",http://www.nps.gov/jela/planyourvisit/directio...,[],...,All Day,All Day,All Day,All Day,All Day,All Day,True,Open,False,"Today, the park is: Open"
18,1,National Scenic Trail,Wisconsin has four distinctly different season...,"[{'city': 'Cross Plains', 'countryCode': 'US',...",{'description': 'The Ice Age National Scenic T...,[],Ice Age,"A mere 15,000 years ago during the Ice Age, mu...",http://www.nps.gov/iatr/planyourvisit/directio...,[],...,All Day,All Day,All Day,All Day,All Day,All Day,True,Open,False,"Today, the park is: Open"
20,1,National Recreation Area,This area of the Texas Panhandle has a wide va...,"[{'city': 'Fritch', 'countryCode': 'US', 'line...",{'description': 'The National Recreation Area ...,[],Lake Meredith,Within the dry plains of the Texas Panhandle l...,http://www.nps.gov/lamr/planyourvisit/directio...,[],...,All Day,All Day,All Day,All Day,All Day,All Day,True,Open,False,"Today, the park is: Open"
22,1,National Park,Today's Weather: http://www.weather.com/weathe...,"[{'city': 'Montrose', 'countryCode': 'US', 'li...",{'description': 'The park is open 24 hours a d...,"[{'cost': 55.0, 'description': 'An annual pass...",Black Canyon Of The Gunnison,"Big enough to be overwhelming, still intimate ...",http://www.nps.gov/blca/planyourvisit/directio...,[],...,All Day,All Day,All Day,All Day,All Day,All Day,True,Open,False,"Today, the park is: Open"


In [18]:
# this gets the first row of the dataframe
open_seven_days_df.iloc[0:3]

Unnamed: 0,relevanceScore,designation,weatherInfo,addresses,operating_hours,entrancePasses,name,description,directionsUrl,fees,...,tuesday_hours,wednesday_hours,thursday_hours,friday_hours,saturday_hours,sunday_hours,open_seven_days_a_week,closed_open,is_closed,open_closed
6,1,Memorial Parkway,Summers on the parkway are generally hot and h...,"[{'city': 'McLean', 'countryCode': 'US', 'line...",{'description': 'The George Washington Memoria...,[],George Washington,The George Washington Memorial Parkway was des...,http://www.nps.gov/gwmp/planyourvisit/directio...,[],...,All Day,All Day,All Day,All Day,All Day,All Day,True,Open,False,"Today, the park is: Open"
8,1,National Historic Site,Spring: Temperatures usually range from 40°F t...,"[{'city': 'Hyde Park', 'countryCode': 'US', 'l...",{'description': 'Park and Val-Kill Cottage tou...,[],Eleanor Roosevelt,"Visit the home of Eleanor Roosevelt. Here, Fra...",http://www.nps.gov/elro/planyourvisit/directio...,[],...,7:00AM - 5:00PM,7:00AM - 5:00PM,7:00AM - 5:00PM,7:00AM - 5:00PM,7:00AM - 5:00PM,7:00AM - 5:00PM,True,Open,False,"Today, the park is: Open"
10,1,National Historical Park,Weather in this part of New Jersey is typical ...,"[{'city': 'Morristown', 'countryCode': 'US', '...",{'description': 'Current Grounds Hours: 8 am ...,[],Morristown,Morristown National Historical Park commemorat...,http://www.nps.gov/morr/planyourvisit/directio...,[],...,8:00AM - 5:00PM,8:00AM - 5:00PM,8:00AM - 5:00PM,8:00AM - 5:00PM,8:00AM - 5:00PM,8:00AM - 5:00PM,True,Open,False,"Today, the park is: Open"


In [6]:
# this gets the rows of the dataframe with index 6, which happens to be the first row :)
open_seven_days_df.loc[6:7]

Unnamed: 0,relevanceScore,designation,weatherInfo,addresses,operating_hours,entrancePasses,name,description,directionsUrl,fees,...,tuesday_hours,wednesday_hours,thursday_hours,friday_hours,saturday_hours,sunday_hours,open_seven_days_a_week,closed_open,is_closed,open_closed
6,1,Memorial Parkway,Summers on the parkway are generally hot and h...,"[{'city': 'McLean', 'countryCode': 'US', 'line...",{'description': 'The George Washington Memoria...,[],George Washington,The George Washington Memorial Parkway was des...,http://www.nps.gov/gwmp/planyourvisit/directio...,[],...,All Day,All Day,All Day,All Day,All Day,All Day,True,Open,False,"Today, the park is: Open"


Filtering in pandas is most easily accomplished by supplying conditions when selecting data, for example

In [7]:
parks_df = pd.read_parquet("../../data/nps/nps_public_data_parks.parquet")

parks_df[parks_df["fullName"] == "Zion National Park"]

Unnamed: 0,relevanceScore,designation,weatherInfo,addresses,operatingHours,entrancePasses,name,description,directionsUrl,fees,...,activities,url,longitude,id,images,directionsInfo,fullName,parkCode,latLong,latitude
257,1,National Park,Zion is known for a wide range of weather cond...,"[{'type': 'Physical', 'line2': '1 Zion Park Bl...","[{'name': 'Zion National Park', 'standardHours...",[],Zion,Follow the paths where people have walked for ...,http://www.nps.gov/zion/planyourvisit/directio...,[],...,"[{'name': 'Arts and Culture', 'id': '09DF0950-...",https://www.nps.gov/zion/index.htm,-113.026514,41BAB8ED-C95F-447D-9DA1-FCC4E4D808B2,[{'url': 'https://www.nps.gov/common/uploads/s...,"Zion National Park's main, south entrance and ...",Zion National Park,zion,"lat:37.29839254, long:-113.0265138",37.298393


We can pass any number of boolean operations to successively filter a dataframe this way

In [8]:
parks_df[parks_df["states"].str.contains("UT") & parks_df["states"].str.contains("AZ")]

Unnamed: 0,relevanceScore,designation,weatherInfo,addresses,operatingHours,entrancePasses,name,description,directionsUrl,fees,...,activities,url,longitude,id,images,directionsInfo,fullName,parkCode,latLong,latitude
220,1,National Historic Trail,Due to the length of Old Spanish National Hist...,"[{'type': 'Physical', 'line2': 'Old Spanish Na...",[{'name': 'Old Spanish National Historic Trail...,[],Old Spanish,Follow the routes of mule pack trains across t...,http://www.nps.gov/olsp/planyourvisit/directio...,[],...,"[{'name': 'Arts and Culture', 'id': '09DF0950-...",https://www.nps.gov/olsp/index.htm,-112.114693,8B07C05F-9C4A-4799-8428-DFA2BBB733AC,[{'url': 'https://www.nps.gov/common/uploads/s...,You can visit many sites of the Old Spanish Na...,Old Spanish National Historic Trail,olsp,"lat:37.0791782514, long:-112.114693431",37.079178
364,1,National Recreation Area,The weather in Glen Canyon National Recreation...,"[{'type': 'Physical', 'line2': 'Park Headquart...","[{'name': 'Glen Canyon Open Hours', 'standardH...",[{'description': 'Annual pass is a card signed...,Glen Canyon,"Encompassing over 1.25 million acres, Glen Can...",http://www.nps.gov/glca/planyourvisit/directio...,[],...,"[{'name': 'Auto and ATV', 'id': '5F723BAD-7359...",https://www.nps.gov/glca/index.htm,-111.485594,F7EB51F0-240D-49BC-968C-792756A3B1A0,[{'url': 'https://www.nps.gov/common/uploads/s...,There are multiple districts in Glen Canyon ve...,Glen Canyon National Recreation Area,glca,"lat:36.9357464677, long:-111.485594268",36.935746


In [9]:
parks_df[
    (parks_df["states"].str.contains("UT") & parks_df["states"].str.contains("AZ"))
    | parks_df["states"].str.contains("WY")
]

Unnamed: 0,relevanceScore,designation,weatherInfo,addresses,operatingHours,entrancePasses,name,description,directionsUrl,fees,...,activities,url,longitude,id,images,directionsInfo,fullName,parkCode,latLong,latitude
18,1,National Park,"Yellowstone's weather can vary quite a bit, ev...","[{'type': 'Physical', 'line2': 'Yellowstone Na...","[{'name': 'All Park Hours', 'standardHours': {...",[{'description': 'Provides unlimited entry for...,Yellowstone,"On March 1, 1872, Yellowstone became the first...",http://www.nps.gov/yell/planyourvisit/directio...,[],...,"[{'name': 'Arts and Culture', 'id': '09DF0950-...",https://www.nps.gov/yell/index.htm,-110.547169,F58C6D24-8D10-4573-9826-65D42B8B83AD,[{'url': 'https://www.nps.gov/common/uploads/s...,"Yellowstone National Park covers nearly 3,500 ...",Yellowstone National Park,yell,"lat:44.59824417, long:-110.5471695",44.598244
149,1,National Historic Trail,Due to the length of the California National H...,"[{'type': 'Physical', 'line2': 'California Nat...",[{'name': 'California National Historic Trail'...,[],California,"Follow in the footsteps of over 250,000 emigra...",http://www.nps.gov/cali/planyourvisit/directio...,[],...,"[{'name': 'Auto and ATV', 'id': '5F723BAD-7359...",https://www.nps.gov/cali/index.htm,-108.702415,B39C368F-CB27-49EC-B2A9-E6C1552430FB,[{'url': 'https://www.nps.gov/common/uploads/s...,Those portions of the California National Hist...,California National Historic Trail,cali,"lat:42.3999643979, long:-108.702415369",42.399964
170,1,National Monument,Obtain forecast information before beginning y...,"[{'type': 'Physical', 'line2': '', 'line1': '1...","[{'name': 'Park Hours', 'standardHours': {'fri...",[{'description': 'A great option if you visit ...,Devils Tower,The Tower is an astounding geologic feature th...,http://www.nps.gov/deto/planyourvisit/directio...,[],...,"[{'name': 'Astronomy', 'id': '13A57703-BB1A-41...",https://www.nps.gov/deto/index.htm,-104.715634,335368E4-B5CE-4370-8324-4A841AFA5025,[{'url': 'https://www.nps.gov/common/uploads/s...,The park entrance is located 33 miles northeas...,Devils Tower National Monument,deto,"lat:44.59064655, long:-104.7156341",44.590647
185,1,National Park,"Grand Teton National Park has long, cold winte...","[{'type': 'Physical', 'line2': '', 'line1': '1...","[{'name': 'Grand Teton National Park', 'standa...",[{'description': 'Pass is valid for one year t...,Grand Teton,"Soaring over a landscape rich with wildlife, p...",http://www.nps.gov/grte/planyourvisit/directio...,[],...,"[{'name': 'Arts and Culture', 'id': '09DF0950-...",https://www.nps.gov/grte/index.htm,-110.705467,FF73E2AA-E274-44E1-A8F5-9DD998B0F579,[{'url': 'https://www.nps.gov/common/uploads/s...,Grand Teton National Park is located in northw...,Grand Teton National Park,grte,"lat:43.81853565, long:-110.7054666",43.818536
209,1,National Historic Trail,Due to the length of the Mormon Pioneer Nation...,"[{'type': 'Physical', 'line2': 'Mormon Pioneer...",[{'name': 'Mormon Pioneer National Historic Tr...,[],Mormon Pioneer,Explore the Mormon Pioneer National Historic T...,http://www.nps.gov/mopi/planyourvisit/directio...,[],...,"[{'name': 'Auto and ATV', 'id': '5F723BAD-7359...",https://www.nps.gov/mopi/index.htm,-101.840838,AD731EF0-685F-4397-88C1-A9BB21EBF034,[{'url': 'https://www.nps.gov/common/uploads/s...,The Mormon Pioneer National Historic Trail cro...,Mormon Pioneer National Historic Trail,mopi,"lat:41.2650321741, long:-101.84083756",41.265032
220,1,National Historic Trail,Due to the length of Old Spanish National Hist...,"[{'type': 'Physical', 'line2': 'Old Spanish Na...",[{'name': 'Old Spanish National Historic Trail...,[],Old Spanish,Follow the routes of mule pack trains across t...,http://www.nps.gov/olsp/planyourvisit/directio...,[],...,"[{'name': 'Arts and Culture', 'id': '09DF0950-...",https://www.nps.gov/olsp/index.htm,-112.114693,8B07C05F-9C4A-4799-8428-DFA2BBB733AC,[{'url': 'https://www.nps.gov/common/uploads/s...,You can visit many sites of the Old Spanish Na...,Old Spanish National Historic Trail,olsp,"lat:37.0791782514, long:-112.114693431",37.079178
222,1,National Historic Trail,Due to the length of the Oregon National Histo...,"[{'type': 'Physical', 'line2': 'Oregon Nationa...","[{'name': 'Oregon National Historic Trail', 's...",[],Oregon,Imagine yourself an emigrant headed for Oregon...,http://www.nps.gov/oreg/planyourvisit/directio...,[],...,"[{'name': 'Auto and ATV', 'id': '5F723BAD-7359...",https://www.nps.gov/oreg/index.htm,-109.634201,4E072F12-3E85-4219-B456-A8F04C688879,[{'url': 'https://www.nps.gov/common/uploads/s...,"More than 2,000 miles of trail ruts and traces...",Oregon National Historic Trail,oreg,"lat:41.9876977831, long:-109.634200841",41.987698
229,1,National Historic Trail,Due to the length of the Pony Express National...,"[{'type': 'Physical', 'line2': 'Pony Express N...",[{'name': 'Pony Express National Historic Trai...,[],Pony Express,It is hard to believe that young men once rode...,http://www.nps.gov/poex/planyourvisit/directio...,[],...,"[{'name': 'Auto and ATV', 'id': '5F723BAD-7359...",https://www.nps.gov/poex/index.htm,-109.266148,A0125F49-080E-4292-BD91-C1FA1657F547,[{'url': 'https://www.nps.gov/common/uploads/s...,You can visit many sites of the Pony Express N...,Pony Express National Historic Trail,poex,"lat:42.2365761128, long:-109.266148417",42.236576
299,1,National Historic Site,Fort Laramie has mild weather for Wyoming. Whi...,"[{'type': 'Physical', 'line2': '', 'line1': '9...",[{'name': 'Park grounds and historic buildings...,[],Fort Laramie,Originally established as a private fur tradin...,http://www.nps.gov/fola/planyourvisit/directio...,[],...,"[{'name': 'Fishing', 'id': 'AE42B46C-E4B7-4889...",https://www.nps.gov/fola/index.htm,-104.545911,53C5C818-EB7C-4FEB-B94F-892588205D64,[{'url': 'https://www.nps.gov/common/uploads/s...,The park is located in southeast Wyoming appro...,Fort Laramie National Historic Site,fola,"lat:42.20301694, long:-104.5459112",42.203017
302,1,National Monument,Expect a variety of weather conditions no matt...,"[{'type': 'Physical', 'line2': '', 'line1': '8...","[{'name': 'Fossil Butte National Monument', 's...",[],Fossil Butte,Some of the world's best preserved fossils are...,http://www.nps.gov/fobu/planyourvisit/directio...,[],...,"[{'name': 'Hiking', 'id': 'BFF8C027-7C8F-480B-...",https://www.nps.gov/fobu/index.htm,-110.762475,57CCF213-6285-408F-AD58-CC2DB9D5B6C1,[{'url': 'https://www.nps.gov/common/uploads/s...,"By car: Travel 9 miles west of Kemmerer, Wyomi...",Fossil Butte National Monument,fobu,"lat:41.85635223, long:-110.7624754",41.856352


In [10]:
parks_df[
    (parks_df["longitude"] < -140)
    & (parks_df["latitude"] > 60)
    & (parks_df["designation"] == "National Park")
]

Unnamed: 0,relevanceScore,designation,weatherInfo,addresses,operatingHours,entrancePasses,name,description,directionsUrl,fees,...,activities,url,longitude,id,images,directionsInfo,fullName,parkCode,latLong,latitude
199,1,National Park,"Snow, rain, and freezing temperatures can occu...","[{'type': 'Physical', 'line2': '', 'line1': '1...","[{'name': 'Kobuk Valley National Park', 'stand...",[],Kobuk Valley,"Caribou, sand dunes, the Kobuk River, Onion Po...",http://www.nps.gov/kova/planyourvisit/directio...,[],...,"[{'name': 'Boating', 'id': '071BA73C-1D3C-46D4...",https://www.nps.gov/kova/index.htm,-159.200229,691831BF-F280-4E02-BF4A-FF476BC66B23,[{'url': 'https://www.nps.gov/common/uploads/s...,Kobuk Valley National Park is very remote. The...,Kobuk Valley National Park,kova,"lat:67.35631336, long:-159.2002293",67.356313


We can select entire columns through a familiar notation & combine with our filtering, too

In [11]:
parks_df[["fullName", "states"]]

Unnamed: 0,fullName,states
0,Federal Hall National Memorial,NY
1,Lewis & Clark National Historic Trail,"IA,ID,IL,IN,KS,KY,MO,MT,NE,ND,OH,OR,PA,SD,WA,WV"
2,National Capital Parks-East,DC
3,Adams National Historical Park,MA
4,George Washington Memorial Parkway,"DC,MD,VA"
...,...,...
466,Navajo National Monument,AZ
467,Cabrillo National Monument,CA
468,Golden Spike National Historical Park,UT
469,Fort Union Trading Post National Historic Site,"MT,ND"


In [12]:
parks_df[
    (parks_df["longitude"] < -140)
    & (parks_df["latitude"] > 60)
    & (parks_df["designation"] == "National Park")
][["fullName", "states"]]

Unnamed: 0,fullName,states
199,Kobuk Valley National Park,AK
