# Indexing, Selecting, and Filtering

## Setup

In [1]:
import pandas as pd

## Indexing & Slicing in `Python`

In [2]:
students = ["Alice", "Bob", "Clara", "David", "Emily", "Fred"]

In [3]:
students[0]

'Alice'

In [4]:
students[-1]

'Fred'

In [5]:
students[:3]

['Alice', 'Bob', 'Clara']

In [6]:
students[-3:]

['David', 'Emily', 'Fred']

In [7]:
students[:]

['Alice', 'Bob', 'Clara', 'David', 'Emily', 'Fred']

## Creation

In [8]:
data = {
    "Capital": {
        "Spain": "Madrid",
        "Belgium": "Brussels",
        "France": "Paris",
        "Italy": "Roma",
        "Germany": "Berlin",
        "Portugal": "Lisbon",
        "Norway": "Oslo",
        "Greece": "Athens",
    },
    "Population": {
        "Spain": 46733038,
        "Belgium": 11449656,
        "France": 67076000,
        "Italy": 60390560,
        "Germany": 83122889,
        "Portugal": 10295909,
        "Norway": 5391369,
        "Greece": 10718565,
    },
    "Monarch": {
        "Spain": "Felipe VI",
        "Belgium": "Philippe",
        "Norway": "Harald V",
    },
    "Area": {
        "Spain": 505990,
        "Belgium": 30688,
        "France": 640679,
        "Italy": 301340,
        "Germany": 357022,
        "Portugal": 92212,
        "Norway": 385207,
        "Greece": 131957,
    },
}

In [9]:
# For now, let's forget about these steps:
df = pd.DataFrame(data)
df["Capital"] = df["Capital"].astype("string")
df["Monarch"] = df["Monarch"].astype("string")

In [10]:
df

Unnamed: 0,Capital,Population,Monarch,Area
Spain,Madrid,46733038,Felipe VI,505990
Belgium,Brussels,11449656,Philippe,30688
France,Paris,67076000,,640679
Italy,Roma,60390560,,301340
Germany,Berlin,83122889,,357022
Portugal,Lisbon,10295909,,92212
Norway,Oslo,5391369,Harald V,385207
Greece,Athens,10718565,,131957


Apple stock data, taken from the [`matplotlib` sample datasets](https://github.com/matplotlib/sample_data/blob/master/aapl.csv)

In [11]:
# For now, let's forget about these steps:
apple = pd.read_csv("AAPL.csv")
apple["Date"] = apple["Date"].astype("datetime64[ns]")
apple = apple.set_index("Date")
apple = apple.sort_index()

In [12]:
apple.head()

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1984-09-07,26.5,26.87,26.25,26.5,2981600,3.02
1984-09-10,26.5,26.62,25.87,26.37,2346400,3.01
1984-09-11,26.62,27.37,26.62,26.87,5444000,3.07
1984-09-12,26.87,27.0,26.12,26.12,4773600,2.98
1984-09-13,27.5,27.62,27.5,27.5,7429600,3.14


## Demo 1: Selecting one column (as a `Series`)

In [13]:
df

Unnamed: 0,Capital,Population,Monarch,Area
Spain,Madrid,46733038,Felipe VI,505990
Belgium,Brussels,11449656,Philippe,30688
France,Paris,67076000,,640679
Italy,Roma,60390560,,301340
Germany,Berlin,83122889,,357022
Portugal,Lisbon,10295909,,92212
Norway,Oslo,5391369,Harald V,385207
Greece,Athens,10718565,,131957


Select one column:

In [14]:
df["Capital"]

Spain         Madrid
Belgium     Brussels
France         Paris
Italy           Roma
Germany       Berlin
Portugal      Lisbon
Norway          Oslo
Greece        Athens
Name: Capital, dtype: string

Check the type of the object returned:

In [15]:
type(df["Capital"])

pandas.core.series.Series

## Exercise 1

In [16]:
apple.head()

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1984-09-07,26.5,26.87,26.25,26.5,2981600,3.02
1984-09-10,26.5,26.62,25.87,26.37,2346400,3.01
1984-09-11,26.62,27.37,26.62,26.87,5444000,3.07
1984-09-12,26.87,27.0,26.12,26.12,4773600,2.98
1984-09-13,27.5,27.62,27.5,27.5,7429600,3.14


Select the "Volume" column of the `apple` DataFrame:

In [17]:
apple["Volume"]

Date
1984-09-07     2981600
1984-09-10     2346400
1984-09-11     5444000
1984-09-12     4773600
1984-09-13     7429600
                ...   
2008-10-08    78847900
2008-10-09    57763700
2008-10-10    79260700
2008-10-13    54967000
2008-10-14    70749800
Name: Volume, Length: 6081, dtype: int64

Check the type of the object returned:

In [18]:
type(apple["Volume"])

pandas.core.series.Series

## Demo 2: Selecting several columns

In [19]:
df

Unnamed: 0,Capital,Population,Monarch,Area
Spain,Madrid,46733038,Felipe VI,505990
Belgium,Brussels,11449656,Philippe,30688
France,Paris,67076000,,640679
Italy,Roma,60390560,,301340
Germany,Berlin,83122889,,357022
Portugal,Lisbon,10295909,,92212
Norway,Oslo,5391369,Harald V,385207
Greece,Athens,10718565,,131957


Select several columns:

In [20]:
df[["Capital", "Area"]]

Unnamed: 0,Capital,Area
Spain,Madrid,505990
Belgium,Brussels,30688
France,Paris,640679
Italy,Roma,301340
Germany,Berlin,357022
Portugal,Lisbon,92212
Norway,Oslo,385207
Greece,Athens,131957


Check the type of the object returned:

In [21]:
type(df[["Capital", "Area"]])

pandas.core.frame.DataFrame

## Exercise 2

In [22]:
apple.head()

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1984-09-07,26.5,26.87,26.25,26.5,2981600,3.02
1984-09-10,26.5,26.62,25.87,26.37,2346400,3.01
1984-09-11,26.62,27.37,26.62,26.87,5444000,3.07
1984-09-12,26.87,27.0,26.12,26.12,4773600,2.98
1984-09-13,27.5,27.62,27.5,27.5,7429600,3.14


Select the "Open" and "Close" columns of the `apple` DataFrame:

In [23]:
apple[["Open","Close"]]

Unnamed: 0_level_0,Open,Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1
1984-09-07,26.50,26.50
1984-09-10,26.50,26.37
1984-09-11,26.62,26.87
1984-09-12,26.87,26.12
1984-09-13,27.50,27.50
...,...,...
2008-10-08,85.91,89.79
2008-10-09,93.35,88.74
2008-10-10,85.70,96.80
2008-10-13,104.55,110.26


Check the type of the object returned:

In [24]:
type(apple[["Open","Close"]])

pandas.core.frame.DataFrame

In [25]:
apple[["Open"]]

Unnamed: 0_level_0,Open
Date,Unnamed: 1_level_1
1984-09-07,26.50
1984-09-10,26.50
1984-09-11,26.62
1984-09-12,26.87
1984-09-13,27.50
...,...
2008-10-08,85.91
2008-10-09,93.35
2008-10-10,85.70
2008-10-13,104.55


## Demo 3: Selecting one column (as a `DataFrame`)

In [26]:
df

Unnamed: 0,Capital,Population,Monarch,Area
Spain,Madrid,46733038,Felipe VI,505990
Belgium,Brussels,11449656,Philippe,30688
France,Paris,67076000,,640679
Italy,Roma,60390560,,301340
Germany,Berlin,83122889,,357022
Portugal,Lisbon,10295909,,92212
Norway,Oslo,5391369,Harald V,385207
Greece,Athens,10718565,,131957


Select one column, and return a DataFrame:

In [27]:
df[["Monarch"]]

Unnamed: 0,Monarch
Spain,Felipe VI
Belgium,Philippe
France,
Italy,
Germany,
Portugal,
Norway,Harald V
Greece,


Check the type of the object returned:

In [28]:
type(df[["Monarch"]])

pandas.core.frame.DataFrame

## Exercise 3

In [29]:
apple.head()

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1984-09-07,26.5,26.87,26.25,26.5,2981600,3.02
1984-09-10,26.5,26.62,25.87,26.37,2346400,3.01
1984-09-11,26.62,27.37,26.62,26.87,5444000,3.07
1984-09-12,26.87,27.0,26.12,26.12,4773600,2.98
1984-09-13,27.5,27.62,27.5,27.5,7429600,3.14


Select the "Adj Close" column of the `apple` DataFrame, and return a DataFrame:

In [30]:
apple[["Adj Close"]]

Unnamed: 0_level_0,Adj Close
Date,Unnamed: 1_level_1
1984-09-07,3.02
1984-09-10,3.01
1984-09-11,3.07
1984-09-12,2.98
1984-09-13,3.14
...,...
2008-10-08,89.79
2008-10-09,88.74
2008-10-10,96.80
2008-10-13,110.26


Check the type of the object returned:

In [31]:
type(apple[["Adj Close"]])

pandas.core.frame.DataFrame

## Demo 4: Slicing rows (using the index)

In [32]:
df

Unnamed: 0,Capital,Population,Monarch,Area
Spain,Madrid,46733038,Felipe VI,505990
Belgium,Brussels,11449656,Philippe,30688
France,Paris,67076000,,640679
Italy,Roma,60390560,,301340
Germany,Berlin,83122889,,357022
Portugal,Lisbon,10295909,,92212
Norway,Oslo,5391369,Harald V,385207
Greece,Athens,10718565,,131957


Slice the first few rows until "Italy" included:

In [33]:
df[:"Italy"]

Unnamed: 0,Capital,Population,Monarch,Area
Spain,Madrid,46733038,Felipe VI,505990
Belgium,Brussels,11449656,Philippe,30688
France,Paris,67076000,,640679
Italy,Roma,60390560,,301340


Check the shape:

In [34]:
df[:"Italy"].shape

(4, 4)

<div class="alert alert-warning">

<b>Beware:</b> Unlike in <code>Python</code>, <b>the end point is included</b> when slicing in <code>pandas</code> <b>using the index!</b>

</div>

Slice the last few rows, starting from "Italy":

In [35]:
df["Italy":]

Unnamed: 0,Capital,Population,Monarch,Area
Italy,Roma,60390560,,301340
Germany,Berlin,83122889,,357022
Portugal,Lisbon,10295909,,92212
Norway,Oslo,5391369,Harald V,385207
Greece,Athens,10718565,,131957


Check the shape:

In [36]:
df["Italy":].shape

(5, 4)

Slice the rows from "Belgium" until "Germany" included:

In [37]:
df["Belgium":"Germany"]

Unnamed: 0,Capital,Population,Monarch,Area
Belgium,Brussels,11449656,Philippe,30688
France,Paris,67076000,,640679
Italy,Roma,60390560,,301340
Germany,Berlin,83122889,,357022


Check the shape:

In [38]:
df["Belgium":"Germany"].shape

(4, 4)

<div class="alert alert-warning">

<b>Beware:</b> Unlike in <code>Python</code>, <b>the end point is included</b> when slicing in <code>pandas</code> <b>using the index!</b>

</div>

## Exercise 4

In [39]:
apple.head()

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1984-09-07,26.5,26.87,26.25,26.5,2981600,3.02
1984-09-10,26.5,26.62,25.87,26.37,2346400,3.01
1984-09-11,26.62,27.37,26.62,26.87,5444000,3.07
1984-09-12,26.87,27.0,26.12,26.12,4773600,2.98
1984-09-13,27.5,27.62,27.5,27.5,7429600,3.14


Slice the first few rows of the `apple` DataFrame until the 14 September 1984 included:

In [40]:
apple[:"1984-09-14"]

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1984-09-07,26.5,26.87,26.25,26.5,2981600,3.02
1984-09-10,26.5,26.62,25.87,26.37,2346400,3.01
1984-09-11,26.62,27.37,26.62,26.87,5444000,3.07
1984-09-12,26.87,27.0,26.12,26.12,4773600,2.98
1984-09-13,27.5,27.62,27.5,27.5,7429600,3.14
1984-09-14,27.62,28.5,27.62,27.87,8826400,3.18


Slice the last few rows of the `apple` DataFrame, starting from the 1 October 2008:

In [41]:
apple["2008-10-01":]

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2008-10-01,111.92,112.36,107.39,109.12,46303000,109.12
2008-10-02,108.01,108.79,100.0,100.1,57477300,100.1
2008-10-03,104.0,106.5,94.65,97.07,81942800,97.07
2008-10-06,91.96,98.78,87.54,98.14,75264900,98.14
2008-10-07,100.48,101.5,88.95,89.16,67099000,89.16
2008-10-08,85.91,96.33,85.68,89.79,78847900,89.79
2008-10-09,93.35,95.8,86.6,88.74,57763700,88.74
2008-10-10,85.7,100.0,85.0,96.8,79260700,96.8
2008-10-13,104.55,110.53,101.02,110.26,54967000,110.26
2008-10-14,116.26,116.4,103.14,104.08,70749800,104.08


Slice the rows of the `apple` DataFrame for the month of February 2000:

In [42]:
apple["2001-02":"2001-02"]

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2001-02-01,20.69,21.5,20.5,21.12,13205400,10.56
2001-02-02,21.12,21.94,20.5,20.62,15263400,10.31
2001-02-05,20.5,20.56,19.75,20.19,10228800,10.1
2001-02-06,20.16,21.39,20.0,21.12,16528400,10.56
2001-02-07,20.66,20.87,19.81,20.75,14071600,10.38
2001-02-08,20.56,21.06,20.19,20.75,21585000,10.38
2001-02-09,20.5,20.81,18.69,19.12,21082600,9.56
2001-02-12,19.06,20.0,18.81,19.69,9795600,9.85
2001-02-13,19.94,20.44,19.0,19.12,8470600,9.56
2001-02-14,19.19,19.62,18.5,19.5,11040000,9.75


## Demo 5: Slicing rows (using integers)

In [43]:
df

Unnamed: 0,Capital,Population,Monarch,Area
Spain,Madrid,46733038,Felipe VI,505990
Belgium,Brussels,11449656,Philippe,30688
France,Paris,67076000,,640679
Italy,Roma,60390560,,301340
Germany,Berlin,83122889,,357022
Portugal,Lisbon,10295909,,92212
Norway,Oslo,5391369,Harald V,385207
Greece,Athens,10718565,,131957


Slice the first 4 rows:

In [44]:
df[:4]

Unnamed: 0,Capital,Population,Monarch,Area
Spain,Madrid,46733038,Felipe VI,505990
Belgium,Brussels,11449656,Philippe,30688
France,Paris,67076000,,640679
Italy,Roma,60390560,,301340


Check the shape:

In [45]:
df[:4].shape

(4, 4)

<div class="alert alert-warning">

<b>Beware:</b> Like in <code>Python</code>, <b>the end point is NOT included</b> when slicing in <code>pandas</code> <b>using integers!</b>

</div>

Slice the last 3 rows:

In [46]:
df[-3:]

Unnamed: 0,Capital,Population,Monarch,Area
Portugal,Lisbon,10295909,,92212
Norway,Oslo,5391369,Harald V,385207
Greece,Athens,10718565,,131957


Check the shape:

In [47]:
df[-3:].shape

(3, 4)

Slice the rows from the third until the fifth:

In [48]:
df

Unnamed: 0,Capital,Population,Monarch,Area
Spain,Madrid,46733038,Felipe VI,505990
Belgium,Brussels,11449656,Philippe,30688
France,Paris,67076000,,640679
Italy,Roma,60390560,,301340
Germany,Berlin,83122889,,357022
Portugal,Lisbon,10295909,,92212
Norway,Oslo,5391369,Harald V,385207
Greece,Athens,10718565,,131957


In [49]:
df[2:5]

Unnamed: 0,Capital,Population,Monarch,Area
France,Paris,67076000,,640679
Italy,Roma,60390560,,301340
Germany,Berlin,83122889,,357022


Check the shape:

In [50]:
df[2:5].shape

(3, 4)

<div class="alert alert-warning">

<b>Beware:</b> Like in <code>Python</code>, <b>the end point is NOT included</b> when slicing in <code>pandas</code> <b>using integers!</b>

</div>

## Exercise 5

In [51]:
apple.head()

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1984-09-07,26.5,26.87,26.25,26.5,2981600,3.02
1984-09-10,26.5,26.62,25.87,26.37,2346400,3.01
1984-09-11,26.62,27.37,26.62,26.87,5444000,3.07
1984-09-12,26.87,27.0,26.12,26.12,4773600,2.98
1984-09-13,27.5,27.62,27.5,27.5,7429600,3.14


Slice the first 3 rows of the `apple` DataFrame:

In [52]:
apple[:3]

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1984-09-07,26.5,26.87,26.25,26.5,2981600,3.02
1984-09-10,26.5,26.62,25.87,26.37,2346400,3.01
1984-09-11,26.62,27.37,26.62,26.87,5444000,3.07


Check the shape:

In [53]:
apple[:3].shape

(3, 6)

Slice the last 6 rows of the `apple` DataFrame:

In [54]:
apple[-6:]

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2008-10-07,100.48,101.5,88.95,89.16,67099000,89.16
2008-10-08,85.91,96.33,85.68,89.79,78847900,89.79
2008-10-09,93.35,95.8,86.6,88.74,57763700,88.74
2008-10-10,85.7,100.0,85.0,96.8,79260700,96.8
2008-10-13,104.55,110.53,101.02,110.26,54967000,110.26
2008-10-14,116.26,116.4,103.14,104.08,70749800,104.08


Check the shape:

In [55]:
apple[-6:].shape

(6, 6)

Slice the second to the fourth rows of the `apple` DataFrame:

In [56]:
apple[1:4]

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1984-09-10,26.5,26.62,25.87,26.37,2346400,3.01
1984-09-11,26.62,27.37,26.62,26.87,5444000,3.07
1984-09-12,26.87,27.0,26.12,26.12,4773600,2.98


Check the shape:

In [60]:
apple[1:4].shape

(3, 6)

## Demo 6: Selecting data with a boolean array

In [57]:
df

Unnamed: 0,Capital,Population,Monarch,Area
Spain,Madrid,46733038,Felipe VI,505990
Belgium,Brussels,11449656,Philippe,30688
France,Paris,67076000,,640679
Italy,Roma,60390560,,301340
Germany,Berlin,83122889,,357022
Portugal,Lisbon,10295909,,92212
Norway,Oslo,5391369,Harald V,385207
Greece,Athens,10718565,,131957


In [58]:
df["Population"]

Spain       46733038
Belgium     11449656
France      67076000
Italy       60390560
Germany     83122889
Portugal    10295909
Norway       5391369
Greece      10718565
Name: Population, dtype: int64

Comparisons return boolean arrays:

In [59]:
df["Population"] < 15_000_000

Spain       False
Belgium      True
France      False
Italy       False
Germany     False
Portugal     True
Norway       True
Greece       True
Name: Population, dtype: bool

Select the rows for which the population is less than 15 million people:

In [60]:
df[df["Population"] < 15_000_000]

Unnamed: 0,Capital,Population,Monarch,Area
Belgium,Brussels,11449656,Philippe,30688
Portugal,Lisbon,10295909,,92212
Norway,Oslo,5391369,Harald V,385207
Greece,Athens,10718565,,131957


Check the shape:

In [61]:
df[df["Population"] < 15_000_000].shape

(4, 4)

Select the rows for which the area is greater than or equal to 400 thousand square km:

In [62]:
df["Area"]

Spain       505990
Belgium      30688
France      640679
Italy       301340
Germany     357022
Portugal     92212
Norway      385207
Greece      131957
Name: Area, dtype: int64

In [63]:
df["Area"] >= 400_000

Spain        True
Belgium     False
France       True
Italy       False
Germany     False
Portugal    False
Norway      False
Greece      False
Name: Area, dtype: bool

In [64]:
df[df["Area"] >= 400_000]

Unnamed: 0,Capital,Population,Monarch,Area
Spain,Madrid,46733038,Felipe VI,505990
France,Paris,67076000,,640679


Check the shape:

In [65]:
df[df["Area"] >= 400_000].shape

(2, 4)

Select the rows for which the area is smaller than 400 thousand square km:

In [69]:
df[~df["Area"] < 400_000]

Unnamed: 0,Capital,Population,Monarch,Area
Spain,Madrid,46733038,Felipe VI,505990
Belgium,Brussels,11449656,Philippe,30688
France,Paris,67076000,,640679
Italy,Roma,60390560,,301340
Germany,Berlin,83122889,,357022
Portugal,Lisbon,10295909,,92212
Norway,Oslo,5391369,Harald V,385207
Greece,Athens,10718565,,131957


Check the shape:

In [68]:
df[~df["Area"] < 400_000].shape

(8, 4)

The original DataFrame has been split into two parts:

In [None]:
df.shape

Select the rows for which the capital is "Roma":

In [None]:
df["Capital"] == "Roma"

In [None]:
df[df["Capital"] == "Roma"]

Check the shape:

In [None]:
df[df["Capital"] == "Roma"].shape

## Exercise 6

In [73]:
apple.head()

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1984-09-07,26.5,26.87,26.25,26.5,2981600,3.02
1984-09-10,26.5,26.62,25.87,26.37,2346400,3.01
1984-09-11,26.62,27.37,26.62,26.87,5444000,3.07
1984-09-12,26.87,27.0,26.12,26.12,4773600,2.98
1984-09-13,27.5,27.62,27.5,27.5,7429600,3.14


Select the rows of the `apple` DataFrame for which the "Open" column was less than or equal to 26.50:

In [76]:
apple.Open

Date
1984-09-07     26.50
1984-09-10     26.50
1984-09-11     26.62
1984-09-12     26.87
1984-09-13     27.50
               ...  
2008-10-08     85.91
2008-10-09     93.35
2008-10-10     85.70
2008-10-13    104.55
2008-10-14    116.26
Name: Open, Length: 6081, dtype: float64

In [77]:
apple.Open <= 26.5

Date
1984-09-07     True
1984-09-10     True
1984-09-11    False
1984-09-12    False
1984-09-13    False
              ...  
2008-10-08    False
2008-10-09    False
2008-10-10    False
2008-10-13    False
2008-10-14    False
Name: Open, Length: 6081, dtype: bool

In [78]:
apple[apple.Open <= 26.50]

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1984-09-07,26.50,26.87,26.25,26.50,2981600,3.02
1984-09-10,26.50,26.62,25.87,26.37,2346400,3.01
1984-09-25,26.50,26.50,26.12,26.12,5977600,2.98
1984-09-26,26.12,27.25,25.75,25.75,3987200,2.94
1984-09-27,25.75,25.87,25.75,25.75,3796000,2.94
...,...,...,...,...,...,...
2004-05-04,25.97,26.55,25.50,26.14,9999400,13.07
2004-05-05,26.20,26.75,25.96,26.65,8503800,13.32
2004-05-06,26.40,26.75,25.90,26.58,9412800,13.29
2004-05-10,26.27,26.60,25.94,26.28,8927800,13.14


Check the shape:

In [79]:
apple[apple.Open <= 26.50].shape

(1757, 6)

Select the rows of the `apple` DataFrame for which the "Volume" column was greater than 100_000_000:

In [80]:
apple[apple.Volume > 100_000_000]

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1997-08-06,25.25,27.75,25.0,26.31,149671200,6.58
1997-08-07,28.75,29.56,28.37,29.19,134124400,7.3
1999-09-21,73.19,73.25,69.0,69.25,119931200,17.31
2000-09-29,28.19,29.0,25.37,25.75,265069000,12.88
2005-01-13,73.71,74.42,69.73,69.8,113025600,34.9
2007-01-09,86.45,92.98,85.15,92.57,119617800,92.57
2007-01-10,94.75,97.8,93.45,97.0,105460000,97.0
2008-01-23,136.19,140.0,126.14,139.07,120463200,139.07


Check the shape:

In [81]:
apple[apple.Volume > 100_000_000].shape

(8, 6)

Using the `apple` DataFrame, find out how many days the "Close" value was exactly 14.00:

In [1]:
apple[apple.Close == 14.00]

NameError: name 'apple' is not defined

Check the shape:

In [83]:
apple[apple.Close == 14.00].shape

(0, 6)

## Summary

 Command                       | Result
:------------------------------|:------------------------------------------------------
`df["Column"]`                 | Selects one column, and returns a `Series`
`df[["Column_1", "Column_2"]]` | Selects several columns, and returns a `DataFrame`
`df[["Column"]]`               | Selects one column, and returns a `DataFrame`
`df[:"Spain"]`                 | Slices rows using the index, and returns a `DataFrame`
`df[:10]`                      | Slices rows using integers, and returns a `DataFrame`
`df[df["Column"] > 0]`         | Selects rows, and returns a `DataFrame`