# Selecting Rows and Columns in pandas

### 1. Reading Data
We first import `pandas` and load a table into a DataFrame.

In [None]:
import pandas as pd

df = pd.read_csv('population.csv', index_col=0)

### 2. Attributes and Methods

`.shape` is an *attribute*. It can be used with any dataset using a dot. It shows the number of rows and columns in a DataFrame as a Python *tuple*:

In [None]:
df.shape

`.head()` is a *method*. It can be called on any DataFrame object by the dot, followed by parentheses.
It returns the first N rows of the DataFrame.

In [None]:
df.head(3)

### 3. Selecting Rows and Columns
Match the Python commands with the descriptions below.

* remove rows with missing values
* select a single row
* inspect column labels
* select multiple columns
* select rows by position
* select rows that match a condition
* select multiple rows
* select a single column
* select values in a given range
* select rows and columns by position
* inspect row labels
* select rows and columns

Create new Markdown cells in the notebook to have a heading for each command.

In [None]:
# inspect row labels
df.index

In [None]:
# inspect column labels
df.columns

In [None]:
# select a single column
df['2015']

In [None]:
# select multiple columns
df[['1900', '1950', '2000']]

In [None]:
# select a single row
df.loc['Estonia']

In [None]:
df.loc[['Japan', 'China']]

In [None]:
df.loc['Croatia', '2000']

In [None]:
# select rows by position
df.iloc[10:15]

In [None]:
df.iloc[10:15, 75:]

In [30]:
# select rows that match a condition
df.loc[df['2000'] > 200_000_000]

Unnamed: 0_level_0,1800,1810,1820,1830,1840,1850,1860,1870,1880,1890,...,2006,2007,2008,2009,2010,2011,2012,2013,2014,2015
Total population,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
China,321675013.0,350542958.0,380055273.0,402373519.0,411213424.0,402711280.0,380047548.0,363661158.0,365544192.0,377135349.0,...,1312601000.0,1319625000.0,1326691000.0,1333807000.0,1340969000.0,1348174000.0,1355387000.0,1362514000.0,1369436000.0,1376049000.0
India,168574895.0,171940819.0,176225709.0,182214537.0,189298397.0,196657653.0,204966302.0,213725049.0,223020377.0,232819584.0,...,1162088000.0,1179686000.0,1197070000.0,1214182000.0,1230985000.0,1247446000.0,1263590000.0,1279499000.0,1295292000.0,1311051000.0
Indonesia,16108545.0,16537268.0,17236636.0,18460171.0,20052305.0,21979198.0,24209376.0,27062539.0,30212871.0,33747355.0,...,229264000.0,232296800.0,235360800.0,238465200.0,241613100.0,244808300.0,248037900.0,251268300.0,254454800.0,257563800.0
United States,6801854.0,8294928.0,10361646.0,13480460.0,17942443.0,24136293.0,31936643.0,40821569.0,51256498.0,63810074.0,...,298860500.0,301656000.0,304473100.0,307232000.0,309876200.0,312390400.0,314799500.0,317135900.0,319448600.0,321773600.0
USSR,,,,,,,,,,,...,287266500.0,287330800.0,287447400.0,,,,,,,


In [None]:
df[df['2000'].between(500_000, 1_000_000)]

In [None]:
# remove rows with missing values
df.dropna()

## License
(c) 2017 Kristian Rother.
Distributed under the conditions of the MIT License.