## Selectors

### Selecting with .loc and .iloc

Aside from using double brackets `[][]` to access values, DataFrame provides `.loc[]` and `'iloc[]` mthods to select values with row labels (index) or position respectively.

Here's some examples of using `.loc()`

In [None]:
import pandas as pd

# read data fom csv
flights = pd.read_csv('../data/flights.csv', header=0)

# select single row by index
flights.loc[0]

# select multiple rows with slices
flights.loc[[0, 5, 7, 10]]
flights.loc[0:3]

# select multiple rows and columns by index
flights.loc[0:3,['airline', 'src', 'dest']]

:::info `.loc[[rows],[columns]]`

Using `.loc` the first bracket selects rows and the second bracket select column. This is the reverse order of using double brackets.

:::

`.iloc[]` works the same way, but instead of labels (index) you can select by row and colunm position numbers. In this case, since our flight records have a RangeIndex the row indexes are the **same** as labels:

In [None]:
# select first row
flights.iloc[0]

# select multiple rows with slices
flights.iloc[[0, 5, 7, 10]]
flights.iloc[0:3]

# select multiple rows and columns by position
flights.iloc[0:3,[0, 2, 4]]

:::info Mixing `.loc` and `iloc`

You can always mix using `.loc` and `iloc` together:

:::


In [None]:
# mixing loc and iloc
# select rows 5-10 and few columns
flights.iloc[5:10].loc[:, ['flight_number', 'src', 'dest']]

### Conditional Selections

You can specify criterias for selecting values within the Dataframe:

In [None]:
# select delta airline flights
flights.loc[flights.airline == 'DL']
# same as above
flights.loc[flights['airline'] == 'DL']

# flights where distance is not null
flights.loc[flights.distance.notna()]
# or where distance is null
flights.loc[flights.distance.isna()]

# select flights out of PDX over 500 miles
flights.loc[(flights.src == 'PDX') & (flights.distance > 500.0)]

# apply multiple conditions::
# select delta or alaska flights
flights.loc[(flights.airline == 'DL') | (flights.airline == 'AS')]
# select delta airlines flights from LAX-JFK
flights.loc[(flights.airline == 'DL') & (flights.src == 'LAX') & (flights.dest == 'JFK')]

# select delta and alaska flights from LAX-JSK
a = flights.loc[(flights.airline.isin(['DL', 'AS'])) & 
            (flights.src == 'LAX') & (flights.dest == 'JFK')]

with pd.option_context('display.max_rows', None, 'display.max_columns', None, 'display.max_colwidth', 500, 'display.width', 500):
    print(a)

:::tip Handy selection methods

Pandas has special selections method for almsot everything. Remember them and use the rigolously. Methods such as `.isin()`, `.isna()`, and `.notna()`. See examples above.

:::

### Using query() method

If you are more familiar with SQL syntax, you can use the pandas `.query()` method:


In [None]:
# select flights from PDX over 500 miles
flights.query("(src == 'PDX') & (distance > 500.0)")

### Subselections

You can always save a selection and further subselect within a set by assigning your selections into a variable:

In [None]:
# select flights from PDX
pdx_flights = flights.loc[flights.src == 'PDX']
# find long distance flights
pdx_long_distance = pdx_flights.query("distance > 500.0")
pdx_long_distance