# Pandas Library Practice II

- Using iloc, loc, & ix to select rows and columns

Since May 2018

Tianyu

## Pandas Data Selection
There are multiple ways to select and index rows and columns from Pandas DataFrames. I find tutorials online focusing on advanced selections of row and column choices a little complex for my requirements.

### Selection Options
There’s three main options to achieve the selection and indexing activities in Pandas, which can be confusing. The three selection cases and methods covered in this post are:

- Selecting data by row numbers (.iloc)
- Selecting data by label or by a conditional statment (.loc)
- Selecting in a hybrid approach (.ix) (now Deprecated in Pandas 0.20.1) (Skip)

### Data Setup

![loc selection](pic/Pandas-selections-and-indexing.png)

This blog post, inspired by other tutorials, describes selection activities with these operations. The tutorial is suited for the general data science situation where, typically I find myself:

- Each row in your data frame represents a data sample.
- Each column is a variable, and is usually named. I rarely select columns without their names.
- I need to quickly and often select relevant rows from the data frame for modelling and visualisation activities.



### Selection and Indexing Methods for Pandas DataFrames
For these explorations we’ll need some sample data – I downloaded the uk-500 sample data set from www.briandunning.com. This data contains artificial names, addresses, companies and phone numbers for fictitious UK characters. 

To follow along, you can download the .csv file here. Load the data as follows

In [36]:
import pandas as pd
import random
 
# read the data from the downloaded CSV file.
data = pd.read_csv('data/uk-500.csv')

# set a numeric id for use as an index for examples.
data['id'] = [random.randint(0,1000) for x in range(data.shape[0])]
 
data.head(5)

Unnamed: 0,first_name,last_name,company_name,address,city,county,postal,phone1,phone2,email,web,id
0,Aleshia,Tomkiewicz,Alan D Rosenburg Cpa Pc,14 Taylor St,St. Stephens Ward,Kent,CT2 7PP,01835-703597,01944-369967,atomkiewicz@hotmail.com,http://www.alandrosenburgcpapc.co.uk,506
1,Evan,Zigomalas,Cap Gemini America,5 Binney St,Abbey Ward,Buckinghamshire,HP11 2AX,01937-864715,01714-737668,evan.zigomalas@gmail.com,http://www.capgeminiamerica.co.uk,798
2,France,Andrade,"Elliott, John W Esq",8 Moor Place,East Southbourne and Tuckton W,Bournemouth,BH6 3BE,01347-368222,01935-821636,france.andrade@hotmail.com,http://www.elliottjohnwesq.co.uk,544
3,Ulysses,Mcwalters,"Mcmahan, Ben L",505 Exeter Rd,Hawerby cum Beesby,Lincolnshire,DN36 5RP,01912-771311,01302-601380,ulysses@hotmail.com,http://www.mcmahanbenl.co.uk,801
4,Tyisha,Veness,Champagne Room,5396 Forth Street,Greets Green and Lyng Ward,West Midlands,B70 9DT,01547-429341,01290-367248,tyisha.veness@hotmail.com,http://www.champagneroom.co.uk,323


## 1. Selecting pandas data using “iloc”
The iloc indexer for Pandas Dataframe is used for integer-location based indexing / selection by position.

The iloc indexer syntax is data.iloc[<row selection>, <column selection>], which is sure to be a source of confusion for R users. “iloc” in pandas is used to select rows and columns by number, in the order that they appear in the data frame. You can imagine that each row has a row number from 0 to the total rows (data.shape[0])  and iloc[] allows selections based on these numbers. The same applies for columns (ranging from 0 to data.shape[1] )

There are two “arguments” to iloc – a row selector, and a column selector.  For example:

In [37]:
# Single selections using iloc and DataFrame

# Rows:
data.iloc[0] # first row of data frame (Aleshia Tomkiewicz) - Note a Series data type output.
data.iloc[1] # second row of data frame (Evan Zigomalas)
data.iloc[-1] # last row of data frame (Mi Richan)

# Columns:
data.iloc[:,0] # first column of data frame (first_name)
data.iloc[:,1] # second column of data frame (last_name)
data.iloc[:,-1].head() # last column of data frame (id)

0    506
1    798
2    544
3    801
4    323
Name: id, dtype: int64

Multiple columns and rows can be selected together using the .iloc indexer.

In [38]:
# Multiple row and column selections using iloc and DataFrame
data.iloc[0:5] # first five rows of dataframe
data.iloc[:, 0:2] # first two columns of data frame with all rows
data.iloc[[0,3,6,24], [0,5,6]] # 1st, 4th, 7th, 25th row + 1st 6th 7th columns.
data.iloc[0:5, 5:8] # first 5 rows and 5th, 6th, 7th columns of data frame (county -> phone1).

Unnamed: 0,county,postal,phone1
0,Kent,CT2 7PP,01835-703597
1,Buckinghamshire,HP11 2AX,01937-864715
2,Bournemouth,BH6 3BE,01347-368222
3,Lincolnshire,DN36 5RP,01912-771311
4,West Midlands,B70 9DT,01547-429341


here’s two gotchas to remember when using iloc in this manner:

1. Note that .iloc returns a **Pandas Series** when one row is selected, and a **Pandas DataFrame** when multiple rows are selected, or if any column in full is selected. To counter this, pass a single-valued list if you require DataFrame output.


2. When selecting multiple columns or multiple rows in this manner, remember that in your selection e.g.[1:5], the rows/columns selected will run from the first number to one minus the second number. e.g. [1:5] will go 1,2,3,4., [x,y] goes from x to y-1.

## 2. Selecting pandas data using “loc”
The Pandas loc indexer can be used with DataFrames for two different use cases:

- a.) Selecting rows by label/index
- b.) Selecting rows with a boolean / conditional lookup
The loc indexer is used with the same syntax as iloc: **data.loc[<row selection>, <column selection>]**.

### 2a. Label-based / Index-based indexing using .loc
Selections using the loc method are based on the index of the data frame (if any). Where the index is set on a DataFrame, using <code>df.set_index()</code>, the .loc method directly selects based on index values of any rows. For example, setting the index of our test data frame to the persons “last_name”:

In [39]:
data.set_index("last_name", inplace=True)
data.head()

Unnamed: 0_level_0,first_name,company_name,address,city,county,postal,phone1,phone2,email,web,id
last_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
Tomkiewicz,Aleshia,Alan D Rosenburg Cpa Pc,14 Taylor St,St. Stephens Ward,Kent,CT2 7PP,01835-703597,01944-369967,atomkiewicz@hotmail.com,http://www.alandrosenburgcpapc.co.uk,506
Zigomalas,Evan,Cap Gemini America,5 Binney St,Abbey Ward,Buckinghamshire,HP11 2AX,01937-864715,01714-737668,evan.zigomalas@gmail.com,http://www.capgeminiamerica.co.uk,798
Andrade,France,"Elliott, John W Esq",8 Moor Place,East Southbourne and Tuckton W,Bournemouth,BH6 3BE,01347-368222,01935-821636,france.andrade@hotmail.com,http://www.elliottjohnwesq.co.uk,544
Mcwalters,Ulysses,"Mcmahan, Ben L",505 Exeter Rd,Hawerby cum Beesby,Lincolnshire,DN36 5RP,01912-771311,01302-601380,ulysses@hotmail.com,http://www.mcmahanbenl.co.uk,801
Veness,Tyisha,Champagne Room,5396 Forth Street,Greets Green and Lyng Ward,West Midlands,B70 9DT,01547-429341,01290-367248,tyisha.veness@hotmail.com,http://www.champagneroom.co.uk,323


Now with the index set, we can directly select rows for different “last_name” values using .loc[<label>]  – either singly, or in multiples. For example:

In [40]:
data.loc['Andrade']

first_name                                France
company_name                 Elliott, John W Esq
address                             8 Moor Place
city              East Southbourne and Tuckton W
county                               Bournemouth
postal                                   BH6 3BE
phone1                              01347-368222
phone2                              01935-821636
email                 france.andrade@hotmail.com
web             http://www.elliottjohnwesq.co.uk
id                                           544
Name: Andrade, dtype: object

In [41]:
data.loc[['Andrade','Veness']]

Unnamed: 0_level_0,first_name,company_name,address,city,county,postal,phone1,phone2,email,web,id
last_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
Andrade,France,"Elliott, John W Esq",8 Moor Place,East Southbourne and Tuckton W,Bournemouth,BH6 3BE,01347-368222,01935-821636,france.andrade@hotmail.com,http://www.elliottjohnwesq.co.uk,544
Veness,Tyisha,Champagne Room,5396 Forth Street,Greets Green and Lyng Ward,West Midlands,B70 9DT,01547-429341,01290-367248,tyisha.veness@hotmail.com,http://www.champagneroom.co.uk,323


Select columns with .loc using the names of the columns. In most of my data work, typically I have named columns, and use these named selections.

In [42]:
data.loc[['Andrade','Veness'],['first_name','address','city']]

Unnamed: 0_level_0,first_name,address,city
last_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Andrade,France,8 Moor Place,East Southbourne and Tuckton W
Veness,Tyisha,5396 Forth Street,Greets Green and Lyng Ward


You can select ranges of index labels – the selection </code>data.loc[‘Bruch’:’Julio’]</code> will return all rows in the data frame between the index entries for “Bruch” and “Julio”. The following examples should now make sense:

In [None]:
# Change the index to be based on the 'id' column
data.set_index('id', inplace=True)

In [47]:
data.head()

Unnamed: 0_level_0,first_name,company_name,address,city,county,postal,phone1,phone2,email,web
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
506,Aleshia,Alan D Rosenburg Cpa Pc,14 Taylor St,St. Stephens Ward,Kent,CT2 7PP,01835-703597,01944-369967,atomkiewicz@hotmail.com,http://www.alandrosenburgcpapc.co.uk
798,Evan,Cap Gemini America,5 Binney St,Abbey Ward,Buckinghamshire,HP11 2AX,01937-864715,01714-737668,evan.zigomalas@gmail.com,http://www.capgeminiamerica.co.uk
544,France,"Elliott, John W Esq",8 Moor Place,East Southbourne and Tuckton W,Bournemouth,BH6 3BE,01347-368222,01935-821636,france.andrade@hotmail.com,http://www.elliottjohnwesq.co.uk
801,Ulysses,"Mcmahan, Ben L",505 Exeter Rd,Hawerby cum Beesby,Lincolnshire,DN36 5RP,01912-771311,01302-601380,ulysses@hotmail.com,http://www.mcmahanbenl.co.uk
323,Tyisha,Champagne Room,5396 Forth Street,Greets Green and Lyng Ward,West Midlands,B70 9DT,01547-429341,01290-367248,tyisha.veness@hotmail.com,http://www.champagneroom.co.uk


In [48]:
# select the row with 'id' = 506
data.loc[506]

first_name                                   Aleshia
company_name                 Alan D Rosenburg Cpa Pc
address                                 14 Taylor St
city                               St. Stephens Ward
county                                          Kent
postal                                       CT2 7PP
phone1                                  01835-703597
phone2                                  01944-369967
email                        atomkiewicz@hotmail.com
web             http://www.alandrosenburgcpapc.co.uk
Name: 506, dtype: object

Note that in the last example, data.loc[487] (the row with index value 487) is not equal to data.iloc[487] (the 487th row in the data). The index of the DataFrame can be out of numeric order, and/or a string or multi-value.



### 2b. Boolean / Logical indexing using .loc
Conditional selections with boolean arrays using **data.loc[<selection>]** is the most common method that I use with Pandas DataFrames. With boolean indexing or logical selection, you pass an array or Series of True/False values to the .loc indexer to select the rows where your Series has True values.

In most use cases, you will make selections based on the values of different columns in your data set.

For example, the statement data[‘first_name’] == ‘Antonio’] produces a Pandas Series with a True/False value for every row in the ‘data’ DataFrame, where there are “True” values for the rows where the first_name is “Antonio”. These type of boolean arrays can be passed directly to the .loc indexer as so:

In [49]:
# Select rows with first name Antonio, # and all columns between 'city' and 'email'
data.loc[data['first_name'] == 'Antonio', 'city':'email']

Unnamed: 0_level_0,city,county,postal,phone1,phone2,email
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
651,Little Parndon and Hare Street,Hertfordshire,CM20 2HT,01559-403415,01388-777812,antonio.villamarin@gmail.com
786,Gaer Community,Newport,NP20 3DE,01463-409090,01242-318420,antonio_glasford@glasford.co.uk
576,Ipplepen,Devon,TQ12 5LL,01324-171614,01442-946357,antonio.heilig@gmail.com


As before, a second argument can be passed to .loc to select particular columns out of the data frame. Again, columns are referred to by name for the loc indexer and can be a single string, a list of columns, or a slice “:” operation.

In [50]:
data.loc[data['first_name'] == 'Erasmo', ['city','email']]

Unnamed: 0_level_0,city,email
id,Unnamed: 1_level_1,Unnamed: 2_level_1
496,Darton West Ward,erasmo.talentino@hotmail.com
918,Sunningdale,egath@hotmail.com
560,Mortehoe,erasmo_rhea@hotmail.com


Note that when selecting columns, if one column only is selected, the .loc operator returns a Series. For a single column DataFrame, use a one-element list to keep the DataFrame format, for example:

In [51]:
data.loc[data['first_name'] == 'Antonio', 'email']

id
651       antonio.villamarin@gmail.com
786    antonio_glasford@glasford.co.uk
576           antonio.heilig@gmail.com
Name: email, dtype: object

In [52]:
data.loc[data['first_name'] == 'Antonio', ['email']]

Unnamed: 0_level_0,email
id,Unnamed: 1_level_1
651,antonio.villamarin@gmail.com
786,antonio_glasford@glasford.co.uk
576,antonio.heilig@gmail.com


Make sure you understand the following additional examples of .loc selections for clarity:

In [53]:
# Select rows where the email column ends with 'hotmail.com', include all columns
data.loc[data['email'].str.endswith("hotmail.com")]   

Unnamed: 0_level_0,first_name,company_name,address,city,county,postal,phone1,phone2,email,web
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
506,Aleshia,Alan D Rosenburg Cpa Pc,14 Taylor St,St. Stephens Ward,Kent,CT2 7PP,01835-703597,01944-369967,atomkiewicz@hotmail.com,http://www.alandrosenburgcpapc.co.uk
544,France,"Elliott, John W Esq",8 Moor Place,East Southbourne and Tuckton W,Bournemouth,BH6 3BE,01347-368222,01935-821636,france.andrade@hotmail.com,http://www.elliottjohnwesq.co.uk
801,Ulysses,"Mcmahan, Ben L",505 Exeter Rd,Hawerby cum Beesby,Lincolnshire,DN36 5RP,01912-771311,01302-601380,ulysses@hotmail.com,http://www.mcmahanbenl.co.uk
323,Tyisha,Champagne Room,5396 Forth Street,Greets Green and Lyng Ward,West Midlands,B70 9DT,01547-429341,01290-367248,tyisha.veness@hotmail.com,http://www.champagneroom.co.uk
844,Marg,Wrangle Hill Auto Auct & Slvg,7457 Cowl St #70,Bargate Ward,Southampton,SO14 3TY,01865-582516,01362-620532,marg@hotmail.com,http://www.wranglehillautoauctslvg.co.uk
42,Lura,Bizerba Usa Inc,929 Augustine St,Staple Hill Ward,South Gloucestershire,BS16 4LL,01907-538509,01340-713951,lura@hotmail.com,http://www.bizerbausainc.co.uk
481,Dewitt,Rittenhouse Motor Co,7 Richmond St,Parkham,Devon,EX39 5DJ,01253-528327,01241-964675,dewitt.julio@hotmail.com,http://www.rittenhousemotorco.co.uk
100,Tess,Smart Signs,61 Rossett St,Chichester,West Sussex,PO19 1RH,01473-229124,01848-116775,tess_sitra@hotmail.com,http://www.smartsigns.co.uk
225,German,Jackson & Heit Machine Co Inc,7 Shenstone St,Longhill Ward,"Yorkshire, East (North Humbers",HU8 9PZ,01400-269033,01366-210656,german@hotmail.com,http://www.jacksonheitmachinecoinc.co.uk
987,Luis,Wa Inst For Plcy Studies,2 Birchfield Rd,Whittington,Shropshire,SY11 4PH,01462-648669,01405-648623,luis@hotmail.com,http://www.wainstforplcystudies.co.uk


In [54]:
# Select rows with last_name equal to some values, all columns
data.loc[data['first_name'].isin(['France', 'Tyisha', 'Eric'])]   

Unnamed: 0_level_0,first_name,company_name,address,city,county,postal,phone1,phone2,email,web
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
544,France,"Elliott, John W Esq",8 Moor Place,East Southbourne and Tuckton W,Bournemouth,BH6 3BE,01347-368222,01935-821636,france.andrade@hotmail.com,http://www.elliottjohnwesq.co.uk
323,Tyisha,Champagne Room,5396 Forth Street,Greets Green and Lyng Ward,West Midlands,B70 9DT,01547-429341,01290-367248,tyisha.veness@hotmail.com,http://www.champagneroom.co.uk
209,Eric,"Thompson, Michael C Esq",9472 Lind St,Desborough,Northamptonshire,NN14 2GH,01969-886290,01545-817375,erampy@rampy.co.uk,http://www.thompsonmichaelcesq.co.uk


In [55]:
# Select rows with first name Antonio AND hotmail email addresses
data.loc[data['email'].str.endswith("gmail.com") & (data['first_name'] == 'Antonio')] 

Unnamed: 0_level_0,first_name,company_name,address,city,county,postal,phone1,phone2,email,web
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
651,Antonio,Combs Sheetmetal,353 Standish St #8264,Little Parndon and Hare Street,Hertfordshire,CM20 2HT,01559-403415,01388-777812,antonio.villamarin@gmail.com,http://www.combssheetmetal.co.uk
576,Antonio,Radisson Suite Hotel,35 Elton St #3,Ipplepen,Devon,TQ12 5LL,01324-171614,01442-946357,antonio.heilig@gmail.com,http://www.radissonsuitehotel.co.uk


In [56]:
# A lambda function that yields True/False values can also be used.
# Select rows where the company name has 4 words in it.
data.loc[data['company_name'].apply(lambda x: len(x.split(' ')) == 4)] 

Unnamed: 0_level_0,first_name,company_name,address,city,county,postal,phone1,phone2,email,web
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
544,France,"Elliott, John W Esq",8 Moor Place,East Southbourne and Tuckton W,Bournemouth,BH6 3BE,01347-368222,01935-821636,france.andrade@hotmail.com,http://www.elliottjohnwesq.co.uk
209,Eric,"Thompson, Michael C Esq",9472 Lind St,Desborough,Northamptonshire,NN14 2GH,01969-886290,01545-817375,erampy@rampy.co.uk,http://www.thompsonmichaelcesq.co.uk
711,Charlesetta,"Cain, John M Esq",5 Hygeia St,Loundsley Green Ward,Derbyshire,S40 4LY,01276-816806,01517-624517,charlesetta_erm@gmail.com,http://www.cainjohnmesq.co.uk
317,Michell,Weiss Spirt & Guyer,89 Noon St,Carbrooke,Norfolk,IP25 6JQ,01967-580851,01672-496478,mthrossell@throssell.co.uk,http://www.weissspirtguyer.co.uk
263,Edgar,"Crowan, Kenneth W Esq",99 Guthrie St,New Milton,Hampshire,BH25 5DF,01326-532337,01666-638176,edgar.kanne@yahoo.com,http://www.crowankennethwesq.co.uk
153,Mee,Galloway Electric Co Inc,9 Pengwern St,Marldon,Devon,TQ3 1SA,01578-287816,01939-815208,mee.lapinski@yahoo.com,http://www.gallowayelectriccoinc.co.uk
459,Peter,Niagara Custombuilt Mfg Co,4410 Tarlton St,Prestatyn Community,Denbighshire,LL19 9EG,01842-767201,01859-648598,peter_gutierres@yahoo.com,http://www.niagaracustombuiltmfgco.co.uk
948,Martha,"Curtin, Patricia M Esq",148 Rembrandt St,Warlingham,Surrey,CR6 9SW,01677-684257,01583-287367,mteplica@teplica.co.uk,http://www.curtinpatriciamesq.co.uk
119,Tamesha,"Wilhelm, James E Jr",2200 Nelson St #58,Newport,Isle of Wight,PO30 5AL,01217-342071,01280-786847,tveigel@veigel.co.uk,http://www.wilhelmjamesejr.co.uk
230,Leonard,Arctic Star Distributing Inc,41 Canning St,Steornabhagh a Deas Ward,Western Isles,HS1 2PZ,01230-623547,01604-718601,lkufner@kufner.co.uk,http://www.arcticstardistributinginc.co.uk


In [57]:
# Selections can be achieved outside of the main .loc for clarity:
# Form a separate variable with your selections:
idx = data['company_name'].apply(lambda x: len(x.split(' ')) == 4)
# Select only the True values in 'idx' and only the 3 columns specified:
data.loc[idx, ['email', 'first_name', 'company']]

Passing list-likes to .loc or [] with any missing label will raise
KeyError in the future, you can use .reindex() as an alternative.

See the documentation here:
http://pandas.pydata.org/pandas-docs/stable/indexing.html#deprecate-loc-reindex-listlike
  """
Passing list-likes to .loc or [] with any missing label will raise
KeyError in the future, you can use .reindex() as an alternative.

See the documentation here:
http://pandas.pydata.org/pandas-docs/stable/indexing.html#deprecate-loc-reindex-listlike
  return self._getitem_tuple(key)


Unnamed: 0_level_0,email,first_name,company
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
544,france.andrade@hotmail.com,France,
209,erampy@rampy.co.uk,Eric,
711,charlesetta_erm@gmail.com,Charlesetta,
317,mthrossell@throssell.co.uk,Michell,
263,edgar.kanne@yahoo.com,Edgar,
153,mee.lapinski@yahoo.com,Mee,
459,peter_gutierres@yahoo.com,Peter,
948,mteplica@teplica.co.uk,Martha,
119,tveigel@veigel.co.uk,Tamesha,
230,lkufner@kufner.co.uk,Leonard,


Logical selections and boolean Series can also be passed to the generic [] indexer of a pandas DataFrame and will give the same results: data.loc[data[‘id’] == 9] == data[data[‘id’] == 9] .

## Setting values in DataFrames using .loc
With a slight change of syntax, you can actually update your DataFrame in the same statement as you select and filter using .loc indexer. This particular pattern allows you to update values in columns depending on different conditions.

**The setting operation does not make a copy of the data frame, but edits the original data.**

As an example:

In [66]:
data = data.reset_index(drop=False)
data.head()

Unnamed: 0,index,id,first_name,company_name,address,city,county,postal,phone1,phone2,email,web
495,495,136,Avery,Plaza Gourmet Delicatessen,9166 Devon St #905,Knightsbridge and Belgravia Wa,Greater London,SW1W 8JY,01748-625058,01369-185737,avery@veit.co.uk,http://www.plazagourmetdelicatessen.co.uk
496,496,122,Reid,"Fitzgerald, Edward J",70 Foster St,Inverness Ness-Side Ward,Highland,IV2 6WT,01916-963261,01370-319414,reuresti@euresti.co.uk,http://www.fitzgeraldedwardj.co.uk
497,497,625,Charlette,Furey & Associates,714 Fonthill Rd,Darton West Ward,South Yorkshire,S75 5EJ,01888-152110,01301-312487,cbrenning@brenning.co.uk,http://www.fureyassociates.co.uk
498,498,743,Celestina,Bfg Federal Credit Union,9 Milton St,Consett North ED,County Durham,DH8 5LP,01877-379681,01600-463475,celestina_keeny@gmail.com,http://www.bfgfederalcreditunion.co.uk
499,499,867,Mi,Nelson Wright Haworth Golf Crs,6 Norwood Grove,Tanworth-in-Arden,Warwickshire,B94 5RZ,01451-785624,01202-738406,mi@hotmail.com,http://www.nelsonwrighthaworthgolfcrs.co.uk


In [59]:
# Change the first name of all rows with an ID greater than 2000 to "John"
data.loc[data['id'] > 2000, "first_name"] = "John"