# Introduction

Selecting specific values of a pandas DataFrame or Series to work on is an implicit step in almost any data operation you'll run, so one of the first things you need to learn in working with data in Python is how to go about selecting the data points relevant to you quickly and effectively.

In [2]:
import pandas as pd
df = pd.read_csv("../Datasets/PakistanDroneAttacksWithTemp Ver 9 (October 19, 2017).csv",  encoding='cp1252')

Unnamed: 0,S#,Date,Time,Location,City,Province,No of Strike,Al-Qaeda,Taliban,Civilians Min,...,Injured Min,Injured Max,Women/Children,Special Mention (Site),Comments,References,Longitude,Latitude,Temperature(C),Temperature(F)
0,1.0,"Friday, June 18, 2004",22:00,Near Wana,south Waziristan,FATA,1.0,,1.0,0.0,...,,,N,Blast occured in courtyard of the house of lon...,Village in Wana,http://archives.dawn.com/2004/06/19/top1.htm,69.900000,33.033300,28.475,83.255
1,2.0,"Sunday, May 08, 2005",23:30,Mir Ali (Near Afghan Border),North Waziristan,FATA,1.0,1.0,,0.0,...,,,N,Drone struck a car driven by local warlord- ki...,Civilian killied was Samiullah Khan who was a ...,http://www.msnbc.msn.com/id/7847008/,70.145500,32.974600,11.475,52.655
2,3.0,"Thursday, December 01, 2005",,Haisori- Miran Shah,North Waziristan,FATA,1.0,1.0,,0.0,...,,2.0,,Explosive occurred at a mud house,No. 3 Al-Qaeda's Leader AbuHamza Rabia killed ...,http://edition.cnn.com/2005/WORLD/asiapcf/12/0...,70.145500,32.974600,7.080,44.744
3,4.0,"Friday, January 06, 2006",,Saidgai village- 115km north of Wana,North Waziristan,FATA,1.0,,,,...,,2.0,,,,http://www.reuters.com/article/2007/04/27/us-p...,70.145500,32.974600,0.535,32.963
4,5.0,"Friday, January 13, 2006",3:00,Damadola Village,Bajaur Agency,FATA,1.0,,,0.0,...,,2.0,Y,Three houses were tarheted in Damadola village...,Masood Khan house was among those bombed. Want...,http://www.dailytimes.com.pk/default.asp?page=...,71.500000,34.683300,10.025,50.045
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
400,402.0,"Monday, June 12, 2017",21:00,Spin Thal,Hangu,KPK,1.0,,,0.0,...,0.0,0.0,N,Haqqani network leader Abubakar and his partne...,Thal city falls in Hangu district and lies clo...,https://www.dawn.com/news/1339293,33.358693,70.540720,,
401,403.0,"Monday, July 03, 2017",,,South Waziristan,FATA,2.0,,,0.0,...,0.0,0.0,N,a CIA-operated drone carried out a missile att...,,https://www.dawn.com/news/1343100,32.120819,69.589987,23.000,74.000
402,404.0,"Friday, September 15, 2017",,Ghuz Ghari,Kurram Agency,FATA,2.0,,,0.0,...,2.0,2.0,N,A US drone killed three suspected Afghan Talib...,,https://www.dawn.com/news/1357853; https://www...,33.732174,70.150755,,
403,405.0,"Monday, October 16, 2017",,Zero-point,Lower Kurram Agency,FATA,4.0,,5.0,,...,,,N,At least five suspected militants were killed ...,Conflict of Report: Foreign media reported tha...,http://www.thesundaily.my/news/2017/10/18/deat...,,,,


# Native accessors

Native Python objects provide  good ways of indexing data. Pandas carries all of these over, which helps make it easy to start with.

Consider this DataFrame:

In [16]:
df

Unnamed: 0,S#,Date,Time,Location,City,Province,No of Strike,Al-Qaeda,Taliban,Civilians Min,...,Injured Min,Injured Max,Women/Children,Special Mention (Site),Comments,References,Longitude,Latitude,Temperature(C),Temperature(F)
0,1.0,"Friday, June 18, 2004",22:00,Near Wana,south Waziristan,FATA,1.0,,1.0,0.0,...,,,N,Blast occured in courtyard of the house of lon...,Village in Wana,http://archives.dawn.com/2004/06/19/top1.htm,69.9000,33.0333,28.475,83.255
1,2.0,"Sunday, May 08, 2005",23:30,Mir Ali (Near Afghan Border),North Waziristan,FATA,1.0,1.0,,0.0,...,,,N,Drone struck a car driven by local warlord- ki...,Civilian killied was Samiullah Khan who was a ...,http://www.msnbc.msn.com/id/7847008/,70.1455,32.9746,11.475,52.655
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
403,405.0,"Monday, October 16, 2017",,Zero-point,Lower Kurram Agency,FATA,4.0,,5.0,,...,,,N,At least five suspected militants were killed ...,Conflict of Report: Foreign media reported tha...,http://www.thesundaily.my/news/2017/10/18/deat...,,,,
404,,,,,,,,49.0,662.0,1304.0,...,402.0,1329.0,,,,,,,,


In Python, we can access the property of an object by accessing it as an attribute. A `book` object, for example, might have a `title` property, which we can access by calling `book.title`. Columns in a pandas DataFrame work in much the same way. 

Hence to access the `City` property of `df` we can use:

In [3]:
df.City

0         south Waziristan
1         North Waziristan
2         North Waziristan
3         North Waziristan
4           Bajaur  Agency
              ...         
400                  Hangu
401       South Waziristan
402          Kurram Agency
403    Lower Kurram Agency
404                    NaN
Name: City, Length: 405, dtype: object

If we have a Python dictionary, we can access its values using the indexing (`[]`) operator. We can do the same with columns in a DataFrame:

In [4]:
df['City']

0         south Waziristan
1         North Waziristan
2         North Waziristan
3         North Waziristan
4           Bajaur  Agency
              ...         
400                  Hangu
401       South Waziristan
402          Kurram Agency
403    Lower Kurram Agency
404                    NaN
Name: City, Length: 405, dtype: object

These are the two ways of selecting a specific Series out of a DataFrame. Neither of them is more or less syntactically valid than the other, but the indexing operator `[]` does have the advantage that it can handle column names with reserved characters in them (e.g. if we had a `country providence` column, `df.country providence` wouldn't work).

Doesn't a pandas Series look kind of like a fancy dictionary? It pretty much is, so it's no surprise that, to drill down to a single specific value, we need only use the indexing operator `[]` once more:

In [5]:
df['City'][0]

'south Waziristan'

# Indexing in pandas

The indexing operator and attribute selection are nice because they work just like they do in the rest of the Python ecosystem. As a novice, this makes them easy to pick up and use. However, pandas has its own accessor operators, `loc` and `iloc`. For more advanced operations, these are the ones you're supposed to be using.

### Index-based selection

Pandas indexing works in one of two paradigms. The first is **index-based selection**: selecting data based on its numerical position in the data. `iloc` follows this paradigm.

To select the first row of data in a DataFrame, we may use the following:

In [6]:
df.iloc[0]

S#                                                                      1.0
Date                                                  Friday, June 18, 2004
Time                                                                  22:00
Location                                                          Near Wana
City                                                       south Waziristan
Province                                                               FATA
No of Strike                                                            1.0
Al-Qaeda                                                                NaN
Taliban                                                                 1.0
Civilians Min                                                           0.0
Civilians Max                                                           4.0
Foreigners Min                                                          NaN
Foreigners Max                                                          NaN
Total Died M

Both `loc` and `iloc` are row-first, column-second. This is the opposite of what we do in native Python, which is column-first, row-second.

This means that it's marginally easier to retrieve rows, and marginally harder to get retrieve columns. To get a column with `iloc`, we can do the following:

In [7]:
df.iloc[:, 0]

0        1.0
1        2.0
2        3.0
3        4.0
4        5.0
       ...  
400    402.0
401    403.0
402    404.0
403    405.0
404      NaN
Name: S#, Length: 405, dtype: float64

In [8]:
df.iloc[:, 1]

0            Friday, June 18, 2004
1             Sunday, May 08, 2005
2      Thursday, December 01, 2005
3         Friday, January 06, 2006
4         Friday, January 13, 2006
                  ...             
400          Monday, June 12, 2017
401          Monday, July 03, 2017
402     Friday, September 15, 2017
403       Monday, October 16, 2017
404                            NaN
Name: Date, Length: 405, dtype: object

On its own, the `:` operator, which also comes from native Python, means "everything". When combined with other selectors, however, it can be used to indicate a range of values. For example, to select the `City` column from just the first, second, and third row, we would do:

In [9]:
df.iloc[:3, 4]

0    south Waziristan
1    North Waziristan
2    North Waziristan
Name: City, dtype: object

Or, to select just the second and third entries, we would do:

In [10]:
df.iloc[1:3, 4]

1    North Waziristan
2    North Waziristan
Name: City, dtype: object

It's also possible to pass a list:

In [11]:
df.iloc[[0, 1, 2], 4]

0    south Waziristan
1    North Waziristan
2    North Waziristan
Name: City, dtype: object

Finally, it's worth knowing that negative numbers can be used in selection. This will start counting forwards from the _end_ of the values. So for example here are the last five elements of the dataset.

In [12]:
df.iloc[-5:]

Unnamed: 0,S#,Date,Time,Location,City,Province,No of Strike,Al-Qaeda,Taliban,Civilians Min,...,Injured Min,Injured Max,Women/Children,Special Mention (Site),Comments,References,Longitude,Latitude,Temperature(C),Temperature(F)
400,402.0,"Monday, June 12, 2017",21:00,Spin Thal,Hangu,KPK,1.0,,,0.0,...,0.0,0.0,N,Haqqani network leader Abubakar and his partne...,Thal city falls in Hangu district and lies clo...,https://www.dawn.com/news/1339293,33.358693,70.54072,,
401,403.0,"Monday, July 03, 2017",,,South Waziristan,FATA,2.0,,,0.0,...,0.0,0.0,N,a CIA-operated drone carried out a missile att...,,https://www.dawn.com/news/1343100,32.120819,69.589987,23.0,74.0
402,404.0,"Friday, September 15, 2017",,Ghuz Ghari,Kurram Agency,FATA,2.0,,,0.0,...,2.0,2.0,N,A US drone killed three suspected Afghan Talib...,,https://www.dawn.com/news/1357853; https://www...,33.732174,70.150755,,
403,405.0,"Monday, October 16, 2017",,Zero-point,Lower Kurram Agency,FATA,4.0,,5.0,,...,,,N,At least five suspected militants were killed ...,Conflict of Report: Foreign media reported tha...,http://www.thesundaily.my/news/2017/10/18/deat...,,,,
404,,,,,,,,49.0,662.0,1304.0,...,402.0,1329.0,,,,,,,,


### Label-based selection

The second paradigm for attribute selection is the one followed by the `loc` operator: **label-based selection**. In this paradigm, it's the data index value, not its position, which matters.

For example, to get the first entry in `df`, we would now do the following:

In [13]:
df.loc[0, 'City']

'south Waziristan'

`iloc` is conceptually simpler than `loc` because it ignores the dataset's indices. When we use `iloc` we treat the dataset like a big matrix (a list of lists), one that we have to index into by position. `loc`, by contrast, uses the information in the indices to do its work. Since your dataset usually has meaningful indices, it's usually easier to do things using `loc` instead. For example, here's one operation that's much easier using `loc`:

In [14]:
df.loc[:, ['Date', 'Location', 'Province']]

Unnamed: 0,Date,Location,Province
0,"Friday, June 18, 2004",Near Wana,FATA
1,"Sunday, May 08, 2005",Mir Ali (Near Afghan Border),FATA
2,"Thursday, December 01, 2005",Haisori- Miran Shah,FATA
3,"Friday, January 06, 2006",Saidgai village- 115km north of Wana,FATA
4,"Friday, January 13, 2006",Damadola Village,FATA
...,...,...,...
400,"Monday, June 12, 2017",Spin Thal,KPK
401,"Monday, July 03, 2017",,FATA
402,"Friday, September 15, 2017",Ghuz Ghari,FATA
403,"Monday, October 16, 2017",Zero-point,FATA


### Choosing between `loc` and `iloc`

When choosing or transitioning between `loc` and `iloc`, there is one "gotcha" worth keeping in mind, which is that the two methods use slightly different indexing schemes.

`iloc` uses the Python stdlib indexing scheme, where the first element of the range is included and the last one excluded. So `0:10` will select entries `0,...,9`. `loc`, meanwhile, indexes inclusively. So `0:10` will select entries `0,...,10`.

Why the change? Remember that loc can index any stdlib type: strings, for example. If we have a DataFrame with index values `Apples, ..., Potatoes, ...`, and we want to select "all the alphabetical fruit choices between Apples and Potatoes", then it's a lot more convenient to index `df.loc['Apples':'Potatoes']` than it is to index something like `df.loc['Apples', 'Potatoet']` (`t` coming after `s` in the alphabet).

This is particularly confusing when the DataFrame index is a simple numerical list, e.g. `0,...,1000`. In this case `df.iloc[0:1000]` will return 1000 entries, while `df.loc[0:1000]` return 1001 of them! To get 1000 elements using `loc`, you will need to go one lower and ask for `df.loc[0:999]`. 

Otherwise, the semantics of using `loc` are the same as those for `iloc`.

# Manipulating the index

Label-based selection derives its power from the labels in the index. Critically, the index we use is not immutable. We can manipulate the index in any way we see fit.

The `set_index()` method can be used to do the job. Here is what happens when we `set_index` to the `title` field:

In [15]:
df.set_index("S#")

Unnamed: 0_level_0,Date,Time,Location,City,Province,No of Strike,Al-Qaeda,Taliban,Civilians Min,Civilians Max,...,Injured Min,Injured Max,Women/Children,Special Mention (Site),Comments,References,Longitude,Latitude,Temperature(C),Temperature(F)
S#,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1.0,"Friday, June 18, 2004",22:00,Near Wana,south Waziristan,FATA,1.0,,1.0,0.0,4.0,...,,,N,Blast occured in courtyard of the house of lon...,Village in Wana,http://archives.dawn.com/2004/06/19/top1.htm,69.900000,33.033300,28.475,83.255
2.0,"Sunday, May 08, 2005",23:30,Mir Ali (Near Afghan Border),North Waziristan,FATA,1.0,1.0,,0.0,1.0,...,,,N,Drone struck a car driven by local warlord- ki...,Civilian killied was Samiullah Khan who was a ...,http://www.msnbc.msn.com/id/7847008/,70.145500,32.974600,11.475,52.655
3.0,"Thursday, December 01, 2005",,Haisori- Miran Shah,North Waziristan,FATA,1.0,1.0,,0.0,1.0,...,,2.0,,Explosive occurred at a mud house,No. 3 Al-Qaeda's Leader AbuHamza Rabia killed ...,http://edition.cnn.com/2005/WORLD/asiapcf/12/0...,70.145500,32.974600,7.080,44.744
4.0,"Friday, January 06, 2006",,Saidgai village- 115km north of Wana,North Waziristan,FATA,1.0,,,,,...,,2.0,,,,http://www.reuters.com/article/2007/04/27/us-p...,70.145500,32.974600,0.535,32.963
5.0,"Friday, January 13, 2006",3:00,Damadola Village,Bajaur Agency,FATA,1.0,,,0.0,18.0,...,,2.0,Y,Three houses were tarheted in Damadola village...,Masood Khan house was among those bombed. Want...,http://www.dailytimes.com.pk/default.asp?page=...,71.500000,34.683300,10.025,50.045
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
402.0,"Monday, June 12, 2017",21:00,Spin Thal,Hangu,KPK,1.0,,,0.0,0.0,...,0.0,0.0,N,Haqqani network leader Abubakar and his partne...,Thal city falls in Hangu district and lies clo...,https://www.dawn.com/news/1339293,33.358693,70.540720,,
403.0,"Monday, July 03, 2017",,,South Waziristan,FATA,2.0,,,0.0,0.0,...,0.0,0.0,N,a CIA-operated drone carried out a missile att...,,https://www.dawn.com/news/1343100,32.120819,69.589987,23.000,74.000
404.0,"Friday, September 15, 2017",,Ghuz Ghari,Kurram Agency,FATA,2.0,,,0.0,0.0,...,2.0,2.0,N,A US drone killed three suspected Afghan Talib...,,https://www.dawn.com/news/1357853; https://www...,33.732174,70.150755,,
405.0,"Monday, October 16, 2017",,Zero-point,Lower Kurram Agency,FATA,4.0,,5.0,,,...,,,N,At least five suspected militants were killed ...,Conflict of Report: Foreign media reported tha...,http://www.thesundaily.my/news/2017/10/18/deat...,,,,


This is useful if you can come up with an index for the dataset which is better than the current one.

# Conditional selection

So far we've been indexing various strides of data, using structural properties of the DataFrame itself. To do *interesting* things with the data, however, we often need to ask questions based on conditions. 

For example, suppose that we're interested specifically in more-than-average Drone Attacks in North Waziristan.

We can start by checking if each Attack is in North Waziristan or not:

In [16]:
df.City == 'North Waziristan'

0      False
1       True
2       True
3       True
4      False
       ...  
400    False
401    False
402    False
403    False
404    False
Name: City, Length: 405, dtype: bool

This operation produced a Series of `True`/`False` booleans based on the `City` of each record.  This result can then be used inside of `loc` to select the relevant data:

In [17]:
df.loc[df.City == 'North Waziristan']

Unnamed: 0,S#,Date,Time,Location,City,Province,No of Strike,Al-Qaeda,Taliban,Civilians Min,...,Injured Min,Injured Max,Women/Children,Special Mention (Site),Comments,References,Longitude,Latitude,Temperature(C),Temperature(F)
1,2.0,"Sunday, May 08, 2005",23:30,Mir Ali (Near Afghan Border),North Waziristan,FATA,1.0,1.0,,0.0,...,,,N,Drone struck a car driven by local warlord- ki...,Civilian killied was Samiullah Khan who was a ...,http://www.msnbc.msn.com/id/7847008/,70.145500,32.974600,11.475,52.655
2,3.0,"Thursday, December 01, 2005",,Haisori- Miran Shah,North Waziristan,FATA,1.0,1.0,,0.0,...,,2.0,,Explosive occurred at a mud house,No. 3 Al-Qaeda's Leader AbuHamza Rabia killed ...,http://edition.cnn.com/2005/WORLD/asiapcf/12/0...,70.145500,32.974600,7.080,44.744
3,4.0,"Friday, January 06, 2006",,Saidgai village- 115km north of Wana,North Waziristan,FATA,1.0,,,,...,,2.0,,,,http://www.reuters.com/article/2007/04/27/us-p...,70.145500,32.974600,0.535,32.963
6,7.0,"Friday, April 27, 2007",,Saidgai village,North Waziristan,FATA,1.0,,,0.0,...,,2.0,,attack on house of Habibullah which was next t...,millitanta were killed while making bombs,http://www.reuters.com/article/idUSISL11111020...,70.145500,32.974600,25.770,78.386
7,8.0,"Tuesday, June 19, 2007",10:30,MamiRogha- Dattakhel,North Waziristan,FATA,3.0,,,20.0,...,,15.0,Y,Explosion occured at cluster of 3 houses and t...,millitanta were killed while making bombs,http://www.express.com.pk/epaper/PoPupwindow.a...,70.145500,32.974600,24.395,75.911
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
394,395.0,"Saturday, January 09, 2016",,Mangroti area,North Waziristan,FATA,1.0,,,,...,,,N,Atleast five suspected militants were killed ...,,http://www.dailykashmirimages.com/Details/1003...,32.320237,69.859741,-7.765,18.023
395,396.0,"Friday, January 22, 2016",,Kurram agency,North Waziristan,FATA,4.0,,,,...,1.0,1.0,N,At least three people were killed and one pers...,,http://dunyanews.tv/en/Pakistan/324105-Three-d...,33.695975,70.336069,-14.155,6.521
397,398.0,"Thursday, March 02, 2017",16:00,"Sara Khwa, Kurram agency",North Waziristan,FATA,1.0,0.0,2.0,0.0,...,0.0,0.0,N,killed two suspected militants in a border vil...,,https://www.thenews.com.pk/print/189928-First-...,32.293811,70.105000,23.000,73.000
398,400.0,"Wednesday, April 26, 2017",12:30,Lawara Mandi area,North Waziristan,FATA,2.0,,7.0,3.0,...,,,,"There are two militant commanders, Abdul Rehma...",,https://www.geo.tv/latest/139727-Suspected-US-...,33.145357,70.029226,29.000,85.000


This DataFrame has 289 rows. The original had 25. That means that around 2.8% of Attacks were in North Waziristan.

Another way of seeing things 

We can use the ampersand (`&`) to bring the two questions together:

In [18]:
df.loc[(df.City == 'North Waziristan') & (df['Al-Qaeda'] == 1.0)]

Unnamed: 0,S#,Date,Time,Location,City,Province,No of Strike,Al-Qaeda,Taliban,Civilians Min,...,Injured Min,Injured Max,Women/Children,Special Mention (Site),Comments,References,Longitude,Latitude,Temperature(C),Temperature(F)
1,2.0,"Sunday, May 08, 2005",23:30,Mir Ali (Near Afghan Border),North Waziristan,FATA,1.0,1.0,,0.0,...,,,N,Drone struck a car driven by local warlord- ki...,Civilian killied was Samiullah Khan who was a ...,http://www.msnbc.msn.com/id/7847008/,70.1455,32.9746,11.475,52.655
2,3.0,"Thursday, December 01, 2005",,Haisori- Miran Shah,North Waziristan,FATA,1.0,1.0,,0.0,...,,2.0,,Explosive occurred at a mud house,No. 3 Al-Qaeda's Leader AbuHamza Rabia killed ...,http://edition.cnn.com/2005/WORLD/asiapcf/12/0...,70.1455,32.9746,7.08,44.744
10,11.0,"Tuesday, January 29, 2008",4:00,Khushali village,North Waziristan,FATA,1.0,1.0,,5.0,...,,2.0,Y,Missile hit a house of local resident,Dead were pro taliban militants and some of th...,http://news.bbc.co.uk/2/hi/south_asia/7220823.stm,70.1455,32.9746,2.18,35.924
300,301.0,"Monday, June 04, 2012",3:30,Heshokhel Village- 3km East of Mirali,North Waziristan,FATA,1.0,1.0,0.0,0.0,...,0.0,5.0,N,Al-Qaeda's Second-in-command Abu Yahya al-Libi...,,http://www.bbc.co.uk/news/world-asia-18327634 ...,70.1455,32.9746,24.5,76.1
326,327.0,"Sunday, December 09, 2012",,Tabbi Village- Miranshah,North Waziristan,FATA,4.0,1.0,0.0,0.0,...,0.0,6.0,N,4 Missiles were fired on a house,Senior Al-Qaeda Commander Muhammad Ahmed Al-Ma...,http://www.express.com.pk/epaper/PoPupwindow.a...,70.1455,32.9746,6.555,43.799


In [19]:
df.loc[(df.City == 'North Waziristan') | (df['Al-Qaeda'] == 1.0)]

Unnamed: 0,S#,Date,Time,Location,City,Province,No of Strike,Al-Qaeda,Taliban,Civilians Min,...,Injured Min,Injured Max,Women/Children,Special Mention (Site),Comments,References,Longitude,Latitude,Temperature(C),Temperature(F)
1,2.0,"Sunday, May 08, 2005",23:30,Mir Ali (Near Afghan Border),North Waziristan,FATA,1.0,1.0,,0.0,...,,,N,Drone struck a car driven by local warlord- ki...,Civilian killied was Samiullah Khan who was a ...,http://www.msnbc.msn.com/id/7847008/,70.145500,32.974600,11.475,52.655
2,3.0,"Thursday, December 01, 2005",,Haisori- Miran Shah,North Waziristan,FATA,1.0,1.0,,0.0,...,,2.0,,Explosive occurred at a mud house,No. 3 Al-Qaeda's Leader AbuHamza Rabia killed ...,http://edition.cnn.com/2005/WORLD/asiapcf/12/0...,70.145500,32.974600,7.080,44.744
3,4.0,"Friday, January 06, 2006",,Saidgai village- 115km north of Wana,North Waziristan,FATA,1.0,,,,...,,2.0,,,,http://www.reuters.com/article/2007/04/27/us-p...,70.145500,32.974600,0.535,32.963
6,7.0,"Friday, April 27, 2007",,Saidgai village,North Waziristan,FATA,1.0,,,0.0,...,,2.0,,attack on house of Habibullah which was next t...,millitanta were killed while making bombs,http://www.reuters.com/article/idUSISL11111020...,70.145500,32.974600,25.770,78.386
7,8.0,"Tuesday, June 19, 2007",10:30,MamiRogha- Dattakhel,North Waziristan,FATA,3.0,,,20.0,...,,15.0,Y,Explosion occured at cluster of 3 houses and t...,millitanta were killed while making bombs,http://www.express.com.pk/epaper/PoPupwindow.a...,70.145500,32.974600,24.395,75.911
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
394,395.0,"Saturday, January 09, 2016",,Mangroti area,North Waziristan,FATA,1.0,,,,...,,,N,Atleast five suspected militants were killed ...,,http://www.dailykashmirimages.com/Details/1003...,32.320237,69.859741,-7.765,18.023
395,396.0,"Friday, January 22, 2016",,Kurram agency,North Waziristan,FATA,4.0,,,,...,1.0,1.0,N,At least three people were killed and one pers...,,http://dunyanews.tv/en/Pakistan/324105-Three-d...,33.695975,70.336069,-14.155,6.521
397,398.0,"Thursday, March 02, 2017",16:00,"Sara Khwa, Kurram agency",North Waziristan,FATA,1.0,0.0,2.0,0.0,...,0.0,0.0,N,killed two suspected militants in a border vil...,,https://www.thenews.com.pk/print/189928-First-...,32.293811,70.105000,23.000,73.000
398,400.0,"Wednesday, April 26, 2017",12:30,Lawara Mandi area,North Waziristan,FATA,2.0,,7.0,3.0,...,,,,"There are two militant commanders, Abdul Rehma...",,https://www.geo.tv/latest/139727-Suspected-US-...,33.145357,70.029226,29.000,85.000


Pandas comes with a few built-in conditional selectors, two of which we will highlight here. 

The first is `isin`. `isin` is lets you select data whose value "is in" a list of values. For example, here's how we can use it to select Attacks only from **North Waziristan** or **South Waziristan**:

In [20]:
df.loc[df.City.isin(['North Waziristan', 'South Waziristan'])]

Unnamed: 0,S#,Date,Time,Location,City,Province,No of Strike,Al-Qaeda,Taliban,Civilians Min,...,Injured Min,Injured Max,Women/Children,Special Mention (Site),Comments,References,Longitude,Latitude,Temperature(C),Temperature(F)
1,2.0,"Sunday, May 08, 2005",23:30,Mir Ali (Near Afghan Border),North Waziristan,FATA,1.0,1.0,,0.0,...,,,N,Drone struck a car driven by local warlord- ki...,Civilian killied was Samiullah Khan who was a ...,http://www.msnbc.msn.com/id/7847008/,70.145500,32.974600,11.475,52.655
2,3.0,"Thursday, December 01, 2005",,Haisori- Miran Shah,North Waziristan,FATA,1.0,1.0,,0.0,...,,2.0,,Explosive occurred at a mud house,No. 3 Al-Qaeda's Leader AbuHamza Rabia killed ...,http://edition.cnn.com/2005/WORLD/asiapcf/12/0...,70.145500,32.974600,7.080,44.744
3,4.0,"Friday, January 06, 2006",,Saidgai village- 115km north of Wana,North Waziristan,FATA,1.0,,,,...,,2.0,,,,http://www.reuters.com/article/2007/04/27/us-p...,70.145500,32.974600,0.535,32.963
6,7.0,"Friday, April 27, 2007",,Saidgai village,North Waziristan,FATA,1.0,,,0.0,...,,2.0,,attack on house of Habibullah which was next t...,millitanta were killed while making bombs,http://www.reuters.com/article/idUSISL11111020...,70.145500,32.974600,25.770,78.386
7,8.0,"Tuesday, June 19, 2007",10:30,MamiRogha- Dattakhel,North Waziristan,FATA,3.0,,,20.0,...,,15.0,Y,Explosion occured at cluster of 3 houses and t...,millitanta were killed while making bombs,http://www.express.com.pk/epaper/PoPupwindow.a...,70.145500,32.974600,24.395,75.911
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
395,396.0,"Friday, January 22, 2016",,Kurram agency,North Waziristan,FATA,4.0,,,,...,1.0,1.0,N,At least three people were killed and one pers...,,http://dunyanews.tv/en/Pakistan/324105-Three-d...,33.695975,70.336069,-14.155,6.521
397,398.0,"Thursday, March 02, 2017",16:00,"Sara Khwa, Kurram agency",North Waziristan,FATA,1.0,0.0,2.0,0.0,...,0.0,0.0,N,killed two suspected militants in a border vil...,,https://www.thenews.com.pk/print/189928-First-...,32.293811,70.105000,23.000,73.000
398,400.0,"Wednesday, April 26, 2017",12:30,Lawara Mandi area,North Waziristan,FATA,2.0,,7.0,3.0,...,,,,"There are two militant commanders, Abdul Rehma...",,https://www.geo.tv/latest/139727-Suspected-US-...,33.145357,70.029226,29.000,85.000
399,401.0,"Wednesday, May 24, 2017",12:00,"Garvik area, Shawal Tehsil",North Waziristan,FATA,1.0,,,,...,0.0,0.0,N,"The US drone attack killed three militants, in...",killing two TTP militants and a key commander ...,https://www.thenews.com.pk/latest/206381-Suspe...,32.706186,69.415781,26.000,80.000


The second is `isnull` (and its companion `notnull`). These methods let you highlight values which are (or are not) empty (`NaN`). For example, to filter out Attacks lacking location in the dataset, here's what we would do:

In [21]:
df.loc[df.Location.notnull()]

Unnamed: 0,S#,Date,Time,Location,City,Province,No of Strike,Al-Qaeda,Taliban,Civilians Min,...,Injured Min,Injured Max,Women/Children,Special Mention (Site),Comments,References,Longitude,Latitude,Temperature(C),Temperature(F)
0,1.0,"Friday, June 18, 2004",22:00,Near Wana,south Waziristan,FATA,1.0,,1.0,0.0,...,,,N,Blast occured in courtyard of the house of lon...,Village in Wana,http://archives.dawn.com/2004/06/19/top1.htm,69.900000,33.033300,28.475,83.255
1,2.0,"Sunday, May 08, 2005",23:30,Mir Ali (Near Afghan Border),North Waziristan,FATA,1.0,1.0,,0.0,...,,,N,Drone struck a car driven by local warlord- ki...,Civilian killied was Samiullah Khan who was a ...,http://www.msnbc.msn.com/id/7847008/,70.145500,32.974600,11.475,52.655
2,3.0,"Thursday, December 01, 2005",,Haisori- Miran Shah,North Waziristan,FATA,1.0,1.0,,0.0,...,,2.0,,Explosive occurred at a mud house,No. 3 Al-Qaeda's Leader AbuHamza Rabia killed ...,http://edition.cnn.com/2005/WORLD/asiapcf/12/0...,70.145500,32.974600,7.080,44.744
3,4.0,"Friday, January 06, 2006",,Saidgai village- 115km north of Wana,North Waziristan,FATA,1.0,,,,...,,2.0,,,,http://www.reuters.com/article/2007/04/27/us-p...,70.145500,32.974600,0.535,32.963
4,5.0,"Friday, January 13, 2006",3:00,Damadola Village,Bajaur Agency,FATA,1.0,,,0.0,...,,2.0,Y,Three houses were tarheted in Damadola village...,Masood Khan house was among those bombed. Want...,http://www.dailytimes.com.pk/default.asp?page=...,71.500000,34.683300,10.025,50.045
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
398,400.0,"Wednesday, April 26, 2017",12:30,Lawara Mandi area,North Waziristan,FATA,2.0,,7.0,3.0,...,,,,"There are two militant commanders, Abdul Rehma...",,https://www.geo.tv/latest/139727-Suspected-US-...,33.145357,70.029226,29.000,85.000
399,401.0,"Wednesday, May 24, 2017",12:00,"Garvik area, Shawal Tehsil",North Waziristan,FATA,1.0,,,,...,0.0,0.0,N,"The US drone attack killed three militants, in...",killing two TTP militants and a key commander ...,https://www.thenews.com.pk/latest/206381-Suspe...,32.706186,69.415781,26.000,80.000
400,402.0,"Monday, June 12, 2017",21:00,Spin Thal,Hangu,KPK,1.0,,,0.0,...,0.0,0.0,N,Haqqani network leader Abubakar and his partne...,Thal city falls in Hangu district and lies clo...,https://www.dawn.com/news/1339293,33.358693,70.540720,,
402,404.0,"Friday, September 15, 2017",,Ghuz Ghari,Kurram Agency,FATA,2.0,,,0.0,...,2.0,2.0,N,A US drone killed three suspected Afghan Talib...,,https://www.dawn.com/news/1357853; https://www...,33.732174,70.150755,,


# Assigning data

Going the other way, assigning data to a DataFrame is easy. You can assign either a constant value:

In [22]:
df['critic'] = 'everyone'
df['critic']

0      everyone
1      everyone
2      everyone
3      everyone
4      everyone
         ...   
400    everyone
401    everyone
402    everyone
403    everyone
404    everyone
Name: critic, Length: 405, dtype: object

Or with an iterable of values:

In [23]:
df['index_backwards'] = range(len(df), 0, -1)
df['index_backwards']

0      405
1      404
2      403
3      402
4      401
      ... 
400      5
401      4
402      3
403      2
404      1
Name: index_backwards, Length: 405, dtype: int64