# Boolean Indexing # 


In boolean indexing, we will select subsets of data based on the actual values of the data in the DataFrame and not on their row/column labels or integer locations. In boolean indexing, we use a boolean vector to filter the data. 
 

Boolean indexing is a type of indexing which uses actual values of the data in the DataFrame. In boolean indexing, we can filter a data in four ways – 
 

* Accessing a DataFrame with a boolean index
* Applying a boolean mask to a dataframe
* Masking data based on column value
* Masking data based on index value

Accessing a DataFrame with a boolean index : 
In order to access a dataframe with a boolean index, we have to create a dataframe in which index of dataframe contains a boolean value that is “True” or “False”. For Example 



In [1]:
# importing pandas as pd
import pandas as pd

# dictionary of lists
dict = {'name':["aparna", "pankaj", "sudhir", "Geeku"],
		'degree': ["MBA", "BCA", "M.Tech", "MBA"],
		'score':[90, 40, 80, 98]}

df = pd.DataFrame(dict, index = [True, False, True, False])

print(df)


         name  degree  score
True   aparna     MBA     90
False  pankaj     BCA     40
True   sudhir  M.Tech     80
False   Geeku     MBA     98


Now we have created a dataframe with boolean index after that user can access a dataframe with the help of boolean index. User can access a dataframe using three functions that is .loc[], .iloc[], .ix[] 
 

# Accessing a Dataframe with a boolean index using .loc[] #
In order to access a dataframe with a boolean index using .loc[], we simply pass a boolean value (True or False) in a .loc[] function. 

In [2]:


# dictionary of lists
dict = {'name':["aparna", "pankaj", "sudhir", "Geeku"],
		'degree': ["MBA", "BCA", "M.Tech", "MBA"],
		'score':[90, 40, 80, 98]}

# creating a dataframe with boolean index
df = pd.DataFrame(dict, index = [True, False, True, False])

# accessing a dataframe using .loc[] function
print(df.loc[True])


        name  degree  score
True  aparna     MBA     90
True  sudhir  M.Tech     80



# Accessing a Dataframe with a boolean index using .iloc[] #
In order to access a dataframe using .iloc[], we have to pass a boolean value (True or False)  but iloc[] function accept only integer as argument so it will throw an error so we can only access a dataframe when we pass a integer in iloc[] function 
Code #1: 
 

In [4]:

# dictionary of lists
dict = {'name':["aparna", "pankaj", "sudhir", "Geeku"],
		'degree': ["MBA", "BCA", "M.Tech", "MBA"],
		'score':[90, 40, 80, 98]}

# creating a dataframe with boolean index
df = pd.DataFrame(dict, index = [True, False, True, False])

# accessing a dataframe using .iloc[] function
print(df.iloc[1])  # TRUE 1 FALSE 0 


name      pankaj
degree       BCA
score         40
dtype: object


Code #2: 

In [5]:

# dictionary of lists
dict = {'name':["aparna", "pankaj", "sudhir", "Geeku"],
		'degree': ["MBA", "BCA", "M.Tech", "MBA"],
		'score':[90, 40, 80, 98]}

# creating a dataframe with boolean index
df = pd.DataFrame(dict, index = [True, False, True, False])


# accessing a dataframe using .iloc[] function
print(df.iloc[1])


name      pankaj
degree       BCA
score         40
dtype: object



# Accessing a Dataframe with a boolean index using .ix[] # 
In order to access a dataframe using .ix[], we have to pass boolean value (True or False) and integer value to .ix[] function because as we know that .ix[] function is a hybrid of .loc[] and .iloc[] function. 
Code #1: 
 



In [12]:

# dictionary of lists
dict = {'name':["aparna", "pankaj", "sudhir", "Geeku"],
		'degree': ["MBA", "BCA", "M.Tech", "MBA"],
		'score':[90, 40, 80, 98]}

# creating a dataframe with boolean index
df = pd.DataFrame(dict, index = [True, False, True, False])


# accessing a dataframe using .ix[] function
print(df.index[True])


[[True False True False]]


  
# Applying a boolean mask to a dataframe :  # 
In a dataframe we can apply a boolean mask in order to do that we, can use __getitems__ or [] accessor. We can apply a boolean mask by giving list of True and False of the same length as contain in a dataframe. When we apply a boolean mask it will print only that dataframe in which we pass a boolean value True. To download “nba1.1” CSV file click here.
Code #1: 

In [13]:
# dictionary of lists
dict = {'name':["aparna", "pankaj", "sudhir", "Geeku"],
		'degree': ["MBA", "BCA", "M.Tech", "MBA"],
		'score':[90, 40, 80, 98]}

df = pd.DataFrame(dict, index = [0, 1, 2, 3])



print(df[[True, False, True, False]])


     name  degree  score
0  aparna     MBA     90
2  sudhir  M.Tech     80


Code #2: 
 

In [16]:

# making data frame from csv file
data = pd.read_csv("D:/Python/input/nba.csv", index_col ="Name")

df = pd.DataFrame(data, index = [0, 1, 2, 3, 4, 5, 6,
								7, 8, 9, 10, 11, 12])


df[[True, False, True, False, True, 
	False, True, False, True, False,
				True, False, True]]


Unnamed: 0,Team,Number,Position,Age,Height,Weight,College,Salary
0,,,,,,,,
2,,,,,,,,
4,,,,,,,,
6,,,,,,,,
8,,,,,,,,
10,,,,,,,,
12,,,,,,,,


# Masking data based on column value :  #
In a dataframe we can filter a data based on a column value in order to filter data, we can apply certain condition on dataframe using different operator like ==, >, <, <=, >=. When we apply these operator on dataframe then it produce a Series of True and False. To download the “nba.csv” CSV, click here.
Code #1: 
 

In [17]:

# dictionary of lists
dict = {'name':["aparna", "pankaj", "sudhir", "Geeku"],
		'degree': ["BCA", "BCA", "M.Tech", "BCA"],
		'score':[90, 40, 80, 98]}

# creating a dataframe
df = pd.DataFrame(dict)

# using a comparison operator for filtering of data
print(df['degree'] == 'BCA')


0     True
1     True
2    False
3     True
Name: degree, dtype: bool


In [19]:
  
# using greater than operator for filtering of data
print(data['Age'] > 25)

Name
Avery Bradley    False
Jae Crowder      False
John Holland      True
R.J. Hunter      False
Jonas Jerebko     True
                 ...  
Shelvin Mack      True
Raul Neto        False
Tibor Pleiss      True
Jeff Withey       True
NaN              False
Name: Age, Length: 458, dtype: bool


  
Masking data based on index value : \ 

In a dataframe we can filter a data based on a column value in order to filter data, we can create a mask based on the index values using different operator like ==, >, <, etc… . To download “nba1.1” CSV file click here.
Code #1: 

In [20]:

# dictionary of lists
dict = {'name':["aparna", "pankaj", "sudhir", "Geeku"],
		'degree': ["BCA", "BCA", "M.Tech", "BCA"],
		'score':[90, 40, 80, 98]}


df = pd.DataFrame(dict, index = [0, 1, 2, 3])

mask = df.index == 0

print(df[mask])


     name degree  score
0  aparna    BCA     90


In [21]:


# giving a index to a dataframe
df = pd.DataFrame(data, index = [0, 1, 2, 3, 4, 5, 6,
								7, 8, 9, 10, 11, 12])

# filtering data on index value
mask = df.index > 7

df[mask]


Unnamed: 0,Team,Number,Position,Age,Height,Weight,College,Salary
8,,,,,,,,
9,,,,,,,,
10,,,,,,,,
11,,,,,,,,
12,,,,,,,,
