# Lesson 7 Logical Statements

## The purpose of todays lecture is to introduce the use of Logical Statements in Programming. 

## A simple definition of a logical statement is that a logical statement is a statement that 
## produces an output which either has the value **True** or **False**.  

## From a coding perspective, we could also map those two states onto a binary variable which takes on the value 1 or 0.  

## This type of variable is known as a **Boolean** variable.   

## In this lesson, we will learn how to setup and evaluate various types of logical statements. 

## We will make use of logical statements in this course is to develop boolean variables as 
* ### *masks* to segregate data. 
* ### to *control* conditional execution code.   

In [None]:
import numpy as np
from numpy import random
from matplotlib import pyplot as plt
import pandas as pd

## 7.1 Comparison Operators

### A logical statement almost always involves comparisons. Examples of comparisons that come to mind might be 
* #### `==` - equal                    
* #### `!=` - not equal                
* #### `>`  - greater than             
* #### `>=` - greater than or equal    
* #### `<`  - less than                
* #### `<=` - less than or equal       

### Example 1  

In [None]:
x = 5

In [None]:
x == 5   #Test of Equality, True statement, returns the value True 

In [None]:
test = x == 5  #Equal, True statement, correct syntax, bad formatting 
print(test)

In [None]:
type(test)

In [None]:
test = (x == 5)  #Equal, True Statement, correct syntax, good formatting 
print(test)

In [None]:
test = (x != 5)  #Not Equal, False Statement 
print(test)

### Example 2

In [None]:
x = np.array([1, 2, 3, 4, 5])
print(x)

In [None]:
test = (x < 3)  # less than
print(test)
type(test)
test.dtype

In [None]:
test = (x > 3)  # greater than
print(test)

In [None]:
test = (x <= 3)  # less than or equal
print(test)

In [None]:
test = (x >= 3)  # greater than or equal
print(test)

In [None]:
test = (x != 3)  # not equal
print(test)

In [None]:
test = (x == 3)  # equal
print(test)

## 7.2 Rules of Logic 1 - Set Theory Perspective

### The rules of logic and set theory are important to understand in meaningfully evaluation logical statements. 

### We can view the array `x =[1 2 3 4 5]` as a **set**. This set defines a **space**, or universe.<br>

### Any of the logical and comparison operations divides the space into two **subspaces**, one where the logical statement is **True**, and another where the logical statement is **False**.<br>  

### **The most important thing to remember is that the True and False subsets must combine to the original set**

### Example 3

In [None]:
test1 = (x < 3)
print('test1 = ',test1)
test2 = (x > 3)
print('test2 = ',test2)
test3 = (x >= 3)
print('test3 = ',test3)

### 7.2.1 Logical complement, or logical_not

### The complement of a boolean variable is the opposite state.  The complement of **True** is **False** and the complement of **False** is **True** 

### There are two ways to take the complement of a Boolean array.  One is the numpy method `logical_not` or by using the operator `~`


In [None]:
test4 = np.logical_not(test1)
print('test4 = ',test4)
test5 = ~test1
print('test5 = ',test5)

## 7.3 Manual Entry of a  Boolean Array 

### Sometimes we need to manually create Boolean array.  There are two ways to do this shown below 


In [None]:
boolean_array1 = [True,False,False,True]
print(boolean_array1)
boolean_numpy_array1 = np.array(boolean_array1)
print(boolean_numpy_array1)

In [None]:
boolean_array2 = [0,1,1,0]
print(boolean_array2)

In [None]:
boolean_array3 = np.array([0,1,1,0],dtype=bool)  # force the data type to boolean 
print(boolean_array3)

In [None]:
boolean_array3 = ~boolean_array3
print(boolean_array3)

## 7.4 Conversion to a Boolean Array - Can I turn anything into a boolean array. 
### What are the rules? 

### 0, None, False or empty strings ARE **False**

### Values other than 0, None, False or empty strings ARE **True**.

In [None]:
strange_list = [1, 0.5, 0, 0.0, None, 'a', '', ' ',True, False]
bool_arr = np.array(strange_list, dtype=bool)
print(strange_list)
print(bool_arr)

Interpreted languages like Python, Matlab, R are easy to use because you can be careless like this and things still kind of work.  

**This is their strength AND their weakness**

Its really easy to have a mistake in your code and have your code seem to work in these languages.  

## 7.5 Comparison of Arrays

### Logical Operators can be used to compare arrays.  Such comparisons proceed on an element by element basis 

In [None]:
rng = random.default_rng(seed = 21)
array_a = rng.integers(0,10,8)
array_b = rng.integers(0,10,8)

In [None]:
test = (array_a > array_b)
print(array_a)
print(array_b)
print(test)

In [None]:
x = np.array([1,2,3,4,5])
test = (2**x == x**2)
print(test)

## 7.6 Working with Boolean Arrays

### Why do we want to make use of Boolean arrays?  
### Here I will show some useful operations that Boolean arrays will allow us to do.  And then, I will show an example of how we make use of Boolean to organize data.  

In [None]:
x = rng.integers(0,9,(3,4))
print(x)

In [None]:
x < 6

### 7.6.1 Counting entries

### To count the number of ``True`` entries in a Boolean array, ``np.count_nonzero`` is useful:

In [None]:
# how many values less than 6?
np.count_nonzero(x < 6)

### We see that `count_nonzero` counted the number of entries that were `True` in the entire matrix. 
### Python interprets `True` as having the numeric value of 1, and `False` as zero.  

In [None]:
#how many values in the 2nd row (index = 1) less than 6 
count = np.count_nonzero(x[1,:] < 6) # count the row with index = 1 
print(count)

### Another way to get at this information is to use ``np.sum``; in this case, ``False`` is interpreted as ``0``, and ``True`` is interpreted as ``1``:

In [None]:
np.sum(x < 6)

### The benefit of ``sum`` is that like with other NumPy aggregation functions, this summation can be done along rows or columns as well:

In [None]:
# how many values less than 6 in each row?
np.sum(x < 6, axis=1)

### This counts the number of values less than 6 in each row of the matrix.

### If we're interested in quickly checking whether any or all the values are true, we can use (you guessed it) ``np.any`` or ``np.all``:

In [None]:
# are there any values greater than 8?
np.any(x > 8)

In [None]:
# are there any values less than zero?
np.any(x < 0)

In [None]:
# are all values less than 10?
np.all(x < 10)

In [None]:
# are all values equal to 6?
np.all(x == 6)

### ``all`` and ``any`` can be used along particular axes as well. For example:

In [None]:
# are all values in each row less than 8?
np.all(x < 8, axis=1)

### 7.7 Finding Entries - Boolean Indexing 

### We can make use of the Boolean variable that is the results of a logical statement as the **index** into a variable to extract the variables that meet the condition of the logical statement.   

### We often will want to return the **index** into an array that meets a logical condition. 

### Here numpy's `where` method is useful, but has a slightly odd syntax due to the return of a list

In [None]:
my_array = rng.integers(0,10,20)
print(my_array)

In [None]:
#test if the array entries are bigger than 5
test = (my_array > 5)
print(test)

In [None]:
#how many values are there greater than 5 
count = np.sum(test)
print(count)

In [None]:
#Lets get the values in my_array that are larger than 5 
my_greater_than_5_array = my_array[test]
print(my_greater_than_5_array)

In [None]:
np.size(my_greater_than_5_array)

In [None]:
#find the indices where the array entries are bigger than 5 
test_indices = np.where(my_array > 5)
print(test_indices)

### in a deeply annoying way, np.where will return a tuple.  This can cause shape problems.

In [None]:
test_array = my_array[test_indices]

In [None]:
np.shape(test_array)

## 7.8 Organizing Data with Boolean Arrays 

### Data is rarely delivered to you in the organized in the manner that you want for data analysis. 

### For example,consider a behavioral experiment with *3* conditions.  In a typical experimental protocol, the subject will be presented a random condition on each trial. The subjects response will be recorded on each trial. 

### We might then at the end expect to have a data file, where the data has been recorded continuously in the experiment IN THE ORDER OF EXPERIMENT, not organized by condition.  

In [None]:
rtdata = pd.read_excel('rtdata.xlsx')

In [None]:
rtdata.keys()

In [None]:
print(rtdata['trialnumber'])


### First thing to do when we load a data file, is to look at the keys 

In [None]:
print(rtdata['condition'])

In [None]:
print(rtdata['responsetime'])

### Before we get started, lets copy the values from the DataFrame to variables 
### I always go ahead and convert them into numpy arrays 

In [None]:
trialnumber = np.array(rtdata['trialnumber'])
condition = np.array(rtdata['condition'])
responsetime = np.array(rtdata['responsetime'])


### 7.8.1 `unique` values

### Now, it would be quite useful to know how many conditions there are.  From looking at this, you might think there are 3 - 1,2,3.  But only 10 values are displayed here.  

### A really useful function in numpy is `unique`

In [None]:
the_conditions = np.unique(condition)
print(the_conditions)

### 7.8.2  TASK: Compute the mean, median, 25th percentile, and 75th percentile of response time for each condition.  

### HINT: USE LOGICAL STATEMENTS TO CREATE BOOLEAN VARIABLES THAT IDENTIFY THE TRIALS CORRESPONDING TO EACH CONDITION.   