# Excel IF statements in Python
---


## **Excel:**

    =IF(logical_test, [value_if_true], [value_if_false])


## **Python:**
    
    np.where(logical_test, [value_if_true], [value_if_false])

<br><br>

### Load required packages and data
---

In [1]:
# Import required packages
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [2]:
# Save Github location paths to a variable
failed_bank_path = 'https://github.com/The-Calculated-Life/python_analysis_for_excel/blob/main/data/failed_banks.xlsx?raw=true'
bx_books_path = 'https://raw.githubusercontent.com/The-Calculated-Life/python_analysis_for_excel/main/data/bx_books.csv'

# Read excel and CSV files
bank_detail = pd.read_excel(failed_bank_path, sheet_name='detail')
bx_books = pd.read_csv(bx_books_path)

<br><br>
### IF statement examples
---
<br>


In [3]:
# Flag all banks which had more than $50,000 in ESTIMATED LOSS
np.where(bank_detail['ESTIMATED LOSS'] > 50000, 1, 0)

array([0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1,
       0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1,
       0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0,
       1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 1,
       1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0,
       0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0,
       1, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 1,
       0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0,
       0, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 1, 0,
       1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0,
       1, 0, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 1,
       0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0,
       1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0,

<br><br>

In [4]:
# Save those flags to a column called BIG_LOSSES
bank_detail['BIG_LOSSES'] = np.where(bank_detail['ESTIMATED LOSS'] > 50000, 1, 0)

<br><br>

In [5]:
# Look at the result (top 10 rows)
bank_detail.head(10)

Unnamed: 0,CERT,FIN,CHARTER,ESTIMATED LOSS,ASSETS,DEPOSITS,RESOLUTION,BIG_LOSSES
0,14361,10536.0,COMMERCIAL,,152400,139526,FAILURE,0
1,18265,10535.0,COMMERCIAL,,100879,95159,FAILURE,0
2,21111,10534.0,COMMERCIAL,2491.0,120574,111234,FAILURE,0
3,58112,10532.0,COMMERCIAL,4547.0,29726,26473,FAILURE,0
4,58317,10533.0,OTHER,2188.0,27119,26151,FAILURE,0
5,10716,10531.0,COMMERCIAL,21577.0,36738,31254,FAILURE,0
6,30570,10530.0,OTHER,86826.0,166345,143964,FAILURE,1
7,17719,10529.0,COMMERCIAL,280.0,33012,27466,FAILURE,0
8,1802,10528.0,OTHER,8544.0,34370,33972,FAILURE,0
9,30003,10527.0,OTHER,133914.0,1031900,1002026,FAILURE,1


<br><br>
**QUICK CHALLENGE #1:**

**Task: Use np.where() and the `bx_books` dataframe to flag all books published in 2002. Save the result to a column named `pub_2002`**

<br>

*Hint: The direct comparision operator is `==` in Python*

In [10]:
# Your code for quick challenge #1 here:
np.where(bx_books['year_of_publication'] == 2002, 1, 0)

array([1, 0, 0, ..., 0, 0, 0])

In [11]:
bx_books['pub_2002'] = np.where(bx_books['year_of_publication'] == 2002, 1, 0)

In [12]:
bx_books.head()

Unnamed: 0,isbn,book_title,book_author,year_of_publication,publisher,pub_2002
0,195153448,Classical Mythology,Mark P. O. Morford,2002,Oxford University Press,1
1,2005018,Clara Callan,Richard Bruce Wright,2001,HarperFlamingo Canada,0
2,60973129,Decision in Normandy,Carlo D'Este,1991,HarperPerennial,0
3,374157065,Flu: The Story of the Great Influenza Pandemic...,Gina Bari Kolata,1999,Farrar Straus Giroux,0
4,393045218,The Mummies of Urumchi,E. J. W. Barber,1999,W. W. Norton &amp; Company,0
