# VLOOKUP in Python (Part 2)
---

## **In part 1:**



*   merge() is the VLOOKUP of Python


    df1.merge(df2, on='lookup_value', how='left')

<br>

---

<br>

### Load required packages and data
---

In [1]:
# Import required packages
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [2]:
# Save Github location paths to a variable
failed_bank_path = 'https://github.com/The-Calculated-Life/python_analysis_for_excel/blob/main/data/failed_banks.xlsx?raw=true'
bx_books_path = 'https://raw.githubusercontent.com/The-Calculated-Life/python_analysis_for_excel/main/data/bx_books.csv'
bx_ratings_path = 'https://raw.githubusercontent.com/The-Calculated-Life/python_analysis_for_excel/main/data/bx_ratings.csv'
bx_users_path = 'https://raw.githubusercontent.com/The-Calculated-Life/python_analysis_for_excel/main/data/bx_users.csv'

# Read excel and CSV files
bank_detail = pd.read_excel(failed_bank_path, sheet_name='detail')
bank_list = pd.read_excel(failed_bank_path, sheet_name='banks')
bank_dividends = pd.read_excel(failed_bank_path, sheet_name='dividends')
bx_books = pd.read_csv(bx_books_path)
bx_ratings = pd.read_csv(bx_ratings_path)
bx_users = pd.read_csv(bx_users_path)

bx_users.columns = ['user', 'location', 'age']

# Apply transformations to bank_dividends
bank_dividends[['Bank Number', 'Name', 'Date Closed']] = bank_dividends['Bank Number:Name:Date Closed'].str.split(':', expand=True)
bank_dividends = bank_dividends[bank_dividends['Dividend Type'] == 'Final']
bank_dividends['Bank Number'] = bank_dividends['Bank Number'].astype('float')

<br>

### Examples: VLOOKUP (Part 2)
---

In [3]:
# View "bank_list"
bank_list.head()

Unnamed: 0,CERT,Bank Name,City,ST,Acquiring Institution,Closing Date
0,14361,The First State Bank,Barboursville,WV,"MVB Bank, Inc.",2020-04-03
1,18265,Ericson State Bank,Ericson,NE,Farmers and Merchants Bank,2020-02-14
2,21111,City National Bank of New Jersey,Newark,NJ,Industrial Bank,2019-11-01
3,58317,Resolute Bank,Maumee,OH,Buckeye State Bank,2019-10-25
4,58112,Louisa Community Bank,Louisa,KY,Kentucky Farmers Bank Corporation,2019-10-25


<br>

In [4]:
# View "bank_detail"
bank_detail.head()

Unnamed: 0,CERT,FIN,CHARTER,ESTIMATED LOSS,ASSETS,DEPOSITS,RESOLUTION
0,14361,10536.0,COMMERCIAL,,152400,139526,FAILURE
1,18265,10535.0,COMMERCIAL,,100879,95159,FAILURE
2,21111,10534.0,COMMERCIAL,2491.0,120574,111234,FAILURE
3,58112,10532.0,COMMERCIAL,4547.0,29726,26473,FAILURE
4,58317,10533.0,OTHER,2188.0,27119,26151,FAILURE


<br>

In [5]:
# Merge "bank_list" and "bank_detail"
bank_list.merge(bank_detail, how='left', on='CERT')

Unnamed: 0,CERT,Bank Name,City,ST,Acquiring Institution,Closing Date,FIN,CHARTER,ESTIMATED LOSS,ASSETS,DEPOSITS,RESOLUTION
0,14361,The First State Bank,Barboursville,WV,"MVB Bank, Inc.",2020-04-03,10536.0,COMMERCIAL,,152400,139526,FAILURE
1,18265,Ericson State Bank,Ericson,NE,Farmers and Merchants Bank,2020-02-14,10535.0,COMMERCIAL,,100879,95159,FAILURE
2,21111,City National Bank of New Jersey,Newark,NJ,Industrial Bank,2019-11-01,10534.0,COMMERCIAL,2491.0,120574,111234,FAILURE
3,58317,Resolute Bank,Maumee,OH,Buckeye State Bank,2019-10-25,10533.0,OTHER,2188.0,27119,26151,FAILURE
4,58112,Louisa Community Bank,Louisa,KY,Kentucky Farmers Bank Corporation,2019-10-25,10532.0,COMMERCIAL,4547.0,29726,26473,FAILURE
...,...,...,...,...,...,...,...,...,...,...,...,...
556,32646,"Superior Bank, FSB",Hinsdale,IL,"Superior Federal, FSB",2001-07-27,6004.0,OTHER,286673.0,1765455,1609501,FAILURE
557,6629,Malta National Bank,Malta,OH,North Valley Bank,2001-05-03,4648.0,COMMERCIAL,769.0,9075,8728,FAILURE
558,34264,First Alliance Bank & Trust Co.,Manchester,NH,Southern New Hampshire Bank & Trust,2001-02-02,4647.0,COMMERCIAL,817.0,17438,16931,FAILURE
559,3815,National State Bank of Metropolis,Metropolis,IL,Banterra Bank of Marion,2000-12-14,4646.0,COMMERCIAL,2670.0,90397,71277,FAILURE


<br>

In [6]:
# Merge bank_list and bank_detail but only keep ASSETS from bank_detail
bank_list.merge(bank_detail[['CERT', 'ASSETS']], how='left', on='CERT')

Unnamed: 0,CERT,Bank Name,City,ST,Acquiring Institution,Closing Date,ASSETS
0,14361,The First State Bank,Barboursville,WV,"MVB Bank, Inc.",2020-04-03,152400
1,18265,Ericson State Bank,Ericson,NE,Farmers and Merchants Bank,2020-02-14,100879
2,21111,City National Bank of New Jersey,Newark,NJ,Industrial Bank,2019-11-01,120574
3,58317,Resolute Bank,Maumee,OH,Buckeye State Bank,2019-10-25,27119
4,58112,Louisa Community Bank,Louisa,KY,Kentucky Farmers Bank Corporation,2019-10-25,29726
...,...,...,...,...,...,...,...
556,32646,"Superior Bank, FSB",Hinsdale,IL,"Superior Federal, FSB",2001-07-27,1765455
557,6629,Malta National Bank,Malta,OH,North Valley Bank,2001-05-03,9075
558,34264,First Alliance Bank & Trust Co.,Manchester,NH,Southern New Hampshire Bank & Trust,2001-02-02,17438
559,3815,National State Bank of Metropolis,Metropolis,IL,Banterra Bank of Marion,2000-12-14,90397


<br>

In [7]:
# Save the original merged "bank_list" and "bank_detail" as "bank_merged"
bank_merged = bank_list.merge(bank_detail, how='left', on='CERT')

<br>

In [8]:
# View "bank_merged"
bank_merged.head()

Unnamed: 0,CERT,Bank Name,City,ST,Acquiring Institution,Closing Date,FIN,CHARTER,ESTIMATED LOSS,ASSETS,DEPOSITS,RESOLUTION
0,14361,The First State Bank,Barboursville,WV,"MVB Bank, Inc.",2020-04-03,10536.0,COMMERCIAL,,152400,139526,FAILURE
1,18265,Ericson State Bank,Ericson,NE,Farmers and Merchants Bank,2020-02-14,10535.0,COMMERCIAL,,100879,95159,FAILURE
2,21111,City National Bank of New Jersey,Newark,NJ,Industrial Bank,2019-11-01,10534.0,COMMERCIAL,2491.0,120574,111234,FAILURE
3,58317,Resolute Bank,Maumee,OH,Buckeye State Bank,2019-10-25,10533.0,OTHER,2188.0,27119,26151,FAILURE
4,58112,Louisa Community Bank,Louisa,KY,Kentucky Farmers Bank Corporation,2019-10-25,10532.0,COMMERCIAL,4547.0,29726,26473,FAILURE


<br><br>
**QUICK CHALLENGE #1:**

**Task: Use merge() to match books in `bx_books` to their ratings in `bx_ratings`**

* Use a default (inner) join
* Save the result as `bx_merged`

In [9]:
bx_books.head()

Unnamed: 0,isbn,book_title,book_author,year_of_publication,publisher
0,195153448,Classical Mythology,Mark P. O. Morford,2002,Oxford University Press
1,2005018,Clara Callan,Richard Bruce Wright,2001,HarperFlamingo Canada
2,60973129,Decision in Normandy,Carlo D'Este,1991,HarperPerennial
3,374157065,Flu: The Story of the Great Influenza Pandemic...,Gina Bari Kolata,1999,Farrar Straus Giroux
4,393045218,The Mummies of Urumchi,E. J. W. Barber,1999,W. W. Norton &amp; Company


In [10]:
bx_ratings.head()

Unnamed: 0,user_id,isbn,rating
0,276725,034545104X,0
1,276726,0155061224,5
2,276727,0446520802,0
3,276729,052165615X,3
4,276729,0521795028,6


In [12]:
# Your code for quick challenge #1 here:
bx_merged = bx_books.merge(bx_ratings, on='isbn')

In [13]:
# View the result: "bx_merged"
bx_merged.head()

Unnamed: 0,isbn,book_title,book_author,year_of_publication,publisher,user_id,rating
0,074322678X,Where You'll Find Me: And Other Stories,Ann Beattie,2002,Scribner,8,5
1,080652121X,Hitler's Secret Bankers: The Myth of Swiss Neu...,Adam Lebor,2000,Citadel Press,8,0
2,1552041778,Jane Doe,R. J. Kaiser,1999,Mira Books,8,5
3,1558746218,A Second Chicken Soup for the Woman's Soul (Ch...,Jack Canfield,1998,Health Communications,8,0
4,1558746218,A Second Chicken Soup for the Woman's Soul (Ch...,Jack Canfield,1998,Health Communications,3363,0


<br><br>

### VLOOKUP with different "on" conditions
---

In [14]:
# Look at "bank_merged" dataframe
bank_merged.head()

Unnamed: 0,CERT,Bank Name,City,ST,Acquiring Institution,Closing Date,FIN,CHARTER,ESTIMATED LOSS,ASSETS,DEPOSITS,RESOLUTION
0,14361,The First State Bank,Barboursville,WV,"MVB Bank, Inc.",2020-04-03,10536.0,COMMERCIAL,,152400,139526,FAILURE
1,18265,Ericson State Bank,Ericson,NE,Farmers and Merchants Bank,2020-02-14,10535.0,COMMERCIAL,,100879,95159,FAILURE
2,21111,City National Bank of New Jersey,Newark,NJ,Industrial Bank,2019-11-01,10534.0,COMMERCIAL,2491.0,120574,111234,FAILURE
3,58317,Resolute Bank,Maumee,OH,Buckeye State Bank,2019-10-25,10533.0,OTHER,2188.0,27119,26151,FAILURE
4,58112,Louisa Community Bank,Louisa,KY,Kentucky Farmers Bank Corporation,2019-10-25,10532.0,COMMERCIAL,4547.0,29726,26473,FAILURE


<br>

In [15]:
# Look at "bank_dividends" dataframe
bank_dividends.head()

Unnamed: 0,Bank Number:Name:Date Closed,Dividend Type,Priority Paid,% Paid,Total Paid,Date Paid,Bank Number,Name,Date Closed
0,10000:METROPOLITAN SAVINGS BANK:2/2/2007,Final,Depositor,0.04178,0.46375,2012-07-06,10000.0,METROPOLITAN SAVINGS BANK,2/2/2007
13,10002:MIAMI VALLEY BANK :10/4/2007,Final,Depositor,0.17015,0.52799,2016-08-12,10002.0,MIAMI VALLEY BANK,10/4/2007
16,10003:DOUGLASS NATIONAL BANK :1/25/2008,Final,Depositor,0.0,0.85705,2012-08-09,10003.0,DOUGLASS NATIONAL BANK,1/25/2008
32,"10006:FIRST INTEGRITY BANK, N.A. :5/30/2008",Final,Depositor,0.04828,0.79249,2016-09-06,10006.0,"FIRST INTEGRITY BANK, N.A.",5/30/2008
52,"10009:FIRST HERITAGE BANK, NA :7/25/2008",Final,Depositor,0.01289,0.64104,2016-08-09,10009.0,"FIRST HERITAGE BANK, NA",7/25/2008


<br>

In [16]:
# Merge (VLOOKUP) bank_dividends Total Paid column to the bank_merged dataframe from above
bank_merged.merge(bank_dividends, how='left', left_on='FIN', right_on='Bank Number')

Unnamed: 0,CERT,Bank Name,City,ST,Acquiring Institution,Closing Date,FIN,CHARTER,ESTIMATED LOSS,ASSETS,DEPOSITS,RESOLUTION,Bank Number:Name:Date Closed,Dividend Type,Priority Paid,% Paid,Total Paid,Date Paid,Bank Number,Name,Date Closed
0,14361,The First State Bank,Barboursville,WV,"MVB Bank, Inc.",2020-04-03,10536.0,COMMERCIAL,,152400,139526,FAILURE,,,,,,NaT,,,
1,18265,Ericson State Bank,Ericson,NE,Farmers and Merchants Bank,2020-02-14,10535.0,COMMERCIAL,,100879,95159,FAILURE,,,,,,NaT,,,
2,21111,City National Bank of New Jersey,Newark,NJ,Industrial Bank,2019-11-01,10534.0,COMMERCIAL,2491.0,120574,111234,FAILURE,,,,,,NaT,,,
3,58317,Resolute Bank,Maumee,OH,Buckeye State Bank,2019-10-25,10533.0,OTHER,2188.0,27119,26151,FAILURE,,,,,,NaT,,,
4,58112,Louisa Community Bank,Louisa,KY,Kentucky Farmers Bank Corporation,2019-10-25,10532.0,COMMERCIAL,4547.0,29726,26473,FAILURE,,,,,,NaT,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
559,32646,"Superior Bank, FSB",Hinsdale,IL,"Superior Federal, FSB",2001-07-27,6004.0,OTHER,286673.0,1765455,1609501,FAILURE,6004:SUPERIOR BANK FSB :7/27/2001,Final,Depositor,0.00648,0.81582,2014-05-21,6004.0,SUPERIOR BANK FSB,7/27/2001
560,6629,Malta National Bank,Malta,OH,North Valley Bank,2001-05-03,4648.0,COMMERCIAL,769.0,9075,8728,FAILURE,,,,,,NaT,,,
561,34264,First Alliance Bank & Trust Co.,Manchester,NH,Southern New Hampshire Bank & Trust,2001-02-02,4647.0,COMMERCIAL,817.0,17438,16931,FAILURE,,,,,,NaT,,,
562,3815,National State Bank of Metropolis,Metropolis,IL,Banterra Bank of Marion,2000-12-14,4646.0,COMMERCIAL,2670.0,90397,71277,FAILURE,4646:THE NATIONAL STATE BANK OF METROPOLIS 1:1...,Final,Depositor,0.00381,0.95109,2005-03-17,4646.0,THE NATIONAL STATE BANK OF METROPOLIS 1,12/14/2000


<br><br>
**QUICK CHALLENGE #2:**

**Task: Use merge() now add the `bx_users` data to `bx_merged` from above**

* Use a left merge (join)
* Keep the `user` and `location` columns from `bx_users`

In [20]:
# Your code for quick challenge #2 here:
bx_merged.merge(bx_users[['user', 'location']], how='left', left_on='user_id', right_on='user')

Unnamed: 0,isbn,book_title,book_author,year_of_publication,publisher,user_id,rating,user,location
0,074322678X,Where You'll Find Me: And Other Stories,Ann Beattie,2002,Scribner,8,5,8,"timmins, ontario, canada"
1,080652121X,Hitler's Secret Bankers: The Myth of Swiss Neu...,Adam Lebor,2000,Citadel Press,8,0,8,"timmins, ontario, canada"
2,1552041778,Jane Doe,R. J. Kaiser,1999,Mira Books,8,5,8,"timmins, ontario, canada"
3,1558746218,A Second Chicken Soup for the Woman's Soul (Ch...,Jack Canfield,1998,Health Communications,8,0,8,"timmins, ontario, canada"
4,1558746218,A Second Chicken Soup for the Woman's Soul (Ch...,Jack Canfield,1998,Health Communications,3363,0,3363,"knoxville, tennessee, usa"
...,...,...,...,...,...,...,...,...,...
196837,3596156904,Amok.,Emmanuel Carrere,2003,"Fischer (Tb.), Frankfurt",274719,0,274719,"wilhering, oberösterreich, austria"
196838,1582380805,Tropical Rainforests: 230 Species in Full Colo...,"Allen M., Ph.D. Young",2001,Golden Guides from St. Martin's Press,275970,0,275970,"pittsburgh, pennsylvania, usa"
196839,1845170423,Cocktail Classics,David Biggs,2004,Connaught,275970,7,275970,"pittsburgh, pennsylvania, usa"
196840,014002803X,Anti Death League,Kingsley Amis,1975,Viking Press,276077,0,276077,"badalona, catalonia, spain"


In [17]:
bx_merged.columns

Index(['isbn', 'book_title', 'book_author', 'year_of_publication', 'publisher',
       'user_id', 'rating'],
      dtype='object')

In [18]:
bx_users.columns

Index(['user', 'location', 'age'], dtype='object')

<br><br>

### VLOOKUP (Merge) on index
---

In [21]:
# Look at bank_list
bank_list.head()

Unnamed: 0,CERT,Bank Name,City,ST,Acquiring Institution,Closing Date
0,14361,The First State Bank,Barboursville,WV,"MVB Bank, Inc.",2020-04-03
1,18265,Ericson State Bank,Ericson,NE,Farmers and Merchants Bank,2020-02-14
2,21111,City National Bank of New Jersey,Newark,NJ,Industrial Bank,2019-11-01
3,58317,Resolute Bank,Maumee,OH,Buckeye State Bank,2019-10-25
4,58112,Louisa Community Bank,Louisa,KY,Kentucky Farmers Bank Corporation,2019-10-25


<br>

In [23]:
# Set CERT as index in "bank_list"
bank_list.set_index('CERT', inplace=True)

# View "bank_list"
bank_list.head()

Unnamed: 0_level_0,Bank Name,City,ST,Acquiring Institution,Closing Date
CERT,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
14361,The First State Bank,Barboursville,WV,"MVB Bank, Inc.",2020-04-03
18265,Ericson State Bank,Ericson,NE,Farmers and Merchants Bank,2020-02-14
21111,City National Bank of New Jersey,Newark,NJ,Industrial Bank,2019-11-01
58317,Resolute Bank,Maumee,OH,Buckeye State Bank,2019-10-25
58112,Louisa Community Bank,Louisa,KY,Kentucky Farmers Bank Corporation,2019-10-25


<br>

In [24]:
# View "bank_detail"
bank_detail.head()

Unnamed: 0,CERT,FIN,CHARTER,ESTIMATED LOSS,ASSETS,DEPOSITS,RESOLUTION
0,14361,10536.0,COMMERCIAL,,152400,139526,FAILURE
1,18265,10535.0,COMMERCIAL,,100879,95159,FAILURE
2,21111,10534.0,COMMERCIAL,2491.0,120574,111234,FAILURE
3,58112,10532.0,COMMERCIAL,4547.0,29726,26473,FAILURE
4,58317,10533.0,OTHER,2188.0,27119,26151,FAILURE


<br>

In [25]:
# Use left_index as condition as lookup_value
bank_list.merge(bank_detail, how='left', left_index=True, right_on='CERT')

Unnamed: 0,Bank Name,City,ST,Acquiring Institution,Closing Date,CERT,FIN,CHARTER,ESTIMATED LOSS,ASSETS,DEPOSITS,RESOLUTION
0,The First State Bank,Barboursville,WV,"MVB Bank, Inc.",2020-04-03,14361,10536.0,COMMERCIAL,,152400,139526,FAILURE
1,Ericson State Bank,Ericson,NE,Farmers and Merchants Bank,2020-02-14,18265,10535.0,COMMERCIAL,,100879,95159,FAILURE
2,City National Bank of New Jersey,Newark,NJ,Industrial Bank,2019-11-01,21111,10534.0,COMMERCIAL,2491.0,120574,111234,FAILURE
4,Resolute Bank,Maumee,OH,Buckeye State Bank,2019-10-25,58317,10533.0,OTHER,2188.0,27119,26151,FAILURE
3,Louisa Community Bank,Louisa,KY,Kentucky Farmers Bank Corporation,2019-10-25,58112,10532.0,COMMERCIAL,4547.0,29726,26473,FAILURE
...,...,...,...,...,...,...,...,...,...,...,...,...
569,"Superior Bank, FSB",Hinsdale,IL,"Superior Federal, FSB",2001-07-27,32646,6004.0,OTHER,286673.0,1765455,1609501,FAILURE
570,Malta National Bank,Malta,OH,North Valley Bank,2001-05-03,6629,4648.0,COMMERCIAL,769.0,9075,8728,FAILURE
571,First Alliance Bank & Trust Co.,Manchester,NH,Southern New Hampshire Bank & Trust,2001-02-02,34264,4647.0,COMMERCIAL,817.0,17438,16931,FAILURE
572,National State Bank of Metropolis,Metropolis,IL,Banterra Bank of Marion,2000-12-14,3815,4646.0,COMMERCIAL,2670.0,90397,71277,FAILURE
