# Congressional Committees w/ Stock Trades w/ String Algorithms

Exploring matching Congressional stock trade data and stock descriptions with Congressional Committee descriptions

reference: https://pythonspot.com/nltk-stop-words/

----

#### Imports

In [190]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from IPython.display import display, HTML

In [191]:
# pd.set_option('display.max_columns', None)
# pd.set_option('display.max_rows', None)
# pd.set_option('display.width', None)
# pd.set_option('display.max_colwidth', None)
# pd.set_option('max_seq_item', None)

In [192]:
#string matching imports
from difflib import SequenceMatcher
from fuzzywuzzy import fuzz
from fuzzywuzzy import process
from thefuzz import fuzz
from thefuzz import process
import textdistance
import jaro
import jellyfish

In [193]:
#natural language processing imports
import nltk
from nltk.tokenize import sent_tokenize, word_tokenize
from nltk.corpus import stopwords
from nltk.corpus import treebank
import string

In [194]:
nltk.download('punkt')
nltk.download('stopwords')
nltk.download('averaged_perceptron_tagger')
nltk.download('maxent_ne_chunker')
nltk.download('words')
nltk.download('treebank')

[nltk_data] Downloading package punkt to /Users/sm/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to /Users/sm/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /Users/sm/nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!
[nltk_data] Downloading package maxent_ne_chunker to
[nltk_data]     /Users/sm/nltk_data...
[nltk_data]   Package maxent_ne_chunker is already up-to-date!
[nltk_data] Downloading package words to /Users/sm/nltk_data...
[nltk_data]   Package words is already up-to-date!
[nltk_data] Downloading package treebank to /Users/sm/nltk_data...
[nltk_data]   Package treebank is already up-to-date!


True

----

### Reading Dataframes

Read in Stock trades by Congress members with Yahoo finance stock info

In [195]:
df_trades = pd.read_csv("..//data//processed//stock_watchers_w_yfinance_03_12_2022.csv", encoding="utf-8")

In [196]:
df_trades.head(1)

Unnamed: 0,transaction_date,disclosure_date,politician,owner,ticker,amount,asset_description,asset_type,transaction_type,comment,...,cap_gains,amount_low,amount_high,ticker2,name,sector,industry,longbusinesssummary,website,stock_description
0,02/24/2022,03/11/2022,Shelley M Capito,Spouse,NEE,1001 - 15000,"NextEra Energy, Inc. Common Stock",Stock,Sale (Partial),--,...,,1001,15000.0,NEE,"NextEra Energy, Inc.",Utilities,Utilities—Regulated Electric,"NextEra Energy, Inc., through its subsidiaries...",https://www.nexteraenergy.com,"Utilities, Utilities—Regulated Electric, NextE..."


In [197]:
df_trades['sector_industry'] = df_trades['sector'] + ' ' + df_trades['industry']

In [198]:
df_trades.head(1)

Unnamed: 0,transaction_date,disclosure_date,politician,owner,ticker,amount,asset_description,asset_type,transaction_type,comment,...,amount_low,amount_high,ticker2,name,sector,industry,longbusinesssummary,website,stock_description,sector_industry
0,02/24/2022,03/11/2022,Shelley M Capito,Spouse,NEE,1001 - 15000,"NextEra Energy, Inc. Common Stock",Stock,Sale (Partial),--,...,1001,15000.0,NEE,"NextEra Energy, Inc.",Utilities,Utilities—Regulated Electric,"NextEra Energy, Inc., through its subsidiaries...",https://www.nexteraenergy.com,"Utilities, Utilities—Regulated Electric, NextE...",Utilities Utilities—Regulated Electric


In [262]:
# df_trades.columns

Read in Congress Committee Descriptions Extracted from Committee.gov sites (with a few exceptions)

In [199]:
df_subcomittees = pd.read_csv('..//data//handmade//congress_commitee_descriptions.csv')

In [200]:
df_subcomittees.head(1)

Unnamed: 0,committee,committee_fullname,committee_description,website
0,SSFR09,Africa and Global Health Policy,The subcommittee deals with all matters concer...,https://www.foreign.senate.gov/download/2021-1...


In [201]:
df_subcomittees['committee_description2'] = df_subcomittees['committee_fullname'] + ' ' + df_subcomittees['committee_description']

In [263]:
df_subcomittees.columns

Index(['committee', 'committee_fullname', 'committee_description', 'website',
       'committee_description2', 'committee_description3'],
      dtype='object')

In [202]:
df_subcomittees.head(1)

Unnamed: 0,committee,committee_fullname,committee_description,website,committee_description2
0,SSFR09,Africa and Global Health Policy,The subcommittee deals with all matters concer...,https://www.foreign.senate.gov/download/2021-1...,Africa and Global Health Policy The subcommitt...


In [203]:
df_committee_members = pd.read_csv("..//data//processed//congress_committees.csv", encoding="utf-8")

In [204]:
df_committee_members.head(2)

Unnamed: 0,committee,name,party,rank,bioguide
0,SSAF,Debbie Stabenow,majority,1,S000770
1,SSAF,Patrick J. Leahy,majority,2,L000174


In [279]:
df_committee_members.columns

Index(['committee', 'name', 'party', 'rank', 'bioguide'], dtype='object')

-----

### Cleaning the Stock Description Columns

In [205]:
df_trades['stock_description2'] = df_trades.stock_description
# df.head(1)

In [206]:
df_trades.stock_description2 = df_trades.stock_description2.astype(str).str.lower()

In [207]:
df_trades.stock_description2.head(1)

0    utilities, utilities—regulated electric, nexte...
Name: stock_description2, dtype: object

In [208]:
df_trades.sector_industry = df_trades.sector_industry.astype(str).str.lower()

In [209]:
df_trades.sector_industry.head(1)

0    utilities utilities—regulated electric
Name: sector_industry, dtype: object

Data Notes:

1. combine committee fullname with committee description in new column
2. remove duplicate words in each description
3. (agriculture vs. agricultural)
5. remove numbers and words, punctuation 

* a 
* includes
* deals 
* shall
* jurisdiction
* policy 
* member
* ranking
* 

-----

##### Removing Punctuation from description

In [210]:
string.punctuation

'!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'

In [211]:
df_trades.stock_description2 = df_trades.stock_description2.str.replace('[{}]'.format(string.punctuation), '')

  df_trades.stock_description2 = df_trades.stock_description2.str.replace('[{}]'.format(string.punctuation), '')


In [212]:
df_trades.stock_description2.head(1)

0    utilities utilities—regulated electric nextera...
Name: stock_description2, dtype: object

In [213]:
df_trades.stock_description2 = df_trades.stock_description2.replace('—', ' ')

In [214]:
df_trades.stock_description2.head(1)

0    utilities utilities—regulated electric nextera...
Name: stock_description2, dtype: object

In [215]:
df_trades.sector_industry = df_trades.sector_industry.str.replace('[{}]'.format(string.punctuation), '')

  df_trades.sector_industry = df_trades.sector_industry.str.replace('[{}]'.format(string.punctuation), '')


In [216]:
df_trades.sector_industry.head(1)

0    utilities utilities—regulated electric
Name: sector_industry, dtype: object

##### Removing "Stop Words"

In [217]:
stops = set(stopwords.words('english'))
print(stops)

{'hers', 'against', 'as', 'until', 'mightn', 'my', "wasn't", "hadn't", 'on', 'm', 'wouldn', 'to', 'been', 'where', 'can', 'through', 'didn', 'what', 'and', 'which', "shouldn't", 's', 'yours', 'me', 'just', 'has', 'up', "you'd", "don't", 'than', 'same', 'your', 'before', 'nor', 'with', "that'll", "should've", 'out', 'don', 'herself', 'isn', 'then', "doesn't", 'above', 'so', 'its', 'is', 'd', 'myself', 'haven', 'who', 'below', 'whom', 'aren', 'between', 've', 'during', "needn't", 'ma', 'them', 'ourselves', "you're", 'had', 'do', 'were', 'our', "isn't", 'into', 'mustn', 'shouldn', 'from', 'only', 'by', 'each', 'further', "didn't", "wouldn't", 'him', 'there', 'too', 'his', 'having', 'have', 'no', "couldn't", 'hadn', 'he', 'shan', 'couldn', 'does', 'once', "you've", 'after', 'needn', 'own', "it's", 'ours', 'or', 'such', 'how', 'a', 'they', 'themselves', 'about', 'this', 'will', 'all', "you'll", 'very', 'when', 'should', 'these', 'but', "weren't", 'down', "aren't", 'an', 'itself', 'over', 'w

In [218]:
df_trades['stock_description3'] = df_trades.stock_description2.apply(lambda x: ' '.join([word for word in x.split() if word not in (stops)]))

In [219]:
df_trades.stock_description3.head(1)

0    utilities utilities—regulated electric nextera...
Name: stock_description3, dtype: object

In [220]:
# df.head(2)

##### Removing numbers/digits from descriptions

In [221]:
df_trades.stock_description3 = df_trades.stock_description3.str.replace('\d+', '')

  df_trades.stock_description3 = df_trades.stock_description3.str.replace('\d+', '')


In [222]:
df_trades.stock_description3.head(1)

0    utilities utilities—regulated electric nextera...
Name: stock_description3, dtype: object

In [223]:
# df_trades

In [224]:
# stops2 = stopwords.words('english')
# print(stops2)

In [225]:
# stops2 = stopwords.words('english')

In [226]:
# Consider the word: Antinationalist, Morpheme
# https://www.analyticsvidhya.com/blog/2021/06/part-3-step-by-step-guide-to-nlp-text-cleaning-and-preprocessing/

Save a Copy

In [227]:
# df_trades.to_csv('..//data//processed//stock_watchers_w_yfinance_edited_03_13_2022.csv', index = False)

----

### Cleaning the Committee Description Column

In [228]:
df_subcomittees.committee_description2 = df_subcomittees.committee_description2.astype(str).str.lower()

In [229]:
df_subcomittees.committee_description2.head(1)

0    africa and global health policy the subcommitt...
Name: committee_description2, dtype: object

##### Removing Punctuation from description

In [230]:
df_subcomittees.committee_description2 = df_subcomittees.committee_description2.str.replace('[{}]'.format(string.punctuation), '')

  df_subcomittees.committee_description2 = df_subcomittees.committee_description2.str.replace('[{}]'.format(string.punctuation), '')


In [231]:
df_subcomittees.committee_description2.head(2)

0    africa and global health policy the subcommitt...
1    africa global health and global human rights t...
Name: committee_description2, dtype: object

##### Removing "Stop Words"

In [232]:
df_subcomittees['committee_description3'] = df_subcomittees.committee_description2.apply(lambda x: ' '.join([word for word in x.split() if word not in (stops)]))

In [233]:
df_subcomittees.committee_description3.head(1)

0    africa global health policy subcommittee deals...
Name: committee_description3, dtype: object

In [234]:
# df.stock_description2

##### Removing numbers/digits from descriptions

In [235]:
df_subcomittees.committee_description3 = df_subcomittees.committee_description3.str.replace('\d+', '')

  df_subcomittees.committee_description3 = df_subcomittees.committee_description3.str.replace('\d+', '')


In [236]:
df_subcomittees.committee_description3.head(1)

0    africa global health policy subcommittee deals...
Name: committee_description3, dtype: object

In [237]:
# df_subcomittees.head(1)

##### Step 3

In [238]:
stops = set(stopwords.words('english'))
# print(stops)

In [239]:
# pat = r'\b(?:{})\b'.format('|'.join(stop))
# test['tweet_without_stopwords'] = test['tweet'].str.replace(pat, '')
# test['tweet_without_stopwords'] = test['tweet_without_stopwords'].str.replace(r'\s+', ' ')
# # Same results.
# # 0              I love car
# # 1       This view amazing
# # 2    I feel great morning
# # 3       I excited concert
# # 4          He best friend

Save a copy

In [240]:
# df_subcomittees.to_csv('..//data//handmade//congress_commitee_descriptions_edited_03_13_22.csv', index = False)

-----

### Merging with algorithm on stock and committee descriptions

In [241]:
# merged = empty_df
# for trade in df_trades:
#     for each comittee in df_comitee:
#       # match ticker descp to comittee descp OR ANY OTHER ALGORITHM
#         flt_s_score = similar(ticker_desc, comittee_desc)
#         if flt_s_score > 0.6:
#             add this trade + comite descp to merged
# drop the rows where member comittee != ticker_comitee
# do analysis (edited) 

In [242]:
# relevant columns

# df.stock_description3
# df_subcomittees.committee_description3

In [243]:
#define the algorithm being used
def similar(a,b):
    return fuzz.partial_ratio(a, b)

In [244]:
# checking that similar works
similar("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")

91

In [245]:
# for column in df_trades[0:5]:
#     print(df_trades[column].values)

In [246]:
for trade in df_trades.stock_description3[0:1]:
    print(trade)

utilities utilities—regulated electric nextera energy inc subsidiaries generates transmits distributes sells electric power retail wholesale customers north america company generates electricity wind solar nuclear fossil fuel coal natural gas facilities also develops constructs operates longterm contracted assets focus renewable generation facilities electric transmission facilities battery storage projects owns develops constructs manages operates electric generation facilities wholesale energy markets december   company operated approximately  megawatts net generating capacity serves approximately  million people approximately  million customer accounts east lower west coasts florida approximately  circuit miles transmission distribution lines  substations company formerly known fpl group inc changed name nextera energy inc  nextera energy inc founded  headquartered juno beach florida


In [247]:
# ls_stock_description3 = df_trades.stock_description3.values.tolist()

In [248]:
for committee in df_subcomittees.committee_description3[0:1]:
    print(committee)

africa global health policy subcommittee deals matters concerning us relations countries africa except like countries north africa specifically covered subcommittees well regional intergovernmental organizations like african union economic community west african states subcommittee’s regional responsibilities include matters within geographic region including matters relating  terrorism nonproliferation  crime illicit narcotics  us foreign assistance programs  promotion us trade exports addition subcommittee global responsibility healthrelated policy including disease outbreak response


In [249]:
# ls_committee_description3 = df_subcomittees.committee_description3.values.tolist()

In [250]:
# next(df_trades.iterrows())

In [251]:
# Jasen Notes

# merged = empty_df
# for trade in df_trades:
#     for each comittee in df_comitee:
#       # match ticker descp to comittee descp OR ANY OTHER ALGORITHM
#         flt_s_score = similar(ticker_desc, comittee_desc)
#         if flt_s_score > 0.6:
#             add this trade + comite descp to merged
# drop the rows where member comittee != ticker_comitee
# do analysis (edited) 

In [252]:
from itertools import chain

In [253]:
#working for loop!!

#establish an empty list
ls_rows = []

# iterate through each row (trade) of trades dataframe
for index, trade in df_trades[0:2].iterrows():
#     print(trade)
#     print(index, trade['stock_description3'])
#     print('a')
    
    #Iterate through each committee in committee dataframe for each row of trades dataframe
    for index, committee in df_subcomittees[50:60].iterrows():
#         print(index, committee['committee_description3'])
#         print('b')

      # match ticker description to committee description with ALGORITHM (which one TBD)
        flt_s_score = similar(trade['stock_description3'], committee['committee_description3'])
#         print(flt_s_score)
        if flt_s_score > 20:
#             print(flt_s_score)
            
#             # add this trade + commitee description to merged
            new_row = list(chain(trade, committee))
#             print(new_row)
#             print('c')
            ls_rows.append(new_row)



In [254]:
ls_rows[0:5]

[['02/24/2022',
  '03/11/2022',
  'Shelley M Capito',
  'Spouse',
  'NEE',
  '1001 - 15000',
  'NextEra Energy, Inc. Common Stock',
  'Stock',
  'Sale (Partial)',
  '--',
  'https://efdsearch.senate.gov/search/view/ptr/e7893c34-0761-4c2b-ac52-e303f166517f/',
  nan,
  nan,
  '1001',
  15000.0,
  'NEE',
  'NextEra Energy, Inc.',
  'Utilities',
  'Utilities—Regulated Electric',
  'NextEra Energy, Inc., through its subsidiaries, generates, transmits, distributes, and sells electric power to retail and wholesale customers in North America. The company generates electricity through wind, solar, nuclear, and fossil fuel, such as coal and natural gas facilities. It also develops, constructs, and operates long-term contracted assets with a focus on renewable generation facilities, electric transmission facilities, and battery storage projects; and owns, develops, constructs, manages and operates electric generation facilities in wholesale energy markets. As of December 31, 2020, the company ope

In [264]:
merged = pd.DataFrame(ls_rows)
merged.columns =['transaction_date', 'disclosure_date', 'politician', 'owner', 'ticker', 'amount', 'asset_description', 'asset_type', 'transaction_type', 'comment', 'ptr_link', 'location', 'cap_gains', 'amount_low', 'amount_high', 'ticker2', 'name', 'sector', 'industry', 'longbusinesssummary', 'website', 'stock_description','sector_industry', 'stock_description2', 'stock_description3', 'committee', 'committee_fullname', 'committee_description', 'website','committee_description2', 'committee_description3']

In [265]:
merged.head(10)

Unnamed: 0,transaction_date,disclosure_date,politician,owner,ticker,amount,asset_description,asset_type,transaction_type,comment,...,stock_description,sector_industry,stock_description2,stock_description3,committee,committee_fullname,committee_description,website,committee_description2,committee_description3
0,02/24/2022,03/11/2022,Shelley M Capito,Spouse,NEE,1001 - 15000,"NextEra Energy, Inc. Common Stock",Stock,Sale (Partial),--,...,"Utilities, Utilities—Regulated Electric, NextE...",utilities utilities—regulated electric,utilities utilities—regulated electric nextera...,utilities utilities—regulated electric nextera...,HSHA08,Elections,The Subcommittee on Elections handles matters ...,https://cha.house.gov/subcommittees/elections-...,elections the subcommittee on elections handle...,elections subcommittee elections handles matte...
1,02/24/2022,03/11/2022,Shelley M Capito,Spouse,NEE,1001 - 15000,"NextEra Energy, Inc. Common Stock",Stock,Sale (Partial),--,...,"Utilities, Utilities—Regulated Electric, NextE...",utilities utilities—regulated electric,utilities utilities—regulated electric nextera...,utilities utilities—regulated electric nextera...,SSHR11,Employment and Workplace Safety,The Subcommittee Chairman is Senator John Hick...,https://www.help.senate.gov/about/subcommittees,employment and workplace safety the subcommitt...,employment workplace safety subcommittee chair...
2,02/24/2022,03/11/2022,Shelley M Capito,Spouse,NEE,1001 - 15000,"NextEra Energy, Inc. Common Stock",Stock,Sale (Partial),--,...,"Utilities, Utilities—Regulated Electric, NextE...",utilities utilities—regulated electric,utilities utilities—regulated electric nextera...,utilities utilities—regulated electric nextera...,HSII06,Energy and Mineral Resources,Energy and mineral resources: Monitoring the d...,https://naturalresources.house.gov/about/the-c...,energy and mineral resources energy and minera...,energy mineral resources energy mineral resour...
3,01/14/2022,02/14/2022,Thomas H Tuberville,Joint,NEE,15001 - 50000,"NextEra Energy, Inc. Common Stock",Stock,Sale (Full),--,...,"Utilities, Utilities—Regulated Electric, NextE...",utilities utilities—regulated electric,utilities utilities—regulated electric nextera...,utilities utilities—regulated electric nextera...,HSHA08,Elections,The Subcommittee on Elections handles matters ...,https://cha.house.gov/subcommittees/elections-...,elections the subcommittee on elections handle...,elections subcommittee elections handles matte...
4,01/14/2022,02/14/2022,Thomas H Tuberville,Joint,NEE,15001 - 50000,"NextEra Energy, Inc. Common Stock",Stock,Sale (Full),--,...,"Utilities, Utilities—Regulated Electric, NextE...",utilities utilities—regulated electric,utilities utilities—regulated electric nextera...,utilities utilities—regulated electric nextera...,SSHR11,Employment and Workplace Safety,The Subcommittee Chairman is Senator John Hick...,https://www.help.senate.gov/about/subcommittees,employment and workplace safety the subcommitt...,employment workplace safety subcommittee chair...
5,01/14/2022,02/14/2022,Thomas H Tuberville,Joint,NEE,15001 - 50000,"NextEra Energy, Inc. Common Stock",Stock,Sale (Full),--,...,"Utilities, Utilities—Regulated Electric, NextE...",utilities utilities—regulated electric,utilities utilities—regulated electric nextera...,utilities utilities—regulated electric nextera...,HSII06,Energy and Mineral Resources,Energy and mineral resources: Monitoring the d...,https://naturalresources.house.gov/about/the-c...,energy and mineral resources energy and minera...,energy mineral resources energy mineral resour...


### Matching with Member Committee Assignments

In [278]:
df_committee_members.head(3)

Unnamed: 0,committee,name,party,rank,bioguide
0,SSAF,Debbie Stabenow,majority,1,S000770
1,SSAF,Patrick J. Leahy,majority,2,L000174
2,SSAF,Sherrod Brown,majority,3,B000944


In [258]:
similar("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")

91

In [259]:
similar("Thomas H Tuberville", "Tommy Tuberville")

75

In [267]:
similar("Shelley M Capito", "Shelley Moore Capito")

75

In [274]:
#test

#establish an empty list
ls_rows2 = []

# iterate through each row of merged dataframe
for index, row in merged.iterrows():
#     print(row)
    print(index, row['politician'])
    print(index, row['committee'])
#     print('a')
    
    #Iterate through each row of member committee assignment dataframe for each row of merged dataframe
    for index, member in df_committee_members.iterrows():
        print(index, member['name'])
        print(index, member['committee'])
#         print('b')

      # match names of trades to members with algorithm
        name_score = similar(row['politician'], member['name'])
#         print(name_score)
        if row['committee'] == member['committee'] and name_score > 60:
            print('eureka')
            
            # add this trade + commitee description to merged
            new_row2 = list(chain(row, member))
#             print(new_row2)
#             print('c')
            ls_rows2.append(new_row2)



0 Shelley M Capito
0 HSHA08
0 Debbie Stabenow
0 SSAF
1 Patrick J. Leahy
1 SSAF
2 Sherrod Brown
2 SSAF
3 Amy Klobuchar
3 SSAF
4 Michael F. Bennet
4 SSAF
5 Kirsten E. Gillibrand
5 SSAF
6 Tina Smith
6 SSAF
7 Richard J. Durbin
7 SSAF
8 Cory A. Booker
8 SSAF
9 Ben Ray Luján
9 SSAF
10 Raphael G. Warnock
10 SSAF
11 John Boozman
11 SSAF
12 Mitch McConnell
12 SSAF
13 John Hoeven
13 SSAF
14 Joni Ernst
14 SSAF
15 Cindy Hyde-Smith
15 SSAF
16 Roger Marshall
16 SSAF
17 Tommy Tuberville
17 SSAF
18 Chuck Grassley
18 SSAF
19 John Thune
19 SSAF
20 Deb Fischer
20 SSAF
21 Mike Braun
21 SSAF
22 Raphael G. Warnock
22 SSAF13
23 Sherrod Brown
23 SSAF13
24 Richard J. Durbin
24 SSAF13
25 Tina Smith
25 SSAF13
26 Kirsten E. Gillibrand
26 SSAF13
27 Ben Ray Luján
27 SSAF13
28 Debbie Stabenow
28 SSAF13
29 John Hoeven
29 SSAF13
30 Mitch McConnell
30 SSAF13
31 Cindy Hyde-Smith
31 SSAF13
32 Tommy Tuberville
32 SSAF13
33 Chuck Grassley
33 SSAF13
34 John Thune
34 SSAF13
35 John Boozman
35 SSAF13
36 Michael F. Bennet
36 S

574 John W. Hickenlooper
574 SSEG01
575 Joe Manchin, III
575 SSEG01
576 John Hoeven
576 SSEG01
577 James E. Risch
577 SSEG01
578 Lisa Murkowski
578 SSEG01
579 James Lankford
579 SSEG01
580 Bill Cassidy
580 SSEG01
581 Cindy Hyde-Smith
581 SSEG01
582 Roger Marshall
582 SSEG01
583 John Barrasso
583 SSEG01
584 Angus S. King, Jr.
584 SSEG04
585 Bernard Sanders
585 SSEG04
586 Martin Heinrich
586 SSEG04
587 Mazie K. Hirono
587 SSEG04
588 Mark Kelly
588 SSEG04
589 Joe Manchin, III
589 SSEG04
590 Steve Daines
590 SSEG04
591 Mike Lee
591 SSEG04
592 Lisa Murkowski
592 SSEG04
593 John Hoeven
593 SSEG04
594 James Lankford
594 SSEG04
595 John Barrasso
595 SSEG04
596 Catherine Cortez Masto
596 SSEG03
597 Ron Wyden
597 SSEG03
598 Martin Heinrich
598 SSEG03
599 Mazie K. Hirono
599 SSEG03
600 Angus S. King, Jr.
600 SSEG03
601 Mark Kelly
601 SSEG03
602 John W. Hickenlooper
602 SSEG03
603 Joe Manchin, III
603 SSEG03
604 Mike Lee
604 SSEG03
605 James E. Risch
605 SSEG03
606 Steve Daines
606 SSEG03
607 Lisa

981 Tim Kaine
981 SSHR12
982 Margaret Wood Hassan
982 SSHR12
983 Jacky Rosen
983 SSHR12
984 Ben Ray Luján
984 SSHR12
985 Patty Murray
985 SSHR12
986 Susan M. Collins
986 SSHR12
987 Rand Paul
987 SSHR12
988 Lisa Murkowski
988 SSHR12
989 Roger Marshall
989 SSHR12
990 Tim Scott
990 SSHR12
991 Jerry Moran
991 SSHR12
992 Bill Cassidy
992 SSHR12
993 Mike Braun
993 SSHR12
994 Richard Burr
994 SSHR12
995 Gary C. Peters
995 SSGA
996 Thomas R. Carper
996 SSGA
997 Margaret Wood Hassan
997 SSGA
998 Kyrsten Sinema
998 SSGA
999 Jacky Rosen
999 SSGA
1000 Alex Padilla
1000 SSGA
1001 Jon Ossoff
1001 SSGA
1002 Rob Portman
1002 SSGA
1003 Ron Johnson
1003 SSGA
1004 Rand Paul
1004 SSGA
1005 James Lankford
1005 SSGA
1006 Mitt Romney
1006 SSGA
1007 Rick Scott
1007 SSGA
1008 Josh Hawley
1008 SSGA
1009 Margaret Wood Hassan
1009 SSGA20
1010 Kyrsten Sinema
1010 SSGA20
1011 Jacky Rosen
1011 SSGA20
1012 Jon Ossoff
1012 SSGA20
1013 Gary C. Peters
1013 SSGA20
1014 Rand Paul
1014 SSGA20
1015 Mitt Romney
1015 SSGA20
1

1480 HSAS
1481 Kaiali’i Kahele
1481 HSAS
1482 Jerry L. Carl
1482 HSAS
1483 Marilyn Strickland
1483 HSAS
1484 Blake D. Moore
1484 HSAS
1485 Marc A. Veasey
1485 HSAS
1486 Pat Fallon
1486 HSAS
1487 Jimmy Panetta
1487 HSAS
1488 Stephanie N. Murphy
1488 HSAS
1489 Steven Horsford
1489 HSAS
1490 Maxine Waters
1490 HSBA
1491 Patrick T. McHenry
1491 HSBA
1492 Carolyn B. Maloney
1492 HSBA
1493 Frank D. Lucas
1493 HSBA
1494 Nydia M. Velázquez
1494 HSBA
1495 Pete Sessions
1495 HSBA
1496 Brad Sherman
1496 HSBA
1497 Bill Posey
1497 HSBA
1498 Gregory W. Meeks
1498 HSBA
1499 Blaine Luetkemeyer
1499 HSBA
1500 Al Green
1500 HSBA
1501 Bill Huizenga
1501 HSBA
1502 Emanuel Cleaver
1502 HSBA
1503 Ann Wagner
1503 HSBA
1504 Ed Perlmutter
1504 HSBA
1505 Andy Barr
1505 HSBA
1506 James A. Himes
1506 HSBA
1507 Roger Williams
1507 HSBA
1508 Bill Foster
1508 HSBA
1509 J. Hill
1509 HSBA
1510 Joyce Beatty
1510 HSBA
1511 Tom Emmer
1511 HSBA
1512 Juan Vargas
1512 HSBA
1513 Lee M. Zeldin
1513 HSBA
1514 Josh Gottheimer
1

1980 HSPW
1981 Steve Cohen
1981 HSPW
1982 Thomas Massie
1982 HSPW
1983 Albio Sires
1983 HSPW
1984 Scott Perry
1984 HSPW
1985 John Garamendi
1985 HSPW
1986 Rodney Davis
1986 HSPW
1987 Henry C. "Hank" Johnson, Jr.
1987 HSPW
1988 John Katko
1988 HSPW
1989 André Carson
1989 HSPW
1990 Brian Babin
1990 HSPW
1991 Dina Titus
1991 HSPW
1992 Garret Graves
1992 HSPW
1993 Sean Patrick Maloney
1993 HSPW
1994 David Rouzer
1994 HSPW
1995 Jared Huffman
1995 HSPW
1996 Mike Bost
1996 HSPW
1997 Julia Brownley
1997 HSPW
1998 Randy K. Weber, Sr.
1998 HSPW
1999 Frederica S. Wilson
1999 HSPW
2000 Doug LaMalfa
2000 HSPW
2001 Donald M. Payne, Jr.
2001 HSPW
2002 Bruce Westerman
2002 HSPW
2003 Alan S. Lowenthal
2003 HSPW
2004 Brian J. Mast
2004 HSPW
2005 Mark DeSaulnier
2005 HSPW
2006 Mike Gallagher
2006 HSPW
2007 Stephen F. Lynch
2007 HSPW
2008 Brian K. Fitzpatrick
2008 HSPW
2009 Salud O. Carbajal
2009 HSPW
2010 Jenniffer González-Colón
2010 HSPW
2011 Anthony Brown
2011 HSPW
2012 Troy Balderson
2012 HSPW
2013 T

2480 HSPW12
2481 Sharice Davids
2481 HSPW12
2482 Carlos A. Gimenez
2482 HSPW12
2483 Seth Moulton
2483 HSPW12
2484 Michelle Steel
2484 HSPW12
2485 Kaiali’i Kahele
2485 HSPW12
2486 Sam Graves
2486 HSPW12
2487 Nikema Williams
2487 HSPW12
2488 Marie Newman
2488 HSPW12
2489 Steve Cohen
2489 HSPW12
2490 Peter A. DeFazio
2490 HSPW12
2491 Jackie Speier
2491 HSAS02
2492 Mike Gallagher
2492 HSAS02
2493 Andy Kim
2493 HSAS02
2494 Stephanie I. Bice
2494 HSAS02
2495 Chrissy Houlahan
2495 HSAS02
2496 Lisa C. McClain
2496 HSAS02
2497 Veronica Escobar
2497 HSAS02
2498 Ronny Jackson
2498 HSAS02
2499 Sara Jacobs
2499 HSAS02
2500 Jerry L. Carl
2500 HSAS02
2501 Marilyn Strickland
2501 HSAS02
2502 Pat Fallon
2502 HSAS02
2503 Marc A. Veasey
2503 HSAS02
2504 Joe Courtney
2504 HSAS28
2505 Robert J. Wittman
2505 HSAS28
2506 James R. Langevin
2506 HSAS28
2507 Vicky Hartzler
2507 HSAS28
2508 Jim Cooper
2508 HSAS28
2509 Sam Graves
2509 HSAS28
2510 Donald Norcross
2510 HSAS28
2511 Trent Kelly
2511 HSAS28
2512 Antho

2980 HSPW02
2981 Brian J. Mast
2981 HSPW02
2982 Salud O. Carbajal
2982 HSPW02
2983 Jenniffer González-Colón
2983 HSPW02
2984 Greg Stanton
2984 HSPW02
2985 Nancy Mace
2985 HSPW02
2986 Eleanor Holmes Norton
2986 HSPW02
2987 Sam Graves
2987 HSPW02
2988 Steve Cohen
2988 HSPW02
2989 Peter A. DeFazio
2989 HSPW02
2990 Julia Brownley
2990 HSVR03
2991 Jack Bergman
2991 HSVR03
2992 Conor Lamb
2992 HSVR03
2993 Aumua Amata Coleman Radewagen
2993 HSVR03
2994 Mike Levin
2994 HSVR03
2995 Chip Roy
2995 HSVR03
2996 Frank J. Mrvan
2996 HSVR03
2997 Gregory F. Murphy
2997 HSVR03
2998 Gregorio Kilili Camacho Sablan
2998 HSVR03
2999 Matthew M. Rosendale, Sr.
2999 HSVR03
3000 Lauren Underwood
3000 HSVR03
3001 Mariannette Miller-Meeks
3001 HSVR03
3002 Colin Z. Allred
3002 HSVR03
3003 Lois Frankel
3003 HSVR03
3004 Chris Pappas
3004 HSVR08
3005 Tracey Mann
3005 HSVR08
3006 Conor Lamb
3006 HSVR08
3007 Aumua Amata Coleman Radewagen
3007 HSVR08
3008 Elaine G. Luria
3008 HSVR08
3009 Jack Bergman
3009 HSVR08
3010 La

3480 Lois Frankel
3480 HSAP07
3481 Ben Cline
3481 HSAP07
3482 Cheri Bustos
3482 HSAP07
3483 Bonnie Watson Coleman
3483 HSAP07
3484 Brenda L. Lawrence
3484 HSAP07
3485 Josh Harder
3485 HSAP07
3486 Mark DeSaulnier
3486 HSED02
3487 Rick W. Allen
3487 HSED02
3488 Joe Courtney
3488 HSED02
3489 Joe Wilson
3489 HSED02
3490 Donald Norcross
3490 HSED02
3491 Tim Walberg
3491 HSED02
3492 Joseph D. Morelle
3492 HSED02
3493 Jim Banks
3493 HSED02
3494 Susan Wild
3494 HSED02
3495 Diana Harshbarger
3495 HSED02
3496 Lucy McBath
3496 HSED02
3497 Mary E. Miller
3497 HSED02
3498 Andy Levin
3498 HSED02
3499 Scott Fitzgerald
3499 HSED02
3500 Haley M. Stevens
3500 HSED02
3501 Frank J. Mrvan
3501 HSED02
3502 Raja Krishnamoorthi
3502 HSGO05
3503 Michael Cloud
3503 HSGO05
3504 Katie Porter
3504 HSGO05
3505 Fred Keller
3505 HSGO05
3506 Cori Bush
3506 HSGO05
3507 C. Scott Franklin
3507 HSGO05
3508 Jackie Speier
3508 HSGO05
3509 Andrew S. Clyde
3509 HSGO05
3510 Henry C. "Hank" Johnson, Jr.
3510 HSGO05
3511 Byron D

81 Sherrod Brown
81 SSAF15
82 Michael F. Bennet
82 SSAF15
83 Richard J. Durbin
83 SSAF15
84 Debbie Stabenow
84 SSAF15
85 Joni Ernst
85 SSAF15
86 Mitch McConnell
86 SSAF15
87 Tommy Tuberville
87 SSAF15
88 Chuck Grassley
88 SSAF15
89 Deb Fischer
89 SSAF15
90 Mike Braun
90 SSAF15
91 John Boozman
91 SSAF15
92 Patrick J. Leahy
92 SSAP
93 Patty Murray
93 SSAP
94 Dianne Feinstein
94 SSAP
95 Richard J. Durbin
95 SSAP
96 Jack Reed
96 SSAP
97 Jon Tester
97 SSAP
98 Jeanne Shaheen
98 SSAP
99 Jeff Merkley
99 SSAP
100 Christopher A. Coons
100 SSAP
101 Brian Schatz
101 SSAP
102 Tammy Baldwin
102 SSAP
103 Christopher Murphy
103 SSAP
104 Joe Manchin, III
104 SSAP
105 Chris Van Hollen
105 SSAP
106 Martin Heinrich
106 SSAP
107 Richard C. Shelby
107 SSAP
108 Mitch McConnell
108 SSAP
109 Susan M. Collins
109 SSAP
110 Lisa Murkowski
110 SSAP
111 Lindsey Graham
111 SSAP
112 Roy Blunt
112 SSAP
113 Jerry Moran
113 SSAP
114 John Hoeven
114 SSAP
115 John Boozman
115 SSAP
116 Shelley Moore Capito
116 SSAP
117 Joh

600 Angus S. King, Jr.
600 SSEG03
601 Mark Kelly
601 SSEG03
602 John W. Hickenlooper
602 SSEG03
603 Joe Manchin, III
603 SSEG03
604 Mike Lee
604 SSEG03
605 James E. Risch
605 SSEG03
606 Steve Daines
606 SSEG03
607 Lisa Murkowski
607 SSEG03
608 James Lankford
608 SSEG03
609 Bill Cassidy
609 SSEG03
610 Cindy Hyde-Smith
610 SSEG03
611 John Barrasso
611 SSEG03
612 Ron Wyden
612 SSEG07
613 Bernard Sanders
613 SSEG07
614 Catherine Cortez Masto
614 SSEG07
615 Mark Kelly
615 SSEG07
616 John W. Hickenlooper
616 SSEG07
617 Joe Manchin, III
617 SSEG07
618 Cindy Hyde-Smith
618 SSEG07
619 James E. Risch
619 SSEG07
620 Mike Lee
620 SSEG07
621 John Hoeven
621 SSEG07
622 Roger Marshall
622 SSEG07
623 John Barrasso
623 SSEG07
624 Thomas R. Carper
624 SSEV
625 Benjamin L. Cardin
625 SSEV
626 Bernard Sanders
626 SSEV
627 Sheldon Whitehouse
627 SSEV
628 Jeff Merkley
628 SSEV
629 Edward J. Markey
629 SSEV
630 Tammy Duckworth
630 SSEV
631 Debbie Stabenow
631 SSEV
632 Mark Kelly
632 SSEV
633 Alex Padilla
633

1116 Christopher A. Coons
1116 SSJU
1117 Richard Blumenthal
1117 SSJU
1118 Mazie K. Hirono
1118 SSJU
1119 Cory A. Booker
1119 SSJU
1120 Alex Padilla
1120 SSJU
1121 Jon Ossoff
1121 SSJU
1122 Chuck Grassley
1122 SSJU
1123 Lindsey Graham
1123 SSJU
1124 John Cornyn
1124 SSJU
1125 Mike Lee
1125 SSJU
1126 Ted Cruz
1126 SSJU
1127 Ben Sasse
1127 SSJU
1128 Josh Hawley
1128 SSJU
1129 Tom Cotton
1129 SSJU
1130 John Kennedy
1130 SSJU
1131 Thom Tillis
1131 SSJU
1132 Marsha Blackburn
1132 SSJU
1133 Amy Klobuchar
1133 SSJU01
1134 Patrick J. Leahy
1134 SSJU01
1135 Richard Blumenthal
1135 SSJU01
1136 Cory A. Booker
1136 SSJU01
1137 Jon Ossoff
1137 SSJU01
1138 Mike Lee
1138 SSJU01
1139 Josh Hawley
1139 SSJU01
1140 Tom Cotton
1140 SSJU01
1141 Thom Tillis
1141 SSJU01
1142 Marsha Blackburn
1142 SSJU01
1143 Alex Padilla
1143 SSJU04
1144 Dianne Feinstein
1144 SSJU04
1145 Amy Klobuchar
1145 SSJU04
1146 Christopher A. Coons
1146 SSJU04
1147 Richard Blumenthal
1147 SSJU04
1148 Mazie K. Hirono
1148 SSJU04
1149 C

1599 HSED
1600 Joe Wilson
1600 HSED
1601 Joe Courtney
1601 HSED
1602 Glenn Thompson
1602 HSED
1603 Gregorio Kilili Camacho Sablan
1603 HSED
1604 Tim Walberg
1604 HSED
1605 Frederica S. Wilson
1605 HSED
1606 Glenn Grothman
1606 HSED
1607 Suzanne Bonamici
1607 HSED
1608 Elise M. Stefanik
1608 HSED
1609 Mark Takano
1609 HSED
1610 Rick W. Allen
1610 HSED
1611 Alma S. Adams
1611 HSED
1612 Jim Banks
1612 HSED
1613 Mark DeSaulnier
1613 HSED
1614 James Comer
1614 HSED
1615 Donald Norcross
1615 HSED
1616 Russ Fulcher
1616 HSED
1617 Pramila Jayapal
1617 HSED
1618 Fred Keller
1618 HSED
1619 Joseph D. Morelle
1619 HSED
1620 Gregory F. Murphy
1620 HSED
1621 Susan Wild
1621 HSED
1622 Mariannette Miller-Meeks
1622 HSED
1623 Lucy McBath
1623 HSED
1624 Burgess Owens
1624 HSED
1625 Jahana Hayes
1625 HSED
1626 Bob Good
1626 HSED
1627 Andy Levin
1627 HSED
1628 Lisa C. McClain
1628 HSED
1629 Ilhan Omar
1629 HSED
1630 Diana Harshbarger
1630 HSED
1631 Haley M. Stevens
1631 HSED
1632 Mary E. Miller
1632 HSED


2099 HSSY
2100 Mikie Sherrill
2100 HSSY
2101 Anthony Gonzalez
2101 HSSY
2102 Jamaal Bowman
2102 HSSY
2103 Michael Waltz
2103 HSSY
2104 Melanie A. Stansbury
2104 HSSY
2105 James R. Baird
2105 HSSY
2106 Brad Sherman
2106 HSSY
2107 Daniel Webster
2107 HSSY
2108 Ed Perlmutter
2108 HSSY
2109 Mike Garcia
2109 HSSY
2110 Jerry McNerney
2110 HSSY
2111 Stephanie I. Bice
2111 HSSY
2112 Paul Tonko
2112 HSSY
2113 Young Kim
2113 HSSY
2114 Bill Foster
2114 HSSY
2115 Randy Feenstra
2115 HSSY
2116 Donald Norcross
2116 HSSY
2117 Jake LaTurner
2117 HSSY
2118 Donald S. Beyer, Jr.
2118 HSSY
2119 Carlos A. Gimenez
2119 HSSY
2120 Charlie Crist
2120 HSSY
2121 Jay Obernolte
2121 HSSY
2122 Sean Casten
2122 HSSY
2123 Peter Meijer
2123 HSSY
2124 Conor Lamb
2124 HSSY
2125 Jake Ellzey
2125 HSSY
2126 Deborah K. Ross
2126 HSSY
2127 Gwen Moore
2127 HSSY
2128 Daniel T. Kildee
2128 HSSY
2129 Susan Wild
2129 HSSY
2130 Lizzie Fletcher
2130 HSSY
2131 Mark Takano
2131 HSVR
2132 Mike Bost
2132 HSVR
2133 Julia Brownley
2133 H

2634 Steven Horsford
2634 HSAS29
2635 James R. Langevin
2635 HSAS35
2636 Jim Banks
2636 HSAS35
2637 Rick Larsen
2637 HSAS35
2638 Elise M. Stefanik
2638 HSAS35
2639 Seth Moulton
2639 HSAS35
2640 Mo Brooks
2640 HSAS35
2641 Ro Khanna
2641 HSAS35
2642 Matt Gaetz
2642 HSAS35
2643 William R. Keating
2643 HSAS35
2644 Mike Johnson
2644 HSAS35
2645 Andy Kim
2645 HSAS35
2646 Stephanie I. Bice
2646 HSAS35
2647 Chrissy Houlahan
2647 HSAS35
2648 C. Scott Franklin
2648 HSAS35
2649 Jason Crow
2649 HSAS35
2650 Blake D. Moore
2650 HSAS35
2651 Elissa Slotkin
2651 HSAS35
2652 Pat Fallon
2652 HSAS35
2653 Veronica Escobar
2653 HSAS35
2654 Joseph D. Morelle
2654 HSAS35
2655 Donald S. Beyer, Jr.
2655 HSSY16
2656 Brian Babin
2656 HSSY16
2657 Zoe Lofgren
2657 HSSY16
2658 Mo Brooks
2658 HSSY16
2659 Ami Bera
2659 HSSY16
2660 Bill Posey
2660 HSSY16
2661 Brad Sherman
2661 HSSY16
2662 Daniel Webster
2662 HSSY16
2663 Ed Perlmutter
2663 HSSY16
2664 Young Kim
2664 HSSY16
2665 Charlie Crist
2665 HSSY16
2666 Donald Norc

3224 HSAS25
3225 Marc A. Veasey
3225 HSAS25
3226 Mark E. Green
3226 HSAS25
3227 Stephanie N. Murphy
3227 HSAS25
3228 Ronny Jackson
3228 HSAS25
3229 Steven Horsford
3229 HSAS25
3230 Ruben Gallego
3230 HSAS26
3231 Trent Kelly
3231 HSAS26
3232 Rick Larsen
3232 HSAS26
3233 Austin Scott
3233 HSAS26
3234 Jim Cooper
3234 HSAS26
3235 Sam Graves
3235 HSAS26
3236 William R. Keating
3236 HSAS26
3237 Don Bacon
3237 HSAS26
3238 Filemon Vela
3238 HSAS26
3239 Liz Cheney
3239 HSAS26
3240 Mikie Sherrill
3240 HSAS26
3241 Michael Waltz
3241 HSAS26
3242 Jimmy Panetta
3242 HSAS26
3243 C. Scott Franklin
3243 HSAS26
3244 Stephanie N. Murphy
3244 HSAS26
3245 Joe Neguse
3245 HSII10
3246 Russ Fulcher
3246 HSII10
3247 Gregorio Kilili Camacho Sablan
3247 HSII10
3248 Thomas P. Tiffany
3248 HSII10
3249 Diana DeGette
3249 HSII10
3250 Louie Gohmert
3250 HSII10
3251 Paul Tonko
3251 HSII10
3252 Doug Lamborn
3252 HSII10
3253 Rashida Tlaib
3253 HSII10
3254 Tom McClintock
3254 HSII10
3255 Lori Trahan
3255 HSII10
3256 Jody

3746 Bradley Scott Schneider
3746 HSFA13
3747 Joaquin Castro
3747 HSFA17
3748 Nicole Malliotakis
3748 HSFA17
3749 Sara Jacobs
3749 HSFA17
3750 Darrell Issa
3750 HSFA17
3751 Brad Sherman
3751 HSFA17
3752 Lee M. Zeldin
3752 HSFA17
3753 Ilhan Omar
3753 HSFA17
3754 Claudia Tenney
3754 HSFA17
3755 Chrissy Houlahan
3755 HSFA17
3756 Andy Kim
3756 HSFA17
3757 Lucille Roybal-Allard
3757 HSAP15
3758 Charles J. "Chuck" Fleischmann
3758 HSAP15
3759 Henry Cuellar
3759 HSAP15
3760 Steven M. Palazzo
3760 HSAP15
3761 Lauren Underwood
3761 HSAP15
3762 John H. Rutherford
3762 HSAP15
3763 David E. Price
3763 HSAP15
3764 Ashley Hinson
3764 HSAP15
3765 C. A. Dutch Ruppersberger
3765 HSAP15
3766 Mike Quigley
3766 HSAP15
3767 Pete Aguilar
3767 HSAP15
3768 G. K. Butterfield
3768 HSHA08
3769 Bryan Steil
3769 HSHA08
3770 Pete Aguilar
3770 HSHA08
3771 Teresa Leger Fernandez
3771 HSHA08
3772 Norma J. Torres
3772 HSRU04
3773 Guy Reschenthaler
3773 HSRU04
3774 Ed Perlmutter
3774 HSRU04
3775 Tom Cole
3775 HSRU04
377

235 Lindsey Graham
235 SSAP22
236 John Hoeven
236 SSAP22
237 Cindy Hyde-Smith
237 SSAP22
238 Bill Hagerty
238 SSAP22
239 Chris Van Hollen
239 SSAP23
240 Christopher A. Coons
240 SSAP23
241 Richard J. Durbin
241 SSAP23
242 Joe Manchin, III
242 SSAP23
243 Patrick J. Leahy
243 SSAP23
244 Cindy Hyde-Smith
244 SSAP23
245 Jerry Moran
245 SSAP23
246 John Boozman
246 SSAP23
247 John Kennedy
247 SSAP23
248 Richard C. Shelby
248 SSAP23
249 Jack Reed
249 SSAP08
250 Christopher Murphy
250 SSAP08
251 Martin Heinrich
251 SSAP08
252 Patrick J. Leahy
252 SSAP08
253 Mike Braun
253 SSAP08
254 Richard C. Shelby
254 SSAP08
255 Marco Rubio
255 SSAP08
256 Martin Heinrich
256 SSAP19
257 Brian Schatz
257 SSAP19
258 Jon Tester
258 SSAP19
259 Patty Murray
259 SSAP19
260 Jack Reed
260 SSAP19
261 Tammy Baldwin
261 SSAP19
262 Christopher A. Coons
262 SSAP19
263 Joe Manchin, III
263 SSAP19
264 Patrick J. Leahy
264 SSAP19
265 John Boozman
265 SSAP19
266 Mitch McConnell
266 SSAP19
267 Lisa Murkowski
267 SSAP19
268 Jo

719 Elizabeth Warren
719 SSFI
720 Mike Crapo
720 SSFI
721 Chuck Grassley
721 SSFI
722 John Cornyn
722 SSFI
723 John Thune
723 SSFI
724 Richard Burr
724 SSFI
725 Rob Portman
725 SSFI
726 Patrick J. Toomey
726 SSFI
727 Tim Scott
727 SSFI
728 Bill Cassidy
728 SSFI
729 James Lankford
729 SSFI
730 Steve Daines
730 SSFI
731 Todd Young
731 SSFI
732 Ben Sasse
732 SSFI
733 John Barrasso
733 SSFI
734 Michael F. Bennet
734 SSFI12
735 Thomas R. Carper
735 SSFI12
736 Mark R. Warner
736 SSFI12
737 Sheldon Whitehouse
737 SSFI12
738 Margaret Wood Hassan
738 SSFI12
739 Ron Wyden
739 SSFI12
740 James Lankford
740 SSFI12
741 John Cornyn
741 SSFI12
742 Tim Scott
742 SSFI12
743 John Barrasso
743 SSFI12
744 Steve Daines
744 SSFI12
745 Mike Crapo
745 SSFI12
746 Elizabeth Warren
746 SSFI14
747 Ron Wyden
747 SSFI14
748 Bill Cassidy
748 SSFI14
749 Richard Burr
749 SSFI14
750 Mike Crapo
750 SSFI14
751 Debbie Stabenow
751 SSFI10
752 Robert Menendez
752 SSFI10
753 Thomas R. Carper
753 SSFI10
754 Benjamin L. Cardin

1219 Angus S. King, Jr.
1219 JSPR
1220 Jamie Raskin
1220 JSPR
1221 Alex Padilla
1221 JSPR
1222 Teresa Leger Fernandez
1222 JSPR
1223 Roy Blunt
1223 JSPR
1224 Roger F. Wicker
1224 JSPR
1225 Rodney Davis
1225 JSPR
1226 Barry Loudermilk
1226 JSPR
1227 Ron Wyden
1227 JSTX
1228 Richard E. Neal
1228 JSTX
1229 Debbie Stabenow
1229 JSTX
1230 Lloyd Doggett
1230 JSTX
1231 Maria Cantwell
1231 JSTX
1232 Mike Thompson
1232 JSTX
1233 Mike Crapo
1233 JSTX
1234 Chuck Grassley
1234 JSTX
1235 Kevin Brady
1235 JSTX
1236 Amy Klobuchar
1236 JSLC
1237 Zoe Lofgren
1237 JSLC
1238 Patrick J. Leahy
1238 JSLC
1239 Tim Ryan
1239 JSLC
1240 Mark R. Warner
1240 JSLC
1241 G. K. Butterfield
1241 JSLC
1242 Roy Blunt
1242 JSLC
1243 Richard C. Shelby
1243 JSLC
1244 Rodney Davis
1244 JSLC
1245 Barry Loudermilk
1245 JSLC
1246 Martin Heinrich
1246 JSEC
1247 Donald S. Beyer, Jr.
1247 JSEC
1248 Amy Klobuchar
1248 JSEC
1249 David J. Trone
1249 JSEC
1250 Margaret Wood Hassan
1250 JSEC
1251 Joyce Beatty
1251 JSEC
1252 Mark Kelly

1718 HSGO
1719 Rashida Tlaib
1719 HSGO
1720 Pete Sessions
1720 HSGO
1721 Katie Porter
1721 HSGO
1722 Fred Keller
1722 HSGO
1723 Cori Bush
1723 HSGO
1724 Andy Biggs
1724 HSGO
1725 Danny K. Davis
1725 HSGO
1726 Andrew S. Clyde
1726 HSGO
1727 Debbie Wasserman Schultz
1727 HSGO
1728 Nancy Mace
1728 HSGO
1729 Peter Welch
1729 HSGO
1730 C. Scott Franklin
1730 HSGO
1731 Henry C. "Hank" Johnson, Jr.
1731 HSGO
1732 Jake LaTurner
1732 HSGO
1733 John P. Sarbanes
1733 HSGO
1734 Pat Fallon
1734 HSGO
1735 Jackie Speier
1735 HSGO
1736 Yvette Herrell
1736 HSGO
1737 Robin L. Kelly
1737 HSGO
1738 Byron Donalds
1738 HSGO
1739 Brenda L. Lawrence
1739 HSGO
1740 Mark DeSaulnier
1740 HSGO
1741 Jimmy Gomez
1741 HSGO
1742 Ayanna Pressley
1742 HSGO
1743 Mike Quigley
1743 HSGO
1744 Zoe Lofgren
1744 HSHA
1745 Rodney Davis
1745 HSHA
1746 Jamie Raskin
1746 HSHA
1747 Barry Loudermilk
1747 HSHA
1748 G. K. Butterfield
1748 HSHA
1749 Bryan Steil
1749 HSHA
1750 Pete Aguilar
1750 HSHA
1751 Mary Gay Scanlon
1751 HSHA
1752

2239 Todd Young
2239 SSCM34
2240 Mike Lee
2240 SSCM34
2241 Ron Johnson
2241 SSCM34
2242 Shelley Moore Capito
2242 SSCM34
2243 Rick Scott
2243 SSCM34
2244 Cynthia M. Lummis
2244 SSCM34
2245 Roger F. Wicker
2245 SSCM34
2246 Richard Blumenthal
2246 SSCM35
2247 Amy Klobuchar
2247 SSCM35
2248 Brian Schatz
2248 SSCM35
2249 Edward J. Markey
2249 SSCM35
2250 Tammy Baldwin
2250 SSCM35
2251 Ben Ray Luján
2251 SSCM35
2252 Maria Cantwell
2252 SSCM35
2253 Marsha Blackburn
2253 SSCM35
2254 John Thune
2254 SSCM35
2255 Roy Blunt
2255 SSCM35
2256 Jerry Moran
2256 SSCM35
2257 Todd Young
2257 SSCM35
2258 Mike Lee
2258 SSCM35
2259 Roger F. Wicker
2259 SSCM35
2260 Tammy Baldwin
2260 SSCM36
2261 Richard Blumenthal
2261 SSCM36
2262 Brian Schatz
2262 SSCM36
2263 Edward J. Markey
2263 SSCM36
2264 Gary C. Peters
2264 SSCM36
2265 Ben Ray Luján
2265 SSCM36
2266 Maria Cantwell
2266 SSCM36
2267 Dan Sullivan
2267 SSCM36
2268 Ted Cruz
2268 SSCM36
2269 Deb Fischer
2269 SSCM36
2270 Marsha Blackburn
2270 SSCM36
2271 Tod

2718 HSIF18
2719 Bill Johnson
2719 HSIF18
2720 Janice D. Schakowsky
2720 HSIF18
2721 Markwayne Mullin
2721 HSIF18
2722 John P. Sarbanes
2722 HSIF18
2723 Richard Hudson
2723 HSIF18
2724 Yvette D. Clarke
2724 HSIF18
2725 Earl L. "Buddy" Carter
2725 HSIF18
2726 Raul Ruiz
2726 HSIF18
2727 Jeff Duncan
2727 HSIF18
2728 Scott H. Peters
2728 HSIF18
2729 Gary J. Palmer
2729 HSIF18
2730 Debbie Dingell
2730 HSIF18
2731 John R. Curtis
2731 HSIF18
2732 Nanette Diaz Barragán
2732 HSIF18
2733 Dan Crenshaw
2733 HSIF18
2734 A. Donald McEachin
2734 HSIF18
2735 Lisa Blunt Rochester
2735 HSIF18
2736 Darren Soto
2736 HSIF18
2737 Tom O’Halleran
2737 HSIF18
2738 John B. Larson
2738 HSWM01
2739 Tom Reed
2739 HSWM01
2740 Bill Pascrell, Jr.
2740 HSWM01
2741 Tom Rice
2741 HSWM01
2742 Linda T. Sánchez
2742 HSWM01
2743 Jodey C. Arrington
2743 HSWM01
2744 Brian Higgins
2744 HSWM01
2745 Ron Estes
2745 HSWM01
2746 Steven Horsford
2746 HSWM01
2747 Kevin Hern
2747 HSWM01
2748 Earl Blumenauer
2748 HSWM01
2749 Terri Sewe

3218 HSAS25
3219 Anthony Brown
3219 HSAS25
3220 Scott DesJarlais
3220 HSAS25
3221 Mikie Sherrill
3221 HSAS25
3222 Matt Gaetz
3222 HSAS25
3223 Kaiali’i Kahele
3223 HSAS25
3224 Don Bacon
3224 HSAS25
3225 Marc A. Veasey
3225 HSAS25
3226 Mark E. Green
3226 HSAS25
3227 Stephanie N. Murphy
3227 HSAS25
3228 Ronny Jackson
3228 HSAS25
3229 Steven Horsford
3229 HSAS25
3230 Ruben Gallego
3230 HSAS26
3231 Trent Kelly
3231 HSAS26
3232 Rick Larsen
3232 HSAS26
3233 Austin Scott
3233 HSAS26
3234 Jim Cooper
3234 HSAS26
3235 Sam Graves
3235 HSAS26
3236 William R. Keating
3236 HSAS26
3237 Don Bacon
3237 HSAS26
3238 Filemon Vela
3238 HSAS26
3239 Liz Cheney
3239 HSAS26
3240 Mikie Sherrill
3240 HSAS26
3241 Michael Waltz
3241 HSAS26
3242 Jimmy Panetta
3242 HSAS26
3243 C. Scott Franklin
3243 HSAS26
3244 Stephanie N. Murphy
3244 HSAS26
3245 Joe Neguse
3245 HSII10
3246 Russ Fulcher
3246 HSII10
3247 Gregorio Kilili Camacho Sablan
3247 HSII10
3248 Thomas P. Tiffany
3248 HSII10
3249 Diana DeGette
3249 HSII10
3250 

3718 Juan Vargas
3718 HSBA15
3719 David Kustoff
3719 HSBA15
3720 Al Lawson, Jr.
3720 HSBA15
3721 John Rose
3721 HSBA15
3722 Michael F. Q. San Nicolas
3722 HSBA15
3723 William R. Timmons IV
3723 HSBA15
3724 Sean Casten
3724 HSBA15
3725 Ayanna Pressley
3725 HSBA15
3726 Ritchie Torres
3726 HSBA15
3727 Theodore E. Deutch
3727 HSFA13
3728 Joe Wilson
3728 HSFA13
3729 Gerald E. Connolly
3729 HSFA13
3730 Scott Perry
3730 HSFA13
3731 David N. Cicilline
3731 HSFA13
3732 Adam Kinzinger
3732 HSFA13
3733 Ted Lieu
3733 HSFA13
3734 Lee M. Zeldin
3734 HSFA13
3735 Colin Z. Allred
3735 HSFA13
3736 Brian J. Mast
3736 HSFA13
3737 Tom Malinowski
3737 HSFA13
3738 Tim Burchett
3738 HSFA13
3739 Kathy E. Manning
3739 HSFA13
3740 W. Gregory Steube
3740 HSFA13
3741 William R. Keating
3741 HSFA13
3742 Ronny Jackson
3742 HSFA13
3743 Brad Sherman
3743 HSFA13
3744 Maria Elvira Salazar
3744 HSFA13
3745 Juan Vargas
3745 HSFA13
3746 Bradley Scott Schneider
3746 HSFA13
3747 Joaquin Castro
3747 HSFA17
3748 Nicole Malliot

235 Lindsey Graham
235 SSAP22
236 John Hoeven
236 SSAP22
237 Cindy Hyde-Smith
237 SSAP22
238 Bill Hagerty
238 SSAP22
239 Chris Van Hollen
239 SSAP23
240 Christopher A. Coons
240 SSAP23
241 Richard J. Durbin
241 SSAP23
242 Joe Manchin, III
242 SSAP23
243 Patrick J. Leahy
243 SSAP23
244 Cindy Hyde-Smith
244 SSAP23
245 Jerry Moran
245 SSAP23
246 John Boozman
246 SSAP23
247 John Kennedy
247 SSAP23
248 Richard C. Shelby
248 SSAP23
249 Jack Reed
249 SSAP08
250 Christopher Murphy
250 SSAP08
251 Martin Heinrich
251 SSAP08
252 Patrick J. Leahy
252 SSAP08
253 Mike Braun
253 SSAP08
254 Richard C. Shelby
254 SSAP08
255 Marco Rubio
255 SSAP08
256 Martin Heinrich
256 SSAP19
257 Brian Schatz
257 SSAP19
258 Jon Tester
258 SSAP19
259 Patty Murray
259 SSAP19
260 Jack Reed
260 SSAP19
261 Tammy Baldwin
261 SSAP19
262 Christopher A. Coons
262 SSAP19
263 Joe Manchin, III
263 SSAP19
264 Patrick J. Leahy
264 SSAP19
265 John Boozman
265 SSAP19
266 Mitch McConnell
266 SSAP19
267 Lisa Murkowski
267 SSAP19
268 Jo

833 Marco Rubio
833 SSFR
834 Ron Johnson
834 SSFR
835 Mitt Romney
835 SSFR
836 Rob Portman
836 SSFR
837 Rand Paul
837 SSFR
838 Todd Young
838 SSFR
839 John Barrasso
839 SSFR
840 Ted Cruz
840 SSFR
841 Mike Rounds
841 SSFR
842 Bill Hagerty
842 SSFR
843 Chris Van Hollen
843 SSFR09
844 Cory A. Booker
844 SSFR09
845 Tim Kaine
845 SSFR09
846 Jeff Merkley
846 SSFR09
847 Christopher A. Coons
847 SSFR09
848 Robert Menendez
848 SSFR09
849 Mike Rounds
849 SSFR09
850 Marco Rubio
850 SSFR09
851 Todd Young
851 SSFR09
852 John Barrasso
852 SSFR09
853 Rand Paul
853 SSFR09
854 James E. Risch
854 SSFR09
855 Edward J. Markey
855 SSFR02
856 Christopher A. Coons
856 SSFR02
857 Christopher Murphy
857 SSFR02
858 Brian Schatz
858 SSFR02
859 Jeff Merkley
859 SSFR02
860 Robert Menendez
860 SSFR02
861 Mitt Romney
861 SSFR02
862 Ted Cruz
862 SSFR02
863 Ron Johnson
863 SSFR02
864 Mike Rounds
864 SSFR02
865 Bill Hagerty
865 SSFR02
866 James E. Risch
866 SSFR02
867 Jeanne Shaheen
867 SSFR01
868 Benjamin L. Cardin
86

1338 Bobby L. Rush
1338 HSAG
1339 David Rouzer
1339 HSAG
1340 Chellie Pingree
1340 HSAG
1341 Trent Kelly
1341 HSAG
1342 Gregorio Kilili Camacho Sablan
1342 HSAG
1343 Don Bacon
1343 HSAG
1344 Ann Kuster
1344 HSAG
1345 Dusty Johnson
1345 HSAG
1346 Cheri Bustos
1346 HSAG
1347 James R. Baird
1347 HSAG
1348 Sean Patrick Maloney
1348 HSAG
1349 Jim Hagedorn
1349 HSAG
1350 Stacey E. Plaskett
1350 HSAG
1351 Chris Jacobs
1351 HSAG
1352 Tom O’Halleran
1352 HSAG
1353 Troy Balderson
1353 HSAG
1354 Salud O. Carbajal
1354 HSAG
1355 Michael Cloud
1355 HSAG
1356 Ro Khanna
1356 HSAG
1357 Tracey Mann
1357 HSAG
1358 Al Lawson, Jr.
1358 HSAG
1359 Randy Feenstra
1359 HSAG
1360 J. Luis Correa
1360 HSAG
1361 Mary E. Miller
1361 HSAG
1362 Angie Craig
1362 HSAG
1363 Barry Moore
1363 HSAG
1364 Josh Harder
1364 HSAG
1365 Kat Cammack
1365 HSAG
1366 Cynthia Axne
1366 HSAG
1367 Michelle Fischbach
1367 HSAG
1368 Kim Schrier
1368 HSAG
1369 Julia Letlow
1369 HSAG
1370 Jimmy Panetta
1370 HSAG
1371 Ann Kirkpatrick
1371 H

1875 Doug Lamborn
1875 HSII
1876 Jared Huffman
1876 HSII
1877 Robert J. Wittman
1877 HSII
1878 Alan S. Lowenthal
1878 HSII
1879 Tom McClintock
1879 HSII
1880 Ruben Gallego
1880 HSII
1881 Joe Neguse
1881 HSII
1882 Garret Graves
1882 HSII
1883 Mike Levin
1883 HSII
1884 Jody B. Hice
1884 HSII
1885 Katie Porter
1885 HSII
1886 Aumua Amata Coleman Radewagen
1886 HSII
1887 Teresa Leger Fernandez
1887 HSII
1888 Daniel Webster
1888 HSII
1889 Melanie A. Stansbury
1889 HSII
1890 Jenniffer González-Colón
1890 HSII
1891 Nydia M. Velázquez
1891 HSII
1892 Russ Fulcher
1892 HSII
1893 Diana DeGette
1893 HSII
1894 Pete Stauber
1894 HSII
1895 Julia Brownley
1895 HSII
1896 Thomas P. Tiffany
1896 HSII
1897 Debbie Dingell
1897 HSII
1898 Jerry L. Carl
1898 HSII
1899 A. Donald McEachin
1899 HSII
1900 Matthew M. Rosendale, Sr.
1900 HSII
1901 Darren Soto
1901 HSII
1902 Blake D. Moore
1902 HSII
1903 Michael F. Q. San Nicolas
1903 HSII
1904 Yvette Herrell
1904 HSII
1905 Jesús G. "Chuy" García
1905 HSII
1906 Laure

2456 Bruce Westerman
2456 HSPW12
2457 Greg Stanton
2457 HSPW12
2458 Mike Gallagher
2458 HSPW12
2459 Colin Z. Allred
2459 HSPW12
2460 Brian K. Fitzpatrick
2460 HSPW12
2461 Jesús G. "Chuy" García
2461 HSPW12
2462 Jenniffer González-Colón
2462 HSPW12
2463 Antonio Delgado
2463 HSPW12
2464 Troy Balderson
2464 HSPW12
2465 Chris Pappas
2465 HSPW12
2466 Pete Stauber
2466 HSPW12
2467 Conor Lamb
2467 HSPW12
2468 Tim Burchett
2468 HSPW12
2469 Jake Auchincloss
2469 HSPW12
2470 Dusty Johnson
2470 HSPW12
2471 Carolyn Bourdeaux
2471 HSPW12
2472 Michael Guest
2472 HSPW12
2473 Marilyn Strickland
2473 HSPW12
2474 Troy E. Nehls
2474 HSPW12
2475 Grace F. Napolitano
2475 HSPW12
2476 Nancy Mace
2476 HSPW12
2477 Jared Huffman
2477 HSPW12
2478 Nicole Malliotakis
2478 HSPW12
2479 Salud O. Carbajal
2479 HSPW12
2480 Beth Van Duyne
2480 HSPW12
2481 Sharice Davids
2481 HSPW12
2482 Carlos A. Gimenez
2482 HSPW12
2483 Seth Moulton
2483 HSPW12
2484 Michelle Steel
2484 HSPW12
2485 Kaiali’i Kahele
2485 HSPW12
2486 Sam G

2962 HSPW02
2963 David Rouzer
2963 HSPW02
2964 Jared Huffman
2964 HSPW02
2965 Daniel Webster
2965 HSPW02
2966 Eddie Bernice Johnson
2966 HSPW02
2967 John Katko
2967 HSPW02
2968 John Garamendi
2968 HSPW02
2969 Brian Babin
2969 HSPW02
2970 Alan S. Lowenthal
2970 HSPW02
2971 Garret Graves
2971 HSPW02
2972 Tom Malinowski
2972 HSPW02
2973 Mike Bost
2973 HSPW02
2974 Antonio Delgado
2974 HSPW02
2975 Randy K. Weber, Sr.
2975 HSPW02
2976 Chris Pappas
2976 HSPW02
2977 Doug LaMalfa
2977 HSPW02
2978 Carolyn Bourdeaux
2978 HSPW02
2979 Bruce Westerman
2979 HSPW02
2980 Frederica S. Wilson
2980 HSPW02
2981 Brian J. Mast
2981 HSPW02
2982 Salud O. Carbajal
2982 HSPW02
2983 Jenniffer González-Colón
2983 HSPW02
2984 Greg Stanton
2984 HSPW02
2985 Nancy Mace
2985 HSPW02
2986 Eleanor Holmes Norton
2986 HSPW02
2987 Sam Graves
2987 HSPW02
2988 Steve Cohen
2988 HSPW02
2989 Peter A. DeFazio
2989 HSPW02
2990 Julia Brownley
2990 HSVR03
2991 Jack Bergman
2991 HSVR03
2992 Conor Lamb
2992 HSVR03
2993 Aumua Amata Cole

3493 Jim Banks
3493 HSED02
3494 Susan Wild
3494 HSED02
3495 Diana Harshbarger
3495 HSED02
3496 Lucy McBath
3496 HSED02
3497 Mary E. Miller
3497 HSED02
3498 Andy Levin
3498 HSED02
3499 Scott Fitzgerald
3499 HSED02
3500 Haley M. Stevens
3500 HSED02
3501 Frank J. Mrvan
3501 HSED02
3502 Raja Krishnamoorthi
3502 HSGO05
3503 Michael Cloud
3503 HSGO05
3504 Katie Porter
3504 HSGO05
3505 Fred Keller
3505 HSGO05
3506 Cori Bush
3506 HSGO05
3507 C. Scott Franklin
3507 HSGO05
3508 Jackie Speier
3508 HSGO05
3509 Andrew S. Clyde
3509 HSGO05
3510 Henry C. "Hank" Johnson, Jr.
3510 HSGO05
3511 Byron Donalds
3511 HSGO05
3512 Mark DeSaulnier
3512 HSGO05
3513 Ayanna Pressley
3513 HSGO05
3514 Jamie Raskin
3514 HSRU05
3515 Michelle Fischbach
3515 HSRU05
3516 Deborah K. Ross
3516 HSRU05
3517 Tom Cole
3517 HSRU05
3518 Norma J. Torres
3518 HSRU05
3519 Mark DeSaulnier
3519 HSRU05
3520 James P. McGovern
3520 HSRU05
3521 Barbara Lee
3521 HSAP04
3522 Harold Rogers
3522 HSAP04
3523 Grace Meng
3523 HSAP04
3524 Mario 

82 Michael F. Bennet
82 SSAF15
83 Richard J. Durbin
83 SSAF15
84 Debbie Stabenow
84 SSAF15
85 Joni Ernst
85 SSAF15
86 Mitch McConnell
86 SSAF15
87 Tommy Tuberville
87 SSAF15
88 Chuck Grassley
88 SSAF15
89 Deb Fischer
89 SSAF15
90 Mike Braun
90 SSAF15
91 John Boozman
91 SSAF15
92 Patrick J. Leahy
92 SSAP
93 Patty Murray
93 SSAP
94 Dianne Feinstein
94 SSAP
95 Richard J. Durbin
95 SSAP
96 Jack Reed
96 SSAP
97 Jon Tester
97 SSAP
98 Jeanne Shaheen
98 SSAP
99 Jeff Merkley
99 SSAP
100 Christopher A. Coons
100 SSAP
101 Brian Schatz
101 SSAP
102 Tammy Baldwin
102 SSAP
103 Christopher Murphy
103 SSAP
104 Joe Manchin, III
104 SSAP
105 Chris Van Hollen
105 SSAP
106 Martin Heinrich
106 SSAP
107 Richard C. Shelby
107 SSAP
108 Mitch McConnell
108 SSAP
109 Susan M. Collins
109 SSAP
110 Lisa Murkowski
110 SSAP
111 Lindsey Graham
111 SSAP
112 Roy Blunt
112 SSAP
113 Jerry Moran
113 SSAP
114 John Hoeven
114 SSAP
115 John Boozman
115 SSAP
116 Shelley Moore Capito
116 SSAP
117 John Kennedy
117 SSAP
118 Cind

663 Benjamin L. Cardin
663 SSEV15
664 Sheldon Whitehouse
664 SSEV15
665 Edward J. Markey
665 SSEV15
666 Debbie Stabenow
666 SSEV15
667 Mark Kelly
667 SSEV15
668 Thomas R. Carper
668 SSEV15
669 Cynthia M. Lummis
669 SSEV15
670 James M. Inhofe
670 SSEV15
671 Kevin Cramer
671 SSEV15
672 John Boozman
672 SSEV15
673 Dan Sullivan
673 SSEV15
674 Joni Ernst
674 SSEV15
675 Shelley Moore Capito
675 SSEV15
676 Jeff Merkley
676 SSEV09
677 Bernard Sanders
677 SSEV09
678 Edward J. Markey
678 SSEV09
679 Mark Kelly
679 SSEV09
680 Alex Padilla
680 SSEV09
681 Thomas R. Carper
681 SSEV09
682 Roger F. Wicker
682 SSEV09
683 Richard C. Shelby
683 SSEV09
684 Dan Sullivan
684 SSEV09
685 Joni Ernst
685 SSEV09
686 Lindsey Graham
686 SSEV09
687 Shelley Moore Capito
687 SSEV09
688 Benjamin L. Cardin
688 SSEV08
689 Bernard Sanders
689 SSEV08
690 Sheldon Whitehouse
690 SSEV08
691 Jeff Merkley
691 SSEV08
692 Tammy Duckworth
692 SSEV08
693 Debbie Stabenow
693 SSEV08
694 Mark Kelly
694 SSEV08
695 Alex Padilla
695 SSEV

1206 SSVA
1207 Margaret Wood Hassan
1207 SSVA
1208 Jerry Moran
1208 SSVA
1209 John Boozman
1209 SSVA
1210 Bill Cassidy
1210 SSVA
1211 Mike Rounds
1211 SSVA
1212 Thom Tillis
1212 SSVA
1213 Dan Sullivan
1213 SSVA
1214 Marsha Blackburn
1214 SSVA
1215 Kevin Cramer
1215 SSVA
1216 Tommy Tuberville
1216 SSVA
1217 Amy Klobuchar
1217 JSPR
1218 Zoe Lofgren
1218 JSPR
1219 Angus S. King, Jr.
1219 JSPR
1220 Jamie Raskin
1220 JSPR
1221 Alex Padilla
1221 JSPR
1222 Teresa Leger Fernandez
1222 JSPR
1223 Roy Blunt
1223 JSPR
1224 Roger F. Wicker
1224 JSPR
1225 Rodney Davis
1225 JSPR
1226 Barry Loudermilk
1226 JSPR
1227 Ron Wyden
1227 JSTX
1228 Richard E. Neal
1228 JSTX
1229 Debbie Stabenow
1229 JSTX
1230 Lloyd Doggett
1230 JSTX
1231 Maria Cantwell
1231 JSTX
1232 Mike Thompson
1232 JSTX
1233 Mike Crapo
1233 JSTX
1234 Chuck Grassley
1234 JSTX
1235 Kevin Brady
1235 JSTX
1236 Amy Klobuchar
1236 JSLC
1237 Zoe Lofgren
1237 JSLC
1238 Patrick J. Leahy
1238 JSLC
1239 Tim Ryan
1239 JSLC
1240 Mark R. Warner
1240 JS

1775 Bonnie Watson Coleman
1775 HSHM
1776 Jake LaTurner
1776 HSHM
1777 Kathleen M. Rice
1777 HSHM
1778 Peter Meijer
1778 HSHM
1779 Val Butler Demings
1779 HSHM
1780 Kat Cammack
1780 HSHM
1781 Nanette Diaz Barragán
1781 HSHM
1782 August Pfluger
1782 HSHM
1783 Josh Gottheimer
1783 HSHM
1784 Andrew R. Garbarino
1784 HSHM
1785 Elaine G. Luria
1785 HSHM
1786 Tom Malinowski
1786 HSHM
1787 Ritchie Torres
1787 HSHM
1788 Frank Pallone, Jr.
1788 HSIF
1789 Cathy McMorris Rodgers
1789 HSIF
1790 Bobby L. Rush
1790 HSIF
1791 Fred Upton
1791 HSIF
1792 Anna G. Eshoo
1792 HSIF
1793 Michael C. Burgess
1793 HSIF
1794 Diana DeGette
1794 HSIF
1795 Steve Scalise
1795 HSIF
1796 Michael F. Doyle
1796 HSIF
1797 Robert E. Latta
1797 HSIF
1798 Janice D. Schakowsky
1798 HSIF
1799 Brett Guthrie
1799 HSIF
1800 G. K. Butterfield
1800 HSIF
1801 David B. McKinley
1801 HSIF
1802 Doris O. Matsui
1802 HSIF
1803 Adam Kinzinger
1803 HSIF
1804 Kathy Castor
1804 HSIF
1805 H. Morgan Griffith
1805 HSIF
1806 John P. Sarbanes
18

2331 SSJU28
2332 Amy Klobuchar
2332 SSJU28
2333 Mazie K. Hirono
2333 SSJU28
2334 Jon Ossoff
2334 SSJU28
2335 Ben Sasse
2335 SSJU28
2336 Lindsey Graham
2336 SSJU28
2337 Josh Hawley
2337 SSJU28
2338 John Kennedy
2338 SSJU28
2339 Marsha Blackburn
2339 SSJU28
2340 Jared Huffman
2340 HSII13
2341 Cliff Bentz
2341 HSII13
2342 Grace F. Napolitano
2342 HSII13
2343 Jerry L. Carl
2343 HSII13
2344 Jim Costa
2344 HSII13
2345 Don Young
2345 HSII13
2346 Mike Levin
2346 HSII13
2347 Robert J. Wittman
2347 HSII13
2348 Julia Brownley
2348 HSII13
2349 Tom McClintock
2349 HSII13
2350 Debbie Dingell
2350 HSII13
2351 Garret Graves
2351 HSII13
2352 Ed Case
2352 HSII13
2353 Aumua Amata Coleman Radewagen
2353 HSII13
2354 Alan S. Lowenthal
2354 HSII13
2355 Daniel Webster
2355 HSII13
2356 Steve Cohen
2356 HSII13
2357 Jenniffer González-Colón
2357 HSII13
2358 Darren Soto
2358 HSII13
2359 Russ Fulcher
2359 HSII13
2360 Raúl M. Grijalva
2360 HSII13
2361 Lauren Boebert
2361 HSII13
2362 Nydia M. Velázquez
2362 HSII13
2

2913 Brad Sherman
2913 HSBA16
2914 Bill Huizenga
2914 HSBA16
2915 Carolyn B. Maloney
2915 HSBA16
2916 Ann Wagner
2916 HSBA16
2917 David Scott
2917 HSBA16
2918 J. Hill
2918 HSBA16
2919 James A. Himes
2919 HSBA16
2920 Tom Emmer
2920 HSBA16
2921 Bill Foster
2921 HSBA16
2922 Alexander Mooney
2922 HSBA16
2923 Gregory W. Meeks
2923 HSBA16
2924 Warren Davidson
2924 HSBA16
2925 Juan Vargas
2925 HSBA16
2926 Trey Hollingsworth
2926 HSBA16
2927 Josh Gottheimer
2927 HSBA16
2928 Anthony Gonzalez
2928 HSBA16
2929 Vicente Gonzalez
2929 HSBA16
2930 Bryan Steil
2930 HSBA16
2931 Michael F. Q. San Nicolas
2931 HSBA16
2932 Van Taylor
2932 HSBA16
2933 Cynthia Axne
2933 HSBA16
2934 Sean Casten
2934 HSBA16
2935 Emanuel Cleaver
2935 HSBA16
2936 David E. Price
2936 HSAP20
2937 Mario Diaz-Balart
2937 HSAP20
2938 Mike Quigley
2938 HSAP20
2939 Steve Womack
2939 HSAP20
2940 Katherine M. Clark
2940 HSAP20
2941 John H. Rutherford
2941 HSAP20
2942 Bonnie Watson Coleman
2942 HSAP20
2943 Mike Garcia
2943 HSAP20
2944 No

3456 Sean Casten
3456 HSSY20
3457 Conor Lamb
3457 HSSY20
3458 Deborah K. Ross
3458 HSSY20
3459 Chellie Pingree
3459 HSAP06
3460 David P. Joyce
3460 HSAP06
3461 Betty McCollum
3461 HSAP06
3462 Michael K. Simpson
3462 HSAP06
3463 Derek Kilmer
3463 HSAP06
3464 Chris Stewart
3464 HSAP06
3465 Josh Harder
3465 HSAP06
3466 Mark E. Amodei
3466 HSAP06
3467 Susie Lee
3467 HSAP06
3468 Marcy Kaptur
3468 HSAP06
3469 Matt Cartwright
3469 HSAP06
3470 Rosa L. DeLauro
3470 HSAP07
3471 Tom Cole
3471 HSAP07
3472 Lucille Roybal-Allard
3472 HSAP07
3473 Andy Harris
3473 HSAP07
3474 Barbara Lee
3474 HSAP07
3475 Charles J. "Chuck" Fleischmann
3475 HSAP07
3476 Mark Pocan
3476 HSAP07
3477 Jaime Herrera Beutler
3477 HSAP07
3478 Katherine M. Clark
3478 HSAP07
3479 John R. Moolenaar
3479 HSAP07
3480 Lois Frankel
3480 HSAP07
3481 Ben Cline
3481 HSAP07
3482 Cheri Bustos
3482 HSAP07
3483 Bonnie Watson Coleman
3483 HSAP07
3484 Brenda L. Lawrence
3484 HSAP07
3485 Josh Harder
3485 HSAP07
3486 Mark DeSaulnier
3486 HSED02

16 Roger Marshall
16 SSAF
17 Tommy Tuberville
17 SSAF
18 Chuck Grassley
18 SSAF
19 John Thune
19 SSAF
20 Deb Fischer
20 SSAF
21 Mike Braun
21 SSAF
22 Raphael G. Warnock
22 SSAF13
23 Sherrod Brown
23 SSAF13
24 Richard J. Durbin
24 SSAF13
25 Tina Smith
25 SSAF13
26 Kirsten E. Gillibrand
26 SSAF13
27 Ben Ray Luján
27 SSAF13
28 Debbie Stabenow
28 SSAF13
29 John Hoeven
29 SSAF13
30 Mitch McConnell
30 SSAF13
31 Cindy Hyde-Smith
31 SSAF13
32 Tommy Tuberville
32 SSAF13
33 Chuck Grassley
33 SSAF13
34 John Thune
34 SSAF13
35 John Boozman
35 SSAF13
36 Michael F. Bennet
36 SSAF14
37 Patrick J. Leahy
37 SSAF14
38 Cory A. Booker
38 SSAF14
39 Ben Ray Luján
39 SSAF14
40 Sherrod Brown
40 SSAF14
41 Amy Klobuchar
41 SSAF14
42 Debbie Stabenow
42 SSAF14
43 Roger Marshall
43 SSAF14
44 John Hoeven
44 SSAF14
45 Cindy Hyde-Smith
45 SSAF14
46 Tommy Tuberville
46 SSAF14
47 John Thune
47 SSAF14
48 Mike Braun
48 SSAF14
49 John Boozman
49 SSAF14
50 Kirsten E. Gillibrand
50 SSAF17
51 Patrick J. Leahy
51 SSAF17
52 Ti

493 Jon Tester
493 SSBK05
494 Kyrsten Sinema
494 SSBK05
495 Jon Ossoff
495 SSBK05
496 Sherrod Brown
496 SSBK05
497 Bill Hagerty
497 SSBK05
498 Mike Crapo
498 SSBK05
499 John Kennedy
499 SSBK05
500 Steve Daines
500 SSBK05
501 Patrick J. Toomey
501 SSBK05
502 Robert Menendez
502 SSBK04
503 Jack Reed
503 SSBK04
504 Mark R. Warner
504 SSBK04
505 Elizabeth Warren
505 SSBK04
506 Catherine Cortez Masto
506 SSBK04
507 Tina Smith
507 SSBK04
508 Kyrsten Sinema
508 SSBK04
509 Raphael G. Warnock
509 SSBK04
510 Sherrod Brown
510 SSBK04
511 Tim Scott
511 SSBK04
512 Richard C. Shelby
512 SSBK04
513 Mike Crapo
513 SSBK04
514 Mike Rounds
514 SSBK04
515 Thom Tillis
515 SSBK04
516 John Kennedy
516 SSBK04
517 Cynthia M. Lummis
517 SSBK04
518 Jerry Moran
518 SSBK04
519 Patrick J. Toomey
519 SSBK04
520 Maria Cantwell
520 SSCM
521 Amy Klobuchar
521 SSCM
522 Richard Blumenthal
522 SSCM
523 Brian Schatz
523 SSCM
524 Edward J. Markey
524 SSCM
525 Gary C. Peters
525 SSCM
526 Tammy Baldwin
526 SSCM
527 Tammy Duck

966 Tammy Baldwin
966 SSHR11
967 Tina Smith
967 SSHR11
968 Jacky Rosen
968 SSHR11
969 Ben Ray Luján
969 SSHR11
970 Patty Murray
970 SSHR11
971 Mike Braun
971 SSHR11
972 Tommy Tuberville
972 SSHR11
973 Rand Paul
973 SSHR11
974 Tim Scott
974 SSHR11
975 Mitt Romney
975 SSHR11
976 Richard Burr
976 SSHR11
977 Bernard Sanders
977 SSHR12
978 Robert P. Casey, Jr.
978 SSHR12
979 Tammy Baldwin
979 SSHR12
980 Christopher Murphy
980 SSHR12
981 Tim Kaine
981 SSHR12
982 Margaret Wood Hassan
982 SSHR12
983 Jacky Rosen
983 SSHR12
984 Ben Ray Luján
984 SSHR12
985 Patty Murray
985 SSHR12
986 Susan M. Collins
986 SSHR12
987 Rand Paul
987 SSHR12
988 Lisa Murkowski
988 SSHR12
989 Roger Marshall
989 SSHR12
990 Tim Scott
990 SSHR12
991 Jerry Moran
991 SSHR12
992 Bill Cassidy
992 SSHR12
993 Mike Braun
993 SSHR12
994 Richard Burr
994 SSHR12
995 Gary C. Peters
995 SSGA
996 Thomas R. Carper
996 SSGA
997 Margaret Wood Hassan
997 SSGA
998 Kyrsten Sinema
998 SSGA
999 Jacky Rosen
999 SSGA
1000 Alex Padilla
1000 SSGA

1538 Madeleine Dean
1538 HSBA
1539 Alexandria Ocasio-Cortez
1539 HSBA
1540 Jesús G. "Chuy" García
1540 HSBA
1541 Sylvia R. Garcia
1541 HSBA
1542 Nikema Williams
1542 HSBA
1543 Jake Auchincloss
1543 HSBA
1544 John A. Yarmuth
1544 HSBU
1545 Jason Smith
1545 HSBU
1546 Hakeem S. Jeffries
1546 HSBU
1547 Trent Kelly
1547 HSBU
1548 Brian Higgins
1548 HSBU
1549 Tom McClintock
1549 HSBU
1550 Brendan F. Boyle
1550 HSBU
1551 Glenn Grothman
1551 HSBU
1552 Lloyd Doggett
1552 HSBU
1553 Lloyd Smucker
1553 HSBU
1554 David E. Price
1554 HSBU
1555 Chris Jacobs
1555 HSBU
1556 Janice D. Schakowsky
1556 HSBU
1557 Michael C. Burgess
1557 HSBU
1558 Daniel T. Kildee
1558 HSBU
1559 Earl L. "Buddy" Carter
1559 HSBU
1560 Joseph D. Morelle
1560 HSBU
1561 Ben Cline
1561 HSBU
1562 Steven Horsford
1562 HSBU
1563 Ashley Hinson
1563 HSBU
1564 Barbara Lee
1564 HSBU
1565 Bob Good
1565 HSBU
1566 Judy Chu
1566 HSBU
1567 Byron Donalds
1567 HSBU
1568 Stacey E. Plaskett
1568 HSBU
1569 Jay Obernolte
1569 HSBU
1570 Jennifer We

2037 Nikema Williams
2037 HSPW
2038 Marie Newman
2038 HSPW
2039 Troy A. Carter
2039 HSPW
2040 James P. McGovern
2040 HSRU
2041 Tom Cole
2041 HSRU
2042 Norma J. Torres
2042 HSRU
2043 Michael C. Burgess
2043 HSRU
2044 Ed Perlmutter
2044 HSRU
2045 Guy Reschenthaler
2045 HSRU
2046 Jamie Raskin
2046 HSRU
2047 Michelle Fischbach
2047 HSRU
2048 Mary Gay Scanlon
2048 HSRU
2049 Joseph D. Morelle
2049 HSRU
2050 Mark DeSaulnier
2050 HSRU
2051 Deborah K. Ross
2051 HSRU
2052 Joe Neguse
2052 HSRU
2053 Nydia M. Velázquez
2053 HSSM
2054 Blaine Luetkemeyer
2054 HSSM
2055 Jared F. Golden
2055 HSSM
2056 Jim Hagedorn
2056 HSSM
2057 Jason Crow
2057 HSSM
2058 Pete Stauber
2058 HSSM
2059 Sharice Davids
2059 HSSM
2060 Roger Williams
2060 HSSM
2061 Kweisi Mfume
2061 HSSM
2062 Daniel Meuser
2062 HSSM
2063 Dean Phillips
2063 HSSM
2064 Claudia Tenney
2064 HSSM
2065 Marie Newman
2065 HSSM
2066 Andrew R. Garbarino
2066 HSSM
2067 Carolyn Bourdeaux
2067 HSSM
2068 Young Kim
2068 HSSM
2069 Troy A. Carter
2069 HSSM
2070

2566 Raul Ruiz
2566 HSVR09
2567 Mike Levin
2567 HSVR10
2568 Barry Moore
2568 HSVR10
2569 Chris Pappas
2569 HSVR10
2570 Tracey Mann
2570 HSVR10
2571 Anthony Brown
2571 HSVR10
2572 Nancy Mace
2572 HSVR10
2573 David J. Trone
2573 HSVR10
2574 Madison Cawthorn
2574 HSVR10
2575 Ruben Gallego
2575 HSVR10
2576 Sanford D. Bishop, Jr.
2576 HSAP01
2577 Chellie Pingree
2577 HSAP01
2578 Robert B. Aderholt
2578 HSAP01
2579 Mark Pocan
2579 HSAP01
2580 Andy Harris
2580 HSAP01
2581 Lauren Underwood
2581 HSAP01
2582 David G. Valadao
2582 HSAP01
2583 Barbara Lee
2583 HSAP01
2584 John R. Moolenaar
2584 HSAP01
2585 Betty McCollum
2585 HSAP01
2586 Dan Newhouse
2586 HSAP01
2587 Debbie Wasserman Schultz
2587 HSAP01
2588 Henry Cuellar
2588 HSAP01
2589 Grace Meng
2589 HSAP01
2590 Betty McCollum
2590 HSAP02
2591 Ken Calvert
2591 HSAP02
2592 Tim Ryan
2592 HSAP02
2593 Harold Rogers
2593 HSAP02
2594 C. A. Dutch Ruppersberger
2594 HSAP02
2595 Tom Cole
2595 HSAP02
2596 Marcy Kaptur
2596 HSAP02
2597 Steve Womack
2597 

2996 Frank J. Mrvan
2996 HSVR03
2997 Gregory F. Murphy
2997 HSVR03
2998 Gregorio Kilili Camacho Sablan
2998 HSVR03
2999 Matthew M. Rosendale, Sr.
2999 HSVR03
3000 Lauren Underwood
3000 HSVR03
3001 Mariannette Miller-Meeks
3001 HSVR03
3002 Colin Z. Allred
3002 HSVR03
3003 Lois Frankel
3003 HSVR03
3004 Chris Pappas
3004 HSVR08
3005 Tracey Mann
3005 HSVR08
3006 Conor Lamb
3006 HSVR08
3007 Aumua Amata Coleman Radewagen
3007 HSVR08
3008 Elaine G. Luria
3008 HSVR08
3009 Jack Bergman
3009 HSVR08
3010 Lauren Underwood
3010 HSVR08
3011 Michael F. Doyle
3011 HSIF16
3012 Robert E. Latta
3012 HSIF16
3013 Jerry McNerney
3013 HSIF16
3014 Steve Scalise
3014 HSIF16
3015 Yvette D. Clarke
3015 HSIF16
3016 Brett Guthrie
3016 HSIF16
3017 Marc A. Veasey
3017 HSIF16
3018 Adam Kinzinger
3018 HSIF16
3019 A. Donald McEachin
3019 HSIF16
3020 Gus M. Bilirakis
3020 HSIF16
3021 Darren Soto
3021 HSIF16
3022 Bill Johnson
3022 HSIF16
3023 Tom O’Halleran
3023 HSIF16
3024 Billy Long
3024 HSIF16
3025 Kathleen M. Rice
30

3489 Joe Wilson
3489 HSED02
3490 Donald Norcross
3490 HSED02
3491 Tim Walberg
3491 HSED02
3492 Joseph D. Morelle
3492 HSED02
3493 Jim Banks
3493 HSED02
3494 Susan Wild
3494 HSED02
3495 Diana Harshbarger
3495 HSED02
3496 Lucy McBath
3496 HSED02
3497 Mary E. Miller
3497 HSED02
3498 Andy Levin
3498 HSED02
3499 Scott Fitzgerald
3499 HSED02
3500 Haley M. Stevens
3500 HSED02
3501 Frank J. Mrvan
3501 HSED02
3502 Raja Krishnamoorthi
3502 HSGO05
3503 Michael Cloud
3503 HSGO05
3504 Katie Porter
3504 HSGO05
3505 Fred Keller
3505 HSGO05
3506 Cori Bush
3506 HSGO05
3507 C. Scott Franklin
3507 HSGO05
3508 Jackie Speier
3508 HSGO05
3509 Andrew S. Clyde
3509 HSGO05
3510 Henry C. "Hank" Johnson, Jr.
3510 HSGO05
3511 Byron Donalds
3511 HSGO05
3512 Mark DeSaulnier
3512 HSGO05
3513 Ayanna Pressley
3513 HSGO05
3514 Jamie Raskin
3514 HSRU05
3515 Michelle Fischbach
3515 HSRU05
3516 Deborah K. Ross
3516 HSRU05
3517 Tom Cole
3517 HSRU05
3518 Norma J. Torres
3518 HSRU05
3519 Mark DeSaulnier
3519 HSRU05
3520 Jame

In [277]:
ls_rows2[0:2]

[['01/14/2022',
  '02/14/2022',
  'Thomas H Tuberville',
  'Joint',
  'NEE',
  '15001 - 50000',
  'NextEra Energy, Inc. Common Stock',
  'Stock',
  'Sale (Full)',
  '--',
  'https://efdsearch.senate.gov/search/view/ptr/c9da6bea-fa14-4a3a-9d8b-1745e834da59/',
  nan,
  nan,
  '15001',
  50000.0,
  'NEE',
  'NextEra Energy, Inc.',
  'Utilities',
  'Utilities—Regulated Electric',
  'NextEra Energy, Inc., through its subsidiaries, generates, transmits, distributes, and sells electric power to retail and wholesale customers in North America. The company generates electricity through wind, solar, nuclear, and fossil fuel, such as coal and natural gas facilities. It also develops, constructs, and operates long-term contracted assets with a focus on renewable generation facilities, electric transmission facilities, and battery storage projects; and owns, develops, constructs, manages and operates electric generation facilities in wholesale energy markets. As of December 31, 2020, the company op

In [280]:
edited = pd.DataFrame(ls_rows2)
edited.columns =['transaction_date', 'disclosure_date', 'politician', 'owner', 'ticker', 'amount', 'asset_description', 'asset_type', 'transaction_type', 'comment', 'ptr_link', 'location', 'cap_gains', 'amount_low', 'amount_high', 'ticker2', 'name', 'sector', 'industry', 'longbusinesssummary', 'website', 'stock_description','sector_industry', 'stock_description2', 'stock_description3', 'committee', 'committee_fullname', 'committee_description', 'website','committee_description2', 'committee_description3', 'committee', 'name', 'party', 'rank', 'bioguide']

In [281]:
edited.head(10)

Unnamed: 0,transaction_date,disclosure_date,politician,owner,ticker,amount,asset_description,asset_type,transaction_type,comment,...,committee_fullname,committee_description,website,committee_description2,committee_description3,committee,name,party,rank,bioguide
0,01/14/2022,02/14/2022,Thomas H Tuberville,Joint,NEE,15001 - 50000,"NextEra Energy, Inc. Common Stock",Stock,Sale (Full),--,...,Employment and Workplace Safety,The Subcommittee Chairman is Senator John Hick...,https://www.help.senate.gov/about/subcommittees,employment and workplace safety the subcommitt...,employment workplace safety subcommittee chair...,SSHR11,Tommy Tuberville,minority,2,T000278


In [68]:
# drop the rows where member comittee != ticker_comittee


In [69]:
# do analysis (edited) 

-----

### Fine-tuning the Algorithms

Exploring types of algorithms on the columns

##### Simple Ratios

In [70]:
def similar(a, b):
    return SequenceMatcher(None, a, b).ratio()

In [71]:
# high match
similar("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")

0.9090909090909091

In [72]:
# should match
similar("utilities utilities—regulated electric nextera energy inc subsidiaries generates transmits distributes sells electric power retail wholesale customers north america company generates electricity wind solar nuclear fossil fuel coal natural gas facilities also develops constructs operates longterm contracted assets focus renewable generation facilities electric transmission facilities battery storage projects owns develops constructs manages operates electric generation facilities wholesale energy markets december   company operated approximately  megawatts net generating capacity serves approximately  million people approximately  million customer accounts east lower west coasts florida approximately  circuit miles transmission distribution lines  substations company formerly known fpl group inc changed name nextera energy inc  nextera energy inc founded  headquartered juno beach florida", "africa global health policy subcommittee deals matters concerning us relations countries africa except like countries north africa specifically covered subcommittees well regional intergovernmental organizations like african union economic community west african states subcommittee’s regional responsibilities include matters within geographic region including matters relating  terrorism nonproliferation  crime illicit narcotics  us foreign assistance programs  promotion us trade exports addition subcommittee global responsibility healthrelated policy including disease outbreak response")

0.017437961099932932

In [73]:
# should match
similar("technology consumer electronics apple inc designs manufactures markets smartphones personal computers tablets wearables accessories worldwide also sells various related services addition company offers iphone line smartphones mac line personal computers ipad line multipurpose tablets airpods max overear wireless headphone wearables home accessories comprising airpods apple tv apple watch beats products homepod ipod touch provides applecare support services cloud services store services operates various platforms including app store allow customers discover download applications digital content books music video games podcasts additionally company offers various services apple arcade game subscription service apple music offers users curated listening experience ondemand radio stations apple news subscription news magazine service apple tv offers exclusive original content apple card cobranded credit card apple pay cashless payment service well licenses intellectual property company serves consumers small midsized businesses education enterprise government markets distributes thirdparty applications products app store company also sells products retail online stores direct sales force thirdparty cellular network carriers wholesalers retailers resellers apple inc incorporated  headquartered cupertino california", "cybersecurity infrastructure protection innovation cyber security infrastructure protection innovation subcommittee jurisdiction cybersecurity infrastructure security agency cisa science technology directorate focuses efforts advance federal network security improve critical infrastructure security also oversees cisa‚Äôs chemical security programs crosscutting science technology initiatives")

0.01625072547881602

In [74]:
# should match using industry/sector and committee description
similar("technology consumer electronics", "cybersecurity infrastructure protection innovation cyber security infrastructure protection innovation subcommittee jurisdiction cybersecurity infrastructure security agency cisa science technology directorate focuses efforts advance federal network security improve critical infrastructure security also oversees cisa‚Äôs chemical security programs crosscutting science technology initiatives")

0.05188679245283019

In [75]:
# should match using industry/sector and committee description
similar("utilities utilities—regulated electric", "All matters relating to energy research, development, and demonstration projects therefor; commercial application of energy technology; Department of Energy research, development, and demonstration programs; Department of Energy laboratories; Department of Energy science activities; Department of Energy international research, development, and demonstration projects; energy supply activities; nuclear, solar, and renewable energy, and other advanced energy technologies; uranium supply and enrichment, and Department of Energy waste management; Department of Energy environmental management research, development, and demonstration; fossil energy research and development; clean coal technology; energy conservation research and development, including building performance, alternate fuels, distributed power systems, and industrial process improvements; pipeline research, development, and demonstration projects; energy standards; other appropriate matters as referred by the Chair; and relevant oversight.") 

0.005719733079122974

In [76]:
# shouldn't match 
similar("technology consumer electronics apple inc designs manufactures markets smartphones personal computers tablets wearables accessories worldwide also sells various related services addition company offers iphone line smartphones mac line personal computers ipad line multipurpose tablets airpods max overear wireless headphone wearables home accessories comprising airpods apple tv apple watch beats products homepod ipod touch provides applecare support services cloud services store services operates various platforms including app store allow customers discover download applications digital content books music video games podcasts additionally company offers various services apple arcade game subscription service apple music offers users curated listening experience ondemand radio stations apple news subscription news magazine service apple tv offers exclusive original content apple card cobranded credit card apple pay cashless payment service well licenses intellectual property company serves consumers small midsized businesses education enterprise government markets distributes thirdparty applications products app store company also sells products retail online stores direct sales force thirdparty cellular network carriers wholesalers retailers resellers apple inc incorporated  headquartered cupertino california", "africa global health policy subcommittee deals matters concerning us relations countries africa except like countries north africa specifically covered subcommittees well regional intergovernmental organizations like african union economic community west african states subcommittee’s regional responsibilities include matters within geographic region including matters relating  terrorism nonproliferation  crime illicit narcotics  us foreign assistance programs  promotion us trade exports addition subcommittee global responsibility healthrelated policy including disease outbreak response")

0.015608740894901144

In [77]:
def similar(a,b):
    return fuzz.ratio(a, b)

In [78]:
# high match
similar("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")

91

In [79]:
# should match
similar("utilities utilities—regulated electric nextera energy inc subsidiaries generates transmits distributes sells electric power retail wholesale customers north america company generates electricity wind solar nuclear fossil fuel coal natural gas facilities also develops constructs operates longterm contracted assets focus renewable generation facilities electric transmission facilities battery storage projects owns develops constructs manages operates electric generation facilities wholesale energy markets december   company operated approximately  megawatts net generating capacity serves approximately  million people approximately  million customer accounts east lower west coasts florida approximately  circuit miles transmission distribution lines  substations company formerly known fpl group inc changed name nextera energy inc  nextera energy inc founded  headquartered juno beach florida", "africa global health policy subcommittee deals matters concerning us relations countries africa except like countries north africa specifically covered subcommittees well regional intergovernmental organizations like african union economic community west african states subcommittee’s regional responsibilities include matters within geographic region including matters relating  terrorism nonproliferation  crime illicit narcotics  us foreign assistance programs  promotion us trade exports addition subcommittee global responsibility healthrelated policy including disease outbreak response")

2

In [80]:
# should match
similar("technology consumer electronics apple inc designs manufactures markets smartphones personal computers tablets wearables accessories worldwide also sells various related services addition company offers iphone line smartphones mac line personal computers ipad line multipurpose tablets airpods max overear wireless headphone wearables home accessories comprising airpods apple tv apple watch beats products homepod ipod touch provides applecare support services cloud services store services operates various platforms including app store allow customers discover download applications digital content books music video games podcasts additionally company offers various services apple arcade game subscription service apple music offers users curated listening experience ondemand radio stations apple news subscription news magazine service apple tv offers exclusive original content apple card cobranded credit card apple pay cashless payment service well licenses intellectual property company serves consumers small midsized businesses education enterprise government markets distributes thirdparty applications products app store company also sells products retail online stores direct sales force thirdparty cellular network carriers wholesalers retailers resellers apple inc incorporated  headquartered cupertino california", "cybersecurity infrastructure protection innovation cyber security infrastructure protection innovation subcommittee jurisdiction cybersecurity infrastructure security agency cisa science technology directorate focuses efforts advance federal network security improve critical infrastructure security also oversees cisa‚Äôs chemical security programs crosscutting science technology initiatives")

2

In [81]:
# should match
similar("technology consumer electronics", "cybersecurity infrastructure protection innovation cyber security infrastructure protection innovation subcommittee jurisdiction cybersecurity infrastructure security agency cisa science technology directorate focuses efforts advance federal network security improve critical infrastructure security also oversees cisa‚Äôs chemical security programs crosscutting science technology initiatives")

5

In [82]:
# should match
similar("utilities utilities—regulated electric", "All matters relating to energy research, development, and demonstration projects therefor; commercial application of energy technology; Department of Energy research, development, and demonstration programs; Department of Energy laboratories; Department of Energy science activities; Department of Energy international research, development, and demonstration projects; energy supply activities; nuclear, solar, and renewable energy, and other advanced energy technologies; uranium supply and enrichment, and Department of Energy waste management; Department of Energy environmental management research, development, and demonstration; fossil energy research and development; clean coal technology; energy conservation research and development, including building performance, alternate fuels, distributed power systems, and industrial process improvements; pipeline research, development, and demonstration projects; energy standards; other appropriate matters as referred by the Chair; and relevant oversight.") 

1

In [83]:
# shouldn't match 
similar("technology consumer electronics apple inc designs manufactures markets smartphones personal computers tablets wearables accessories worldwide also sells various related services addition company offers iphone line smartphones mac line personal computers ipad line multipurpose tablets airpods max overear wireless headphone wearables home accessories comprising airpods apple tv apple watch beats products homepod ipod touch provides applecare support services cloud services store services operates various platforms including app store allow customers discover download applications digital content books music video games podcasts additionally company offers various services apple arcade game subscription service apple music offers users curated listening experience ondemand radio stations apple news subscription news magazine service apple tv offers exclusive original content apple card cobranded credit card apple pay cashless payment service well licenses intellectual property company serves consumers small midsized businesses education enterprise government markets distributes thirdparty applications products app store company also sells products retail online stores direct sales force thirdparty cellular network carriers wholesalers retailers resellers apple inc incorporated  headquartered cupertino california", "africa global health policy subcommittee deals matters concerning us relations countries africa except like countries north africa specifically covered subcommittees well regional intergovernmental organizations like african union economic community west african states subcommittee’s regional responsibilities include matters within geographic region including matters relating  terrorism nonproliferation  crime illicit narcotics  us foreign assistance programs  promotion us trade exports addition subcommittee global responsibility healthrelated policy including disease outbreak response")

2

In [84]:
# shouldn't match 
similar("technology consumer electronics apple inc designs manufactures markets smartphones personal computers tablets wearables accessories worldwide also sells various related services addition company offers iphone line smartphones mac line personal computers ipad line multipurpose tablets airpods max overear wireless headphone wearables home accessories comprising airpods apple tv apple watch beats products homepod ipod touch provides applecare support services cloud services store services operates various platforms including app store allow customers discover download applications digital content books music video games podcasts additionally company offers various services apple arcade game subscription service apple music offers users curated listening experience ondemand radio stations apple news subscription news magazine service apple tv offers exclusive original content apple card cobranded credit card apple pay cashless payment service well licenses intellectual property company serves consumers small midsized businesses education enterprise government markets distributes thirdparty applications products app store company also sells products retail online stores direct sales force thirdparty cellular network carriers wholesalers retailers resellers apple inc incorporated  headquartered cupertino california", "africa global health policy subcommittee deals matters concerning us relations countries africa except like countries north africa specifically covered subcommittees well regional intergovernmental organizations like african union economic community west african states subcommittee’s regional responsibilities include matters within geographic region including matters relating  terrorism nonproliferation  crime illicit narcotics  us foreign assistance programs  promotion us trade exports addition subcommittee global responsibility healthrelated policy including disease outbreak response")

2

In [85]:
def similar(a,b):
    return fuzz.token_set_ratio(a, b)

In [86]:
# high match
similar("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")

100

In [87]:
# should match
similar("utilities utilities—regulated electric nextera energy inc subsidiaries generates transmits distributes sells electric power retail wholesale customers north america company generates electricity wind solar nuclear fossil fuel coal natural gas facilities also develops constructs operates longterm contracted assets focus renewable generation facilities electric transmission facilities battery storage projects owns develops constructs manages operates electric generation facilities wholesale energy markets december   company operated approximately  megawatts net generating capacity serves approximately  million people approximately  million customer accounts east lower west coasts florida approximately  circuit miles transmission distribution lines  substations company formerly known fpl group inc changed name nextera energy inc  nextera energy inc founded  headquartered juno beach florida", "africa global health policy subcommittee deals matters concerning us relations countries africa except like countries north africa specifically covered subcommittees well regional intergovernmental organizations like african union economic community west african states subcommittee’s regional responsibilities include matters within geographic region including matters relating  terrorism nonproliferation  crime illicit narcotics  us foreign assistance programs  promotion us trade exports addition subcommittee global responsibility healthrelated policy including disease outbreak response")

4

In [88]:
# should match using industry/sector and committee description
similar("technology consumer electronics apple inc designs manufactures markets smartphones personal computers tablets wearables accessories worldwide also sells various related services addition company offers iphone line smartphones mac line personal computers ipad line multipurpose tablets airpods max overear wireless headphone wearables home accessories comprising airpods apple tv apple watch beats products homepod ipod touch provides applecare support services cloud services store services operates various platforms including app store allow customers discover download applications digital content books music video games podcasts additionally company offers various services apple arcade game subscription service apple music offers users curated listening experience ondemand radio stations apple news subscription news magazine service apple tv offers exclusive original content apple card cobranded credit card apple pay cashless payment service well licenses intellectual property company serves consumers small midsized businesses education enterprise government markets distributes thirdparty applications products app store company also sells products retail online stores direct sales force thirdparty cellular network carriers wholesalers retailers resellers apple inc incorporated  headquartered cupertino california", "cybersecurity infrastructure protection innovation cyber security infrastructure protection innovation subcommittee jurisdiction cybersecurity infrastructure security agency cisa science technology directorate focuses efforts advance federal network security improve critical infrastructure security also oversees cisa‚Äôs chemical security programs crosscutting science technology initiatives")

17

In [89]:
# should match using industry/sector and committee description
similar("technology consumer electronics", "cybersecurity infrastructure protection innovation cyber security infrastructure protection innovation subcommittee jurisdiction cybersecurity infrastructure security agency cisa science technology directorate focuses efforts advance federal network security improve critical infrastructure security also oversees cisa‚Äôs chemical security programs crosscutting science technology initiatives")

49

In [90]:
# should match
similar("utilities utilities—regulated electric", "All matters relating to energy research, development, and demonstration projects therefor; commercial application of energy technology; Department of Energy research, development, and demonstration programs; Department of Energy laboratories; Department of Energy science activities; Department of Energy international research, development, and demonstration projects; energy supply activities; nuclear, solar, and renewable energy, and other advanced energy technologies; uranium supply and enrichment, and Department of Energy waste management; Department of Energy environmental management research, development, and demonstration; fossil energy research and development; clean coal technology; energy conservation research and development, including building performance, alternate fuels, distributed power systems, and industrial process improvements; pipeline research, development, and demonstration projects; energy standards; other appropriate matters as referred by the Chair; and relevant oversight.") 

0

In [91]:
# shouldn't match 
similar("technology consumer electronics apple inc designs manufactures markets smartphones personal computers tablets wearables accessories worldwide also sells various related services addition company offers iphone line smartphones mac line personal computers ipad line multipurpose tablets airpods max overear wireless headphone wearables home accessories comprising airpods apple tv apple watch beats products homepod ipod touch provides applecare support services cloud services store services operates various platforms including app store allow customers discover download applications digital content books music video games podcasts additionally company offers various services apple arcade game subscription service apple music offers users curated listening experience ondemand radio stations apple news subscription news magazine service apple tv offers exclusive original content apple card cobranded credit card apple pay cashless payment service well licenses intellectual property company serves consumers small midsized businesses education enterprise government markets distributes thirdparty applications products app store company also sells products retail online stores direct sales force thirdparty cellular network carriers wholesalers retailers resellers apple inc incorporated  headquartered cupertino california", "africa global health policy subcommittee deals matters concerning us relations countries africa except like countries north africa specifically covered subcommittees well regional intergovernmental organizations like african union economic community west african states subcommittee’s regional responsibilities include matters within geographic region including matters relating  terrorism nonproliferation  crime illicit narcotics  us foreign assistance programs  promotion us trade exports addition subcommittee global responsibility healthrelated policy including disease outbreak response")

9

##### Partial Ratio

In [92]:
def similar(a,b):
    return fuzz.partial_ratio(a, b)

In [93]:
# high match
similar("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")

91

In [94]:
# should match
similar("utilities utilities—regulated electric nextera energy inc subsidiaries generates transmits distributes sells electric power retail wholesale customers north america company generates electricity wind solar nuclear fossil fuel coal natural gas facilities also develops constructs operates longterm contracted assets focus renewable generation facilities electric transmission facilities battery storage projects owns develops constructs manages operates electric generation facilities wholesale energy markets december   company operated approximately  megawatts net generating capacity serves approximately  million people approximately  million customer accounts east lower west coasts florida approximately  circuit miles transmission distribution lines  substations company formerly known fpl group inc changed name nextera energy inc  nextera energy inc founded  headquartered juno beach florida", "africa global health policy subcommittee deals matters concerning us relations countries africa except like countries north africa specifically covered subcommittees well regional intergovernmental organizations like african union economic community west african states subcommittee’s regional responsibilities include matters within geographic region including matters relating  terrorism nonproliferation  crime illicit narcotics  us foreign assistance programs  promotion us trade exports addition subcommittee global responsibility healthrelated policy including disease outbreak response")

14

In [95]:
# should match
similar("technology consumer electronics apple inc designs manufactures markets smartphones personal computers tablets wearables accessories worldwide also sells various related services addition company offers iphone line smartphones mac line personal computers ipad line multipurpose tablets airpods max overear wireless headphone wearables home accessories comprising airpods apple tv apple watch beats products homepod ipod touch provides applecare support services cloud services store services operates various platforms including app store allow customers discover download applications digital content books music video games podcasts additionally company offers various services apple arcade game subscription service apple music offers users curated listening experience ondemand radio stations apple news subscription news magazine service apple tv offers exclusive original content apple card cobranded credit card apple pay cashless payment service well licenses intellectual property company serves consumers small midsized businesses education enterprise government markets distributes thirdparty applications products app store company also sells products retail online stores direct sales force thirdparty cellular network carriers wholesalers retailers resellers apple inc incorporated  headquartered cupertino california", "cybersecurity infrastructure protection innovation cyber security infrastructure protection innovation subcommittee jurisdiction cybersecurity infrastructure security agency cisa science technology directorate focuses efforts advance federal network security improve critical infrastructure security also oversees cisa‚Äôs chemical security programs crosscutting science technology initiatives")

4

!!!! High ratio with industry/sector and committee description

In [96]:
# should match using industry/sector and committee description
similar("technology consumer electronics", "cybersecurity infrastructure protection innovation cyber security infrastructure protection innovation subcommittee jurisdiction cybersecurity infrastructure security agency cisa science technology directorate focuses efforts advance federal network security improve critical infrastructure security also oversees cisa‚Äôs chemical security programs crosscutting science technology initiatives")

61

In [97]:
# should match using industry/sector and committee description
similar("utilities utilities—regulated electric", "All matters relating to energy research, development, and demonstration projects therefor; commercial application of energy technology; Department of Energy research, development, and demonstration programs; Department of Energy laboratories; Department of Energy science activities; Department of Energy international research, development, and demonstration projects; energy supply activities; nuclear, solar, and renewable energy, and other advanced energy technologies; uranium supply and enrichment, and Department of Energy waste management; Department of Energy environmental management research, development, and demonstration; fossil energy research and development; clean coal technology; energy conservation research and development, including building performance, alternate fuels, distributed power systems, and industrial process improvements; pipeline research, development, and demonstration projects; energy standards; other appropriate matters as referred by the Chair; and relevant oversight.") 

45

In [98]:
# shouldn't match 
similar("technology consumer electronics apple inc designs manufactures markets smartphones personal computers tablets wearables accessories worldwide also sells various related services addition company offers iphone line smartphones mac line personal computers ipad line multipurpose tablets airpods max overear wireless headphone wearables home accessories comprising airpods apple tv apple watch beats products homepod ipod touch provides applecare support services cloud services store services operates various platforms including app store allow customers discover download applications digital content books music video games podcasts additionally company offers various services apple arcade game subscription service apple music offers users curated listening experience ondemand radio stations apple news subscription news magazine service apple tv offers exclusive original content apple card cobranded credit card apple pay cashless payment service well licenses intellectual property company serves consumers small midsized businesses education enterprise government markets distributes thirdparty applications products app store company also sells products retail online stores direct sales force thirdparty cellular network carriers wholesalers retailers resellers apple inc incorporated  headquartered cupertino california", "africa global health policy subcommittee deals matters concerning us relations countries africa except like countries north africa specifically covered subcommittees well regional intergovernmental organizations like african union economic community west african states subcommittee’s regional responsibilities include matters within geographic region including matters relating  terrorism nonproliferation  crime illicit narcotics  us foreign assistance programs  promotion us trade exports addition subcommittee global responsibility healthrelated policy including disease outbreak response")

18

##### Token Sort Ratio

In [99]:
def similar(a,b):
    return fuzz.token_sort_ratio(a, b)

In [100]:
# high match
similar("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")

100

In [101]:
# should match
similar("utilities utilities—regulated electric nextera energy inc subsidiaries generates transmits distributes sells electric power retail wholesale customers north america company generates electricity wind solar nuclear fossil fuel coal natural gas facilities also develops constructs operates longterm contracted assets focus renewable generation facilities electric transmission facilities battery storage projects owns develops constructs manages operates electric generation facilities wholesale energy markets december   company operated approximately  megawatts net generating capacity serves approximately  million people approximately  million customer accounts east lower west coasts florida approximately  circuit miles transmission distribution lines  substations company formerly known fpl group inc changed name nextera energy inc  nextera energy inc founded  headquartered juno beach florida", "africa global health policy subcommittee deals matters concerning us relations countries africa except like countries north africa specifically covered subcommittees well regional intergovernmental organizations like african union economic community west african states subcommittee’s regional responsibilities include matters within geographic region including matters relating  terrorism nonproliferation  crime illicit narcotics  us foreign assistance programs  promotion us trade exports addition subcommittee global responsibility healthrelated policy including disease outbreak response")

2

In [102]:
# should match
similar("technology consumer electronics apple inc designs manufactures markets smartphones personal computers tablets wearables accessories worldwide also sells various related services addition company offers iphone line smartphones mac line personal computers ipad line multipurpose tablets airpods max overear wireless headphone wearables home accessories comprising airpods apple tv apple watch beats products homepod ipod touch provides applecare support services cloud services store services operates various platforms including app store allow customers discover download applications digital content books music video games podcasts additionally company offers various services apple arcade game subscription service apple music offers users curated listening experience ondemand radio stations apple news subscription news magazine service apple tv offers exclusive original content apple card cobranded credit card apple pay cashless payment service well licenses intellectual property company serves consumers small midsized businesses education enterprise government markets distributes thirdparty applications products app store company also sells products retail online stores direct sales force thirdparty cellular network carriers wholesalers retailers resellers apple inc incorporated  headquartered cupertino california", "cybersecurity infrastructure protection innovation cyber security infrastructure protection innovation subcommittee jurisdiction cybersecurity infrastructure security agency cisa science technology directorate focuses efforts advance federal network security improve critical infrastructure security also oversees cisa‚Äôs chemical security programs crosscutting science technology initiatives")

2

In [103]:
# should match
similar("technology consumer electronics", "cybersecurity infrastructure protection innovation cyber security infrastructure protection innovation subcommittee jurisdiction cybersecurity infrastructure security agency cisa science technology directorate focuses efforts advance federal network security improve critical infrastructure security also oversees cisa‚Äôs chemical security programs crosscutting science technology initiatives")

1

In [104]:
# should match
similar("utilities utilities—regulated electric", "All matters relating to energy research, development, and demonstration projects therefor; commercial application of energy technology; Department of Energy research, development, and demonstration programs; Department of Energy laboratories; Department of Energy science activities; Department of Energy international research, development, and demonstration projects; energy supply activities; nuclear, solar, and renewable energy, and other advanced energy technologies; uranium supply and enrichment, and Department of Energy waste management; Department of Energy environmental management research, development, and demonstration; fossil energy research and development; clean coal technology; energy conservation research and development, including building performance, alternate fuels, distributed power systems, and industrial process improvements; pipeline research, development, and demonstration projects; energy standards; other appropriate matters as referred by the Chair; and relevant oversight.") 

1

In [105]:
# shouldn't match 
similar("technology consumer electronics apple inc designs manufactures markets smartphones personal computers tablets wearables accessories worldwide also sells various related services addition company offers iphone line smartphones mac line personal computers ipad line multipurpose tablets airpods max overear wireless headphone wearables home accessories comprising airpods apple tv apple watch beats products homepod ipod touch provides applecare support services cloud services store services operates various platforms including app store allow customers discover download applications digital content books music video games podcasts additionally company offers various services apple arcade game subscription service apple music offers users curated listening experience ondemand radio stations apple news subscription news magazine service apple tv offers exclusive original content apple card cobranded credit card apple pay cashless payment service well licenses intellectual property company serves consumers small midsized businesses education enterprise government markets distributes thirdparty applications products app store company also sells products retail online stores direct sales force thirdparty cellular network carriers wholesalers retailers resellers apple inc incorporated  headquartered cupertino california", "africa global health policy subcommittee deals matters concerning us relations countries africa except like countries north africa specifically covered subcommittees well regional intergovernmental organizations like african union economic community west african states subcommittee’s regional responsibilities include matters within geographic region including matters relating  terrorism nonproliferation  crime illicit narcotics  us foreign assistance programs  promotion us trade exports addition subcommittee global responsibility healthrelated policy including disease outbreak response")

1

##### Token Set

In [106]:
def similar(a,b):
    return fuzz.token_set_ratio(a, b)

In [107]:
# high match
similar("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")

100

In [108]:
# should match
similar("utilities utilities—regulated electric nextera energy inc subsidiaries generates transmits distributes sells electric power retail wholesale customers north america company generates electricity wind solar nuclear fossil fuel coal natural gas facilities also develops constructs operates longterm contracted assets focus renewable generation facilities electric transmission facilities battery storage projects owns develops constructs manages operates electric generation facilities wholesale energy markets december   company operated approximately  megawatts net generating capacity serves approximately  million people approximately  million customer accounts east lower west coasts florida approximately  circuit miles transmission distribution lines  substations company formerly known fpl group inc changed name nextera energy inc  nextera energy inc founded  headquartered juno beach florida", "africa global health policy subcommittee deals matters concerning us relations countries africa except like countries north africa specifically covered subcommittees well regional intergovernmental organizations like african union economic community west african states subcommittee’s regional responsibilities include matters within geographic region including matters relating  terrorism nonproliferation  crime illicit narcotics  us foreign assistance programs  promotion us trade exports addition subcommittee global responsibility healthrelated policy including disease outbreak response")

4

In [109]:
# should match
similar("technology consumer electronics apple inc designs manufactures markets smartphones personal computers tablets wearables accessories worldwide also sells various related services addition company offers iphone line smartphones mac line personal computers ipad line multipurpose tablets airpods max overear wireless headphone wearables home accessories comprising airpods apple tv apple watch beats products homepod ipod touch provides applecare support services cloud services store services operates various platforms including app store allow customers discover download applications digital content books music video games podcasts additionally company offers various services apple arcade game subscription service apple music offers users curated listening experience ondemand radio stations apple news subscription news magazine service apple tv offers exclusive original content apple card cobranded credit card apple pay cashless payment service well licenses intellectual property company serves consumers small midsized businesses education enterprise government markets distributes thirdparty applications products app store company also sells products retail online stores direct sales force thirdparty cellular network carriers wholesalers retailers resellers apple inc incorporated  headquartered cupertino california", "cybersecurity infrastructure protection innovation cyber security infrastructure protection innovation subcommittee jurisdiction cybersecurity infrastructure security agency cisa science technology directorate focuses efforts advance federal network security improve critical infrastructure security also oversees cisa‚Äôs chemical security programs crosscutting science technology initiatives")

17

In [110]:
# should match
similar("technology consumer electronics", "cybersecurity infrastructure protection innovation cyber security infrastructure protection innovation subcommittee jurisdiction cybersecurity infrastructure security agency cisa science technology directorate focuses efforts advance federal network security improve critical infrastructure security also oversees cisa‚Äôs chemical security programs crosscutting science technology initiatives")

49

In [111]:
# should match
similar("utilities utilities—regulated electric", "All matters relating to energy research, development, and demonstration projects therefor; commercial application of energy technology; Department of Energy research, development, and demonstration programs; Department of Energy laboratories; Department of Energy science activities; Department of Energy international research, development, and demonstration projects; energy supply activities; nuclear, solar, and renewable energy, and other advanced energy technologies; uranium supply and enrichment, and Department of Energy waste management; Department of Energy environmental management research, development, and demonstration; fossil energy research and development; clean coal technology; energy conservation research and development, including building performance, alternate fuels, distributed power systems, and industrial process improvements; pipeline research, development, and demonstration projects; energy standards; other appropriate matters as referred by the Chair; and relevant oversight.") 

0

In [112]:
# shouldn't match 
similar("technology consumer electronics apple inc designs manufactures markets smartphones personal computers tablets wearables accessories worldwide also sells various related services addition company offers iphone line smartphones mac line personal computers ipad line multipurpose tablets airpods max overear wireless headphone wearables home accessories comprising airpods apple tv apple watch beats products homepod ipod touch provides applecare support services cloud services store services operates various platforms including app store allow customers discover download applications digital content books music video games podcasts additionally company offers various services apple arcade game subscription service apple music offers users curated listening experience ondemand radio stations apple news subscription news magazine service apple tv offers exclusive original content apple card cobranded credit card apple pay cashless payment service well licenses intellectual property company serves consumers small midsized businesses education enterprise government markets distributes thirdparty applications products app store company also sells products retail online stores direct sales force thirdparty cellular network carriers wholesalers retailers resellers apple inc incorporated  headquartered cupertino california", "africa global health policy subcommittee deals matters concerning us relations countries africa except like countries north africa specifically covered subcommittees well regional intergovernmental organizations like african union economic community west african states subcommittee’s regional responsibilities include matters within geographic region including matters relating  terrorism nonproliferation  crime illicit narcotics  us foreign assistance programs  promotion us trade exports addition subcommittee global responsibility healthrelated policy including disease outbreak response")

9

##### Hamming Distance (finding the places where the strings vary)

In [113]:
textdistance.hamming.normalized_similarity('arrow', 'arow')

0.4

In [114]:
#the edit distance is 1 for only the difference being one letter different
def similar(a,b):
    return textdistance.hamming(a, b)

In [115]:
# high match
similar("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")

2

In [116]:
# should match
similar("utilities utilities—regulated electric nextera energy inc subsidiaries generates transmits distributes sells electric power retail wholesale customers north america company generates electricity wind solar nuclear fossil fuel coal natural gas facilities also develops constructs operates longterm contracted assets focus renewable generation facilities electric transmission facilities battery storage projects owns develops constructs manages operates electric generation facilities wholesale energy markets december   company operated approximately  megawatts net generating capacity serves approximately  million people approximately  million customer accounts east lower west coasts florida approximately  circuit miles transmission distribution lines  substations company formerly known fpl group inc changed name nextera energy inc  nextera energy inc founded  headquartered juno beach florida", "africa global health policy subcommittee deals matters concerning us relations countries africa except like countries north africa specifically covered subcommittees well regional intergovernmental organizations like african union economic community west african states subcommittee’s regional responsibilities include matters within geographic region including matters relating  terrorism nonproliferation  crime illicit narcotics  us foreign assistance programs  promotion us trade exports addition subcommittee global responsibility healthrelated policy including disease outbreak response")

855

In [117]:
# should match
similar("technology consumer electronics apple inc designs manufactures markets smartphones personal computers tablets wearables accessories worldwide also sells various related services addition company offers iphone line smartphones mac line personal computers ipad line multipurpose tablets airpods max overear wireless headphone wearables home accessories comprising airpods apple tv apple watch beats products homepod ipod touch provides applecare support services cloud services store services operates various platforms including app store allow customers discover download applications digital content books music video games podcasts additionally company offers various services apple arcade game subscription service apple music offers users curated listening experience ondemand radio stations apple news subscription news magazine service apple tv offers exclusive original content apple card cobranded credit card apple pay cashless payment service well licenses intellectual property company serves consumers small midsized businesses education enterprise government markets distributes thirdparty applications products app store company also sells products retail online stores direct sales force thirdparty cellular network carriers wholesalers retailers resellers apple inc incorporated  headquartered cupertino california", "cybersecurity infrastructure protection innovation cyber security infrastructure protection innovation subcommittee jurisdiction cybersecurity infrastructure security agency cisa science technology directorate focuses efforts advance federal network security improve critical infrastructure security also oversees cisa‚Äôs chemical security programs crosscutting science technology initiatives")

1305

In [118]:
# should match
similar("technology consumer electronics", "cybersecurity infrastructure protection innovation cyber security infrastructure protection innovation subcommittee jurisdiction cybersecurity infrastructure security agency cisa science technology directorate focuses efforts advance federal network security improve critical infrastructure security also oversees cisa‚Äôs chemical security programs crosscutting science technology initiatives")

391

In [119]:
# should match
similar("utilities utilities—regulated electric", "All matters relating to energy research, development, and demonstration projects therefor; commercial application of energy technology; Department of Energy research, development, and demonstration programs; Department of Energy laboratories; Department of Energy science activities; Department of Energy international research, development, and demonstration projects; energy supply activities; nuclear, solar, and renewable energy, and other advanced energy technologies; uranium supply and enrichment, and Department of Energy waste management; Department of Energy environmental management research, development, and demonstration; fossil energy research and development; clean coal technology; energy conservation research and development, including building performance, alternate fuels, distributed power systems, and industrial process improvements; pipeline research, development, and demonstration projects; energy standards; other appropriate matters as referred by the Chair; and relevant oversight.") 

1009

In [120]:
# shouldn't match 
similar("technology consumer electronics apple inc designs manufactures markets smartphones personal computers tablets wearables accessories worldwide also sells various related services addition company offers iphone line smartphones mac line personal computers ipad line multipurpose tablets airpods max overear wireless headphone wearables home accessories comprising airpods apple tv apple watch beats products homepod ipod touch provides applecare support services cloud services store services operates various platforms including app store allow customers discover download applications digital content books music video games podcasts additionally company offers various services apple arcade game subscription service apple music offers users curated listening experience ondemand radio stations apple news subscription news magazine service apple tv offers exclusive original content apple card cobranded credit card apple pay cashless payment service well licenses intellectual property company serves consumers small midsized businesses education enterprise government markets distributes thirdparty applications products app store company also sells products retail online stores direct sales force thirdparty cellular network carriers wholesalers retailers resellers apple inc incorporated  headquartered cupertino california", "africa global health policy subcommittee deals matters concerning us relations countries africa except like countries north africa specifically covered subcommittees well regional intergovernmental organizations like african union economic community west african states subcommittee’s regional responsibilities include matters within geographic region including matters relating  terrorism nonproliferation  crime illicit narcotics  us foreign assistance programs  promotion us trade exports addition subcommittee global responsibility healthrelated policy including disease outbreak response")

1295

In [121]:
#75% similar between text and test
def similar(a,b):
    return textdistance.hamming.normalized_similarity(a, b)

In [122]:
# high match
similar("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")

0.9090909090909091

In [123]:
# should match
similar("utilities utilities—regulated electric nextera energy inc subsidiaries generates transmits distributes sells electric power retail wholesale customers north america company generates electricity wind solar nuclear fossil fuel coal natural gas facilities also develops constructs operates longterm contracted assets focus renewable generation facilities electric transmission facilities battery storage projects owns develops constructs manages operates electric generation facilities wholesale energy markets december   company operated approximately  megawatts net generating capacity serves approximately  million people approximately  million customer accounts east lower west coasts florida approximately  circuit miles transmission distribution lines  substations company formerly known fpl group inc changed name nextera energy inc  nextera energy inc founded  headquartered juno beach florida", "africa global health policy subcommittee deals matters concerning us relations countries africa except like countries north africa specifically covered subcommittees well regional intergovernmental organizations like african union economic community west african states subcommittee’s regional responsibilities include matters within geographic region including matters relating  terrorism nonproliferation  crime illicit narcotics  us foreign assistance programs  promotion us trade exports addition subcommittee global responsibility healthrelated policy including disease outbreak response")

0.04894327030033374

In [124]:
# should match
similar("technology consumer electronics apple inc designs manufactures markets smartphones personal computers tablets wearables accessories worldwide also sells various related services addition company offers iphone line smartphones mac line personal computers ipad line multipurpose tablets airpods max overear wireless headphone wearables home accessories comprising airpods apple tv apple watch beats products homepod ipod touch provides applecare support services cloud services store services operates various platforms including app store allow customers discover download applications digital content books music video games podcasts additionally company offers various services apple arcade game subscription service apple music offers users curated listening experience ondemand radio stations apple news subscription news magazine service apple tv offers exclusive original content apple card cobranded credit card apple pay cashless payment service well licenses intellectual property company serves consumers small midsized businesses education enterprise government markets distributes thirdparty applications products app store company also sells products retail online stores direct sales force thirdparty cellular network carriers wholesalers retailers resellers apple inc incorporated  headquartered cupertino california", "cybersecurity infrastructure protection innovation cyber security infrastructure protection innovation subcommittee jurisdiction cybersecurity infrastructure security agency cisa science technology directorate focuses efforts advance federal network security improve critical infrastructure security also oversees cisa‚Äôs chemical security programs crosscutting science technology initiatives")

0.018796992481203034

In [125]:
# should match
similar("technology consumer electronics", "cybersecurity infrastructure protection innovation cyber security infrastructure protection innovation subcommittee jurisdiction cybersecurity infrastructure security agency cisa science technology directorate focuses efforts advance federal network security improve critical infrastructure security also oversees cisa‚Äôs chemical security programs crosscutting science technology initiatives")

0.0050890585241730735

In [126]:
# should match
similar("utilities utilities—regulated electric", "All matters relating to energy research, development, and demonstration projects therefor; commercial application of energy technology; Department of Energy research, development, and demonstration programs; Department of Energy laboratories; Department of Energy science activities; Department of Energy international research, development, and demonstration projects; energy supply activities; nuclear, solar, and renewable energy, and other advanced energy technologies; uranium supply and enrichment, and Department of Energy waste management; Department of Energy environmental management research, development, and demonstration; fossil energy research and development; clean coal technology; energy conservation research and development, including building performance, alternate fuels, distributed power systems, and industrial process improvements; pipeline research, development, and demonstration projects; energy standards; other appropriate matters as referred by the Chair; and relevant oversight.") 

0.001978239366963397

In [127]:
# shouldn't match 
similar("technology consumer electronics apple inc designs manufactures markets smartphones personal computers tablets wearables accessories worldwide also sells various related services addition company offers iphone line smartphones mac line personal computers ipad line multipurpose tablets airpods max overear wireless headphone wearables home accessories comprising airpods apple tv apple watch beats products homepod ipod touch provides applecare support services cloud services store services operates various platforms including app store allow customers discover download applications digital content books music video games podcasts additionally company offers various services apple arcade game subscription service apple music offers users curated listening experience ondemand radio stations apple news subscription news magazine service apple tv offers exclusive original content apple card cobranded credit card apple pay cashless payment service well licenses intellectual property company serves consumers small midsized businesses education enterprise government markets distributes thirdparty applications products app store company also sells products retail online stores direct sales force thirdparty cellular network carriers wholesalers retailers resellers apple inc incorporated  headquartered cupertino california", "africa global health policy subcommittee deals matters concerning us relations countries africa except like countries north africa specifically covered subcommittees well regional intergovernmental organizations like african union economic community west african states subcommittee’s regional responsibilities include matters within geographic region including matters relating  terrorism nonproliferation  crime illicit narcotics  us foreign assistance programs  promotion us trade exports addition subcommittee global responsibility healthrelated policy including disease outbreak response")

0.02631578947368418

##### Levenshtein Distance

In [128]:
#number of edits it will take to transform one to the other
textdistance.levenshtein('arrow', 'arow')

1

In [129]:
#number of edits it will take to transform one to the other
def similar(a,b):
    return textdistance.levenshtein(a, b)

In [130]:
# high match
similar("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")

2

In [131]:
# should match
similar("utilities utilities—regulated electric nextera energy inc subsidiaries generates transmits distributes sells electric power retail wholesale customers north america company generates electricity wind solar nuclear fossil fuel coal natural gas facilities also develops constructs operates longterm contracted assets focus renewable generation facilities electric transmission facilities battery storage projects owns develops constructs manages operates electric generation facilities wholesale energy markets december   company operated approximately  megawatts net generating capacity serves approximately  million people approximately  million customer accounts east lower west coasts florida approximately  circuit miles transmission distribution lines  substations company formerly known fpl group inc changed name nextera energy inc  nextera energy inc founded  headquartered juno beach florida", "africa global health policy subcommittee deals matters concerning us relations countries africa except like countries north africa specifically covered subcommittees well regional intergovernmental organizations like african union economic community west african states subcommittee’s regional responsibilities include matters within geographic region including matters relating  terrorism nonproliferation  crime illicit narcotics  us foreign assistance programs  promotion us trade exports addition subcommittee global responsibility healthrelated policy including disease outbreak response")

639

In [132]:
# should match
similar("technology consumer electronics apple inc designs manufactures markets smartphones personal computers tablets wearables accessories worldwide also sells various related services addition company offers iphone line smartphones mac line personal computers ipad line multipurpose tablets airpods max overear wireless headphone wearables home accessories comprising airpods apple tv apple watch beats products homepod ipod touch provides applecare support services cloud services store services operates various platforms including app store allow customers discover download applications digital content books music video games podcasts additionally company offers various services apple arcade game subscription service apple music offers users curated listening experience ondemand radio stations apple news subscription news magazine service apple tv offers exclusive original content apple card cobranded credit card apple pay cashless payment service well licenses intellectual property company serves consumers small midsized businesses education enterprise government markets distributes thirdparty applications products app store company also sells products retail online stores direct sales force thirdparty cellular network carriers wholesalers retailers resellers apple inc incorporated  headquartered cupertino california", "cybersecurity infrastructure protection innovation cyber security infrastructure protection innovation subcommittee jurisdiction cybersecurity infrastructure security agency cisa science technology directorate focuses efforts advance federal network security improve critical infrastructure security also oversees cisa‚Äôs chemical security programs crosscutting science technology initiatives")

1069

In [133]:
# should match
similar("technology consumer electronics", "cybersecurity infrastructure protection innovation cyber security infrastructure protection innovation subcommittee jurisdiction cybersecurity infrastructure security agency cisa science technology directorate focuses efforts advance federal network security improve critical infrastructure security also oversees cisa‚Äôs chemical security programs crosscutting science technology initiatives")

362

In [134]:
# should match
similar("utilities utilities—regulated electric", "All matters relating to energy research, development, and demonstration projects therefor; commercial application of energy technology; Department of Energy research, development, and demonstration programs; Department of Energy laboratories; Department of Energy science activities; Department of Energy international research, development, and demonstration projects; energy supply activities; nuclear, solar, and renewable energy, and other advanced energy technologies; uranium supply and enrichment, and Department of Energy waste management; Department of Energy environmental management research, development, and demonstration; fossil energy research and development; clean coal technology; energy conservation research and development, including building performance, alternate fuels, distributed power systems, and industrial process improvements; pipeline research, development, and demonstration projects; energy standards; other appropriate matters as referred by the Chair; and relevant oversight.") 

975

In [135]:
# shouldn't match 
similar("technology consumer electronics apple inc designs manufactures markets smartphones personal computers tablets wearables accessories worldwide also sells various related services addition company offers iphone line smartphones mac line personal computers ipad line multipurpose tablets airpods max overear wireless headphone wearables home accessories comprising airpods apple tv apple watch beats products homepod ipod touch provides applecare support services cloud services store services operates various platforms including app store allow customers discover download applications digital content books music video games podcasts additionally company offers various services apple arcade game subscription service apple music offers users curated listening experience ondemand radio stations apple news subscription news magazine service apple tv offers exclusive original content apple card cobranded credit card apple pay cashless payment service well licenses intellectual property company serves consumers small midsized businesses education enterprise government markets distributes thirdparty applications products app store company also sells products retail online stores direct sales force thirdparty cellular network carriers wholesalers retailers resellers apple inc incorporated  headquartered cupertino california", "africa global health policy subcommittee deals matters concerning us relations countries africa except like countries north africa specifically covered subcommittees well regional intergovernmental organizations like african union economic community west african states subcommittee’s regional responsibilities include matters within geographic region including matters relating  terrorism nonproliferation  crime illicit narcotics  us foreign assistance programs  promotion us trade exports addition subcommittee global responsibility healthrelated policy including disease outbreak response")

1004

In [136]:
textdistance.levenshtein.normalized_similarity('arrow', 'arow')

0.8

In [137]:
def similar(a,b):
    return textdistance.levenshtein.normalized_similarity(a, b)

In [138]:
# high match
similar("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")

0.9090909090909091

In [139]:
# should match
similar("utilities utilities—regulated electric nextera energy inc subsidiaries generates transmits distributes sells electric power retail wholesale customers north america company generates electricity wind solar nuclear fossil fuel coal natural gas facilities also develops constructs operates longterm contracted assets focus renewable generation facilities electric transmission facilities battery storage projects owns develops constructs manages operates electric generation facilities wholesale energy markets december   company operated approximately  megawatts net generating capacity serves approximately  million people approximately  million customer accounts east lower west coasts florida approximately  circuit miles transmission distribution lines  substations company formerly known fpl group inc changed name nextera energy inc  nextera energy inc founded  headquartered juno beach florida", "africa global health policy subcommittee deals matters concerning us relations countries africa except like countries north africa specifically covered subcommittees well regional intergovernmental organizations like african union economic community west african states subcommittee’s regional responsibilities include matters within geographic region including matters relating  terrorism nonproliferation  crime illicit narcotics  us foreign assistance programs  promotion us trade exports addition subcommittee global responsibility healthrelated policy including disease outbreak response")

0.289210233592881

In [140]:
# should match
similar("technology consumer electronics apple inc designs manufactures markets smartphones personal computers tablets wearables accessories worldwide also sells various related services addition company offers iphone line smartphones mac line personal computers ipad line multipurpose tablets airpods max overear wireless headphone wearables home accessories comprising airpods apple tv apple watch beats products homepod ipod touch provides applecare support services cloud services store services operates various platforms including app store allow customers discover download applications digital content books music video games podcasts additionally company offers various services apple arcade game subscription service apple music offers users curated listening experience ondemand radio stations apple news subscription news magazine service apple tv offers exclusive original content apple card cobranded credit card apple pay cashless payment service well licenses intellectual property company serves consumers small midsized businesses education enterprise government markets distributes thirdparty applications products app store company also sells products retail online stores direct sales force thirdparty cellular network carriers wholesalers retailers resellers apple inc incorporated  headquartered cupertino california", "cybersecurity infrastructure protection innovation cyber security infrastructure protection innovation subcommittee jurisdiction cybersecurity infrastructure security agency cisa science technology directorate focuses efforts advance federal network security improve critical infrastructure security also oversees cisa‚Äôs chemical security programs crosscutting science technology initiatives")

0.19624060150375944

In [141]:
# should match
similar("technology consumer electronics", "cybersecurity infrastructure protection innovation cyber security infrastructure protection innovation subcommittee jurisdiction cybersecurity infrastructure security agency cisa science technology directorate focuses efforts advance federal network security improve critical infrastructure security also oversees cisa‚Äôs chemical security programs crosscutting science technology initiatives")

0.07888040712468192

In [142]:
# should match
similar("utilities utilities—regulated electric", "All matters relating to energy research, development, and demonstration projects therefor; commercial application of energy technology; Department of Energy research, development, and demonstration programs; Department of Energy laboratories; Department of Energy science activities; Department of Energy international research, development, and demonstration projects; energy supply activities; nuclear, solar, and renewable energy, and other advanced energy technologies; uranium supply and enrichment, and Department of Energy waste management; Department of Energy environmental management research, development, and demonstration; fossil energy research and development; clean coal technology; energy conservation research and development, including building performance, alternate fuels, distributed power systems, and industrial process improvements; pipeline research, development, and demonstration projects; energy standards; other appropriate matters as referred by the Chair; and relevant oversight.") 

0.035608308605341255

In [143]:
# shouldn't match 
similar("technology consumer electronics apple inc designs manufactures markets smartphones personal computers tablets wearables accessories worldwide also sells various related services addition company offers iphone line smartphones mac line personal computers ipad line multipurpose tablets airpods max overear wireless headphone wearables home accessories comprising airpods apple tv apple watch beats products homepod ipod touch provides applecare support services cloud services store services operates various platforms including app store allow customers discover download applications digital content books music video games podcasts additionally company offers various services apple arcade game subscription service apple music offers users curated listening experience ondemand radio stations apple news subscription news magazine service apple tv offers exclusive original content apple card cobranded credit card apple pay cashless payment service well licenses intellectual property company serves consumers small midsized businesses education enterprise government markets distributes thirdparty applications products app store company also sells products retail online stores direct sales force thirdparty cellular network carriers wholesalers retailers resellers apple inc incorporated  headquartered cupertino california", "africa global health policy subcommittee deals matters concerning us relations countries africa except like countries north africa specifically covered subcommittees well regional intergovernmental organizations like african union economic community west african states subcommittee’s regional responsibilities include matters within geographic region including matters relating  terrorism nonproliferation  crime illicit narcotics  us foreign assistance programs  promotion us trade exports addition subcommittee global responsibility healthrelated policy including disease outbreak response")

0.24511278195488717

##### Jaccard Index 

(find the number of common tokens and divide it by the total number of unique tokens)

"We first tokenize the string by default space delimiter, to make words in the strings as tokens. Then we compute the similarity score." 

In [144]:
tokens_1 = "hello world".split()
tokens_2 = "world hello".split()

In [145]:
textdistance.jaccard(tokens_1 , tokens_2)

1.0

In [146]:
tokens_1 = "hello new world".split()
tokens_2 = "hello world".split()

In [147]:
textdistance.jaccard(tokens_1 , tokens_2)

0.6666666666666666

In [148]:
def similar(a,b):
    return textdistance.jaccard(a, b)

In [149]:
# high match
similar("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")

1.0

In [150]:
# should match
similar("utilities utilities—regulated electric nextera energy inc subsidiaries generates transmits distributes sells electric power retail wholesale customers north america company generates electricity wind solar nuclear fossil fuel coal natural gas facilities also develops constructs operates longterm contracted assets focus renewable generation facilities electric transmission facilities battery storage projects owns develops constructs manages operates electric generation facilities wholesale energy markets december   company operated approximately  megawatts net generating capacity serves approximately  million people approximately  million customer accounts east lower west coasts florida approximately  circuit miles transmission distribution lines  substations company formerly known fpl group inc changed name nextera energy inc  nextera energy inc founded  headquartered juno beach florida", "africa global health policy subcommittee deals matters concerning us relations countries africa except like countries north africa specifically covered subcommittees well regional intergovernmental organizations like african union economic community west african states subcommittee’s regional responsibilities include matters within geographic region including matters relating  terrorism nonproliferation  crime illicit narcotics  us foreign assistance programs  promotion us trade exports addition subcommittee global responsibility healthrelated policy including disease outbreak response")

0.6493362831858407

In [151]:
# should match
similar("technology consumer electronics apple inc designs manufactures markets smartphones personal computers tablets wearables accessories worldwide also sells various related services addition company offers iphone line smartphones mac line personal computers ipad line multipurpose tablets airpods max overear wireless headphone wearables home accessories comprising airpods apple tv apple watch beats products homepod ipod touch provides applecare support services cloud services store services operates various platforms including app store allow customers discover download applications digital content books music video games podcasts additionally company offers various services apple arcade game subscription service apple music offers users curated listening experience ondemand radio stations apple news subscription news magazine service apple tv offers exclusive original content apple card cobranded credit card apple pay cashless payment service well licenses intellectual property company serves consumers small midsized businesses education enterprise government markets distributes thirdparty applications products app store company also sells products retail online stores direct sales force thirdparty cellular network carriers wholesalers retailers resellers apple inc incorporated  headquartered cupertino california", "cybersecurity infrastructure protection innovation cyber security infrastructure protection innovation subcommittee jurisdiction cybersecurity infrastructure security agency cisa science technology directorate focuses efforts advance federal network security improve critical infrastructure security also oversees cisa‚Äôs chemical security programs crosscutting science technology initiatives")

0.28967065868263475

In [152]:
# should match
similar("technology consumer electronics", "cybersecurity infrastructure protection innovation cyber security infrastructure protection innovation subcommittee jurisdiction cybersecurity infrastructure security agency cisa science technology directorate focuses efforts advance federal network security improve critical infrastructure security also oversees cisa‚Äôs chemical security programs crosscutting science technology initiatives")

0.07888040712468193

In [153]:
# should match
similar("utilities utilities—regulated electric", "All matters relating to energy research, development, and demonstration projects therefor; commercial application of energy technology; Department of Energy research, development, and demonstration programs; Department of Energy laboratories; Department of Energy science activities; Department of Energy international research, development, and demonstration projects; energy supply activities; nuclear, solar, and renewable energy, and other advanced energy technologies; uranium supply and enrichment, and Department of Energy waste management; Department of Energy environmental management research, development, and demonstration; fossil energy research and development; clean coal technology; energy conservation research and development, including building performance, alternate fuels, distributed power systems, and industrial process improvements; pipeline research, development, and demonstration projects; energy standards; other appropriate matters as referred by the Chair; and relevant oversight.") 

0.036561264822134384

In [154]:
# shouldn't match 
similar("technology consumer electronics apple inc designs manufactures markets smartphones personal computers tablets wearables accessories worldwide also sells various related services addition company offers iphone line smartphones mac line personal computers ipad line multipurpose tablets airpods max overear wireless headphone wearables home accessories comprising airpods apple tv apple watch beats products homepod ipod touch provides applecare support services cloud services store services operates various platforms including app store allow customers discover download applications digital content books music video games podcasts additionally company offers various services apple arcade game subscription service apple music offers users curated listening experience ondemand radio stations apple news subscription news magazine service apple tv offers exclusive original content apple card cobranded credit card apple pay cashless payment service well licenses intellectual property company serves consumers small midsized businesses education enterprise government markets distributes thirdparty applications products app store company also sells products retail online stores direct sales force thirdparty cellular network carriers wholesalers retailers resellers apple inc incorporated  headquartered cupertino california", "africa global health policy subcommittee deals matters concerning us relations countries africa except like countries north africa specifically covered subcommittees well regional intergovernmental organizations like african union economic community west african states subcommittee’s regional responsibilities include matters within geographic region including matters relating  terrorism nonproliferation  crime illicit narcotics  us foreign assistance programs  promotion us trade exports addition subcommittee global responsibility healthrelated policy including disease outbreak response")

0.4397003745318352

##### Sorensen-Dice

"Falling under set similarity, the logic is to find the common tokens, and divide it by the total number of tokens present by combining both sets." 

In [155]:
tokens_1 = "hello world".split()
tokens_2 = "world hello".split()

In [156]:
textdistance.sorensen(tokens_1 , tokens_2)

1.0

In [157]:
tokens_1 = "hello new world".split()
tokens_2 = "hello world".split()

In [158]:
textdistance.sorensen(tokens_1 , tokens_2)

0.8

In [159]:
def similar(a,b):
    return textdistance.sorensen(a, b)

In [160]:
# high match
similar("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")

1.0

In [161]:
# should match
similar("utilities utilities—regulated electric nextera energy inc subsidiaries generates transmits distributes sells electric power retail wholesale customers north america company generates electricity wind solar nuclear fossil fuel coal natural gas facilities also develops constructs operates longterm contracted assets focus renewable generation facilities electric transmission facilities battery storage projects owns develops constructs manages operates electric generation facilities wholesale energy markets december   company operated approximately  megawatts net generating capacity serves approximately  million people approximately  million customer accounts east lower west coasts florida approximately  circuit miles transmission distribution lines  substations company formerly known fpl group inc changed name nextera energy inc  nextera energy inc founded  headquartered juno beach florida", "africa global health policy subcommittee deals matters concerning us relations countries africa except like countries north africa specifically covered subcommittees well regional intergovernmental organizations like african union economic community west african states subcommittee’s regional responsibilities include matters within geographic region including matters relating  terrorism nonproliferation  crime illicit narcotics  us foreign assistance programs  promotion us trade exports addition subcommittee global responsibility healthrelated policy including disease outbreak response")

0.7873910127431254

In [162]:
# should match
similar("technology consumer electronics apple inc designs manufactures markets smartphones personal computers tablets wearables accessories worldwide also sells various related services addition company offers iphone line smartphones mac line personal computers ipad line multipurpose tablets airpods max overear wireless headphone wearables home accessories comprising airpods apple tv apple watch beats products homepod ipod touch provides applecare support services cloud services store services operates various platforms including app store allow customers discover download applications digital content books music video games podcasts additionally company offers various services apple arcade game subscription service apple music offers users curated listening experience ondemand radio stations apple news subscription news magazine service apple tv offers exclusive original content apple card cobranded credit card apple pay cashless payment service well licenses intellectual property company serves consumers small midsized businesses education enterprise government markets distributes thirdparty applications products app store company also sells products retail online stores direct sales force thirdparty cellular network carriers wholesalers retailers resellers apple inc incorporated  headquartered cupertino california", "cybersecurity infrastructure protection innovation cyber security infrastructure protection innovation subcommittee jurisdiction cybersecurity infrastructure security agency cisa science technology directorate focuses efforts advance federal network security improve critical infrastructure security also oversees cisa‚Äôs chemical security programs crosscutting science technology initiatives")

0.4492164828786999

In [163]:
# should match
similar("technology consumer electronics", "cybersecurity infrastructure protection innovation cyber security infrastructure protection innovation subcommittee jurisdiction cybersecurity infrastructure security agency cisa science technology directorate focuses efforts advance federal network security improve critical infrastructure security also oversees cisa‚Äôs chemical security programs crosscutting science technology initiatives")

0.14622641509433962

In [164]:
# should match
similar("utilities utilities—regulated electric", "All matters relating to energy research, development, and demonstration projects therefor; commercial application of energy technology; Department of Energy research, development, and demonstration programs; Department of Energy laboratories; Department of Energy science activities; Department of Energy international research, development, and demonstration projects; energy supply activities; nuclear, solar, and renewable energy, and other advanced energy technologies; uranium supply and enrichment, and Department of Energy waste management; Department of Energy environmental management research, development, and demonstration; fossil energy research and development; clean coal technology; energy conservation research and development, including building performance, alternate fuels, distributed power systems, and industrial process improvements; pipeline research, development, and demonstration projects; energy standards; other appropriate matters as referred by the Chair; and relevant oversight.") 

0.07054337464251668

In [165]:
# shouldn't match 
similar("technology consumer electronics apple inc designs manufactures markets smartphones personal computers tablets wearables accessories worldwide also sells various related services addition company offers iphone line smartphones mac line personal computers ipad line multipurpose tablets airpods max overear wireless headphone wearables home accessories comprising airpods apple tv apple watch beats products homepod ipod touch provides applecare support services cloud services store services operates various platforms including app store allow customers discover download applications digital content books music video games podcasts additionally company offers various services apple arcade game subscription service apple music offers users curated listening experience ondemand radio stations apple news subscription news magazine service apple tv offers exclusive original content apple card cobranded credit card apple pay cashless payment service well licenses intellectual property company serves consumers small midsized businesses education enterprise government markets distributes thirdparty applications products app store company also sells products retail online stores direct sales force thirdparty cellular network carriers wholesalers retailers resellers apple inc incorporated  headquartered cupertino california", "africa global health policy subcommittee deals matters concerning us relations countries africa except like countries north africa specifically covered subcommittees well regional intergovernmental organizations like african union economic community west african states subcommittee’s regional responsibilities include matters within geographic region including matters relating  terrorism nonproliferation  crime illicit narcotics  us foreign assistance programs  promotion us trade exports addition subcommittee global responsibility healthrelated policy including disease outbreak response")

0.6108220603537982

##### Ratcliff-Obershelp similarity

In [166]:
string1, string2 = "i am going home", "gone home"

In [167]:
textdistance.ratcliff_obershelp(string1, string2)

0.6666666666666666

In [168]:
def similar(a,b):
    return textdistance.ratcliff_obershelp(a, b)

In [169]:
# high match
similar("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")

0.9090909090909091

In [170]:
# should match 
similar("utilities utilities—regulated electric nextera energy inc subsidiaries generates transmits distributes sells electric power retail wholesale customers north america company generates electricity wind solar nuclear fossil fuel coal natural gas facilities also develops constructs operates longterm contracted assets focus renewable generation facilities electric transmission facilities battery storage projects owns develops constructs manages operates electric generation facilities wholesale energy markets december   company operated approximately  megawatts net generating capacity serves approximately  million people approximately  million customer accounts east lower west coasts florida approximately  circuit miles transmission distribution lines  substations company formerly known fpl group inc changed name nextera energy inc  nextera energy inc founded  headquartered juno beach florida", "africa global health policy subcommittee deals matters concerning us relations countries africa except like countries north africa specifically covered subcommittees well regional intergovernmental organizations like african union economic community west african states subcommittee’s regional responsibilities include matters within geographic region including matters relating  terrorism nonproliferation  crime illicit narcotics  us foreign assistance programs  promotion us trade exports addition subcommittee global responsibility healthrelated policy including disease outbreak response")

0.12206572769953052

In [171]:
# should match
similar("technology consumer electronics apple inc designs manufactures markets smartphones personal computers tablets wearables accessories worldwide also sells various related services addition company offers iphone line smartphones mac line personal computers ipad line multipurpose tablets airpods max overear wireless headphone wearables home accessories comprising airpods apple tv apple watch beats products homepod ipod touch provides applecare support services cloud services store services operates various platforms including app store allow customers discover download applications digital content books music video games podcasts additionally company offers various services apple arcade game subscription service apple music offers users curated listening experience ondemand radio stations apple news subscription news magazine service apple tv offers exclusive original content apple card cobranded credit card apple pay cashless payment service well licenses intellectual property company serves consumers small midsized businesses education enterprise government markets distributes thirdparty applications products app store company also sells products retail online stores direct sales force thirdparty cellular network carriers wholesalers retailers resellers apple inc incorporated  headquartered cupertino california", "cybersecurity infrastructure protection innovation cyber security infrastructure protection innovation subcommittee jurisdiction cybersecurity infrastructure security agency cisa science technology directorate focuses efforts advance federal network security improve critical infrastructure security also oversees cisa‚Äôs chemical security programs crosscutting science technology initiatives")

0.06848520023215322

In [172]:
# should match
similar("technology consumer electronics", "cybersecurity infrastructure protection innovation cyber security infrastructure protection innovation subcommittee jurisdiction cybersecurity infrastructure security agency cisa science technology directorate focuses efforts advance federal network security improve critical infrastructure security also oversees cisa‚Äôs chemical security programs crosscutting science technology initiatives")

0.09433962264150944

In [173]:
# should match 
similar("utilities utilities—regulated electric", "All matters relating to energy research, development, and demonstration projects therefor; commercial application of energy technology; Department of Energy research, development, and demonstration programs; Department of Energy laboratories; Department of Energy science activities; Department of Energy international research, development, and demonstration projects; energy supply activities; nuclear, solar, and renewable energy, and other advanced energy technologies; uranium supply and enrichment, and Department of Energy waste management; Department of Energy environmental management research, development, and demonstration; fossil energy research and development; clean coal technology; energy conservation research and development, including building performance, alternate fuels, distributed power systems, and industrial process improvements; pipeline research, development, and demonstration projects; energy standards; other appropriate matters as referred by the Chair; and relevant oversight.") 

0.06291706387035272

In [174]:
# shouldn't match 
similar("technology consumer electronics apple inc designs manufactures markets smartphones personal computers tablets wearables accessories worldwide also sells various related services addition company offers iphone line smartphones mac line personal computers ipad line multipurpose tablets airpods max overear wireless headphone wearables home accessories comprising airpods apple tv apple watch beats products homepod ipod touch provides applecare support services cloud services store services operates various platforms including app store allow customers discover download applications digital content books music video games podcasts additionally company offers various services apple arcade game subscription service apple music offers users curated listening experience ondemand radio stations apple news subscription news magazine service apple tv offers exclusive original content apple card cobranded credit card apple pay cashless payment service well licenses intellectual property company serves consumers small midsized businesses education enterprise government markets distributes thirdparty applications products app store company also sells products retail online stores direct sales force thirdparty cellular network carriers wholesalers retailers resellers apple inc incorporated  headquartered cupertino california", "africa global health policy subcommittee deals matters concerning us relations countries africa except like countries north africa specifically covered subcommittees well regional intergovernmental organizations like african union economic community west african states subcommittee’s regional responsibilities include matters within geographic region including matters relating  terrorism nonproliferation  crime illicit narcotics  us foreign assistance programs  promotion us trade exports addition subcommittee global responsibility healthrelated policy including disease outbreak response")

0.18106139438085328

----