pip install bank_scrapers
bank-scrape {subcommand} $LOGIN_USER $LOGIN_PASS
bank_scrapers
is a library containing drivers for scraping account information from various financial websites.
Since most traditional financial institutions don't provide an API for accessing one's account data, most of these
drivers utilize Playwright
to impersonate the user using the provided credentials.
Unfortunately, undetected-playwright will only start consistently while using Chrome. Here's how to install:
wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add - && \
sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list' && \
sudo apt update && \
sudo apt install -y google-chrome-stable
Since these modules are ran in virtual displays to avoid detection, the xvfb
package is required.
sudo apt update && sudo apt install -y xvfb xserver-xephyr tigervnc-standalone-server
pip install bank_scrapers
pip install git+https://github.com/eebette/bank_scrapers.git
💡 Usage examples for each driver are listed in that driver's documentation
bank-scrape -h
bank-scrape {subcommand} $LOGIN_USER $LOGIN_PASS
API results are returned as a Python list of pandas dataframes, containing relevant data scraped from the site. See each driver's section for info on what is in that driver's return tables.
import asyncio
from bank_scrapers.scrapers.becu.driver import get_accounts_info
tables = asyncio.run(get_accounts_info(username="{username}", password="{password}"))
for t in tables:
print(t.to_string())
As of version 1.1, there is a single get_accounts_info()
function available in the module
bank_scrapers.get_accounts_info
that takes the institution name as the first argument and the rest of the
institution's required arguments after that.
Example
import asyncio
from bank_scrapers.get_accounts_info import get_accounts_info
accounts_info = asyncio.run(get_accounts_info("chase", "{username}", "{password}"))
for table in accounts_info:
print(table)
As of version 1.1, it is possible to get output the metrics in the form of [labels] metric
by passing the
prometheus=True
parameter to get_accounts_info
.
Passing this parameter will cause the API to return the following format: Tuple(List, List)
- The first list in the tuple will return a list of labels (one per symbol per account) and their Quantity (i.e. number of shares/units).
- The second list in the tuple will return a list of labels (one per symbol per account) and the symbol's USD value (i.e. #.# if it's a US-based bank account, such as Chase, or the share value of a stock, such as in Vanguard accounts).
The metric comes back in the following format:
(['<institution_name>', <account_number>, '<account_type>', '<symbol>'], <metric>)
Example
This functionality is meant to make these metrics easily ingest-able into a Prometheus server.
import asyncio
from bank_scrapers.scrapers.vanguard.driver import get_accounts_info
prometheus_output = asyncio.run(
get_accounts_info("{username}", "{password}", prometheus=True)
)
print((prometheus_output[0][0], prometheus_output[1][0]))
> ((['Vanguard', ########, 'deposit', 'TTWO'], ##.#), (['Vanguard', ########, 'deposit', 'TTWO'], ###.##))
LABELS = [
"institution",
"account",
"account_type",
"symbol",
]
metrics = Gauge(
name,
documentation,
LABELS,
registry=registry,
)
for metric in prometheus_output[0]:
labels: List[str] = metric[0]
value: float = metric[1]
metrics.labels(*labels).set(value)
As of version 1.1, it is possible to automate the Multi-Factor Authentication workflows in both the API and the CLI by providing a Python dict (or JSON file in the case of the CLI) with the following:
otp_contact_option
: The list option which you would like to use for MFA Authentication (e.g. when a site asks if you'd like to be contacted via 1 Phone or 2 SMS)otp_code_location
: The file directory location to look for a file containing the One-Time Password (OTP). SeeOTP File Requirements
below
Example
from bank_scrapers.scrapers.roundpoint.driver import get_accounts_info
prometheus_output = get_accounts_info(
"{username}",
"{password}",
mfa_auth={"otp_contact_option": 1, "otp_code_location": "/tmp/otp_codes"},
)
or
bank-scrape roundpoint $LOGIN_USER $LOGIN_PASS --json_file ~/roundpoint_mfa.json
- The scraper will begin searching the text in files in the
otp_code_location
## seconds after the OTP request is submitted on the site. - The scraper will look at each file in the
otp_code_location
in reverse alphabetical order. For this reason, if you are automatically moving your SMS to this folder through some automation system, it is recommended to prepend the file names with a timestamp. - Each scraper has a string term that it searches for in each file (to ensure that the OTP was sent from/belongs to the correct institution). These values for can be found in each scraper's documentation below.
- The scraper will NOT delete the file once it is done. Maintaining this directory is up to you.
Automating getting SMS messages with OTP codes from your phone to .txt files on your PC is outside the scope of this project, SMS to URL Forwarder and webhook is a good place to start.
These are all written in Python using the Playwright driver and, for the most part, try to simulate the real user experience/workflow as seen in the eyes of the website provider.
Boeing Enterprises Credit Union
This is a Playwright driver that logs in using provided credentials and reads account info from the landing page.
❗️Driver does NOT currently support MFA
bank-scrape becu $LOGIN_USER $LOGIN_PASS
import asyncio
from bank_scrapers.scrapers.becu.driver import get_accounts_info
tables = asyncio.run(get_accounts_info(username="{username}", password="{password}"))
for t in tables:
print(t.to_markdown(index=False))
Account | YTD Interest | Current Balance | Available Balance | account_type | symbol | usd_value |
---|---|---|---|---|---|---|
########## | ###.## | #####.## | #####.## | deposit | USD | 1 |
########## | ###.## | #####.## | #####.## | deposit | USD | 1 |
########## | ###.## | #####.## | #####.## | deposit | USD | 1 |
Account | Current Balance | Available Credit | account_type | symbol | usd_value |
---|---|---|---|---|---|
#### | ####.## | ##### | credit | USD | 1 |
This is a Playwright driver that logs in using provided credentials, navigates MFA, navigates to the detail account info from the landing page, and reads the account info from the page.
✔️ Driver supports handling of MFA
❗️This driver is designed to crawl and pull data for Chase credit card services only. Chase shared bank accounts are currently not in the scope of this project
bank-scrape chase $LOGIN_USER $LOGIN_PASS
import asyncio
from bank_scrapers.scrapers.chase.driver import get_accounts_info
tables = asyncio.run(get_accounts_info(username="{username}", password="{password}"))
for t in tables:
print(t.to_markdown(index=False))
>>> # Example MFA workflow
>>> tables = get_accounts_info(username="{username}", password="{password}")
1: Get a text
2: Get a call
Please select one: {user_choose_mfa_option}
Enter OTP Code: {user_enters_otp_code}
Note that Chase has 2 MFA workflows. otp_contact_option
refers to the (now) more common one with a binary Call Me/Text
Me choice. otp_contact_option_alternate
refers to the traditional workflow with a list of numbers and contact options
in a dropdown list.
{
"otp_contact_option": 2,
"otp_contact_option_alternate": 2,
"otp_code_location": "/tmp/otp_codes"
}
Chase
Current balance | Pending charges | Available credit | Total credit limit | Next closing date | Balance on last statement | Remaining statement balance | Payments are due on the | account | account_type | symbol | usd_value |
---|---|---|---|---|---|---|---|---|---|---|---|
####.## | ##.## | #####.# | ##### | ##### | ####.## | ####.## | # | #### | credit | USD | 1 |
Last payment | Minimum payment | Automatic Payments | account | account_type | symbol |
---|---|---|---|---|---|
####.## | ##.#### | #### | credit | USD |
Points available | account | account_type | symbol |
---|---|---|---|
###### | #### | credit | USD |
Cash advance balance | Available for cash advance | Cash advance limit | account | account_type | symbol |
---|---|---|---|---|---|
# | #### | #### | #### | credit | USD |
Purchase APR | Cash advance APR | account | account_type | symbol |
---|---|---|---|---|
##.## | ##.## | #### | credit | USD |
Program details | account | account_type | symbol |
---|---|---|---|
#### | credit | USD |
Provides int-ified values for each of the columns.
❗️Dates will be converted to their spreadsheet friendly int-representation
❗️Any text values are dropped. Most notably this affects
Automatic Payments
andProgram details
columns, which are currently out of the scope of this project
❗️This driver is designed to work on the webpage for Fidelity NetBenefits, which is Fidelity's net interface for 401(k) holders and stock plan participants for various companies. It is not designed to work for general brokerage account holders, though I suspect it would work with minimal effort
️✔️ This driver will pull holdings info for all Fidelity accounts for the account holder, including general brokerage accounts
This is a Playwright driver that logs in using provided credentials, navigates MFA, navigates to the detail account info from the landing page for Fidelity NetBenefits.
Instead of scraping the user's account info from the page, this driver will navigate to the user's positions summary and download the accounts info provided by Fidelity using a folder of the user's choice
✔️ Driver supports handling of MFA
bank-scrape fidelity-nb $LOGIN_USER $LOGIN_PASS
💡 The API and CLI backends handle the creation of a tmp directory in the user's home directory by default.
import asyncio
from bank_scrapers.scrapers.fidelity_netbenefits.driver import get_accounts_info
tables = asyncio.run(get_accounts_info(username="{username}", password="{password}"))
for t in tables:
print(t.to_markdown(index=False))
>>> # Example MFA workflow
>>> tables = get_accounts_info(username="{username}", password="{password}")
1: Text me the code
2: Call me with the code
Please select one: {user_choose_mfa_option}
Enter OTP Code: {user_enters_otp_code}
Note that Fidelity doesn't have any otp_contact_option
.
{
"otp_code_location": "/tmp/otp_codes"
}
NetBenefits
Account Number | Account Name | Symbol | Description | Quantity | Last Price | Last Price Change | Current Value | Today's Gain/Loss Dollar | Today's Gain/Loss Percent | Total Gain/Loss Dollar | Total Gain/Loss Percent | Percent Of Account | Cost Basis Total | Average Cost Basis | Type | account_type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Z######## | Individual - TOD | USD | HELD IN FCASH | ##.## | # | nan | $##.## | nan | nan | nan | nan | #.##% | nan | nan | Cash | deposit |
Z######## | Individual - TOD | AMZN | AMAZON.COM INC | ### | ###.# | +$#.## | $#####.## | +$###.## | +#.##% | +$####.## | +##.##% | ##.##% | $#####.## | $###.## | Cash | deposit |
##### | ###### ###(K) PLAN | SSGA LG CAP GROWTH | SSGA LG CAP GROWTH | ####.## | ##.## | -$#.## | $#####.## | -$###.## | -#.##% | +$#####.## | +##.##% | ##.##% | $#####.## | $##.## | nan | retirement |
##### | ###### ###(K) PLAN | #####N### | VANGUARD TARGET #### | ###.### | ###.## | -$#.## | $#####.## | -$##.## | -#.##% | +$####.## | +##.##% | #.##% | $#####.## | $###.## | nan | retirement |
##### | ###### ###(K) PLAN | AMZN | AMAZON.COM STOCK | ##.### | ###.# | +$#.## | $#####.## | +$###.## | +#.##% | +$####.## | +##.##% | #.##% | $#####.## | $###.## | nan | retirement |
##### | ###### ###(K) PLAN | VFTNX | VANG FTSE SOC IDX IS | ####.## | ##.## | -$#.## | $#####.## | -$###.## | -#.##% | +$#####.## | +##.##% | ##.##% | $#####.## | $##.## | nan | retirement |
This is a Playwright driver that logs in using provided credentials, navigates MFA, navigates to the detail account info from the landing page for a mortgage serviced by RoundPoint Mortgage.
✔️ Driver supports handling of MFA
bank-scrape roundpoint $LOGIN_USER $LOGIN_PASS
import asyncio
from bank_scrapers.scrapers.roundpoint.driver import get_accounts_info
tables = asyncio.run(get_accounts_info(username="{username}", password="{password}"))
for t in tables:
print(t.to_markdown(index=False))
>>> # Example MFA workflow
>>> tables = get_accounts_info(username="{username}", password="{password}")
1: Email (**********@##.##)
2: Text (***-***-####)
Please select one: {user_choose_mfa_option}
Enter OTP Code: {user_enters_otp_code}
1
is email.
2
is SMS.
{
"otp_contact_option": 2,
"otp_code_location": "/tmp/otp_codes"
}
Servicing Digital
Balance | Monthly Payment Amount | Actual Due Date | Next Draft Date | Payment Method | account_number | account_type | usd_value | symbol |
---|---|---|---|---|---|---|---|---|
#####.# | ###.## | July ##, #### | July ##, #### | Checking Account (####) | ########## | loan | 1 | USD |
Sumitomo Mitsui Banking Corporation PRESTIA
This is a Playwright driver that logs in using provided credentials, navigates to the detail account info and scrapes account info for a member account of SMBC Prestia.
❗️Driver does NOT currently support MFA
bank-scrape smbc-prestia $LOGIN_USER $LOGIN_PASS
import asyncio
from bank_scrapers.scrapers.smbc_prestia.driver import get_accounts_info
tables = asyncio.run(get_accounts_info(username="{username}", password="{password}"))
for t in tables:
print(t.to_markdown(index=False))
Account Number | Available Amount | symbol | account_type | usd_value |
---|---|---|---|---|
####### | ####### | JPY | deposit | #.######## |
######## | # | JPY | deposit | #.######## |
University of Hawaii Federal Credit Union
This is a Playwright driver that logs in using provided credentials, navigates MFA, navigates to the detail account info from the landing page for UHFCU account. It will also navigate to the credit card management system used by UHFCU and pull info for each credit card on the dashboard
✔️ Driver supports handling of MFA
bank-scrape uhfcu $LOGIN_USER $LOGIN_PASS
import asyncio
from bank_scrapers.scrapers.uhfcu.driver import get_accounts_info
tables = asyncio.run(get_accounts_info(username="{username}", password="{password}"))
for t in tables:
print(t.to_markdown(index=False))
>>> # Example MFA workflow
>>> tables = get_accounts_info(username="{username}", password="{password}")
1: #********#@##.##
2: ###-***-**##
Please select one: {user_choose_mfa_option}
Enter OTP Code: {user_enters_otp_code}
1
is email.
2
is SMS.
{
"otp_contact_option": 2,
"otp_code_location": "/tmp/otp_codes"
}
University of Hawaii Federal Credit Union
Account Type | Account Desc | Available | Current Balance | symbol | account_type | usd_value |
---|---|---|---|---|---|---|
Savings | XXX ##-S#### | $#.## | #.## | USD | deposit | 1 |
Checking | XXX ##-S#### | $#,###.## | ####.## | USD | deposit | 1 |
Current Balance | Pending Balance | Statement Balance | Available Credit | Last Payment | Total Minimum Due | Payment Due Date | Last Login | Account Desc | symbol | account_type | usd_value |
---|---|---|---|---|---|---|---|---|---|---|---|
# | $#.## | $#.## | $##,###.## | $##.## | $#.## | Not Available | Jun ##, ####, #:##:## PM | #### | USD | credit | 1 |
️✔️ This driver will pull holdings info for all Vanguard accounts for the account holder, including general brokerage accounts
This is a Playwright driver that logs in using provided credentials, navigates MFA, navigates to the detail account info in the Downloads Center from the landing page.
Instead of scraping the user's account info from the page, this driver will navigate to the user's positions summary and download the accounts info provided by Vanguard using a folder of the user's choice
➖️ Driver has limited support for MFA (only supports mobile app touch authentication)
bank-scrape vanguard $LOGIN_USER $LOGIN_PASS
💡 The API and CLI backends handle the creation of a tmp directory in the user's home directory by default
import asyncio
from bank_scrapers.scrapers.vanguard.driver import get_accounts_info
tables = asyncio.run(get_accounts_info(username="{username}", password="{password}"))
for t in tables:
print(t.to_markdown(index=False))
>>> # Example MFA workflow
>>> tables = get_accounts_info(username="{username}", password="{password}")
1: Click to verify with the Vanguard App
2: Click to verify with security code
Please select one: {user_choose_mfa_option}
Enter OTP Code: {user_enters_otp_code}
1
is app verification.
2
is SMS.
{
"otp_contact_option": 2,
"otp_code_location": "/tmp/otp_codes"
}
Vanguard
Account Number | account_type | Investment Name | Symbol | Shares | Share Price | Total Value |
---|---|---|---|---|---|---|
######## | deposit | TAKE-TWO INTERACTIVE SOFTWARE INC | TTWO | ## | ###.## | ####.## |
######## | deposit | PAYCOM SOFTWARE INC | PAYC | # | ###.## | ###.## |
This is a Playwright driver that finds a property's Zestimate from a user-provided url suffix (the part after
https://www.zillow.com/homedetails/
).
bank-scrape zillow $URL_SUFFIX_FOR_PROPERTY
💡 The suffix of the Zillow URL (the part after 'homedetails'). Note that you only need to provide the part that ends with "zpid"
💡 For example, this is a valid suffix argument (provided
#
was replaced by actual digits):########_zpid
import asyncio
from bank_scrapers.scrapers.zillow.driver import get_accounts_info
tables = asyncio.run(get_accounts_info(suffix="########_zpid"))
for t in tables:
print(t.to_markdown(index=False))
address | zestimate | symbol | account_type | usd_value |
---|---|---|---|---|
123 Apple Lane | ###### | USD | real_estate | 1 |
These are wrappers written around API endpoints provided by providers and are generally purposed around making these processes of getting accounts info cohesive across this library.
This is an API wrapper for pulling Kraken account holdings based on Kraken's documentation.
The main purpose of this wrapper is to provide an even simpler interface for pulling account holdings and to align the data provided by Kraken with the rest of the financial data pulled by this package.
bank-scrape kraken $API_KEY $SECRET_KEY
from bank_scrapers.api_wrappers.kraken.driver import get_accounts_info
tables = get_accounts_info(
api_key="*****************/**************************************",
api_sec="********+*************************+****+********//******************/**************+**==",
)
for t in tables:
print(t.to_markdown(index=False))
symbol | quantity | account_id | account_type | usd_value |
---|---|---|---|---|
ETHW | #.##### | #################/###################################### | cryptocurrency | #.##### |
XETH | #.##e-## | #################/###################################### | cryptocurrency | # |
This library also contains a few handy functions for pulling the value of a given crypto wallet for some popular tokens.
This is an API wrapper for pulling a Bitcoin wallet's holdings using the Bitcoin wallet's xpub or zpub.
Under the hood, this is just another Playwright-based scraper that uses Blockpath to do the dirty work of getting the wallet balance. Unfortunately, there isn't a publicly available, non-registration API available for doing this programmatically.
If your xpub changes after each transaction, and you want to pull the full wallet's BTC balance, convert the xpub used in the latest transaction to a zpub here and use that.
bank-scrape bitcoin $BITCOIN_ZPUB
import asyncio
from bank_scrapers.crypto.bitcoin.driver import get_accounts_info
tables = asyncio.run(
get_accounts_info(zpub="*****************/**************************************")
)
for t in tables:
print(t.to_markdown(index=False))
zpub | balance | symbol | account_type | usd_value |
---|---|---|---|---|
zpub########################################################################################################### | #.###### | BTC | cryptocurrency | #####.# |
This is an API wrapper for pulling an Ethereum wallet's holdings using the Ethereum wallet's address.
bank-scrape ethereum $ETHEREUM_ADDRESS
from bank_scrapers.crypto.ethereum.driver import get_accounts_info
tables = get_accounts_info(
address="0x########################################",
)
for t in tables:
print(t.to_markdown(index=False))
address | balance | symbol | account_type | usd_value |
---|---|---|---|---|
#x######################################## | #.##### | ETH | cryptocurrency | ####.## |
The intended purpose of this code is purely academic in nature, and it IS NOT intended to be used for any real life production use, nefarious or otherwise.
Usage of this code is potentially against your bank's terms of service and could result in you or your IP getting flagged, listed, or blocked as bad actors. I don't take any responsibility for any effects this code may have on your bank accounts or your relationships with your banking institutions.
Please don't try to learn anything about me or my life based on the banks that I've arbitrarily decided to write drivers for.