Quantitative analysis of altcoins, or "Bitcoin alternatives", for Princeton's COS 597E: Bitcoin and Cryptocurrency Technologies.
Project by Dan Kang, Charlie Marsh, and Shubhro Saha.
Altcoins have evolved into a popular means of innovation and self-expression in the cryptocurrency landscape. CoinMarketCap lists nearly 500 altcoins which range in popularity, value, and technical specification. This project provides the tools to perform quantitative analysis of altcoins in a general and module way that asssumes very little about the coins themselves.
The bulk of the tooling is contained in two Python modules:
coin.py
: An abstraction that takes an altcoin's ticker as argument (e.g.,LTC
) and allows for easy retrieval of that coin's blockchain data as Pandas data frames.graph_analyzer.py
: An abstraction that takes an altcoin's ticker as argument and constructs a representation of the coin's transaction graph, including clusters as generated by the heuristics in A Fistful of Bitcoins (Meiklejohn 13). Once loaded, the Graph Analyzer can track balances over time for individual addresses and clusters, as well as report back other statistics about the blockchain's transaction graph.
Additionally, mining_analyzer.py
can be used in tandem with graph_analyzer.py
to extract data about miners, while price.py
is a slimmed-down version of coin.py
that only operates over price data, which can be downloaded quickly without syncing a blockchain.
Usage examples can be found in the IPython notebooks.
This repository requires you to sync an altcoin's full blockchain with the network and store it as a Postgres database on your machine, which can be a lengthy and computationally intensive process.
Each altcoin needs to be assigned its own Postgres database, identified by its ticker (prefixed with the string "abe-"), and loaded from blockchain to Postgres using the bitcoin-abe open source block browser (see the bitcoin-abe documentation for more) with the configuration files provided in the conf
folder.
For example, Litcoin's ticker is LTC. Thus, its database is labeled "abe-ltc" and its bitcoin-abe configuration file is labeled conf/abe-ltc.conf
.
In addition, you'll need to change config.py
to adjust your Postgres username accordingly.
Run download_price_data.py
to generate CSV files in the price-data
folder. Prices will be downloaded for both USD and BTC exchange rates, and will be placed in price-data/usd
and price-data/btc
, respectively.
Generating the transaction graph can be a time- and resource-intensive process, depending on the size of the altcoin's blockchain. It's a multi-step procedure that runs as follows (assuming that you're in the home directory, and that you're generating the graph for Litecoin):
- Run
psql -U postgres -d abe-ltc -A -F ',' -f src/sql/abe_schema_edges.sql > edges.csv
to export network edges. - Run
psql -U postgres -d abe-ltc -A -F ',' -f src/sql/abe_schema_txin.sql > txin.csv
to export transaction input counts. - Run
psql -U postgres -d abe-ltc -A -F ',' -f src/sql/abe_schema_txout.sql > txout.csv
to export transaction output counts. - Run
python src/clustergen.py edges.csv > clusters-ltc.txt
to generate cluster listings, one per line. - Run
python src/balancegen.py edges.csv txin.csv txout.csv balances-ltc.pickle
to generate balances for each public key at every point in time, exported as a pickled data structure. - Run
rm edges.csv txin.csv txout.csv
to clean up remaining CSV files.
The altcoin ticker (ltc
) can be replaced with any other coin code, as long as that coin has been synced to your machine. Much of this process is captured by the edges.sh
script, for reference.
At this point, you can analyze the generated clusters and balances using a GraphAnalyzer
from the graph_analyzer.py
module. Make sure to change the paths to the clusters-ltc.txt
and balances-ltc.pickle
to reflect your local file system.
Additional data that might be useful when analyzing altcoins.
Every address actually has three representations:
- The address as acknowledged by the blockchain. For example, Namecoin addresses look like this: NCAzVGKq8JrsETxAkgw3MsDPinAEPwsTfn.
- The pubkey_hash as used by Abe. For example, the pubkey hash of that address looks like this: ab1651e5bff4186dc3eb23b9508201458c82b0a2.
- The pubkey_id, again as used by Abe (and typically used in our code). This is just an integer that is incremented for each new transaction. For example, the pubkey ID of that hash is 2.
When we analyze the blockchain, we usually use the ID. But it's convenient to convert from ID to address to verify that what you're seeing matches up with block explorers online; it's also nice to go from address to ID, as you might want to check if a certain real address is in a certain cluster, which will operate over pubkey IDs.
The module address_handler.py
can convert in both directions. It takes three arguments:
- The method, either
encode
ordecode
. The former goes from address to pubkey ID, and the latter from pubkey ID to address. - The address or pubkey ID, respectively.
- The ticker of the relevant coin.
Here are two sample calls to illustrate the functionality:
python address_handler.py encode NCAzVGKq8JrsETxAkgw3MsDPinAEPwsTfn NMC
# 2
python address_handler.py decode 2 NMC
# NCAzVGKq8JrsETxAkgw3MsDPinAEPwsTfn
The address-handling process is also provided by the Graph Analyzer which allows for on-the-fly conversion. For example, I could look-up the balance associated with NCAzVGKq8JrsETxAkgw3MsDPinAEPwsTfn
at present as follows:
# Load transaction graph into memory
analyzer = graph_analyzer.GraphAnalyzer('NMC')
# Convert address to pubkey ID and check balance
address = 'NCAzVGKq8JrsETxAkgw3MsDPinAEPwsTfn'
pubkey_id = analyzer.pubkey_for_address(address)
analyzer.balance_for_pubkey(pubkey_id)
MIT.