# Generate miners.json file and performs initial 3rd party attribution for blocks_attribition.json 

This notebook generates:
* `miners.json` miners files containing mining pool addresses and markers. 
* `blocks_attribution_0-$(current_blockheight).json` attributes miners to blocks based on third party information

### Types of attributions:
The added attributions are **only** based on the respective sources. Every updated to the *miners* made during the attribution are only relevant for future (custom) attributions:
- `blockchain_info_tag`: (unique) from pools.json available on github from blockchain.info, use the pool tag. This information is serves in miners_initial.json
- `blockchain_info_address`: (one for each output address) from pools.json available on github from blockchain.info, use the output addresses info 
- `blockchain_info`: which is the union of the above attributions
- `blocktrail`: (unique) use data from blocktrail.com 
- `btcccom_*`: also attribution form this file which has same syntax as blockchain.info file

## Imports

In [1]:
# python3.5
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

import re
import binascii
import sys
import random # draw samples

import csv
import json
from collections import Counter
from collections import defaultdict
import copy

import os.path # os.path.isfile; 

# dict diffing and printing
import dictdiffer
import pprint
pp = pprint.PrettyPrinter(indent=4)

In [2]:
# custom imports 
import util
from importlib import reload
reload(util)

<module 'util' from '/home/matteo/deep_dive/util.py'>

## Global variables and functions

In [3]:
# data up to blockheight 
current_blockheight = util.CURRENT_BLOCKHEIGHT
print(current_blockheight)

556400


### Input files

#### `blocks_0-$(current_blockheight).json`
The block.sjon file filled with raw data form the blockchain

In [4]:
blocks_json_file = "./dataset/blocks_0-" + str(current_blockheight) + ".json"
assert( os.path.isfile(blocks_json_file) )

#### `blockchain.info_$(date -I).json`
pools file from blockchain.info from github
* https://github.com/blockchain/Blockchain-Known-Pools
* https://github.com/blockchain/Blockchain-Known-Pools/tree/82ed31956388e3950845cc2faeaf6679a057ee5b
* UPDATE 2019-01-28: https://github.com/blockchain/Blockchain-Known-Pools/tree/29ab27c844ebdb63110f8783f73b9decd4abc221

In [6]:
!pwd

/home/aljosha/minerconomics-analytics/1_block_attribution


In [5]:
!wget --output-document='./dataset/blockchain.info_2019-01-28.json' "https://raw.githubusercontent.com/blockchain/Blockchain-Known-Pools/29ab27c844ebdb63110f8783f73b9decd4abc221/pools.json"

--2019-06-02 14:42:39--  https://raw.githubusercontent.com/blockchain/Blockchain-Known-Pools/29ab27c844ebdb63110f8783f73b9decd4abc221/pools.json
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.192.133, 151.101.128.133, 151.101.64.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.192.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 22210 (22K) [text/plain]
Saving to: ‘./dataset/blockchain.info_2019-01-28.json’


2019-06-02 14:42:39 (2,70 MB/s) - ‘./dataset/blockchain.info_2019-01-28.json’ saved [22210/22210]



In [6]:
#pools_blockchain_info_json_file = './miners_and_pools_sources/blockchain.info_2018-03-10.json'
pools_blockchain_info_json_file = './dataset/blockchain.info_2019-01-28.json'
assert( os.path.isfile(pools_blockchain_info_json_file) )

#### `btccom_$(date -I).json`
pools file from btccom from github
* https://github.com/btccom/Blockchain-Known-Pools
* https://raw.githubusercontent.com/btccom/Blockchain-Known-Pools/650a92227bf65b06ff0a5b58bb57c13856a3babf/pools.json

In [9]:
!pwd

/home/aljosha/minerconomics-analytics/1_block_attribution


In [7]:
!wget --output-document='./dataset/btccom_2019-02-15.json' "https://raw.githubusercontent.com/btccom/Blockchain-Known-Pools/650a92227bf65b06ff0a5b58bb57c13856a3babf/pools.json"

--2019-06-02 14:42:47--  https://raw.githubusercontent.com/btccom/Blockchain-Known-Pools/650a92227bf65b06ff0a5b58bb57c13856a3babf/pools.json
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.192.133, 151.101.128.133, 151.101.64.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.192.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 25075 (24K) [text/plain]
Saving to: ‘./dataset/btccom_2019-02-15.json’


2019-06-02 14:42:47 (4,78 MB/s) - ‘./dataset/btccom_2019-02-15.json’ saved [25075/25075]



In [9]:
#pools_blockchain_info_json_file = './miners_and_pools_sources/blockchain.info_2018-03-10.json'
pools_btccom_json_file = './dataset/btccom_2019-02-15.json'
assert( os.path.isfile(pools_btccom_json_file) )

#### `blocks_blocktrail_0-514240.json`
Attribution information fetched from blocktrail.com 
This file was generated with `blocktrail_extract.py` from previously fetched info since API has changed

new api:
* https://www.blocktrail.com/api

In [10]:
blocks_blocktrail_json_file = "./dataset/blocks_blocktrail_0-514240.json"
assert( os.path.isfile(blocks_blocktrail_json_file) )

### Output files

#### `miners_initial_blockchaininfo.json`
The mining entity data from:
* blockchain.info 

in our custom JSON format defined above

In [11]:
miners_initial_blockchaininfo_json_file = './dataset/miners_initial_blockchaininfo.json'
if os.path.isfile(miners_initial_blockchaininfo_json_file):
    print("Output file " + miners_initial_blockchaininfo_json_file +  "exists, will be overwritten.")

Output file ./dataset/miners_initial_blockchaininfo.jsonexists, will be overwritten.


#### `miners_initial_blockchaininfo_conflicts.json`
The mining entity data from:
* blockchain.info 

in our custom JSON format defined above

In [12]:
miners_initial_blockchaininfo_conflicts_json_file = './dataset/miners_initial_blockchaininfo_conflicts.json'
if os.path.isfile(miners_initial_blockchaininfo_conflicts_json_file):
    print("Output file " + miners_initial_blockchaininfo_conflicts_json_file +  "exists, will be overwritten.")

Output file ./dataset/miners_initial_blockchaininfo_conflicts.jsonexists, will be overwritten.


#### `miners_initial_btccom.json`
The mining entity data from:
* btccom 

in our custom JSON format defined above

In [13]:
miners_initial_btccom_json_file = './dataset/miners_initial_btccom.json'
if os.path.isfile(miners_initial_btccom_json_file):
    print("Output file " + miners_initial_btccom_json_file +  "exists, will be overwritten.")

Output file ./dataset/miners_initial_btccom.jsonexists, will be overwritten.


#### `miners_initial_btccom_conflicts.json`
The mining entity data from:
* btccom 

in our custom JSON format defined above

In [14]:
miners_initial_btccom_conflicts_json_file = './dataset/miners_initial_btccom_conflicts.json'
if os.path.isfile(miners_initial_btccom_conflicts_json_file):
    print("Output file " + miners_initial_btccom_conflicts_json_file +  "exists, will be overwritten.")

#### `miners_initial_incl_blocktrail.json`
The mining entity data from:
* incl some miners identified by blocktrail 

in our custom JSON format defined above

In [15]:
miners_initial_incl_blocktrail_json_file = './dataset/miners_initial_incl_blocktrail.json'
if os.path.isfile(miners_initial_incl_blocktrail_json_file):
    print("Output file " + miners_initial_incl_blocktrail_json_file +  "exists, will be overwritten.")

#### `miners_initial.json`
The mining entity data from:
* blockchain.info
* btccom 

in our custom JSON format defined above

In [16]:
miners_initial_json_file = './dataset/miners_initial.json'
if os.path.isfile(miners_initial_json_file):
    print("Output file " + miners_initial_json_file +  "exists, will be overwritten.")

#### `miners_initial_conflicts.json`
The mining entity data from:
* blockchain.info
* btccom 

in our custom JSON format defined above

In [17]:
miners_initial_conflicts_json_file = './dataset/miners_initial_conflicts.json'
if os.path.isfile(miners_initial_conflicts_json_file):
    print("Output file " + miners_initial_conflicts_json_file +  "exists, will be overwritten.")

#### `miners.json`
The mining entity data from:
* blockchain.info 
* blocktrail.com

Again this data is in our custom JSON format defined above.

In [18]:
miners_json_file = './dataset/miners.json'
if os.path.isfile(miners_json_file):
    print("Output file " + miners_json_file +  "exists, will be overwritten.")

#### `blocks_attribution_0-$(current_blockheight).json`
already performed block attributions to link same miners and pools together:
- `blockchain_info_marker`
- `blockchain_info_address`
- `blockchain_info`
- `blocktrail`
- `btccom`
- `btccom_address`
- `btccom_marker`

In [19]:
blocks_attribution_json_file = './dataset/blocks_attribution_0-' + str(current_blockheight) + '.json'
if os.path.isfile(blocks_attribution_json_file):
    print("Output file " + blocks_attribution_json_file +  "exists, will be overwritten.")

Output file ./dataset/blocks_attribution_0-556400.jsonexists, will be overwritten.


## miners.json example structure which should get populated in this notebook


In [22]:
# Show example file
with open(miners_example_file) as miners_example_fp:
    miners_example = json.load(miners_example_fp)
    
print(json.dumps(miners_example, indent=2, sort_keys=True))


{
  "miner_id": {
    "addresses": {
      "addresss": {
        "comment": "",
        "currencies": [
          ""
        ],
        "firstUsed": "",
        "lastUsed": "",
        "sources": [
          ""
        ]
      }
    },
    "markers": {
      "marker": {
        "comment": "",
        "currencies": [
          ""
        ],
        "firstUsed": "",
        "lastUsed": "",
        "sources": [
          ""
        ]
      }
    },
    "names": {
      "name": {
        "comment": "",
        "currencies:": [
          ""
        ],
        "firstUsed": "",
        "fullName": "",
        "lastUsed": "",
        "sources": [
          ""
        ],
        "url": ""
      }
    }
  }
}


## Add pools file form blockcahin.info 

In [20]:
with open (pools_blockchain_info_json_file) as pools_bc_data:
    pools_bc = json.load(pools_bc_data)

In [21]:
for field in pools_bc:
    print(field)

coinbase_tags
payout_addresses


In [22]:
assert "payout_addresses" in pools_bc.keys()
assert "coinbase_tags" in pools_bc.keys()

In [23]:
len(pools_bc["payout_addresses"]) # 85

85

In [24]:
len(pools_bc["coinbase_tags"]) # 89

89

## Add pools file form btccom

Compare and add the info from the btccom file

In [25]:
with open (pools_btccom_json_file) as pools_btccom_data:
    pools_btccom = json.load(pools_btccom_data)

In [26]:
for field in pools_btccom:
    print(field)

coinbase_tags
payout_addresses


In [27]:
assert "payout_addresses" in pools_bc.keys()
assert "coinbase_tags" in pools_bc.keys()

In [31]:
len(pools_btccom["payout_addresses"]) # 93

93

In [32]:
len(pools_btccom["coinbase_tags"]) # 104

104

In [28]:
diffs = list(dictdiffer.diff(pools_bc,pools_btccom))
for diff in diffs:
    if diff[0] == "add":
        pprint.pprint(diff)
    if diff[0] == "remove":
        pprint.pprint(diff)

('add',
 'coinbase_tags',
 [('BITFARMS', {'link': 'https://www.bitarms.io/', 'name': 'Bitfarms'}),
  ('/Huobi/', {'link': 'https://www.poolhb.com/', 'name': 'Huobi.pool'}),
  ('/E2M & BTC.TOP/', {'link': 'http://www.easy2mine.com/', 'name': 'WAYI.CN'}),
  ('/canoepool/', {'link': 'https://www.canoepool.com/', 'name': 'CanoePool'}),
  ('Mined By AntPool', {'link': 'https://www.antpool.com/', 'name': 'AntPool'}),
  ('BWPool', {'link': 'https://bwpool.net/', 'name': 'BWPool'}),
  ('/DCEX/', {'link': 'http://dcexploration.cn', 'name': 'DCEX'}),
  ('/BTPOOL/', {'link': '', 'name': 'BTPOOL'}),
  ('/Rawpool.com/',
   {'link': 'https://www.rawpool.com/', 'name': 'Rawpool.com'}),
  ('/Helix/', {'link': '', 'name': 'Helix'}),
  ('/Bitcoin-Ukraine.com.ua/',
   {'link': 'https://bitcoin-ukraine.com.ua/', 'name': 'Bitcoin-Ukraine'}),
  ('/poolin.com', {'link': 'https://www.poolin.com/', 'name': 'Poolin'}),
  ('/SecretSuperstar/', {'link': '', 'name': 'SecretSuperstar'}),
  ('/tigerpool.net', {'link

In [29]:
for diff in diffs:
    if diff[0] == "change":
        #if type(diff[1]) is list:
        #    "name" in diff[1]:
        #        pprint.pprint(diff)
        #else:
            if "name" in diff[1]:
                pprint.pprint(diff)

('change',
 ['coinbase_tags', '/solo.ckpool.org/', 'name'],
 ('Solo CKPool', 'Solo CK'))
('change', 'coinbase_tags./NiceHashSolo.name', ('NiceHash Solo', 'NiceHash'))
('change',
 'coinbase_tags./BitClub Network/.name',
 ('BitClub Network', 'BitClub'))
('change', 'coinbase_tags.BTCChina Pool.name', ('BTCC Pool', 'BTCC'))
('change', ['coinbase_tags', 'btcchina.com', 'name'], ('BTCC Pool', 'BTCC'))
('change', ['coinbase_tags', 'BTCChina.com', 'name'], ('BTCC Pool', 'BTCC'))
('change', 'coinbase_tags./BTCC/.name', ('BTCC Pool', 'BTCC'))
('change', 'coinbase_tags.BW Pool.name', ('BW.COM', 'BWPool'))
('change',
 ['coinbase_tags', 'xbtc.exx.com&bw.com', 'name'],
 ('xbtc.exx.com&bw.com', 'EXX&BW'))
('change', 'coinbase_tags./CANOE/.name', ('CANOE', 'CanoePool'))
('change', 'coinbase_tags./haominer/.name', ('Haominer', 'haominer'))
('change',
 'payout_addresses.1JLRXD8rjRgQtTS9MvfQALfHgGWau9L9ky.name',
 ('BW.COM', 'BWPool'))
('change',
 'payout_addresses.155fzsEBHy9Ri2bMQ8uuuR3tv1YzcDywd4.name'

## Create miners\_initial\_....json file

In [30]:
class ConflictingPoolData(Exception):
    pass

def add_makers_from_source(pools_dict,miners_dict,source,currency=[ "BTC",] ):
    # Add all markers from blockchain.info pools.json style file converted to a pools_dict
    # fields of blockchain.info dict
    fpayout="payout_addresses"
    ftags="coinbase_tags"
    fname="name"
    furl="link"

    for marker in pools_dict[ftags]:
        pool_name = pools_dict[ftags][marker][fname]
        pool_name_lookup = util.get_miner_id_by_name(miners_dict,pool_name)
        if pool_name_lookup is not None and pool_name_lookup != pool_name:
            # print a note if a pool has more than one name, this is ok 
            print("duplicate name in markers: " + pool_name + " - " + pool_name_lookup)
            pool_name = pool_name_lookup
        pool_url = pools_dict[ftags][marker][furl]

        
        if pool_name in miners_dict.keys() and marker in miners_dict[ pool_name ][ util.D_MARKERS ]:
            # print a note if a pool has more than one marker, this is ok 
            print("Same marker for pool: " + pool_name + " marker:" +  marker)
        elif pool_name in miners_dict.keys():
            # print a note if a pool has more than one marker, this is ok 
            print("Multiple markers for pool: " + pool_name)

        names_dict = { pools_dict[ftags][marker][fname]: { util.DD_URL:pool_url,
                                    util.DD_CURRENCIES: currency,
                                    util.DD_FULLNAME: "",
                                    util.DD_FIRSTUSED: 0,
                                    util.DD_LASTUSED: 0,
                                    util.DD_SOURCES: source} } 

        markers_dict = { marker: {  util.DD_CURRENCIES: currency,
                                    util.DD_FIRSTUSED: 0,
                                    util.DD_LASTUSED: 0,
                                    util.DD_SOURCES: source } }  

        util.add_miner(pool_name,
                 miners_dict,
                 names_dict,
                 markers_dict)

        #print(pool_name + "\t" + marker + "\t" + pool_url)
    return miners_dict


In [31]:
def add_addresses_from_source(pools_dict,miners_dict,source,currency=[ "BTC",] ):
    # Add all addresses from blockchain.info style file 
    # fields of blockchain.info dict
    fpayout="payout_addresses"
    ftags="coinbase_tags"
    fname="name"
    furl="link"
    
    for address in pools_dict[fpayout]:
        pool_name = pools_dict[fpayout][address][fname]
        pool_url = pools_dict[fpayout][address][furl]
        
        pool_name_lookup = util.get_miner_id_by_name(miners_dict,pool_name)
        if pool_name_lookup is not None and pool_name_lookup != pool_name:
            # print a note if a pool has more than one name, this is ok 
            print("duplicate name in addresses: " + pool_name + " - " + pool_name_lookup)
            pool_name = pool_name_lookup
        
        if pool_name in miners_dict:
            # add address to a existing pool
            #print("Adding address " + address + " to pool "+ pool_name)

            # check for duplicated pool_names with different addresses, this should not happen
            #print(json.dumps(miners, indent=2, sort_keys=True))
            stored_pool_url = miners_dict[ pool_name ][ util.D_NAMES ][ pool_name ]["url"]

            if ( pool_url.strip("/").strip("https://www.").strip("http://www.").strip("http://")
            != stored_pool_url.strip("/").strip("https://www.").strip("http://www.").strip("http://") ):
                print("CONFLICTING URL: " + repr(pool_url) )
                print("CONFLICTING URL: " + repr(stored_pool_url) )
                #raise ConflictingPoolData()

            # check if address already added 
            #if address in miners[pool_name]["addresses"]:
            #    print(" " + repr(pool_url) + repr(pool_name) + repr(address))
            #    print(" " + repr(miners[pool_name]["addresses"][address]) )
            #    raise ConflictingPoolData()
                
            names_dict = { pools_dict[fpayout][address][fname]: { util.DD_URL:pool_url,
                                    util.DD_CURRENCIES: currency,
                                    util.DD_FULLNAME: "",
                                    util.DD_FIRSTUSED: 0,
                                    util.DD_LASTUSED: 0,
                                    util.DD_SOURCES: source} }
            
            addresses_dict={ address: { util.DD_CURRENCIES: currency,
                                                 util.DD_FIRSTUSED: 0,
                                                 util.DD_LASTUSED: 0,
                                                 util.DD_SOURCES: source } }

            util.add_miner(miner_id=pool_name,
                     miners=miners_dict,
                     names_dict=names_dict,
                     addresses_dict=addresses_dict)
        else:
            # ceate new pool for new address
            print("new pool: " + pool_name)
            util.add_miner(miner_id=pool_name,
                     miners=miners_dict,
                     names_dict= { pool_name: { util.DD_URL:pool_url,
                                                util.DD_CURRENCIES: currency,
                                                util.DD_FULLNAME: "",
                                                util.DD_FIRSTUSED: 0,
                                                util.DD_LASTUSED: 0,
                                                util.DD_SOURCES:source } },
                     addresses_dict= { address: { util.DD_CURRENCIES: currency,
                                                  util.DD_FIRSTUSED: 0,
                                                  util.DD_LASTUSED: 0,
                                                  util.DD_SOURCES: source } }, )
    return miners_dict
        

## Create miners_initial_blockchaininfo.json file from blockchain.info
This initial pools files is based soly on data from *blockchain.info* github repository. 
It should give us a clear start to begin with. 

In [32]:
# Boostrap pools files with blockchain.info github data
# https://github.com/blockchain/Blockchain-Known-Pools/tree/82ed31956388e3950845cc2faeaf6679a057ee5b
miners_initial_blockchaininfo = dict()
miners_initial_blockchaininfo.clear()

miners_initial_blockchaininfo = add_makers_from_source(pools_bc,miners_dict=miners_initial_blockchaininfo,source=["blockchain.info github",])

Multiple markers for pool: OzCoin
Multiple markers for pool: TripleMining
Multiple markers for pool: Polmine
Multiple markers for pool: AntPool
Multiple markers for pool: BTCC Pool
Multiple markers for pool: BTCC Pool
Multiple markers for pool: BTCC Pool
Multiple markers for pool: BitFury
Multiple markers for pool: ViaBTC
Multiple markers for pool: PHash.IO


In [33]:
miners_initial_blockchaininfo = add_addresses_from_source(pools_bc,miners_dict=miners_initial_blockchaininfo,source=["blockchain.info github",])

new pool: BTC Nuggets
new pool: EkanemBTC
new pool: Huobi
new pool: CloudHashing
new pool: digitalX Mintsy
new pool: Telco 214
new pool: BTC Pool Party
new pool: Multipool
new pool: transactioncoinmining
new pool: BTCDig
new pool: Tricky's BTC Pool
new pool: BTCMP
new pool: Eobot
new pool: UNOMP
new pool: Patel's Mining pool
new pool: GoGreenLight
new pool: Poolin


In [34]:
# 'digitalX Mintsy' => 'digitalBTC'
util.add_miner('digitalBTC',miners_initial_blockchaininfo) # already exists 
util.unify_miners('digitalX Mintsy','digitalBTC',miners_initial_blockchaininfo)
print()




In [35]:
util.get_sample(miners_initial_blockchaininfo)

{'names': {'BW.COM': {'url': 'https://bw.com',
   'currencies': ['BTC'],
   'fullName': '',
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['blockchain.info github']}},
 'markers': {'BW Pool': {'currencies': ['BTC'],
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['blockchain.info github']}},
 'addresses': {'1JLRXD8rjRgQtTS9MvfQALfHgGWau9L9ky': {'currencies': ['BTC'],
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['blockchain.info github']}}}

In [36]:
with open(miners_initial_blockchaininfo_json_file, 'w') as outfile:
    json.dump(miners_initial_blockchaininfo, outfile)

## blockchain.info only attribution

In [37]:
# freshly read clean blocks.json file without any attributions
blocks = dict()
blocks.clear()

with open(blocks_json_file, 'r') as fp:
    blocks = json.load(fp)

In [38]:
# freshly read lean miners_minitial file (from blockchain.info) without any modifications
miners_initial_blockchaininfo = dict()
miners_initial_blockchaininfo.clear()

with open(miners_initial_blockchaininfo_json_file, 'r') as fp:
    miners_initial_blockchaininfo = json.load(fp)

In [39]:
util.get_sample(miners_initial_blockchaininfo)

{'names': {'Bitcoin.com': {'url': 'https://www.bitcoin.com',
   'currencies': ['BTC'],
   'fullName': '',
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['blockchain.info github']}},
 'markers': {'pool.bitcoin.com': {'currencies': ['BTC'],
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['blockchain.info github']}},
 'addresses': {}}

In [40]:
def attribute_blocks(blocks,
                     miners_dict,
                     addr_attr,
                     marker_attr,
                     both_attr,
                     source,
                     override=False,
                     update=False):
    """ Attribute given blocks based on given miners_dict json
    
    Takes names for the different attribution per address, marker and both as well as a source 
    from where the miners_dict information comes from. Overrides existing attributions with given
    names if override flag is set. 
    Returns tuple of (blocks,miners_dict,conflicts) and does change miners_dict in the process.
    """
    i = 0
    conflicts = list()
    conflicts.clear()

    for blknum in blocks:
        match = list()
        addr_match = list()
        cb_match = list()

        try:
            # first always test if not already attributed 
            if ( addr_attr not in blocks[ blknum ][ util.D_ATTRIBUTIONS ].keys() ) or override:
                # match address
                if len( blocks[ blknum ][ util.D_ADDRESSES ] ) == 1:
                    # only match if there is just one output address in the coinbase 
                    address = blocks[ blknum ][ util.D_ADDRESSES ][0]
                    match = util.match_address_to_miner( address, miners_dict, strict=False, blknum=int(blknum) )

                    if len( match ) >= 1:
                        # if multiple coinbase markers match we can get more than one match
                        matched_miners = defaultdict(list)
                        for ma in match:
                            matched_miners[ ma[0] ].append( ma[1] )
                        j = 0
                        attr = ""
                        for mi in matched_miners:
                            blocks[ blknum ][ util.D_ATTRIBUTIONS ][ addr_attr + attr ] = { util.DDD_MINER:mi,
                                                                                               "matches":matched_miners[mi],
                                                                                               util.DDD_SRC:source }
                            j += 1
                            attr = str(j)

            if ( marker_attr not in blocks[ blknum ][ util.D_ATTRIBUTIONS ].keys() ) or override:
                # match coinbase
                coinbase = blocks[ blknum ][ util.D_CB ]
                match = util.match_coinbase_to_miner( coinbase, miners_dict, strict=False, blknum=int(blknum) )

                if len( match ) >= 1:
                    # if multiple coinbase markers match we can get more than one match
                    matched_miners = defaultdict(list)
                    for ma in match:
                        matched_miners[ ma[0] ].append( ma[1] )
                    j = 0
                    attr = ""
                    for mi in matched_miners:
                        blocks[ blknum ][ util.D_ATTRIBUTIONS ][ marker_attr + attr ] = { util.DDD_MINER:mi,
                                                                                           "matches":matched_miners[mi],
                                                                                           util.DDD_SRC:source }
                        j += 1
                        attr = str(j)

            if ( both_attr not in blocks[ blknum ][ util.D_ATTRIBUTIONS ].keys() ) or override:
                # match both and update miners
                coinbase = blocks[ blknum ][ util.D_CB ]
                if len( blocks[ blknum ][ util.D_ADDRESSES ] ) == 1:
                    address = blocks[ blknum ][ util.D_ADDRESSES ][0]
                    match = util.match_miner(miners_dict,address,coinbase,update=update, blknum=int(blknum) )
                else:
                    match = util.match_miner(miners=miners_dict,coinbase=coinbase, blknum=int(blknum) )

                if len( match ) > 0:
                    # There could be more than one marker of the same pool that matches simultaniously
                    matches = list()
                    #print(match)
                    for m in match:
                        matches.append( m[1] )
                    blocks[ blknum ][ util.D_ATTRIBUTIONS ][ both_attr ] = { util.DDD_MINER:match[0][0],
                                                                                    "matches":matches,
                                                                                    util.DDD_SRC:source }
        except util.ConflictingMinerData as e:
            print()
            print("Message    = ",e.message)
            print("Blockheight= ",blknum)
            print("Miner1     = ",e.miner1)
            print("Miner2     = ",e.miner2)
            print("Coinbase   = ",e.coinbase)
            print("CoinbaseStr= ",repr(binascii.unhexlify(e.coinbase)))
            print("Addesses   = ",e.address)
            print("addr_match = ",e.addr_match)
            print("cb_match   = ",e.cb_match)
            conflicts.append( { "message":e.message,
                                util.DDD_MINER + "1":e.miner1,
                                util.DDD_MINER + "2":e.miner2,
                                util.D_CB + "1":e.coinbase,
                                "address": e.address,
                                "addr_match": e.addr_match,
                                "cb_match": e.cb_match,
                                util.DDD_SRC:source } )

        # progress bar     
        i+=1
        if i % 1000 == 0:
            print(i,end="\r")
            sys.stdout.flush()
    return (blocks,miners_dict,conflicts)

In [46]:
# this dict will be updated when coinbase matches are found and there is only one address 
# as the coinbase output
#miners = dict()
#miners.clear()
#miners = copy.deepcopy(miners_initial_blockchaininfo)

Attribute blocks to miners according to **blockchain.info** github file.
Even this file contains some conflicts when attributing according to address vs marker.
See below for conflicts:

In [41]:
# attribute blocks to miners according to blockcahin.info initial mapping
(blocks,miners_initial_blockchaininfo,conflicts) = util.attribute_blocks(blocks=blocks,
                                   miners_dict=miners_initial_blockchaininfo,
                                   addr_attr=util.DD_BCI_ADDR_ATTR,
                                   marker_attr=util.DD_BCI_MARK_ATTR,
                                   both_attr=util.DD_BCI_ATTR,
                                   source="blockchain.info",
                                   override=True)

482000
Message    =  Addr and Cb match differ
Blockheight=  482059
Miner1     =  Waterhole
Miner2     =  BTC.com
Coinbase   =  030b5b0704465fa1592f4254432e434f4d2ffabe6d6dff48161efa0f0b44fc9463e380e6b585ff91ac3f9c973d156dfff35fd553cdb101000000000000000228124b7faf010000000000
CoinbaseStr=  b'\x03\x0b[\x07\x04F_\xa1Y/BTC.COM/\xfa\xbemm\xffH\x16\x1e\xfa\x0f\x0bD\xfc\x94c\xe3\x80\xe6\xb5\x85\xff\x91\xac?\x9c\x97=\x15m\xff\xf3_\xd5S\xcd\xb1\x01\x00\x00\x00\x00\x00\x00\x00\x02(\x12K\x7f\xaf\x01\x00\x00\x00\x00\x00'
Addesses   =  1FLH1SoLv4U68yUERhDiWzrJn5TggMqkaZ
addr_match =  [('Waterhole', {'addr_match': '1FLH1SoLv4U68yUERhDiWzrJn5TggMqkaZ'})]
cb_match   =  [('BTC.com', {'cb_match': '/BTC.COM/'})]

Message    =  Addr and Cb match differ
Blockheight=  482221
Miner1     =  Waterhole
Miner2     =  BTC.com
Coinbase   =  03ad5b0704a4e6a2592f4254432e434f4d2ffabe6d6d0dd4f6c7dac894c84821c9fed8c15a621583ada8e4177c182c011cab52e2d6d0010000000000000001292f8e92ae010000000000
CoinbaseStr=  b'\x03\xad[\x

In [42]:
len(conflicts) # 2

2

In [43]:
# introduced checkpoint
assert 2 == len(conflicts)

In [105]:
# attribute blocks to miners according to blockcahin.info initial mapping
(blocks,miners_initial_blockchaininfo,conflicts) = util.attribute_blocks(blocks=blocks,
                                   miners_dict=miners_initial_blockchaininfo,
                                   addr_attr=util.DD_BCI_ADDR_ATTR + "_update",
                                   marker_attr=util.DD_BCI_MARK_ATTR + "_update",
                                   both_attr=util.DD_BCI_ATTR + "_update",
                                   source="blockchain.info",
                                   override=True,
                                   update=True)

In [104]:
(blocks,miners_initial_blockchaininfo,conflicts) = util.attribute_blocks(blocks=blocks,
                                   miners_dict=miners_initial_blockchaininfo,
                                   addr_attr=util.DD_BCI_ADDR_ATTR + "_update",
                                   marker_attr=util.DD_BCI_MARK_ATTR + "_update",
                                   both_attr=util.DD_BCI_ATTR + "_update",
                                   source="blockchain.info",
                                   override=True,
                                   update=True)

In [46]:
len(conflicts) # 5

65

In [47]:
with open(miners_initial_blockchaininfo_conflicts_json_file, 'w') as outfile:
    json.dump(conflicts, outfile)

In [55]:
blocks["353757"]

{'addresses': ['1NY15MK947MLzmPUa2gL7UgyR8prLh2xfu'],
 'attribution': '',
 'attributions': {'blockchain_info': {'matches': [{'addr_match': '1NY15MK947MLzmPUa2gL7UgyR8prLh2xfu'}],
   'miner': 'digitalBTC',
   'src': 'blockchain.info'},
  'blockchain_info_address': {'matches': [{'addr_match': '1NY15MK947MLzmPUa2gL7UgyR8prLh2xfu'}],
   'miner': 'digitalBTC',
   'src': 'blockchain.info'},
  'blockchain_info_address_update': {'matches': [{'addr_match': '1NY15MK947MLzmPUa2gL7UgyR8prLh2xfu'}],
   'miner': 'digitalBTC',
   'src': 'blockchain.info'},
  'blockchain_info_update': {'matches': [{'addr_match': '1NY15MK947MLzmPUa2gL7UgyR8prLh2xfu'}],
   'miner': 'digitalBTC',
   'src': 'blockchain.info'}},
 'cb': '03dd6505062f503253482f047c713c5508fabe6d6d00009582bc1f01000000002f6d75746172745365646f6e2f0dd38a2f05e3ffff010000000000000077ffffe5053d8c490d2f6e6f64655374726174756d2f',
 'conflicts': 0,
 'hash': '00000000000000000e6ed6f2db8296aab547670f4cf23847fec76161723efc46',
 'miner': '',
 'payout': '25

Print and investigate the conflicts:

In [56]:
blocks["482059"]

{'addresses': ['1FLH1SoLv4U68yUERhDiWzrJn5TggMqkaZ'],
 'attribution': '',
 'attributions': {'blockchain_info_address': {'matches': [{'addr_match': '1FLH1SoLv4U68yUERhDiWzrJn5TggMqkaZ'}],
   'miner': 'Waterhole',
   'src': 'blockchain.info'},
  'blockchain_info_address_update': {'matches': [{'addr_match': '1FLH1SoLv4U68yUERhDiWzrJn5TggMqkaZ'}],
   'miner': 'Waterhole',
   'src': 'blockchain.info'},
  'blockchain_info_marker': {'matches': [{'cb_match': '/BTC.COM/'}],
   'miner': 'BTC.com',
   'src': 'blockchain.info'},
  'blockchain_info_marker_update': {'matches': [{'cb_match': '/BTC.COM/'}],
   'miner': 'BTC.com',
   'src': 'blockchain.info'}},
 'cb': '030b5b0704465fa1592f4254432e434f4d2ffabe6d6dff48161efa0f0b44fc9463e380e6b585ff91ac3f9c973d156dfff35fd553cdb101000000000000000228124b7faf010000000000',
 'conflicts': 0,
 'hash': '000000000000000000ee874c85580ea9c9297d4fcf4fe0bf59b696d1dbd10290',
 'miner': '',
 'payout': '1701199633',
 'phash': None,
 'time': 1503747909}

In [57]:
blocks["482221"]

{'addresses': ['1FLH1SoLv4U68yUERhDiWzrJn5TggMqkaZ'],
 'attribution': '',
 'attributions': {'blockchain_info_address': {'matches': [{'addr_match': '1FLH1SoLv4U68yUERhDiWzrJn5TggMqkaZ'}],
   'miner': 'Waterhole',
   'src': 'blockchain.info'},
  'blockchain_info_address_update': {'matches': [{'addr_match': '1FLH1SoLv4U68yUERhDiWzrJn5TggMqkaZ'}],
   'miner': 'Waterhole',
   'src': 'blockchain.info'},
  'blockchain_info_marker': {'matches': [{'cb_match': '/BTC.COM/'}],
   'miner': 'BTC.com',
   'src': 'blockchain.info'},
  'blockchain_info_marker_update': {'matches': [{'cb_match': '/BTC.COM/'}],
   'miner': 'BTC.com',
   'src': 'blockchain.info'}},
 'cb': '03ad5b0704a4e6a2592f4254432e434f4d2ffabe6d6d0dd4f6c7dac894c84821c9fed8c15a621583ada8e4177c182c011cab52e2d6d0010000000000000001292f8e92ae010000000000',
 'conflicts': 0,
 'hash': '000000000000000000d172ef46944db6127dbebe815664f26f37fef3e22fd65b',
 'miner': '',
 'payout': '1293786138',
 'phash': None,
 'time': 1503848099}

In [58]:
blocks["159929"]

{'addresses': ['1GG9HQZchCRxPSBV5SwZ9GoYEVq9vVLGqU'],
 'attribution': '',
 'attributions': {'blockchain_info': {'matches': [{'cb_match': 'ozco.in'}],
   'miner': 'OzCoin',
   'src': 'blockchain.info'},
  'blockchain_info_address_update': {'matches': [{'addr_match': '1GG9HQZchCRxPSBV5SwZ9GoYEVq9vVLGqU'}],
   'miner': 'Yourbtc.net',
   'src': 'blockchain.info'},
  'blockchain_info_marker': {'matches': [{'cb_match': 'ozco.in'}],
   'miner': 'OzCoin',
   'src': 'blockchain.info'},
  'blockchain_info_marker_update': {'matches': [{'cb_match': 'ozco.in'}],
   'miner': 'OzCoin',
   'src': 'blockchain.info'}},
 'cb': '70736a04ba760e1a0418930700522cfabe6d6dc15059fd58be57a512008aff27237100949e3afd1e7f7be1dc5cc5346f7169b401000000000000006f7a636f2e696eac1eeeed88',
 'conflicts': 0,
 'hash': '00000000000004a975cb9331f6264330b8b5d96db58b5917f41d721141efba9a',
 'miner': '',
 'payout': '5004514921',
 'phash': None,
 'time': 1325312734}

In [59]:
blocks["159846"]

{'addresses': ['1GG9HQZchCRxPSBV5SwZ9GoYEVq9vVLGqU'],
 'attribution': '',
 'attributions': {'blockchain_info': {'matches': [{'cb_match': 'ozco.in'}],
   'miner': 'OzCoin',
   'src': 'blockchain.info'},
  'blockchain_info_address_update': {'matches': [{'addr_match': '1GG9HQZchCRxPSBV5SwZ9GoYEVq9vVLGqU'}],
   'miner': 'Yourbtc.net',
   'src': 'blockchain.info'},
  'blockchain_info_marker': {'matches': [{'cb_match': 'ozco.in'}],
   'miner': 'OzCoin',
   'src': 'blockchain.info'},
  'blockchain_info_marker_update': {'matches': [{'cb_match': 'ozco.in'}],
   'miner': 'OzCoin',
   'src': 'blockchain.info'}},
 'cb': '70736a04ba760e1a040c9b0100522cfabe6d6dd5198c9a7f3796c9599592dc199792cc88c0a5588c2b9f5b491c1d8cbed5700601000000000000006f7a636f2e696eac1eeeed88',
 'conflicts': 0,
 'hash': '0000000000000afe20d9da4428d75588696ba4352931a386d9732ca74bfcfbc2',
 'miner': '',
 'payout': '5000300000',
 'phash': None,
 'time': 1325270885}

In [60]:
blocks["159964"]

{'addresses': ['1GG9HQZchCRxPSBV5SwZ9GoYEVq9vVLGqU'],
 'attribution': '',
 'attributions': {'blockchain_info': {'matches': [{'cb_match': 'ozco.in'}],
   'miner': 'OzCoin',
   'src': 'blockchain.info'},
  'blockchain_info_address_update': {'matches': [{'addr_match': '1GG9HQZchCRxPSBV5SwZ9GoYEVq9vVLGqU'}],
   'miner': 'Yourbtc.net',
   'src': 'blockchain.info'},
  'blockchain_info_marker': {'matches': [{'cb_match': 'ozco.in'}],
   'miner': 'OzCoin',
   'src': 'blockchain.info'},
  'blockchain_info_marker_update': {'matches': [{'cb_match': 'ozco.in'}],
   'miner': 'OzCoin',
   'src': 'blockchain.info'}},
 'cb': '70736a04ba760e1a04f9850000522cfabe6d6daa7399390129c5b877981adab8face0bd391539f3127cad753419a506405ced001000000000000006f7a636f2e696eac1eeeed88',
 'conflicts': 0,
 'hash': '00000000000004f1db65f2026e4d91e3f88644c2da9b213cee1b01848dc16ecd',
 'miner': '',
 'payout': '5000000000',
 'phash': None,
 'time': 1325333051}

Print and check random block:

In [61]:
util.get_sample(blocks)

{'addresses': ['1H1sq6Msgt9HjRrBuz8ieZdThzWXS6oPVA',
  '1P7ym2BcYePrhBrTwNKBe8zgettjtPUzuG'],
 'attribution': '',
 'attributions': {'blockchain_info': {'matches': [{'cb_match': 'KnCMiner'}],
   'miner': 'KnCMiner',
   'src': 'blockchain.info'},
  'blockchain_info_marker': {'matches': [{'cb_match': 'KnCMiner'}],
   'miner': 'KnCMiner',
   'src': 'blockchain.info'},
  'blockchain_info_marker_update': {'matches': [{'cb_match': 'KnCMiner'}],
   'miner': 'KnCMiner',
   'src': 'blockchain.info'},
  'blockchain_info_update': {'matches': [{'cb_match': 'KnCMiner'}],
   'miner': 'KnCMiner',
   'src': 'blockchain.info'}},
 'cb': '0363c204184b6e434d696e6572422d5031c4654b832a94115b53cdbcacf80f00005a730200',
 'conflicts': 0,
 'hash': '00000000000000002e7db7f737b81db0965bd17aa7886683bb22ceceef24d75c',
 'miner': '',
 'payout': '2508543256',
 'phash': None,
 'time': 1405992108}

Check how many addresses have been added by marker detection:

In [48]:
def check_for_miner_addresses_from_markers(miners):
    # check for miners addresses added based on markers 
    i = 0 
    for miner in miners:
        for addr in miners[ miner ][ util.D_ADDRESSES ]:
            if util.DD_SOURCES in miners[ miner ][ util.D_ADDRESSES ][ addr ]:
                if "cb marker" in miners[ miner ][ util.D_ADDRESSES ][ addr ][ util.DD_SOURCES ]:
                    #print("miner: ",miner," addr: ", addr)
                    i += 1
            #else:
                #print(miners[ miner ])
                #break
    return i
            

In [49]:
print("Added addresses by marker: ",util.check_for_miner_addresses_from_markers(miners_initial_blockchaininfo)) # 6021

Added addresses by marker:  6021


In [50]:
assert util.check_for_miner_addresses_from_markers(miners_initial_blockchaininfo) == 6021

Check if these addtions have caused address collisions:

In [51]:
def check_for_obvious_address_collisions(miners):
    # check for obvious address collisions:
    addr_collision = False
    address_conflicts = Counter()

    for m in miners:
        for a in miners[ m ][ util.D_ADDRESSES ].keys():
            address_conflicts[ a ] += 1

    for at in address_conflicts.most_common():
        if at[1] > 1:
            #address_conflicts_list.append( at[0] )
            print(at[0],":",at[1])
            addr_collision = True
    return addr_collision

In [52]:
assert util.check_for_obvious_address_collisions(miners_initial_blockchaininfo) == False

In [65]:
util.get_sample(miners_initial_blockchaininfo)

{'addresses': {},
 'markers': {'Mined by MultiCoin.co': {'currencies': ['BTC'],
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['blockchain.info github']}},
 'names': {'MultiCoin.co': {'currencies': ['BTC'],
   'firstUsed': 0,
   'fullName': '',
   'lastUsed': 0,
   'sources': ['blockchain.info github'],
   'url': 'http://multicoin.co'}}}

## Fresh btccom only attribution and create miners_initial_btccom.json

Attribute all blocks based soly on the btccom attribution

In [53]:
# Boostrap pools files with blockchain.info github data
# https://github.com/blockchain/Blockchain-Known-Pools/tree/82ed31956388e3950845cc2faeaf6679a057ee5b
miners_initial_btccom = dict()
miners_initial_btccom.clear()

miners_initial_btccom = add_makers_from_source(pools_btccom,miners_initial_btccom,source=["btccom github",])

Multiple markers for pool: OzCoin
Multiple markers for pool: TripleMining
Multiple markers for pool: Polmine
Multiple markers for pool: AntPool
Multiple markers for pool: AntPool
Multiple markers for pool: BTCC
Multiple markers for pool: BTCC
Multiple markers for pool: BTCC
Multiple markers for pool: BWPool
Multiple markers for pool: BitFury
Multiple markers for pool: ViaBTC
Multiple markers for pool: CanoePool
Multiple markers for pool: PHash.IO


In [54]:
miners_initial_btccom = add_addresses_from_source(pools_btccom,miners_initial_btccom,source=["btccom github",])

CONFLICTING URL: 'https://www.bitfarms.io/'
CONFLICTING URL: 'https://www.bitarms.io/'
new pool: BTC Nuggets
new pool: Huobi
new pool: CloudHashing
new pool: digitalX Mintsy
new pool: Telco 214
new pool: BTC Pool Party
new pool: Multipool
new pool: transactioncoinmining
new pool: BTCDig
new pool: Tricky's BTC Pool
new pool: BTCMP
new pool: Eobot
new pool: UNOMP
new pool: Patels
new pool: GoGreenLight
new pool: BitcoinIndia
CONFLICTING URL: 'https://btc.canoepool.com/'
CONFLICTING URL: 'https://www.canoepool.com/'
new pool: EkanemBTC
new pool: CANOE
new pool: tiger


In [68]:
# rename miners according to blockchain.info
# Therefore we create a new empty miner and move the existing one into it
# 'BTCC' => 'BTCC Pool'
util.add_miner('BTCC Pool',miners_initial_btccom)
util.unify_miners('BTCC','BTCC Pool',miners_initial_btccom)
print()




In [55]:
# 'BWPool' => 'BW.COM'
util.add_miner('BW.COM',miners_initial_btccom)
util.unify_miners('BWPool','BW.COM',miners_initial_btccom)
print()





In [56]:
# 'Solo CK' => 'Solo CKPool'
util.add_miner('Solo CKPool',miners_initial_btccom)
util.unify_miners('Solo CK','Solo CKPool',miners_initial_btccom)
print()




In [57]:
# 'BitClub' => 'BitClub Network'
util.add_miner('BitClub Network',miners_initial_btccom)
util.unify_miners('BitClub','BitClub Network',miners_initial_btccom)
print()




In [58]:
# 'Huobi.pool' => 'Huobi'
#util.add_miner('Huobi',miners_initial_btccom) # already exists 
util.unify_miners('Huobi.pool','Huobi',miners_initial_btccom)
print()




In [59]:
# 'sigmapool.com' => 'SigmaPool.com'
util.add_miner('SigmaPool.com',miners_initial_btccom)  
util.unify_miners('sigmapool.com','SigmaPool.com',miners_initial_btccom)
print()




In [60]:
# 'EXX&BW' => 'xbtc.exx.com&bw.com'
util.add_miner('xbtc.exx.com&bw.com',miners_initial_btccom) 
util.unify_miners('EXX&BW','xbtc.exx.com&bw.com',miners_initial_btccom)
print()




In [61]:
# 'haominer' => 'Haominer'
util.add_miner('Haominer',miners_initial_btccom) # already exists 
util.unify_miners('haominer','Haominer',miners_initial_btccom)
print()




In [62]:
# "Patels" => "Patel's Mining pool"
util.add_miner("Patel's Mining pool",miners_initial_btccom) # already exists 
util.unify_miners("Patels","Patel's Mining pool",miners_initial_btccom)
print()




In [63]:
# 'digitalX Mintsy' => 'digitalBTC'
util.add_miner('digitalBTC',miners_initial_btccom) # already exists 
util.unify_miners('digitalX Mintsy','digitalBTC',miners_initial_btccom)
print()




In [64]:
# 'NiceHash' => 'NiceHash Solo'
util.add_miner('NiceHash Solo',miners_initial_btccom) # already exists 
util.unify_miners('NiceHash','NiceHash Solo',miners_initial_btccom)
print()




In [65]:
miners_initial_btccom["DCExploration"]

{'names': {'DCExploration': {'url': 'http://dcexploration.cn',
   'currencies': ['BTC'],
   'fullName': '',
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['btccom github']}},
 'markers': {'/DCExploration/': {'currencies': ['BTC'],
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['btccom github']}},
 'addresses': {}}

In [66]:
miners_initial_btccom["DCEX"]

{'names': {'DCEX': {'url': 'http://dcexploration.cn',
   'currencies': ['BTC'],
   'fullName': '',
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['btccom github']}},
 'markers': {'/DCEX/': {'currencies': ['BTC'],
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['btccom github']}},
 'addresses': {}}

In [67]:
util.unify_miners("DCEX","DCExploration",miners_initial_btccom)
print()




In [68]:
miners_initial_btccom["Bitcoin India"]

{'names': {'Bitcoin India': {'url': 'https://bitcoin-india.org',
   'currencies': ['BTC'],
   'fullName': '',
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['btccom github']}},
 'markers': {'/Bitcoin-India/': {'currencies': ['BTC'],
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['btccom github']}},
 'addresses': {}}

In [69]:
miners_initial_btccom["BitcoinIndia"]

{'names': {'BitcoinIndia': {'url': 'https://pool.bitcoin-india.org/',
   'currencies': ['BTC'],
   'fullName': '',
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['btccom github']}},
 'markers': {},
 'addresses': {'1AZ6BkCo4zgTuuLpRStJH8iNsehXTMp456': {'currencies': ['BTC'],
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['btccom github']}}}

In [70]:
util.unify_miners("BitcoinIndia","Bitcoin India",miners_initial_btccom)
print()




In [71]:
miners_initial_btccom["CANOE"]

{'names': {'CANOE': {'url': 'https://www.canoepool.com',
   'currencies': ['BTC'],
   'fullName': '',
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['btccom github']}},
 'markers': {},
 'addresses': {'1Afcpc2FpPnREU6i52K3cicmHdvYRAH9Wo': {'currencies': ['BTC'],
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['btccom github']}}}

In [72]:
miners_initial_btccom["CanoePool"]

{'names': {'CanoePool': {'url': 'https://www.canoepool.com/',
   'currencies': ['BTC'],
   'fullName': '',
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['btccom github']}},
 'markers': {'/canoepool/': {'currencies': ['BTC'],
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['btccom github']},
  '/CANOE/': {'currencies': ['BTC'],
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['btccom github']}},
 'addresses': {'1GP8eWArgpwRum76saJS4cZKCHWJHs9PQo': {'currencies': ['BTC'],
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['btccom github']}}}

In [73]:
util.unify_miners("CanoePool","CANOE",miners_initial_btccom)
print()




In [74]:
util.get_sample(miners_initial_btccom)

{'names': {'Bitsolo': {'url': 'http://bitsolo.net/',
   'currencies': ['BTC'],
   'fullName': '',
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['btccom github']}},
 'markers': {'Bitsolo Pool': {'currencies': ['BTC'],
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['btccom github']}},
 'addresses': {'18zRehBcA2YkYvsC7dfQiFJNyjmWvXsvon': {'currencies': ['BTC'],
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['btccom github']}}}

In [75]:
with open(miners_initial_btccom_json_file, 'w') as outfile:
    json.dump(miners_initial_btccom, outfile)

In [76]:
# this dict will be updated when coinbase matches are found and there is only one address 
# as the coinbase output
miners = dict()
miners.clear()
miners = copy.deepcopy(miners_initial_btccom)

In [77]:
# attribute blocks to miners according to btccom initial mapping
(blocks,miners_initial_btccom,conflicts) = util.attribute_blocks(blocks=blocks,
                                   miners_dict=miners_initial_btccom,
                                   addr_attr=util.DD_BTCCOM_ADDR_ATTR,
                                   marker_attr=util.DD_BTCCOM_MARK_ATTR,
                                   both_attr=util.DD_BTCCOM_ATTR,
                                   source="btccom",
                                   override=True)

482000
Message    =  Addr and Cb match differ
Blockheight=  482059
Miner1     =  Waterhole
Miner2     =  BTC.com
Coinbase   =  030b5b0704465fa1592f4254432e434f4d2ffabe6d6dff48161efa0f0b44fc9463e380e6b585ff91ac3f9c973d156dfff35fd553cdb101000000000000000228124b7faf010000000000
CoinbaseStr=  b'\x03\x0b[\x07\x04F_\xa1Y/BTC.COM/\xfa\xbemm\xffH\x16\x1e\xfa\x0f\x0bD\xfc\x94c\xe3\x80\xe6\xb5\x85\xff\x91\xac?\x9c\x97=\x15m\xff\xf3_\xd5S\xcd\xb1\x01\x00\x00\x00\x00\x00\x00\x00\x02(\x12K\x7f\xaf\x01\x00\x00\x00\x00\x00'
Addesses   =  1FLH1SoLv4U68yUERhDiWzrJn5TggMqkaZ
addr_match =  [('Waterhole', {'addr_match': '1FLH1SoLv4U68yUERhDiWzrJn5TggMqkaZ'})]
cb_match   =  [('BTC.com', {'cb_match': '/BTC.COM/'})]

Message    =  Addr and Cb match differ
Blockheight=  482221
Miner1     =  Waterhole
Miner2     =  BTC.com
Coinbase   =  03ad5b0704a4e6a2592f4254432e434f4d2ffabe6d6d0dd4f6c7dac894c84821c9fed8c15a621583ada8e4177c182c011cab52e2d6d0010000000000000001292f8e92ae010000000000
CoinbaseStr=  b'\x03\xad[\x

In [79]:
len(conflicts) # 3

3

In [93]:
assert len(conflicts) == 3

In [94]:
blocks["524045"]

{'addresses': ['165GCEAx81wce33FWEnPCRhdjcXCrBJdKn'],
 'attribution': '',
 'attributions': {'blockchain_info_address_update': {'matches': [{'addr_match': '165GCEAx81wce33FWEnPCRhdjcXCrBJdKn'}],
   'miner': 'BitcoinRussia',
   'src': 'blockchain.info'},
  'blockchain_info_update': {'matches': [{'addr_match': '165GCEAx81wce33FWEnPCRhdjcXCrBJdKn'}],
   'miner': 'BitcoinRussia',
   'src': 'blockchain.info'},
  'btccom_address': {'matches': [{'addr_match': '165GCEAx81wce33FWEnPCRhdjcXCrBJdKn'}],
   'miner': 'BitcoinRussia',
   'src': 'btccom'},
  'btccom_marker': {'matches': [{'cb_match': '/Bitcoin-Ukraine.com.ua/'}],
   'miner': 'Bitcoin-Ukraine',
   'src': 'btccom'}},
 'cb': '030dff0704bbbe055b08fabe6d6d5161e32de4ba2440240366a8777307840c73e448a8e5239e663916c17e0be144010000000000000078054a364b5fe107182f426974636f696e2d556b7261696e652e636f6d2e75612f',
 'conflicts': 0,
 'hash': '00000000000000000038b67cc69238ebc13b7c63410200d03b67be05ed8af0e0',
 'miner': '',
 'payout': '1265971913',
 'phash'

In [81]:
# attribute blocks to miners according to btccom initial mapping
# update initial mapping when coinbase marker matches and there is only one output address of coinbase tx
(blocks,miners_initial_btccom,conflicts) = util.attribute_blocks(blocks=blocks,
                                   miners_dict=miners_initial_btccom,
                                   addr_attr=util.DD_BTCCOM_ADDR_ATTR + "_update",
                                   marker_attr=util.DD_BTCCOM_MARK_ATTR + "_update",
                                   both_attr=util.DD_BTCCOM_ATTR + "_update",
                                   source="btccom",
                                   override=True,
                                   update=True)

In [82]:
len(conflicts)

388

In [83]:
i = 0 
for c in conflicts:
    if c[ "miner1" ] == "CANOE" or c[ "miner2"] == "CANOE":
        i += 1
i

322

In [85]:
# attribute blocks to miners according to btccom initial mapping 
# updated with single addresses on which coinbase marker matched 
(blocks,miners_initial_btccom,conflicts) = util.attribute_blocks(blocks=blocks,
                                   miners_dict=miners_initial_btccom,
                                   addr_attr=util.DD_BTCCOM_ADDR_ATTR + "_update",
                                   marker_attr=util.DD_BTCCOM_MARK_ATTR + "_update",
                                   both_attr=util.DD_BTCCOM_ATTR + "_update",
                                   source="btccom",
                                   override=True,
                                   update=True)

In [86]:
len(conflicts)

388

In [87]:
with open(miners_initial_btccom_conflicts_json_file, 'w') as outfile:
    json.dump(conflicts, outfile)

In [88]:
print("Added addresses by marker: ",util.check_for_miner_addresses_from_markers(miners_initial_btccom)) # 6067

Added addresses by marker:  6067


In [104]:
assert util.check_for_miner_addresses_from_markers(miners_initial_btccom) == 6067

In [89]:
assert util.check_for_obvious_address_collisions(miners_initial_btccom) == False

In [106]:
util.get_sample(blocks)

{'addresses': ['1DDVcMv7DqcThcFDmqD9aLUkRByMDuAm5v'],
 'attribution': '',
 'attributions': {},
 'cb': '04d21c081b0147',
 'conflicts': 0,
 'hash': '00000000000069c0cd2c3ed74f04cdc43f00870f583d0d670e8d0c9f278ff481',
 'miner': '',
 'payout': '5000000000',
 'phash': None,
 'time': 1291440815}

## Miner unification between blockchain.info and btccom and enrich miners file from btccom

In [90]:
# Boostrap pools files with blockchain.info github data
# https://github.com/blockchain/Blockchain-Known-Pools/tree/82ed31956388e3950845cc2faeaf6679a057ee5b
miners = dict()
miners.clear()

miners = add_makers_from_source(pools_bc,miners,source=["blockchain.info github",])

Multiple markers for pool: OzCoin
Multiple markers for pool: TripleMining
Multiple markers for pool: Polmine
Multiple markers for pool: AntPool
Multiple markers for pool: BTCC Pool
Multiple markers for pool: BTCC Pool
Multiple markers for pool: BTCC Pool
Multiple markers for pool: BitFury
Multiple markers for pool: ViaBTC
Multiple markers for pool: PHash.IO


In [91]:
diffs = list(dictdiffer.diff(pools_bc,pools_btccom))
for diff in diffs:
    if diff[0] == "change":
        if "name" in diff[1]:
            #pprint.pprint(diff)
            util.add_name(miner=diff[2][0],miners=miners,name=diff[2][1],source="btccom github",currencies=["BTC",])

In [92]:
assert util.get_miner_id_by_name(miners,'NiceHash') == 'NiceHash Solo'

In [93]:
miners = add_makers_from_source(pools_btccom,miners,source=["btccom github",])

duplicate name in markers: CanoePool - CANOE
Multiple markers for pool: CANOE
Same marker for pool: BTC.TOP marker:/BTC.TOP/
Same marker for pool: Bitcoin.com marker:pool.bitcoin.com
Same marker for pool: 175btc marker:Mined By 175btc.com
Same marker for pool: GBMiners marker:/mined by gbminers/
Same marker for pool: A-XBT marker:/A-XBT/
Same marker for pool: ASICMiner marker:ASICMiner
Same marker for pool: BitMinter marker:BitMinter
Same marker for pool: BitcoinRussia marker:/Bitcoin-Russia.ru/
Same marker for pool: BTCServ marker:btcserv
Same marker for pool: simplecoin.us marker:simplecoin
Same marker for pool: BTC Guild marker:BTC Guild
Same marker for pool: Eligius marker:Eligius
Same marker for pool: OzCoin marker:ozco.in
Same marker for pool: OzCoin marker:ozcoin
Same marker for pool: EclipseMC marker:EMC
Same marker for pool: MaxBTC marker:MaxBTC
Same marker for pool: TripleMining marker:triplemining
Same marker for pool: TripleMining marker:Triplemining.com
Same marker for poo

In [94]:
assert "BWPool" in miners["BW.COM"][ util.D_NAMES ].keys()

In [95]:
miners = add_addresses_from_source(pools_bc,miners,source=["blockchain.info github",])

new pool: BTC Nuggets
new pool: EkanemBTC
new pool: Huobi
new pool: CloudHashing
new pool: digitalX Mintsy
new pool: Telco 214
new pool: BTC Pool Party
new pool: Multipool
new pool: transactioncoinmining
new pool: BTCDig
new pool: Tricky's BTC Pool
new pool: BTCMP
new pool: Eobot
new pool: UNOMP
new pool: Patel's Mining pool
new pool: GoGreenLight


In [96]:
miners["Huobi.pool"]

{'names': {'Huobi.pool': {'url': 'https://www.poolhb.com/',
   'currencies': ['BTC'],
   'fullName': '',
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['btccom github', 'blockchain.info github']}},
 'markers': {'/Huobi/': {'currencies': ['BTC'],
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['btccom github', 'blockchain.info github']}},
 'addresses': {}}

In [97]:
miners["Huobi"]

{'names': {'Huobi': {'url': 'http://www.huobi.com',
   'currencies': ['BTC'],
   'fullName': '',
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['blockchain.info github']}},
 'markers': {},
 'addresses': {'3HuobiNg2wHjdPU2mQczL9on8WF7hZmaGd': {'currencies': ['BTC'],
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['blockchain.info github']}}}

In [98]:
util.unify_miners("Huobi.pool","Huobi",miners)
print()




In [99]:
miners = add_addresses_from_source(pools_btccom,miners,source=["btccom github",])

CONFLICTING URL: 'https://www.bitfarms.io/'
CONFLICTING URL: 'https://www.bitarms.io/'
duplicate name in addresses: BWPool - BW.COM
CONFLICTING URL: 'https://bwpool.net/'
CONFLICTING URL: 'https://bw.com'
duplicate name in addresses: BitClub - BitClub Network
duplicate name in addresses: BTCC - BTCC Pool
new pool: Patels
duplicate name in addresses: BitcoinIndia - Bitcoin India
CONFLICTING URL: 'https://pool.bitcoin-india.org/'
CONFLICTING URL: 'https://bitcoin-india.org'
duplicate name in addresses: CanoePool - CANOE
CONFLICTING URL: 'https://btc.canoepool.com/'
CONFLICTING URL: 'https://www.canoepool.com'
new pool: tiger


In [117]:
miners["DCExploration"]

{'addresses': {},
 'markers': {'/DCExploration/': {'currencies': ['BTC'],
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['blockchain.info github', 'btccom github']}},
 'names': {'DCExploration': {'currencies': ['BTC'],
   'firstUsed': 0,
   'fullName': '',
   'lastUsed': 0,
   'sources': ['blockchain.info github', 'btccom github'],
   'url': ''}}}

In [118]:
miners["DCEX"]

{'addresses': {},
 'markers': {'/DCEX/': {'currencies': ['BTC'],
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['btccom github', 'blockchain.info github']}},
 'names': {'DCEX': {'currencies': ['BTC'],
   'firstUsed': 0,
   'fullName': '',
   'lastUsed': 0,
   'sources': ['btccom github', 'blockchain.info github'],
   'url': 'http://dcexploration.cn'}}}

In [100]:
util.unify_miners("DCEX","DCExploration",miners)
print()




In [120]:
miners["sigmapool.com"]

{'addresses': {},
 'markers': {'/SigmaPool.com/': {'currencies': ['BTC'],
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['btccom github', 'blockchain.info github']}},
 'names': {'sigmapool.com': {'currencies': ['BTC'],
   'firstUsed': 0,
   'fullName': '',
   'lastUsed': 0,
   'sources': ['btccom github', 'blockchain.info github'],
   'url': 'https://sigmapool.com'}}}

In [121]:
miners["SigmaPool.com"]

{'addresses': {},
 'markers': {'SigmaPool.com': {'currencies': ['BTC'],
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['blockchain.info github', 'btccom github']}},
 'names': {'SigmaPool.com': {'currencies': ['BTC'],
   'firstUsed': 0,
   'fullName': '',
   'lastUsed': 0,
   'sources': ['blockchain.info github', 'btccom github'],
   'url': 'https://www.sigmapool.com/'}}}

In [101]:
util.unify_miners("sigmapool.com","SigmaPool.com",miners)
print()




In [123]:
miners["tiger"]

{'addresses': {'1LsFmhnne74EmU4q4aobfxfrWY4wfMVd8w': {'currencies': ['BTC'],
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['btccom github']}},
 'markers': {},
 'names': {'tiger': {'currencies': ['BTC'],
   'firstUsed': 0,
   'fullName': '',
   'lastUsed': 0,
   'sources': ['btccom github'],
   'url': ''}}}

In [124]:
miners["Patels"]

{'addresses': {'197miJmttpCt2ubVs6DDtGBYFDroxHmvVB': {'currencies': ['BTC'],
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['btccom github']},
  '19RE4mz2UbDxDVougc6GGdoT4x5yXxwFq2': {'currencies': ['BTC'],
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['btccom github']}},
 'markers': {},
 'names': {'Patels': {'currencies': ['BTC'],
   'firstUsed': 0,
   'fullName': '',
   'lastUsed': 0,
   'sources': ['btccom github'],
   'url': 'http://patelsminingpool.com/'}}}

In [102]:
assert "BitcoinIndia" in miners["Bitcoin India"][ util.D_NAMES ].keys()
assert "CanoePool" in miners["CANOE"][ util.D_NAMES ].keys()

In [126]:
util.get_sample(miners)

{'addresses': {'18cBEMRxXHqzWWCxZNtU91F5sbUNKhL5PX': {'currencies': ['BTC'],
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['blockchain.info github', 'btccom github']}},
 'markers': {'/ViaBTC/': {'currencies': ['BTC'],
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['blockchain.info github', 'btccom github']},
  'viabtc.com deploy': {'currencies': ['BTC'],
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['blockchain.info github', 'btccom github']}},
 'names': {'ViaBTC': {'currencies': ['BTC'],
   'firstUsed': 0,
   'fullName': '',
   'lastUsed': 0,
   'sources': ['blockchain.info github', 'btccom github'],
   'url': 'https://viabtc.com'}}}

## Miner unification between blockchain.info and blocktrail.com

In [106]:
with open(blocks_blocktrail_json_file, 'r') as fp:
    blocks_blocktrail = json.load(fp)

In [107]:
blocks_blocktrail["400000"]

{'miner': 'BW Pool', 'src': 'blocktrail.com'}

Try to match every pool/miner name in the *blocktrail.com* attribution to a name from the *blockchain.info* attribution

In [108]:
miners_bci = set()
miners_bt = set()
miners_both = set()

for blknum in blocks:
    if blknum in blocks_blocktrail.keys():
        # check if there is a blocktrail.com attribution for this block
        if ( util.DD_BCI_ATTR in blocks[ blknum ][ util.D_ATTRIBUTIONS ].keys() or
             util.DD_BCI_MARK_ATTR in blocks[ blknum ][ util.D_ATTRIBUTIONS ].keys() or
             util.DD_BCI_ADDR_ATTR in blocks[ blknum ][ util.D_ATTRIBUTIONS ].keys() ):
            # check if there is a blockchain.info attribution already
            
            # assign blockchain.info attribuiton:
            if util.DD_BCI_ATTR in blocks[ blknum ][ util.D_ATTRIBUTIONS ].keys():
                miner_bci = blocks[ blknum ][ util.D_ATTRIBUTIONS ][ util.DD_BCI_ATTR ][ util.DDD_MINER ]
                src_bci = blocks[ blknum ][ util.D_ATTRIBUTIONS ][ util.DD_BCI_ATTR ][ util.DDD_SRC ]
            elif util.DD_BCI_MARK_ATTR in blocks[ blknum ][ util.D_ATTRIBUTIONS ].keys():
                miner_bci = blocks[ blknum ][ util.D_ATTRIBUTIONS ][ util.DD_BCI_MARK_ATTR ][ util.DDD_MINER ]
                src_bci = blocks[ blknum ][ util.D_ATTRIBUTIONS ][ util.DD_BCI_MARK_ATTR ][ util.DDD_SRC ]
            elif util.DD_BCI_ADDR_ATTR in blocks[ blknum ][ util.D_ATTRIBUTIONS ].keys():
                miner_bci = blocks[ blknum ][ util.D_ATTRIBUTIONS ][ util.DD_BCI_ADDR_ATTR ][ util.DDD_MINER ]
                src_bci = blocks[ blknum ][ util.D_ATTRIBUTIONS ][ util.DD_BCI_ADDR_ATTR ][ util.DDD_SRC ]
            
            # assign blocktrail.com attribution:
            miner_bt = blocks_blocktrail[ blknum ][ util.DDD_MINER ]
            src_bt = blocks_blocktrail[ blknum ][ util.DDD_SRC ]
            
            if miner_bt == "unknown" or miner_bt == "Unknown Entity":
                # only try to match known miners 
                continue
            
            # try some simple name matching of pool/miner:
            if miner_bci == miner_bt:
                miners_both.add(miner_bci)
                
            elif miner_bci.lower() == miner_bt.lower():
                miners_both.add(miner_bci.lower())
                
                util.add_name( miner_bci, miners, miner_bci.lower(), "blocktrail.com" )
                util.add_name( miner_bci, miners, miner_bt, "blocktrail.com" ) 
                
            elif re.sub('[^a-z0-9]','',miner_bci.lower() ) == re.sub('[^a-z0-9]','',miner_bt.lower() ):
                miners_both.add( re.sub('[^a-z0-9]', '', miner_bci.lower() ) )
                
                util.add_name( miner_bci, miners, re.sub('[^a-z0-9]','',miner_bci.lower() ), "blocktrail.com" )
                util.add_name( miner_bci, miners, miner_bt, "blocktrail.com" )
                
            elif ( re.sub( 'pool', '', re.sub('[^a-z0-9]','',miner_bci.lower() ) ) == 
                   re.sub( 'pool', '', re.sub('[^a-z0-9]','',miner_bt.lower() ) ) ):
                miners_both.add( re.sub( 'pool', '', re.sub('[^a-z0-9]', '', miner_bci.lower() ) ) )
                
                util.add_name( miner_bci, miners, re.sub( 'pool', '', re.sub('[^a-z0-9]','',miner_bci.lower() ) ), "blocktrail.com" )
                util.add_name( miner_bci, miners, miner_bt, "blocktrail.com" )
            else:
                # if no name has matched yet and no other miner matches directly add both names to a list 
                # to relsove later manually
                if miner_bt not in miners.keys():
                    miners_bci.add(miner_bci)
                    miners_bt.add(miner_bt)


In [109]:
len(sorted(miners_both))

59

In [131]:
assert len(sorted(miners_both)) == 59

In [110]:
sorted(miners_bci)

['BW.COM',
 'Bitcoin Affiliate Network',
 'Bixin',
 'F2Pool',
 'Huobi',
 'KanoPool',
 'Yourbtc.net',
 'simplecoin.us',
 'xbtc.exx.com&bw.com']

In [111]:
assert len(miners_bci) == 9

In [112]:
sorted(miners_bt)

['Avalon + Huobi',
 'BTCC',
 'BW Pool',
 'BitAffNet',
 'DiscusFish / F2Pool',
 'EXX & BW',
 'HaoBTC',
 'Kano CKPool',
 'Simplecoin',
 'YourBTC']

In [113]:
assert len(miners_bt) == 10

Manually fix pool/miner names that are not recognized automatically by adding them to the *names* dict of a *miner*:

In [114]:
# manual miner name fixes

# merge pool names:
util.add_name( "Huobi", miners, 'Avalon + Huobi', "blocktrail.com" )

util.add_name( "BTCC Pool", miners, 'BTCC', "blocktrail.com" )

util.add_name( "BW.COM", miners, 'BW Pool', "blocktrail.com" )

util.add_name( "Bitcoin Affiliate Network", miners, 'BitAffNet', "blocktrail.com" )

util.add_name( "CANOE", miners, 'Canoe Pool', "blocktrail.com" )

util.add_name( "F2Pool", miners, 'DiscusFish / F2Pool', "blocktrail.com" )
util.add_name( "F2Pool", miners, 'DiscusFish', "blocktrail.com" )

util.add_name( "xbtc.exx.com&bw.com", miners, "EXX & BW", "blocktrail.com" )

util.add_name( "Bixin", miners, "HaoBTC", "blocktrail.com" )
names_dict= { "HaoBTC": {  util.DD_URL:"https://haobtc.com/",
                           util.DD_CURRENCIES: ["BTC",],
                           util.DD_FULLNAME: "",
                           util.DD_FIRSTUSED: 0,
                           util.DD_LASTUSED: 0,
                           util.DD_SOURCES:["blocktrail.com",] } }
util.add_miner( "Bixin", miners, names_dict, update=True )

util.add_name( "KanoPool", miners, "Kano CKPool", "blocktrail.com" )

util.add_name( "simplecoin.us", miners, "Simplecoin", "blocktrail.com" )

util.add_name( "Yourbtc.net", miners, "YourBTC", "blocktrail.com" )

# add non-machted pool name manually
util.add_name( "digitalX Mintsy", miners, "digitalX Mintsy", "blocktrail.com" )


{'names': {'digitalX Mintsy': {'url': 'https://www.mintsy.co',
   'currencies': ['BTC'],
   'fullName': '',
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['blockchain.info github', 'btccom github', 'blocktrail.com']}},
 'markers': {},
 'addresses': {'1NY15MK947MLzmPUa2gL7UgyR8prLh2xfu': {'currencies': ['BTC'],
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['blockchain.info github', 'btccom github', 'blocktrail.com']}}}

In [115]:
assert "KNCMiner" in miners["KnCMiner"][ util.D_NAMES ].keys()
assert "kncminer" in miners["KnCMiner"][ util.D_NAMES ].keys()

In [116]:
with open(miners_initial_json_file, 'w') as outfile:
    json.dump(miners, outfile)

In [117]:
miners_initial = copy.deepcopy(miners)

## blocktrail.com attribution with unified miner ids
Run blocktrail attribution again with unified names and attribute blocks according to blocktrail based on unified names

In [118]:
unknown_counter = 0
attribution_counter = 0 

for blknum in blocks:
    # iterate over all blocks once
    if blknum in blocks_blocktrail.keys():
        # check if block was attributed by blocktrail
        miner_bt = blocks_blocktrail[ blknum ][ util.DDD_MINER ]
        if miner_bt == "unknown" or miner_bt == "Unknown Entity":
            unknown_counter +=1 
            continue
            
        miner_uid = miner_bt
        mapped = False
        for miner_id in miners:
            # check if there is a unified miner id for this miner already in miners.json
            if miner_bt in miners[ miner_id ][ util.D_NAMES ].keys():
                # unified miner_id found
                miner_uid = miner_id
                mapped = True
                break
                
        if not mapped and len( blocks[ blknum ][ util.D_ADDRESSES ] ) == 1:
            # if miner name could not be mapped to list of existing miner names
            # it is probably a new miner and gets added as such
            address = blocks[ blknum ][ util.D_ADDRESSES ][0]
            print(miner_uid,":",address)
            attribution_counter += 1
            util.add_miner(miner_uid,
                     miners,
                     names_dict= { miner_uid: { util.DD_URL:"",
                                            util.DD_CURRENCIES: ["BTC",],
                                            util.DD_FULLNAME: "",
                                            util.DD_FIRSTUSED: 0,
                                            util.DD_LASTUSED: 0,
                                            util.DD_SOURCES:["blocktrail.com",] } },
                     addresses_dict= { address: { util.DD_CURRENCIES: ["BTC",],
                                              util.DD_FIRSTUSED: 0,
                                              util.DD_LASTUSED: 0,
                                              util.DD_SOURCES:["blocktrail.com",] } }, )
        elif mapped and len( blocks[ blknum ][ util.D_ADDRESSES ] ) == 1:
            # the the miner could be mapped then check if we already have the address 
            # of the coinbase output mapped to this miner 
            # (if there is only one address in the block coinbase output)
            # If we dont have the address add it, if we have it add "blocktrail.com" as source to the address
            address = blocks[ blknum ][ util.D_ADDRESSES ][0]
            """
            if address == "1FLH1SoLv4U68yUERhDiWzrJn5TggMqkaZ":
                # check for specific address 
                for attr in blocks[ blknum ][ util.D_ATTRIBUTIONS ].keys():
                    if blocks[ blknum ][ util.D_ATTRIBUTIONS ][ attr ][ "miner" ] != miner_uid:
                        print(blknum,":",blocks[ blknum ][ util.D_ATTRIBUTIONS ][ attr ][ "miner" ],"--",miner_uid)
            """            
            util.add_addr(miner_uid,miners,address,source="blocktrail.com",currencies=["BTC",])
            attribution_counter += 1
        
        # attribute block based on miner_uid in any case
        blocks[ blknum ][ util.D_ATTRIBUTIONS ][ util.DD_BT_ATTR ] = { util.DDD_MINER:miner_uid,
                                                                       util.DDD_SRC:"blocktrail.com" }       
        

DeepBit : 14RxyduJ3CJnCqMGo7Bbz43axv1wdFZ5yM
P2Pool.org : 15rLkQjruGDKnQpm8Y6RdvRJJ9BNKzbvEw
Satoshi Systems : 1E9UoAzRpnZeymLh4JnLjADEaVyG5xKWma
itzod : 1JaepCfDnErPTPA96HJr7kfHZXLsN4asmH
IceDrill : 12ej4RUwoszmQoKYyFg6Ej27L82xhFS5Ao
Poolz 4 you : 1HhZiLEY8YatYS1KywhgetZVvnZ6j3pA8z
TangPool : 12Taz8FFXQ3E2AGn3ZW1SZM5bLnYGX4xR6


In [141]:
len(blocks_blocktrail)

514240

In [119]:
attribution_counter

301026

In [120]:
unknown_counter

189135

## Check block attributions

In [121]:
blocks["0"]

{'time': 1231006505,
 'cb': '04ffff001d0104455468652054696d65732030332f4a616e2f32303039204368616e63656c6c6f72206f6e206272696e6b206f66207365636f6e64206261696c6f757420666f722062616e6b73',
 'addresses': ['1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa'],
 'miner': '',
 'conflicts': 0,
 'attribution': '',
 'attributions': {},
 'hash': '000000000019d6689c085ae165831e934ff763ae46a2a6c172b3f1b60a8ce26f',
 'phash': None,
 'payout': '5000000000'}

In [123]:
minblk = len(blocks) # check if start is really genesis at height 0 
maxblk = 0
for blknum in blocks.keys():
    if int(blknum) < minblk:
        minblk = int(blknum)
    if int(blknum) > maxblk:
        maxblk = int(blknum)
        
assert minblk == 0
assert maxblk == current_blockheight
print(minblk)
print(maxblk)

0
556400


### Analyze and resolve address/attribution conflicts 

Search for address conflicts where one address was attributed to more than one pool

In [124]:
# there should be 6 address conflicts now
util.check_for_obvious_address_collisions(miners)

1KFHE7w8BhaENAswwryaoccDb6qcT6DbYY : 2
1MimPd6LrPKGftPRHWdfk8S3KYBfN4ELnD : 2
147SwRQdpCfj5p8PnfsXV2SsVVpVcz3aPq : 2
1FLH1SoLv4U68yUERhDiWzrJn5TggMqkaZ : 2
19RE4mz2UbDxDVougc6GGdoT4x5yXxwFq2 : 2
197miJmttpCt2ubVs6DDtGBYFDroxHmvVB : 2


True

In [125]:
address_conflicts_list = list()

remove_conflicts = False
address_conflicts = Counter()

for m in miners:
    for a in miners[ m ][ util.D_ADDRESSES ].keys():
        address_conflicts[ a ] += 1

for at in address_conflicts.most_common():
    if at[1] > 1:
        address_conflicts_list.append( at[0] )

#print(address_conflicts_list)

for a in address_conflicts_list:
    print("---")
    for m in miners:
        if a in miners[ m ][ util.D_ADDRESSES ].keys():
            print(a,":\nminer = ",m,"\n",miners[ m ][ util.D_ADDRESSES ][ a ],"\n")
            if remove_conflicts and 'blocktrail.com' in miners[ m ][ util.D_ADDRESSES ][ a ][ util.DD_SOURCES ]:
                miners[ m ][ util.D_ADDRESSES ].pop( a , None)


---
1KFHE7w8BhaENAswwryaoccDb6qcT6DbYY :
miner =  F2Pool 
 {'currencies': ['BTC'], 'firstUsed': 0, 'lastUsed': 0, 'sources': ['blockchain.info github', 'btccom github', 'blocktrail.com']} 

1KFHE7w8BhaENAswwryaoccDb6qcT6DbYY :
miner =  BTCC Pool 
 {'sources': ['blocktrail.com'], 'firstUsed': 0, 'lastUsed': 0, 'currencies': ['BTC']} 

---
1MimPd6LrPKGftPRHWdfk8S3KYBfN4ELnD :
miner =  digitalBTC 
 {'currencies': ['BTC'], 'firstUsed': 0, 'lastUsed': 0, 'sources': ['blockchain.info github', 'btccom github', 'blocktrail.com']} 

1MimPd6LrPKGftPRHWdfk8S3KYBfN4ELnD :
miner =  digitalX Mintsy 
 {'sources': ['blocktrail.com'], 'firstUsed': 0, 'lastUsed': 0, 'currencies': ['BTC']} 

---
147SwRQdpCfj5p8PnfsXV2SsVVpVcz3aPq :
miner =  BTC.TOP 
 {'sources': ['blocktrail.com'], 'firstUsed': 0, 'lastUsed': 0, 'currencies': ['BTC']} 

147SwRQdpCfj5p8PnfsXV2SsVVpVcz3aPq :
miner =  CANOE 
 {'sources': ['blocktrail.com'], 'firstUsed': 0, 'lastUsed': 0, 'currencies': ['BTC']} 

---
1FLH1SoLv4U68yUERhDiWzrJ

#### 1KFHE7w8BhaENAswwryaoccDb6qcT6DbYY

This address was a mis attribution by *Blocktrail.com* because the coinbase includes `btcc` and the Fish of discus fish.

In [126]:
for blknum in blocks:
    if blknum in blocks_blocktrail.keys() and len( blocks[ blknum ][ util.D_ADDRESSES ] ) == 1 and blocks[ blknum ][ util.D_ADDRESSES ][ 0 ] == "1KFHE7w8BhaENAswwryaoccDb6qcT6DbYY":
        miner_bt = blocks_blocktrail[ blknum ][ util.DDD_MINER ]
        miner_uid = ""
        for miner_id in miners:
            # check if there is a unified miner id for this miner already in miners.json
            if miner_bt in miners[ miner_id ][ util.D_NAMES ].keys():
                # unified miner_id found
                miner_uid = miner_id
                mapped = True
                break
        if mapped:        
            for attr in blocks[ blknum ][ util.D_ATTRIBUTIONS ].keys():
                if blocks[ blknum ][ util.D_ATTRIBUTIONS ][ attr ][ util.DDD_MINER ] != miner_uid:
                    print(blknum,": ",blocks[ blknum ][ util.D_ATTRIBUTIONS ][ attr ][ util.DDD_MINER ]," (",
                          blocks[ blknum ][ util.D_ATTRIBUTIONS ][ attr ][ util.DDD_SRC ],") -- ",
                          miner_uid," ( blocktrail.com )")
                    print("\t", repr(binascii.unhexlify(blocks[ blknum ][ util.D_CB ]) ) )
                    break

482886 :  F2Pool  ( blockchain.info ) --  BTCC Pool  ( blocktrail.com )
	 b'\x03F^\x07\x05/NYA/,\xfa\xbemm\xf8\x01\xe6\x8c4\xd5\x08\xa0\xf3\xd6u\xe1\x04\xe3l\xc4\xcfr0\xb1[V?\r\xbe\xd8\xff\xaf\xf0\xdc\xaf\xb1\x04\x00\x00\x00\xf0\x9f\x90\x9f\x10Mined by smcbtcc\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'


In [151]:
print(miners["BTCC Pool"][ util.D_ADDRESSES ][ "1KFHE7w8BhaENAswwryaoccDb6qcT6DbYY" ])
miners["BTCC Pool"][ util.D_ADDRESSES ].pop( "1KFHE7w8BhaENAswwryaoccDb6qcT6DbYY" , None)

{'lastUsed': 0, 'currencies': ['BTC'], 'sources': ['blocktrail.com'], 'firstUsed': 0}


{'currencies': ['BTC'],
 'firstUsed': 0,
 'lastUsed': 0,
 'sources': ['blocktrail.com']}

#### 1MimPd6LrPKGftPRHWdfk8S3KYBfN4ELnD
#### 19vvtxUpbidB8MT5CsSYYTBEjMRnowSZj4
#### 12f1FoTYvYiSmiDSVfeHcw8gS8Fp7xREUW
There are multiple occasions where *blocktrail.com* attributed blocks differently than *blockchain.info*
**digitalBTC** and **digitalX Mintsy**.
Accodring to an online search they really belong to the same company digitalBTC:
* https://www.coindesk.com/digitalbtc-launches-mining-contracts-platform-digitalx-mintsy/

In [127]:
i = 0
for blknum in blocks:
    if blknum in blocks_blocktrail.keys() and len( blocks[ blknum ][ util.D_ADDRESSES ] ) == 1 and blocks[ blknum ][ util.D_ADDRESSES ][ 0 ] == "1MimPd6LrPKGftPRHWdfk8S3KYBfN4ELnD":
        miner_bt = blocks_blocktrail[ blknum ][ util.DDD_MINER ]
        miner_uid = ""
        for miner_id in miners:
            # check if there is a unified miner id for this miner already in miners.json
            if miner_bt in miners[ miner_id ][ util.D_NAMES ].keys():
                # unified miner_id found
                miner_uid = miner_id
                mapped = True
                break
        if mapped:        
            for attr in blocks[ blknum ][ util.D_ATTRIBUTIONS ].keys():
                if blocks[ blknum ][ util.D_ATTRIBUTIONS ][ attr ][ util.DDD_MINER ] != miner_uid:
                    print(blknum,": ",blocks[ blknum ][ util.D_ATTRIBUTIONS ][ attr ][ util.DDD_MINER ]," (",
                          blocks[ blknum ][ util.D_ATTRIBUTIONS ][ attr ][ util.DDD_SRC ],") -- ",
                          miner_uid," ( blocktrail.com )")
                    print("\t", repr(binascii.unhexlify(blocks[ blknum ][ util.D_CB ]) ) )
                    i += 1
                    break
print(i)

298955 :  digitalBTC  ( blockchain.info ) --  digitalX Mintsy  ( blocktrail.com )
	 b"\x03\xcb\x8f\x04\x06/P2SH/\x04\xab!eS\x08@\x00\x00'\x0f\x91\x85\xca\r/nodeStratum/"
299011 :  digitalBTC  ( blockchain.info ) --  digitalX Mintsy  ( blocktrail.com )
	 b'\x03\x03\x90\x04\x06/P2SH/\x04\x95\xafeS\x08P\x00\x00\n\x10\xc5\x18@\r/nodeStratum/'
299228 :  digitalBTC  ( blockchain.info ) --  digitalX Mintsy  ( blocktrail.com )
	 b'\x03\xdc\x90\x04\x06/P2SH/\x04\xe9\xc1gS\x08p\x00\x00<\x0c\xb5\xb8d\r/nodeStratum/'
299240 :  digitalBTC  ( blockchain.info ) --  digitalX Mintsy  ( blocktrail.com )
	 b'\x03\xe8\x90\x04\x06/P2SH/\x04\x12\xd9gS\x08\x10\x00\x00a\x14\x1c\xec#\r/nodeStratum/'
299273 :  digitalBTC  ( blockchain.info ) --  digitalX Mintsy  ( blocktrail.com )
	 b'\x03\t\x91\x04\x06/P2SH/\x047-hS\x08 \x00\x00\x02\x15\xb7\xfd\x8b\r/nodeStratum/'
299274 :  digitalBTC  ( blockchain.info ) --  digitalX Mintsy  ( blocktrail.com )
	 b'\x03\n\x91\x04\x06/P2SH/\x04\x8a-hS\x08 \x00\x00 \x14`R\x7f\r/

74


In [128]:
print(miners["digitalX Mintsy"])
util.unify_miners("digitalX Mintsy","digitalBTC",miners)
print()

{'names': {'digitalX Mintsy': {'url': 'https://www.mintsy.co', 'currencies': ['BTC'], 'fullName': '', 'firstUsed': 0, 'lastUsed': 0, 'sources': ['blockchain.info github', 'btccom github', 'blocktrail.com']}}, 'markers': {}, 'addresses': {'1NY15MK947MLzmPUa2gL7UgyR8prLh2xfu': {'currencies': ['BTC'], 'firstUsed': 0, 'lastUsed': 0, 'sources': ['blockchain.info github', 'btccom github', 'blocktrail.com']}, '1MimPd6LrPKGftPRHWdfk8S3KYBfN4ELnD': {'sources': ['blocktrail.com'], 'firstUsed': 0, 'lastUsed': 0, 'currencies': ['BTC']}, '19vvtxUpbidB8MT5CsSYYTBEjMRnowSZj4': {'sources': ['blocktrail.com'], 'firstUsed': 0, 'lastUsed': 0, 'currencies': ['BTC']}, '12f1FoTYvYiSmiDSVfeHcw8gS8Fp7xREUW': {'sources': ['blocktrail.com'], 'firstUsed': 0, 'lastUsed': 0, 'currencies': ['BTC']}}}



#### 147SwRQdpCfj5p8PnfsXV2SsVVpVcz3aPq
There are multiple occasions where *blocktrail.com* attributed blocks differently than *blockchain.info*:
**CANOE** and **BTC.TOP**.

With current information this cannot be resolved. 

In [130]:
i = 0
for blknum in blocks:
    if blknum in blocks_blocktrail.keys() and len( blocks[ blknum ][ util.D_ADDRESSES ] ) == 1 and blocks[ blknum ][ util.D_ADDRESSES ][ 0 ] == "147SwRQdpCfj5p8PnfsXV2SsVVpVcz3aPq":
        miner_bt = blocks_blocktrail[ blknum ][ util.DDD_MINER ]
        miner_uid = ""
        for miner_id in miners:
            # check if there is a unified miner id for this miner already in miners.json
            if miner_bt in miners[ miner_id ][ util.D_NAMES ].keys():
                # unified miner_id found
                miner_uid = miner_id
                mapped = True
                break
        if mapped:        
            for attr in blocks[ blknum ][ util.D_ATTRIBUTIONS ].keys():
                if blocks[ blknum ][ util.D_ATTRIBUTIONS ][ attr ][ util.DDD_MINER ] != miner_uid:
                    print(blknum,": ",blocks[ blknum ][ util.D_ATTRIBUTIONS ][ attr ][ util.DDD_MINER ]," (",
                          blocks[ blknum ][ util.D_ATTRIBUTIONS ][ attr ][ util.DDD_SRC ],") -- ",
                          miner_uid," ( blocktrail.com )")
                    print("\t", repr(binascii.unhexlify(blocks[ blknum ][ util.D_CB ]) ) )
                    i += 1
                    break
print(i)

#### 1FLH1SoLv4U68yUERhDiWzrJn5TggMqkaZ
* 482059
* 482221

Waterhole -- BTC.com 

With current information this cannot be resovled

In [136]:
with open(miners_initial_incl_blocktrail_json_file, 'w') as outfile:
    json.dump(miners, outfile)

## Redo `blockchain.info` and `blocktrail.com` attributioins based on modified miner data
With updated miners 

In [137]:
miners = dict()
miners.clear()

with open(miners_initial_json_file, 'r') as fp:
    miners = json.load(fp)

In [140]:
# Update the miners initial but not persist them after that step
print(miners_initial["digitalX Mintsy"])
util.unify_miners("digitalX Mintsy","digitalBTC",miners_initial)

In [159]:
util.get_sample(miners)

{'addresses': {'1KPQkehgYAqwiC6UCcbojM3mbGjURrQJF2': {'currencies': ['BTC'],
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['blockchain.info github', 'btccom github', 'blocktrail.com']}},
 'markers': {'/ConnectBTC - Home for Miners/': {'currencies': ['BTC'],
   'firstUsed': 0,
   'lastUsed': 0,
   'sources': ['blockchain.info github', 'btccom github', 'blocktrail.com']}},
 'names': {'ConnectBTC': {'currencies': ['BTC'],
   'firstUsed': 0,
   'fullName': '',
   'lastUsed': 0,
   'sources': ['blockchain.info github', 'btccom github', 'blocktrail.com'],
   'url': 'https://www.connectbtc.com'}}}

In [160]:
util.get_sample(blocks)

{'addresses': ['16mCKDJqGp2yGM4vYkkHydQgQogwiWtX9F',
  '16NB52SewXETqfrawp1Fooo1kS6oDFgPNc',
  '18d3HV2bm94UyY4a9DrPfoZ17sXuiDQq2B',
  '18FHEZ47jUcgo29WfoF5JN3imQi414D5iQ',
  '18pcZWZa79VqkCBS55mAyXwh9UCj1PrxKq',
  '19dNbhJ4LeoThc5ck8GafN43xQ8EsVDPie',
  '1DKsUFRMH4yiQeKyauHoBadPZnJEcpqWbW',
  '1DXfYWL1S1cpdNMS9bRXHYEqXevhKgE9HQ',
  '1G3nmSyhtGc2FFdUKiQfjMEjNZLFVE66MF',
  '1GvZ7Y5NejG4bHS9NpxPVSNA4tawcXnoRV',
  '1JKi5gSthVsDJ7brXN9jSqEicB2Qx4cjDK',
  '1JqMb42q7PpqvBEVxNNK6SXTRTMCRncPFW',
  '1Kseaf5QeTv61ghTobCHdzRAkMSiKVUPtW',
  '1MegiuY2FZNSQb86zZR4qFvJQS3zE4MZze',
  '1MNdLkFk8y5rk5WzmXjn8ESxSU5sixv4sP',
  '1MU9UeqgsWDNkjppbyJUdTbq7JrEJKGFvQ',
  '1MXGXsZTBewv8XUjL8vnZ3PGsZ95mEexLk',
  '1NFvvAu3DoFsDhW31Lnr71aMhVhF3ETP7K',
  '1NNSeVpihhfgo5zHW86xC22Dd4nMj1wHAk',
  '1NqJH2zjSBAgat16ZpZQ1FAb83J8MaSCom',
  '1Rq7HRPfBXjjzPWT1ptko4MFUyiXQJQC2'],
 'attribution': '',
 'attributions': {'blockchain_info': {'matches': [{'cb_match': 'Eligius'}],
   'miner': 'Eligius',
   'src': 'blockchain.info'}

In [141]:
# attribute blocks to miners according to blockcahin.info initial mapping
(blocks,miners,conflicts) = util.attribute_blocks(blocks=blocks,
                                   miners_dict=miners,
                                   addr_attr="btccom_blockchain_info_addr",
                                   marker_attr="btccom_blockchain_info_marker",
                                   both_attr="btccom_blockchain_info",
                                   source="attribution based on btccom and blockchain.info sources",
                                   override=True,
                                   update=False)

482000
Message    =  Addr and Cb match differ
Blockheight=  482059
Miner1     =  Waterhole
Miner2     =  BTC.com
Coinbase   =  030b5b0704465fa1592f4254432e434f4d2ffabe6d6dff48161efa0f0b44fc9463e380e6b585ff91ac3f9c973d156dfff35fd553cdb101000000000000000228124b7faf010000000000
CoinbaseStr=  b'\x03\x0b[\x07\x04F_\xa1Y/BTC.COM/\xfa\xbemm\xffH\x16\x1e\xfa\x0f\x0bD\xfc\x94c\xe3\x80\xe6\xb5\x85\xff\x91\xac?\x9c\x97=\x15m\xff\xf3_\xd5S\xcd\xb1\x01\x00\x00\x00\x00\x00\x00\x00\x02(\x12K\x7f\xaf\x01\x00\x00\x00\x00\x00'
Addesses   =  1FLH1SoLv4U68yUERhDiWzrJn5TggMqkaZ
addr_match =  [('Waterhole', {'addr_match': '1FLH1SoLv4U68yUERhDiWzrJn5TggMqkaZ'})]
cb_match   =  [('BTC.com', {'cb_match': '/BTC.COM/'})]

Message    =  Addr and Cb match differ
Blockheight=  482221
Miner1     =  Waterhole
Miner2     =  BTC.com
Coinbase   =  03ad5b0704a4e6a2592f4254432e434f4d2ffabe6d6d0dd4f6c7dac894c84821c9fed8c15a621583ada8e4177c182c011cab52e2d6d0010000000000000001292f8e92ae010000000000
CoinbaseStr=  b'\x03\xad[\x

In [142]:
len(conflicts) 

3

In [145]:
# attribute blocks to miners according to blockcahin.info initial mapping
(blocks,miners,conflicts) = util.attribute_blocks(blocks=blocks,
                                   miners_dict=miners,
                                   addr_attr="btccom_blockchain_info_addr_update",
                                   marker_attr="btccom_blockchain_info_update",
                                   both_attr="btccom_blockchain_info_update",
                                   source="attribution based on btccom and blockchain.info sources",
                                   override=True,
                                   update=True)

In [144]:
len(conflicts) # 479 # 455

388

In [146]:
miners = dict()
miners.clear()

with open(miners_initial_incl_blocktrail_json_file, 'r') as fp:
    miners = json.load(fp)

In [148]:
(blocks,miners,conflicts) = util.attribute_blocks(blocks=blocks,
                                   miners_dict=miners,
                                   addr_attr="initial_addr",
                                   marker_attr="initial_marker",
                                   both_attr="initial",
                                   source="initial attribution based on all sources",
                                   override=True,
                                   update=False)

In [149]:
len(conflicts)

455

In [151]:
(blocks,miners,conflicts) = util.attribute_blocks(blocks=blocks,
                                   miners_dict=miners,
                                   addr_attr="initial_addr_update",
                                   marker_attr="initial_marker_update",
                                   both_attr="initial_update",
                                   source="initial attribution based on all sources including update",
                                   override=True,
                                   update=True)

In [152]:
len(conflicts) # 479 # 455

455

In [153]:
with open(miners_initial_conflicts_json_file, 'w') as fp:
    json.dump(conflicts, fp)

Redo/Update the blocks json attribution of blocktrail to unify the name of digitalX Mintsy in this
attribution data

In [154]:
# blocktrail.com again to map 'digitalX Mintsy' to 'digitalBTC' correctly

for blknum in blocks:
    # iterate over all blocks once
    if blknum in blocks_blocktrail.keys():
        # check if block was attributed by blocktrail
        
        # !!! remove old blocktrail.com attribution
        blocks[ blknum ][ util.D_ATTRIBUTIONS ].pop( util.DD_BT_ATTR, None)
        
        miner_bt = blocks_blocktrail[ blknum ][ util.DDD_MINER ]
        if miner_bt == "unknown" or miner_bt == "Unknown Entity":
            continue
            
        miner_uid = miner_bt
        mapped = False
        for miner_id in miners:
            # check if there is a unified miner id for this miner already in miners.json
            if miner_bt in miners[ miner_id ][ util.D_NAMES ].keys():
                # unified miner_id found
                miner_uid = miner_id
                mapped = True
                break
                
        if not mapped and len( blocks[ blknum ][ util.D_ADDRESSES ] ) == 1:
            # if miner name could not be mapped to list of existing miner names
            # it is probably a new miner and gets added as such
            address = blocks[ blknum ][ util.D_ADDRESSES ][0]
            print(miner_uid,":",address)
            util.add_miner(miner_uid,
                     miners,
                     names_dict= { miner_uid: { util.DD_URL:"",
                                            util.DD_CURRENCIES: ["BTC",],
                                            util.DD_FULLNAME: "",
                                            util.DD_FIRSTUSED: 0,
                                            util.DD_LASTUSED: 0,
                                            util.DD_SOURCES:["blocktrail.com",] } },
                     addresses_dict= { address: { util.DD_CURRENCIES: ["BTC",],
                                              util.DD_FIRSTUSED: 0,
                                              util.DD_LASTUSED: 0,
                                              util.DD_SOURCES:["blocktrail.com",] } }, )
        elif mapped and len( blocks[ blknum ][ util.D_ADDRESSES ] ) == 1:
            # the the miner could be mapped then check if we already have the address 
            # of the coinbase output mapped to this miner 
            # (if there is only one address in the block coinbase output)
            # If we dont have the address add it, if we have it add "blocktrail.com" as source to the address
            address = blocks[ blknum ][ util.D_ADDRESSES ][0]
            """
            if address == "1FLH1SoLv4U68yUERhDiWzrJn5TggMqkaZ":
                # check for specific address 
                for attr in blocks[ blknum ][ util.D_ATTRIBUTIONS ].keys():
                    if blocks[ blknum ][ util.D_ATTRIBUTIONS ][ attr ][ "miner" ] != miner_uid:
                        print(blknum,":",blocks[ blknum ][ util.D_ATTRIBUTIONS ][ attr ][ "miner" ],"--",miner_uid)
            """            
            util.add_addr(miner_uid,miners,address,source="blocktrail.com",currencies=["BTC",])
        
        # attribute block based on miner_uid in any case
        blocks[ blknum ][ util.D_ATTRIBUTIONS ][ util.DD_BT_ATTR ] = { util.DDD_MINER:miner_uid,
                                                                       util.DDD_SRC:"blocktrail.com" }       
        

Reapply fixes and checks to see if everything is consistant 

In [155]:
# first re-undo the incorrec attribution of blocktrail which messes up BTCC and F2Pool:
print(miners["BTCC Pool"][ util.D_ADDRESSES ][ "1KFHE7w8BhaENAswwryaoccDb6qcT6DbYY" ])
miners["BTCC Pool"][ util.D_ADDRESSES ].pop( "1KFHE7w8BhaENAswwryaoccDb6qcT6DbYY" , None)

{'sources': ['blocktrail.com'], 'firstUsed': 0, 'lastUsed': 0, 'currencies': ['BTC']}


{'sources': ['blocktrail.com'],
 'firstUsed': 0,
 'lastUsed': 0,
 'currencies': ['BTC']}

In [156]:
for m in miners:
    if util.D_ADDRESSES in miners[ m ].keys():
        for addr in miners[ m ][ util.D_ADDRESSES ]:
            if addr == "1KFHE7w8BhaENAswwryaoccDb6qcT6DbYY":
                print(m)
                pprint.pprint(miners[ m ])

F2Pool
{'addresses': {'13NA7X1u18CgGa7RzTDyvuuoLJtLRvXgke': {'currencies': ['BTC'],
                                                      'firstUsed': 0,
                                                      'lastUsed': 0,
                                                      'sources': ['blocktrail.com']},
               '1KFHE7w8BhaENAswwryaoccDb6qcT6DbYY': {'currencies': ['BTC'],
                                                      'firstUsed': 0,
                                                      'lastUsed': 0,
                                                      'sources': ['blockchain.info '
                                                                  'github',
                                                                  'btccom '
                                                                  'github',
                                                                  'blocktrail.com']},
               '1LoveYoURwCeQu6dURqTQ7hrhYXDA4eJyn': {'currencies': ['BTC'],

## Persist files 

In [157]:
with open(miners_json_file, 'w') as fp:
    json.dump(miners, fp)

In [158]:
with open(blocks_attribution_json_file, 'w') as fp:
    json.dump(blocks, fp)