*Sometimes the rabbit hole goes deeper than you thought.*

[![](./01.png)](https://steemit.com/steem/@lextenebris/steem-the-whale-wars-bernie-and-haejin-sitting-in-a-tree-structure#@crokkon/re-lextenebris-re-crokkon-re-lextenebris-re-abh12345-re-lextenebris-re-abh12345-re-lextenebris-steem-the-whale-wars-bernie-and-haejin-sitting-in-a-tree-structure-20180307t161114284z)

And so it began, with a simple back and forth between @crokkon and I, talking about the methods of deriving `steem-per-mvest` and whether or not it was even worth doing.

This is going to be an example of how asking the simplest questions can take you down some very strange rabbit holes, and if you're willing to follow them you can stumble on things you have no way of understanding. But at least you found them.

The basic question:

Has the effective ratio between steem and mega-vests changed significantly enough over the last year to even bother worrying about generating a precise ratio in order to determine how much a given vote is worth?

It seems like a very simple question to answer. And it is! But digging out the answer ends up revealing more than it intends.

## Getting Out the Shovel

Like most things lately that involve code, this starts simply enough by loading up the basics of SteamData, pulling in some useful analysis tools, and finishing off the imports with something to draw pretty pictures. I know up front that I'm going to want to plot at least one graph of the ratio between deposits and withdrawals in the table of operations of the system fulfilling requests to convert steem to SP.

If you've been following my code posts over the last few weeks, you know what's coming up now. Imports, database initialization, query, and then we start looking at the data we get back.

In [1]:
# Setting up the imports 

from steemdata import SteemData
import datetime
from datetime import datetime as dt

import numpy as np

import bokeh
import bokeh.plotting as bplt

In [2]:
# Init connection to database

db = SteemData()

In [3]:
query = {
    'type' : 'fill_vesting_withdraw',
    'timestamp' : {'$gte': dt.now() - datetime.timedelta(days=360)}}
    
proj = {'deposited.amount': 1, 'withdrawn.amount': 1, 'timestamp': 1, 'from_account': 1, 'to_account':1, '_id': 0}

sort = [('timestamp', -1)]

In [4]:
%%time

result = db.Operations.find(query,
                            projection=proj,
                            sort=sort)

fvL = list(result)

Wall time: 1min 33s


I decided to go big rather than to go home and just grab the last year of transaction information on vesting withdrawals. That sounds like complete overkill, and it probably is, but as a result I can look at the data and really get a feeling for what's going on under the hood. Also, it gives us a chance to make a solid prediction (technically hindcast) of what the ratio was and how much it's varied.

It turns out that there are just over a million vesting withdrawals that occurred in the last year. Honestly, that could be a lot worse. We can really work with this amount of data.

While were looking, let's check out the first five and the last five in the list. Since I told the database to return the list in reverse chronological order, those at the top of the most recent and of those at the bottom are the oldest.

In [5]:
len(fvL)

1049438

In [180]:
fvL[:3], fvL[-3:]

([{'deposited': {'amount': 35.77},
   'from_account': 'zakariashikder',
   'timestamp': datetime.datetime(2018, 3, 8, 18, 18, 33),
   'to_account': 'zakariashikder',
   'withdrawn': {'amount': 73050.513285}},
  {'deposited': {'amount': 20.336},
   'from_account': 'nobutsd',
   'timestamp': datetime.datetime(2018, 3, 8, 18, 17, 33),
   'to_account': 'nobutsd',
   'withdrawn': {'amount': 41531.82173}},
  {'deposited': {'amount': 0.927},
   'from_account': 'junhokim',
   'timestamp': datetime.datetime(2018, 3, 8, 18, 17, 12),
   'to_account': 'junhokim',
   'withdrawn': {'amount': 1894.72076}}],
 [{'deposited': {'amount': 65.278},
   'from_account': 'midnightoil',
   'timestamp': datetime.datetime(2017, 3, 13, 13, 32, 18),
   'to_account': 'midnightoil',
   'withdrawn': {'amount': 135825.952611}},
  {'deposited': {'amount': 43.866},
   'from_account': 'catulhu',
   'timestamp': datetime.datetime(2017, 3, 13, 13, 32, 9),
   'to_account': 'catulhu',
   'withdrawn': {'amount': 91274.494945}}

There's nothing particularly shocking or surprising about this information really.

Well, that and the fact that @tinfoilfedora, who showed up once, has one of the best names that I've ever seen on the platform. Bravo!

Now that we have the raw data that we came for, we need to come up with the actual ratio in question. This is relatively easy if we just make a list comprehension which pulls out the necessary values and gives them a quick divide. The ratio just falls out.

While we're at it, let's look at the first and last five just to see if the data looks relatively coherent and there's nothing obviously wrong with it.

In [7]:
spmL = [e['deposited']['amount'] / e['withdrawn']['amount']
        for e in fvL]

In [8]:
spmL[:5], spmL[-5:]

([0.0004896611726798767,
  0.0004896486393543035,
  0.0004892541526805249,
  0.000489633353887179,
  0.000489619055570078],
 [0.0004806019912260807,
  0.0004800960347037997,
  0.0004806003473206151,
  0.00048059427802293165,
  0.00048055643711602])

In [9]:
np.median(spmL)

0.00048264105814548454

Things look pretty good in terms of raw data. Not only that, but we can tell from just an easy eyeball of the information that steem per million vests hasn't really changed all that much in the last year. The only really conspicuous changes down in the millionths place, which might be meaningful to the really big whales in the pool but for most of the people reading this – that's probably less than noise in the signal.

Just because we have the information, I decided to pull the median out. Not the average, not the mean, not even the mode – the median. For those not familiar with statistical operations, that's what you get when you sort a list of numbers highest to lowest and literally pick the one that is in the middle of the list.

This data is not entirely clean, as you will see later, but because the sample is so large, the median was going to pull out the most reasonable value in the structure. It passes the sniff test as a reasonable value in between those at the beginning and and of the last year.

## Table That Motion

@crokkon was good enough to share [a link to some code that he had written a little over two weeks ago which did a little dumpster diving in the database to do these calculations and work out the ratio over time.](https://steemit.com/steemdev/@crokkon/re-schererf-re-crokkon-re-schererf-tutorial-get-the-value-of-your-steemit-earnings-part-2-calculate-steem-power-20180219t153715756z) He cleverly managed to avoid showing off the filters that he had implemented "for simplicity", but I figured it couldn't be that big a deal.

It's actually a pretty big deal. We'll take a look at why here shortly.

First, though, let's shove some data into a structure which is designed for manipulating tabular data.

[Pandas.](http://pandas.pydata.org/pandas-docs/stable/10min.html)

I know that a lot of coders on the platform are already familiar with pandas because it is one of the best known big data manipulating tools in the world. It's new to me. That probably doesn't say a lot positive about my experience as an analyst, but bear with me – we'll run with it.

After all, you've made it this far.

To shove the data from the database into a form that we can manipulate easily, we'll basically just implement a series of implicit for loops in the form of some list comprehensions. Essentially we just build a nice record dictionary in a semi-lazy way, fold it together with the list of ratios that we generated, and incidentally tell the data frame that the bit of information that primarily differentiates one of these entries from the other is the timestamp.

That last bit is important because pandas has some nice tools for working with timeseries, which we'll talk about shortly.

In [10]:
import pandas as pd

In [11]:
Data = pd.DataFrame({'deposited': [e['deposited']['amount'] for e in fvL],
                     'withdrawn': [e['deposited']['amount'] for e in fvL],
                     'from_account': [e['from_account'] for e in fvL],
                     'to_account': [e['to_account'] for e in fvL],
                     'ratio': spmL
                    }, 
                     index=[e['timestamp'] for e in fvL])

Now we have a nice, flexible, high-speed data structure – which is full of data with questionable consistency.

Having spent a little bit of time trawling through this pile, there were two things that immediately leapt out at me as indicators of less than useful knowledge. Firstly, some of the calculated ratios were zero. That can really only occur when the deposit is equal to zero. There were also sometimes when the ratio was 1, which can only happen when – well, nothing good.

So let's filter out bits of the database where the ratio is greater than 0.99 and entries where deposits are zero.

In [12]:
Data = Data[(Data['ratio'] < 0.99)]
Data = Data[(Data['deposited'] > 0)]

One of the nice things about pandas is that when given tabular data, it's really quite nice about creating a clear output.

Again, let's look at the top and the bottom of this list since it's still in chronological order. Now we have a nice temporal index, and all of our fields are nicely lined up.

Again, the ratios make sense and pandas is very good about giving us just enough information to see where there is some differentiation.

In [13]:
Data.head()

Unnamed: 0,deposited,from_account,ratio,to_account,withdrawn
2018-03-08 18:18:33,35.77,zakariashikder,0.00049,zakariashikder,35.77
2018-03-08 18:17:33,20.336,nobutsd,0.00049,nobutsd,20.336
2018-03-08 18:17:12,0.927,junhokim,0.000489,junhokim,0.927
2018-03-08 18:15:45,10.278,jakiasultana,0.00049,jakiasultana,10.278
2018-03-08 18:15:33,2.673,markboss,0.00049,markboss,2.673


In [14]:
Data.tail()

Unnamed: 0,deposited,from_account,ratio,to_account,withdrawn
2017-03-13 13:40:15,2337.791,salva82,0.000481,salva82,2337.791
2017-03-13 13:36:48,0.689,romangelsi,0.00048,romangelsi,0.689
2017-03-13 13:32:18,65.278,midnightoil,0.000481,midnightoil,65.278
2017-03-13 13:32:09,43.866,catulhu,0.000481,catulhu,43.866
2017-03-13 13:31:36,5.902,dennygalindo,0.000481,dennygalindo,5.902


Since we have everything in such a nice structure, we can query it directly for various manipulations rather than having to write the code to do all those things by hand.

For instance, what if we just want to pileup all the accounts that have received these withdrawals and add up the total amount deposited and withdrawn over the last year. If we were to actually write the Python code for that, it would involve at least one loop, a set to hold the accounts and unify them, and probably an accumulator.

With a panda data frame? It's a one-liner.

In [178]:
Data.groupby('to_account')['deposited', 'withdrawn'].sum().head()

Unnamed: 0_level_0,deposited,withdrawn
to_account,Unnamed: 1_level_1,Unnamed: 2_level_1
a-a,6.153,6.153
a-a-lifemix,216.746,216.746
a-angel,6.785,6.785
a-c-s,7.721,7.721
a-condor,737.694,737.694


In [179]:
len(Data.groupby('to_account')['deposited', 'withdrawn'].sum())

32112

The interesting thing here is that of that 1 million set of implicit withdrawals, only 32,000 accounts actually received funds.

That's – kind of surprising. This is the kind of thing that perks my ears when stuff rolls around. Essentially, this is a much smaller number of accounts than I ever expected to see involved over the last year.

I can't even really begin to theorize about why these numbers are so small. I find it more than a little concerning. There are a lot of promotional people who constantly tout the number of active accounts which often exceed 70,000 – but here we see 32,000 accounts which have triggered this operation on the blockchain.

I can't explain that. Yet.

We will leave aside that mystery for the moment and make use of pandas again to do something that would be much harder without it – carving all of these transactions up into weekly blocks. Again, something that in straight Python would be a real challenge but here turns into a single line where we simply tell the structure to resample itself and break down into one-week chunks, taking the numeric values within those chunks and taking the median value.

Yes, I tried using the average value. Because of some of the weird bits going on under the hood, the mean was just a bad choice. The median is just fine for our purposes, and even know it's probably not the right thing for the deposited or withdrawn values, I think it's interesting to look at because one of the things that we can tell immediately is that the usual value is relatively low.

52 weeks ago, there was a particularly odd outlier in values. One day, that might be worth investigating more thoroughly. Everything after that looks pretty normal.

In [16]:
WkData = Data.resample('1W').median()

WkData

Unnamed: 0,deposited,ratio,withdrawn
2017-03-19,2.281,0.000481,2.281
2017-03-26,2.544,0.000481,2.544
2017-04-02,2.626,0.000481,2.626
2017-04-09,1.765,0.000481,1.765
2017-04-16,2.339,0.000481,2.339
2017-04-23,2.471,0.000481,2.471
2017-04-30,2.181,0.000482,2.181
2017-05-07,2.473,0.000482,2.473
2017-05-14,2.474,0.000482,2.474
2017-05-21,1.03,0.000482,1.03


Another thing that you learn while dumpster diving in data is to look for strange aberrations. Anything different is something worth looking at.

Strangest thing I've seen in this database? That some of these transactions don't have the same from account and to account. It's not even a particularly small portion of the database. Out of 1 million transactions, almost 300,000 have different origin and destination accounts.

No, I don't have a really good explanation for why that's the case, either. It's far too common to be unusual behavior, but there's something about the way that these withdrawal transactions are happening that I don't understand.

In [17]:
MultData = Data[(Data['from_account'] != Data['to_account'])]

In [177]:
MultData.head()

Unnamed: 0,deposited,from_account,ratio,to_account,withdrawn
2018-03-08 17:08:33,0.397,jennykidd,0.000489,cabi5boh,0.397
2018-03-08 17:08:18,0.415,bitrexx,0.000489,cabi5boh,0.415
2018-03-08 15:45:09,0.387,index178,0.000489,sigmajin,0.387
2018-03-08 15:45:09,0.387,index180,0.000489,sigmajin,0.387
2018-03-08 15:45:09,0.387,index181,0.000489,sigmajin,0.387


In [176]:
len(MultData)

296947

While we've got the information out, let's see what happens when we group by the receiving account.

In [175]:
MultData.groupby('to_account').mean()

Unnamed: 0_level_0,deposited,ratio,withdrawn
to_account,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
aaron,2621.641637,0.000487,2621.641637
adiel,3.751456,0.000483,3.751456
ahmeddimassi,0.836500,0.000489,0.836500
aizensou,7.521411,0.000481,7.521411
alittle,21509.356333,0.000486,21509.356333
alot,72956.143818,0.000486,72956.143818
alpha,7272.757897,0.000481,7272.757897
anastacia,81067.693000,0.000483,81067.693000
another,2.276687,0.000481,2.276687
anyx,6.104247,0.000482,6.104247


Now, this is where it gets strange. There are only 187 distinct recipient accounts for these 300,000 withdrawal transactions.

It was looking a little strange, before – but anytime you see a sample this large have less than 200 of anything happening, it's worth looking at.

I went with the mean derivation on this one because it seemed like an actual average for each of these values would be meaningful. They are certainly interesting. Some of these things are surprisingly vast, while in general most of the leakage effects are relatively small.

Again, I have no idea what's going on here – but because there are only 187 distinct rows that's a series of relationships which we can easily graph, and we will – after we do a few other things first.

## Do You Think My Slope Is Pretty?

It's time, finally, to answer the question we started this whole thing with. We already have a pretty good idea, but a picture is worth a thousand words. Which probably means that a graph is worth about half of what I've written already about this problem?

When I put it that way, it starts to be kind of sad.

Rather than use [plotly](https://plot.ly/) today, I decided to learn an entirely new graphing system, because if you don't learn at least three new complicated things a day what good are you? Besides, while plotly does make some really nice online graphs – they really do require that a server elsewhere crunch the data, and I am almost certain to have more hardware sitting here under me to do this graph crunching than any online services likely to let me have for free.

I was right about that. Crunching the graphs with [bokeh](https://bokeh.pydata.org/en/latest/) took almost 0 time and generated a very nice HTML interface locally which could be saved out to a convenient image.

Setting up a graph is a little bit fiddly and I don't have exactly what I wanted – but it's good enough for government work which means it's more than good enough for this job.

In [20]:
bplt.output_file('weeklyRatio.html')

In [21]:
Plot = bplt.figure( tools='pan,box_zoom,reset,save',
                              title='Median Ratio Per Week For a Year',
                              x_axis_label='Week', y_axis_label='Ratio',
                              x_axis_type='datetime',
                              width=600, height=400, sizing_mode='scale_height'
                            )

Plot.axis.minor_tick_in = -3
Plot.axis.minor_tick_out = 6

Plot.yaxis[0].formatter = bokeh.models.NumeralTickFormatter(format="0.000000")

# Plot.yaxis.major_label_orientation = "vertical"

Plot.grid.grid_line_alpha = 0.4
Plot.ygrid.minor_grid_line_color = 'grey'
Plot.ygrid.minor_grid_line_alpha = 0.2

Plot.ygrid.band_fill_color='olive'
Plot.ygrid.band_fill_alpha=0.1

Plot.line(WkData.index, WkData.ratio, color='red', legend='Ratio')

In [22]:
bplt.show(Plot)

![Imgur](https://i.imgur.com/Q62VDMb.png)

This thing looks like a climate change graph, but it has something in common with global warming.

The slope of that curve is nowhere near as intense as it appears on the sheet.

Notice what the range on the Y axis actually is. It just runs over the range that we've already seen, differing by 1/100000 from the top to the bottom. I also used the weekly numbers rather than the full 1 million element set because there was absolutely no need for that level of resolution. 52 points along the timeline are more than sufficient.

While the slope isn't particularly useful for us to know, the fact that it exists is interesting. There has been a general trend upwards on the steem per mega-vests over the last year. As I understand it, that could only be the result of witnesses making a conscious decision to increase that number ever so slightly.

Again, I could be completely wrong about that. Once you start getting down to this level of poking at the blockchain, there is a lot of mystery and a lot of absolute absence of documentation, with the only reference that you can look at which makes a difference being the source code.

I haven't quite descended to that level of madness. Yet.

## Who's Touching Who?

Let's take the pause that refreshes and look back up at those accounts which make up 30% of the overall transactions of this nature but only target 178 accounts or so. What's going on there?

I have no idea what's going on there, but we can see what that network of accounts actually looks like. All we need to do is create a relationship map which sums the transfers between them and look at what's in front of us.

That's going to take our old friend [graphviz](https://www.graphviz.org/), so let's get that set up.

In [117]:
from graphviz import Digraph

In [152]:
dot = Digraph(comment="Steem-to-Mvest Oddity Map", 
              format="svg",
              engine="sfdp")

In [153]:
dot.attr('graph', overlap='false')
dot.attr('graph', ratio='auto')
dot.attr('graph', size='1000000,1000000')
dot.attr('graph', start='1.0')
dot.attr('graph', K='10')
dot.attr('graph', margin='5')

We'll start with accounts being light grey, just for some visual texture.

In [154]:
dot.attr('node', shape='rectangle', 
         style='filled', color='black', 
         fillcolor='lightgrey')

Now that we have a framework, there needs to be some data. We'll build a new data frame to give us something to work with.

No timeline info is going to be necessary, which is nice. All that can go. Just accounts and values of what's been withdrawn.

In [155]:
rGraphData = pd.DataFrame({'to': MultData.to_account,
                           'from': MultData.from_account,
                           'deposit': MultData.deposited,
                           'withdraw': MultData.withdrawn})

In [156]:
rGraphData.head(20)

Unnamed: 0,deposit,from,to,withdraw
2018-03-08 17:08:33,0.397,jennykidd,cabi5boh,0.397
2018-03-08 17:08:18,0.415,bitrexx,cabi5boh,0.415
2018-03-08 15:45:09,0.387,index178,sigmajin,0.387
2018-03-08 15:45:09,0.387,index180,sigmajin,0.387
2018-03-08 15:45:09,0.387,index181,sigmajin,0.387
2018-03-08 15:45:09,0.387,index182,sigmajin,0.387
2018-03-08 15:45:09,0.387,index183,sigmajin,0.387
2018-03-08 15:45:09,0.387,index184,sigmajin,0.387
2018-03-08 15:45:09,0.387,index185,sigmajin,0.387
2018-03-08 15:45:09,0.387,index186,sigmajin,0.387


In [157]:
rGraphData.tail()

Unnamed: 0,deposit,from,to,withdraw
2017-03-13 16:59:42,0.935,newbie6,originate,0.935
2017-03-13 16:58:45,0.898,newbie26,originate,0.898
2017-03-13 16:58:09,1.272,newbie25,originate,1.272
2017-03-13 16:57:36,1.67,newbie24,originate,1.67
2017-03-13 16:56:57,1.67,newbie23,originate,1.67


At first glance, I thought I had absolutely blown something in the translation. All of the from accounts appeared to be the same with the same timestamp – but then I looked closer. A lot closer. All of these transactions occurred during the same second, it's absolutely true. But they are being issued from a series of accounts with integer increasing names to a single withdrawal target.

Is this what a bot farm paying off looks like?

I have no idea what this is. But whatever it is, it's synchronized tightly enough so that all of these things are being issued at the same moment.

Looking at the other end of the record, the oldest things look pretty standard. Though, again, we see a situation in which a series of integer-named accounts are all withdrawing to the same account.

At least they're not all hitting the blockchain at once.

Again, what's going on here? I have no idea. Maybe @sigmajin and @originate would like to let us know?

Since we know that there are different numbers of from and to accounts, or at least that's the assumption we're going on for the moment, maybe we should figure out exactly how big the disparity between those two groups is.

That's pretty easy if we look at the data simply grouped by each of them separately.

In [171]:
rGraphData.groupby(['from', 'to']).sum().head()

Unnamed: 0_level_0,Unnamed: 1_level_0,deposit,withdraw
from,to,Unnamed: 2_level_1,Unnamed: 3_level_1
a-ok,nrg,51.622,51.622
a00,nrg,40.528,40.528
a1r,nrg,40.622,40.622
a2v6aaz0tf,sigmajin,15.304,15.304
a4elentano,dart,40.231,40.231


In [172]:
rGraphData.groupby(['from', 'to']).sum().tail()

Unnamed: 0_level_0,Unnamed: 1_level_0,deposit,withdraw
from,to,Unnamed: 2_level_1,Unnamed: 3_level_1
zyl,onthax,209.354,209.354
zynnnymehetele,dart,35.217,35.217
zzz,www,35819.052,35819.052
zzz3ya,dart,35.126,35.126
zzzya,dart,35.1,35.1


Well that now we know how that works out. Nearly 24,000 source accounts are going to just under 200 recipient accounts on a regular basis that nearly 1/3 of the traffic of this particular kind of operation on the blockchain, day in, day out, for the entirety of the last year.

I'm not going to say that such things look sketchy, but they look sketchy as Hell.

In [169]:
toRG = rGraphData.groupby(['to', 'from'])

toRG.sum().head()

Unnamed: 0_level_0,Unnamed: 1_level_0,deposit,withdraw
to,from,Unnamed: 2_level_1,Unnamed: 3_level_1
aaron,jesta,211542.903,211542.903
aaron,powerbot-1,4255.104,4255.104
aaron,powerbot-2,8486.685,8486.685
aaron,powerbot-3,6477.957,6477.957
aaron,powerbot-4,7806.74,7806.74


In [170]:
toRG.sum().tail()

Unnamed: 0_level_0,Unnamed: 1_level_0,deposit,withdraw
to,from,Unnamed: 2_level_1,Unnamed: 3_level_1
zear,bellyrub,3454.115,3454.115
zear,bellyrubbank,66.964,66.964
zear,itzzia,57.687,57.687
zear,psych101,33.549,33.549
zear,zeartul,4690.089,4690.089


And there's our activity, all laid out neatly and effectively with all of the accounts which are pumping up into others completely revealed.

I find it interesting that, thinking back on the recent @bellyrub insanity, we can see what accounts that @zear has been involved with – at least as regards whatever this particular vesting operation represents.

All right, let's get on making that relationship graph. I'm more curious than ever to see what it looks like.

In [160]:
toNodes = set()
fromNodes = set()

for e in list(toRG.groups.keys()):
    (t, f) = e
    toNodes.add(t)
    fromNodes.add(f)    

In [161]:
len(toNodes), len(fromNodes)

(187, 21507)

In [162]:
for n in toNodes:
    dot.node(n)

In [163]:
dot.attr('node', fillcolor='lightgreen', shape='oval')

In [164]:
for n in fromNodes:
    dot.node(n)

Let's look at making the connecting edges a little more interesting. We'll label them with the actual sum of transfers that have gone along that road.

In [165]:
sumToRG = toRG.sum()

In [166]:
dot.attr('edge', fontcolor='darkred')

In [167]:
for e in sumToRG['deposit'].iteritems():
    (f, t), v = e 
    dot.edge(t, f, taillabel=str(v))

In [168]:
%time dot.render('02')

Wall time: 2min 41s


'02.svg'

![](./02.svg)

[(Available straight from the source on Github, if you want it.)](https://github.com/SquidLord/Steem-Per-Vests-Over-Time/blob/master/02.svg)

That graph is huge, and I apologize for the necessity – and for the fact that it needed to be rendered with SVG because no binary representation was going to be effective enough to allow enough scaling so that you can make out any part of it.

Notice what we are seeing. There are clearly visible islands with very little interconnection between them which represent – I'm not exactly sure what.

[Further research suggests, but does not prove, that this particular operation on the blockchain is related to powering down,](https://steemkr.com/steemsql/@steemreports/re-demotruk-re-steemreports-re-demotruk-re-steemreports-re-demotruk-re-steemreports-re-demotruk-upvote-bounty-can-someone-confirm-how-many-transfer-transactions-were-performed-yesterday-20170808t151406235z) filling the request for steem in exchange for SP/vests, which would make sense as to why it would be the source for deriving the ratio of steem to vests.

What we see here are accounts which are receiving steem for other accounts powering down. How does that even work? If that is the explanation for what we're seeing, there are literally 187 and only 187 accounts which are profiting from that process in the last year.

Looking deeper at that list of 187, you see some very interesting things, like the fact that most of them have no posts or votes. They are simply repositories for funds.

Perhaps the most interesting thing that I saw while just poking around through the list of 187 was [this particular transaction on the blockchain,](https://steemblockexplorer.com/tx/c90b519349ee036fd346899815431eb4fd40af50) where @gtg apparently transferred 0.001 SBD to one of these minimally active accounts in exchange for a witness vote.

Maybe that's common practice, I don't know – but I do know that it struck me as interesting.

![Imgur](https://i.imgur.com/72wvMo7.png)

## Take It Away

I realize that I have a lot more questions than I have answers. That seems to be the way that investigation goes for me; find something interesting, follow it down the rabbit hole, watch it turn into the World Serpent.

What have I found here?

Well, firstly – you can totally replace your expected steem per vest with a pretty straightforward constant and generally expect to be close with any calculations that you might have been inclined to do in order to determine the value of a vote or other interaction with the blockchain.

But that's not the question that you are really interested in at this point, is it?

Me either.

What is going on with these 187 accounts? That's the question that most concerns me right now.

If you go panning around the graph, you'll see an entire swath of associated accounts which are integer, sequentially numbered. I would bet dollars to doughnuts that those are almost all bot-created and bot maintained. A lot of other accounts have the whiff of procedural generation about their naming, even if they don't go as far as labeling themselves *"newbie1."*

I'm pretty sure what we are seeing here are vast numbers of bots which are run in one of two modes, either centralized commander-node architecture or a more distributed ring of mutually owned nodes which are trying very hard to stay under the radar by distributing their interactions across a much wider number of accounts.

Not all of these accounts in a given island are bots, most assuredly. But all of them have a relationship.

The fact that there are extremely isolated islands suggests that there is very little collusion between these groups of related accounts. Looking at the graph as a whole at a high level, the nodes of activity are extremely obvious.

What does it mean? *I haven't the foggiest.*

What I *do* know is that over the last year, a full one third of the blockchain traffic of the `fill_vesting_withdraw` operation has involved these 187 accounts. That's 300,000 of 1,000,000.

If this stuff had been more obscure or less obscure, if it had been more common to be engaged in this operation with another account or if it had been less common – I wouldn't have given it another look.

It falls into just enough unusual that it drew my eye and it continues to be extremely unusual the more I, bit by bit, put together ideas about it.

If anyone has insight into what we see here, I invite you to share it with the rest of us. If anyone wants to reproduce my research, to go digging into the blockchain database and discover more about these accounts and what they're doing – that's why my code is here. Feel free to make use of it as you will.

In the meantime, I'm going to go make sure that my secure lab is security against intrusion and I am unlikely to be murdered in my sleep. And I want everyone to know that if you kill my dog, I have seen [**John Wick** and I know what to do.](https://youtu.be/iGpZ9xaQLYQ)

Also, if you would like to employ me as your own private dick, remember that I don't come cheap and I don't take cases that don't entertain me. In all my dialogue is written by [Raymond Chandler.](https://en.wikiquote.org/wiki/Raymond_Chandler)

That's life in the big city.

## *Epilogue*

I feel like I ask a lot of existential questions in the course of my little explorations of the steem blockchain.

What does it mean to be? Are any of us real or are we all bots? Is it all a giant illusion, or are there things which are grounded reality but which elude our perception? Can any of us really, truly know anything?

The continuation of existence, the first, yes there is grounded reality, and knowledge is possible.

I'm glad I could help you with your existential crisis.

Remember, these regular doses of madness are brought to you by an obsessive need to know things and viewers like you. (The letter "Y" was unavailable for comment.)

### Tools

- [Python 3.6](http://python.org)
- [Jupyter Lab](http://jupyter.org/)
- [*SteemData* by @furion](https://steemdata.com/)
- [Pandas](https://pandas.pydata.org/)
- [Bokeh](https://bokeh.pydata.org/en/latest/)
- [Graphviz](https://www.graphviz.org/)