# An Explain Report

This is a POC for an explain report: given a number at time T1 and time T2, introspect the clocks involved to produce a simple linear breakdown
of what changed on various timelines to account for the change in the number between T1 and T2.

The report is an ordered sequence of clock changes: the sum of changes should equal to the total change. The order of application is important: change the
ordering, and the resultant values may change. It's not a Jacobian-type report with T1 deltas and no ordering.

The real reason for this workbook is to start working on the node class optimizations and metadata management.

## Footnotes

Footnotes are one of the simplest types of metadata:
* Any computation (even a simple read of a constant value) may declare it has footnotes
* Every computation gets the footnote of all its inputs

Footnoting is based on a number of observations about large software systems (say, 1M LOC or more):
* Any complex report (i.e. computation output) will generally be somewhat wrong, misleading, or out of date.
* If you want 100% correctness and truth, your report will raise an exception every time you run it.
* If you just want it to run, you'll get a result, but some log file on some machine in some compute farm will have a message explaining why your result is wrong.
* And you won't see that message.
* And that message may have leaked information that bad actors can read.
* And you won't trust this system, so your own developers will copy the underlying data and write their own version of the report with all the same issues.
* Now you don't have a problem, but your company now has two problems.

Footnotes summary:
* Are the documentation of problems from the producer's point of view
* Make no claim about usability of results, etc, for a specific consumer
* May be programmatically removed/condensed/replaced at controlled code points


In [1]:
import mand.core

from mand.core import Entity, node, Context

from mand.core import ObjectDb, _tr, Timestamp, Context, getNode
from mand.core import ProfileMonitor, PrintMonitor
from mand.lib.extrefdata import ExternalRefData, dataField
from mand.lib.workflow import Workbook, WorkItemOpenEvent, WorkItem
from mand.lib.portfolio import Portfolio
from mand.core import displayDict, displayMarkdown, displayListOfDicts, displayHeader
from mand.core import num, find
import datetime


from mand.demos.trading import TradingBook, TradingPortfolio, MarketDataSource, MarketInterface

db = ObjectDb()

from mand.lib.dbsetup import setUpDb
setUpDb(db)
db.describe()

<mand.db.ObjectDb object at 0x10b48ee50>: 203, mem=True, ro=False: entities=9, map=2


In [2]:
def makeTree(names):
    ret = []
    for name in names:
        subs = [ TradingBook(name+str(i)) for i in range(10) ]
        p = TradingPortfolio(name).write()
        p.setChildren(subs)
        ret.append(p)
    return ret

with db:
    pAll = TradingPortfolio('TopOfTheHouse').write()
    subs = makeTree(['Eq-Prop', 'Eq-Inst', 'FX', 'Rates', 'Credit', 'Delta1', 'Loans', 'Commod', 'ETFs', 'Mtge'])
    pAll.setChildren(subs)
    
print pAll
print '# books:', len(pAll.books())
print '# children:', len(pAll.children())

<mand.demos.trading.TradingPortfolio object at 0x10c85f3d0>
# books: 100
# children: 10


In [3]:
with db:
    bExt  = _tr.TradingBook('Customer1')
    bExt2 = _tr.TradingBook('Customer2')
    
p1 = pAll.children()[0]
p2 = pAll.children()[1]
p3 = pAll.children()[2]
p4 = pAll.children()[3]

b1 = p1.children()[0]
b2 = p2.children()[0]
b3 = p3.children()[0]
b4 = p4.children()[0]

print bExt.meta.name()
print b1.meta.name()
print b2.meta.name()

Customer1
Eq-Prop0
Eq-Inst0


In [4]:
with db:
    s1_ibm  = MarketDataSource('source1.IBM')
    s1_goog = MarketDataSource('source1.GOOG')

s1_ibm.update(last=175.61)
s1_goog.update(last=852.12)

In [5]:
with db:
    ibm  = MarketInterface('IBM').write()
    goog = MarketInterface('GOOG').write()

# Book some trades

This time, we throw a few thousand in...

In [6]:
with db:
    TradeOpenEvent = _tr.TradeOpenEvent
    cf1 = _tr.ForwardCashflow()
    ins1 = _tr.Equity()
    ins2 = _tr.Equity(assetName='GOOG.Eq.1')
    
    ts0 = Timestamp()

    for i in range(10): # 1000
        ev0 = TradeOpenEvent(action='Buy',
                             item=ins2,
                             quantity=1,
                             premium=cf1,
                             unitPrice=852 + i/100.,
                             book1=b3,
                             book2=bExt2).write()
    
    for i in range(10):
        ev0 = TradeOpenEvent(action='Buy',
                             item=ins2,
                             quantity=1,
                             premium=cf1,
                             unitPrice=852 + i/100.,
                             book1=b4,
                             book2=bExt2).write()
    
    
    ts1 = Timestamp()
    
    ev1 = TradeOpenEvent(action='Buy',
                         item=ins1,
                         quantity=100,
                         premium=cf1,
                         unitPrice=175.65,
                         book1=b1,
                         book2=bExt).write()
    
    ts2 = Timestamp()
    
    s1_ibm.update(last=175.64)
    
    ts3 = Timestamp()
    
    ev2 = TradeOpenEvent(action='Buy',
                         item=ins2,
                         quantity=300,
                         premium=cf1,
                         unitPrice=852.12,
                         book1=b2,
                         book2=bExt).write()
    
    ev3 = TradeOpenEvent(action='Sell',
                         item=ins1,
                         quantity=100,
                         premium=cf1,
                         unitPrice=175.85,
                         book1=b2,
                         book2=bExt2).write()
    
    ts4 = Timestamp()
    
    s1_ibm.update(last=175.70)
    s1_goog.update(last=852.11)
    
    ts5 = Timestamp()
    
    s1_ibm.update(last=175.68)
    s1_goog.update(last=852.13)
    
    eod = Timestamp()
    
    ev4 = TradeOpenEvent(action='Buy',
                         item=ins1,
                         quantity=100,
                         premium=cf1,
                         unitPrice=175.69,
                         book1=b1,
                         book2=bExt,
                         amends=ev1,
                         message='Sorry, the broker says you actually paid 69. signed: the middle office'
                        ).write(validTime=ev1.meta._timestamp.validTime)
    
    s1_ibm.update(last=177.68)
    s1_goog.update(last=856.13)
    
    ts6 = Timestamp()
    

# An Explain Report

This is a mix of abstractions at the Core, DBA, and BA, and User levels. But, it does the job for now...

In [7]:
class Report(Entity):
    @node(stored=True)
    def valuable(self):
        return None
    
    @node(stored=True)
    def ts1(self):
        return None
    
    @node(stored=True)
    def ts2(self):
        return None
    
    @node
    def data(self):
        valuable = self.valuable()
        ts1 = self.ts1()
        ts2 = self.ts2()
        clock = valuable.getObj(_tr.RootClock, 'Main')
    
        def clocks(ts):
            def fn(node):
                obj = node.key[0]
                m = node.key[1].split(':')[-1]
                if isinstance(obj, _tr.Clock) and m == 'cutoffs':
                    return True
            with Context({clock.cutoffs: ts}, 'Clocks'):
                nodes = find(valuable.NPV, fn)
                return dict( [ (node.tweakPoint, node) for node in nodes ] )
    
        allNodes = clocks(ts1)
    
        allNodes.update(clocks(ts2))
        nodes = allNodes.values() 
    
        # IRL, we'd sort these according to some business req...
        # And our clocks might be arranged in an N-level tree...
        nodes = sorted(nodes, key = lambda node: node.key[0].meta.name())
    
        data = []
        curr = [0]
        def add(title, npv):
            pnl = npv - curr[0]
            curr[0] = npv
            data.append( {'Activity': title, 'PnL': pnl } )

        with Context({clock.cutoffs: ts1}, 'Start'):
            curr = [ valuable.NPV() ] # Starting balance
    
        tweaks = {}
        for n in nodes:
            tweaks[n.tweakPoint] = ts1
        with Context(tweaks, name='Start breaks'):
            start = valuable.NPV()
            add('Starting balance breaks', start)

        tsAmend = Timestamp(t=ts2.transactionTime, v=ts1.validTime)
        # XXX - modifying tweaks in place is a bit evil
        # This is only safe because I know Context() effectively copies, so this works
        # for now.
        for n in nodes:
            tweaks[n.tweakPoint] = tsAmend
            name = n.key[0].meta.name()
            with Context(tweaks, name='Amend %s' % name):
                add('prior day amends: %s' % name, valuable.NPV())
        for n in nodes:
            tweaks[n.tweakPoint] = ts2
            name = n.key[0].meta.name()
            with Context(tweaks, name='Activity %s' % name):
                add('activity: %s' % name, valuable.NPV())
    
        with Context({clock.cutoffs: ts2}, name='End'):
            end = valuable.NPV()
            add('Ending balance breaks', end)

        title = 'PnL explain for %s: %s' % (valuable.meta.name(), end-start)
        return data, title

    def run(self):
        data, title = self.data()
        node = getNode(self.data)
        footnotes = node.footnotes.values()
        displayHeader('%s' % title)
        if footnotes:
            displayMarkdown('**Caveat: this report encountered problems. See footnotes at bottom.**')
        displayListOfDicts(data, names=['Activity', 'PnL'] )
        if footnotes:
            displayMarkdown('## Footnotes:')
            txt = '\n'.join( [ '1. %s' % f for f in footnotes])
            displayMarkdown(txt)
    
r = Report(valuable=pAll, ts1=eod, ts2=ts6)
r.run()

# PnL explain for TopOfTheHouse: 1276.00

**Caveat: this report encountered problems. See footnotes at bottom.**

|Activity|PnL|
|-|-|
|Starting balance breaks|0.00
|prior day amends: MarketData|0.00
|prior day amends: Portfolio|0.00
|prior day amends: Trading|-4.00
|activity: MarketData|1280.00
|activity: Portfolio|0.00
|activity: Trading|0.00
|Ending balance breaks|0.00

## Footnotes:

1. Inadequate cash discounting model used

## Add some inconsistent data [Test]

Book b2 should appear multiple times in some portfolio trees and be flagged accordingly...

In [8]:
with db:
    p1.setChildren(p1.children() + [b2])

ts7 = Timestamp()

# Footnotes

Note the report calculation has run, but attached appropriate caveats to the output:

In [9]:
db3 = db.copy()
p = db3._get(pAll.meta.path())

with ProfileMonitor(mode='sum'): 
    r = Report(valuable=p, ts1=eod, ts2=ts7)
    r.run()

# PnL explain for TopOfTheHouse: 2296.00

**Caveat: this report encountered problems. See footnotes at bottom.**

|Activity|PnL|
|-|-|
|Starting balance breaks|0.00
|prior day amends: MarketData|0.00
|prior day amends: Portfolio|0.00
|prior day amends: Trading|-4.00
|activity: MarketData|1280.00
|activity: Portfolio|1020.00
|activity: Trading|0.00
|Ending balance breaks|0.00

## Footnotes:

1. Inadequate cash discounting model used
1. Book appears multiple times: /Global/TradingBook/Eq-Inst0


### Profile by nodes.
* times are in microseconds
* cumT is total time spent in funtion
* calcT is time spent in function, but not in a child node

|fn|n|cumT|calcT|cumT/call|sys|
|-|-|-|-|-|-|
|Report:data|1|5,908,102|13|5,908,102|GetValue
|Report:data|1|5,908,089|485|5,908,089|GetValue/Calc
|TradingContainer:NPV|11|5,900,050|116|536,368|GetValue
|TradingContainer:NPV|11|5,899,934|1,603|536,357|GetValue/Calc
|Portfolio:items|121|5,730,298|1,086|47,357|GetValue
|Portfolio:items|121|5,730,213|21,133|47,357|GetValue/Calc
|Workbook:items|1,104|4,700,039|9,227|4,257|GetValue
|Workbook:items|1,100|4,690,811|4,372,357|4,264|GetValue/Calc
|Root:Clocks|2|1,749,381|3,367|874,690|Context
|Portfolio:children|242|1,002,769|1,342|4,143|GetValue
|Portfolio:children|121|1,001,426|497,026|8,276|GetValue/Calc
|Root:Amend Trading|1|487,325|43|487,325|Context
|Root:End|1|474,363|41|474,363|Context
|Root:Start|1|468,348|21|468,348|Context
|Root:Amend MarketData|1|458,189|37|458,189|Context
|Root:Activity Portfolio|1|457,584|40|457,584|Context
|Root:Amend Portfolio|1|454,957|37|454,957|Context
|Root:Activity Trading|1|453,368|42|453,368|Context
|Root:Start breaks|1|450,551|39|450,551|Context
|Root:Activity MarketData|1|449,694|40|449,694|Context
|PortfolioUpdateEvent:children|121|426,395|2,473|3,523|GetValue
|TradingBook|102|393,415|393,415|3,857|Db.Get
|Equity:NPV|15|167,839|149|11,189|GetValue
|Equity:NPV|15|167,689|381|11,179|GetValue/Calc
|MarketInterface:spot|15|159,249|281|10,616|GetValue
|MarketInterface:spot|15|158,968|402|10,597|GetValue/Calc
|ExternalRefData:state|15|150,498|107|10,033|GetValue
|ExternalRefData:state|15|150,390|270|10,026|GetValue/Calc
|RefData:state|15|150,119|112|10,007|GetValue
|RefData:state|15|150,007|67,083|10,000|GetValue/Calc
|TradeOpenEvent|24|107,015|107,015|4,458|Db.Get
|Clock:cutoffs|2,480|104,318|7,069|42|GetValue
|Clock:cutoffs|20|97,308|484|4,865|GetValue/Calc
|TradeOpenEvent:ticket|506|96,922|2,348|191|GetValue
|Clock:parent|20|96,734|162|4,836|GetValue
|Clock:parent|20|96,571|80,291|4,828|GetValue/Calc
|TradingTicket|24|94,574|94,574|3,940|Db.Get
|PortfolioUpdateEvent|12|54,143|54,143|4,511|Db.Get
|TradingPortfolio|10|38,163|38,163|3,816|Db.Get
|RefDataUpdateEvent|9|36,760|36,760|4,084|Db.Get
|TradingBook:clock|2,200|35,994|18,090|16|GetValue
|Clock|5|19,469|19,469|3,893|Db.Get
|TradingBook:clock|1,100|17,903|14,490|16|GetValue/Calc
|_WorkItemEvent:book2|506|9,568|1,912|18|GetValue
|_WorkItemEvent:item|253|8,702|1,470|34|GetValue
|ClockEvent:parent|8|8,121|76|1,015|GetValue
|MarketInterface:source|15|8,067|104|537|GetValue
|Equity:refdata|15|8,058|132|537|GetValue
|MarketInterface:source|15|7,962|345|530|GetValue/Calc
|Equity:refdata|15|7,925|364|528|GetValue/Calc
|Portfolio:clock|242|7,531|2,041|31|GetValue
|ClockEvent|2|7,482|7,482|3,741|Db.Get
|MarketInterface|2|7,477|7,477|3,738|Db.Get
|MarketDataSource|2|7,449|7,449|3,724|Db.Get
|Equity|2|7,231|7,231|3,615|Db.Get
|Portfolio:books|231|5,610|1,412|24|GetValue
|Portfolio:clock|121|5,490|1,592|45|GetValue/Calc
|TradeOpenEvent:premium|253|5,002|1,347|19|GetValue
|MarketDataSource:clock|30|4,561|249|152|GetValue
|Portfolio:books|121|4,481|3,856|37|GetValue/Calc
|MarketDataSource:clock|15|4,312|198|287|GetValue/Calc
|RootClock|1|3,828|3,828|3,828|Db.Get
|ForwardCashflow|1|3,655|3,655|3,655|Db.Get
|Event:amends|450|2,420|2,420|5|GetValue
|_WorkItemEvent:book1|506|2,033|2,033|4|GetValue
|TradeOpenEvent:quantity|506|1,800|1,800|3|GetValue
|TradeOpenEvent:action|253|1,309|1,309|5|GetValue
|TradeOpenEvent:unitPrice|253|1,122|1,122|4|GetValue
|Entity:clock|40|505|310|12|GetValue
|RefDataUpdateEvent:data|58|224|224|3|GetValue
|Entity:clock|20|195|195|9|GetValue/Calc
|ForwardCashflow:NPV|11|193|116|17|GetValue
|MarketInterface:sourceName|15|167|123|11|GetValue
|RootClock:cutoffs|52|148|148|2|GetValue
|Equity:assetName|15|84|84|5|GetValue
|ForwardCashflow:NPV|11|76|76|6|GetValue/Calc
|MarketInterface:sourceName|15|44|44|2|GetValue/Calc
|Report:valuable|1|5|5|5|GetValue
|Report:ts1|1|4|4|4|GetValue
|Report:ts2|1|3|3|3|GetValue

# Caching/Reusing results

Until now, we have just checked to see if the current context contains a value for a node we are asking for, and if so, reuse that.

A better approach to caching bound *fn* on *object* in *context0* is:
    
1. Has fn been tweaked in context0? If so, return that
2. Is fn sufficiently trivial that we can avoid managing it as a node?
  * If trivial, just treat it as a pure python function and call it
  * Meta-data (inputs, outputs, etc) will be given to its caller and callees as appropriate
3. Ask object for context1
  * *context1* is a simplified version of *context0*
  * The default case is just to return *context1*
  * IRL, this could actually be a list of contexts or a pattern to match contexts against
4. Is fn cached in *context1*? If so, use that
5. Compute *fn* in *context1*
6. Construct *context2* from the inputs of the computed value
7. If *context2* is a subset of *context1*, cache *fn* in *context1*
8. Something odd happened
  * Footnote the problem as part of computation notes on the node's metadata
  * Maybe cache the node anyway in *context2*
    * Perhaps *object* will return *context2* as a siplification for future computations?
9. Return the value

Notes:
* Parallel compute of nodes not considered yet
* Context simplification (step 3) and input simplification (step 9) are probably intimately related
* The split between calculation and caching is nice:
  * BAs can write business logic
  * Computer scientists can add caching logic where needed
* We can use a profiler type object to gather runtime compute cost infomation
  * The resulting trace can be used as input to drives step 2 and step 3
    

In [10]:
valuable = b4
clock = valuable.getObj(_tr.RootClock, 'Main')
with Context({clock.cutoffs: ts7}):
    print valuable.NPV()
    node = getNode(valuable.NPV)
    node.printInputGraph()

40.85
 <TradingBook@10c8226d0/TradingContainer:NPV in Root:4427022232>
   <TradingBook@10c8226d0/Workbook:items in Root:4427022232>
     <TradeOpenEvent@10c827450/TradeOpenEvent:ticket in Root:4427022232>
     <TradeOpenEvent@10c827450/_WorkItemEvent:item in Root:4427022232>
     <TradeOpenEvent@10c827450/_WorkItemEvent:book1 in Root:4427022232>
     <TradeOpenEvent@10c827450/_WorkItemEvent:book2 in Root:4427022232>
     <TradeOpenEvent@10c827450/TradeOpenEvent:premium in Root:4427022232>
     <TradeOpenEvent@10c954a90/Event:amends in Root:4427022232>
     <TradeOpenEvent@10c91c3d0/TradeOpenEvent:unitPrice in Root:4427022232>
     <TradeOpenEvent@10c91c3d0/TradeOpenEvent:quantity in Root:4427022232>
     <TradeOpenEvent@10c91c3d0/TradeOpenEvent:ticket in Root:4427022232>
     <TradeOpenEvent@10c91c3d0/_WorkItemEvent:item in Root:4427022232>
     <TradeOpenEvent@10c91c3d0/_WorkItemEvent:book1 in Root:4427022232>
     <TradeOpenEvent@10c91c3d0/_WorkItemEvent:book2 in Root:4427022232>
   

In [11]:
# Input simplification



In [12]:
from mand.graph import DependencyManager, setDependencyManager
from mand.core import Event

class DM1(DependencyManager):
    
    def prn(self, input, txt):
        print txt
        isEvent = isinstance(input.object(), Event)
        print input.object(), input.methodId() 
        print isEvent, len(input.inputs)
        print input.tweakPoint
        print
        
    def addDep(self, input, output):
        if not input.tweakPoint:
            self.prn(input, 'No tweak point:')
        elif input.inputs:
            self.prn(input, 'Interesting:')
            
        output.inputs.add(input)
        input.outputs.add(output)

        
setDependencyManager(DM1())

db4 = db.copy()
valuable = db4.get(b4.meta.path())

clock = valuable.getObj(_tr.RootClock, 'Main')
with Context({clock.cutoffs: ts7}):
    print valuable.NPV()
    print valuable.items()
    node = getNode(valuable.NPV)
    node.printInputGraph()

Interesting:
<mand.clock.Clock object at 0x10d90a9d0> Clock:parent
False 4
<bound method Clock.fn of <mand.clock.Clock object at 0x10d90a9d0>>

Interesting:
<mand.clock.Clock object at 0x10d90af10> Clock:parent
False 2
<bound method Clock.fn of <mand.clock.Clock object at 0x10d90af10>>

Interesting:
<mand.clock.Clock object at 0x10d90af10> Clock:cutoffs
False 2
<bound method Clock.fn of <mand.clock.Clock object at 0x10d90af10>>

Interesting:
<mand.clock.Clock object at 0x10d90a9d0> Clock:cutoffs
False 2
<bound method Clock.fn of <mand.clock.Clock object at 0x10d90a9d0>>

Interesting:
<mand.clock.Clock object at 0x10d90a9d0> Clock:cutoffs
False 2
<bound method Clock.fn of <mand.clock.Clock object at 0x10d90a9d0>>

Interesting:
<mand.demos.trading.TradingBook object at 0x10b48d950> Workbook:items
False 92
<bound method TradingBook.fn of <mand.demos.trading.TradingBook object at 0x10b48d950>>

Interesting:
<mand.demos.trading.Equity object at 0x10d910f10> Equity:refdata
False 1
<bound met