# Dynamic Analysis

> Dynamic analyses collect information about the system as it executes

Opposed to **static analysis**: 

> Static analyses analyze the system’s artifacts to obtain information that is valid for all possible executions (e.g, program structure or potential calls between different modules). 
(View-Driven Software Architecture Reconstruction...)


```python

class Person(object):
   def have_fun():
      pass
   
class Student(Person):
   def have_fun():
      print("student kind of fun")
   
class SoftwareArchitect(Person):
   def have_fun():
      print("architect kind of fun")
      
# later in my code

def process(p:Person):
   p.have_fun()

```

##### Static Analysis limitations
- there is no way for me to know on which object is have_fun() called 
- there is no information about runtime properties (memory consumption, timing)


### Uses of Dynamic Analysis in Architecture Recovery
In the Extraction phase: 
  - dependencies between components (e.g. `Model` -> `UI`)
  - properties of components (e.g. `Model` is never used, `Connection` is slow, etc.)
    - corroborate usage info with static analysis for, e.g. dead code detection

## Prerequisite: Running the System

- not that trivial as you might think
- challenges
  - configuration
  - dependencies
  - unwritten rules
  - some systems don't have a clear entry point (e.g. libraries)
- helpful practices
  - continuous integration
  - containerization
  - infrastructure as code
  
  

## How to Get the Program Running? 

- install required 3rd party librarie
- deploy using containers
- ...

## Which Scenarios to Run from the System?

- Run the unit tests if they exist 
- Exercise "features" 

> A feature is a realized functional requirement of a system. [...] an observable unit of behavior of a system triggered by the user [Eisenbarth et al., 2003].

[Eisenbarth et al., 2003]. Thomas Eisenbarth, Rainer Koschke, and Daniel Simon. Locating features in source code. IEEE Computer, 29(3):210–224, March 2003.

## Approaches

- logging
- instrumentation
- traffic analysis


## Approach #1: Logging

- invasive - adding logging statements in the program
  - implies changing the program
- allows surgical precision - adding log statements only where relevant


## Approach #2: Instrumentation

What to instrument: 
- source code
  - using reflection, or code generation
- binaries
  - e.g. virtual machine instrumentation


### Instrumenting Binaries. e.g. Java

![](./images/java_instrumentation.png)

- Java programs are compiled into bytecode
- Bytecode is executed on the JVM
- You can provide a Java Agent (via command line argument -javaagent) that modifieds the bytecode before it being executed


Advantages:
  - no need for parsing
  - works for multiple languages



  
More on this topic:
- https://blog.sqreen.com/building-a-dynamic-instrumentation-agent-for-java/


### Instrumenting Source Code

- can be done using **reflection**

> Reflection is the ability of a program to manipulate as data something representing the state of the program during its own execution. 
> - **Introspection** is the ability for a program to observe and therefore reason about its own state. 
> - **Intercession** is the ability for a program to modify its own execution state or alter its own interpretation or meaning.

- in some languages it's easier to do (e.g. Ruby, Python, Java)  than in others



### Example: Introspection in Python

Goal: 
- a program to observe it's own state (e.g. a class observing it's own methods)


Python Specific: 
- use the `cls_name.__dict__.items( )` to get all the attributes of a class and filter those which represent a method because they have the `__call__` annotation

In [1]:
# an object-oriented foobar example
class Foo(object):

    def __init__(self):
        self.x= 'foo'

    def do(self):
        print(self.x)


class Bar(object):

    def __init__(self, foo):
        self.foo = foo

    def do(self):
        self.foo.do()

In [2]:
def methods_in_class(cls_name):
    """ list all methods in a class"""
    result = {}
    for method_name, value in cls_name.__dict__.items( ):
            if hasattr( value, '__call__' ):
                result [method_name] = value
    return result


In [3]:
methods_in_class(Foo)

{'__init__': <function __main__.Foo.__init__(self)>,
 'do': <function __main__.Foo.do(self)>}

Notes:
- it's the same program, even if it's in three cells
  - could have moved the `list_methods` in the Bar class

### Example: Intercession in Python

Goal: 
- let's have our program replace it's methods on the fly 
  - each with another method that prints a note when the function is called
  - we will thus trace the execution of the program!

Python specific: 
- We rely on `setattr( cls_name, key, replacement )` to replace the method found under the name `key` with `replacement`




In [4]:
# same classes as before
class Foo(object):

    def __init__(self):
        self.x= 'foo'

    def do(self):
        print(self.x)


class Bar(object):

    def __init__(self, foo):
        self.foo = foo

    def do(self):
        self.foo.do()

In [5]:
def replace_methods( cls_name, replacement ):
    """ replace every method in class cls_name with the replacement method """
    for method_name, original_method in methods_in_class(cls_name).items():
            setattr( cls_name, method_name, replacement( original_method ) )
            
def wrapper( fn ):
    def result( *args, **kwargs ):
        print (f'entered {fn}')
        return fn( *args, **kwargs )
    return result


In [6]:
replace_methods(Foo, wrapper)
Foo().do()

entered <function Foo.__init__ at 0x10d335830>
entered <function Foo.do at 0x10d3358c0>
foo


In [7]:
replace_methods(Bar, wrapper)


In [8]:
bar = Bar(Foo())
bar.do()

entered <function Foo.__init__ at 0x10d335830>
entered <function Bar.__init__ at 0x10d335a70>
entered <function Bar.do at 0x10d335b00>
entered <function Foo.do at 0x10d3358c0>
foo


##### Notes:
- this is easier in a dynamically typed language
- we have used **function wrappers**, a design pattern where (pretty much the Proxy design pattern):
  - a function *wraps* another function in order to
    - perform some *prologue* and/or *epilogue* tasks
    - optimize (e.g. cache results )
  - while the *wrapper* is *fully* compatible with the wrapped function so it can be used instead



More on Function Wrappers: 
- https://wiki.python.org/moin/FunctionWrappers
- Wrappers to the Rescue: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.18.6550&rep=rep1&type=pdf

### Example: Tracing Method Calls with Function Wrappers

- we want a wrapper that prints out when a method is called
  - the method name
  - who called it
  

- by deploying this in selected parts of our system we can trace all method calls



In [9]:
class Foo(object):

    def __init__(self):
        self.x= 'foo'

    def do(self):
        print(self.x)


class Bar(object):

    def __init__(self, foo):
        self.foo = foo

    def do(self):
        self.foo.do()

In [10]:
import inspect
import sys

def trace_call(caller_method, called_method):
    print(caller_method + " -> " + called_method)
    
def tracing_wrapper( cls, fn ):
    def result( *args, **kwargs ):
        # the new stuff!!!          
        caller_method = inspect.stack()[1].frame.f_code.co_name    
        called_method = fn.__module__ + "." + str(cls) + "." + fn.__name__
        trace_call(caller_method, called_method)
        # up to here ^^^^
        return fn( *args, **kwargs )
    return result

def wrap_methods( cls, wrapper ):
    """ replace every method in class cls_name with a wrapper method """
    for key, value in cls.__dict__.items( ):
        if hasattr( value, '__call__' ):
            setattr( cls, key, wrapper( cls, value ) )


In [11]:
wrap_methods(Foo, tracing_wrapper)
wrap_methods(Bar, tracing_wrapper)
Bar(Foo()).do()

<module> -> __main__.<class '__main__.Foo'>.__init__
<module> -> __main__.<class '__main__.Bar'>.__init__
<module> -> __main__.<class '__main__.Bar'>.do
do -> __main__.<class '__main__.Foo'>.do
foo


#### Case study


In [12]:
%cd /Users/egh/Zeeguu-Core

/Users/egh/Zeeguu-Core


In [25]:
import os
os.environ['ZEEGUU_CORE_CONFIG'] = '/Users/egh/Zeeguu-Core/default_core.cfg'

In [26]:
from zeeguu_core.model import User

In [29]:
from tools.past_exercises_for_user import past_exercises_for
past_exercises_for(534)

NoResultFound: No row was found for one()

In [30]:
def all_classes_in(mod):
    import inspect, importlib
    """ return all the classes in a given module """
    result = []
    for name, thingy in inspect.getmembers(importlib.import_module(mod)):
        if inspect.ismodule(thingy):
            if thingy.__name__.startswith(mod):
                result.extend(all_classes_in(thingy.__name__))

        elif inspect.isclass(thingy):
            if (thingy.__module__ == mod):
                result.append(thingy)
    return result


In [31]:
import zeeguu_core
all_classes_in('zeeguu_core')

[zeeguu_core.language.difficulty_estimator_factory.DifficultyEstimatorFactory,
 zeeguu_core.language.difficulty_estimator_strategy.DifficultyEstimatorStrategy,
 zeeguu_core.language.strategies.default_difficulty_estimator.DefaultDifficultyEstimator,
 zeeguu_core.language.strategies.flesch_kincaid_difficulty_estimator.FleschKincaidDifficultyEstimator,
 zeeguu_core.model.SortedExerciseLog.SortedExerciseLog,
 zeeguu_core.model.article.Article,
 zeeguu_core.model.article_word.ArticleWord,
 zeeguu_core.model.articles_cache.ArticlesCache,
 zeeguu_core.model.bookmark.Bookmark,
 zeeguu_core.model.bookmark_priority_arts.BookmarkPriorityARTS,
 zeeguu_core.model.cohort.Cohort,
 zeeguu_core.model.cohort_article_map.CohortArticleMap,
 zeeguu_core.model.domain_name.DomainName,
 zeeguu_core.model.exercise.Exercise,
 zeeguu_core.model.exercise_outcome.ExerciseOutcome,
 zeeguu_core.model.exercise_source.ExerciseSource,
 zeeguu_core.model.feed.RSSFeed,
 zeeguu_core.model.feed_registrations.RSSFeedRegist

In [32]:
for each in all_classes_in('zeeguu_core'):
    wrap_methods(each, tracing_wrapper)
    

In [33]:
past_exercises_for(534)

NoResultFound: No row was found for one()

In [34]:
tracefile = open("/Users/egh/tracing_calls_in_zeeguu_core.txt", "a")

def trace_call(caller_method, called_method):
    tracefile.write(caller_method + " -> " + called_method)


In [35]:
past_exercises_for(534)

NoResultFound: No row was found for one()

In [37]:
%cd tools

/Users/egh/Zeeguu-Core/tools


####  Challenges for you: Improve this if you can!
- fully qualified names of the caller method
- compute overhead
- extract graph from unit tests
- instrument the running of the unit tests
  - **compare dyanmically extracted graph with statically extracted graph** 



### Disadvantages of Wrappers
- they introduce an overhead (but then, so do all code related tracing)
- they require you to obtain the **live** objects (must be in the same process as the instrumented code)


### Advantages of Wrappers

- still allows surgical precision 
- allow **even more surgical deployment and removal** of wrappers at runtime
  - e.g. FlaskMonitoringDashboard 
- as opposed to off-the-shelf tools that trace the entire execution of the program
  - compare with
    ```python -m trace --trackcalls past_exercises_for_user.py ```
    
    - (executed from within the tools folder)
    
   
    
More on the `trace` module: https://docs.python.org/3/library/trace.html

## Approach #3: Traffic Analysis

- useful for service oriented architectures
- monitors the messages on the wire
- powerful approach for reverse engineering services


Read: https://danlebrero.com/2017/04/06/documenting-your-architecture-wireshark-plantuml-and-a-repl/

Further Possibilities:
- If somebody wants to work on this as their report, replicating this for Zeeguu-API / Zeeguu-Web would be great!
- If not for now, doing **something like this would be a great starting point for a thesis**



## Limitations

- limited by execution coverage 
  - a program does not reach an execution point... => no data (e.g. Word but user never prints)
  
  
- can slow down the application considerably 


- can result in a large amount of of data

  


## Uses Beyond Architecture Recovery

- Performance monitoring (e.g. the FMD)
- Intercepting and tracing specific calls
  - e.g. calls to the DB, calls across the network
  
- Quality control (e.g. test coverage tool)
- Dynamic optimizations 
- Logging Energy Usage (https://help.apple.com/instruments/mac/current/#/dev03a7149d)

### Further Reading

[Optional] Papers: 
- **Visualizing the Execution of Java Programs**. Wim De Pauw, Erik Jensen, Nick Mitchell, Gary Sevitsky, John Vlissides, Jeaha Yang

- **Correlating Features and Code Using a Compact Two-Sideed Trace Analysis Approach**. Orla Greevy, Stephane Ducasse. 
