# Phase 4 - Workbench substitutes

As the rules are extended and become more complex, the need for more complex tooling also increases. The UIMA Ruta Workbench provides several features which are essential for developing large and complex scripts. These features include explanation of the rule execution (including inlined rules and conditions), profiling as well as which rule created a specific annotations. These features are not (yet) available in IRuta, but can be substituted to a certain degree. This notebook provides some examples how to approach some challenges during rule engineering. 

First, we simply run the script developed in the previous notebooks and activate the debugging/explanation functionality of UIMA Ruta using the configuration parameters of the analysis engine. We only load one specific document, but we could do this also for all other documents.

In [None]:
%loadCas ./data-nlp/20878159.txt.xmi
%outputDir ./temp/debug-out
%displayMode NONE
%configParams --debug=true --debugWithMatches=true --debugAddToIndexes=true --createdBy=true --profile=true --statistics=true

%scriptDir temp/
%typeSystemDir typesystems/

SCRIPT Endpoints;
TYPESYSTEM EndpointsTypeSystem;
CALL(Endpoints);

Now, we have additional information about the rule execution and can investigate this information. As the script is extended and more rules are added during rapid prototyping, the execution of the script becomes slower. In this cell, we list all rules and inspect their runtime performance, i.e. how fast the rule have been applied. This can be a useful pointer, which rules are too slow and should be rewritten.

In [None]:
%loadCas ./temp/debug-out/20878159.txt.xmi
%outputDir ./temp/trash
%displayMode CSV
%csvConfig -ProfiledRule element time

DECLARE ProfiledRule(STRING element, INT time);
ACTION Profiled(ANNOTATION dra) = CREATE(ProfiledRule, 
    "element"=dra.element,
    "time"=dra.time);
rule:DebugRuleApply{-> Profiled(rule)};

In the next cell, we use the created debug information to investigate the matching of the rules. The cell lists all rules with a UNMARK action (rules that remove an annotation) including the number of overall (tried) and successful (applied) matches. This provides a useful pointer if a rule did match at all and if it did, how often.

In [None]:
%loadCas ./temp/debug-out/20878159.txt.xmi
%outputDir ./temp/unused
%displayMode CSV
%csvConfig -DebugRule element applied tried

DECLARE DebugRule(STRING element, INT applied, INT tried);
ACTION Debug(ANNOTATION dra) = CREATE(DebugRule, 
    "element"=dra.element,
    "applied"=dra.applied,
    "tried"=dra.tried);

rule:DebugRuleApply{REGEXP(rule.element, ".*UNMARK.*")-> Debug(rule)};


The next step is usually investigating where and why a rule did match or failed to match. The next cell highlights the positions where a specific rule failed to match (pink) or succeeded to match (lightgreen).

In [None]:
%loadCas ./temp/debug-out/20878159.txt.xmi
%outputDir ./temp/unused
%displayMode CSV
%csvConfig DocumentAnnotation
DECLARE Matched,Failed;
apply:DebugRuleApply{apply.element=="v:Value{->UNMARK(v)} CIInd;"}->{
    match:apply.rules->{
        match.type==DebugMatchedRuleMatch{->Matched};
        match.type==DebugFailedRuleMatch{->Failed};
    };
};
COLOR(Matched, "lightgreen");
COLOR(Failed, "pink");