# Exercise 5: Undesirable Effect Relations in Tables

The goal of this exercise is to create a simple rule script for extracting undesirable effect information from tables. Declare a new annotation type named `UndesirableEffect` with three features of the type `Annotation`. 
The first feature named `class` represents the symptom class of the undesirable effect. The second feature named `effect` represents the actual effect of the relation. The third feature named `frequency` represents the frequency of the effect.

First, we take a look at the two example tables.

In [None]:
%inputDir data/ex5_table1
%displayMode RUTA_COLORING

In [None]:
%inputDir data/ex5_table2
%displayMode RUTA_COLORING

Now, we write some rules to extract the triples.

In [None]:
%inputDir data/ex5_table1
%outputDir temp/
%displayMode CSV
%csvConfig UndesirableEffect frequency class 

// We write the output of this script in a temporary directory so that we can apply it to to the other table, too
%writescript temp/UndesirableEffect.ruta
%saveTypeSystem temp/UndesirableEffectTypeSystem.xml

// We want to use the old DefaultSeeder for obtaining MARKUP annotations
%configParams --seeders=org.apache.uima.ruta.seed.DefaultSeeder

TYPESYSTEM org.apache.uima.ruta.engine.HtmlTypeSystem;
UIMAFIT org.apache.uima.ruta.engine.HtmlAnnotator;
EXEC(HtmlAnnotator, {TAG});

RETAINTYPE(WS, MARKUP);
TAG{-> TRIM(WS, MARKUP)};
RETAINTYPE;

// The targeted relation
DECLARE UndesirableEffect (Annotation class, Annotation effect, Annotation frequency);

// We define a macro action just for shorter rules later
ACTION UE(ANNOTATION class, ANNOTATION effect, ANNOTATION frequency) = 
    CREATE(UndesirableEffect, "class"= class, "effect" = effect, "frequency" = frequency) ;

// Annotate frequencies like "common" from an external Wordlist
DECLARE FrequencyInd;
WORDLIST FrequencyList = 'resources/Frequencies.txt';
MARKFAST(FrequencyInd, FrequencyList, true);
FrequencyInd->{ANY f:FrequencyInd{-> UNMARK(f)};};

// Some useful annotations
INT index;
DECLARE Cell(INT column);
DECLARE FirstRow, FirstColumn, FrequencyCell;
TR{STARTSWITH(TABLE)-> FirstRow};
TD{STARTSWITH(TR)-> FirstColumn};
TD{CONTAINS(FrequencyInd)-> FrequencyCell};

// Create Cell annoation with index representing the column number
TR{->index=0}->{
    TD{->index = index+1, CREATE(Cell,"column"=index)};
};

// Candidates for the effect
DECLARE Chunk;
TD{-CONTAINS(FrequencyInd), -PARTOF(FirstColumn), -REGEXP("-") -> Chunk};
Chunk{CONTAINS(COMMA)-> SPLIT(COMMA)};

DECLARE Header;
"System organ class"-> Header;

// the actual rules
c:TD{PARTOF(FirstColumn),-PARTOF(Header), -PARTOF(FrequencyCell)} 
    # f:FrequencyCell 
    # e:@Chunk{-PARTOF(UndesirableEffect) -> UE(c,e,f)};

// a rule for format 2
fc:Cell{PARTOF(FirstRow),PARTOF(FrequencyCell),fc.column==c.column}
    # cc:Cell{PARTOF(FirstColumn), -PARTOF(FrequencyCell)}
    # c:@Cell{CONTAINS(Chunk),-PARTOF(UndesirableEffect)}
    ->{e:@Chunk{-PARTOF(UndesirableEffect)-> UE(cc,e,fc)};};

We apply the rules above on the second example.

In [None]:
%inputDir data/ex5_table2
%displayMode CSV
%scriptDir temp/
%typeSystemDir temp/
%csvConfig UndesirableEffect frequency class 

SCRIPT UndesirableEffect;
CALL(UndesirableEffect);