# Exercise 3: Undesirable Effect Relations in Tables

The goal of this exercise is to create a simple rule script for extracting undesirable effect information from tables. Declare a new annotation type named "UndesirableEffect" with three features of the type Annotation. 
The first feature named "class" represents the symptom class of the undesirable effect. The second feature named "effect" represents a actual effect of the relation. the thirs feature named "frequency" represents the frequency of the effect.

First, we take a look at the two example tables.

In [None]:
%inputDir input_table1
%displayMode RUTA_COLORING

In [None]:
%inputDir input_table2
%displayMode RUTA_COLORING

Now, we write some rules to extract the triples.

In [None]:
%inputDir input_table1
%outputDir output
%displayMode CSV
%csvConfig UndesirableEffect frequency class 
%writescript ./UndesirableEffect.ruta
%saveTypeSystem ./UndesirableEffectTypeSystem.xml
// we want to use the old DefaultSeeder for MARKUP annotations
%configParams --seeders=org.apache.uima.ruta.seed.DefaultSeeder

TYPESYSTEM org.apache.uima.ruta.engine.HtmlTypeSystem;
UIMAFIT org.apache.uima.ruta.engine.HtmlAnnotator;

EXEC(HtmlAnnotator, {TAG});
RETAINTYPE(WS, MARKUP);
TAG{-> TRIM(WS, MARKUP)};
RETAINTYPE;

// the targeted relation
DECLARE UndesirableEffect (Annotation class, Annotation effect, Annotation frequency);
// we define a macro action just for shorter rules later
ACTION UE(ANNOTATION class, ANNOTATION effect, ANNOTATION frequency) = 
    CREATE(UndesirableEffect, "class"= class, "effect" = effect, "frequency" = frequency) ;

// annotate frequencies like "common"
DECLARE FrequencyInd;
WORDLIST FrequencyList = 'Frequencies.txt';
MARKFAST(FrequencyInd, FrequencyList, true);
FrequencyInd->{ANY f:FrequencyInd{-> UNMARK(f)};};

// some useful annotations
INT index;
DECLARE Cell(INT column);
DECLARE FirstRow, FirstColumn, FrequencyCell;
TR{STARTSWITH(TABLE)-> FirstRow};
TD{STARTSWITH(TR)-> FirstColumn};
TD{CONTAINS(FrequencyInd)-> FrequencyCell};
// create annoations with additional information about the column
TR{->index=0}->{
    TD{->index = index+1, CREATE(Cell,"column"=index)};
};

// candidates for the effect
DECLARE Chunk;
TD{-CONTAINS(FrequencyInd), -PARTOF(FirstColumn), -REGEXP("-") -> Chunk};
Chunk{CONTAINS(COMMA)-> SPLIT(COMMA)};

DECLARE Header;
"System organ class"-> Header;

// the actual rules

c:TD{PARTOF(FirstColumn),-PARTOF(Header), -PARTOF(FrequencyCell)} 
    # f:FrequencyCell 
    # e:@Chunk{-PARTOF(UndesirableEffect) -> UE(c,e,f)};

// a rule for format 2
fc:Cell{PARTOF(FirstRow),PARTOF(FrequencyCell),fc.column==c.column}
    # cc:Cell{PARTOF(FirstColumn), -PARTOF(FrequencyCell)}
    # c:@Cell{CONTAINS(Chunk),-PARTOF(UndesirableEffect)}
    ->{e:@Chunk{-PARTOF(UndesirableEffect)-> UE(cc,e,fc)};};


We apply the rules above on the second example.

In [None]:
%inputDir input_table2
%displayMode CSV
%csvConfig UndesirableEffect frequency class 

SCRIPT UndesirableEffect;
CALL(UndesirableEffect);