# Exercise 4: Conditions and Actions

In this notebook, a selection of conditions and actions are introduced. These highlight different approaches how to work with annotations.

#### Setup

In [None]:
%%documentText
The dog barked at the cat.
Dogs, cats and mice are mammals.
Zander and tuna are fishes.
Peter works for Frank.
10€ are less than 100$.

We add a (simplified) rule to detect sentences. These sentences will be used later for illustrating different conditions/actions.

In [None]:
%displayMode CSV
%csvConfig Sentence
DECLARE Sentence;
// Create some simple sentences
(ANY+{-PARTOF(Sentence),-PARTOF(PERIOD)} PERIOD){-> Sentence};

### Conditions (`OR`, `CONTAINS`, `STARTSWITH`)

#### Annotating sentences that contain a number or a comma

We annotate all sentences that contain a number or a comma using a combination of conditions.

In [None]:
%csvConfig SentenceWithNumberOrComma
DECLARE SentenceWithNumberOrComma;
Sentence{OR(CONTAINS(NUM),CONTAINS(COMMA))-> SentenceWithNumberOrComma};

#### Annotating sentences that contain at least two capitalized words

In [None]:
%csvConfig SentenceWith2CapitalizedWords
DECLARE SentenceWith2CapitalizedWords;

// Right now we stil need a upper boundary and simply set it to a very high number
Sentence{CONTAINS(CW,2,1000)-> SentenceWith2CapitalizedWords};

#### Annotating sentences that start with an Animal annotation

In [None]:
%csvConfig SentenceStartsWithAnimal
DECLARE SentenceStartsWithAnimal;

DECLARE Animal;
WORDLIST AnimalList = 'resources/animals.txt';
MARKFAST(Animal, AnimalList, true);

Sentence{STARTSWITH(Animal)-> SentenceStartsWithAnimal};

### The `UNMARK` action

#### Removing all sentences that contain an Animal annotation
The `UNMARK` action is really useful to disentangle rules. The `s` in the rule is used as a label/reference for the Sentence annotation.

In [None]:
%csvConfig Sentence
s:Sentence{CONTAINS(Animal)-> UNMARK(s)};

#### Removing all amounts of money whose value is less than 50 or if the currency is dollar

Almost any boolean expression can be used to represent an implicit condition. Here, we simple define a condition on the feature value.

##### Setup

In [None]:
%displayMode RUTA_COLORING
// Simplified: Annotate amounts of money with currency (see exercise 3)
DECLARE MoneyAmount(INT amount, STRING currency);
INT value;
(NUM{PARSE(value)} c:SPECIAL){-> CREATE(MoneyAmount, "amount"=value, "currency"=c.ct)};

##### Rule 1: If the amount is less than 50, then we remove the annotation

In [None]:
ma:MoneyAmount{ma.amount<50 -> UNMARK(ma)};
COLOR(MoneyAmount, "lightgreen");

##### Rule 2: Removing all amounts of money whose currency is dollar.

In [None]:
ma:MoneyAmount{ma.currency=="$" -> UNMARK(ma)};

### Changing the offset of an annotation using an implicit action

In the next example, we change the offsets of an existing annotation. We expand the `WorksFor` annotation to the complete document.

In [None]:
DECLARE Employer, Employee;
"Peter"-> Employee;
"Frank"-> Employer;

DECLARE WorksFor (Employee employee, Employer employer);
(e1:Employee # e2:Employer){-> wf:WorksFor, wf.employee=e1, wf.employer=e2};

// we can use the action SHIFT
//# @WorksFor{-> SHIFT(WorksFor,1,3)} #; 

// or we could do this also using implicit actions:
b:# wf:@WorksFor{-> wf.begin=b.begin, wf.end=e.end} e:#;

COLOR(WorksFor, "pink");

### The `TRIM` action

In [None]:
// Reset everything and start anew
%resetCas

In [None]:
%%documentText
The dog barked at the cat.
Dogs, cats and mice are mammals.
Zander and tuna are fishes.
Peter works for Frank.
10€ are less than 100$.

Now, we trim the sentences by punctuation marks using the `TRIM` action. `TRIM` changes the offsets of the matched annotations (`Sentence` in this case) by removing a given annotations (`PERIOD` signs in this case) from the beginning and end of an annotation. 

In [None]:
%displayMode CSV
%csvConfig Sentence

DECLARE Sentence;
// Create some simple sentences
(ANY+{-PARTOF(Sentence),-PARTOF(PERIOD)} PERIOD){-> Sentence};

Sentence{->TRIM(PERIOD)};

### The `SPLIT` action

And we split the sentences at the word "are" using the `SPLIT` action.

In [None]:
DECLARE Split;
W{REGEXP("are") -> Split};

Sentence{-> SPLIT(Split)};

There are many more useful actions and conditions. A complete list can be found in the [UIMA Ruta documentation](https://uima.apache.org/d/ruta-current/tools.ruta.book.html).