# Examples how span visualisuation can be used in the NLP pipeline

Span visualisation is the simplest visualisation form for layers. In brief, it allows you to specify how individual spans and their overlaps are coloured. This can be useful:
* for detecting unexpected overlaps among spans; 
* for visualisation of certain types of words or word classes;
* for checking that complex taggers do what they are intended to do.

Some of the code examples will be added as a separate module in the following EstNLTK releases.

In [1]:
import re
from collections import defaultdict
from typing import Mapping, Any, Tuple, List, Sequence, Union

from estnltk import Text, Layer
from estnltk.taggers import TokensTagger
from estnltk.taggers import CompoundTokenTagger
from estnltk.visualisation.span_visualiser.fancy_span_visualisation import DisplaySpans

from quick_visualisation_fix import DisplayAmbiguousSpans

We illustrate concepts on the following example text.

In [2]:
t = Text('''Silver, kus on sinu julged teod? 
            Merepõhja noorus maha jäi koos laevaga.''').analyse('morphology')

## I. Visualisation of span overlaps and spans with ambigious annotations

Many steps in the NLP pipleline can lead to ambiguous results. Sometimes this can lead to unexpected results.
For example, most analysers in the NLP pipeline assume that tokenisation is unique and there are no overlapping spans. You might also want to inspect what words have several valid morfological annotations to see when disambiguation fails during morphological analysis. 

### Corresponding span visualiser 

The following span visualiser allows you to control the background color of overlaps and ambigious annotations.
You can specify different color for each number of annotations so that words with extremely many annotations would be easy to spot.

In [3]:
from estnltk.visualisation.core.span_visualiser import SpanVisualiser
import html


class DirectPlainSpanVisualiser(SpanVisualiser):
    """Class that visualises spans, arguments can be css elements.
    Arguments that can be changed are bg_mapping, colour_mapping, font_mapping, weight_mapping,
    italics_mapping, underline_mapping, size_mapping and tracking_mapping. These should
    be functions that take the span as the argument and return a string that will be
    the value of the corresponding attribute in the css."""

    def __init__(self, colour_mapping=None, bg_mapping=None, font_mapping=None,
                 weight_mapping=None, italics_mapping=None, underline_mapping=None,
                 size_mapping=None, tracking_mapping=None, fill_empty_spans=False):

        self.bg_mapping = bg_mapping or self.default_bg_mapping
        self.colour_mapping = colour_mapping
        self.font_mapping = font_mapping
        self.weight_mapping = weight_mapping
        self.italics_mapping = italics_mapping
        self.underline_mapping = underline_mapping
        self.size_mapping = size_mapping
        self.tracking_mapping = tracking_mapping
        self.fill_empty_spans = fill_empty_spans

    def __call__(self, segment, spans):

        segment[0] = html.escape(segment[0])

        # Simple text no span to fill
        if not self.fill_empty_spans and self.is_pure_text(segment):
            return segment[0]

        # There is a span to decorate
        output = ['<span style=']
        if self.colour_mapping is not None:
            output.append('color:' + self.colour_mapping(segment, spans) + ";")
        if self.bg_mapping is not None:
            output.append('background:' + self.bg_mapping(segment, spans) + ";")
        if self.font_mapping is not None:
            output.append('font-family:' + self.font_mapping(segment, spans) + ";")
        if self.weight_mapping is not None:
            output.append('font-weight:' + self.weight_mapping(segment, spans) + ";")
        if self.italics_mapping is not None:
            output.append('font-style:' + self.italics_mapping(segment, spans) + ";")
        if self.underline_mapping is not None:
            output.append('text-decoration:' + self.underline_mapping(segment, spans) + ";")
        if self.size_mapping is not None:
            output.append('font-size:' + self.size_mapping(segment, spans) + ";")
        if self.tracking_mapping is not None:
            output.append('letter-spacing:' + self.tracking_mapping(segment, spans) + ";")
        if len(segment[1]) > 1:
            output.append(' class=overlapping-span ')
            rows = []
            for i in segment[1]:
                rows.append(spans[i].text)
            output.append(' span_info=' + html.escape(','.join(rows)))  # text of spans for javascript
        output.append('>')
        output.append(segment[0])
        output.append('</span>')
        return "".join(output)

In [4]:
class DisplayAmbiguousSpans(DisplaySpans):
    """
    Displays overlaps between spans and spans with multiple annotations
    
    Default background color scheme is following:
    * normal spans are transparent
    * overlapping spans are red
    * ambiguous spans are orange
    
    The opacity level indicates the number of overlaps and annotations
    
    Color scheme is controlled by two dictionary-like class attributes
    * span_coloring[int]
    * annotation_coloring[int]
    
    Assigning corresponding elements redefines the coloring for a particular
    number of annotations or spans, e.g. span_coloring[2] = 'blue'.
    
    Assigning corresponding attributes redefines the entire color scheme.
    The assigned object must support indexing with any int.
    """
    
    def __init__(self):
        super(DisplayAmbiguousSpans, self).__init__(styling="direct")
        
        # Hack to get it working by replacing a wrong base class
        self.span_decorator = DirectPlainSpanVisualiser()
        
        self.restore_defaults()
        self.span_decorator.bg_mapping = self.__bg_mapper
    
    
    def restore_defaults(self):
        """Restore default coloring scheme for overlaps and ambigous spans"""
        
        # Define two shades of red for overlaps
        self.span_coloring = defaultdict(lambda:'#FF0000')
        self.span_coloring[2] = '#FF5050'

        # Define transparent + two shades of orange for ambigious annotations
        self.annotation_coloring = defaultdict(lambda:'#F59B00')
        self.annotation_coloring[1] = '#FFA50000'
        self.annotation_coloring[2] = '#FFA500'


    def __bg_mapper(self, segment: Tuple[str, List[int]], spans) -> str:
        
        if len(segment[1]) != 1:
            return self.span_coloring[segment[1]]
                
        return self.annotation_coloring[len(spans[segment[1][0]].annotations)]

### Example: Showing words with ambigious morfological analysis

Lets first call the span visualiser with default color scheme. 

In [5]:
display_ambigous = DisplayAmbiguousSpans()
display_ambigous(t.morph_analysis)

Lets check whether the word 'on' has two different annotations by setting the corresponding annotation color to red and the color of three annotations to blue. 

In [6]:
display_ambigous.annotation_coloring[2] = 'red'
display_ambigous.annotation_coloring[3] = 'blue'
display_ambigous(t.morph_analysis)

Now lets go back to the default coloring scheme.

In [7]:
display_ambigous.restore_defaults()
display_ambigous(t.morph_analysis)

## II. Visualisation of part-of-speech tag types

Sometimes it can be useful to control the outcome of a morphological analysis visually. In the following, we show how one can visualise most important part-of-speech tags. Analogous approach can be used for other attributes of the `morph_analysis` layer.  

### Corresponding span visualiser

You can specify different color for each part-of-speech tag and a special color for words that have multiple part-of-speech tags. You can also choose the color for the rare case where tokenization is not unique and tokens overlap. This could occur only if you did some non-standard modifications to the NLP pipeline.

In [8]:
class DisplayPostagsSpans(DisplaySpans):
    """
    Visualises different part-of-speech tags in a text
    
    Provides default background colourschme for EstMorf and GT tagsets.
    Color scheme is controlled by two dictionary-like class attributes
    * pos_coloring[str]
    * span_coloring[int]
    
    The first coloring controls how spans with different POS-tags are 
    colored. Default coloring can be changed by assigning appropriate
    entries, e.g. pos_coloring['V'] = 'black'.
    
    The second controls how span overlaps are colored. The tokenization 
    into the words can be ambiguous. By default, overlaps are colored
    by two shades of red. This can be changed by assigning appropriate
    entries, e.g. span_coloring[2] = 'blue'.
    
    To redefine the entire color scheme, the entire colouring attribute
    must be redefined. The assigned object must support indexing with 
    any string for pos_coloring and any int for span_coloring.
    
    As POS-tagging may be ambiguous, coloring is done in two phases:
    1. list of POS-tags is aggregated into a new string label
    2. POS-tag coloring is used to determine the background color
    
    The default aggregator marks all ambigious labellings with '*'.
    It is possible to customise this by redefining ambiguity_resolver.
    """

    def __init__(self, layer:str='morph_analysis', tagset:str='EstMorf', ambiguity_resolver:callable=None):
        super(DisplayPostagsSpans, self).__init__(styling="direct")
        
        # Hack to get it working by replacing a wrong base class
        self.span_decorator = DirectPlainSpanVisualiser()

        self.morph_layer = layer
        self.tagset = tagset
        self.__default_ambiguity_resolver = ambiguity_resolver or self.__default_ambiguity_resolver
        self.span_decorator.bg_mapping = self.__bg_mapper
        self.restore_defaults()
        
        
    def restore_defaults(self): 
        """Restore default coloring scheme for part-of-speech tags and token overlaps and ambiguity resolver"""
        
        self.ambiguity_resolver = self.__default_ambiguity_resolver
        
        self.pos_coloring = {}
        if self.tagset == 'EstMorf' or self.tagset == 'GT':
            self.pos_coloring['S'] = 'orange'
            self.pos_coloring['H'] = 'orange'
            self.pos_coloring['A'] = 'yellow'
            self.pos_coloring['U'] = 'yellow'
            self.pos_coloring['C'] = 'yellow'
            self.pos_coloring['N'] = 'yellow'
            self.pos_coloring['O'] = 'yellow'
            self.pos_coloring['V'] = 'lime'
            self.pos_coloring['*'] = 'gray'
            
        # Define two shades of red for overlapping tokenization
        self.span_coloring = {2:'#FF5050'}
        
            
    def __call__(self, object:Union[Text, Layer]) -> str:
        if isinstance(object, Text):
            return super(DisplayPostagsSpans, self).__call__(object[self.morph_layer])
        elif isinstance(object, Layer):
            return super(DisplayPostagsSpans, self).__call__(object)
        else:
            raise ValueError('Invalid input')
            
            
    def __default_ambiguity_resolver(self, span) -> str:
        pos_tags = set(span['partofspeech'])
        if len(pos_tags) == 1:
            return next(iter(pos_tags));
        return '*'

    
    def __bg_mapper(self, segment: Tuple[str, List[int]], spans) -> str:
        
        if len(segment[1]) != 1:
            return self.span_coloring.get(len(segment[1]),'#FF0000')
            
        return self.pos_coloring.get(self.ambiguity_resolver(spans[segment[1][0]]),'#ffffff00')

### Example: Default coloring and how to change it

In [9]:
display_postags = DisplayPostagsSpans()
display_postags(t.morph_analysis)

Let's change the color of the verbs (V) to red and nouns (S) to green.   

In [10]:
display_postags.pos_coloring['V'] = 'red'
display_postags.pos_coloring['S'] = 'green'
display_postags(t)

This does not look good. Let's restore original coloring. 

In [11]:
display_postags.restore_defaults()
display_postags(t)

Let's redefine the way the span visualiser handles ambiguities.  

In [12]:
def amb_resolver(span):
    
    pos_tags = list(span['partofspeech'])
    if len(pos_tags) == 1:
        return next(iter(pos_tags));
    return '*'

display_postags.ambiguity_resolver = amb_resolver
display_postags(t)

Note that you can still go back to the default ambiguity resolved specified during the initialisation.

In [13]:
display_postags.restore_defaults()
display_postags(t)

## III. Visualization of compound tokens 

The NLP pipeline starts with a tokenization phase. This phase has sub-steps:

* first, text is split into smallest meaningful tokens
* then, some of these are combined back into larger units (e.g. dates and proper names)

The following span visualiser highlights identified compound tokens and their subtype.

### Corresponding span visualiser

In [14]:
class DisplayCompoundTokens(DisplaySpans):
    
    """
    Visualizes subtypes of compound tokens in a text
    
    Default background color scheme is following:
    * type_coloring[str]
    * span_coloring[int]
    
    The first coloring controls how spans with different compound tokens 
    are colored. Default coloring can be changed by assigning appropriate
    entries, e.g. type_coloring['numeric'] = 'black'.
    
    The second controls how span overlaps are colored. Tokenization 
    into words can be ambiguous. By default, overlaps are colored
    in two shades of red. This can be changed by assigning appropriate
    entries, e.g. span_coloring[2] = 'blue'.
    
    To redefine the entire color scheme, the entire colouring attribute
    must be redefined. The assigned object must support indexing with 
    any string for type_coloring and any int for span_coloring.
    
    As a compound can have multiple type labels, colouring is done in two phases:
    1. list of type tags is aggregated into a new string label
    2. type tag coloring is used to determine the background color
    
    The default aggregator removes 'sign' attribute and marks all ambigious 
    labellings with '*'. It is possible to customize this by redefining 
    ambiguity_resolver.
    """
    
    def __init__(self, layer:str='compound_tokens', ambiguity_resolver:callable=None):
        super(DisplayCompoundTokens, self).__init__(styling="direct")
        
        # Hack to get it working by replacing a wrong base class
        self.span_decorator = DirectPlainSpanVisualiser()
        
        self.compound_layer = layer
        self.__default_ambiguity_resolver = ambiguity_resolver or self.__default_ambiguity_resolver
        self.restore_defaults()
        
        
    def restore_defaults(self): 
        """Restore default coloring scheme for compound tokens and ambiguity resolver for compound subtypes"""

        
        self.ambiguity_resolver = self.__default_ambiguity_resolver
        self.span_decorator.bg_mapping = self.__bg_mapper
        
        # Define two shades of red for overlapping tokenisation
        self.span_coloring = defaultdict(lambda:'#FF0000')
        self.span_coloring[2] = '#FF5050'

        # Define coloring for different types of compounds 
        self.type_coloring = defaultdict(lambda:'#ffffff00')
        
        # Proper names
        self.type_coloring['name_with_initial'] = '#6CA390'
        self.type_coloring['case_ending'] = '#6CA390'

        # Rare wordforms
        self.type_coloring['hyphenation'] = '#306754'
        
        # Numbers and units
        self.type_coloring['numeric_date'] = '#9DC209'
        self.type_coloring['numeric'] = '#9DC209'
        self.type_coloring['percentage'] = '#9DC209'
        self.type_coloring['unit'] = '#759A00'
        
        # Abbrevations
        self.type_coloring['non_ending_abbreviation'] = '#BCE954'
        self.type_coloring['abbreviation'] = '#BCE954'
        
        # Web specific compounds
        self.type_coloring['xml_tag'] = '#5E5A80'
        self.type_coloring['email'] = '#908CB2'
        self.type_coloring['www_address'] = '#908CB2'
        
        # Emotiocons
        self.type_coloring['emoticon'] = '#FFDB58'
        
        # All the rest
        self.type_coloring['*'] = '#FFA62F'
        
        
    def __call__(self, object:Union[Text, Layer]) -> str:
        if isinstance(object, Text):
            return super(DisplayCompoundTokens, self).__call__(object[self.compound_layer])
        elif isinstance(object, Layer):
            return super(DisplayCompoundTokens, self).__call__(object)
        else:
            raise ValueError('Invalid input')
            
            
    def __default_ambiguity_resolver(self, span) -> str:
        
        types = set(span.type)
        types.discard('sign')
        if len(types) == 1:
            return next(iter(types));
        return '*'

    
    def __bg_mapper(self,  segment: Tuple[str, List[int]], spans) -> str:
        
        if len(segment[1]) != 1:
            return self.span_coloring[len(segment[1])]
            
        return self.type_coloring[self.ambiguity_resolver(spans[segment[1][0]])]    

### Example: Default coloring and how to change it
For this example, we need more specific text that contains compound tokens 

In [15]:
raw_text  = 'Mis aias sa-das 3me sorti s-saia?\n'
raw_text += '02.02.2010 22:55 Mati : saad sa mulle 100,50 asemel 10 000 laenata?\n'
raw_text += 'Mati : +100% kindel, et toon tagasi!!\n'
raw_text += 'Tänase seisuga tuleb ikka suur lohe vaiksema tuule (6-12 m/s) jaoks ja teine väiksem tormikaks (12-20 m/s) võtta…\n'
raw_text += '<u>Kirjavahemärgid, hingamiskohad</u>.\n'
raw_text += 'Saada need e-postiaadressile big@boss.com või tule sisesta lehelt www.iamboss.com\n'
raw_text += 'Maja on fantastiline :)) ja mõte on hea :-)\n'
raw_text += '(arhitektid M. Port, M. Meelak, O. Zhemtshugov, R.-L. Kivi)\n'
raw_text += 'Nt. hädas oli juba Vana-Hiina suurim ajaloolane Sima Qian (II—I saj. e. m. a.).\n'
raw_text += "10 000-st LinkedIn 'i kontaktist mitte üks ei hoolinud meie SKT -st, aga meie workshop ' e väisasid küll.\n"
text = TokensTagger().tag(Text(raw_text))
CompoundTokenTagger().tag(text)
display_compounds = DisplayCompoundTokens()
display_compounds(text.compound_tokens)

Lets change the colour of dates (numeric_date) to red and emoticons (emoticon) to blue. 

In [16]:
display_compounds.type_coloring['numeric_date'] = 'red' 
display_compounds.type_coloring['emoticon'] = 'blue' 
display_compounds(text)

Default subtype resolver discards sign subtype. Lets highlight locations where these are present. 

In [17]:
def subtype_resolver(span):
    if 'sign' in set(span.type):
        return 'sign'
    else:
        return '*'  
display_compounds.ambiguity_resolver = subtype_resolver
display_compounds.type_coloring['sign'] = 'red'
display_compounds.type_coloring['*'] = 'white'
display_compounds(text)

Again, you can restore the default state.

In [18]:
display_compounds.restore_defaults()
display_compounds(text)

## IV. Visualization of cases

Cases are one of the most important output attribute of morphological analysis. These are used as inputs for many higher level taggers. Thus, it might be convenient to visualize this outcome. This might reveal error patterns or incorrect assumptions made by the higher level taggers.  


### Corresponding span visualiser

In [19]:
class DisplayNounCases(DisplaySpans):
    
    """
    Visualizes case information in a text
    
    Default background color scheme is following:
    * case_coloring[str]
    * span_coloring[int]
    
    The first coloring controls how spans with different cases are colored. 
    Default coloring can be changed by assigning appropriate entries, e.g. 
    case_coloring['kom'] = 'black'.
    
    The second controls how span overlaps are colored. The tokenization 
    into the words can be ambiguous. By default, overlaps are colored
    in two shades of red. This can be changed by assigning appropriate
    entries, e.g. span_coloring[2] = 'blue'.
    
    To redefine the entire color scheme, the entire colouring attribute
    must be redefined. The assigned object must support indexing with 
    any string for type_coloring and any int for span_coloring.
    
    As a word can have multiple analyses, coloring is done in two phases:
    1. list of case tags is aggregated into a new string label
    2. case tag coloring is used to determine the background color
    
    The default aggregator removes number information from the case label
    and marks all ambigious labellings with '*'. It is possible to customise
    this by redefining ambiguity_resolver.
    """    
    
    def __init__(self, layer:str='morph_analysis', ambiguity_resolver:callable=None):
        super(DisplayNounCases, self).__init__(styling="direct")
        
        # Hack to get it working by replacing a wrong base class
        self.span_decorator = DirectPlainSpanVisualiser()
        
        self.morph_layer = layer
        self.__default_resolver = ambiguity_resolver or self.__default_ambiguity_resolver
        self.restore_defaults()
        
        
    def restore_defaults(self):
        
        self.ambiguity_resolver = self.__default_ambiguity_resolver
        self.span_decorator.bg_mapping = self.__bg_mapper
        
        # Define two shades of red for overlapping tokenization
        self.span_coloring = defaultdict(lambda:'#FF0000')
        self.span_coloring[2] = '#FF5050'

        # Define coloring for different types of compounds 
        self.case_coloring = defaultdict(lambda:'#ffffff00')
        
        # By default, we color only the first four cases
        self.case_coloring['n']   = 'orange'
        self.case_coloring['g']   = 'yellow'
        self.case_coloring['p']   = 'lightgreen'
        self.case_coloring['adt'] = 'pink'
        self.case_coloring['*']   = 'gray'
        self.case_coloring['_']   = '#ffffff00'

        
        
        
    def __call__(self, object:Union[Text, Layer]) -> str:
        if isinstance(object, Text):
            return super(DisplayNounCases, self).__call__(object[self.morph_layer])
        elif isinstance(object, Layer):
            return super(DisplayNounCases, self).__call__(object)
        else:
            raise ValueError('Invalid input')
            
            
    def __default_ambiguity_resolver(self, span) -> str:
        
        cases = {re.sub('^sg |^pl ', '', an.form) for an in span.annotations if an.partofspeech in {'S', 'H', 'P'}}
        
        if len(cases) == 0:
            return '_'
        
        if len(cases) == 1:
            return next(iter(cases));
        
        return '*'

    
    def __bg_mapper(self,  segment: Tuple[str, List[int]], spans) -> str:
        
        if len(segment[1]) != 1:
            return self.span_coloring[len(segment[1])]
                    
        return self.case_coloring[self.ambiguity_resolver(spans[segment[1][0]])]    

### Example: Default coloring and how ro change it

In [20]:
display_cases = DisplayNounCases() 
display_cases(t)

Lets color only plural and singular aspect of the case. This requires change in the ambiguity resolver and coloring itself. 

In [21]:
def case_number_detector(span):
    cases = {re.match('^(?P<number>sg|pl) ', an.form).group('number') for an in span.annotations 
             if an.partofspeech in {'S', 'H', 'P'} and re.match('^(?P<number>sg|pl) ', an.form)}
        
    if len(cases) == 0:
        return '_'
        
    if len(cases) == 1:
        return next(iter(cases));
        
    return '*'

display_cases.ambiguity_resolver = case_number_detector
display_cases.case_coloring['sg'] = 'blue'
display_cases.case_coloring['pl'] = 'red'
display_cases(t)

Let's now restore default behaviour.

In [22]:
display_cases.restore_defaults()
display_cases(t)

## V. Visualisation of compound words

In [23]:
class DisplayCompoundWords(DisplaySpans):
    
    """
    Visualizes compound words in a text
    
    Default background color scheme is following:
    * word_coloring[str]
    * span_coloring[int]
    
    The first coloring controls how simple and compound words are colored. 
    Default coloring can be changed by assigning appropriate entries, e.g. 
    case_coloring['simple'] = 'blue'.
    
    The second controls how span overlaps are colored. Tokenization 
    into words can be ambiguous. By default, overlaps are colored
    in two shades of red. This can be changed by assigning appropriate
    entries, e.g. span_coloring[2] = 'blue'.
    
    To redefine the entire color scheme, the entire colouring attribute
    must be redefined. The assigned object must support indexing with 
    any string for type_coloring and any int for span_coloring.
    
    As a word can have multiple analyses, coloring is done in two phases:
    1. list of root_word tags is aggregated into a new string label
    2. case tag coloring is used to determine the background color
    
    The default aggregator marks a word as a compound if it is a compound 
    according to at least one analysis. It is possible to customise
    this by redefining ambiguity_resolver.
    """

    
    def __init__(self, layer:str='morph_analysis', ambiguity_resolver:callable=None):
        super(DisplayCompoundWords, self).__init__(styling="direct")
        
        # Hack to get it working by replacing a wrong base class
        self.span_decorator = DirectPlainSpanVisualiser()
        
        self.morph_layer = layer
        self.__default_resolver = ambiguity_resolver or self.__default_ambiguity_resolver
        self.restore_defaults()
        
        
    def restore_defaults(self):
        
        self.ambiguity_resolver = self.__default_ambiguity_resolver
        self.span_decorator.bg_mapping = self.__bg_mapper
        
        # Define two shades of red for overlapping tokenisation
        self.span_coloring = defaultdict(lambda:'#FF0000')
        self.span_coloring[2] = '#FF5050'

        # Define coloring for different types of compounds 
        self.word_coloring = defaultdict(lambda:'#ffffff00')
        
        # Colors compound words in millennial pink
        self.word_coloring['simple']    = '#ffffff00'
        self.word_coloring['compound']  = '#ffb6c1'
        
        
    def __call__(self, object:Union[Text, Layer]) -> str:
        if isinstance(object, Text):
            return super(DisplayCompoundWords, self).__call__(object[self.morph_layer])
        elif isinstance(object, Layer):
            return super(DisplayCompoundWords, self).__call__(object)
        else:
            raise ValueError('Invalid input')
            
            
    def __default_ambiguity_resolver(self, span) -> str:
        
        compound_counts = max(len(an.root_tokens) for an in span.annotations)
        
        if compound_counts == 1:
            return 'simple'
        else:
            return 'compound'

    
    def __bg_mapper(self,  segment: Tuple[str, List[int]], spans) -> str:
        
        if len(segment[1]) != 1:
            return self.span_coloring[len(segment[1])]
                    
        return self.word_coloring[self.ambiguity_resolver(spans[segment[1][0]])]    

In [24]:
display_compound_words = DisplayCompoundWords()
display_compound_words(t)