# Domain Language

The domain language defines an agreed upon collection of terms and concepts that are representative of the domain of interest. They act as a bridge between the high-level vision of the project and the concrete implementation of that vision in code. The Domain Driven Design text includes additional details <span data-cite=evans2004ddd>(Evans, 2004)</span>.

List of domain terms:


- <a href='#Entity'>ENTITY</a>
- <a href='#Factory'>FACTORY</a>
- <a href='#Service'>SERVICE</a>
- <a href='#Token'>TOKEN</a>
- <a href='#Tokenizer'>TOKENIZER</a>
- <a href='#TokenSet'>TOKENSET</a>
- <a href='#Value-Object'>VALUE OBJECT</a>


## Entity

An <a href="#Entity">ENTITY</a> is a mutable data structure that is not defined by its internal data. Instead, it is defined and compared by a reference or identifier key that acts as an index for its current state at a given point in time. For example, the list and dictionary classes are <a href="#Entity">ENTITY</a>s in Python. An <a href="#Entity">ENTITY</a> is defined in distinction to a <a href="#Value-Object">VALUE OBJECT</a>.

## Factory

A <a href="#Factory">FACTORY</a> is a <a href="#Service">SERVICE</a> that is designed to construct domain objects based on different sets of parameters.

## Service

A <a href="#Service">SERVICE</a> is a collection of functions that do not belong to any object.

## Token

A <a href="#Token">TOKEN</a> is a numerical identifier for a word or part-of-speech.

## Tokenizer

The <a href="#Tokenizer">TOKENIZER</a> is a singleton <a href="#Factory">FACTORY</a> that consumes pre-processed text data in the form of a CSV and outputs a <a href="#TokenSet">TOKENSET</a> of <a href="#Token">TOKEN</a>s. The key function of the <a href="#Tokenizer">TOKENIZER</a> is to handle the management of text attributes (e.g., venture\_id, session\_id, speaker\_id, cohort) for future filtering based on these attributes or segmenting the <a href="#Token">TOKEN</a>s for later model processing.

## TokenSet

The <a href="#TokenSet">TOKENSET</a> is a <a href="#Value-Object">VALUE OBJECT</a> that acts as the primary interface for pulling data and labels, filtering data based on attributes (e.g., venture\_id, session\_id, speaker\_id, cohort), collapsing across an attribute, and imbalanced sampling. The <a href="#TokenSet">TOKENSET</a> has no internal state. Consequently, any method that would modify the <a href="#TokenSet">TOKENSET</a> returns a new <a href="#TokenSet">TOKENSET</a> instead.

## Value Object

A <a href="#Value-Object">VALUE OBJECT</a> is an immutable data type that is defined and compared by the internal data that it is made out of. For example, the int and tuple classes are <a href="#Value-Object">VALUE OBJECT</a>s in Python. A <a href="#Value-Object">VALUE OBJECT</a> is defined in distinction to an <a href="#Entity">ENTITY</a>.