# The Macronizer Class

A new macronizer object takes a range of initialization variables, all of them optional. Only the first two are intended to be changed by the user:


- `macronize_everything=True`, determines whether to mark macrons whose length is inferable from accent rules (should be False for human audience)
- `unicode=False`, determines whether output is human-friendly unicode combining diacritics or machine-friendly non-combining carets and underscores. Evaluation methods are only available for the latter. 

In [1]:
from class_macronizer import Macronizer

macronizer = Macronizer()

input = '''ἀάατος, ἀγαθὸς, καλὸς, ἀνήρ, νεανίας'''
output = macronizer.macronize_text(input)

print(f'Results: {output}')

νεα_νί^α_ς
[(-4, 'ἀ^'), (-3, 'ά_'), (-2, 'α^'), (-1, 'τος')]
Modified syllable positions: [(-4, 'ἀ'), (-3, 'ά'), (-2, 'α'), (-1, 'τος')]
New version: ἀάατος
[(-3, 'ἀ^'), (-2, 'γα^'), (-1, 'θὸς')]
Modified syllable positions: [(-3, 'ἀ'), (-2, 'γα'), (-1, 'θὸς')]
New version: ἀγαθὸς
[(-2, 'κα^'), (-1, 'λὸς')]
Modified syllable positions: [(-2, 'κα'), (-1, 'λὸς')]
New version: καλὸς
[(-2, 'ἀ^'), (-1, 'νήρ')]
Modified syllable positions: [(-2, 'ἀ'), (-1, 'νήρ')]
New version: ἀνήρ

Macronization took 1.99 seconds
Results: ἀ^ά_α^τος, ἀ^γα^θὸς, κα^λὸς, ἀ^νήρ


Let's try it with unicode diacritics. This is a useful option for settings where there is access to fonts with so-called OpenType ligature instructions like [New Athena](https://github.com/SteelWagstaff/new-athena-unicode), meaning fonts that have prepared precomposed glyphs for adding longa or brevia to Greek letters that already have other diacritics. 

**Important**: this option is only for printing results; to apply any of the evaluation methods, you will need to turn unicode off.

In [11]:
macronizer = Macronizer(unicode=True)

output = macronizer.macronize_text(input)
print(f'Results: {output}')

[(-3, 'ἀ^'), (-2, 'γα^'), (-1, 'θὸς')]
Modified syllable positions: [(-3, 'ἀ'), (-2, 'γα'), (-1, 'θὸς')]
New version: ἀγαθὸς
[(-2, 'κα^'), (-1, 'λὸς')]
Modified syllable positions: [(-2, 'κα'), (-1, 'λὸς')]
New version: καλὸς
[(-2, 'ἀ^'), (-1, 'νήρ')]
Modified syllable positions: [(-2, 'ἀ'), (-1, 'νήρ')]
New version: ἀνήρ

Macronization took 2.05 seconds
Results: ἀ̆γᾰθὸς, κᾰλὸς, ἀ̆νήρ


Let's try a longer text. We can evaluate the results with the method `macronization_ratio`, which makes some prints and returns a ratio. Have patience if you include the evaluation, as it is O(n) and will take about 130 seconds.

In [1]:
from class_macronizer import Macronizer
from anabasis import anabasis

macronizer = Macronizer()

macronisandum = anabasis
macronisatum = macronizer.macronize_text(macronisandum)
with open('macronisatum.txt', 'w') as f:
    f.write(macronisatum)

ratio = macronizer.macronization_ratio(macronisandum, macronisatum)
print(ratio)


[(-5, 'δα_'), (-4, 'ρε'), (-3, 'ί'), (-2, 'ο'), (-1, 'υ')]
υ: proparoxytone
Modified syllable positions: [(-5, 'δα'), (-4, 'ρε'), (-3, 'ί'), (-2, 'ο'), (-1, 'υ^')]
New version: δαρείου^
[(-5, 'πα'), (-4, 'ρυ'), (-3, 'σά'), (-2, 'τι'), (-1, 'δος')]
δος: proparoxytone
Modified syllable positions: [(-5, 'πα'), (-4, 'ρυ'), (-3, 'σά'), (-2, 'τι'), (-1, 'δο^ς')]
New version: παρυσάτιδο^ς
[(-4, 'γίγ'), (-3, 'νον'), (-2, 'τα'), (-1, 'ι')]
Modified syllable positions: [(-4, 'γίγ'), (-3, 'νον'), (-2, 'τα'), (-1, 'ι')]
New version: γίγνονται
[(-3, 'πα'), (-2, 'ῖ'), (-1, 'δες')]
δες: properispomenon
Modified syllable positions: [(-3, 'πα'), (-2, 'ῖ'), (-1, 'δε^ς')]
New version: παῖδε^ς
[(-2, 'δῠ́'), (-1, 'ο')]
Modified syllable positions: [(-2, 'δῠ́'), (-1, 'ο')]
New version: δῠ́ο
[(-4, 'πρεσ'), (-3, 'βῠ́'), (-2, 'τε'), (-1, 'ρος')]
Modified syllable positions: [(-4, 'πρεσ'), (-3, 'βῠ́'), (-2, 'τε'), (-1, 'ρος')]
New version: πρεσβῠ́τερος
[(-4, 'ἀρ'), (-3, 'τα'), (-2, 'ξέρ'), (-1, 'ξης')]
Modified