<center><h1>Restoring ancient text using deep learning</h1>
<h2>A case study on Greek epigraphy</h2>

Yannis Assael<sup>*</sup>, Thea Sommerschield<sup>*</sup>, Jonathan Prag
</center>

---

Ancient history relies on disciplines such as epigraphy, the study of ancient inscribed texts, for evidence of the recorded past. However, these texts, "inscriptions", are often damaged over the centuries, and illegible parts of the text must be restored by specialists, known as epigraphists. This work presents a novel assistive method for providing text restorations using deep neural networks.To the best of our knowledge, Pythia is the first ancient text restoration model that recovers missing characters from a damaged text input. Its architecture is carefully designed to handle long-term context information, and deal efficiently with missing or corrupted character and word representations. To train it, we wrote a non-trivial pipeline to convert PHI, the largest digital corpus of ancient Greek inscriptions, to machine actionable text, which we call PHI-ML. On PHI-ML, Pythia's predictions achieve a 30.1% character error rate, compared to the 57.3% of human epigraphists. Moreover, in 73.5% of cases the ground-truth sequence was among the Top-20 hypotheses of Pythia, which effectively demonstrates the impact of such an assistive method on the field of digital epigraphy, and sets the state-of-the-art in ancient text restoration.

### References

- [arXiv pre-print](https://arxiv.org/abs/1910.06262)
- [EMNLP-IJCNLP 2019](https://www.aclweb.org/anthology/D19-1668)

When using any of the source code of this project please cite:
```
@inproceedings{assael2019restoring,
  title={Restoring ancient text using deep learning: a case study on {Greek} epigraphy},
  author={Assael, Yannis and Sommerschield, Thea and Prag, Jonathan},
  booktitle={Empirical Methods in Natural Language Processing},
  pages={6369--6376},
  year={2019}
}
```

#### License

```
Copyright 2019 Google LLC, Thea Sommerschield, Jonathan Prag

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
```
---

# Interactive notebook instructions

Χαῖρε (or welcome) to the Interactive Notebook of Pythia.

Please follow the inscriptions below to begin restoring ancient Greek inscriptions.

1. Create a copy of this notebook to allow editing (Click on the top-left "**Open in playground**" mode or "File" -> "Save a copy in Drive")
1. Execute the cells below using shift + enter to prepare the download and initiate Pythia.
2. In the section **Imports and parameters**, the cells:
  - load the imports;
  - set the default character prediction parameters.
3. In the section **Create and load Pythia**, the cells:
  - download a pre-trained bi-word epigraphy model;
  - define the alphabet and the text processing scripts;
  - load the vocabulary;
  - create an instance of the model, and
  - the auxiliary visualisation functions.

3. In the **Pythia** section, the interactive forms can be used to get the top models predictions and visualise Pythia's attention weights. You can input your own text in the "input text" fields, which are currently filled with example texts.



In [1]:
#@title Imports (takes a while)

import argparse
import html
import importlib
import os
import re
import sys
import logging
import warnings

warnings.filterwarnings('ignore')
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'

from IPython.core.display import display, HTML
import numpy as np
import tensorflow as tf
np.seterr(all='ignore')
tf.get_logger().setLevel(logging.ERROR)
tf.logging.set_verbosity(tf.logging.ERROR)
tf.compat.v1.logging.set_verbosity(tf.compat.v1.logging.ERROR)
tf.compat.v1.disable_v2_behavior()
print(tf.__version__)



1.14.0


In [2]:
#@title Parameters
pred_char_min = 1 #@param {type:"slider", min:1, max:20, step:1}
pred_char_max = 10 #@param {type:"slider", min:1, max:20, step:1}

p = argparse.ArgumentParser(prog='Pythia', description='visualise')
# positional args:
p.add_argument('--dataset', default="greek_epigraphy_dict.p", type=str, help='dataset file')
p.add_argument('--load_checkpoint', default=os.getcwd() + '/model_biword_epigraphy/model.ckpt-830000', type=str, help='load from checkpoint')
p.add_argument('--batch_size', default=1, type=int, help='batch size')
p.add_argument('--learning_rate', default=1e-3, type=float, help='learning rate')
p.add_argument('--grad_clip', default=5., type=float, help='gradient norm clipping')
p.add_argument('--beam_width', default=100, type=int, help='beam search width')
p.add_argument("--log_samples", default=False, type=bool, metavar='N', help='log samples')
p.add_argument('--eval_samples', default=6400, type=int, help='number of evaluation samples')
p.add_argument('--test_iterations', default=1, type=int, help='number of training iterations')
p.add_argument('--context_char_min', default=-1, type=int, help='minimum context characters')
p.add_argument('--context_char_max', default=1000, type=int, help='minimum context characters')
p.add_argument('--pred_char_min', default=pred_char_min, type=int, help='minimum pred characters')
p.add_argument('--pred_char_max', default=pred_char_max, type=int, help='minimum pred characters')
p.add_argument('--missing_char_min', default=0, type=int, help='minimum missing characters')
p.add_argument('--missing_char_max', default=0, type=int, help='minimum missing characters')
p.add_argument('--pred_guess', default=False, type=bool, help='predict guessed characters')
p.add_argument('--loglevel', default='INFO', type=str, metavar='LEVEL',
               help=('Log level, will be overwritten by --debug. (DEBUG/INFO/WARN)'))
FLAGS = p.parse_args(args=[])

# Download and load Pythia

In [3]:
#@title Alphabet
class Alphabet():  # () if for passing arguments
  # with the = in the arguments you are setting these as defaults; init = initialise
  def __init__(self, alphabet, numerals='0123456789', punctuation='.', space=' ', pred='_', missing='-',
               pad='#', unk='^', sos='<', eos='>', sog='[', eog=']', wordlist_path=None, wordlist_size=100000):
    self.alphabet = list(alphabet)  # alph
    self.numerals = list(numerals)  # num
    self.punctuation = list(punctuation)  # punt
    self.space = space  # spacing
    self.pred = pred  # pred char to be predicted
    self.missing = missing  # missing char
    self.pad = pad  # padding (spaces to right of string)
    self.unk = unk  # unknown char
    self.sos = sos  # start of sentence
    self.eos = eos  # end of entence
    self.sog = sog  # start of guess
    self.eog = eog  # end of guess

    # Define wordlist mapping
    if wordlist_path is None:
      self.idx2word = None
      self.word2idx = None
    else:
      with open(wordlist_path, "r") as f:
        self.idx2word = [self.sos,
                         self.eos,
                         self.pad,
                         self.unk] + [w_c.split('\t')[0] for w_c in f.read().split('\n')[:wordlist_size]]
      self.word2idx = {self.idx2word[i]: i for i in range(len(self.idx2word))}

    # Define vocab mapping
    self.idx2char = [self.sos,
                     self.eos,
                     self.pad,
                     self.unk] + self.alphabet + self.numerals + self.punctuation + [
                      self.space,
                      self.missing,
                      self.pred]
    self.char2idx = {self.idx2char[i]: i for i in range(len(self.idx2char))}

    # Define special character indices
    self.pad_idx = self.char2idx[pad]
    self.unk_idx = self.char2idx[unk]
    self.sos_idx = self.char2idx[sos]
    self.eos_idx = self.char2idx[eos]

  def filter(self, t):
    return t


class GreekAlphabet(Alphabet):
  def __init__(self):
    super().__init__(alphabet=
                     'ΐάέήίΰαβγδεζηθικλμνξοπρςστυφχψωϊϋόύώϙϛἀἁἂἃἄἅἆἇἐἑἒἓἔἕἠἡἢἣἤἥἦἧἰἱἲἳἴἵἶἷὀὁὂὃὄὅὐὑὒὓὔὕὖὗὠὡὢὣὤὥὦὧὰὲὴὶὸὺὼᾀᾁᾂᾃᾄᾅᾆᾇᾐᾑᾒᾓᾔᾕᾖᾗᾠᾡᾢᾣᾤᾥᾦᾧᾰᾱᾲᾳᾴᾶᾷῂῃῄῆῇῐῑῖῠῡῤῥῦῲῳῴῶῷ',
                     wordlist_path=os.getcwd() + "/datasets/greek_text_and_epigraphy_wordlist.txt",
                     wordlist_size=100000)  # ͱͳͷϝϟϡ϶ϸϻ
    self.tonos_to_oxia = {
      # tonos   : #oxia
      "\u0386": "\u1FBB",  # capital letter alpha
      "\u0388": "\u1FC9",  # capital letter epsilon
      "\u0389": "\u1FCB",  # capital letter eta
      "\u038C": "\u1FF9",  # capital letter omicron
      "\u038A": "\u1FDB",  # capital letter iota
      "\u038E": "\u1FF9",  # capital letter upsilon
      "\u038F": "\u1FFB",  # capital letter omega
      "\u03AC": "\u1F71",  # small letter alpha
      "\u03AD": "\u1F73",  # small letter epsilon
      "\u03AE": "\u1F75",  # small letter eta
      "\u0390": "\u1FD3",  # small letter iota with dialytika and tonos/oxia
      "\u03AF": "\u1F77",  # small letter iota
      "\u03CC": "\u1F79",  # small letter omicron
      "\u03B0": "\u1FE3",  # small letter upsilon with with dialytika and tonos/oxia
      "\u03CD": "\u1F7B",  # small letter upsilon
      "\u03CE": "\u1F7D"  # small letter omega
    }
    self.oxia_to_tonos = {v: k for k, v in self.tonos_to_oxia.items()}

  def filter(self, t):  # override previous filter function
    # lowercase
    t = t.lower()

    # replace dot below
    t = t.replace(u'\u0323', '')

    # replace perispomeni
    t = t.replace(u'\u0342', '')

    # replace ending sigma
    t = re.sub(r'([\w\[\]])σ(?![\[\]])(\b)', r'\1ς\2', t)

    # replace oxia with tonos
    for oxia, tonos in self.oxia_to_tonos.items():
      t = t.replace(oxia, tonos)

    # replace h
    h_patterns = {
      # input: #target
      "ε": "ἑ",
      "ὲ": "ἓ",
      "έ": "ἕ",

      "α": "ἁ",
      "ὰ": "ἃ",
      "ά": "ἅ",
      "ᾶ": "ἇ",

      "ι": "ἱ",
      "ὶ": "ἳ",
      "ί": "ἵ",
      "ῖ": "ἷ",

      "ο": "ὁ",
      "ό": "ὅ",
      "ὸ": "ὃ",

      "υ": "ὑ",
      "ὺ": "ὓ",
      "ύ": "ὕ",
      "ῦ": "ὗ",

      "ὴ": "ἣ",
      "η": "ἡ",
      "ή": "ἥ",
      "ῆ": "ἧ",

      "ὼ": "ὣ",
      "ώ": "ὥ",
      "ω": "ὡ",
      "ῶ": "ὧ"
    }

    # iterate by keys
    for h_in, h_tar in h_patterns.items():
      # look up and replace h[ and h]
      t = re.sub(r'ℎ(\[?){}'.format(h_in), r'\1{}'.format(h_tar), t)
      t = re.sub(r'ℎ(\]?){}'.format(h_in), r'{}\1'.format(h_tar), t)

    # any h left is an ἡ
    t = re.sub(r'(\[?)ℎ(\]?)', r'\1ἡ\2'.format(h_tar), t)

    return t


In [4]:
#@title Vocabulary
def text_to_idx(t, alphabet):
  return np.array([alphabet.char2idx[c] for c in t], dtype=np.int32)


def text_to_word_idx(t, alphabet):
  out = np.full(len(t), alphabet.word2idx[alphabet.unk], dtype=np.int32)

  for m in re.finditer(r'\w+', t):
    if m.group() in alphabet.word2idx:
      out[m.start():m.end()] = alphabet.word2idx[m.group()]

  return out


def idx_to_text(idxs, alphabet):
  idxs = np.array(idxs)
  out = ''
  for i in range(idxs.size):
    idx = idxs[i]
    if idx == alphabet.eos_idx:
      break
    elif idx not in [alphabet.sos_idx]:
      out += alphabet.idx2char[idx]

  return out

In [5]:
#@title Dataset
class Text():

  def __init__(self, path=None, sentences=[]):
    self.path = path
    self.sentences = sentences[:]

  def __str__(self):
    return 'path: {}, sentences: {}'.format(self.path, len(self.sentences))


def generate_sample(config, alphabet, texts, idx=None, eos=False, pad=False, is_training=False):
  iter = 0
  x_idx, x_len, x_word_idx, x_word_len, y_idx, y_len = None, None, None, None, None, None

  # select random text until length is satisfied
  while True:
    iter += 1

    if idx is None:
      t = np.random.choice(texts)
    else:
      t = texts[idx]
      if iter > 1:
        break
    text = ' '.join([s + '.' for s in t.sentences])

    # remove guess signs
    if not config.pred_guess:
      text = text.replace(alphabet.sog, '').replace(alphabet.eog, '')

    # randomly select context size and number of characters to remove
    context_char_max = min(max(config.context_char_min, len(text)), config.context_char_max)
    if config.context_char_min > 0:
      context_char_num = np.random.randint(config.context_char_min, context_char_max)
    else:
      context_char_num = context_char_max

    if len(text.replace(alphabet.sog, '').replace(alphabet.eog, '')) < context_char_num:
      continue

    # compute start sentence
    text_idx_start = np.random.randint(0, len(text) - context_char_num) if len(text) > context_char_num else 0
    text_idx_end = text_idx_start + context_char_num

    # keep only current text
    text = text[text_idx_start:text_idx_end]

    if config.pred_guess and not is_training:
      # delete guess characters
      matches = []
      for m in re.finditer(r'%s([^%s%s]+)%s' % (
          re.escape(alphabet.sog),
          re.escape(alphabet.missing),
          re.escape(alphabet.eog), re.escape(alphabet.eog)), text):
        start = m.start() + 1
        end = m.end() - 2
        if config.pred_char_min <= end - start <= config.pred_char_max:
          matches.append((m.group(1), start, end))

      # skip if no matches found
      if len(matches) == 0:
        continue

      # pick a random match
      matches_idx = np.random.randint(len(matches))
      (y, pred_start, pred_end) = matches[matches_idx]
      x = list(text)
      for i in range(pred_start, pred_end + 1):
        x[i] = alphabet.pred

      # remove guess signs
      x = [c for c in x if c not in [alphabet.sog, alphabet.eog]]

    else:
      # delete pred characters
      if alphabet.pred not in text:
        pred_char_num = np.random.randint(config.pred_char_min, min(len(text), config.pred_char_max))
        if len(text) < pred_char_num:
          continue
        pred_char_idx = np.random.randint(0, len(text) - pred_char_num) if len(text) > pred_char_num else 0
        y = text[pred_char_idx:pred_char_idx + pred_char_num]
      else:
        y = ''

      # skip if it's a missing character
      if alphabet.missing in y:
        continue

      x = list(text)
      if alphabet.pred not in x:
        for i in range(pred_char_idx, pred_char_idx + pred_char_num):
          x[i] = alphabet.pred

      # hide random characters
      if config.missing_char_max > 0 and is_training:
        missing_char_num = np.random.randint(config.missing_char_min, min(len(text), config.missing_char_max))
        for i in np.random.randint(0, context_char_num, missing_char_num):
          if x[i] != alphabet.pred:
            x[i] = alphabet.missing

    # convert to string
    x = ''.join(x)

    # convert to indices
    x_idx = text_to_idx(x, alphabet)
    x_word_idx = text_to_word_idx(x, alphabet)
    y_idx = text_to_idx(y, alphabet)
    assert (len(x_idx) == len(x_word_idx))

    # append eos character
    if eos:
      y_idx = np.concatenate((y_idx, [alphabet.eos_idx]))

    # compute lengths
    x_len = np.int32(x_idx.shape[0])
    x_word_len = np.int32(x_word_idx.shape[0])
    y_len = np.int32(y_idx.shape[0])

    # pad sequences
    if pad:
      x_idx = np.pad(x_idx, (0, config.context_char_max - x_idx.size), 'constant',
                     constant_values=(0, alphabet.eos_idx))
      x_word_idx = np.pad(x_word_idx, (0, config.context_char_max - x_word_idx.size), 'constant',
                          constant_values=(0, alphabet.eos_idx))
      y_idx = np.pad(y_idx, (0, config.pred_char_max - y_idx.size + 1), 'constant',
                     constant_values=(0, alphabet.eos_idx))

    break

  return {'x': x_idx, 'x_len': x_len,
          'x_word': x_word_idx, 'x_word_len': x_word_len,
          'y': y_idx, 'y_len': y_len}

In [6]:
#@title Attention visualisation
def softmax(x, axis=-1):
    e_x = np.exp(x - np.max(x)) # same code
    return e_x / e_x.sum(axis=axis, keepdims=True)


def scale(x):
  return (x - np.min(x)) / np.ptp(x)


def generate_alignment(x_, y_, y_pred_0, out_tensors):
  html_src = '''
  <html>
  <head>
  <link href="https://fonts.googleapis.com/css?family=Roboto+Mono" rel="stylesheet">
  <style>
  body { font-family: 'Roboto Mono', monospace; }
  table {
    width:100%%;
    table-layout: fixed;
    font-size: 11px;
    padding: 0;
  }
  table td{
    word-wrap: break-word;
    white-space: normal;
  }
  span {
    padding: 0;
  }
  </style>
  </head>
  <body>
  <p><b>y:</b> %s</p>
  <p><b>y_pred:</b> %s</p>''' % (y_, y_pred_0)

  for attn_layer in range(len(out_tensors.alignment_history)):
    text = np.array(list(x_))
    text_missing_idx = text == '_'
    html_src += '<h2>layer {}</h2>'.format(attn_layer)

    html_src += '<table>'
    for y_idx in range(len(y_pred_0)):
      html_src += '<tr><td width="20"><b>{}</b></td><td>'.format(y_pred_0[y_idx])
      attn = out_tensors.alignment_history[attn_layer].copy()[y_idx, 0, :len(x_)]

      # rescale
      attn[np.bitwise_not(text_missing_idx)] /= attn[np.bitwise_not(text_missing_idx)].max()
      if (text_missing_idx).any():
        attn[text_missing_idx] /= attn[text_missing_idx].max()

      for i in range(len(text)):
        if text[i] == '_':
          html_src += '<span style="background-color: rgba(139,195,74,{:.2f});">{}</span>'.format(attn[i], '?')
        else:
          html_src += '<span style="background-color: rgba(33,150,243,{:.2f});">{}</span>'.format(attn[i], html.escape(text[i]))
      html_src += '</td></tr>'
    html_src += '</table>'

  html_src += '</body></html>'

  return html_src

In [7]:
#@title Model
import collections

import sonnet as snt
# import tensorflow as tf
# from tensor2tensor.models import transformer
# from tensor2tensor.utils.learning_rate import learning_rate_schedule


class Model(snt.AbstractModule):

  def __init__(self,
               config,
               alphabet,
               name="model_0010w"):
    super(Model, self).__init__(name=name)
    self._config = config
    self._vocab_size_char = len(alphabet.idx2char)
    self._vocab_size_word = len(alphabet.idx2word)
    self._pad_idx = alphabet.pad_idx
    self._unk_idx = alphabet.unk_idx
    self._sos_idx = alphabet.sos_idx
    self._eos_idx = alphabet.eos_idx

  # def get_learning_rate(self):
  #   hparams = transformer.transformer_base()
  #   return learning_rate_schedule(hparams)

  def _build(self,
             batch,
             keep_prob=0.8,
             sampling_probability=0.5,
             beam_width=0,
             is_training=False):

    # encoder
    with tf.variable_scope('encoder'):
      encoder_output, encoder_state = self._encoder(
        inputs=(tf.transpose(batch['x'], [1, 0]), tf.transpose(batch['x_word'], [1, 0])),
        lengths=batch['x_len'],
        target_size=(self._vocab_size_char, self._vocab_size_word),
        rnn_size_enc=512,
        rnn_size_dec=512,
        rnn_layers_enc=2,
        keep_prob=keep_prob,
        is_training=is_training)

    # decoder
    with tf.variable_scope('decoder'):
      decoder_output = self._seq2seq(encoder_output=encoder_output,
                                     encoder_state=encoder_state,
                                     encoder_lengths=batch['x_len'],
                                     target=batch['y'],
                                     target_lengths=batch['y_len'],
                                     target_size=self._vocab_size_char,
                                     sos_idx=self._sos_idx,
                                     eos_idx=self._eos_idx,
                                     rnn_size_dec=512,
                                     rnn_layers_dec=2,
                                     attention_fn=tf.contrib.seq2seq.LuongAttention,
                                     beam_width=beam_width,
                                     sampling_probability=sampling_probability,
                                     keep_prob=keep_prob,
                                     is_training=is_training)
    return decoder_output

  def _encoder(self,
               inputs,
               lengths,
               target_size,
               rnn_cell=tf.contrib.rnn.LSTMBlockCell,
               rnn_size_enc=128,
               rnn_size_dec=128,
               rnn_layers_enc=3,
               keep_prob=1.,
               is_training=False):

    with tf.variable_scope('encoder'):
      # character embedding
      self.embedding_encoder_char = snt.Embed(target_size[0], embed_dim=rnn_size_enc, name='embedding_encoder_char')
      inputs_char_emb = self.embedding_encoder_char(inputs[0])

      # word embedding
      self.embedding_encoder_word = snt.Embed(target_size[1], embed_dim=rnn_size_enc, name='embedding_encoder_word')
      inputs_word_emb = self.embedding_encoder_word(inputs[1])

      # combine embeddings
      inputs_emb = tf.concat([inputs_char_emb, inputs_word_emb], axis=-1)

      # rnn cells
      cells_fw = [tf.compat.v1.nn.rnn_cell.DropoutWrapper(
        rnn_cell(rnn_size_enc, reuse=tf.AUTO_REUSE,
                 name='rnn_fw_%d' % l),
        input_keep_prob=keep_prob if is_training else 1.,
        dtype=tf.float32) for l in range(rnn_layers_enc)]

      cells_bw = [tf.compat.v1.nn.rnn_cell.DropoutWrapper(
        rnn_cell(rnn_size_enc, reuse=tf.AUTO_REUSE,
                 name='rnn_bw_%d' % l),
        input_keep_prob=keep_prob if is_training else 1.,
        dtype=tf.float32) for l in range(rnn_layers_enc)]

      # run bidirectional rnn
      encoder_output, states_fw, states_bw = tf.contrib.rnn.stack_bidirectional_dynamic_rnn(
        cells_fw=cells_fw,
        cells_bw=cells_bw,
        inputs=inputs_emb,
        sequence_length=lengths,
        time_major=True,
        dtype=tf.float32)

      # concatenate state of last layer
      encoder_states = []
      for l in range(rnn_layers_enc):
        state_c = tf.concat(
          values=(states_fw[l].c, states_bw[l].c),
          axis=1)
        state_c_bridge = tf.layers.dense(state_c, rnn_size_dec,
                                         trainable=is_training,
                                         name='state_c_bridge_%d' % l,
                                         reuse=tf.AUTO_REUSE)
        state_h = tf.concat(
          values=(states_fw[l].h, states_bw[l].h),
          axis=1)
        state_h_bridge = tf.layers.dense(state_h, rnn_size_dec,
                                         trainable=is_training,
                                         name='state_h_bridge_%d' % l,
                                         reuse=tf.AUTO_REUSE)
        encoder_states.append(tf.contrib.rnn.LSTMStateTuple(c=state_c_bridge, h=state_h_bridge))

    return tf.transpose(encoder_output, [1, 0, 2]), tuple(encoder_states)

  def _seq2seq(self,
               encoder_output,
               encoder_state,
               encoder_lengths,
               target,
               target_lengths,
               target_size,
               sos_idx,
               eos_idx,
               rnn_cell=tf.contrib.rnn.LSTMBlockCell,
               rnn_size_dec=128,
               rnn_layers_dec=3,
               attention_fn=tf.contrib.seq2seq.LuongAttention,
               beam_width=0,
               sampling_probability=0.,
               keep_prob=1.,
               is_training=False):

    with tf.variable_scope('decoder'):
      batch_size = tf.shape(encoder_output)[0]

      # decoder embedding
      self.embedding_decoder = snt.Embed(target_size, embed_dim=rnn_size_dec, name='embedding_decoder')
      self.sos_idx = sos_idx
      self.eos_idx = eos_idx
      self.sos_tokens = tf.fill([batch_size], self.sos_idx)

      # beam search
      if not is_training and beam_width > 0:
        encoder_output = tf.contrib.seq2seq.tile_batch(encoder_output,
                                                       multiplier=beam_width)
        encoder_state = tf.contrib.seq2seq.tile_batch(encoder_state,
                                                      multiplier=beam_width)
        encoder_lengths = tf.contrib.seq2seq.tile_batch(encoder_lengths,
                                                        multiplier=beam_width)

      # define attetnion
      self.attention_mechanism = attention_fn(rnn_size_dec,
                                              memory=encoder_output,
                                              memory_sequence_length=encoder_lengths,
                                              scale=True)

      # attention cell
      attention_cell = tf.contrib.rnn.MultiRNNCell([
        tf.compat.v1.nn.rnn_cell.DropoutWrapper(rnn_cell(rnn_size_dec, reuse=tf.AUTO_REUSE, name='rnn_%d' % l),
                                      input_keep_prob=keep_prob if is_training else 1.,
                                      dtype=tf.float32) for l in range(rnn_layers_dec)])

      # attention wrapper
      self.decoder_cell = tf.contrib.seq2seq.AttentionWrapper(attention_cell,
                                                              [self.attention_mechanism] * rnn_layers_dec,
                                                              attention_layer_size=[rnn_size_dec] * rnn_layers_dec,
                                                              alignment_history=(not is_training and beam_width == 0))

      # initial attention state
      if not is_training and beam_width > 0:
        bs = batch_size * beam_width
      else:
        bs = batch_size
      decoder_initial_state = self.decoder_cell.zero_state(bs, tf.float32).clone(
        cell_state=encoder_state)

      # projection layer
      self.projection_layer = tf.layers.Dense(target_size,
                                              use_bias=False,
                                              name='output_projection',
                                              trainable=is_training,
                                              _reuse=tf.AUTO_REUSE)

      # training and inference helpers
      if is_training:
        # left pad to add sos idx
        target_sos = tf.pad(target, [[0, 0], [1, 0]], constant_values=self.sos_idx)

        # helper
        target_emb_input = self.embedding_decoder(target_sos)
        if sampling_probability > 0:
          helper = tf.contrib.seq2seq.ScheduledEmbeddingTrainingHelper(
            target_emb_input,
            sequence_length=target_lengths,
            embedding=self.embedding_decoder,
            sampling_probability=sampling_probability,
            time_major=False)
        else:
          helper = tf.contrib.seq2seq.TrainingHelper(target_emb_input,
                                                     sequence_length=target_lengths,
                                                     time_major=False)
        # decoder
        decoder = tf.contrib.seq2seq.BasicDecoder(cell=self.decoder_cell,
                                                  helper=helper,
                                                  initial_state=decoder_initial_state,
                                                  output_layer=self.projection_layer)
        maximum_iterations = None
      else:
        # inference
        if beam_width > 0:
          decoder = tf.contrib.seq2seq.BeamSearchDecoder(
            cell=self.decoder_cell,
            embedding=self.embedding_decoder,
            start_tokens=self.sos_tokens,
            end_token=self.eos_idx,
            initial_state=decoder_initial_state,
            beam_width=beam_width,
            output_layer=self.projection_layer)
        else:
          helper = tf.contrib.seq2seq.GreedyEmbeddingHelper(
            embedding=self.embedding_decoder,
            start_tokens=self.sos_tokens,
            end_token=self.eos_idx)

          decoder = tf.contrib.seq2seq.BasicDecoder(cell=self.decoder_cell,
                                                    helper=helper,
                                                    initial_state=decoder_initial_state,
                                                    output_layer=self.projection_layer)

        maximum_iterations = tf.round(self._config.pred_char_max)

      (final_outputs, final_state, final_sequence_lengths) = tf.contrib.seq2seq.dynamic_decode(
        decoder,
        output_time_major=False,
        maximum_iterations=maximum_iterations,
        swap_memory=True)

      if is_training:
        logits = final_outputs.rnn_output
        sample = final_outputs.sample_id
        alignment_history = tf.no_op()
      else:
        logits = tf.no_op()
        if beam_width > 0:
          sample = final_outputs.predicted_ids
          alignment_history = tf.no_op()
        else:
          sample = final_outputs.sample_id

          alignment_history = []
          for history_array in final_state.alignment_history:
            alignment_history.append(history_array.stack())
          alignment_history = tuple(alignment_history)

      return collections.namedtuple('Outputs', 'logits sample alignment_history')(logits=logits, sample=sample,
                                                                                  alignment_history=alignment_history)


In [8]:
#@title Create graph
alphabet = GreekAlphabet()

tf.reset_default_graph()

# create input placeholder
batch_tf = {
  'x': tf.placeholder(tf.int32, shape=[1, FLAGS.context_char_max]),
  'x_len': tf.placeholder(tf.int32, shape=[1]),
  'x_word': tf.placeholder(tf.int32, shape=[1, FLAGS.context_char_max]),
  'x_word_len': tf.placeholder(tf.int32, shape=[1]),
  'y': tf.placeholder(tf.int32, shape=[1, FLAGS.pred_char_max]),
  'y_len': tf.placeholder(tf.int32, shape=[1])
}

# create the model
model = Model(config=FLAGS, alphabet=alphabet)

# evaluation model with beam search
graph_tensors = model(batch_tf,
                      keep_prob=1.,
                      sampling_probability=0,
                      beam_width=FLAGS.beam_width,
                      is_training=False)

# evaluation model greedy
graph_tensors_greedy = model(batch_tf,
                             keep_prob=1.,
                             sampling_probability=0,
                             beam_width=0,
                             is_training=False)

# configure a checkpoint saver.
saver = tf.compat.v1.train.Saver()

In [9]:
#@title Create session and restore parameters
# create a session
sess = tf.Session()

# restore model weights from previously saved model
saver.restore(sess, FLAGS.load_checkpoint)

# Pythia

- Use the "**input text**" field to enter text in ancient greek.
- The missing characters are denoted by the "-" symbol and the characters to be predicted by the "_" symbol.
- For optimal usage and results avoid using additional special character notation.

In [10]:
#@title Get top predictions

input_text = 'ἀπολλοδώρου εὐβοΐδος --ώνιος ἀττίνου εὐμενείας --- διονοσίου εὐμενείας --------------εστράτου ---εύς ------ιτουτων τῶν ἔξ ἄββου κώμης --- _________ου ---νος -----ου' #@param {type:"string"}

# Generate batch from text
texts = [Text(sentences=[input_text])]
batch = generate_sample(FLAGS, alphabet, texts, pad=True)

# Run graph
out_tensors = sess.run(graph_tensors, feed_dict={
  batch_tf['x']: batch['x'].reshape((1, -1)),
  batch_tf['x_len']: batch['x_len'].reshape((1)),
  batch_tf['x_word']: batch['x_word'].reshape((1, -1)),
  batch_tf['x_word_len']: batch['x_word_len'].reshape((1))
})

# Convert index outputs to strings
x = idx_to_text(batch['x'], alphabet)
print('input text: "{}"'.format(x))
print('predictions:')
for beam_i in range(min(20, FLAGS.beam_width)):
  y_pred = idx_to_text(out_tensors.sample[0, :, beam_i], alphabet)
  print('{}: "{}"'.format(beam_i, y_pred))

input text: "ἀπολλοδώρου εὐβοΐδος --ώνιος ἀττίνου εὐμενείας --- διονοσίου εὐμενείας --------------εστράτου ---εύς ------ιτουτων τῶν ἔξ ἄββου κώμης --- _________ου ---νος -----ου."
predictions:
0: "ἀπολλοδώρ"
1: "ἀρτεμιδώρ"
2: "ἀσκληπιάδ"
3: "ἀπολλωνίδ"
4: "ἀρολλοδώρ"
5: "ἀπολλωδώρ"
6: "ἀπολλοδόρ"
7: "ἀπτεμιδώρ"
8: "ἀπολλοδώδ"
9: "ἀπολλονίδ"
10: "ἀπολλονώρ"
11: "μενεστράτ"
12: "φιλοστράτ"
13: "ἀγολλοδώρ"
14: "νικοστράτ"
15: "ἀνολλοδώρ"
16: "ἀπολλοφάν"
17: "κπολλοδώρ"
18: "στρατονίκ"
19: "ἀπκληπιάδ"


In [11]:
#@title Visualise attention weight

input_text = '\u1F00\u03C0\u03BF\u03BB\u03BB\u03BF\u03B4\u03CE\u03C1\u03BF\u03C5 \u03B5\u1F50\u03B2\u03BF\u0390\u03B4\u03BF\u03C2 --\u03CE\u03BD\u03B9\u03BF\u03C2 \u1F00\u03C4\u03C4\u03AF\u03BD\u03BF\u03C5 \u03B5\u1F50\u03BC\u03B5\u03BD\u03B5\u03AF\u03B1\u03C2 --- \u03B4\u03B9\u03BF\u03BD\u03BF\u03C3\u03AF\u03BF\u03C5 \u03B5\u1F50\u03BC\u03B5\u03BD\u03B5\u03AF\u03B1\u03C2 --------------\u03B5\u03C3\u03C4\u03C1\u03AC\u03C4\u03BF\u03C5 ---\u03B5\u03CD\u03C2 ------\u03B9\u03C4\u03BF\u03C5\u03C4\u03C9\u03BD \u03C4\u1FF6\u03BD \u1F14\u03BE \u1F04\u03B2\u03B2\u03BF\u03C5 \u03BA\u03CE\u03BC\u03B7\u03C2 --- _________\u03BF\u03C5 ---\u03BD\u03BF\u03C2 -----\u03BF\u03C5' #@param {type:"string"}

# Generate batch from text
texts = [Text(sentences=[input_text])]
batch = generate_sample(FLAGS, alphabet, texts, pad=True)

# Run graph
out_tensors = sess.run(graph_tensors_greedy, feed_dict={
  batch_tf['x']: batch['x'].reshape((1, -1)),
  batch_tf['x_len']: batch['x_len'].reshape((1)),
  batch_tf['x_word']: batch['x_word'].reshape((1, -1)),
  batch_tf['x_word_len']: batch['x_word_len'].reshape((1))
})

# Convert index outputs to strings
x = idx_to_text(batch['x'], alphabet)
y_pred = idx_to_text(out_tensors.sample[0], alphabet)

# Display
html_src = generate_alignment(x, '', y_pred, out_tensors)
display(HTML(html_src))

0,1
ἀ,ἀπολλοδώρου εὐβοΐδος --ώνιος ἀττίνου εὐμενείας --- διονοσίου εὐμενείας --------------εστράτου ---εύς ------ιτουτων τῶν ἔξ ἄββου κώμης --- ?????????ου ---νος -----ου.
π,ἀπολλοδώρου εὐβοΐδος --ώνιος ἀττίνου εὐμενείας --- διονοσίου εὐμενείας --------------εστράτου ---εύς ------ιτουτων τῶν ἔξ ἄββου κώμης --- ?????????ου ---νος -----ου.
ο,ἀπολλοδώρου εὐβοΐδος --ώνιος ἀττίνου εὐμενείας --- διονοσίου εὐμενείας --------------εστράτου ---εύς ------ιτουτων τῶν ἔξ ἄββου κώμης --- ?????????ου ---νος -----ου.
λ,ἀπολλοδώρου εὐβοΐδος --ώνιος ἀττίνου εὐμενείας --- διονοσίου εὐμενείας --------------εστράτου ---εύς ------ιτουτων τῶν ἔξ ἄββου κώμης --- ?????????ου ---νος -----ου.
λ,ἀπολλοδώρου εὐβοΐδος --ώνιος ἀττίνου εὐμενείας --- διονοσίου εὐμενείας --------------εστράτου ---εύς ------ιτουτων τῶν ἔξ ἄββου κώμης --- ?????????ου ---νος -----ου.
ο,ἀπολλοδώρου εὐβοΐδος --ώνιος ἀττίνου εὐμενείας --- διονοσίου εὐμενείας --------------εστράτου ---εύς ------ιτουτων τῶν ἔξ ἄββου κώμης --- ?????????ου ---νος -----ου.
δ,ἀπολλοδώρου εὐβοΐδος --ώνιος ἀττίνου εὐμενείας --- διονοσίου εὐμενείας --------------εστράτου ---εύς ------ιτουτων τῶν ἔξ ἄββου κώμης --- ?????????ου ---νος -----ου.
ώ,ἀπολλοδώρου εὐβοΐδος --ώνιος ἀττίνου εὐμενείας --- διονοσίου εὐμενείας --------------εστράτου ---εύς ------ιτουτων τῶν ἔξ ἄββου κώμης --- ?????????ου ---νος -----ου.
ρ,ἀπολλοδώρου εὐβοΐδος --ώνιος ἀττίνου εὐμενείας --- διονοσίου εὐμενείας --------------εστράτου ---εύς ------ιτουτων τῶν ἔξ ἄββου κώμης --- ?????????ου ---νος -----ου.

0,1
ἀ,ἀπολλοδώρου εὐβοΐδος --ώνιος ἀττίνου εὐμενείας --- διονοσίου εὐμενείας --------------εστράτου ---εύς ------ιτουτων τῶν ἔξ ἄββου κώμης --- ?????????ου ---νος -----ου.
π,ἀπολλοδώρου εὐβοΐδος --ώνιος ἀττίνου εὐμενείας --- διονοσίου εὐμενείας --------------εστράτου ---εύς ------ιτουτων τῶν ἔξ ἄββου κώμης --- ?????????ου ---νος -----ου.
ο,ἀπολλοδώρου εὐβοΐδος --ώνιος ἀττίνου εὐμενείας --- διονοσίου εὐμενείας --------------εστράτου ---εύς ------ιτουτων τῶν ἔξ ἄββου κώμης --- ?????????ου ---νος -----ου.
λ,ἀπολλοδώρου εὐβοΐδος --ώνιος ἀττίνου εὐμενείας --- διονοσίου εὐμενείας --------------εστράτου ---εύς ------ιτουτων τῶν ἔξ ἄββου κώμης --- ?????????ου ---νος -----ου.
λ,ἀπολλοδώρου εὐβοΐδος --ώνιος ἀττίνου εὐμενείας --- διονοσίου εὐμενείας --------------εστράτου ---εύς ------ιτουτων τῶν ἔξ ἄββου κώμης --- ?????????ου ---νος -----ου.
ο,ἀπολλοδώρου εὐβοΐδος --ώνιος ἀττίνου εὐμενείας --- διονοσίου εὐμενείας --------------εστράτου ---εύς ------ιτουτων τῶν ἔξ ἄββου κώμης --- ?????????ου ---νος -----ου.
δ,ἀπολλοδώρου εὐβοΐδος --ώνιος ἀττίνου εὐμενείας --- διονοσίου εὐμενείας --------------εστράτου ---εύς ------ιτουτων τῶν ἔξ ἄββου κώμης --- ?????????ου ---νος -----ου.
ώ,ἀπολλοδώρου εὐβοΐδος --ώνιος ἀττίνου εὐμενείας --- διονοσίου εὐμενείας --------------εστράτου ---εύς ------ιτουτων τῶν ἔξ ἄββου κώμης --- ?????????ου ---νος -----ου.
ρ,ἀπολλοδώρου εὐβοΐδος --ώνιος ἀττίνου εὐμενείας --- διονοσίου εὐμενείας --------------εστράτου ---εύς ------ιτουτων τῶν ἔξ ἄββου κώμης --- ?????????ου ---νος -----ου.


In [12]:
#@title Predict all missing parts iteratively

input_text = 'θε-ί ἐπὶ νικοφήμο ἄρχοντος συμμαχία ἀθηναίων καὶ θετταλῶν εἰς τὸν ἀεὶ χρόνον. ἔδοξεν τ-ι -ουλῆι κα- τῶι δήμωι λ-ωντὶς ἐπρυτάνευεν χαιρ-ων χαριναύ-ο φαληρεὺ- ἐγραμμάτευεν ἄρχιππος ἀμφ-τροπῆθε- ἐπεστάτει δωδεκάτει τῆς πρυτανείας ἐ-ηκεστίδης εἶπεν -ε-- ὧν λέγουσιν οἱ π-έσβεις τῶν θετταλῶ- ἐψηφίσθα- τῶι δ-μωι δέχεσθαι τὴν συμμαχίαν τύχ-ι ἀγαθῆι κ-θὰ ἐπ-νγέλλοντα- οἱ θετταλο-. εἶναι δὲ αὐ-ο-ς τὴ- συμμ-χίαν πρὸς ἀθηναίος εἰς -ὸν αἰεὶ χρόνον. εἶ-αι δὲ καὶ τοὺς ἀθηναίων συμμ-χ-ς ἅπαντας θετταλῶ- συμμ-χος καὶ τὸς -ετταλῶν ἀ--ναίων. ὀμόσαι δὲ ἀ--ναίων μὲν τὸς στρ---γὸς καὶ τ-ν βολὴν καὶ τὸς ἱππάρχος καὶ τὸς ἱππέ-ς τόνδε τὸν ὅρκον βοηθήσω π-ντὶ σθένει κατὰ τὸ δυνατόν ἐάν τι- ἴηι ἐπὶ τὸ κοινὸν τὸ θετταλῶν ἐπὶ πολ--ωι ἢ τὸν ἄρχοντα καταλύει ὃν εἵλοντο θετταλοί ἢ -ύραννον καθ-στῆι ἐν θετταλίαι ἐπομνύναι δὲ τὸν --μιμον ὅρκον. ὅπως δ -ν καὶ θετταλοὶ ὁμόσωσι τῆι π--ει ἑ-έσθα----ν δῆμον πέντε ἄν--ας ἐ- ἀθηναίων ἁπά-των οἵτινες ἀφικόμενοι εἰς θετταλία- ἐξορκώ-οσιν ἀγέλαο---ὸν ἄρχοντα καὶ τὸς -ολ-μά-χος καὶ τὸς ἱ-πάρχος καὶ τὸς ἱππέ-ς καὶ τὸ-----ο--ήμονας καὶ τοὺς ἄλλο- ἄρχοντας ὁπόσοι ὑπὲ- το κοινο το θετταλῶν ἄρχοσ-ν τόνδε τὸν ὅρκον βο-θ--ω παντὶ σθένει κατὰ τὸ δυνατόν ἐάν τις ἴ-ι ἐπὶ τὴν πόλιν τὴν ἀθ--αίων ἐπὶ πολέμωι ἢ τὸν δῆμον καταλύει τὸν ἀθηνα--- ὀμόσαι δὲ -αὶ τὸς πρέσβεις τὸς τῶν θετταλῶν ἐν τ-ι βολῆι τὸς ---δημο-τα- ἀθήνησιν τὸν αὐ-ὸ- ὁ-κο---ὸ- δὲ πόλεμον τὸν πρὸς ἀλέξανδρον τὸν μὴ -----α- κ----ύσασθαι ---- θετταλοῖς -νευ ἀθηναί------- ἀ---αίοις ἄ------ ἄρχοντος καὶ τοῦ κοινοῦ ------------. ἐπαιν-σα---- ἀγέλαον τὸν ἄρχοντα ------------- τῶν θετ---ῶν ὅτι εὖ κ-ὶ προθύμ-ς ἐπ----------- περὶ ὧν αὐ-ο-ς - πόλ-ς ἐ-η-γείλ--ο ἐπ------ι ------ τὸς πρέ----- τῶν -ετταλῶν τὸ----ον--- κ-- κ----αι αὐτὸς -----ένια -ἰς -----υτα--ῖον --- αὐρι------ν δὲ στ-λ-----ν πρὸ- ἀλ---νδ-ον --θελ--ν τὸς -----ς τῆς θεο -----ερ----ς -υμμαχία-. τοῖς δὲ πρέσ------οναι τὸν ----αν τ-ῦ ---ο εἰς ἐφόδια δδ δραχ--- ἑκάστωι. τὴ---- συμ--χί-- τή-δε ἀναγράψαι τὸν ---μ-ατέα τῆς β---ς ἐν -τ---- λιθίνη-------τῆσαι -ν ἀκ-ο-όλε- ε-ς -ὲ --ν -------ὴν τῆς -τ-λη- δονα- τὸν ταμίαν το δή-- 0 --α---ς εἶναι δὲ καὶ -ε--τητον τὸν ἐρχιέα ὡ- λέγο-τα --ιστα --ὶ --άττοντα ὅ -ι ἂν δύνηται ἀγα--ν τῶ-----ω- τῶι ἀ---α-ω----ὶ θετταλ-ῖς ἐν τῶι τεταγμέ-ωι.' #@param {type:"string"}
decoding = 'beam search' #@param ["beam search", "greedy"]

greedy = decoding == 'greedy'
x = np.array(list(input_text))

print(0, ''.join(x))

# Find all chunks of missing characters with max length of FLAGS.pred_char_max
for i, m in enumerate(re.finditer(r'-{1,%d}' % FLAGS.pred_char_max, input_text)):

  # Replace missing characters with prediction character
  x[m.span()[0]:m.span()[1]] = '_'

  # If x more than 1000 chars
  max_len = 500
  max_len_05 = int(max_len / 2)
  if len(x) > max_len:
    pred_len = m.span()[1] - m.span()[0]
    pred_len_05 = int(pred_len / 2)
    if m.span()[0] - max_len_05 < 0:
      x_start = 0
      x_end = max_len
    elif m.span()[1] + max_len_05 > len(x):
      x_start = len(x) - max_len
      x_end = len(x)
    else:
      x_start = max(0, m.span()[0] - max_len_05 + pred_len_05)
      x_end = min(len(x), m.span()[1] + max_len_05 - pred_len_05 - 1)
    assert x_end - x_start <= max_len
  else:
    x_start = 0
    x_end = len(x)

  # Generate batch from text
  texts = [Text(sentences=[''.join(x[x_start:x_end])])]
  batch = generate_sample(FLAGS, alphabet, texts, pad=True)

  # Run graph

  out_tensors = sess.run(
      graph_tensors_greedy if greedy else graph_tensors,
      feed_dict={
        batch_tf['x']: batch['x'].reshape((1, -1)),
        batch_tf['x_len']: batch['x_len'].reshape((1)),
        batch_tf['x_word']: batch['x_word'].reshape((1, -1)),
        batch_tf['x_word_len']: batch['x_word_len'].reshape((1))
      })

  # Convert index outputs to strings (get the first output of the beam)
  if greedy:
    y_pred = idx_to_text(out_tensors.sample[0, :], alphabet)
  else:
    for beam_i in range(min(20, FLAGS.beam_width)):
      y_pred = idx_to_text(out_tensors.sample[0, :, beam_i], alphabet)

    y_pred = idx_to_text(out_tensors.sample[0, :, 0], alphabet)

  # Fill the missing part with prediction
  x[m.span()[0]:m.span()[1]] = list(y_pred)

  print(i + 1, ''.join(x))

0 θε-ί ἐπὶ νικοφήμο ἄρχοντος συμμαχία ἀθηναίων καὶ θετταλῶν εἰς τὸν ἀεὶ χρόνον. ἔδοξεν τ-ι -ουλῆι κα- τῶι δήμωι λ-ωντὶς ἐπρυτάνευεν χαιρ-ων χαριναύ-ο φαληρεὺ- ἐγραμμάτευεν ἄρχιππος ἀμφ-τροπῆθε- ἐπεστάτει δωδεκάτει τῆς πρυτανείας ἐ-ηκεστίδης εἶπεν -ε-- ὧν λέγουσιν οἱ π-έσβεις τῶν θετταλῶ- ἐψηφίσθα- τῶι δ-μωι δέχεσθαι τὴν συμμαχίαν τύχ-ι ἀγαθῆι κ-θὰ ἐπ-νγέλλοντα- οἱ θετταλο-. εἶναι δὲ αὐ-ο-ς τὴ- συμμ-χίαν πρὸς ἀθηναίος εἰς -ὸν αἰεὶ χρόνον. εἶ-αι δὲ καὶ τοὺς ἀθηναίων συμμ-χ-ς ἅπαντας θετταλῶ- συμμ-χος καὶ τὸς -ετταλῶν ἀ--ναίων. ὀμόσαι δὲ ἀ--ναίων μὲν τὸς στρ---γὸς καὶ τ-ν βολὴν καὶ τὸς ἱππάρχος καὶ τὸς ἱππέ-ς τόνδε τὸν ὅρκον βοηθήσω π-ντὶ σθένει κατὰ τὸ δυνατόν ἐάν τι- ἴηι ἐπὶ τὸ κοινὸν τὸ θετταλῶν ἐπὶ πολ--ωι ἢ τὸν ἄρχοντα καταλύει ὃν εἵλοντο θετταλοί ἢ -ύραννον καθ-στῆι ἐν θετταλίαι ἐπομνύναι δὲ τὸν --μιμον ὅρκον. ὅπως δ -ν καὶ θετταλοὶ ὁμόσωσι τῆι π--ει ἑ-έσθα----ν δῆμον πέντε ἄν--ας ἐ- ἀθηναίων ἁπά-των οἵτινες ἀφικόμενοι εἰς θετταλία- ἐξορκώ-οσιν ἀγέλαο---ὸν ἄρχοντα καὶ τὸς -ολ-μά-χος