## Homework

1. Read @Brezina2018 [ch. 2, pp. 41--65].
2. Choose 3 books of Pausanias and calculate the most common tokens, types, and lemmata for each. In a paragraph or so, describe your findings relative to the work we have done in class today.
3. Using your findings from 2., write a short (1-page) evaluation of one of the books of Pausanias that you have analyzed. Does your qualitative -- which is not to say "subjective" -- experience of reading the text cohere with your quantitative evaluation?

In [2]:
%pip install MyCapytain

Note: you may need to restart the kernel to use updated packages.


In [3]:
from MyCapytain.resources.texts.local.capitains.cts import CapitainsCtsText

with open("../tei/tlg0525.tlg001.perseus-eng2.xml") as f:
    text = CapitainsCtsText(urn="urn:cts:greekLit:tlg0525.tlg001.perseus-eng2", resource=f)

In [4]:

from lxml import etree
from MyCapytain.common.constants import Mimetypes

urns = []
raw_xmls = []
unannotated_strings = []

for ref in text.getReffs(level=len(text.citation)):
    urn = f"{text.urn}:{ref}"
    node = text.getTextualNode(ref)
    raw_xml = node.export(Mimetypes.XML.TEI)
    tree = node.export(Mimetypes.PYTHON.ETREE)
    s = etree.tostring(tree, encoding="unicode", method="text")

    urns.append(urn)
    raw_xmls.append(raw_xml)
    unannotated_strings.append(s)

In [5]:
# install the latest version of numpy 1, instead of pandas' numpy 2
%pip install numpy==1.26.4

%pip install pandas

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


In [6]:
import pandas as pd

d = {
    "urn": pd.Series(urns, dtype="string"),
    "raw_xml": raw_xmls,
    "unannotated_strings": pd.Series(unannotated_strings, dtype="string")
}
pausanias_df = pd.DataFrame(d)

In [7]:
# See https://pandas.pydata.org/docs/reference/api/pandas.Series.str.split.html for
# panda's string-splitting utilities; it splits on whitespace by default
pausanias_df['whitespaced_tokens'] = pausanias_df['unannotated_strings'].str.split()

pausanias_df

Unnamed: 0,urn,raw_xml,unannotated_strings,whitespaced_tokens
0,urn:cts:greekLit:tlg0525.tlg001.perseus-eng2:1...,"<TEI xmlns=""http://www.tei-c.org/ns/1.0"" xmlns...",On the Greek mainland facing the Cyclades Isla...,"[On, the, Greek, mainland, facing, the, Cyclad..."
1,urn:cts:greekLit:tlg0525.tlg001.perseus-eng2:1...,"<TEI xmlns=""http://www.tei-c.org/ns/1.0"" xmlns...","The Peiraeus was a parish from early times, th...","[The, Peiraeus, was, a, parish, from, early, t..."
2,urn:cts:greekLit:tlg0525.tlg001.perseus-eng2:1...,"<TEI xmlns=""http://www.tei-c.org/ns/1.0"" xmlns...",The most noteworthy sight in the Peiraeus is a...,"[The, most, noteworthy, sight, in, the, Peirae..."
3,urn:cts:greekLit:tlg0525.tlg001.perseus-eng2:1...,"<TEI xmlns=""http://www.tei-c.org/ns/1.0"" xmlns...","The Athenians have also another harbor, at Mun...","[The, Athenians, have, also, another, harbor,,..."
4,urn:cts:greekLit:tlg0525.tlg001.perseus-eng2:1...,"<TEI xmlns=""http://www.tei-c.org/ns/1.0"" xmlns...",Twenty stades away is the Coliad promontory; o...,"[Twenty, stades, away, is, the, Coliad, promon..."
...,...,...,...,...
3165,urn:cts:greekLit:tlg0525.tlg001.perseus-eng2:1...,"<TEI xmlns=""http://www.tei-c.org/ns/1.0"" xmlns...","These, then, live above Amphissa. On the coast...","[These,, then,, live, above, Amphissa., On, th..."
3166,urn:cts:greekLit:tlg0525.tlg001.perseus-eng2:1...,"<TEI xmlns=""http://www.tei-c.org/ns/1.0"" xmlns...",I gather that the city got its name from a wom...,"[I, gather, that, the, city, got, its, name, f..."
3167,urn:cts:greekLit:tlg0525.tlg001.perseus-eng2:1...,"<TEI xmlns=""http://www.tei-c.org/ns/1.0"" xmlns...",The epic poem called the Naupactia by the Gree...,"[The, epic, poem, called, the, Naupactia, by, ..."
3168,urn:cts:greekLit:tlg0525.tlg001.perseus-eng2:1...,"<TEI xmlns=""http://www.tei-c.org/ns/1.0"" xmlns...",Here there is on the coast a temple of Poseido...,"[Here, there, is, on, the, coast, a, temple, o..."


In [8]:
def get_book_of_pausanias(df: pd.DataFrame, book_n: int):
    return df[df['urn'].str.startswith(f"urn:cts:greekLit:tlg0525.tlg001.perseus-eng2:{book_n}")]

In [9]:
book7 = get_book_of_pausanias(pausanias_df, 7)
book1 = get_book_of_pausanias(pausanias_df, 1)
book3 = get_book_of_pausanias(pausanias_df, 3)

In [10]:
book7['whitespaced_tokens'].explode().count()

23779

In [11]:
len(book7['whitespaced_tokens'].explode().unique())

4935

In [12]:
from collections import Counter
types = book7['whitespaced_tokens'].explode()

type_counts = Counter(types)

print(type_counts.most_common(100))

[('the', 2288), ('of', 1210), ('to', 758), ('and', 720), ('a', 451), ('in', 367), ('is', 307), ('that', 278), ('was', 266), ('by', 249), ('from', 228), ('they', 198), ('at', 187), ('The', 183), ('with', 181), ('he', 171), ('on', 160), ('his', 152), ('their', 152), ('were', 151), ('for', 149), ('had', 138), ('who', 129), ('it', 129), ('but', 119), ('as', 115), ('Achaeans', 106), ('an', 92), ('not', 89), ('them', 87), ('are', 87), ('this', 87), ('be', 83), ('son', 80), ('which', 69), ('when', 65), ('sanctuary', 65), ('also', 62), ('against', 59), ('city', 59), ('have', 58), ('all', 56), ('But', 55), ('made', 55), ('been', 54), ('called', 53), ('into', 52), ('one', 51), ('there', 49), ('I', 49), ('image', 49), ('people', 47), ('no', 47), ('after', 46), ('time', 45), ('him', 44), ('other', 44), ('before', 41), ('because', 41), ('came', 40), ('When', 40), ('up', 40), ('Achaean', 40), ('Romans', 39), ('river', 38), ('name', 37), ('about', 37), ('under', 37), ('Lacedaemonians', 37), ('too', 3

In [None]:
from collections import Counter
types = book1['whitespaced_tokens'].explode()

type_counts = Counter(types)

print(type_counts.most_common(100))

In [None]:
from collections import Counter
types = book3['whitespaced_tokens'].explode()

type_counts = Counter(types)

print(type_counts.most_common(100))

In [13]:
%pip install spacy

Note: you may need to restart the kernel to use updated packages.


In [14]:
%run -m spacy download en
%run -m spacy download en_core_web_sm

[38;5;3m⚠ As of spaCy v3.0, shortcuts like 'en' are deprecated. Please use the
full pipeline package name 'en_core_web_sm' instead.[0m
Collecting en-core-web-sm==3.7.1
  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.7.1/en_core_web_sm-3.7.1-py3-none-any.whl (12.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m12.8/12.8 MB[0m [31m44.0 MB/s[0m eta [36m0:00:00[0m00:01[0m
[38;5;2m✔ Download and installation successful[0m
You can now load the package via spacy.load('en_core_web_sm')
[38;5;3m⚠ Restart to reload dependencies[0m
If you are in a Jupyter or Colab notebook, you may need to restart Python in
order to load all the package's dependencies. You can do this by selecting the
'Restart kernel' or 'Restart runtime' option.
Collecting en-core-web-sm==3.7.1
  Using cached https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.7.1/en_core_web_sm-3.7.1-py3-none-any.whl (12.8 MB)
[38;5;2m✔ Downloa

In [15]:
import spacy

nlp = spacy.load("en_core_web_sm")

In [16]:
tokenizer = nlp.tokenizer

In [17]:
def commontypes(df: pd.DataFrame):
    from collections import Counter
    
    df['tokens'] = df['unannotated_strings'].apply(tokenizer)

    types = [t.text for t in df['tokens'].explode() if not t.is_stop and t.is_alpha]

    type_counts = Counter(types)

    return type_counts.most_common(100)
    

In [18]:
commontypes(book7)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df['tokens'] = df['unannotated_strings'].apply(tokenizer)


[('Achaeans', 163),
 ('son', 85),
 ('city', 81),
 ('sanctuary', 71),
 ('time', 59),
 ('called', 58),
 ('Romans', 58),
 ('image', 57),
 ('people', 53),
 ('Lacedaemonians', 49),
 ('Ionians', 48),
 ('war', 43),
 ('Achaean', 42),
 ('land', 41),
 ('river', 41),
 ('Athenians', 40),
 ('came', 40),
 ('Philip', 37),
 ('Patrae', 36),
 ('sea', 35),
 ('men', 35),
 ('sent', 33),
 ('images', 32),
 ('Greece', 30),
 ('man', 29),
 ('Rome', 29),
 ('place', 28),
 ('cities', 28),
 ('Greeks', 27),
 ('said', 26),
 ('Roman', 26),
 ('took', 25),
 ('temple', 25),
 ('Athena', 24),
 ('senate', 24),
 ('king', 23),
 ('Helice', 23),
 ('ancient', 23),
 ('Sparta', 23),
 ('given', 23),
 ('Apollo', 23),
 ('god', 23),
 ('army', 22),
 ('Artemis', 22),
 ('water', 22),
 ('stades', 22),
 ('received', 21),
 ('set', 21),
 ('brought', 21),
 ('Poseidon', 21),
 ('love', 21),
 ('League', 21),
 ('sons', 20),
 ('Athens', 20),
 ('found', 20),
 ('women', 20),
 ('Pellene', 20),
 ('right', 20),
 ('Metellus', 20),
 ('Diaeus', 20),
 ('or

In [19]:
commontypes(book1)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df['tokens'] = df['unannotated_strings'].apply(tokenizer)


[('son', 288),
 ('called', 204),
 ('Athenians', 182),
 ('city', 154),
 ('Apollo', 135),
 ('said', 109),
 ('men', 108),
 ('temple', 106),
 ('came', 106),
 ('Greeks', 105),
 ('sanctuary', 104),
 ('time', 100),
 ('war', 93),
 ('daughter', 92),
 ('Delphi', 86),
 ('Theseus', 82),
 ('bronze', 80),
 ('man', 80),
 ('dedicated', 79),
 ('Athena', 78),
 ('image', 76),
 ('killed', 76),
 ('god', 75),
 ('place', 74),
 ('Athens', 74),
 ('king', 73),
 ('Zeus', 71),
 ('Phocians', 70),
 ('army', 69),
 ('sea', 68),
 ('death', 67),
 ('statue', 67),
 ('day', 66),
 ('battle', 66),
 ('land', 64),
 ('says', 61),
 ('story', 61),
 ('water', 61),
 ('near', 59),
 ('people', 59),
 ('took', 58),
 ('sent', 57),
 ('account', 57),
 ('river', 54),
 ('Pyrrhus', 54),
 ('Ptolemy', 53),
 ('Artemis', 53),
 ('statues', 53),
 ('built', 52),
 ('Heracles', 52),
 ('road', 52),
 ('Athenian', 51),
 ('women', 48),
 ('gave', 48),
 ('old', 48),
 ('island', 47),
 ('children', 46),
 ('away', 46),
 ('Homer', 46),
 ('given', 46),
 ('havi

In [20]:
commontypes(book3)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df['tokens'] = df['unannotated_strings'].apply(tokenizer)


[('son', 155),
 ('Lacedaemonians', 108),
 ('sanctuary', 94),
 ('called', 91),
 ('image', 71),
 ('place', 56),
 ('city', 46),
 ('temple', 46),
 ('king', 45),
 ('Heracles', 45),
 ('time', 45),
 ('Sparta', 44),
 ('stades', 42),
 ('Apollo', 37),
 ('said', 37),
 ('sea', 35),
 ('came', 35),
 ('Athena', 35),
 ('war', 34),
 ('sons', 33),
 ('Artemis', 33),
 ('throne', 32),
 ('Athenians', 29),
 ('Agesilaus', 28),
 ('land', 27),
 ('left', 27),
 ('people', 27),
 ('road', 27),
 ('Zeus', 26),
 ('took', 25),
 ('man', 25),
 ('Asclepius', 25),
 ('Cleomenes', 25),
 ('set', 24),
 ('daughter', 24),
 ('brought', 24),
 ('far', 24),
 ('Pausanias', 24),
 ('Agis', 23),
 ('statue', 23),
 ('hero', 22),
 ('Tyndareus', 21),
 ('oracle', 21),
 ('army', 21),
 ('old', 21),
 ('away', 20),
 ('tomb', 20),
 ('house', 20),
 ('battle', 20),
 ('Achilles', 20),
 ('Dionysus', 20),
 ('named', 19),
 ('death', 19),
 ('water', 19),
 ('Lacedaemon', 19),
 ('account', 19),
 ('Argives', 19),
 ('won', 19),
 ('god', 19),
 ('bronze', 19)

In [21]:
# raw_texts = [t for t in pausanias_df['unannotated_strings']]
# annotated_texts = nlp.pipe(raw_texts, batch_size=100)

# pausanias_df['nlp_docs'] = list(annotated_texts)

In [22]:
def commonlemmata(df: pd.DataFrame):
    from collections import Counter
    raw_texts = [t for t in df['unannotated_strings']]
    annotated_texts = nlp.pipe(raw_texts, batch_size=100)
    df['nlp_docs'] = list(annotated_texts)
    lemmata = [t.lemma_ for t in df['nlp_docs'].explode() if not t.is_stop and t.is_alpha]

    lemmata_counts = Counter(lemmata)

    return lemmata_counts.most_common(100)

In [23]:
commonlemmata(book7)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df['nlp_docs'] = list(annotated_texts)


[('Achaeans', 163),
 ('city', 109),
 ('son', 105),
 ('image', 89),
 ('sanctuary', 81),
 ('come', 65),
 ('man', 64),
 ('call', 61),
 ('time', 60),
 ('Romans', 58),
 ('people', 54),
 ('Ionians', 48),
 ('Lacedaemonians', 47),
 ('war', 46),
 ('send', 44),
 ('land', 43),
 ('take', 43),
 ('river', 43),
 ('say', 41),
 ('Athenians', 40),
 ('give', 38),
 ('bring', 38),
 ('Philip', 37),
 ('Patrae', 36),
 ('sea', 35),
 ('place', 34),
 ('great', 33),
 ('god', 32),
 ('Greece', 30),
 ('Rome', 29),
 ('king', 28),
 ('Greeks', 27),
 ('woman', 27),
 ('hold', 27),
 ('temple', 27),
 ('find', 26),
 ('sacrifice', 26),
 ('stand', 25),
 ('Athena', 24),
 ('senate', 24),
 ('roman', 24),
 ('Helice', 23),
 ('ancient', 23),
 ('Sparta', 23),
 ('Apollo', 23),
 ('army', 22),
 ('receive', 22),
 ('set', 22),
 ('Artemis', 22),
 ('water', 22),
 ('love', 22),
 ('Achaean', 22),
 ('stade', 22),
 ('name', 21),
 ('oracle', 21),
 ('hand', 21),
 ('Poseidon', 21),
 ('spring', 21),
 ('League', 21),
 ('right', 21),
 ('Athens', 20)

In [24]:
commonlemmata(book1)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df['nlp_docs'] = list(annotated_texts)


[('son', 316),
 ('call', 214),
 ('city', 194),
 ('man', 191),
 ('say', 184),
 ('Athenians', 182),
 ('come', 169),
 ('Apollo', 135),
 ('statue', 121),
 ('take', 120),
 ('image', 117),
 ('sanctuary', 115),
 ('time', 112),
 ('temple', 111),
 ('give', 111),
 ('god', 110),
 ('daughter', 107),
 ('Greeks', 105),
 ('war', 96),
 ('place', 96),
 ('king', 96),
 ('kill', 91),
 ('day', 86),
 ('Delphi', 86),
 ('woman', 84),
 ('Theseus', 82),
 ('land', 80),
 ('bronze', 80),
 ('near', 79),
 ('dedicate', 79),
 ('Athena', 78),
 ('send', 78),
 ('army', 74),
 ('Athens', 74),
 ('great', 73),
 ('stand', 72),
 ('Zeus', 71),
 ('hold', 71),
 ('see', 71),
 ('sea', 70),
 ('battle', 70),
 ('Phocians', 70),
 ('know', 68),
 ('death', 67),
 ('name', 66),
 ('account', 65),
 ('bring', 64),
 ('story', 64),
 ('oracle', 64),
 ('people', 64),
 ('water', 62),
 ('child', 61),
 ('work', 61),
 ('old', 61),
 ('build', 60),
 ('river', 58),
 ('far', 57),
 ('fight', 57),
 ('carry', 56),
 ('island', 55),
 ('think', 55),
 ('road', 

In [25]:
commonlemmata(book3)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df['nlp_docs'] = list(annotated_texts)


[('son', 188),
 ('Lacedaemonians', 108),
 ('sanctuary', 102),
 ('call', 97),
 ('image', 86),
 ('place', 58),
 ('city', 58),
 ('come', 57),
 ('say', 52),
 ('king', 51),
 ('temple', 50),
 ('time', 48),
 ('Sparta', 44),
 ('stade', 44),
 ('man', 43),
 ('Heracles', 41),
 ('daughter', 38),
 ('war', 38),
 ('Apollo', 37),
 ('take', 36),
 ('sea', 35),
 ('bring', 35),
 ('Athena', 35),
 ('Artemis', 33),
 ('god', 33),
 ('far', 33),
 ('throne', 32),
 ('Athenians', 29),
 ('give', 28),
 ('sacrifice', 28),
 ('Agesilaus', 28),
 ('statue', 28),
 ('land', 27),
 ('people', 27),
 ('road', 27),
 ('Zeus', 26),
 ('old', 26),
 ('name', 25),
 ('set', 25),
 ('lead', 25),
 ('win', 25),
 ('Asclepius', 25),
 ('stand', 25),
 ('surname', 25),
 ('hero', 24),
 ('hold', 24),
 ('water', 23),
 ('tomb', 23),
 ('Agis', 23),
 ('Pausanias', 23),
 ('victory', 23),
 ('town', 22),
 ('house', 22),
 ('oracle', 22),
 ('Cleomenes', 22),
 ('army', 22),
 ('leave', 21),
 ('die', 21),
 ('Tyndareus', 21),
 ('go', 21),
 ('account', 21),
 

In [26]:
print(book1)

                                                    urn  \
0     urn:cts:greekLit:tlg0525.tlg001.perseus-eng2:1...   
1     urn:cts:greekLit:tlg0525.tlg001.perseus-eng2:1...   
2     urn:cts:greekLit:tlg0525.tlg001.perseus-eng2:1...   
3     urn:cts:greekLit:tlg0525.tlg001.perseus-eng2:1...   
4     urn:cts:greekLit:tlg0525.tlg001.perseus-eng2:1...   
...                                                 ...   
3165  urn:cts:greekLit:tlg0525.tlg001.perseus-eng2:1...   
3166  urn:cts:greekLit:tlg0525.tlg001.perseus-eng2:1...   
3167  urn:cts:greekLit:tlg0525.tlg001.perseus-eng2:1...   
3168  urn:cts:greekLit:tlg0525.tlg001.perseus-eng2:1...   
3169  urn:cts:greekLit:tlg0525.tlg001.perseus-eng2:1...   

                                                raw_xml  \
0     <TEI xmlns="http://www.tei-c.org/ns/1.0" xmlns...   
1     <TEI xmlns="http://www.tei-c.org/ns/1.0" xmlns...   
2     <TEI xmlns="http://www.tei-c.org/ns/1.0" xmlns...   
3     <TEI xmlns="http://www.tei-c.org/ns/1.0" xmlns...

Findings in relation to work done in class:
I believe the transition from tokens to types to lemmata is extremely critical for quantitative textual analysis to be meaningful. For example, the words 'men' and 'man' are very similar and pretty much represent the same thing/ have the same implications. Thus, by combining their counts through lemmatization, we are able to get a much more accurate and insightful idea of the text to be studied. Especially in the context of these books where proper nouns take different forms (Athens/ Athenians, Greece/ Greek), the aforementioned process is helpful to correlate closely resembled words.

Question 3. Summary of Book 1:
In Book 1, Pausanias focuses on Attica, the region of Athens, one of the most famous and culturally significant parts of ancient Greece. Athenians (Athens), Greeks and Apollo are some of the most recurring lemmata in book 1, almost certainly iterating that the setting of the book is Athens, Greece. This is easily discernible given the frequency of appearances of these words in our quantitative analysis.

Pausanias provides an extensive description of the Acropolis, the citadel of Athens, and its significant temples and monuments. Pausanias details the art and sculptures housed in these temples, which we can clearly link to our quantitative analysis- given that temples, statues and sanctuaries were in the top 20 most common words used in the book. Pausanias describes the Athenian Agora, the social and political heart of the city, pointing out the various temples, statues, and stoas. Throughout Athens, Pausanias records numerous statues of gods, heroes, and important figures. 

Pausanias is particularly interested in the temples of Athens and surrounding Attica. The temple of Athena on the Acropolis, the temple of Olympian Zeus, and the sanctuary at Eleusis are key highlights of his description. Gods like Athena, Zeus, and Apollo are central to the religious life of the Athenians, as Pausanias illustrates by detailing the architecture and religious ceremonies.

Pausanias frequently refers to historical events such as the wars that shaped the history of Athens, including the Battle of Marathon. He also mentions various kings, both mythical and historical, like Theseus, who is closely tied to the legends of Athens. His discussions of war and leadership reflect the Athenians' military prowess and the legacy of their past victories. This is evident from our quantitative analysis where the words 'war', 'king', 'kill' appeared frequently- all together indicative of conquest. Thus, I can reiterate that my quantitative analysis matches the qualitative observation through reading about historical events.

The frequent references to family relationships, particularly sons and daughters, echo Pausanias' recounting of Greek myths and genealogies. For example, Athena (who is symbolically considered the daughter of Zeus) is revered in Athens, and mythological stories often emphasize the familial connections between gods and heroes. The sons and daughters of kings and gods feature prominently in the stories of the city. That is why our quantitative analysis of the most common lemmata featured sons and daughters- simply because of the way Greek mythological gods are referred to in society. 

Overall, Pausanias intersperses his description with historical accounts of wars, political events, and important figures that shaped Athens. He often contrasts Athens' present with its glorious past, lamenting how many of the city's wonders were destroyed or decayed over time. I believe the quantitative analysis conducted does give reasonable insight into the contents of book 1 without reading- as by grouping themed words together- we can assert the setting, history and other important events and landmarks.