<a href="https://colab.research.google.com/github/ryderwishart/biblical-machine-learning/blob/main/macula_data_overview.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Simple TSV Data Exploration Notebook
This notebook is designed to load and explore TSV data files for Greek and ~~Hebrew~~ languages.

## Setup
First, import the necessary libraries.

In [10]:
import pandas as pd
import os
import random

## Download and Load Data
Here, download the TSV files using the `!wget` command and load them using pandas.

Choose which language data to load

In [11]:
language = "Greek"  # Change this to "Hebrew" to load Hebrew data - not yet implemented

In [12]:
if language == "Greek":
    if "macula-greek.tsv" not in [path for path in os.listdir()]:
        !wget - O macula-greek.tsv https://raw.githubusercontent.com/Clear-Bible/macula-greek/main/Nestle1904/TSV/macula-greek.tsv
    tsv_file = "macula-greek.tsv"
elif language == "Hebrew":
    if "macula-hebrew.tsv" not in [path for path in os.listdir()]:
        !wget - O macula-hebrew.tsv < URL_FOR_HEBREW_TSV > # FIXME: need to use GIT LFS for this one
    tsv_file = "macula-hebrew.tsv"
else:
    raise ValueError("Invalid language choice")

Load the chosen TSV file

In [13]:
data = pd.read_csv(tsv_file, sep="\t")

## Column Information
The following sections describe each column header, the possible values, and their meanings.

In [14]:
# Create an empty list to store dictionaries for each column
column_info_list = []

# Iterate through each column in the data DataFrame
for column in data.columns:
    num_unique_values = data[column].nunique()
    unique_values = data[column].unique()
    column_info = {"column_name": column,
                   "num_unique_values": num_unique_values,
                   "unique_values": unique_values}
    column_info_list.append(column_info)

# Convert the list of dictionaries into a DataFrame
overview = pd.concat([pd.Series(d) for d in column_info_list], axis=1).T

# Create a dictionary with attribute descriptions
attribute_descriptions = {
    "after": "Encodes the following character, including a blank space.",
    "articular": "'true' if the word has an article (i.e., modified by the word 'the').",
    "case": "Grammatical case: nominative, genitive, dative, accusative, or vocative",
    "class": "On words, the class is the word's part of speech",
    "cltype": "Explicitly marks Verbless Clauses, Verb Elided Clauses, and Minor Clauses",
    "degree": "A derivative lexical category, indicating the degree of the adjective",
    "discontinuous": "'true' if the word is discontinuous with respect to sentence order due to reordering in the syntax tree",
    "domain": "Semantic domain information from the Semantic Dictionary of Biblical Greek (SDBG)",
    "frame": "Frames of verbs, refers to the arguments of the verb",
    "gender": "Grammatical gender values",
    "gloss": "SIL data, not Berean",
    "lemma": "Form of the word as it appears in a dictionary.",
    "ln": "Short for Louw-Nida, representing the semantic domain entry in Johannes P. Louw and Eugene Albert Nida, Greek-English Lexicon of the New Testament: Based on Semantic Domains (New York: United Bible Societies, 1996).",
    "mood": "Grammatical mood",
    "morph": "Morphological parsing codes",
    "normalized": "The normalized form of the token (i.e., no trailing or leading punctuation or accent shifting depending on context)",
    "number": "Grammatical number",
    "person": "Grammatical person",
    "ref": "Verse!word reference to this edition of the Nestle1904 text by USFM id",
    "referent": "The xml:id of the node to which a pronoun (i.e., 'he') refers. Note that some of these IDs are not word IDs but rather phrase or clause IDs.",
    "role": "The clause-level role of the word.",
    "strong": "Strong's number for the lemma",
    "subjref": "The xml:id of the node that is the implied subject of a verb (for verbs without an explicit subject). Note that some of these IDs are not word IDs but rather phrase or clause IDs.",
    "tense": "Grammatical tense form",
    "text": "Text content associated with the ID",
    "type": "Indicates different types of pronominals",
    "voice": "Grammatical voice",
    "xml:id": "XML ids occur on every word and encode the corpus ('n' for New Testament), the book (40 for Matthew), the chapter (001), verse (001), and word (001)."
}

# Add descriptions to the overview dataframe
overview["description"] = overview["column_name"].map(attribute_descriptions)

# Display the overview dataframe with descriptions
print('Note that empty or "nan" values occur when a column does not apply to a given word')
overview


Note that empty or "nan" values occur when a column does not apply to a given word


Unnamed: 0,column_name,num_unique_values,unique_values,description
0,xml:id,137779,"[n40001001001, n40001001002, n40001001003, n40...",XML ids occur on every word and encode the cor...
1,ref,137779,"[MAT 1:1!1, MAT 1:1!2, MAT 1:1!3, MAT 1:1!4, M...",Verse!word reference to this edition of the Ne...
2,role,10,"[nan, s, v, vc, p, adv, o, aux, io, o2, apposi...",The clause-level role of the word.
3,class,11,"[noun, verb, det, conj, pron, prep, adj, adv, ...","On words, the class is the word's part of speech"
4,type,9,"[common, proper, nan, personal, relative, demo...",Indicates different types of pronominals
5,gloss,20021,"[[The] book, of [the] genealogy, of Jesus, Chr...","SIL data, not Berean"
6,text,19477,"[Βίβλος, γενέσεως, Ἰησοῦ, Χριστοῦ, υἱοῦ, Δαυεὶ...",Text content associated with the ID
7,after,17,"[ , ., ,, ·, ;, —, ὶ, ς, ε, χ, ὸ, ι, α, ί, ὁ, ...","Encodes the following character, including a b..."
8,lemma,5401,"[βίβλος, γένεσις, Ἰησοῦς, Χριστός, υἱός, Δαυίδ...",Form of the word as it appears in a dictionary.
9,normalized,18480,"[Βίβλος, γενέσεως, Ἰησοῦ, Χριστοῦ, υἱοῦ, Δαυεί...","The normalized form of the token (i.e., no tra..."


## Inspect a column

Specify which column name you would like to look at some examples of, and then view five random rows from the dataset as examples.

In [15]:
# Specify the column name in a variable
column_name = "mood"  # Replace with your desired column name

# Filter out null values from the specified column
non_null_data = data[data[column_name].notnull()]

# If there are fewer than 5 non-null rows, adjust the sample size accordingly
num_non_null_rows = len(non_null_data)
sample_size = min(5, num_non_null_rows)

# Select 5 (or fewer) random non-null rows
random_rows = random.sample(range(num_non_null_rows), sample_size)
random_cells = non_null_data[[column_name, 'text']].iloc[random_rows]

# Get description from overview dataset
column_description = overview[overview['column_name'] == column_name]['description'].values[0]

print(f"COLUMN NAME: {column_name}\nCOLUMN DESCRIPTION: {column_description}\n")
print(f"Five (or fewer) random non-null cells from the '{column_name}' and 'text' columns:")
print(random_cells)


COLUMN NAME: mood
COLUMN DESCRIPTION: Grammatical mood

Five (or fewer) random non-null cells from the 'mood' and 'text' columns:
              mood        text
116139  indicative   πλανῶνται
35364   indicative         ἔχω
96308   indicative  ἐγείρονται
2568    indicative      ἔσεσθε
87103   indicative          εἶ


In [16]:
!pip install ydata-profiling


Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [18]:
from ydata_profiling import ProfileReport

# Save the report to an HTML file
from IPython.display import HTML
# your dataframe goes here: might need a sample if the data is large (like Macula)
profile = ProfileReport(data.sample(500, random_state=50))
profile.to_file("profile_report.html")
HTML('/content/profile_report.html')

Summarize dataset:   0%|          | 0/5 [00:00<?, ?it/s]

Generate report structure:   0%|          | 0/1 [00:00<?, ?it/s]

Render HTML:   0%|          | 0/1 [00:00<?, ?it/s]

Export report to file:   0%|          | 0/1 [00:00<?, ?it/s]

0,1
Number of variables,25
Number of observations,500
Missing cells,4830
Missing cells (%),38.6%
Duplicate rows,0
Duplicate rows (%),0.0%
Total size in memory,101.6 KiB
Average record size in memory,208.0 B

0,1
Categorical,24
Numeric,1

0,1
xml:id has a high cardinality: 500 distinct values,High cardinality
ref has a high cardinality: 500 distinct values,High cardinality
gloss has a high cardinality: 306 distinct values,High cardinality
text has a high cardinality: 316 distinct values,High cardinality
lemma has a high cardinality: 229 distinct values,High cardinality
normalized has a high cardinality: 311 distinct values,High cardinality
morph has a high cardinality: 162 distinct values,High cardinality
domain has a high cardinality: 163 distinct values,High cardinality
ln has a high cardinality: 265 distinct values,High cardinality
frame has a high cardinality: 85 distinct values,High cardinality

0,1
Analysis started,2023-04-26 07:26:59.025839
Analysis finished,2023-04-26 07:27:08.406120
Duration,9.38 seconds
Software version,ydata-profiling vv4.1.2
Download configuration,config.json

0,1
Distinct,500
Distinct (%),100.0%
Missing,0
Missing (%),0.0%
Memory size,7.8 KiB

0,1
n52002013023,1
n42008047006,1
n40017009011,1
n55001003010,1
n43006040017,1
Other values (495),495

0,1
Max length,12
Median length,12
Mean length,12
Min length,12

0,1
Total characters,6000
Distinct characters,11
Distinct categories,2 ?
Distinct scripts,2 ?
Distinct blocks,1 ?

0,1
Unique,500 ?
Unique (%),100.0%

0,1
1st row,n52002013023
2nd row,n44028019006
3rd row,n43006051007
4th row,n40024036017
5th row,n44004031007

Value,Count,Frequency (%)
n52002013023,1,0.2%
n42008047006,1,0.2%
n40017009011,1,0.2%
n55001003010,1,0.2%
n43006040017,1,0.2%
n42008051015,1,0.2%
n44005036011,1,0.2%
n43011037017,1,0.2%
n44020032004,1,0.2%
n43016021028,1,0.2%

Value,Count,Frequency (%)
n52002013023,1,0.2%
n66002018023,1,0.2%
n43006051007,1,0.2%
n40024036017,1,0.2%
n44004031007,1,0.2%
n48001001003,1,0.2%
n40018015009,1,0.2%
n52004016006,1,0.2%
n46001020015,1,0.2%
n43005009011,1,0.2%

Value,Count,Frequency (%)
0,2366,39.4%
1,759,12.7%
4,633,10.5%
n,500,8.3%
2,467,7.8%
3,292,4.9%
6,269,4.5%
5,263,4.4%
8,159,2.6%
7,147,2.5%

Value,Count,Frequency (%)
Decimal Number,5500,91.7%
Lowercase Letter,500,8.3%

Value,Count,Frequency (%)
0,2366,43.0%
1,759,13.8%
4,633,11.5%
2,467,8.5%
3,292,5.3%
6,269,4.9%
5,263,4.8%
8,159,2.9%
7,147,2.7%
9,145,2.6%

Value,Count,Frequency (%)
n,500,100.0%

Value,Count,Frequency (%)
Common,5500,91.7%
Latin,500,8.3%

Value,Count,Frequency (%)
0,2366,43.0%
1,759,13.8%
4,633,11.5%
2,467,8.5%
3,292,5.3%
6,269,4.9%
5,263,4.8%
8,159,2.9%
7,147,2.7%
9,145,2.6%

Value,Count,Frequency (%)
n,500,100.0%

Value,Count,Frequency (%)
ASCII,6000,100.0%

Value,Count,Frequency (%)
0,2366,39.4%
1,759,12.7%
4,633,10.5%
n,500,8.3%
2,467,7.8%
3,292,4.9%
6,269,4.5%
5,263,4.4%
8,159,2.6%
7,147,2.5%

0,1
Distinct,500
Distinct (%),100.0%
Missing,0
Missing (%),0.0%
Memory size,7.8 KiB

0,1
1TH 2:13!23,1
LUK 8:47!6,1
MAT 17:9!11,1
2TI 1:3!10,1
JHN 6:40!17,1
Other values (495),495

0,1
Max length,12.0
Median length,11.0
Mean length,10.696
Min length,9.0

0,1
Total characters,5348
Distinct characters,33
Distinct categories,4 ?
Distinct scripts,2 ?
Distinct blocks,1 ?

0,1
Unique,500 ?
Unique (%),100.0%

0,1
1st row,1TH 2:13!23
2nd row,ACT 28:19!6
3rd row,JHN 6:51!7
4th row,MAT 24:36!17
5th row,ACT 4:31!7

Value,Count,Frequency (%)
1TH 2:13!23,1,0.2%
LUK 8:47!6,1,0.2%
MAT 17:9!11,1,0.2%
2TI 1:3!10,1,0.2%
JHN 6:40!17,1,0.2%
LUK 8:51!15,1,0.2%
ACT 5:36!11,1,0.2%
JHN 11:37!17,1,0.2%
ACT 20:32!4,1,0.2%
JHN 16:21!28,1,0.2%

Value,Count,Frequency (%)
mat,85,8.5%
act,66,6.6%
luk,58,5.8%
jhn,53,5.3%
rev,37,3.7%
mrk,35,3.5%
rom,31,3.1%
1co,25,2.5%
2co,15,1.5%
heb,12,1.2%

Value,Count,Frequency (%)
1,765,14.3%
,500,9.3%
:,500,9.3%
!,500,9.3%
2,422,7.9%
3,234,4.4%
T,181,3.4%
4,176,3.3%
5,170,3.2%
A,167,3.1%

Value,Count,Frequency (%)
Decimal Number,2436,45.5%
Uppercase Letter,1412,26.4%
Other Punctuation,1000,18.7%
Space Separator,500,9.3%

Value,Count,Frequency (%)
T,181,12.8%
A,167,11.8%
M,151,10.7%
C,113,8.0%
R,103,7.3%
K,93,6.6%
H,88,6.2%
O,78,5.5%
L,74,5.2%
J,72,5.1%

Value,Count,Frequency (%)
1,765,31.4%
2,422,17.3%
3,234,9.6%
4,176,7.2%
5,170,7.0%
6,145,6.0%
8,138,5.7%
7,132,5.4%
9,130,5.3%
0,124,5.1%

Value,Count,Frequency (%)
:,500,50.0%
!,500,50.0%

Value,Count,Frequency (%)
,500,100.0%

Value,Count,Frequency (%)
Common,3936,73.6%
Latin,1412,26.4%

Value,Count,Frequency (%)
T,181,12.8%
A,167,11.8%
M,151,10.7%
C,113,8.0%
R,103,7.3%
K,93,6.6%
H,88,6.2%
O,78,5.5%
L,74,5.2%
J,72,5.1%

Value,Count,Frequency (%)
1,765,19.4%
,500,12.7%
:,500,12.7%
!,500,12.7%
2,422,10.7%
3,234,5.9%
4,176,4.5%
5,170,4.3%
6,145,3.7%
8,138,3.5%

Value,Count,Frequency (%)
ASCII,5348,100.0%

Value,Count,Frequency (%)
1,765,14.3%
,500,9.3%
:,500,9.3%
!,500,9.3%
2,422,7.9%
3,234,4.4%
T,181,3.4%
4,176,3.3%
5,170,3.2%
A,167,3.1%

0,1
Distinct,8
Distinct (%),5.3%
Missing,349
Missing (%),69.8%
Memory size,7.8 KiB

0,1
v,83
o,21
adv,17
s,13
vc,8
Other values (3),9

0,1
Max length,3.0
Median length,1.0
Mean length,1.3311258
Min length,1.0

0,1
Total characters,201
Distinct characters,10
Distinct categories,1 ?
Distinct scripts,1 ?
Distinct blocks,1 ?

0,1
Unique,1 ?
Unique (%),0.7%

0,1
1st row,v
2nd row,o
3rd row,aux
4th row,o
5th row,adv

Value,Count,Frequency (%)
v,83,16.6%
o,21,4.2%
adv,17,3.4%
s,13,2.6%
vc,8,1.6%
io,6,1.2%
p,2,0.4%
aux,1,0.2%
(Missing),349,69.8%

Value,Count,Frequency (%)
v,83,55.0%
o,21,13.9%
adv,17,11.3%
s,13,8.6%
vc,8,5.3%
io,6,4.0%
p,2,1.3%
aux,1,0.7%

Value,Count,Frequency (%)
v,108,53.7%
o,27,13.4%
a,18,9.0%
d,17,8.5%
s,13,6.5%
c,8,4.0%
i,6,3.0%
p,2,1.0%
u,1,0.5%
x,1,0.5%

Value,Count,Frequency (%)
Lowercase Letter,201,100.0%

Value,Count,Frequency (%)
v,108,53.7%
o,27,13.4%
a,18,9.0%
d,17,8.5%
s,13,6.5%
c,8,4.0%
i,6,3.0%
p,2,1.0%
u,1,0.5%
x,1,0.5%

Value,Count,Frequency (%)
Latin,201,100.0%

Value,Count,Frequency (%)
v,108,53.7%
o,27,13.4%
a,18,9.0%
d,17,8.5%
s,13,6.5%
c,8,4.0%
i,6,3.0%
p,2,1.0%
u,1,0.5%
x,1,0.5%

Value,Count,Frequency (%)
ASCII,201,100.0%

Value,Count,Frequency (%)
v,108,53.7%
o,27,13.4%
a,18,9.0%
d,17,8.5%
s,13,6.5%
c,8,4.0%
i,6,3.0%
p,2,1.0%
u,1,0.5%
x,1,0.5%

0,1
Distinct,9
Distinct (%),1.8%
Missing,0
Missing (%),0.0%
Memory size,7.8 KiB

0,1
noun,97
verb,94
det,72
conj,71
pron,50
Other values (4),116

0,1
Max length,4.0
Median length,4.0
Mean length,3.716
Min length,3.0

0,1
Total characters,1858
Distinct characters,14
Distinct categories,1 ?
Distinct scripts,1 ?
Distinct blocks,1 ?

0,1
Unique,0 ?
Unique (%),0.0%

0,1
1st row,conj
2nd row,verb
3rd row,det
4th row,noun
5th row,prep

Value,Count,Frequency (%)
noun,97,19.4%
verb,94,18.8%
det,72,14.4%
conj,71,14.2%
pron,50,10.0%
prep,46,9.2%
adj,46,9.2%
adv,21,4.2%
num,3,0.6%

Value,Count,Frequency (%)
noun,97,19.4%
verb,94,18.8%
det,72,14.4%
conj,71,14.2%
pron,50,10.0%
prep,46,9.2%
adj,46,9.2%
adv,21,4.2%
num,3,0.6%

Value,Count,Frequency (%)
n,318,17.1%
o,218,11.7%
e,212,11.4%
r,190,10.2%
p,142,7.6%
d,139,7.5%
j,117,6.3%
v,115,6.2%
u,100,5.4%
b,94,5.1%

Value,Count,Frequency (%)
Lowercase Letter,1858,100.0%

Value,Count,Frequency (%)
n,318,17.1%
o,218,11.7%
e,212,11.4%
r,190,10.2%
p,142,7.6%
d,139,7.5%
j,117,6.3%
v,115,6.2%
u,100,5.4%
b,94,5.1%

Value,Count,Frequency (%)
Latin,1858,100.0%

Value,Count,Frequency (%)
n,318,17.1%
o,218,11.7%
e,212,11.4%
r,190,10.2%
p,142,7.6%
d,139,7.5%
j,117,6.3%
v,115,6.2%
u,100,5.4%
b,94,5.1%

Value,Count,Frequency (%)
ASCII,1858,100.0%

Value,Count,Frequency (%)
n,318,17.1%
o,218,11.7%
e,212,11.4%
r,190,10.2%
p,142,7.6%
d,139,7.5%
j,117,6.3%
v,115,6.2%
u,100,5.4%
b,94,5.1%

0,1
Distinct,7
Distinct (%),4.8%
Missing,353
Missing (%),70.6%
Memory size,7.8 KiB

0,1
common,86
personal,34
proper,11
relative,6
interrogative,5
Other values (2),5

0,1
Max length,13.0
Median length,6.0
Mean length,6.9795918
Min length,6.0

0,1
Total characters,1026
Distinct characters,16
Distinct categories,1 ?
Distinct scripts,1 ?
Distinct blocks,1 ?

0,1
Unique,0 ?
Unique (%),0.0%

0,1
1st row,common
2nd row,personal
3rd row,common
4th row,common
5th row,personal

Value,Count,Frequency (%)
common,86,17.2%
personal,34,6.8%
proper,11,2.2%
relative,6,1.2%
interrogative,5,1.0%
demonstrative,3,0.6%
indefinite,2,0.4%
(Missing),353,70.6%

Value,Count,Frequency (%)
common,86,58.5%
personal,34,23.1%
proper,11,7.5%
relative,6,4.1%
interrogative,5,3.4%
demonstrative,3,2.0%
indefinite,2,1.4%

Value,Count,Frequency (%)
o,225,21.9%
m,175,17.1%
n,132,12.9%
c,86,8.4%
e,77,7.5%
r,75,7.3%
p,56,5.5%
a,48,4.7%
l,40,3.9%
s,37,3.6%

Value,Count,Frequency (%)
Lowercase Letter,1026,100.0%

Value,Count,Frequency (%)
o,225,21.9%
m,175,17.1%
n,132,12.9%
c,86,8.4%
e,77,7.5%
r,75,7.3%
p,56,5.5%
a,48,4.7%
l,40,3.9%
s,37,3.6%

Value,Count,Frequency (%)
Latin,1026,100.0%

Value,Count,Frequency (%)
o,225,21.9%
m,175,17.1%
n,132,12.9%
c,86,8.4%
e,77,7.5%
r,75,7.3%
p,56,5.5%
a,48,4.7%
l,40,3.9%
s,37,3.6%

Value,Count,Frequency (%)
ASCII,1026,100.0%

Value,Count,Frequency (%)
o,225,21.9%
m,175,17.1%
n,132,12.9%
c,86,8.4%
e,77,7.5%
r,75,7.3%
p,56,5.5%
a,48,4.7%
l,40,3.9%
s,37,3.6%

0,1
Distinct,306
Distinct (%),61.4%
Missing,2
Missing (%),0.4%
Memory size,7.8 KiB

0,1
the,38
and,28
-,21
in,11
And,7
Other values (301),393

0,1
Max length,25.0
Median length,22.0
Mean length,5.9578313
Min length,1.0

0,1
Total characters,2967
Distinct characters,50
Distinct categories,7 ?
Distinct scripts,2 ?
Distinct blocks,2 ?

0,1
Unique,255 ?
Unique (%),51.2%

0,1
1st row,even as
2nd row,to appeal to
3rd row,-
4th row,Son
5th row,in

Value,Count,Frequency (%)
the,38,7.6%
and,28,5.6%
-,21,4.2%
in,11,2.2%
And,7,1.4%
then,7,1.4%
not,7,1.4%
of the,7,1.4%
for,6,1.2%
from,6,1.2%

Value,Count,Frequency (%)
the,56,8.2%
and,35,5.1%
of,26,3.8%
,21,3.1%
to,18,2.6%
in,15,2.2%
you,14,2.0%
will,10,1.5%
not,9,1.3%
all,9,1.3%

Value,Count,Frequency (%)
e,349,11.8%
t,232,7.8%
o,221,7.4%
n,215,7.2%
a,195,6.6%
,186,6.3%
h,177,6.0%
i,176,5.9%
s,147,5.0%
r,139,4.7%

Value,Count,Frequency (%)
Lowercase Letter,2656,89.5%
Space Separator,186,6.3%
Uppercase Letter,75,2.5%
Dash Punctuation,21,0.7%
Open Punctuation,14,0.5%
Close Punctuation,14,0.5%
Final Punctuation,1,< 0.1%

Value,Count,Frequency (%)
e,349,13.1%
t,232,8.7%
o,221,8.3%
n,215,8.1%
a,195,7.3%
h,177,6.7%
i,176,6.6%
s,147,5.5%
r,139,5.2%
l,118,4.4%

Value,Count,Frequency (%)
I,11,14.7%
H,10,13.3%
A,9,12.0%
J,6,8.0%
B,5,6.7%
O,4,5.3%
S,4,5.3%
G,3,4.0%
L,3,4.0%
T,3,4.0%

Value,Count,Frequency (%)
,186,100.0%

Value,Count,Frequency (%)
-,21,100.0%

Value,Count,Frequency (%)
[,14,100.0%

Value,Count,Frequency (%)
],14,100.0%

Value,Count,Frequency (%)
’,1,100.0%

Value,Count,Frequency (%)
Latin,2731,92.0%
Common,236,8.0%

Value,Count,Frequency (%)
e,349,12.8%
t,232,8.5%
o,221,8.1%
n,215,7.9%
a,195,7.1%
h,177,6.5%
i,176,6.4%
s,147,5.4%
r,139,5.1%
l,118,4.3%

Value,Count,Frequency (%)
,186,78.8%
-,21,8.9%
[,14,5.9%
],14,5.9%
’,1,0.4%

Value,Count,Frequency (%)
ASCII,2966,> 99.9%
Punctuation,1,< 0.1%

Value,Count,Frequency (%)
e,349,11.8%
t,232,7.8%
o,221,7.5%
n,215,7.2%
a,195,6.6%
,186,6.3%
h,177,6.0%
i,176,5.9%
s,147,5.0%
r,139,4.7%

Value,Count,Frequency (%)
’,1,100.0%

0,1
Distinct,316
Distinct (%),63.2%
Missing,0
Missing (%),0.0%
Memory size,7.8 KiB

0,1
καὶ,38
δὲ,9
τὸν,9
ὁ,8
ἐν,8
Other values (311),428

0,1
Max length,15.0
Median length,13.0
Mean length,4.92
Min length,1.0

0,1
Total characters,2460
Distinct characters,93
Distinct categories,3 ?
Distinct scripts,2 ?
Distinct blocks,3 ?

0,1
Unique,256 ?
Unique (%),51.2%

0,1
1st row,καθὼς
2nd row,ἐπικαλέσασθαι
3rd row,ὁ
4th row,Υἱός
5th row,ἐν

Value,Count,Frequency (%)
καὶ,38,7.6%
δὲ,9,1.8%
τὸν,9,1.8%
ὁ,8,1.6%
ἐν,8,1.6%
τὴν,7,1.4%
τοῦ,7,1.4%
εἰς,7,1.4%
τὸ,6,1.2%
τῆς,6,1.2%

Value,Count,Frequency (%)
καὶ,39,7.8%
δὲ,9,1.8%
τὸν,9,1.8%
ὁ,8,1.6%
ἐν,8,1.6%
τὴν,7,1.4%
τοῦ,7,1.4%
εἰς,7,1.4%
τῆς,6,1.2%
τὸ,6,1.2%

Value,Count,Frequency (%)
ν,217,8.8%
α,210,8.5%
τ,191,7.8%
ο,147,6.0%
ε,118,4.8%
ς,92,3.7%
κ,90,3.7%
ι,87,3.5%
σ,82,3.3%
π,81,3.3%

Value,Count,Frequency (%)
Lowercase Letter,2425,98.6%
Uppercase Letter,32,1.3%
Final Punctuation,3,0.1%

Value,Count,Frequency (%)
ν,217,8.9%
α,210,8.7%
τ,191,7.9%
ο,147,6.1%
ε,118,4.9%
ς,92,3.8%
κ,90,3.7%
ι,87,3.6%
σ,82,3.4%
π,81,3.3%

Value,Count,Frequency (%)
Π,6,18.8%
Ἰ,6,18.8%
Θ,3,9.4%
Κ,2,6.2%
Τ,2,6.2%
Ἐ,2,6.2%
Χ,2,6.2%
Ὃ,1,3.1%
Ο,1,3.1%
Ἤ,1,3.1%

Value,Count,Frequency (%)
’,3,100.0%

Value,Count,Frequency (%)
Greek,2457,99.9%
Common,3,0.1%

Value,Count,Frequency (%)
ν,217,8.8%
α,210,8.5%
τ,191,7.8%
ο,147,6.0%
ε,118,4.8%
ς,92,3.7%
κ,90,3.7%
ι,87,3.5%
σ,82,3.3%
π,81,3.3%

Value,Count,Frequency (%)
’,3,100.0%

Value,Count,Frequency (%)
,1991,80.9%
Greek Ext,466,18.9%
Punctuation,3,0.1%

Value,Count,Frequency (%)
ν,217,10.9%
α,210,10.5%
τ,191,9.6%
ο,147,7.4%
ε,118,5.9%
ς,92,4.6%
κ,90,4.5%
ι,87,4.4%
σ,82,4.1%
π,81,4.1%

Value,Count,Frequency (%)
ὶ,55,11.8%
ἐ,50,10.7%
ἀ,35,7.5%
ὐ,34,7.3%
ὸ,31,6.7%
ῦ,20,4.3%
ῖ,18,3.9%
ῶ,17,3.6%
ὰ,17,3.6%
ὴ,16,3.4%

Value,Count,Frequency (%)
’,3,100.0%

0,1
Distinct,5
Distinct (%),1.0%
Missing,0
Missing (%),0.0%
Memory size,7.8 KiB

0,1
,437
",",29
.,21
·,10
;,3

0,1
Max length,1
Median length,1
Mean length,1
Min length,1

0,1
Total characters,500
Distinct characters,5
Distinct categories,2 ?
Distinct scripts,1 ?
Distinct blocks,2 ?

0,1
Unique,0 ?
Unique (%),0.0%

0,1
1st row,
2nd row,
3rd row,
4th row,","
5th row,

Value,Count,Frequency (%)
,437,87.4%
",",29,5.8%
.,21,4.2%
·,10,2.0%
;,3,0.6%

Value,Count,Frequency (%)
,53,84.1%
·,10,15.9%

Value,Count,Frequency (%)
,437,87.4%
",",29,5.8%
.,21,4.2%
·,10,2.0%
;,3,0.6%

Value,Count,Frequency (%)
Space Separator,437,87.4%
Other Punctuation,63,12.6%

Value,Count,Frequency (%)
",",29,46.0%
.,21,33.3%
·,10,15.9%
;,3,4.8%

Value,Count,Frequency (%)
,437,100.0%

Value,Count,Frequency (%)
Common,500,100.0%

Value,Count,Frequency (%)
,437,87.4%
",",29,5.8%
.,21,4.2%
·,10,2.0%
;,3,0.6%

Value,Count,Frequency (%)
ASCII,490,98.0%
,10,2.0%

Value,Count,Frequency (%)
,437,89.2%
",",29,5.9%
.,21,4.3%
;,3,0.6%

Value,Count,Frequency (%)
·,10,100.0%

0,1
Distinct,229
Distinct (%),45.8%
Missing,0
Missing (%),0.0%
Memory size,7.8 KiB

0,1
ὁ,72
καί,39
αὐτός,17
πᾶς,14
λέγω,11
Other values (224),347

0,1
Max length,14.0
Median length,11.0
Mean length,4.34
Min length,1.0

0,1
Total characters,2170
Distinct characters,71
Distinct categories,5 ?
Distinct scripts,3 ?
Distinct blocks,3 ?

0,1
Unique,174 ?
Unique (%),34.8%

0,1
1st row,καθώς
2nd row,ἐπικαλέω
3rd row,ὁ
4th row,υἱός
5th row,ἐν

Value,Count,Frequency (%)
ὁ,72,14.4%
καί,39,7.8%
αὐτός,17,3.4%
πᾶς,14,2.8%
λέγω,11,2.2%
δέ,9,1.8%
ἐκ,8,1.6%
ἐν,8,1.6%
εἰμί,8,1.6%
εἰς,7,1.4%

Value,Count,Frequency (%)
ὁ,72,14.4%
καί,39,7.8%
αὐτός,17,3.4%
πᾶς,14,2.8%
λέγω,11,2.2%
δέ,9,1.8%
ἐκ,8,1.6%
ἐν,8,1.6%
εἰμί,8,1.6%
εἰς,7,1.4%

Value,Count,Frequency (%)
α,173,8.0%
ς,140,6.5%
ο,116,5.3%
ί,98,4.5%
τ,98,4.5%
ν,93,4.3%
κ,91,4.2%
π,86,4.0%
ω,85,3.9%
μ,83,3.8%

Value,Count,Frequency (%)
Lowercase Letter,2154,99.3%
Uppercase Letter,13,0.6%
Open Punctuation,1,< 0.1%
Space Separator,1,< 0.1%
Close Punctuation,1,< 0.1%

Value,Count,Frequency (%)
α,173,8.0%
ς,140,6.5%
ο,116,5.4%
ί,98,4.5%
τ,98,4.5%
ν,93,4.3%
κ,91,4.2%
π,86,4.0%
ω,85,3.9%
μ,83,3.9%

Value,Count,Frequency (%)
Ἰ,6,46.2%
I,2,15.4%
Χ,2,15.4%
Π,1,7.7%
Δ,1,7.7%
Σ,1,7.7%

Value,Count,Frequency (%)
(,1,100.0%

Value,Count,Frequency (%)
,1,100.0%

Value,Count,Frequency (%)
),1,100.0%

Value,Count,Frequency (%)
Greek,2165,99.8%
Common,3,0.1%
Latin,2,0.1%

Value,Count,Frequency (%)
α,173,8.0%
ς,140,6.5%
ο,116,5.4%
ί,98,4.5%
τ,98,4.5%
ν,93,4.3%
κ,91,4.2%
π,86,4.0%
ω,85,3.9%
μ,83,3.8%

Value,Count,Frequency (%)
(,1,33.3%
,1,33.3%
),1,33.3%

Value,Count,Frequency (%)
I,2,100.0%

Value,Count,Frequency (%)
,1858,85.6%
Greek Ext,307,14.1%
ASCII,5,0.2%

Value,Count,Frequency (%)
α,173,9.3%
ς,140,7.5%
ο,116,6.2%
ί,98,5.3%
τ,98,5.3%
ν,93,5.0%
κ,91,4.9%
π,86,4.6%
ω,85,4.6%
μ,83,4.5%

Value,Count,Frequency (%)
ὁ,73,23.8%
ἐ,46,15.0%
ἀ,36,11.7%
ὐ,34,11.1%
ἰ,17,5.5%
ᾶ,14,4.6%
ὅ,11,3.6%
ῦ,8,2.6%
ἡ,7,2.3%
ἔ,6,2.0%

Value,Count,Frequency (%)
I,2,40.0%
(,1,20.0%
,1,20.0%
),1,20.0%

0,1
Distinct,311
Distinct (%),62.2%
Missing,0
Missing (%),0.0%
Memory size,7.8 KiB

0,1
καί,38
τόν,9
δέ,9
ὁ,8
ἐν,8
Other values (306),428

0,1
Max length,15.0
Median length,13.0
Mean length,4.92
Min length,1.0

0,1
Total characters,2460
Distinct characters,84
Distinct categories,3 ?
Distinct scripts,2 ?
Distinct blocks,3 ?

0,1
Unique,249 ?
Unique (%),49.8%

0,1
1st row,καθώς
2nd row,ἐπικαλέσασθαι
3rd row,ὁ
4th row,Υἱός
5th row,ἐν

Value,Count,Frequency (%)
καί,38,7.6%
τόν,9,1.8%
δέ,9,1.8%
ὁ,8,1.6%
ἐν,8,1.6%
τοῦ,7,1.4%
εἰς,7,1.4%
τήν,7,1.4%
τό,6,1.2%
τῆς,6,1.2%

Value,Count,Frequency (%)
καί,39,7.8%
δέ,9,1.8%
τόν,9,1.8%
ὁ,8,1.6%
ἐν,8,1.6%
τοῦ,7,1.4%
εἰς,7,1.4%
τήν,7,1.4%
τό,6,1.2%
τῆς,6,1.2%

Value,Count,Frequency (%)
ν,217,8.8%
α,210,8.5%
τ,191,7.8%
ο,148,6.0%
ε,119,4.8%
ς,92,3.7%
κ,90,3.7%
ί,90,3.7%
ι,88,3.6%
σ,82,3.3%

Value,Count,Frequency (%)
Lowercase Letter,2425,98.6%
Uppercase Letter,32,1.3%
Final Punctuation,3,0.1%

Value,Count,Frequency (%)
ν,217,8.9%
α,210,8.7%
τ,191,7.9%
ο,148,6.1%
ε,119,4.9%
ς,92,3.8%
κ,90,3.7%
ί,90,3.7%
ι,88,3.6%
σ,82,3.4%

Value,Count,Frequency (%)
Ἰ,6,18.8%
Π,6,18.8%
Θ,3,9.4%
Κ,2,6.2%
Ἐ,2,6.2%
Χ,2,6.2%
Τ,2,6.2%
Ἤ,1,3.1%
Υ,1,3.1%
Ο,1,3.1%

Value,Count,Frequency (%)
’,3,100.0%

Value,Count,Frequency (%)
Greek,2457,99.9%
Common,3,0.1%

Value,Count,Frequency (%)
ν,217,8.8%
α,210,8.5%
τ,191,7.8%
ο,148,6.0%
ε,119,4.8%
ς,92,3.7%
κ,90,3.7%
ί,90,3.7%
ι,88,3.6%
σ,82,3.3%

Value,Count,Frequency (%)
’,3,100.0%

Value,Count,Frequency (%)
,2130,86.6%
Greek Ext,327,13.3%
Punctuation,3,0.1%

Value,Count,Frequency (%)
ν,217,10.2%
α,210,9.9%
τ,191,9.0%
ο,148,6.9%
ε,119,5.6%
ς,92,4.3%
κ,90,4.2%
ί,90,4.2%
ι,88,4.1%
σ,82,3.8%

Value,Count,Frequency (%)
ἐ,50,15.3%
ἀ,35,10.7%
ὐ,34,10.4%
ῦ,20,6.1%
ῖ,18,5.5%
ῶ,17,5.2%
ἡ,15,4.6%
ἰ,11,3.4%
ῆ,10,3.1%
ὅ,10,3.1%

Value,Count,Frequency (%)
’,3,100.0%

0,1
Distinct,229
Distinct (%),45.8%
Missing,0
Missing (%),0.0%
Infinite,0
Infinite (%),0.0%
Mean,2681.33

0,1
Minimum,18
Maximum,5613
Zeros,0
Zeros (%),0.0%
Negative,0
Negative (%),0.0%
Memory size,7.8 KiB

0,1
Minimum,18.0
5-th percentile,242.85
Q1,1519.0
median,2576.0
Q3,3588.75
95-th percentile,5017.95
Maximum,5613.0
Range,5595.0
Interquartile range (IQR),2069.75

0,1
Standard deviation,1374.197
Coefficient of variation (CV),0.51250574
Kurtosis,-0.85365073
Mean,2681.33
Median Absolute Deviation (MAD),1057
Skewness,-0.12337782
Sum,1340665
Variance,1888417.4
Monotonicity,Not monotonic

Value,Count,Frequency (%)
3588,72,14.4%
2532,39,7.8%
846,17,3.4%
3956,14,2.8%
3004,11,2.2%
1161,9,1.8%
1537,8,1.6%
1722,8,1.6%
1510,8,1.6%
1519,7,1.4%

Value,Count,Frequency (%)
18,1,0.2%
26,2,0.4%
40,2,0.4%
79,1,0.2%
80,2,0.4%
106,1,0.2%
129,1,0.2%
142,1,0.2%
166,2,0.4%
176,1,0.2%

Value,Count,Frequency (%)
5613,1,0.2%
5550,1,0.2%
5547,2,0.4%
5511,1,0.2%
5457,1,0.2%
5346,1,0.2%
5343,1,0.2%
5279,1,0.2%
5232,1,0.2%
5222,1,0.2%

0,1
Distinct,162
Distinct (%),32.4%
Missing,0
Missing (%),0.0%
Memory size,7.8 KiB

0,1
CONJ,69
PREP,44
N-ASF,12
V-PAI-3S,11
ADV,11
Other values (157),353

0,1
Max length,12.0
Median length,5.0
Mean length,5.37
Min length,3.0

0,1
Total characters,2685
Distinct characters,23
Distinct categories,3 ?
Distinct scripts,2 ?
Distinct blocks,1 ?

0,1
Unique,83 ?
Unique (%),16.6%

0,1
1st row,ADV
2nd row,V-AMN
3rd row,T-NSM
4th row,N-NSM
5th row,PREP

Value,Count,Frequency (%)
CONJ,69,13.8%
PREP,44,8.8%
N-ASF,12,2.4%
V-PAI-3S,11,2.2%
ADV,11,2.2%
N-NSM,10,2.0%
N-GSF,10,2.0%
T-ASM,9,1.8%
PRT-N,8,1.6%
N-ASM,8,1.6%

Value,Count,Frequency (%)
conj,69,13.8%
prep,44,8.8%
n-asf,12,2.4%
v-pai-3s,11,2.2%
adv,11,2.2%
n-nsm,10,2.0%
n-gsf,10,2.0%
t-asm,9,1.8%
prt-n,8,1.6%
n-asm,8,1.6%

Value,Count,Frequency (%)
-,469,17.5%
N,345,12.8%
P,296,11.0%
S,268,10.0%
A,262,9.8%
M,145,5.4%
V,112,4.2%
T,84,3.1%
F,79,2.9%
C,73,2.7%

Value,Count,Frequency (%)
Uppercase Letter,2115,78.8%
Dash Punctuation,469,17.5%
Decimal Number,101,3.8%

Value,Count,Frequency (%)
N,345,16.3%
P,296,14.0%
S,268,12.7%
A,262,12.4%
M,145,6.9%
V,112,5.3%
T,84,4.0%
F,79,3.7%
C,73,3.5%
O,72,3.4%

Value,Count,Frequency (%)
3,49,48.5%
2,32,31.7%
1,20,19.8%

Value,Count,Frequency (%)
-,469,100.0%

Value,Count,Frequency (%)
Latin,2115,78.8%
Common,570,21.2%

Value,Count,Frequency (%)
N,345,16.3%
P,296,14.0%
S,268,12.7%
A,262,12.4%
M,145,6.9%
V,112,5.3%
T,84,4.0%
F,79,3.7%
C,73,3.5%
O,72,3.4%

Value,Count,Frequency (%)
-,469,82.3%
3,49,8.6%
2,32,5.6%
1,20,3.5%

Value,Count,Frequency (%)
ASCII,2685,100.0%

Value,Count,Frequency (%)
-,469,17.5%
N,345,12.8%
P,296,11.0%
S,268,10.0%
A,262,9.8%
M,145,5.4%
V,112,4.2%
T,84,3.1%
F,79,2.9%
C,73,2.7%

0,1
Distinct,3
Distinct (%),4.4%
Missing,432
Missing (%),86.4%
Memory size,7.8 KiB

0,1
third,47
first,11
second,10

0,1
Max length,6.0
Median length,5.0
Mean length,5.1470588
Min length,5.0

0,1
Total characters,350
Distinct characters,11
Distinct categories,1 ?
Distinct scripts,1 ?
Distinct blocks,1 ?

0,1
Unique,0 ?
Unique (%),0.0%

0,1
1st row,third
2nd row,first
3rd row,third
4th row,third
5th row,third

Value,Count,Frequency (%)
third,47,9.4%
first,11,2.2%
second,10,2.0%
(Missing),432,86.4%

Value,Count,Frequency (%)
third,47,69.1%
first,11,16.2%
second,10,14.7%

Value,Count,Frequency (%)
t,58,16.6%
i,58,16.6%
r,58,16.6%
d,57,16.3%
h,47,13.4%
s,21,6.0%
f,11,3.1%
e,10,2.9%
c,10,2.9%
o,10,2.9%

Value,Count,Frequency (%)
Lowercase Letter,350,100.0%

Value,Count,Frequency (%)
t,58,16.6%
i,58,16.6%
r,58,16.6%
d,57,16.3%
h,47,13.4%
s,21,6.0%
f,11,3.1%
e,10,2.9%
c,10,2.9%
o,10,2.9%

Value,Count,Frequency (%)
Latin,350,100.0%

Value,Count,Frequency (%)
t,58,16.6%
i,58,16.6%
r,58,16.6%
d,57,16.3%
h,47,13.4%
s,21,6.0%
f,11,3.1%
e,10,2.9%
c,10,2.9%
o,10,2.9%

Value,Count,Frequency (%)
ASCII,350,100.0%

Value,Count,Frequency (%)
t,58,16.6%
i,58,16.6%
r,58,16.6%
d,57,16.3%
h,47,13.4%
s,21,6.0%
f,11,3.1%
e,10,2.9%
c,10,2.9%
o,10,2.9%

0,1
Distinct,2
Distinct (%),0.6%
Missing,146
Missing (%),29.2%
Memory size,7.8 KiB

0,1
singular,261
plural,93

0,1
Max length,8.0
Median length,8.0
Mean length,7.4745763
Min length,6.0

0,1
Total characters,2646
Distinct characters,9
Distinct categories,1 ?
Distinct scripts,1 ?
Distinct blocks,1 ?

0,1
Unique,0 ?
Unique (%),0.0%

0,1
1st row,singular
2nd row,singular
3rd row,singular
4th row,singular
5th row,singular

Value,Count,Frequency (%)
singular,261,52.2%
plural,93,18.6%
(Missing),146,29.2%

Value,Count,Frequency (%)
singular,261,73.7%
plural,93,26.3%

Value,Count,Frequency (%)
l,447,16.9%
u,354,13.4%
a,354,13.4%
r,354,13.4%
s,261,9.9%
i,261,9.9%
n,261,9.9%
g,261,9.9%
p,93,3.5%

Value,Count,Frequency (%)
Lowercase Letter,2646,100.0%

Value,Count,Frequency (%)
l,447,16.9%
u,354,13.4%
a,354,13.4%
r,354,13.4%
s,261,9.9%
i,261,9.9%
n,261,9.9%
g,261,9.9%
p,93,3.5%

Value,Count,Frequency (%)
Latin,2646,100.0%

Value,Count,Frequency (%)
l,447,16.9%
u,354,13.4%
a,354,13.4%
r,354,13.4%
s,261,9.9%
i,261,9.9%
n,261,9.9%
g,261,9.9%
p,93,3.5%

Value,Count,Frequency (%)
ASCII,2646,100.0%

Value,Count,Frequency (%)
l,447,16.9%
u,354,13.4%
a,354,13.4%
r,354,13.4%
s,261,9.9%
i,261,9.9%
n,261,9.9%
g,261,9.9%
p,93,3.5%

0,1
Distinct,3
Distinct (%),1.1%
Missing,228
Missing (%),45.6%
Memory size,7.8 KiB

0,1
masculine,139
feminine,69
neuter,64

0,1
Max length,9.0
Median length,9.0
Mean length,8.0404412
Min length,6.0

0,1
Total characters,2187
Distinct characters,12
Distinct categories,1 ?
Distinct scripts,1 ?
Distinct blocks,1 ?

0,1
Unique,0 ?
Unique (%),0.0%

0,1
1st row,masculine
2nd row,masculine
3rd row,masculine
4th row,neuter
5th row,feminine

Value,Count,Frequency (%)
masculine,139,27.8%
feminine,69,13.8%
neuter,64,12.8%
(Missing),228,45.6%

Value,Count,Frequency (%)
masculine,139,51.1%
feminine,69,25.4%
neuter,64,23.5%

Value,Count,Frequency (%)
e,405,18.5%
n,341,15.6%
i,277,12.7%
m,208,9.5%
u,203,9.3%
a,139,6.4%
s,139,6.4%
c,139,6.4%
l,139,6.4%
f,69,3.2%

Value,Count,Frequency (%)
Lowercase Letter,2187,100.0%

Value,Count,Frequency (%)
e,405,18.5%
n,341,15.6%
i,277,12.7%
m,208,9.5%
u,203,9.3%
a,139,6.4%
s,139,6.4%
c,139,6.4%
l,139,6.4%
f,69,3.2%

Value,Count,Frequency (%)
Latin,2187,100.0%

Value,Count,Frequency (%)
e,405,18.5%
n,341,15.6%
i,277,12.7%
m,208,9.5%
u,203,9.3%
a,139,6.4%
s,139,6.4%
c,139,6.4%
l,139,6.4%
f,69,3.2%

Value,Count,Frequency (%)
ASCII,2187,100.0%

Value,Count,Frequency (%)
e,405,18.5%
n,341,15.6%
i,277,12.7%
m,208,9.5%
u,203,9.3%
a,139,6.4%
s,139,6.4%
c,139,6.4%
l,139,6.4%
f,69,3.2%

0,1
Distinct,5
Distinct (%),1.7%
Missing,214
Missing (%),42.8%
Memory size,7.8 KiB

0,1
accusative,98
nominative,85
genitive,67
dative,34
vocative,2

0,1
Max length,10.0
Median length,10.0
Mean length,9.041958
Min length,6.0

0,1
Total characters,2586
Distinct characters,13
Distinct categories,1 ?
Distinct scripts,1 ?
Distinct blocks,1 ?

0,1
Unique,0 ?
Unique (%),0.0%

0,1
1st row,nominative
2nd row,nominative
3rd row,accusative
4th row,dative
5th row,accusative

Value,Count,Frequency (%)
accusative,98,19.6%
nominative,85,17.0%
genitive,67,13.4%
dative,34,6.8%
vocative,2,0.4%
(Missing),214,42.8%

Value,Count,Frequency (%)
accusative,98,34.3%
nominative,85,29.7%
genitive,67,23.4%
dative,34,11.9%
vocative,2,0.7%

Value,Count,Frequency (%)
i,438,16.9%
e,353,13.7%
a,317,12.3%
v,288,11.1%
t,286,11.1%
n,237,9.2%
c,198,7.7%
u,98,3.8%
s,98,3.8%
o,87,3.4%

Value,Count,Frequency (%)
Lowercase Letter,2586,100.0%

Value,Count,Frequency (%)
i,438,16.9%
e,353,13.7%
a,317,12.3%
v,288,11.1%
t,286,11.1%
n,237,9.2%
c,198,7.7%
u,98,3.8%
s,98,3.8%
o,87,3.4%

Value,Count,Frequency (%)
Latin,2586,100.0%

Value,Count,Frequency (%)
i,438,16.9%
e,353,13.7%
a,317,12.3%
v,288,11.1%
t,286,11.1%
n,237,9.2%
c,198,7.7%
u,98,3.8%
s,98,3.8%
o,87,3.4%

Value,Count,Frequency (%)
ASCII,2586,100.0%

Value,Count,Frequency (%)
i,438,16.9%
e,353,13.7%
a,317,12.3%
v,288,11.1%
t,286,11.1%
n,237,9.2%
c,198,7.7%
u,98,3.8%
s,98,3.8%
o,87,3.4%

0,1
Distinct,5
Distinct (%),5.3%
Missing,406
Missing (%),81.2%
Memory size,7.8 KiB

0,1
present,41
aorist,38
future,8
imperfect,6
perfect,1

0,1
Max length,9.0
Median length,7.0
Mean length,6.6382979
Min length,6.0

0,1
Total characters,624
Distinct characters,13
Distinct categories,1 ?
Distinct scripts,1 ?
Distinct blocks,1 ?

0,1
Unique,1 ?
Unique (%),1.1%

0,1
1st row,aorist
2nd row,perfect
3rd row,present
4th row,aorist
5th row,present

Value,Count,Frequency (%)
present,41,8.2%
aorist,38,7.6%
future,8,1.6%
imperfect,6,1.2%
perfect,1,0.2%
(Missing),406,81.2%

Value,Count,Frequency (%)
present,41,43.6%
aorist,38,40.4%
future,8,8.5%
imperfect,6,6.4%
perfect,1,1.1%

Value,Count,Frequency (%)
e,104,16.7%
r,94,15.1%
t,94,15.1%
s,79,12.7%
p,48,7.7%
i,44,7.1%
n,41,6.6%
a,38,6.1%
o,38,6.1%
u,16,2.6%

Value,Count,Frequency (%)
Lowercase Letter,624,100.0%

Value,Count,Frequency (%)
e,104,16.7%
r,94,15.1%
t,94,15.1%
s,79,12.7%
p,48,7.7%
i,44,7.1%
n,41,6.6%
a,38,6.1%
o,38,6.1%
u,16,2.6%

Value,Count,Frequency (%)
Latin,624,100.0%

Value,Count,Frequency (%)
e,104,16.7%
r,94,15.1%
t,94,15.1%
s,79,12.7%
p,48,7.7%
i,44,7.1%
n,41,6.6%
a,38,6.1%
o,38,6.1%
u,16,2.6%

Value,Count,Frequency (%)
ASCII,624,100.0%

Value,Count,Frequency (%)
e,104,16.7%
r,94,15.1%
t,94,15.1%
s,79,12.7%
p,48,7.7%
i,44,7.1%
n,41,6.6%
a,38,6.1%
o,38,6.1%
u,16,2.6%

0,1
Distinct,4
Distinct (%),4.3%
Missing,406
Missing (%),81.2%
Memory size,7.8 KiB

0,1
active,67
passive,11
middle,10
middlepassive,6

0,1
Max length,13.0
Median length,6.0
Mean length,6.5638298
Min length,6.0

0,1
Total characters,617
Distinct characters,11
Distinct categories,1 ?
Distinct scripts,1 ?
Distinct blocks,1 ?

0,1
Unique,0 ?
Unique (%),0.0%

0,1
1st row,middle
2nd row,middlepassive
3rd row,middlepassive
4th row,active
5th row,active

Value,Count,Frequency (%)
active,67,13.4%
passive,11,2.2%
middle,10,2.0%
middlepassive,6,1.2%
(Missing),406,81.2%

Value,Count,Frequency (%)
active,67,71.3%
passive,11,11.7%
middle,10,10.6%
middlepassive,6,6.4%

Value,Count,Frequency (%)
i,100,16.2%
e,100,16.2%
a,84,13.6%
v,84,13.6%
c,67,10.9%
t,67,10.9%
s,34,5.5%
d,32,5.2%
p,17,2.8%
m,16,2.6%

Value,Count,Frequency (%)
Lowercase Letter,617,100.0%

Value,Count,Frequency (%)
i,100,16.2%
e,100,16.2%
a,84,13.6%
v,84,13.6%
c,67,10.9%
t,67,10.9%
s,34,5.5%
d,32,5.2%
p,17,2.8%
m,16,2.6%

Value,Count,Frequency (%)
Latin,617,100.0%

Value,Count,Frequency (%)
i,100,16.2%
e,100,16.2%
a,84,13.6%
v,84,13.6%
c,67,10.9%
t,67,10.9%
s,34,5.5%
d,32,5.2%
p,17,2.8%
m,16,2.6%

Value,Count,Frequency (%)
ASCII,617,100.0%

Value,Count,Frequency (%)
i,100,16.2%
e,100,16.2%
a,84,13.6%
v,84,13.6%
c,67,10.9%
t,67,10.9%
s,34,5.5%
d,32,5.2%
p,17,2.8%
m,16,2.6%

0,1
Distinct,5
Distinct (%),5.3%
Missing,406
Missing (%),81.2%
Memory size,7.8 KiB

0,1
indicative,56
participle,21
subjunctive,8
infinitive,5
imperative,4

0,1
Max length,11.0
Median length,10.0
Mean length,10.085106
Min length,10.0

0,1
Total characters,948
Distinct characters,17
Distinct categories,1 ?
Distinct scripts,1 ?
Distinct blocks,1 ?

0,1
Unique,0 ?
Unique (%),0.0%

0,1
1st row,infinitive
2nd row,indicative
3rd row,indicative
4th row,indicative
5th row,indicative

Value,Count,Frequency (%)
indicative,56,11.2%
participle,21,4.2%
subjunctive,8,1.6%
infinitive,5,1.0%
imperative,4,0.8%
(Missing),406,81.2%

Value,Count,Frequency (%)
indicative,56,59.6%
participle,21,22.3%
subjunctive,8,8.5%
infinitive,5,5.3%
imperative,4,4.3%

Value,Count,Frequency (%)
i,246,25.9%
e,98,10.3%
t,94,9.9%
c,85,9.0%
a,81,8.5%
n,74,7.8%
v,73,7.7%
d,56,5.9%
p,46,4.9%
r,25,2.6%

Value,Count,Frequency (%)
Lowercase Letter,948,100.0%

Value,Count,Frequency (%)
i,246,25.9%
e,98,10.3%
t,94,9.9%
c,85,9.0%
a,81,8.5%
n,74,7.8%
v,73,7.7%
d,56,5.9%
p,46,4.9%
r,25,2.6%

Value,Count,Frequency (%)
Latin,948,100.0%

Value,Count,Frequency (%)
i,246,25.9%
e,98,10.3%
t,94,9.9%
c,85,9.0%
a,81,8.5%
n,74,7.8%
v,73,7.7%
d,56,5.9%
p,46,4.9%
r,25,2.6%

Value,Count,Frequency (%)
ASCII,948,100.0%

Value,Count,Frequency (%)
i,246,25.9%
e,98,10.3%
t,94,9.9%
c,85,9.0%
a,81,8.5%
n,74,7.8%
v,73,7.7%
d,56,5.9%
p,46,4.9%
r,25,2.6%

0,1
Distinct,2
Distinct (%),100.0%
Missing,498
Missing (%),99.6%
Memory size,7.8 KiB

0,1
superlative,1
comparative,1

0,1
Max length,11
Median length,11
Mean length,11
Min length,11

0,1
Total characters,22
Distinct characters,13
Distinct categories,1 ?
Distinct scripts,1 ?
Distinct blocks,1 ?

0,1
Unique,2 ?
Unique (%),100.0%

0,1
1st row,superlative
2nd row,comparative

Value,Count,Frequency (%)
superlative,1,0.2%
comparative,1,0.2%
(Missing),498,99.6%

Value,Count,Frequency (%)
superlative,1,50.0%
comparative,1,50.0%

Value,Count,Frequency (%)
e,3,13.6%
a,3,13.6%
p,2,9.1%
r,2,9.1%
t,2,9.1%
i,2,9.1%
v,2,9.1%
s,1,4.5%
u,1,4.5%
l,1,4.5%

Value,Count,Frequency (%)
Lowercase Letter,22,100.0%

Value,Count,Frequency (%)
e,3,13.6%
a,3,13.6%
p,2,9.1%
r,2,9.1%
t,2,9.1%
i,2,9.1%
v,2,9.1%
s,1,4.5%
u,1,4.5%
l,1,4.5%

Value,Count,Frequency (%)
Latin,22,100.0%

Value,Count,Frequency (%)
e,3,13.6%
a,3,13.6%
p,2,9.1%
r,2,9.1%
t,2,9.1%
i,2,9.1%
v,2,9.1%
s,1,4.5%
u,1,4.5%
l,1,4.5%

Value,Count,Frequency (%)
ASCII,22,100.0%

Value,Count,Frequency (%)
e,3,13.6%
a,3,13.6%
p,2,9.1%
r,2,9.1%
t,2,9.1%
i,2,9.1%
v,2,9.1%
s,1,4.5%
u,1,4.5%
l,1,4.5%

0,1
Distinct,163
Distinct (%),34.8%
Missing,31
Missing (%),6.2%
Memory size,7.8 KiB

0,1
092004,100
089017,17
089015,17
059003,15
033006,14
Other values (158),306

0,1
Max length,20.0
Median length,6.0
Mean length,6.1108742
Min length,3.0

0,1
Total characters,2866
Distinct characters,11
Distinct categories,2 ?
Distinct scripts,1 ?
Distinct blocks,1 ?

0,1
Unique,95 ?
Unique (%),20.3%

0,1
1st row,64
2nd row,56004
3rd row,92004
4th row,12001
5th row,83008

Value,Count,Frequency (%)
092004,100,20.0%
089017,17,3.4%
089015,17,3.4%
059003,15,3.0%
033006,14,2.8%
012001,10,2.0%
093001,9,1.8%
013001,7,1.4%
069002,6,1.2%
067006,6,1.2%

Value,Count,Frequency (%)
092004,100,20.9%
089017,18,3.8%
089015,17,3.5%
059003,15,3.1%
033006,14,2.9%
012001,10,2.1%
093001,9,1.9%
013001,7,1.5%
067005,6,1.3%
069002,6,1.3%

Value,Count,Frequency (%)
0,1395,48.7%
9,267,9.3%
2,230,8.0%
1,203,7.1%
3,191,6.7%
4,161,5.6%
8,133,4.6%
5,98,3.4%
6,90,3.1%
7,88,3.1%

Value,Count,Frequency (%)
Decimal Number,2856,99.7%
Space Separator,10,0.3%

Value,Count,Frequency (%)
0,1395,48.8%
9,267,9.3%
2,230,8.1%
1,203,7.1%
3,191,6.7%
4,161,5.6%
8,133,4.7%
5,98,3.4%
6,90,3.2%
7,88,3.1%

Value,Count,Frequency (%)
,10,100.0%

Value,Count,Frequency (%)
Common,2866,100.0%

Value,Count,Frequency (%)
0,1395,48.7%
9,267,9.3%
2,230,8.0%
1,203,7.1%
3,191,6.7%
4,161,5.6%
8,133,4.6%
5,98,3.4%
6,90,3.1%
7,88,3.1%

Value,Count,Frequency (%)
ASCII,2866,100.0%

Value,Count,Frequency (%)
0,1395,48.7%
9,267,9.3%
2,230,8.0%
1,203,7.1%
3,191,6.7%
4,161,5.6%
8,133,4.6%
5,98,3.4%
6,90,3.1%
7,88,3.1%

0,1
Distinct,265
Distinct (%),56.5%
Missing,31
Missing (%),6.2%
Memory size,7.8 KiB

0,1
92.24,72
89.87,17
92.11,16
33.69,13
59.23,12
Other values (260),339

0,1
Max length,23.0
Median length,5.0
Mean length,5.1321962
Min length,3.0

0,1
Total characters,2407
Distinct characters,18
Distinct categories,7 ?
Distinct scripts,2 ?
Distinct blocks,1 ?

0,1
Unique,218 ?
Unique (%),46.5%

0,1
1st row,64.14
2nd row,56.15
3rd row,92.24
4th row,12.15
5th row,83.47

Value,Count,Frequency (%)
92.24,72,14.4%
89.87,17,3.4%
92.11,16,3.2%
33.69,13,2.6%
59.23,12,2.4%
89.92,12,2.4%
69.3,6,1.2%
13.4,5,1.0%
92.27,5,1.0%
92.23,4,0.8%

Value,Count,Frequency (%)
92.24,72,15.0%
89.87,17,3.5%
92.11,16,3.3%
33.69,13,2.7%
59.23,12,2.5%
89.92,12,2.5%
69.3,6,1.2%
13.4,5,1.0%
92.27,5,1.0%
92.1,4,0.8%

Value,Count,Frequency (%)
.,480,19.9%
2,365,15.2%
9,315,13.1%
1,236,9.8%
3,235,9.8%
8,184,7.6%
4,166,6.9%
6,118,4.9%
7,115,4.8%
5,110,4.6%

Value,Count,Frequency (%)
Decimal Number,1904,79.1%
Other Punctuation,481,20.0%
Space Separator,11,0.5%
Lowercase Letter,8,0.3%
Open Punctuation,1,< 0.1%
Uppercase Letter,1,< 0.1%
Close Punctuation,1,< 0.1%

Value,Count,Frequency (%)
2,365,19.2%
9,315,16.5%
1,236,12.4%
3,235,12.3%
8,184,9.7%
4,166,8.7%
6,118,6.2%
7,115,6.0%
5,110,5.8%
0,60,3.2%

Value,Count,Frequency (%)
.,480,99.8%
:,1,0.2%

Value,Count,Frequency (%)
a,7,87.5%
b,1,12.5%

Value,Count,Frequency (%)
,11,100.0%

Value,Count,Frequency (%)
{,1,100.0%

Value,Count,Frequency (%)
N,1,100.0%

Value,Count,Frequency (%)
},1,100.0%

Value,Count,Frequency (%)
Common,2398,99.6%
Latin,9,0.4%

Value,Count,Frequency (%)
.,480,20.0%
2,365,15.2%
9,315,13.1%
1,236,9.8%
3,235,9.8%
8,184,7.7%
4,166,6.9%
6,118,4.9%
7,115,4.8%
5,110,4.6%

Value,Count,Frequency (%)
a,7,77.8%
N,1,11.1%
b,1,11.1%

Value,Count,Frequency (%)
ASCII,2407,100.0%

Value,Count,Frequency (%)
.,480,19.9%
2,365,15.2%
9,315,13.1%
1,236,9.8%
3,235,9.8%
8,184,7.6%
4,166,6.9%
6,118,4.9%
7,115,4.8%
5,110,4.6%

0,1
Distinct,85
Distinct (%),98.8%
Missing,414
Missing (%),82.8%
Memory size,7.8 KiB

0,1
A0:n00000000000,2
A0:n42018035009 A1:n42018036004,1
A1:n40013031012,1
A0:n61003003012,1
A0:n41003031004;n41003031008,1
Other values (80),80

0,1
Max length,60.0
Median length,57.0
Mean length,25.860465
Min length,15.0

0,1
Total characters,2224
Distinct characters,15
Distinct categories,5 ?
Distinct scripts,2 ?
Distinct blocks,1 ?

0,1
Unique,84 ?
Unique (%),97.7%

0,1
1st row,A0:n44028016008 A1:n44028019007
2nd row,A0:n44016010016 A1:n44016010017
3rd row,A0:n43014012021
4th row,A0:n66006013003
5th row,A0:n44008034013 A1:n44008034015

Value,Count,Frequency (%)
A0:n00000000000,2,0.4%
A0:n42018035009 A1:n42018036004,1,0.2%
A1:n40013031012,1,0.2%
A0:n61003003012,1,0.2%
A0:n41003031004;n41003031008,1,0.2%
A0:n40017009010,1,0.2%
A0:n44020016004 A1:n44020032005 A2:n44020032007;n44020032010,1,0.2%
A0:n00000000000 A1:n40010041003,1,0.2%
A0:n40005001015,1,0.2%
A0:n42020028008 A1:n42020028011,1,0.2%

Value,Count,Frequency (%)
a0:n00000000000,4,2.9%
a1:n00000000000,2,1.5%
a0:n46003022002,2,1.5%
a0:n59005001004,1,0.7%
a1:n44016010017,1,0.7%
a0:n44016010016,1,0.7%
a0:n43014012021,1,0.7%
a0:n66006013003,1,0.7%
a0:n41014068009,1,0.7%
a1:n44008034015,1,0.7%

Value,Count,Frequency (%)
0,796,35.8%
1,286,12.9%
4,210,9.4%
n,146,6.6%
A,138,6.2%
:,137,6.2%
2,133,6.0%
3,80,3.6%
5,67,3.0%
6,54,2.4%

Value,Count,Frequency (%)
Decimal Number,1743,78.4%
Lowercase Letter,146,6.6%
Other Punctuation,146,6.6%
Uppercase Letter,138,6.2%
Space Separator,51,2.3%

Value,Count,Frequency (%)
0,796,45.7%
1,286,16.4%
4,210,12.0%
2,133,7.6%
3,80,4.6%
5,67,3.8%
6,54,3.1%
7,44,2.5%
8,39,2.2%
9,34,2.0%

Value,Count,Frequency (%)
:,137,93.8%
;,9,6.2%

Value,Count,Frequency (%)
n,146,100.0%

Value,Count,Frequency (%)
A,138,100.0%

Value,Count,Frequency (%)
,51,100.0%

Value,Count,Frequency (%)
Common,1940,87.2%
Latin,284,12.8%

Value,Count,Frequency (%)
0,796,41.0%
1,286,14.7%
4,210,10.8%
:,137,7.1%
2,133,6.9%
3,80,4.1%
5,67,3.5%
6,54,2.8%
,51,2.6%
7,44,2.3%

Value,Count,Frequency (%)
n,146,51.4%
A,138,48.6%

Value,Count,Frequency (%)
ASCII,2224,100.0%

Value,Count,Frequency (%)
0,796,35.8%
1,286,12.9%
4,210,9.4%
n,146,6.6%
A,138,6.2%
:,137,6.2%
2,133,6.0%
3,80,3.6%
5,67,3.0%
6,54,2.4%

0,1
Distinct,45
Distinct (%),97.8%
Missing,454
Missing (%),90.8%
Memory size,7.8 KiB

0,1
n46003022002,2
n42018035009,1
n40026059003 n40026059006,1
n44021040005,1
n42020028008,1
Other values (40),40

0,1
Max length,38.0
Median length,12.0
Mean length,13.695652
Min length,12.0

0,1
Total characters,630
Distinct characters,12
Distinct categories,3 ?
Distinct scripts,2 ?
Distinct blocks,1 ?

0,1
Unique,44 ?
Unique (%),95.7%

0,1
1st row,n44028016008
2nd row,n62005001011
3rd row,n40019023003
4th row,n46003022002
5th row,n45008012003

Value,Count,Frequency (%)
n46003022002,2,0.4%
n42018035009,1,0.2%
n40026059003 n40026059006,1,0.2%
n44021040005,1,0.2%
n42020028008,1,0.2%
n40005001015,1,0.2%
n44020016004,1,0.2%
n40017009010,1,0.2%
n41003031004 n41003031008,1,0.2%
n40013031012,1,0.2%

Value,Count,Frequency (%)
n46003022002,2,3.8%
n42021034022,1,1.9%
n40019023003,1,1.9%
n45008012003,1,1.9%
n40008019007,1,1.9%
n40019025004,1,1.9%
n45001001001,1,1.9%
n59005001004,1,1.9%
n45009016012,1,1.9%
n40012029004,1,1.9%

Value,Count,Frequency (%)
0,253,40.2%
1,84,13.3%
4,78,12.4%
n,52,8.3%
2,52,8.3%
3,30,4.8%
5,22,3.5%
6,15,2.4%
9,15,2.4%
7,13,2.1%

Value,Count,Frequency (%)
Decimal Number,572,90.8%
Lowercase Letter,52,8.3%
Space Separator,6,1.0%

Value,Count,Frequency (%)
0,253,44.2%
1,84,14.7%
4,78,13.6%
2,52,9.1%
3,30,5.2%
5,22,3.8%
6,15,2.6%
9,15,2.6%
7,13,2.3%
8,10,1.7%

Value,Count,Frequency (%)
n,52,100.0%

Value,Count,Frequency (%)
,6,100.0%

Value,Count,Frequency (%)
Common,578,91.7%
Latin,52,8.3%

Value,Count,Frequency (%)
0,253,43.8%
1,84,14.5%
4,78,13.5%
2,52,9.0%
3,30,5.2%
5,22,3.8%
6,15,2.6%
9,15,2.6%
7,13,2.2%
8,10,1.7%

Value,Count,Frequency (%)
n,52,100.0%

Value,Count,Frequency (%)
ASCII,630,100.0%

Value,Count,Frequency (%)
0,253,40.2%
1,84,13.3%
4,78,12.4%
n,52,8.3%
2,52,8.3%
3,30,4.8%
5,22,3.5%
6,15,2.4%
9,15,2.4%
7,13,2.1%

0,1
Distinct,40
Distinct (%),100.0%
Missing,460
Missing (%),92.0%
Memory size,7.8 KiB

0,1
n42017001018,1
n40015014007 n40015014009,1
n40012018008,1
n40026051003,1
n41009019007,1
Other values (35),35

0,1
Max length,51.0
Median length,12.0
Mean length,16.55
Min length,12.0

0,1
Total characters,662
Distinct characters,12
Distinct categories,3 ?
Distinct scripts,2 ?
Distinct blocks,1 ?

0,1
Unique,40 ?
Unique (%),100.0%

0,1
1st row,n40018015005
2nd row,n43005009006
3rd row,n52001001003 n52003006004 n52002018008 n52005004003
4th row,n41003007003
5th row,n42019035017

Value,Count,Frequency (%)
n42017001018,1,0.2%
n40015014007 n40015014009,1,0.2%
n40012018008,1,0.2%
n40026051003,1,0.2%
n41009019007,1,0.2%
n43004033004,1,0.2%
n44005036007,1,0.2%
n42009043019,1,0.2%
n44019016015,1,0.2%
n43005005004,1,0.2%

Value,Count,Frequency (%)
n52001001003,2,3.7%
n42017001018,1,1.9%
n52003006004,1,1.9%
n52002018008,1,1.9%
n52005004003,1,1.9%
n41003007003,1,1.9%
n42019035017,1,1.9%
n40023012001,1,1.9%
n46004009007,1,1.9%
n44008018004,1,1.9%

Value,Count,Frequency (%)
0,274,41.4%
1,90,13.6%
4,59,8.9%
n,54,8.2%
5,45,6.8%
2,31,4.7%
3,29,4.4%
6,21,3.2%
9,18,2.7%
7,14,2.1%

Value,Count,Frequency (%)
Decimal Number,594,89.7%
Lowercase Letter,54,8.2%
Space Separator,14,2.1%

Value,Count,Frequency (%)
0,274,46.1%
1,90,15.2%
4,59,9.9%
5,45,7.6%
2,31,5.2%
3,29,4.9%
6,21,3.5%
9,18,3.0%
7,14,2.4%
8,13,2.2%

Value,Count,Frequency (%)
n,54,100.0%

Value,Count,Frequency (%)
,14,100.0%

Value,Count,Frequency (%)
Common,608,91.8%
Latin,54,8.2%

Value,Count,Frequency (%)
0.0,274,45.1%
1.0,90,14.8%
4.0,59,9.7%
5.0,45,7.4%
2.0,31,5.1%
3.0,29,4.8%
6.0,21,3.5%
9.0,18,3.0%
7.0,14,2.3%
,14,2.3%

Value,Count,Frequency (%)
n,54,100.0%

Value,Count,Frequency (%)
ASCII,662,100.0%

Value,Count,Frequency (%)
0,274,41.4%
1,90,13.6%
4,59,8.9%
n,54,8.2%
5,45,6.8%
2,31,4.7%
3,29,4.4%
6,21,3.2%
9,18,2.7%
7,14,2.1%

Unnamed: 0,strong,role,class,type,after,person,number,gender,case,tense,voice,mood,degree,frame,subjref,referent
strong,1.0,0.274,0.387,0.413,0.154,0.194,0.154,0.167,0.076,0.0,0.325,0.0,1.0,0.158,0.0,1.0
role,0.274,1.0,0.653,0.309,0.1,0.0,0.144,0.623,0.812,0.242,0.0,0.137,0.0,1.0,1.0,1.0
class,0.387,0.653,1.0,0.983,0.154,1.0,0.151,0.257,0.159,1.0,1.0,1.0,1.0,1.0,1.0,1.0
type,0.413,0.309,0.983,1.0,0.0,0.0,0.254,0.412,0.203,0.0,0.0,0.0,0.0,0.0,0.0,1.0
after,0.154,0.1,0.154,0.0,1.0,0.205,0.023,0.0,0.065,0.0,0.196,0.0,1.0,0.189,0.0,1.0
person,0.194,0.0,1.0,0.0,0.205,1.0,0.106,0.0,0.0,0.043,0.0,0.28,0.0,1.0,0.289,0.0
number,0.154,0.144,0.151,0.254,0.023,0.106,1.0,0.154,0.0,0.151,0.0,0.0,1.0,0.0,0.156,1.0
gender,0.167,0.623,0.257,0.412,0.0,0.0,0.154,1.0,0.176,1.0,1.0,1.0,1.0,1.0,1.0,1.0
case,0.076,0.812,0.159,0.203,0.065,0.0,0.0,0.176,1.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0
tense,0.0,0.242,1.0,0.0,0.0,0.043,0.151,1.0,0.0,1.0,0.259,0.061,0.0,0.142,0.134,0.0

Unnamed: 0,xml:id,ref,role,class,type,gloss,text,after,lemma,normalized,strong,morph,person,number,gender,case,tense,voice,mood,degree,domain,ln,frame,subjref,referent
109754,n52002013023,1TH 2:13!23,,conj,,even as,καθὼς,,καθώς,καθώς,2531,ADV,,,,,,,,,64,64.14,,,
82819,n44028019006,ACT 28:19!6,v,verb,,to appeal to,ἐπικαλέσασθαι,,ἐπικαλέω,ἐπικαλέσασθαι,1941,V-AMN,,,,,aorist,middle,infinitive,,56004,56.15,A0:n44028016008 A1:n44028019007,n44028016008,
53564,n43006051007,JHN 6:51!7,,det,,-,ὁ,,ὁ,ὁ,3588,T-NSM,,singular,masculine,nominative,,,,,92004,92.24,,,
14732,n40024036017,MAT 24:36!17,,noun,common,Son,Υἱός,",",υἱός,Υἱός,5207,N-NSM,,singular,masculine,nominative,,,,,12001,12.15,,,
67062,n44004031007,ACT 4:31!7,,prep,,in,ἐν,,ἐν,ἐν,1722,PREP,,,,,,,,,83008,83.47,,,
101459,n48001001003,GAL 1:1!3,,adv,,not,οὐκ,,οὐ,οὐκ,3756,PRT-N,,,,,,,,,69002,69.3,,,
10545,n40018015009,MAT 18:15!9,o,pron,personal,him,αὐτὸν,,αὐτός,αὐτόν,846,P-ASM,,singular,masculine,accusative,,,,,92004,92.11,,,n40018015005
110416,n52004016006,1TH 4:16!6,,noun,common,a loud command,κελεύσματι,",",κέλευσμα,κελεύσματι,2752,N-DSN,,singular,neuter,dative,,,,,33032,33.324,,,
90487,n46001020015,1CO 1:20!15,,noun,common,wisdom,σοφίαν,,σοφία,σοφίαν,4678,N-ASF,,singular,feminine,accusative,,,,,28001,28.8,,,
52020,n43005009011,JHN 5:9!11,,pron,personal,of him,αὐτοῦ,,αὐτός,αὐτοῦ,846,P-GSM,,singular,masculine,genitive,,,,,92004,92.11,,,n43005009006

Unnamed: 0,xml:id,ref,role,class,type,gloss,text,after,lemma,normalized,strong,morph,person,number,gender,case,tense,voice,mood,degree,domain,ln,frame,subjref,referent
73513,n44013046027,ACT 13:46!27,,adj,,of eternal,αἰωνίου,,αἰώνιος,αἰωνίου,166,A-GSF,,singular,feminine,genitive,,,,,67005.0,67.96,,,
1948,n40005020009,MAT 5:20!9,,det,,-,ἡ,,ὁ,ἡ,3588,T-NSF,,singular,feminine,nominative,,,,,92004.0,92.24,,,
43681,n42018008001,LUK 18:8!1,v,verb,,I say,λέγω,,λέγω,λέγω,3004,V-PAI-1S,first,singular,,,present,active,indicative,,33006.0,33.69,A0:n42018006004 A2:n42018008002,n42018006004,
48719,n42024032003,LUK 24:32!3,,prep,,to,πρὸς,,πρός,πρός,4314,PREP,,,,,,,,,90013.0,90.58,,,
75058,n44016015023,ACT 16:15!23,,conj,,And,καὶ,,καί,καί,2532,CONJ,,,,,,,,,89015.0,89.87,,,
137599,n66022013013,REV 22:13!13,,noun,common,Beginning,ἀρχὴ,,ἀρχή,ἀρχή,746,N-NSF,,singular,feminine,nominative,,,,,67003.0,67.65,,,
60024,n43014016004,JHN 14:16!4,,noun,common,Father,Πατέρα,,πατήρ,Πατέρα,3962,N-ASM,,singular,masculine,accusative,,,,,12001.0,12.12,,,
120166,n58013012009,HEB 13:12!9,,noun,common,blood,αἵματος,,αἷμα,αἵματος,129,N-GSN,,singular,neuter,genitive,,,,,8002.0,8.64,,,
94874,n46012022006,1CO 12:22!6,,noun,common,members,μέλη,,μέλος,μέλη,3196,N-NPN,,plural,neuter,nominative,,,,,8002.0,8.9,,,
10081,n40017019011,MAT 17:19!11,,pron,interrogative,why,τί,,τίς,τί,5101,I-ASN,,singular,neuter,accusative,,,,,,,,,
