# Aesop: Greek and Portuguese




This notebook takes a plain-text file containing the text of Aesop, *Fabulae*, 1–17, in the Greek edition of Helm (1872), and a new Portuguese translation by M.C. Dezotti (2020), and transforms it into a canonically-citable, CITE-compliant digital library serialized into [CEX format](http://cite-architecture.org/citedx/CEX-spec-3.0.1).

**This is not a generic script!** The input file is clean and well-structured plain-text, but in an idiosyncratic format. Because it is well-structured, we can work with it. Because it is idiosyncratic, this is an exercise in *some techniques* for moving legacy data into CEX.

## Configuring CITE libraries for almond kernel

First, we'll make a bintray repository with CITE libraries available to your almond kernel.

In [47]:
val myBT = coursierapi.MavenRepository.of("https://dl.bintray.com/neelsmith/maven")
interp.repositories() ++= Seq(myBT)

[36mmyBT[39m: [32mcoursierapi[39m.[32mMavenRepository[39m = MavenRepository(https://dl.bintray.com/neelsmith/maven)

Next, we bring in specific libraries from the new repository using almond's `$ivy` magic:

In [48]:
import $ivy.`edu.holycross.shot::ohco2:10.16.0`
import $ivy.`edu.holycross.shot.cite::xcite:4.1.1`
import $ivy.`edu.holycross.shot::scm:7.2.0`
import $ivy.`edu.holycross.shot::dse:5.2.2`
import $ivy.`edu.holycross.shot::citebinaryimage:3.1.1`
import $ivy.`edu.holycross.shot::citeobj:7.3.4`
import $ivy.`edu.holycross.shot::citerelations:2.5.2`
import $ivy.`edu.holycross.shot::cex:6.3.3`


[32mimport [39m[36m$ivy.$                                  
[39m
[32mimport [39m[36m$ivy.$                                     
[39m
[32mimport [39m[36m$ivy.$                              
[39m
[32mimport [39m[36m$ivy.$                              
[39m
[32mimport [39m[36m$ivy.$                                          
[39m
[32mimport [39m[36m$ivy.$                                  
[39m
[32mimport [39m[36m$ivy.$                                        
[39m
[32mimport [39m[36m$ivy.$                              
[39m

## Imports

From this point on, your notebook consists of completely generic Scala, with the CITE Libraries available to use.

In [49]:
// Import some CITE libraries
import edu.holycross.shot.cite._
import edu.holycross.shot.ohco2._
import edu.holycross.shot.scm._
import edu.holycross.shot.citeobj._
import edu.holycross.shot.citerelation._
import edu.holycross.shot.dse._
import edu.holycross.shot.citebinaryimage._
import edu.holycross.shot.ohco2._

import almond.display.UpdatableDisplay
import almond.interpreter.api.DisplayData.ContentType
import almond.interpreter.api.{DisplayData, OutputHandler}

import java.io.File
import java.io.PrintWriter

import scala.io.Source


[32mimport [39m[36medu.holycross.shot.cite._
[39m
[32mimport [39m[36medu.holycross.shot.ohco2._
[39m
[32mimport [39m[36medu.holycross.shot.scm._
[39m
[32mimport [39m[36medu.holycross.shot.citeobj._
[39m
[32mimport [39m[36medu.holycross.shot.citerelation._
[39m
[32mimport [39m[36medu.holycross.shot.dse._
[39m
[32mimport [39m[36medu.holycross.shot.citebinaryimage._
[39m
[32mimport [39m[36medu.holycross.shot.ohco2._

[39m
[32mimport [39m[36malmond.display.UpdatableDisplay
[39m
[32mimport [39m[36malmond.interpreter.api.DisplayData.ContentType
[39m
[32mimport [39m[36malmond.interpreter.api.{DisplayData, OutputHandler}

[39m
[32mimport [39m[36mjava.io.File
[39m
[32mimport [39m[36mjava.io.PrintWriter

[39m
[32mimport [39m[36mscala.io.Source
[39m

## Useful Functions

A function for saving a String to a file.

In [50]:
def saveString(s:String, filePath:String = "", fileName:String = "temp.txt"):Unit = {
		 val writer = new PrintWriter(new File(s"${filePath}${fileName}"))
         writer.write(s)
         writer.close()
	}

defined [32mfunction[39m [36msaveString[39m

A function to pretty-print lists and OHCO2 corpora.

In [51]:
def showMe(v:Any):Unit = {
  v match {
    case _:StringHistogram => {
        for ( h <- v.asInstanceOf[StringHistogram].histogram ) {
            println(s"${h.count}\t${h.s}")
        }
    }
  	case _:Corpus => {
  		for ( n <- v.asInstanceOf[Corpus].nodes) {
  			println(s"${n.urn.passageComponent}\t\t${n.text}")
  		}	
  	}
    case _:Vector[Any] => println(s"""\n----\n${v.asInstanceOf[Vector[Any]].mkString("\n")}\n----\n""")
    case _:Iterable[Any] => println(s"""\n----\n${v.asInstanceOf[Iterable[Any]].mkString("\n")}\n----\n""")
    case _ => println(s"\n-----\n${v}\n----\n")
  }
}

defined [32mfunction[39m [36mshowMe[39m

## Load a Template File

Load it:

In [52]:
val filePath = s"txt/aesop.txt"
val allLines: Vector[String] = {
    scala.io.Source.fromFile(filePath).mkString.split("\n").toVector.filter( _.size > 0 )
}

[36mfilePath[39m: [32mString[39m = [32m"txt/aesop.txt"[39m
[36mallLines[39m: [32mVector[39m[[32mString[39m] = [33mVector[39m(
  [32m"1. \u1f08\u03b3\u03b1\u03b8\u1f70 \u03ba\u03b1\u1f76 \u039a\u03b1\u03ba\u03ac Os bens e os males "[39m,
  [32m"\u1f08\u03b3\u03b1\u03b8\u1f70 \u03c0\u03ac\u03bd\u03c4\u03b1 \u1f51\u03c0\u1f78 \u03c4\u1ff6\u03bd \u03ba\u03b1\u03ba\u1ff6\u03bd \u1f10\u03b4\u03b9\u03ce\u03c7\u03b8\u03b7, \u1f61\u03c2 \u1f00\u03c3\u03b8\u03b5\u03bd\u1fc6 \u1f44\u03bd\u03c4\u03b1\u00b7 \u03b5\u1f30\u03c2 \u03bf\u1f50\u03c1\u03b1\u03bd\u1f78\u03bd \u03b4\u1f72 \u1f00\u03bd\u1fc6\u03bb\u03b8\u03bf\u03bd. \u039a\u03b1\u1f76 \u03c4\u1f00\u03b3\u03b1\u03b8\u1f70 \u1f20\u03c1\u03ce\u03c4\u03b7\u03c3\u03b1\u03bd \u03c4\u1f78\u03bd \u0394\u03af\u03b1, \u03c0\u1ff6\u03c2 \u03b5\u1f36\u03bd\u03b1\u03b9 \u03b4\u03b5\u1fd6 \u03bc\u03b5\u03c4\u1f70 \u1f00\u03bd\u03b8\u03c1\u03ce\u03c0\u03c9\u03bd. \u1f49 \u03b4\u1f72 \u03b5\u1f36\u03c0\u03b5, \u03bc\u1f74 \u03bc\u03b5\u03c

## Parse Data

We define a custom Class that is String + Index:

In [53]:
case class IndexedLine( text: String, index: Int)

defined [32mclass[39m [36mIndexedLine[39m

We want to separate heading-lines from the content-lines.

Attach to each line of the text, an index-number (this will stay with the lines, and be useful later).

In [54]:
val indexedLines: Vector[IndexedLine] = allLines.zipWithIndex.map ( l => {
    IndexedLine( l._1, l._2 )
})

[36mindexedLines[39m: [32mVector[39m[[32mIndexedLine[39m] = [33mVector[39m(
  [33mIndexedLine[39m(
    [32m"1. \u1f08\u03b3\u03b1\u03b8\u1f70 \u03ba\u03b1\u1f76 \u039a\u03b1\u03ba\u03ac Os bens e os males "[39m,
    [32m0[39m
  ),
  [33mIndexedLine[39m(
    [32m"\u1f08\u03b3\u03b1\u03b8\u1f70 \u03c0\u03ac\u03bd\u03c4\u03b1 \u1f51\u03c0\u1f78 \u03c4\u1ff6\u03bd \u03ba\u03b1\u03ba\u1ff6\u03bd \u1f10\u03b4\u03b9\u03ce\u03c7\u03b8\u03b7, \u1f61\u03c2 \u1f00\u03c3\u03b8\u03b5\u03bd\u1fc6 \u1f44\u03bd\u03c4\u03b1\u00b7 \u03b5\u1f30\u03c2 \u03bf\u1f50\u03c1\u03b1\u03bd\u1f78\u03bd \u03b4\u1f72 \u1f00\u03bd\u1fc6\u03bb\u03b8\u03bf\u03bd. \u039a\u03b1\u1f76 \u03c4\u1f00\u03b3\u03b1\u03b8\u1f70 \u1f20\u03c1\u03ce\u03c4\u03b7\u03c3\u03b1\u03bd \u03c4\u1f78\u03bd \u0394\u03af\u03b1, \u03c0\u1ff6\u03c2 \u03b5\u1f36\u03bd\u03b1\u03b9 \u03b4\u03b5\u1fd6 \u03bc\u03b5\u03c4\u1f70 \u1f00\u03bd\u03b8\u03c1\u03ce\u03c0\u03c9\u03bd. \u1f49 \u03b4\u1f72 \u03b5\u1f36\u03c0\u03b5, \u03bc\u1f7

We want to pull out just the lines that are headings. We start with a Regular Expression pattern that (we happen to know) will match all of these lines: lines beginning with Arabic numerals are our headings.

In [55]:
val pattern = "^[0-9]".r // note that .r after a String makes it into a RegEx

[36mpattern[39m: [32mscala[39m.[32mutil[39m.[32mmatching[39m.[32mRegex[39m = ^[0-9]

Now we use that regular expression, `pattern` as a filter to get a Vector of just our heading-lines.

In [56]:
val headingLines: Vector[IndexedLine] = indexedLines.filter( l => {
    pattern.findAllIn(l.text).size > 0
})

[36mheadingLines[39m: [32mVector[39m[[32mIndexedLine[39m] = [33mVector[39m(
  [33mIndexedLine[39m(
    [32m"1. \u1f08\u03b3\u03b1\u03b8\u1f70 \u03ba\u03b1\u1f76 \u039a\u03b1\u03ba\u03ac Os bens e os males "[39m,
    [32m0[39m
  ),
  [33mIndexedLine[39m(
    [32m"2. \u1f08\u03b3\u03b1\u03bb\u03bc\u03b1\u03c4\u03bf\u03c0\u03ce\u03bb\u03b7\u03c2 O vendedor de est\u00e1tuas "[39m,
    [32m3[39m
  ),
  [33mIndexedLine[39m(
    [32m"3. \u1f0c\u03b3\u03c1\u03bf\u03b9\u03ba\u03bf\u03c2 \u03ba\u03b1\u1f76 \u1f48\u03bd\u03ac\u03c1\u03b9\u03b1 O campon\u00eas e os burrinhos "[39m,
    [32m6[39m
  ),
  [33mIndexedLine[39m([32m"4. \u1f08\u03b5\u03c4\u03cc\u03c2 A \u00e1guia "[39m, [32m9[39m),
  [33mIndexedLine[39m(
    [32m"5. \u1f08\u03b5\u03c4\u1f78\u03c2 \u03ba\u03b1\u1f76 \u1f08\u03bb\u03ce\u03c0\u03b7\u03be A \u00e1guia e a raposa "[39m,
    [32m12[39m
  ),
  [33mIndexedLine[39m(
    [32m"6. \u1f08\u03b5\u03c4\u1f78\u03c2 \u03ba\u03b1\u1f76 \u1f0c\u03bd

#### Group Text

We want to group our text by section. The procedure will be:

- Identify the index number of one heading.
- Identify the index number of the *next* heading.
- Get all lines that fall between the two.
- Attach them to the first heading.

Scala's [`.sliding`](http://daily-scala.blogspot.com/2009/11/iteratorsliding.html) method is ideal for this. It will group all the headings into pairs.

Below, `headingPairs` is a Vector of Vectors of IndexedLine objects. The inner Vector will have two IndexedLines, each one a heading. In the first pair will consist of the first heading and the second; the second pair will consist of the *second* heading (again) and the third.

In [57]:
val headingPairs: Vector[Vector[IndexedLine]] = headingLines.sliding(2,1).toVector

[36mheadingPairs[39m: [32mVector[39m[[32mVector[39m[[32mIndexedLine[39m]] = [33mVector[39m(
  [33mVector[39m(
    [33mIndexedLine[39m(
      [32m"1. \u1f08\u03b3\u03b1\u03b8\u1f70 \u03ba\u03b1\u1f76 \u039a\u03b1\u03ba\u03ac Os bens e os males "[39m,
      [32m0[39m
    ),
    [33mIndexedLine[39m(
      [32m"2. \u1f08\u03b3\u03b1\u03bb\u03bc\u03b1\u03c4\u03bf\u03c0\u03ce\u03bb\u03b7\u03c2 O vendedor de est\u00e1tuas "[39m,
      [32m3[39m
    )
  ),
  [33mVector[39m(
    [33mIndexedLine[39m(
      [32m"2. \u1f08\u03b3\u03b1\u03bb\u03bc\u03b1\u03c4\u03bf\u03c0\u03ce\u03bb\u03b7\u03c2 O vendedor de est\u00e1tuas "[39m,
      [32m3[39m
    ),
    [33mIndexedLine[39m(
      [32m"3. \u1f0c\u03b3\u03c1\u03bf\u03b9\u03ba\u03bf\u03c2 \u03ba\u03b1\u1f76 \u1f48\u03bd\u03ac\u03c1\u03b9\u03b1 O campon\u00eas e os burrinhos "[39m,
      [32m6[39m
    )
  ),
  [33mVector[39m(
    [33mIndexedLine[39m(
      [32m"3. \u1f0c\u03b3\u03c1\u03bf\u03b9\u03ba\u03bf\u

We can map this Vector of pairs and get all the chapters except the last one. For the last one, we need a variant. 

> In other programming idioms, we would iterate through the pairs, with a check, each time, to see if we were on the last one, or beyond the last one. In Scala's Functional Programming Idiom, we "do something to everything", and know in advance that this will not include the last section, and treat that differently. This helps avoid "off by one" errors, among other things.

In [58]:
val mappedHeadings: Vector[( IndexedLine, Vector[IndexedLine])] = {
    
    // We use up all the pairs…
    val allButLast: Vector[( IndexedLine, Vector[IndexedLine])] = headingPairs.map( p => {
        val firstIndex: Int = p.head.index
        val lastIndex: Int = p.last.index
        val firstLine: IndexedLine = indexedLines(firstIndex)
        val allLines: Vector[IndexedLine] = indexedLines.filter( il => {
            ( il.index > firstIndex) & ( il.index < lastIndex )
        })
        ( firstLine, allLines )
    })
    
    // We go get the last section, which we know was not included…
    val lastSection: Vector[( IndexedLine, Vector[IndexedLine])] = {
        val firstIndex: Int = headingPairs.last.last.index
        val firstLine: IndexedLine = indexedLines(firstIndex)
        val allLines: Vector[IndexedLine] = indexedLines.filter( il => {
            ( il.index > firstIndex) 
        })
        val tup = ( firstLine, allLines )
        Vector[( IndexedLine, Vector[IndexedLine])](tup)
    }
    
    // We concatenate the two Vectors…
    allButLast ++ lastSection
}

[36mmappedHeadings[39m: [32mVector[39m[([32mIndexedLine[39m, [32mVector[39m[[32mIndexedLine[39m])] = [33mVector[39m(
  (
    [33mIndexedLine[39m(
      [32m"1. \u1f08\u03b3\u03b1\u03b8\u1f70 \u03ba\u03b1\u1f76 \u039a\u03b1\u03ba\u03ac Os bens e os males "[39m,
      [32m0[39m
    ),
    [33mVector[39m(
      [33mIndexedLine[39m(
        [32m"\u1f08\u03b3\u03b1\u03b8\u1f70 \u03c0\u03ac\u03bd\u03c4\u03b1 \u1f51\u03c0\u1f78 \u03c4\u1ff6\u03bd \u03ba\u03b1\u03ba\u1ff6\u03bd \u1f10\u03b4\u03b9\u03ce\u03c7\u03b8\u03b7, \u1f61\u03c2 \u1f00\u03c3\u03b8\u03b5\u03bd\u1fc6 \u1f44\u03bd\u03c4\u03b1\u00b7 \u03b5\u1f30\u03c2 \u03bf\u1f50\u03c1\u03b1\u03bd\u1f78\u03bd \u03b4\u1f72 \u1f00\u03bd\u1fc6\u03bb\u03b8\u03bf\u03bd. \u039a\u03b1\u1f76 \u03c4\u1f00\u03b3\u03b1\u03b8\u1f70 \u1f20\u03c1\u03ce\u03c4\u03b7\u03c3\u03b1\u03bd \u03c4\u1f78\u03bd \u0394\u03af\u03b1, \u03c0\u1ff6\u03c2 \u03b5\u1f36\u03bd\u03b1\u03b9 \u03b4\u03b5\u1fd6 \u03bc\u03b5\u03c4\u1f70 \u1f00\u03bd\u03b8\u

### A Useful Function for Title Lines

The title-line of this text consists of:

- An Arabic number (1–17), followed by a period.
- A Greek title
- The Portuguese title

In XML, *vel sim.*, all of these would be wrapped in some kind of markup. They are not, here, but we can still work with these three discrete sets of data, because the plain-text is clean and predictable.

We *could* do this in-line, but it is easier to see, and test, if we pull it out into a defined Function.

We grab the Heading-number (which we turn into a String, because it is merely a *label*), using a Regular Expression.

To split the Greek title from the Portuguese title, we do the following:

- Grab the chapter-label (some Arabic numerals) with a Regex
- Remove the chapter-label (and following period '.') before further processing: this is the String `val` called `twoTitles`
- Turn that into a Vector of `Char`.
- Filter out everything except `[A-Z]` (we know that the Greek title is first, and the Portuguese title begins with an upper-case Latin letter).
- The first element in the resulting list will be the start of the Portuguese title.
- Using Scala's [`.indexOf`](https://www.geeksforgeeks.org/scala-string-indexof-method-with-example/) method, we get the index of the first occurrance of the first `Char` of the Portuguese title in the `twoTitles` String.
- Using `.take` we grab the Greek title.
- Using `.takeRight` and some arithmetic we grab the Portuguese title.

The result will be a "3ple" of Strings: chapter-label, Greek title, Portugues title.

In [59]:
def splitTitle( testString: String ): (String, String, String) = {
    
    val chapterId: String = {
        val rx = "^[0-9]+".r
        val foundOption: Option[String] = rx.findFirstIn(testString)
        foundOption.getOrElse("NO_ID")
        
    }
    
    val twoTitles: String = testString.replaceAll("""^[0-9]+\.""", "").trim
    
    val charVec = twoTitles.toVector
    val filteredVec = charVec.filter( c => {
        val s = c.toString
        val rpl = s.replaceAll("[A-Z]", "")
        rpl == ""
    })
    val firstChar: Char = filteredVec.head.toChar
    val firstPorIndex: Int = charVec.indexOf(firstChar)
    val greekTitle: String = twoTitles.take(firstPorIndex - 1)
    val porTitle: String = twoTitles.takeRight( twoTitles.size - firstPorIndex )
    
    (chapterId, greekTitle, porTitle)
}

splitTitle("12. αβγδ ABCD")

defined [32mfunction[39m [36msplitTitle[39m
[36mres58_1[39m: ([32mString[39m, [32mString[39m, [32mString[39m) = ([32m"12"[39m, [32m"\u03b1\u03b2\u03b3\u03b4"[39m, [32m"ABCD"[39m)

### Make a CEX File!

We can make two CEX blocks, one for Greek and one for Portuguese. We happen to know that, for each section, there is a header-line, a Greek section (one line), and a Portuguese section (one line). 

**So this is not a generic script!** It only works with this file!

First we define our URN-base:

In [60]:
val urnBase = CtsUrn("urn:cts:greekLit:tlg0096.tlg002:")

[36murnBase[39m: [32mCtsUrn[39m = [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0096.tlg002:"[39m)

We make a CEX block for Greek first…

In [61]:
val greekBlock: Vector[String] = mappedHeadings.map( h => {
    val heading: IndexedLine = h._1
    val section: IndexedLine = h._2.head
    val splitHeading = splitTitle(heading.text)
    val sectionId = splitHeading._1
    val sectionHeading = splitHeading._2
    val versionUrn = urnBase.addVersion("First1K-grc1")
    Vector(
        s"${versionUrn}${sectionId}.head#${sectionHeading}",
        s"${versionUrn}${sectionId}.text#${section.text}"
    )
}).flatten

[36mgreekBlock[39m: [32mVector[39m[[32mString[39m] = [33mVector[39m(
  [32m"urn:cts:greekLit:tlg0096.tlg002.First1K-grc1:1.head#\u1f08\u03b3\u03b1\u03b8\u1f70 \u03ba\u03b1\u1f76 \u039a\u03b1\u03ba\u03ac"[39m,
  [32m"urn:cts:greekLit:tlg0096.tlg002.First1K-grc1:1.text#\u1f08\u03b3\u03b1\u03b8\u1f70 \u03c0\u03ac\u03bd\u03c4\u03b1 \u1f51\u03c0\u1f78 \u03c4\u1ff6\u03bd \u03ba\u03b1\u03ba\u1ff6\u03bd \u1f10\u03b4\u03b9\u03ce\u03c7\u03b8\u03b7, \u1f61\u03c2 \u1f00\u03c3\u03b8\u03b5\u03bd\u1fc6 \u1f44\u03bd\u03c4\u03b1\u00b7 \u03b5\u1f30\u03c2 \u03bf\u1f50\u03c1\u03b1\u03bd\u1f78\u03bd \u03b4\u1f72 \u1f00\u03bd\u1fc6\u03bb\u03b8\u03bf\u03bd. \u039a\u03b1\u1f76 \u03c4\u1f00\u03b3\u03b1\u03b8\u1f70 \u1f20\u03c1\u03ce\u03c4\u03b7\u03c3\u03b1\u03bd \u03c4\u1f78\u03bd \u0394\u03af\u03b1, \u03c0\u1ff6\u03c2 \u03b5\u1f36\u03bd\u03b1\u03b9 \u03b4\u03b5\u1fd6 \u03bc\u03b5\u03c4\u1f70 \u1f00\u03bd\u03b8\u03c1\u03ce\u03c0\u03c9\u03bd. \u1f49 \u03b4\u1f72 \u03b5\u1f36\u03c0\u03b5, \u03bc\u1f74

Now a CEX block for Portuguese…

In [62]:
val portBlock: Vector[String] = mappedHeadings.map( h => {
    val heading: IndexedLine = h._1
    val section: IndexedLine = h._2.last
    val splitHeading = splitTitle(heading.text)
    val sectionId = splitHeading._1
    val sectionHeading = splitHeading._3
    val versionUrn = urnBase.addVersion("mcdezotti")
    Vector(
        s"${versionUrn}${sectionId}.head#${sectionHeading}",
        s"${versionUrn}${sectionId}.text#${section.text}"
    )
}).flatten

[36mportBlock[39m: [32mVector[39m[[32mString[39m] = [33mVector[39m(
  [32m"urn:cts:greekLit:tlg0096.tlg002.mcdezotti:1.head#Os bens e os males"[39m,
  [32m"urn:cts:greekLit:tlg0096.tlg002.mcdezotti:1.text#Os bens todos, por serem fr\u00e1geis, foram perseguidos pelos males. Ent\u00e3o, subiram ao c\u00e9u. E os bens perguntaram a Zeus como deviam comportar-se entre os homens. Ele ent\u00e3o falou para se acercarem dos homens, n\u00e3o todos em conjunto, mas um de cada vez. Por isso os males constantemente se acercam dos homens porque est\u00e3o por perto, enquanto os bens descem do c\u00e9u mais devagar. A f\u00e1bula mostra que ningu\u00e9m depara rapidamente com um bem, mas pelos males cada pessoa \u00e9 a cada momento atingida."[39m,
  [32m"urn:cts:greekLit:tlg0096.tlg002.mcdezotti:2.head#O vendedor de est\u00e1tuas"[39m,
  [32m"urn:cts:greekLit:tlg0096.tlg002.mcdezotti:2.text#Um homem fabricou um Hermes de madeira e o exp\u00f4s, tentando vender.  Como nenhum comprad

**Final Assembly**

We need to add the `#!ctsdata` header before each block, and of course the overall CEX header and CTS Catalog, which are convenientl saved in a separate template file.

> Concatenating, appending, and prepending things to Vectors in Scala is flexible, but the syntax is hard to remember. [This site](https://alvinalexander.com/scala/how-to-append-prepend-items-vector-seq-in-scala) is the definitive reference.

First, we load the CEX Header:

In [63]:
val filePath = s"template/aesop_cex_header.txt"
val cexHeader: String = {
    scala.io.Source.fromFile(filePath).mkString.split("\n").toVector.filter( _.size > 0 ).mkString("\n")
}

[36mfilePath[39m: [32mString[39m = [32m"template/aesop_cex_header.txt"[39m
[36mcexHeader[39m: [32mString[39m = [32m"""#!cexversion
3.0
#!citelibrary
name#CEX library
urn#urn:cite2:cex:unesp_fu.v1:temp2
license#CC 3.0 NC-BY
#!ctscatalog
urn#citationScheme#groupName#workTitle#versionLabel#exemplarLabel#online#lang
urn:cts:greekLit:tlg0096.tlg002.First1K-grc1:#fable#Aesop#Fabulae#Helm, Teubner, 1872##true#grc
urn:cts:greekLit:tlg0096.tlg002.mcdezotti:#fable#Aesop#Fabulae#M.C. Dezotti, trans., 2020##true#por"""[39m

Now give our blocks their proper headers:

In [64]:
val greekCex: String = {
    ( "#!ctsdata" +: greekBlock ).mkString("\n")
}

[36mgreekCex[39m: [32mString[39m = [32m"""#!ctsdata
urn:cts:greekLit:tlg0096.tlg002.First1K-grc1:1.head#Ἀγαθὰ καὶ Κακά
urn:cts:greekLit:tlg0096.tlg002.First1K-grc1:1.text#Ἀγαθὰ πάντα ὑπὸ τῶν κακῶν ἐδιώχθη, ὡς ἀσθενῆ ὄντα· εἰς οὐρανὸν δὲ ἀνῆλθον. Καὶ τἀγαθὰ ἠρώτησαν τὸν Δία, πῶς εἶναι δεῖ μετὰ ἀνθρώπων. Ὁ δὲ εἶπε, μὴ μετʼ ἀλλήλων πάντα, ἓν δὲ καθʼ ἓν τοῖς ἀνθρώποις ἐπέρχεσθαι. Διὰ τοῦτο τὰ μὲν κακὰ συνεχῆ τοῖς ἀνθρώποις, ὡς πλησίον ὄντα, ἐπέρχεται, τὰ δὲ ἀγαθὰ βράδιον ἐξ οὐρανοῦ κάτεισι. Ὁ λόγος δηλοῖ, ὅτι ἀγαθῷ μὲν οὐδεὶς ταχέως ἐπιτυγχάνει, ὑπὸ δὲ τῶν κακῶν ἕκαστος καθʼ ἑκάστην πλήττεται.
urn:cts:greekLit:tlg0096.tlg002.First1K-grc1:2.head#Ἀγαλματοπώλης
urn:cts:greekLit:tlg0096.tlg002.First1K-grc1:2.text#Ξύλινόν τις Ἑρμῆν κατασκευάσας, προσενεγκὼν ἐπώλει. Μηδενὸς δὲ ὠνητοῦ προσιόντος, ἐκκαλέσασθαί τινας βουλόμενος ἐβόα, ὡς ἀγαθοποιὸν δαίμονα καὶ κέρδους τηρητικὸν πιπράσκει. Τῶν δὲ παρατυχόντων τινὸς εἰπόντος πρὸς αὐτόν· ,,ὦ οὗτος, καὶ τί τοῦτον ὄντα τοιοῦτον πωλεῖς, δέον τῶν παρʼ

In [65]:
val portCex: String = {
    ( "#!ctsdata" +: portBlock ).mkString("\n")
}

[36mportCex[39m: [32mString[39m = [32m"""#!ctsdata
urn:cts:greekLit:tlg0096.tlg002.mcdezotti:1.head#Os bens e os males
urn:cts:greekLit:tlg0096.tlg002.mcdezotti:1.text#Os bens todos, por serem frágeis, foram perseguidos pelos males. Então, subiram ao céu. E os bens perguntaram a Zeus como deviam comportar-se entre os homens. Ele então falou para se acercarem dos homens, não todos em conjunto, mas um de cada vez. Por isso os males constantemente se acercam dos homens porque estão por perto, enquanto os bens descem do céu mais devagar. A fábula mostra que ninguém depara rapidamente com um bem, mas pelos males cada pessoa é a cada momento atingida.
urn:cts:greekLit:tlg0096.tlg002.mcdezotti:2.head#O vendedor de estátuas
urn:cts:greekLit:tlg0096.tlg002.mcdezotti:2.text#Um homem fabricou um Hermes de madeira e o expôs, tentando vender.  Como nenhum comprador se aproximava, ele, querendo atrair alguns, pôs-se a gritar que estava vendendo um deus benfeitor e guardião do lucro. Então uma d

Put the whole things together:

In [66]:
val aesopCex: String = {
    cexHeader + "\n\n" + greekCex + "\n\n" + portCex + "\n"
}

[36maesopCex[39m: [32mString[39m = [32m"""#!cexversion
3.0
#!citelibrary
name#CEX library
urn#urn:cite2:cex:unesp_fu.v1:temp2
license#CC 3.0 NC-BY
#!ctscatalog
urn#citationScheme#groupName#workTitle#versionLabel#exemplarLabel#online#lang
urn:cts:greekLit:tlg0096.tlg002.First1K-grc1:#fable#Aesop#Fabulae#Helm, Teubner, 1872##true#grc
urn:cts:greekLit:tlg0096.tlg002.mcdezotti:#fable#Aesop#Fabulae#M.C. Dezotti, trans., 2020##true#por

#!ctsdata
urn:cts:greekLit:tlg0096.tlg002.First1K-grc1:1.head#Ἀγαθὰ καὶ Κακά
urn:cts:greekLit:tlg0096.tlg002.First1K-grc1:1.text#Ἀγαθὰ πάντα ὑπὸ τῶν κακῶν ἐδιώχθη, ὡς ἀσθενῆ ὄντα· εἰς οὐρανὸν δὲ ἀνῆλθον. Καὶ τἀγαθὰ ἠρώτησαν τὸν Δία, πῶς εἶναι δεῖ μετὰ ἀνθρώπων. Ὁ δὲ εἶπε, μὴ μετʼ ἀλλήλων πάντα, ἓν δὲ καθʼ ἓν τοῖς ἀνθρώποις ἐπέρχεσθαι. Διὰ τοῦτο τὰ μὲν κακὰ συνεχῆ τοῖς ἀνθρώποις, ὡς πλησίον ὄντα, ἐπέρχεται, τὰ δὲ ἀγαθὰ βράδιον ἐξ οὐρανοῦ κάτεισι. Ὁ λόγος δηλοῖ, ὅτι ἀγαθῷ μὲν οὐδεὶς ταχέως ἐπιτυγχάνει, ὑπὸ δὲ τῶν κακῶν ἕκαστος καθʼ ἑκάστην πλήττεται.
urn:c

Save it…

In [67]:
saveString( aesopCex, "cex/", "aesop.cex")

## Test It!

### Load the Library

We can test the validity of our work by trying to load it into a [CiteLibrary](https://cite-architecture.github.io/cite-api-docs/).

In [68]:
val cexPath = "cex/aesop.cex"
val lib = CiteLibrary(scala.io.Source.fromFile(cexPath).mkString)

Jan 29, 2020 11:18:51 AM wvlet.log.Logger log
INFO: Building text repo from cex ...
Jan 29, 2020 11:18:51 AM wvlet.log.Logger log
INFO: Building collection repo from cex ...
Jan 29, 2020 11:18:51 AM wvlet.log.Logger log
INFO: Building relations from cex ...
Jan 29, 2020 11:18:51 AM wvlet.log.Logger log
INFO: All library components built.


[36mcexPath[39m: [32mString[39m = [32m"cex/aesop.cex"[39m
[36mlib[39m: [32mCiteLibrary[39m = [33mCiteLibrary[39m(
  [32m"CEX library"[39m,
  [33mCite2Urn[39m([32m"urn:cite2:cex:unesp_fu.v1:temp2"[39m),
  [32m"CC 3.0 NC-BY"[39m,
  [33mVector[39m(),
  [33mSome[39m(
    [33mTextRepository[39m(
      [33mCorpus[39m(
        [33mVector[39m(
          [33mCitableNode[39m(
            [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0096.tlg002.First1K-grc1:1.head"[39m),
            [32m"\u1f08\u03b3\u03b1\u03b8\u1f70 \u03ba\u03b1\u1f76 \u039a\u03b1\u03ba\u03ac"[39m
          ),
          [33mCitableNode[39m(
            [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0096.tlg002.First1K-grc1:1.text"[39m),
            [32m"\u1f08\u03b3\u03b1\u03b8\u1f70 \u03c0\u03ac\u03bd\u03c4\u03b1 \u1f51\u03c0\u1f78 \u03c4\u1ff6\u03bd \u03ba\u03b1\u03ba\u1ff6\u03bd \u1f10\u03b4\u03b9\u03ce\u03c7\u03b8\u03b7, \u1f61\u03c2 \u1f00\u03c3\u03b8\u03b5\u03bd\u1fc6 \u1f44\u03bd\u03c4\u0

If that worked (!??!), we can now try a little retrieval and analysis. 

A CITE Library has many possible components. The one we have just loaded is text-only, so let's get some parts of it convenient to hand.

> A CiteLibrary possesses an `Option[TextRepository]`. So there may or may not be a TextRepository in any given CiteLibrary, the value of `lib.textRepository` may be either `Some[TextRepository]` or `None`. We can "get" the TR with `lib.textRepository.get`. If the value is actually `None`, this will throw an exception. But in that case, something failed, above, so there is no point doing elaborate checking.

In [69]:
val tr: TextRepository = lib.textRepository.get // Go for it!

[36mtr[39m: [32mTextRepository[39m = [33mTextRepository[39m(
  [33mCorpus[39m(
    [33mVector[39m(
      [33mCitableNode[39m(
        [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0096.tlg002.First1K-grc1:1.head"[39m),
        [32m"\u1f08\u03b3\u03b1\u03b8\u1f70 \u03ba\u03b1\u1f76 \u039a\u03b1\u03ba\u03ac"[39m
      ),
      [33mCitableNode[39m(
        [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0096.tlg002.First1K-grc1:1.text"[39m),
        [32m"\u1f08\u03b3\u03b1\u03b8\u1f70 \u03c0\u03ac\u03bd\u03c4\u03b1 \u1f51\u03c0\u1f78 \u03c4\u1ff6\u03bd \u03ba\u03b1\u03ba\u1ff6\u03bd \u1f10\u03b4\u03b9\u03ce\u03c7\u03b8\u03b7, \u1f61\u03c2 \u1f00\u03c3\u03b8\u03b5\u03bd\u1fc6 \u1f44\u03bd\u03c4\u03b1\u00b7 \u03b5\u1f30\u03c2 \u03bf\u1f50\u03c1\u03b1\u03bd\u1f78\u03bd \u03b4\u1f72 \u1f00\u03bd\u1fc6\u03bb\u03b8\u03bf\u03bd. \u039a\u03b1\u1f76 \u03c4\u1f00\u03b3\u03b1\u03b8\u1f70 \u1f20\u03c1\u03ce\u03c4\u03b7\u03c3\u03b1\u03bd \u03c4\u1f78\u03bd \u0394\u03af\u03b1, \u03c0\u1ff6\u

A TextRepository **must** have both a `Catalog` and a `Corpus`. See [the API docs for the `OHCO2` library](https://cite-architecture.github.io/cite-api-docs/ohco2/api/edu/holycross/shot/ohco2/index.html).

In [70]:
val cat: Catalog = tr.catalog

val corp: Corpus = tr.corpus

[36mcat[39m: [32mCatalog[39m = [33mCatalog[39m(
  [33mVector[39m(
    [33mCatalogEntry[39m(
      [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0096.tlg002.First1K-grc1:"[39m),
      [32m"fable"[39m,
      [32m"grc"[39m,
      [32m"Aesop"[39m,
      [32m"Fabulae"[39m,
      [33mSome[39m([32m"Helm, Teubner, 1872"[39m),
      [32mNone[39m,
      true
    ),
    [33mCatalogEntry[39m(
      [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0096.tlg002.mcdezotti:"[39m),
      [32m"fable"[39m,
      [32m"por"[39m,
      [32m"Aesop"[39m,
      [32m"Fabulae"[39m,
      [33mSome[39m([32m"M.C. Dezotti, trans., 2020"[39m),
      [32mNone[39m,
      true
    )
  )
)
[36mcorp[39m: [32mCorpus[39m = [33mCorpus[39m(
  [33mVector[39m(
    [33mCitableNode[39m(
      [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0096.tlg002.First1K-grc1:1.head"[39m),
      [32m"\u1f08\u03b3\u03b1\u03b8\u1f70 \u03ba\u03b1\u1f76 \u039a\u03b1\u03ba\u03ac"[39m
    ),
    [33mCitable

### Retrieval

For this exercise, we will define some URNs, and use them to retrieve passage of text. This will take advantage of 
the `showMe()` Function defined above.

In [71]:
// Urn to Aesop's Fabulae
val aesopUrn = CtsUrn("urn:cts:greekLit:tlg0096.tlg002:")

// Version ID for Greek
val greekVers = "First1K-grc1"

// Version ID for Portuguese
val portVers = "mcdezotti"

[36maesopUrn[39m: [32mCtsUrn[39m = [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0096.tlg002:"[39m)
[36mgreekVers[39m: [32mString[39m = [32m"First1K-grc1"[39m
[36mportVers[39m: [32mString[39m = [32m"mcdezotti"[39m

#### Retrieve Fables

One fable, in Greek:

In [72]:
val oneGreekCitation = aesopUrn.addVersion(greekVers).addPassage("3")

[36moneGreekCitation[39m: [32mCtsUrn[39m = [33mCtsUrn[39m(
  [32m"urn:cts:greekLit:tlg0096.tlg002.First1K-grc1:3"[39m
)

We use the `~~` method to retrieve a passage, based on a URN, from a Corpus.

In [73]:
val oneGreekFable: Corpus = corp ~~ oneGreekCitation

println( s"Retrieving CTS-URN: ${oneGreekCitation}\n")

showMe(oneGreekFable)

Retrieving CTS-URN: urn:cts:greekLit:tlg0096.tlg002.First1K-grc1:3

3.head		Ἄγροικος καὶ Ὀνάρια
3.text		Γεωργός τις ἐπʼ ἀγροῦ γεγηρακὼς, ἐπεὶ μηδέποτε εἰσῆλθεν εἰς ἄστυ, παρεκάλει τοὺς οἰκείους τοῦτο θεάσασθαι. Οἱ δὲ ζεύξαντες ὀνάρια καὶ ἐπὶ τῆς ἀπήνης αὐτὸν ἀναβιβασάμενοι, μόνον ἐκέλευσαν ἐλαύνειν. Ὁδεύοντι δὲ χειμῶνος καὶ θυέλλης τὸν ἀέρα καταλαβόντων καὶ ζόφου γενομένου, τὰ ὀνάρια τῆς ὁδοῦ πλανηθέντα εἴς τινα κρημνὸν ἐξετόπισαν τὸν πρεσβύτην. Ὁ δὲ μέλλων ἤδη κατακρημνίζεσθαι ,,ὦ Ζεῦ,“ εἶπε ,,τί ποτέ σε ἠδίκησα, ὅτι οὕτω παρὰ λόγον ἀπόλλυμαι, καὶ ταῦτα οὔθʼ ὑφʼ ἵππων γενναίων οὔθ’ ἡμιόνων ἀγαθῶν, ἀλλ’ ὀναρίων εὐτελεστάτων;“


[36moneGreekFable[39m: [32mCorpus[39m = [33mCorpus[39m(
  [33mVector[39m(
    [33mCitableNode[39m(
      [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0096.tlg002.First1K-grc1:3.head"[39m),
      [32m"\u1f0c\u03b3\u03c1\u03bf\u03b9\u03ba\u03bf\u03c2 \u03ba\u03b1\u1f76 \u1f48\u03bd\u03ac\u03c1\u03b9\u03b1"[39m
    ),
    [33mCitableNode[39m(
      [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0096.tlg002.First1K-grc1:3.text"[39m),
      [32m"\u0393\u03b5\u03c9\u03c1\u03b3\u03cc\u03c2 \u03c4\u03b9\u03c2 \u1f10\u03c0\u02bc \u1f00\u03b3\u03c1\u03bf\u1fe6 \u03b3\u03b5\u03b3\u03b7\u03c1\u03b1\u03ba\u1f7c\u03c2, \u1f10\u03c0\u03b5\u1f76 \u03bc\u03b7\u03b4\u03ad\u03c0\u03bf\u03c4\u03b5 \u03b5\u1f30\u03c3\u1fc6\u03bb\u03b8\u03b5\u03bd \u03b5\u1f30\u03c2 \u1f04\u03c3\u03c4\u03c5, \u03c0\u03b1\u03c1\u03b5\u03ba\u03ac\u03bb\u03b5\u03b9 \u03c4\u03bf\u1f7a\u03c2 \u03bf\u1f30\u03ba\u03b5\u03af\u03bf\u03c5\u03c2 \u03c4\u03bf\u1fe6\u03c4\u03bf \u03b8\u03b5\u03ac\u03c3\u03b1\u03c3\u03b8\u03b1

One fable, in Portuguese:

In [74]:
val onePortCitation = aesopUrn.addVersion(portVers).addPassage("3")

[36monePortCitation[39m: [32mCtsUrn[39m = [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0096.tlg002.mcdezotti:3"[39m)

We use the ~~ method to retrieve a passage, based on a URN, from a Corpus.

In [75]:
val onePortFable: Corpus = corp ~~ onePortCitation

println( s"Retrieving CTS-URN: ${onePortCitation}\n")

showMe(onePortFable)

Retrieving CTS-URN: urn:cts:greekLit:tlg0096.tlg002.mcdezotti:3

3.head		O camponês e os burrinhos
3.text		Um camponês que chegou à velhice no campo, visto que nunca tinha ido à cidade, pedia com insistência aos familiares para vê-la. Então eles, após atrelarem burrinhos,  fizeram-no subir na carroça  e ordenaram que apenas tocasse adiante. Mas enquanto ele estava a caminho, uma tempestade e um vendaval apanharam de surpresa o tempo e ficou um breu. Os burrinhos perderam o rumo do caminho e desviaram o velho para um precipício. E ele, já prestes a despencar no precipício, disse: “Ó Zeus, o que alguma vez te fiz de errado para morrer assim de modo absurdo, e isso nem por obra de cavalos de raça, nem de boas mulas, mas de burrinhos da pior espécie!" 


[36monePortFable[39m: [32mCorpus[39m = [33mCorpus[39m(
  [33mVector[39m(
    [33mCitableNode[39m(
      [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0096.tlg002.mcdezotti:3.head"[39m),
      [32m"O campon\u00eas e os burrinhos"[39m
    ),
    [33mCitableNode[39m(
      [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0096.tlg002.mcdezotti:3.text"[39m),
      [32m"Um campon\u00eas que chegou \u00e0 velhice no campo, visto que nunca tinha ido \u00e0 cidade, pedia com insist\u00eancia aos familiares para v\u00ea-la. Ent\u00e3o eles, ap\u00f3s atrelarem burrinhos,  fizeram-no subir na carro\u00e7a  e ordenaram que apenas tocasse adiante. Mas enquanto ele estava a caminho, uma tempestade e um vendaval apanharam de surpresa o tempo e ficou um breu. Os burrinhos perderam o rumo do caminho e desviaram o velho para um precip\u00edcio. E ele, j\u00e1 prestes a despencar no precip\u00edcio, disse: \u201c\u00d3 Zeus, o que alguma vez te fiz de errado para morrer assim de modo absurdo, e i

Two fables, in Greek:

In [76]:
val twoGreekCitations = aesopUrn.addVersion(greekVers).addPassage("4-5")

[36mtwoGreekCitations[39m: [32mCtsUrn[39m = [33mCtsUrn[39m(
  [32m"urn:cts:greekLit:tlg0096.tlg002.First1K-grc1:4-5"[39m
)

We use the `~~` method to retrieve a passage, based on a URN, from a Corpus.

In [77]:
val twoGreekFables: Corpus = corp ~~ twoGreekCitations

println( s"Retrieving CTS-URN: ${twoGreekCitations}\n")

showMe(twoGreekFables)

Retrieving CTS-URN: urn:cts:greekLit:tlg0096.tlg002.First1K-grc1:4-5

4.head		Ἀετός
4.text		Ὑπεράνωθεν πέτρας ἀετὸς ἐκαθέζετο, λαγωὸν θηρεῦσαι ζητῶν. Τοῦτον δέ τις ἔβαλε τοξεύσας· καὶ τὸ μὲν βέλος ἐντὸς αὐτοῦ εἰσῆλθεν, ἡ δὲ γλυφὶς σὺν τοῖς πτεροῖς πρὸ τῶν ὀφθαλμῶν εἱστήκει. Ὁ δὲ ἰδὼν ἔφη· „καὶ τοῦτό μοι ἑτέρα λύπη, τὸ τοῖς ἰδίοις πτεροῖς ἐναποθνήσκειν.“  Ὁ μῦθος δηλοῖ, ὅτι δεινόν ἐστιν, ὅταν τις ἐκ τῶν ἰδίων κινδυνεύσῃ.
5.head		Ἀετὸς καὶ Ἀλώπηξ
5.text		Ἀετὸς καὶ ἀλώπηξ φιλεῖν ἀλλήλους συνθέμενοι, πλησίον ἑαυτῶν οἰκεῖν διέγνωσαν, βεβαίωσιν φιλίας τὴν συνήθειαν ποιούμενοι. Καὶ δὴ ὁ μὲν ἀναβὰς ἐπί τι περίμηκες δένδρον ἐνεοττοποιήσατο· ἡ δὲ εἰσελθοῦσα εἰς τὸν ὑποκείμενον θάμνον ἔτεκεν. Ἐξελθούσης δέ ποτε αὐτῆς ἐπὶ νομὴν, ὁ ἀετὸς ἀπορῶν τροφῆς, καταπτὰς εἰς τὸν θάμνον καὶ τὰ γεννήματα ἀναρπάσας, μετὰ τῶν αὑτοῦ νεοττῶν κατεθοινήσατο. Ἡ δʼἀλώπηξ ἐπανελθοῦσα ὡς ἔγνω τὸ πραχθὲν, οὐ μᾶλλον ἐπὶ τῷ τῶν νεοττῶν θανάτῳ ἐλυπήθη, ὅσον ἐπὶ τῷ τῆς ἀμύνης ἀπόρῳ· χερσαία γὰρ οὖσα πτηνὸν διώκειν ἠδυνάτει. 

[36mtwoGreekFables[39m: [32mCorpus[39m = [33mCorpus[39m(
  [33mVector[39m(
    [33mCitableNode[39m(
      [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0096.tlg002.First1K-grc1:4.head"[39m),
      [32m"\u1f08\u03b5\u03c4\u03cc\u03c2"[39m
    ),
    [33mCitableNode[39m(
      [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0096.tlg002.First1K-grc1:4.text"[39m),
      [32m"\u1f59\u03c0\u03b5\u03c1\u03ac\u03bd\u03c9\u03b8\u03b5\u03bd \u03c0\u03ad\u03c4\u03c1\u03b1\u03c2 \u1f00\u03b5\u03c4\u1f78\u03c2 \u1f10\u03ba\u03b1\u03b8\u03ad\u03b6\u03b5\u03c4\u03bf, \u03bb\u03b1\u03b3\u03c9\u1f78\u03bd \u03b8\u03b7\u03c1\u03b5\u1fe6\u03c3\u03b1\u03b9 \u03b6\u03b7\u03c4\u1ff6\u03bd. \u03a4\u03bf\u1fe6\u03c4\u03bf\u03bd \u03b4\u03ad \u03c4\u03b9\u03c2 \u1f14\u03b2\u03b1\u03bb\u03b5 \u03c4\u03bf\u03be\u03b5\u03cd\u03c3\u03b1\u03c2\u00b7 \u03ba\u03b1\u1f76 \u03c4\u1f78 \u03bc\u1f72\u03bd \u03b2\u03ad\u03bb\u03bf\u03c2 \u1f10\u03bd\u03c4\u1f78\u03c2 \u03b1\u1f50\u03c4\u03bf\u1fe6 \u03b5\u1f30\u0

One fable, in Portuguese:

In [78]:
val twoPortCitations = aesopUrn.addVersion(portVers).addPassage("4-5")

[36mtwoPortCitations[39m: [32mCtsUrn[39m = [33mCtsUrn[39m(
  [32m"urn:cts:greekLit:tlg0096.tlg002.mcdezotti:4-5"[39m
)

We use the ~~ method to retrieve a passage, based on a URN, from a Corpus.

In [79]:
val twoPortFables: Corpus = corp ~~ twoPortCitations

println( s"Retrieving CTS-URN: ${twoPortCitations}\n")

showMe(twoPortFables)

Retrieving CTS-URN: urn:cts:greekLit:tlg0096.tlg002.mcdezotti:4-5

4.head		A águia
4.text		Uma águia pousou bem no alto de um rochedo, buscando caçar uma lebre. Então alguém desferiu o arco e atingiu-a. E a flecha penetrou nela, mas o chanfro com as penas estancou diante de seus olhos. Ao vê-lo, ela disse: “E isso para mim é outro desgosto: morrer em meio às próprias penas.” A fábula mostra que é terrível quando alguém corre perigo advindo de seus próprios recursos.
5.head		A águia e a raposa
5.text		Uma  águia e uma raposa, após pactuarem amizade mútua,  decidiram morar perto uma da outra, fazendo do convívio garantia de amizade. E então uma subiu numa árvore bem alta e fez o ninho, enquanto a outra penetrou na moita ao pé da árvore e deu cria. Tendo a raposa certa vez saído para caçar, a águia,  carecendo de alimento, desceu voando à moita, apanhou as crias da raposa e as devorou em companhia dos seus filhotes. E a raposa, ao retornar, quando se deu conta do fato, afligiu-se não tant

[36mtwoPortFables[39m: [32mCorpus[39m = [33mCorpus[39m(
  [33mVector[39m(
    [33mCitableNode[39m(
      [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0096.tlg002.mcdezotti:4.head"[39m),
      [32m"A \u00e1guia"[39m
    ),
    [33mCitableNode[39m(
      [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0096.tlg002.mcdezotti:4.text"[39m),
      [32m"Uma \u00e1guia pousou bem no alto de um rochedo, buscando ca\u00e7ar uma lebre. Ent\u00e3o algu\u00e9m desferiu o arco e atingiu-a. E a flecha penetrou nela, mas o chanfro com as penas estancou diante de seus olhos. Ao v\u00ea-lo, ela disse: \u201cE isso para mim \u00e9 outro desgosto: morrer em meio \u00e0s pr\u00f3prias penas.\u201d A f\u00e1bula mostra que \u00e9 terr\u00edvel quando algu\u00e9m corre perigo advindo de seus pr\u00f3prios recursos."[39m
    ),
    [33mCitableNode[39m(
      [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0096.tlg002.mcdezotti:5.head"[39m),
      [32m"A \u00e1guia e a raposa"[39m
    ),
    [33mCit

#### Retrieve Parts of Fables

The above retrieve by canonical citation, that is, by Fable. The library we define separates the heading from the text of a fable, for more precise identification and retrieval, *if so desired*.

In [80]:
val newUrn = aesopUrn.addVersion(greekVers).addPassage("5.head")

val fableFiveGreekHead: Corpus = {
    corp ~~ newUrn
}

println( s"Retrieving CTS-URN: ${newUrn}\n")

showMe( fableFiveGreekHead )

Retrieving CTS-URN: urn:cts:greekLit:tlg0096.tlg002.First1K-grc1:5.head

5.head		Ἀετὸς καὶ Ἀλώπηξ


[36mnewUrn[39m: [32mCtsUrn[39m = [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0096.tlg002.First1K-grc1:5.head"[39m)
[36mfableFiveGreekHead[39m: [32mCorpus[39m = [33mCorpus[39m(
  [33mVector[39m(
    [33mCitableNode[39m(
      [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0096.tlg002.First1K-grc1:5.head"[39m),
      [32m"\u1f08\u03b5\u03c4\u1f78\u03c2 \u03ba\u03b1\u1f76 \u1f08\u03bb\u03ce\u03c0\u03b7\u03be"[39m
    )
  )
)

In [81]:
val newUrn = aesopUrn.addVersion(greekVers).addPassage("5.head")

val fableFiveGreekText: Corpus = corp ~~ newUrn

println( s"Retrieving CTS-URN: ${newUrn}\n")

showMe( newUrn )

Retrieving CTS-URN: urn:cts:greekLit:tlg0096.tlg002.First1K-grc1:5.head


-----
urn:cts:greekLit:tlg0096.tlg002.First1K-grc1:5.head
----



[36mnewUrn[39m: [32mCtsUrn[39m = [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0096.tlg002.First1K-grc1:5.head"[39m)
[36mfableFiveGreekText[39m: [32mCorpus[39m = [33mCorpus[39m(
  [33mVector[39m(
    [33mCitableNode[39m(
      [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0096.tlg002.First1K-grc1:5.head"[39m),
      [32m"\u1f08\u03b5\u03c4\u1f78\u03c2 \u03ba\u03b1\u1f76 \u1f08\u03bb\u03ce\u03c0\u03b7\u03be"[39m
    )
  )
)

#### Retrieve Multitext Fables

Because the [CITE Architecture](http://cite-architecture.org) has always been developed in the context of the [Homer Multitext](http://www.homermultitext.org), its *raison d’être* has been **identification and retrieval** of passages of texts, by **canonical citation**, across versions. We can capitalize on this here:

In [82]:
val newUrn = aesopUrn.addPassage("5.head")

val fableFiveHeadAll: Corpus = corp ~~ newUrn

println( s"Retrieving CTS-URN: ${newUrn}\n")

showMe(fableFiveHeadAll)

Retrieving CTS-URN: urn:cts:greekLit:tlg0096.tlg002:5.head

5.head		Ἀετὸς καὶ Ἀλώπηξ
5.head		A águia e a raposa


[36mnewUrn[39m: [32mCtsUrn[39m = [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0096.tlg002:5.head"[39m)
[36mfableFiveHeadAll[39m: [32mCorpus[39m = [33mCorpus[39m(
  [33mVector[39m(
    [33mCitableNode[39m(
      [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0096.tlg002.First1K-grc1:5.head"[39m),
      [32m"\u1f08\u03b5\u03c4\u1f78\u03c2 \u03ba\u03b1\u1f76 \u1f08\u03bb\u03ce\u03c0\u03b7\u03be"[39m
    ),
    [33mCitableNode[39m(
      [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0096.tlg002.mcdezotti:5.head"[39m),
      [32m"A \u00e1guia e a raposa"[39m
    )
  )
)

In [83]:
val newUrn = aesopUrn.addPassage("5")

val fableFiveAll: Corpus = corp ~~ newUrn

println( s"Retrieving CTS-URN: ${newUrn}\n")

showMe(fableFiveAll)

Retrieving CTS-URN: urn:cts:greekLit:tlg0096.tlg002:5

5.head		Ἀετὸς καὶ Ἀλώπηξ
5.text		Ἀετὸς καὶ ἀλώπηξ φιλεῖν ἀλλήλους συνθέμενοι, πλησίον ἑαυτῶν οἰκεῖν διέγνωσαν, βεβαίωσιν φιλίας τὴν συνήθειαν ποιούμενοι. Καὶ δὴ ὁ μὲν ἀναβὰς ἐπί τι περίμηκες δένδρον ἐνεοττοποιήσατο· ἡ δὲ εἰσελθοῦσα εἰς τὸν ὑποκείμενον θάμνον ἔτεκεν. Ἐξελθούσης δέ ποτε αὐτῆς ἐπὶ νομὴν, ὁ ἀετὸς ἀπορῶν τροφῆς, καταπτὰς εἰς τὸν θάμνον καὶ τὰ γεννήματα ἀναρπάσας, μετὰ τῶν αὑτοῦ νεοττῶν κατεθοινήσατο. Ἡ δʼἀλώπηξ ἐπανελθοῦσα ὡς ἔγνω τὸ πραχθὲν, οὐ μᾶλλον ἐπὶ τῷ τῶν νεοττῶν θανάτῳ ἐλυπήθη, ὅσον ἐπὶ τῷ τῆς ἀμύνης ἀπόρῳ· χερσαία γὰρ οὖσα πτηνὸν διώκειν ἠδυνάτει. Διὸ πόῤῥωθεν στᾶσα, ὃ μόνον τοῖς ἀσθενέσι καὶ ἀδυνάτοις ὑπολείπεται, τῷ ἐχθρῷ κατηρᾶτο. Συνέβη δʼ αὐτῷ τῆς εἰς τὴν φιλίαν ἀσεβείας οὐκ εἰς μακρὰν δίκην ὑπελθεῖν· θυόντων γάρ τινων αἶγα ἐπʼ ἀγροῦ, καταπτὰς ἀπὸ τοῦ βωμοῦ σπλάγχνον ἔμπυρον ἀνήνεγκεν· οὗ κομισθέντος εἰς τὴν καλιὰν, σφοδρὸς ἐμπεσὼν ἄνεμος ἐκ λεπτοῦ καὶ παλαιοῦ κάρφους λαμπρὰν φλόγα ἀνῆψε· καὶ διὰ τοῦτο κα

[36mnewUrn[39m: [32mCtsUrn[39m = [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0096.tlg002:5"[39m)
[36mfableFiveAll[39m: [32mCorpus[39m = [33mCorpus[39m(
  [33mVector[39m(
    [33mCitableNode[39m(
      [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0096.tlg002.First1K-grc1:5.head"[39m),
      [32m"\u1f08\u03b5\u03c4\u1f78\u03c2 \u03ba\u03b1\u1f76 \u1f08\u03bb\u03ce\u03c0\u03b7\u03be"[39m
    ),
    [33mCitableNode[39m(
      [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0096.tlg002.First1K-grc1:5.text"[39m),
      [32m"\u1f08\u03b5\u03c4\u1f78\u03c2 \u03ba\u03b1\u1f76 \u1f00\u03bb\u03ce\u03c0\u03b7\u03be \u03c6\u03b9\u03bb\u03b5\u1fd6\u03bd \u1f00\u03bb\u03bb\u03ae\u03bb\u03bf\u03c5\u03c2 \u03c3\u03c5\u03bd\u03b8\u03ad\u03bc\u03b5\u03bd\u03bf\u03b9, \u03c0\u03bb\u03b7\u03c3\u03af\u03bf\u03bd \u1f11\u03b1\u03c5\u03c4\u1ff6\u03bd \u03bf\u1f30\u03ba\u03b5\u1fd6\u03bd \u03b4\u03b9\u03ad\u03b3\u03bd\u03c9\u03c3\u03b1\u03bd, \u03b2\u03b5\u03b2\u03b1\u03af\u03c9\u03c3\u03b9\u03bd

### Analysis

For information about using the [OCHO2 library’s built-in analytical tools](https://cite-architecture.github.io/cite-api-docs/ohco2/api/edu/holycross/shot/ohco2/index.html) see the [API documentation](https://cite-architecture.github.io/cite-api-docs/ohco2/api/edu/holycross/shot/ohco2/index.html). We can test our new library, though, with a quick linguistic analysis or two.

We can do a quick search for an NGram, in Greek or Portuguese, or for the whole Corpus.

We start by defining Corpora for analysis.

**N.b.** The `val` named `corp`, the Corpus in our TextRepository, contains both Greek and Portuguese.

In [84]:
val greekCorpus: Corpus = corp ~~ aesopUrn.addVersion(greekVers)

val portCorpus: Corpus = corp ~~ aesopUrn.addVersion(portVers)

[36mgreekCorpus[39m: [32mCorpus[39m = [33mCorpus[39m(
  [33mVector[39m(
    [33mCitableNode[39m(
      [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0096.tlg002.First1K-grc1:1.head"[39m),
      [32m"\u1f08\u03b3\u03b1\u03b8\u1f70 \u03ba\u03b1\u1f76 \u039a\u03b1\u03ba\u03ac"[39m
    ),
    [33mCitableNode[39m(
      [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0096.tlg002.First1K-grc1:1.text"[39m),
      [32m"\u1f08\u03b3\u03b1\u03b8\u1f70 \u03c0\u03ac\u03bd\u03c4\u03b1 \u1f51\u03c0\u1f78 \u03c4\u1ff6\u03bd \u03ba\u03b1\u03ba\u1ff6\u03bd \u1f10\u03b4\u03b9\u03ce\u03c7\u03b8\u03b7, \u1f61\u03c2 \u1f00\u03c3\u03b8\u03b5\u03bd\u1fc6 \u1f44\u03bd\u03c4\u03b1\u00b7 \u03b5\u1f30\u03c2 \u03bf\u1f50\u03c1\u03b1\u03bd\u1f78\u03bd \u03b4\u1f72 \u1f00\u03bd\u1fc6\u03bb\u03b8\u03bf\u03bd. \u039a\u03b1\u1f76 \u03c4\u1f00\u03b3\u03b1\u03b8\u1f70 \u1f20\u03c1\u03ce\u03c4\u03b7\u03c3\u03b1\u03bd \u03c4\u1f78\u03bd \u0394\u03af\u03b1, \u03c0\u1ff6\u03c2 \u03b5\u1f36\u03bd\u03b1\u03b9 \u03b4\

We ask for repeating patterns of 3 words that occur more than 2 times:

In [None]:
val threeGramsGreek = greekCorpus.ngramHisto(3, 2)

showMe( threeGramsGreek )

Let's do the same for Portuguese:

In [None]:
val threeGramsPort = portCorpus.ngramHisto(3, 2)

showMe( threeGramsPort )

Let's do the same for both languages!:

In [None]:
val threeGramsAll = corp.ngramHisto(3, 2)

showMe( threeGramsAll )

# More…

Work like this is a collaboration among scholars and students, each of whom bring different experiences and skills. This work **cannot** advance without **dialog**.

Please send questions, suggestions, or reports of problems to: `christopher.blackwell@furman.edu`.