# OCRE with texts

This notebook illustrates how to load data from a series of files in CEX format to assemble a single `Ocre` object modelling more than 50,000 issues of Roman imperial coins, and work with a separate `Ohco2` corpus of coin legends.


## Organization of this notebook

This notebook uses Scala with the almond kernel (<https://almond.sh/>).  The following cell configures the almond kernel to find and import a series of custom libraries using syntax specific to the ammonite shell that almond use.  This is analogous to defining imports in a `build.sbt` file if you were using `sbt` to run scala.

The following section (labelled  "Analyis with generic Scala") consists of completely generic scala that could be used in any environment with access to the repositories and libraries configured in the section labelled "Notebook configuration".


## Notebook configuration

Set up notebook for access to libraries.  For reasons I don't understand (but perhaps having to do with asynchronous loading) I have to separate out the two steps of adding a maven repository and using `$ivy` imports with those repositories into separate notebook cells.

In [3]:
// 1. Add maven repository where we can find our libraries
val myBT = coursierapi.MavenRepository.of("https://dl.bintray.com/neelsmith/maven")
interp.repositories() ++= Seq(myBT)

[36mmyBT[39m: [32mcoursierapi[39m.[32mMavenRepository[39m = MavenRepository(https://dl.bintray.com/neelsmith/maven)

In [4]:
// 2. Make libraries available with `$ivy` imports:
import $ivy.`edu.holycross.shot::nomisma:0.4.0`
import $ivy.`edu.holycross.shot::histoutils:2.2.0`
import $ivy.`org.plotly-scala::plotly-almond:0.7.1`

import $ivy.`edu.holycross.shot::ohco2:10.16.0`
import $ivy.`edu.holycross.shot.cite::xcite:4.1.1`
import $ivy.`edu.holycross.shot::midvalidator:9.1.0`


import $ivy.`edu.holycross.shot::latphone:2.7.2`

import $ivy.`edu.holycross.shot::latincorpus:2.2.1`
    

Downloading https://repo1.maven.org/maven2/edu/holycross/shot/midvalidator_2.12/9.1.0/midvalidator_2.12-9.1.0.pom
Downloaded https://repo1.maven.org/maven2/edu/holycross/shot/midvalidator_2.12/9.1.0/midvalidator_2.12-9.1.0.pom
Downloading https://repo1.maven.org/maven2/edu/holycross/shot/midvalidator_2.12/9.1.0/midvalidator_2.12-9.1.0.pom.sha1
Downloaded https://repo1.maven.org/maven2/edu/holycross/shot/midvalidator_2.12/9.1.0/midvalidator_2.12-9.1.0.pom.sha1
Downloading https://dl.bintray.com/neelsmith/maven/edu/holycross/shot/midvalidator_2.12/9.1.0/midvalidator_2.12-9.1.0.pom
Downloaded https://dl.bintray.com/neelsmith/maven/edu/holycross/shot/midvalidator_2.12/9.1.0/midvalidator_2.12-9.1.0.pom
Downloading https://repo1.maven.org/maven2/edu/holycross/shot/citeobj_2.12/7.3.4/citeobj_2.12-7.3.4.pom
Downloading https://repo1.maven.org/maven2/edu/holycross/shot/citebinaryimage_2.12/3.1.0/citebinaryimage_2.12-3.1.0.pom
Downloading https://repo1.maven.org/maven2/edu/holycross/shot/dse_2.1

Downloaded https://dl.bintray.com/neelsmith/maven/edu/holycross/shot/scm_2.12/7.0.1/scm_2.12-7.0.1.jar
Downloading https://dl.bintray.com/neelsmith/maven/edu/holycross/shot/citerelations_2.12/2.5.2/citerelations_2.12-2.5.2.jar
Downloaded https://dl.bintray.com/neelsmith/maven/edu/holycross/shot/citeobj_2.12/7.3.4/citeobj_2.12-7.3.4.jar
Downloading https://repo1.maven.org/maven2/org/scala-lang/scala-library/2.12.6/scala-library-2.12.6.jar
Downloaded https://dl.bintray.com/neelsmith/maven/edu/holycross/shot/cite/xcite_2.12/4.1.0/xcite_2.12-4.1.0.jar
Downloading https://repo1.maven.org/maven2/org/scala-lang/scala-library/2.12.6/scala-library-2.12.6-sources.jar
Downloaded https://dl.bintray.com/neelsmith/maven/edu/holycross/shot/ohco2_2.12/10.14.0/ohco2_2.12-10.14.0.jar
Downloading https://dl.bintray.com/neelsmith/maven/edu/holycross/shot/citebinaryimage_2.12/3.1.0/citebinaryimage_2.12-3.1.0-sources.jar
Downloaded https://dl.bintray.com/neelsmith/maven/edu/holycross/shot/citerelations_2.12

Downloading https://repo1.maven.org/maven2/org/scala-lang/modules/scala-collection-compat_2.12/2.1.1/scala-collection-compat_2.12-2.1.1.pom
Downloading https://repo1.maven.org/maven2/org/wvlet/airframe/airframe-log_2.12/19.9.0/airframe-log_2.12-19.9.0.pom
Downloaded https://repo1.maven.org/maven2/ch/qos/logback/logback-core/1.2.3/logback-core-1.2.3.pom
Downloaded https://repo1.maven.org/maven2/org/scala-lang/scala-library/2.12.8/scala-library-2.12.8.pom
Downloaded https://repo1.maven.org/maven2/org/scala-lang/modules/scala-collection-compat_2.12/2.1.1/scala-collection-compat_2.12-2.1.1.pom
Downloaded https://repo1.maven.org/maven2/org/wvlet/airframe/airframe-log_2.12/19.9.0/airframe-log_2.12-19.9.0.pom
Downloading https://repo1.maven.org/maven2/ch/qos/logback/logback-parent/1.2.3/logback-parent-1.2.3.pom
Downloaded https://repo1.maven.org/maven2/ch/qos/logback/logback-parent/1.2.3/logback-parent-1.2.3.pom
Downloading https://repo1.maven.org/maven2/org/scala-lang/scala-library/2.12.8/sc

[32mimport [39m[36m$ivy.$                                  
[39m
[32mimport [39m[36m$ivy.$                                     
[39m
[32mimport [39m[36m$ivy.$                                      

[39m
[32mimport [39m[36m$ivy.$                                  
[39m
[32mimport [39m[36m$ivy.$                                     
[39m
[32mimport [39m[36m$ivy.$                                       


[39m
[32mimport [39m[36m$ivy.$                                   

[39m
[32mimport [39m[36m$ivy.$                                      
    [39m

## Analyis with generic Scala


All imports:

In [5]:
import edu.holycross.shot.nomisma._
import edu.holycross.shot.histoutils._

import edu.holycross.shot.cite._
import edu.holycross.shot.ohco2._

import edu.holycross.shot.mid.validator._


import edu.holycross.shot.latin._
import edu.holycross.shot.latincorpus._

import scala.io.Source
import plotly._, plotly.element._, plotly.layout._, plotly.Almond._

[32mimport [39m[36medu.holycross.shot.nomisma._
[39m
[32mimport [39m[36medu.holycross.shot.histoutils._

[39m
[32mimport [39m[36medu.holycross.shot.cite._
[39m
[32mimport [39m[36medu.holycross.shot.ohco2._

[39m
[32mimport [39m[36medu.holycross.shot.mid.validator._


[39m
[32mimport [39m[36medu.holycross.shot.latin._
[39m
[32mimport [39m[36medu.holycross.shot.latincorpus._

[39m
[32mimport [39m[36mscala.io.Source
[39m
[32mimport [39m[36mplotly._, plotly.element._, plotly.layout._, plotly.Almond._[39m

### Build an `Ocre` object


Read basic records for OCRE issues:

In [6]:
val basicsUrl = "https://github.com/neelsmith/nomisma/raw/master/shared/src/test/resources/cex/ocre-basic-issues.cex"
// Drop header line
val data = Source.fromURL(basicsUrl).getLines.toVector.tail

val basics = for (cex <- data) yield {
    BasicIssue(cex)
}
val basicIssues = basics.toVector


val legendUrl = "https://raw.githubusercontent.com/neelsmith/nomisma/master/shared/src/test/resources/cex/ocre-legends.cex"
val legendList = for (cex <- Source.fromURL(legendUrl).getLines) yield {
    edu.holycross.shot.nomisma.Legend(cex)
}
val legends = legendList.toVector.flatten

val typesUrl = "https://raw.githubusercontent.com/neelsmith/nomisma/master/shared/src/test/resources/cex/ocre-types.cex"
val typeList = for (cex <- Source.fromURL(typesUrl).getLines) yield {
    TypeDescription(cex)
}
val typeVector = typeList.toVector.flatten

val geoUrl  = "https://raw.githubusercontent.com/neelsmith/nomisma/master/shared/src/test/resources/cex/mintgeo.cex"
val geoLines = Source.fromURL(geoUrl).getLines.toVector.tail
val mintPoints = geoLines.map(MintPoint(_)).toVector

val ocre = Ocre(basicIssues, legends, typeVector, Vector.empty[Portrait], MintPointCollection(mintPoints))




[36mbasicsUrl[39m: [32mString[39m = [32m"https://github.com/neelsmith/nomisma/raw/master/shared/src/test/resources/cex/ocre-basic-issues.cex"[39m
[36mdata[39m: [32mVector[39m[[32mString[39m] = [33mVector[39m(
  [32m"1_2.aug.10#RIC I (second edition) Augustus 10#denarius#ar#augustus#emerita#lusitania"[39m,
  [32m"1_2.aug.100#RIC I (second edition) Augustus 100#denarius#ar#augustus#colonia_patricia#lusitania"[39m,
  [32m"1_2.aug.101#RIC I (second edition) Augustus 101#denarius#ar#augustus#colonia_patricia#lusitania"[39m,
  [32m"1_2.aug.102#RIC I (second edition) Augustus 102#denarius#ar#augustus#colonia_patricia#lusitania"[39m,
  [32m"1_2.aug.103#RIC I (second edition) Augustus 103#denarius#ar#augustus#colonia_patricia#lusitania"[39m,
  [32m"1_2.aug.104#RIC I (second edition) Augustus 104#aureus#av#augustus#colonia_patricia#lusitania"[39m,
  [32m"1_2.aug.105A#RIC I (second edition) Augustus 105A#denarius#ar#augustus#colonia_patricia#lusitania"[39m,
  [32m"1_2

### Load the Ohco2 Corpus

In [7]:
val fstUrl = "https://raw.githubusercontent.com/neelsmith/hctexts/master/workfiles/ocre/ocre-fst.txt"
val fstLines = Source.fromURL(fstUrl).getLines.toVector

val url = "https://raw.githubusercontent.com/neelsmith/hctexts/master/cex/ocre43k.cex"
val ctsLines = Source.fromURL(url).getLines.toVector.tail.filter(_.nonEmpty)

val stringPairs = ctsLines.map(_.split("#"))
val citableNodes = stringPairs.map( arr => CitableNode(CtsUrn(arr(0)), arr(1)))
val corpus = Corpus(citableNodes)

[36mfstUrl[39m: [32mString[39m = [32m"https://raw.githubusercontent.com/neelsmith/hctexts/master/workfiles/ocre/ocre-fst.txt"[39m
[36mfstLines[39m: [32mVector[39m[[32mString[39m] = [33mVector[39m(
  [32m"> avgvstvs"[39m,
  [32m"<u>ocremorph.n4509</u><u>ls.n4509</u>avgvst<adj><us_a_um><div><us_a_um><adj>vs<masc><nom><sg><pos><u>ocremorph.us_a_um1</u>"[39m,
  [32m"> pivs"[39m,
  [32m"<u>ocremorph.n36487</u><u>ls.n36487</u>pi<adj><us_a_um><div><us_a_um><adj>vs<masc><nom><sg><pos><u>ocremorph.us_a_um1</u>"[39m,
  [32m"> imperator"[39m,
  [32m"<u>ocremorph.n21857</u><u>ls.n21857</u>imperator<noun><masc><0_is><div><0_is><noun><masc><nom><sg><u>ocremorph.0_is1</u>"[39m,
  [32m"<u>ocremorph.n21857</u><u>ls.n21857</u>imperator<noun><masc><0_is><div><0_is><noun><masc><voc><sg><u>ocremorph.0_is11</u>"[39m,
  [32m"> felix"[39m,
  [32m"<u>ocremorph.n17887</u><u>ls.n17887</u>feli<adj><x_cis><div><x_cis><adj>x<masc><nom><sg><pos><u>livymorph.x_cis1</u>"[39m,
  [32m"<u

In [None]:
// This step needs more memory than we get on mybinder...
//
//val lc = LatinCorpus.fromFstLines(corpus, Latin23Alphabet, fstLines, strict = false)