# Finding Text Bearing Surfaces

## Configuring CITE libraries for almond kernel

First, we'll make a bintray repository with CITE libraries available to your almond kernel.

In [1]:
val myBT = coursierapi.MavenRepository.of("https://dl.bintray.com/neelsmith/maven")
interp.repositories() ++= Seq(myBT)

[36mmyBT[39m: [32mcoursierapi[39m.[32mMavenRepository[39m = MavenRepository(https://dl.bintray.com/neelsmith/maven)

Next, we bring in specific libraries from the new repository using almond's `$ivy` magic:

In [3]:
import $ivy.`edu.holycross.shot::ohco2:10.18.2`
import $ivy.`edu.holycross.shot.cite::xcite:4.1.1`
import $ivy.`edu.holycross.shot::scm:7.2.0`
import $ivy.`edu.holycross.shot::dse:6.0.4`
import $ivy.`edu.holycross.shot::citebinaryimage:3.1.1`
import $ivy.`edu.holycross.shot::citeobj:7.3.4`
import $ivy.`edu.holycross.shot::citerelations:2.5.2`
import $ivy.`edu.holycross.shot::cex:6.3.3`


[32mimport [39m[36m$ivy.$                                  
[39m
[32mimport [39m[36m$ivy.$                                     
[39m
[32mimport [39m[36m$ivy.$                              
[39m
[32mimport [39m[36m$ivy.$                              
[39m
[32mimport [39m[36m$ivy.$                                          
[39m
[32mimport [39m[36m$ivy.$                                  
[39m
[32mimport [39m[36m$ivy.$                                        
[39m
[32mimport [39m[36m$ivy.$                              
[39m

## Imports

From this point on, your notebook consists of completely generic Scala, with the CITE Libraries available to use.

In [10]:
// Import some CITE libraries
import edu.holycross.shot.cite._
import edu.holycross.shot.ohco2._
import edu.holycross.shot.scm._
import edu.holycross.shot.citeobj._
import edu.holycross.shot.citerelation._
import edu.holycross.shot.dse._
import edu.holycross.shot.citebinaryimage._
import edu.holycross.shot.ohco2._

import almond.display.UpdatableDisplay
import almond.interpreter.api.DisplayData.ContentType
import almond.interpreter.api.{DisplayData, OutputHandler}

[32mimport [39m[36medu.holycross.shot.cite._
[39m
[32mimport [39m[36medu.holycross.shot.ohco2._
[39m
[32mimport [39m[36medu.holycross.shot.scm._
[39m
[32mimport [39m[36medu.holycross.shot.citeobj._
[39m
[32mimport [39m[36medu.holycross.shot.citerelation._
[39m
[32mimport [39m[36medu.holycross.shot.dse._
[39m
[32mimport [39m[36medu.holycross.shot.citebinaryimage._
[39m
[32mimport [39m[36medu.holycross.shot.ohco2._

[39m
[32mimport [39m[36malmond.display.UpdatableDisplay
[39m
[32mimport [39m[36malmond.interpreter.api.DisplayData.ContentType
[39m
[32mimport [39m[36malmond.interpreter.api.{DisplayData, OutputHandler}[39m

## Load a CITE Library

In [11]:
val filePath = s"https://raw.githubusercontent.com/Eumaeus/fuCiteDX/master/hmt/hmt_january_2020.cex"
val lib: CiteLibrary = CiteLibrarySource.fromUrl(filePath)

[34m2020-01-11 16:21:34.241-0500[0m  [36minfo[0m [[37mCiteLibrary[0m] [36mBuilding text repo from cex ...[0m  [34m- (CiteLibrary.scala:160)[0m
[34m2020-01-11 16:21:35.259-0500[0m  [36minfo[0m [[37mCiteLibrary[0m] [36mBuilding collection repo from cex ...[0m  [34m- (CiteLibrary.scala:163)[0m
[34m2020-01-11 16:21:47.968-0500[0m  [36minfo[0m [[37mCiteLibrary[0m] [36mBuilding relations from cex ...[0m  [34m- (CiteLibrary.scala:166)[0m
[34m2020-01-11 16:21:48.996-0500[0m  [36minfo[0m [[37mCiteLibrary[0m] [36mAll library components built.[0m  [34m- (CiteLibrary.scala:168)[0m


[36mfilePath[39m: [32mString[39m = [32m"https://raw.githubusercontent.com/Eumaeus/fuCiteDX/master/hmt/hmt_january_2020.cex"[39m
[36mlib[39m: [32mCiteLibrary[39m = [33mCiteLibrary[39m(
  [32m"Homer Multitext project, release cwb_test"[39m,
  [33mCite2Urn[39m([32m"urn:cite2:hmt:publications.cex.cwb_test:all"[39m),
  [32m"Creative Commons Attribution, Non-Commercial 4.0 License <https://creativecommons.org/licenses/by-nc/4.0/>."[39m,
  [33mVector[39m(
    [33mCiteNamespace[39m([32m"hmt"[39m, http://www.homermultitext.org/citens/hmt),
    [33mCiteNamespace[39m([32m"greekLit"[39m, http://chs.harvard.edu/ctsns/greekLit)
  ),
  [33mSome[39m(
    [33mTextRepository[39m(
      [33mCorpus[39m(
        [33mVector[39m(
          [33mCitableNode[39m(
            [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg5026.msA.hmt:1.1.lemma"[39m),
            [32m"\u03bc\u1fc6\u03bd\u03b9\u03bd \u1f04\u03b5\u03b9\u03b4\u03b5"[39m
          ),
          [33mCitableNode

Get parts of the CITE Library in convenient form:

In [6]:
lazy val tr: TextRepository = lib.textRepository.get
lazy val corp: Corpus = tr.corpus
lazy val cat: Catalog = tr.catalog
lazy val colls: CiteCollectionRepository = lib.collectionRepository.get
lazy val rels: CiteRelationSet = lib.relationSet.get
lazy val myDseVec: DseVector = DseVector.fromCiteLibrary(lib)

In [7]:
myDseVec

[36mres6[39m: [32mDseVector[39m = [33mDseVector[39m([33mVector[39m())

## Get a Map of Text-Bearing Surfaces

If a CITE Library implements the "text bearing surface" model (`urn:cite2:cite:datamodels.v1:tbsmodel`), it knows this and can tell us about it.

In [6]:
val tbsUrn = Cite2Urn("urn:cite2:cite:datamodels.v1:tbsmodel")
val tbsCollections: Vector[Cite2Urn] = lib.collectionsForModel(tbsUrn)

[36mtbsUrn[39m: [32mCite2Urn[39m = [33mCite2Urn[39m([32m"urn:cite2:cite:datamodels.v1:tbsmodel"[39m)
[36mtbsCollections[39m: [32mVector[39m[[32mCite2Urn[39m] = [33mVector[39m(
  [33mCite2Urn[39m([32m"urn:cite2:hmt:msA.v1:"[39m),
  [33mCite2Urn[39m([32m"urn:cite2:hmt:msB.v1:"[39m)
)

With this, we can get a map of Text Bearing Surfaces, keyed by collection.

In [7]:
val tbsMap: Map[Cite2Urn, Vector[Cite2Urn]] = {
    tbsCollections.map( tc => {
        val key = tc
        val vec = (colls ~~ tc).map(_.urn)
        (key -> vec)
    }).toMap
}

[36mtbsMap[39m: [32mMap[39m[[32mCite2Urn[39m, [32mVector[39m[[32mCite2Urn[39m]] = [33mMap[39m(
  [33mCite2Urn[39m([32m"urn:cite2:hmt:msA.v1:"[39m) -> [33mVector[39m(
    [33mCite2Urn[39m([32m"urn:cite2:hmt:msA.v1:insidefrontcover"[39m),
    [33mCite2Urn[39m([32m"urn:cite2:hmt:msA.v1:ir"[39m),
    [33mCite2Urn[39m([32m"urn:cite2:hmt:msA.v1:iv"[39m),
    [33mCite2Urn[39m([32m"urn:cite2:hmt:msA.v1:1r"[39m),
    [33mCite2Urn[39m([32m"urn:cite2:hmt:msA.v1:1v"[39m),
    [33mCite2Urn[39m([32m"urn:cite2:hmt:msA.v1:2r"[39m),
    [33mCite2Urn[39m([32m"urn:cite2:hmt:msA.v1:2v"[39m),
    [33mCite2Urn[39m([32m"urn:cite2:hmt:msA.v1:3r"[39m),
    [33mCite2Urn[39m([32m"urn:cite2:hmt:msA.v1:3v"[39m),
    [33mCite2Urn[39m([32m"urn:cite2:hmt:msA.v1:4r"[39m),
    [33mCite2Urn[39m([32m"urn:cite2:hmt:msA.v1:4v"[39m),
    [33mCite2Urn[39m([32m"urn:cite2:hmt:msA.v1:5r"[39m),
    [33mCite2Urn[39m([32m"urn:cite2:hmt:msA.v1:5v"[39m),
    [33m

   ## DSE Work

In [8]:
lib.dseVector

cmd8.sc:1: value dseVector is not a member of edu.holycross.shot.scm.CiteLibrary
val res8 = lib.dseVector
               ^Compilation Failed

: 

In [9]:
DseVector.fromCiteLibrary(lib)

[36mres8[39m: [32mDseVector[39m = [33mDseVector[39m(
  [33mVector[39m(
    [33mDsePassage[39m(
      [33mCite2Urn[39m([32m"urn:cite2:hmt:va_dse.v1:il2168"[39m),
      [32m"DSE record for Iliad 4.217"[39m,
      [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0012.tlg001.msA:4.217"[39m),
      [33mCite2Urn[39m(
        [32m"urn:cite2:hmt:vaimg.2017a:VA055VN_0557@0.4865,0.3644,0.3954,0.0391"[39m
      ),
      [33mCite2Urn[39m([32m"urn:cite2:hmt:msA.v1:55v"[39m)
    ),
    [33mDsePassage[39m(
      [33mCite2Urn[39m([32m"urn:cite2:hmt:va_dse.v1:il11826"[39m),
      [32m"DSE record for Iliad 18.529"[39m,
      [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0012.tlg001.msA:18.529"[39m),
      [33mCite2Urn[39m(
        [32m"urn:cite2:hmt:vaimg.2017a:VA249RN_0420@0.19,0.6589,0.427,0.0331"[39m
      ),
      [33mCite2Urn[39m([32m"urn:cite2:hmt:msA.v1:249r"[39m)
    ),
    [33mDsePassage[39m(
      [33mCite2Urn[39m([32m"urn:cite2:hmt:va_dse.v1:il6005"[39m)

In [12]:
val ctsu1 = CtsUrn("urn:cts:greekLit:tlg0012.tlg001:1")
val ctsu2 = CtsUrn("urn:cts:greekLit:tlg0012.tlg001:1")
val citeu1 =  Cite2Urn("urn:cite2:hmt:va_dse.v1:il16223")
val citeu2 =  Cite2Urn("urn:cite2:hmt:va_dse.v1:il16223")

[36mctsu1[39m: [32mCtsUrn[39m = [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0012.tlg001:1"[39m)
[36mctsu2[39m: [32mCtsUrn[39m = [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0012.tlg001:1"[39m)
[36mciteu1[39m: [32mCite2Urn[39m = [33mCite2Urn[39m([32m"urn:cite2:hmt:va_dse.v1:il16223"[39m)
[36mciteu2[39m: [32mCite2Urn[39m = [33mCite2Urn[39m([32m"urn:cite2:hmt:va_dse.v1:il16223"[39m)

In [14]:
ctsu1.asInstanceOf[Urn] 
ctsu2.asInstanceOf[Urn] 

[36mres13_0[39m: [32mUrn[39m = [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0012.tlg001:1"[39m)
[36mres13_1[39m: [32mUrn[39m = [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0012.tlg001:1"[39m)

In [15]:
res13_0 == res13_1


[36mres14[39m: [32mBoolean[39m = true