# Search diplomatic text of HMT *scholia*


### How to use this notebook

1. First, run step 1 (e.g., by selecting the cell labelled **Step 1: load everything** and choosing "Run all below" from the "Cell" menu).  This will be slow, and your mileage may vary depending on how well your connection to different resources on the internet happens to be performing just then.
2. Just below the cell labelled **Step 2: search**, fill in between quotation marks an argument to the function `search`. 

Then run the cell (e.g., by selecting it, and choosing "Run cells" from the "Cell" menu).



# Step 2: search

In [7]:
search("κονομ")

[34m2020-09-12 11:37:52.033Z[0m  [33mwarn[0m [[37mDseVector[0m] [33mNo text-bearing surface found for urn:cts:greekLit:tlg5026.msA.hmt:3.527[0m  [34m- (DseVector.scala:186)[0m
[34m2020-09-12 11:37:59.210Z[0m  [33mwarn[0m [[37mDseVector[0m] [33mNo text-bearing surface found for urn:cts:greekLit:tlg5026.msA.hmt:11.228[0m  [34m- (DseVector.scala:186)[0m


# Step 1. Load everything


The most recent release of the archive is always available from [this directory](https://github.com/homermultitext/hmt-archive/tree/master/releases-cex):  you can check there to update the release version in the following cell.

In [None]:
// Check for most recent release at
// https://github.com/homermultitext/hmt-archive/tree/master/releases-cex
// and change this value if needed:
val releaseId = "2020i"


## Configure Jupyter notebook

In [1]:
// 1. Add maven repository where we can find our libraries
val myBT = coursierapi.MavenRepository.of("https://dl.bintray.com/neelsmith/maven")
interp.repositories() ++= Seq(myBT)

[36mmyBT[39m: [32mcoursierapi[39m.[32mMavenRepository[39m = MavenRepository(https://dl.bintray.com/neelsmith/maven)

In [4]:
// 2. Make libraries available with `$ivy` imports:
import $ivy.`edu.holycross.shot::scm:7.4.0`
import $ivy.`edu.holycross.shot::ohco2:10.20.4`
import $ivy.`edu.holycross.shot.cite::xcite:4.3.0`
import $ivy.`edu.holycross.shot::dse:7.1.3`
import $ivy.`edu.holycross.shot::greek:9.0.0`

Downloading https://repo1.maven.org/maven2/edu/holycross/shot/scm_2.12/7.4.0/scm_2.12-7.4.0.pom
Downloaded https://repo1.maven.org/maven2/edu/holycross/shot/scm_2.12/7.4.0/scm_2.12-7.4.0.pom
Downloading https://repo1.maven.org/maven2/edu/holycross/shot/scm_2.12/7.4.0/scm_2.12-7.4.0.pom.sha1
Downloaded https://repo1.maven.org/maven2/edu/holycross/shot/scm_2.12/7.4.0/scm_2.12-7.4.0.pom.sha1
Downloading https://dl.bintray.com/neelsmith/maven/edu/holycross/shot/scm_2.12/7.4.0/scm_2.12-7.4.0.pom
Downloaded https://dl.bintray.com/neelsmith/maven/edu/holycross/shot/scm_2.12/7.4.0/scm_2.12-7.4.0.pom
Downloading https://repo1.maven.org/maven2/edu/holycross/shot/citerelations_2.12/2.7.0/citerelations_2.12-2.7.0.pom
Downloading https://repo1.maven.org/maven2/edu/holycross/shot/cite/xcite_2.12/4.3.0/xcite_2.12-4.3.0.pom
Downloading https://repo1.maven.org/maven2/org/wvlet/airframe/airframe-log_2.12/20.5.2/airframe-log_2.12-20.5.2.pom
Downloading https://repo1.maven.org/maven2/edu/holycross/shot/ce

[32mimport [39m[36m$ivy.$                              
[39m
[32mimport [39m[36m$ivy.$                                  
[39m
[32mimport [39m[36m$ivy.$                                     
[39m
[32mimport [39m[36m$ivy.$                              
[39m
[32mimport [39m[36m$ivy.$                                [39m

## Load HMT data

Data releases of the Homer Multitext project archive are published as CITE libraries, and committed to the `hmt-archive` github repository in CEX format.



In [None]:
import edu.holycross.shot.scm._

val url = s"https://raw.githubusercontent.com/homermultitext/hmt-archive/master/releases-cex/hmt-${releaseId}.cex"
val lib = CiteLibrarySource.fromUrl(url)

[34m2020-09-12 11:45:17.516Z[0m  [36minfo[0m [[37mCiteLibrary[0m] [36mBuilding text repo from cex ...[0m  [34m- (CiteLibrary.scala:160)[0m
[34m2020-09-12 11:45:28.734Z[0m  [36minfo[0m [[37mCiteLibrary[0m] [36mBuilding collection repo from cex ...[0m  [34m- (CiteLibrary.scala:163)[0m


In [5]:
import edu.holycross.shot.ohco2._
import edu.holycross.shot.dse._
import edu.holycross.shot.greek._

val corpus = lib.textRepository.get.corpus
val dsev = DseVector.fromCiteLibrary(lib)
val scholia = corpus.nodes.filter(_.urn.textGroup == "tlg5026")

[32mimport [39m[36medu.holycross.shot.ohco2._
[39m
[32mimport [39m[36medu.holycross.shot.dse._
[39m
[32mimport [39m[36medu.holycross.shot.greek._

[39m
[36mcorpus[39m: [32mCorpus[39m = [33mCorpus[39m(
  [33mVector[39m(
    [33mCitableNode[39m(
      [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0012.tlg001.due_ebbott:10.1"[39m),
      [32m"Alongside the ships the other best men of the Panachaeans"[39m
    ),
    [33mCitableNode[39m(
      [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0012.tlg001.due_ebbott:10.2"[39m),
      [32m"slept all night long, subdued by gentle sleep,"[39m
    ),
    [33mCitableNode[39m(
      [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0012.tlg001.due_ebbott:10.3"[39m),
      [32m"but not the son of Atreus, Agamemnon, the shepherd of the warriors\u2014"[39m
    ),
    [33mCitableNode[39m(
      [33mCtsUrn[39m([32m"urn:cts:greekLit:tlg0012.tlg001.due_ebbott:10.4"[39m),
      [32m"sweet sleep did not hold him, as he pondered man

## Search and format results 

In [6]:

val pageBaseUrl = "http://www.homermultitext.org/facsimiles/venetus-a/"

def search(s: String) = {
  val matchedPsgs = scholia.filter(_.text.contains(s))
  val pls = if (matchedPsgs.size == 1) { "" } else  { "s" }
  val hdr = s"<h2>Search for string ${s}</h2>" +
  s"<p>Found ${matchedPsgs.size} passage${pls}</p>"
  val results = for ( (urn, idx)  <- matchedPsgs.map(_.urn).zipWithIndex) yield {
    val scholion = urn.collapsePassageBy(1)
    //println(scholion)
    val nd = corpus.nodes.filter(nd => scholion > nd.urn)
    //println(nd)
    val text = nd.map(n => "<blockquote>" + n.text.replaceAll(s, "<strong>" + s + "</strong>") + "</blockquote>" )
    val pgOpt = dsev.tbsForText(scholion)
    pgOpt match  {
      case None => {

        s"<li> <strong>${idx + 1}/${matchedPsgs.size}</strong> ${scholion} (Sadly, no page indexed in DSE) "  + text.mkString("\n")  + "</li>"
      }
      case _ => {
        val pg = pgOpt.get.objectComponent
        val url = pageBaseUrl + pg + "/"

        val link = "<a href=\"" + url + "\">facsimile</a>"

        s"<li> <strong>${idx + 1}/${matchedPsgs.size}</strong> ${scholion}, page ${pg} (${link})" + text.mkString("\n") + "</li>"
      }
    }
  }
  Html(hdr + results.mkString("\n"))
}


[36mpageBaseUrl[39m: [32mString[39m = [32m"http://www.homermultitext.org/facsimiles/venetus-a/"[39m
defined [32mfunction[39m [36msearch[39m