Skip to content

Commit

Permalink
Merge pull request #709 from clulab/kwalcock/webapp
Browse files Browse the repository at this point in the history
Add a webapp
  • Loading branch information
kwalcock committed Feb 8, 2023
2 parents 91e7571 + 55dbfbb commit 2ffeab5
Show file tree
Hide file tree
Showing 95 changed files with 14,695 additions and 3 deletions.
1 change: 1 addition & 0 deletions CHANGES.md
@@ -1,3 +1,4 @@
+ **8.5.4** - Add a webapp subproject
+ **8.5.3** - Allow document construction from text fragments containing multiple sentences
+ **8.5.3** - Build for Scala 3, use artifactory.clulab.org, streamline buffers
+ **8.5.3** - Improve number parser with words such as "dozen", "grand"
Expand Down
13 changes: 13 additions & 0 deletions build.sbt
Expand Up @@ -26,3 +26,16 @@ lazy val corenlp = project

lazy val openie = project
.dependsOn(main % "compile -> compile; test -> test")

lazy val webapp = project
.enablePlugins(PlayScala)
.dependsOn(main % "compile -> compile; test -> test")
.settings(
// scala3 doesn't have play and is ruled out completely.
// scala213 dies at runtime thinking it needs something from scala11.
// scala212 works!
// scala211 isn't compiling and complains on twirlCompileTemplates.
crossScalaVersions := Seq(scala212)
)

addCommandAlias("dockerizeWebapp", ";webapp/docker:publishLocal")
Binary file added docs/webapp_full.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 2 additions & 2 deletions main/build.sbt
Expand Up @@ -19,7 +19,7 @@ libraryDependencies ++= {
}
val combinatorsVersion = {
CrossVersion.partialVersion(scalaVersion.value) match {
case Some((2, minor)) if minor <= 12 => "1.1.2" // Higher causes problems with libraries.
case Some((2, minor)) if minor <= 13 => "1.1.2" // Higher causes problems with libraries.
case _ => "2.1.1" // up to 2.1.1
}
}
Expand Down Expand Up @@ -60,7 +60,7 @@ libraryDependencies ++= {
// Instead, all code makes use of the Java interface.
"org.slf4j" % "slf4j-api" % "1.7.32", // MIT
// Local logging is provided here but not published.
"ch.qos.logback" % "logback-classic" % "1.2.8", // up to 1.2.8; less than 1.2 is vulnerable
"ch.qos.logback" % "logback-classic" % "1.2.8", // up to 1.2.8; less than 1.2 is vulnerable
// testing
"org.scalatest" %% "scalatest" % "3.2.10" % Test, // Apache-2.0
// trained models for local ML models used in both main and corenlp
Expand Down
7 changes: 6 additions & 1 deletion project/build.properties
@@ -1 +1,6 @@
sbt.version = 1.8.0
# Version 1.8.x will cause problems when combined with the play plug-in used for the webapp!
# [error] * org.scala-lang.modules:scala-xml_2.12:2.1.0 (early-semver) is selected over {1.2.0, 1.1.1}
# [error] +- org.scala-lang:scala-compiler:2.12.17 (depends on 2.1.0)
# [error] +- com.typesafe.sbt:sbt-native-packager:1.5.2 (scalaVersion=2.12, sbtVersion=1.0) (depends on 1.1.1)
# [error] +- com.typesafe.play:twirl-api_2.12:1.5.1 (depends on 1.2.0)
sbt.version = 1.7.2
2 changes: 2 additions & 0 deletions project/plugins.sbt
@@ -1,5 +1,7 @@
// Latest version numbers were updated on 2021 Mar 11.
addSbtPlugin("com.jsuereth" % "sbt-pgp" % "1.1.2-1") // up to 1.1.2-1 *
addSbtPlugin("org.xerial.sbt" % "sbt-sonatype" % "2.3") // up to 3.9.6 *
// Newer versions of play are not compatible with Scala 2.11. None works with Scala 3.
addSbtPlugin("com.typesafe.play" % "sbt-plugin" % "2.8.19") // up to 2.8.19
addSbtPlugin("com.github.gseitz" % "sbt-release" % "1.0.13") // up to 1.0.13
// * Held back out of an abundance of caution.
1 change: 1 addition & 0 deletions shell.bat
@@ -0,0 +1 @@
sbt "runMain org.clulab.processors.ProcessorShell"
9 changes: 9 additions & 0 deletions webapp/.gitignore
@@ -0,0 +1,9 @@
logs
target
/.idea
/.idea_modules
/.classpath
/.project
/.settings
/RUNNING_PID
cache/*
30 changes: 30 additions & 0 deletions webapp/README.md
@@ -0,0 +1,30 @@
# webapp

This subproject of processors houses code that implements a web page that displays
output from processors in HTML format. There are diagrams and tables resulting in a display much like the image below.

![Webapp window with text](../docs/webapp_full.png?raw=True")

## Execution

One can start the webapp directly from within `sbt` in development mode with the command `webapp/run`. The web page will then be accessible at [http://localhost:9000](http://localhost:9000). If you need to debug the webapp, use `sbt -jvm-debug 5005` and then configure IntelliJ for "Remote JVM Debug". You should then be able to set breakpoints in `org.clulab.processors.webapp.controllers.HomeController`, for example.

## Configuration

The configuration for Odin used in the `HomeController` is based on the OdinStarter (`org.clulab.odinstarter.OdinStarter`) App. The NER and rule files for the App are configured in code. For the webapp, the same files are instead specified in the configuration file `processors.conf` under keys `customLexiconNer` and `extractorEngine`. You would change filenames there or change the contents of the Odin files in the directory `main/src/main/resources/org/clulab/odinstarter`.


## Dockerization

In this subproject there is also a `docker.sbt` file which allows one to build an image from within `sbt`. A command alias `dockerizeWebapp` has been set up for it.

To run the resulting image, use a command like
```bash
docker run -d --env secret=<secret> -p 9000:9000 --restart unless-stopped processors-webapp:latest &
```
The secret is the value for `play.http.secret.key` used in
[conf/application.conf](./conf/application.conf) to protect the application.

## Limitations

The webapp presently only works for Scala 2.12 because of library and plug-in conflicts. The Play framework itself is not ready for Scala 3. Scala 2.12 is the default version for processors, so things should just work for the most part. Because of this limitation, however, the webapp is not "aggregated" and will not be published or released with the other projects. To publish, make sure the version is set as desired and perform a `webapp/publish` for Artifactory or `webapp/publishSigned` and `webapp/sonatypeRelease` for maven. The files `application.conf` and `routes` are reused in a [template repo](https://github.com/clulab/sbt-processors-small.g8), so changes should be propagated there.
@@ -0,0 +1,162 @@
package org.clulab.processors.webapp.controllers

import org.clulab.odin.{CrossSentenceMention, EventMention, ExtractorEngine, Mention, RelationMention, TextBoundMention}
import org.clulab.processors.Processor
import org.clulab.processors.clu.CluProcessor
import org.clulab.processors.webapp.serialization.WebSerializer
import org.clulab.sequences.LexiconNER
import org.clulab.utils.{FileUtils, Unordered}
import org.clulab.utils.Unordered.OrderingOrElseBy
import com.typesafe.config.{ConfigBeanFactory, ConfigFactory}
import play.api.mvc._
import play.api.mvc.Action

import javax.inject._
import scala.beans.BeanProperty
import scala.jdk.CollectionConverters._
import scala.util.Try

@Singleton
class HomeController @Inject()(cc: ControllerComponents) extends AbstractController(cc) {

def initialize(): (Processor, ExtractorEngine) = {
println("[processors] Initializing the processor ...")

val config = ConfigFactory.load("application")
.withFallback(ConfigFactory.load("processors"))
val customLexiconNerConfigs = config.getConfigList("customLexiconNer").asScala.map { config =>
ConfigBeanFactory.create(config, classOf[CustomLexiconNerConfig])
}
val extractorEngineConfig = ConfigBeanFactory.create(config.getConfig("extractorEngine"), classOf[ExtractorEngineConfig])

val processor = {
val kbs = customLexiconNerConfigs.map(_.kb)
val caseInsensitiveMatchings = customLexiconNerConfigs.map(_.caseInsensitiveMatching)
val customLexiconNer = LexiconNER(kbs, caseInsensitiveMatchings, None)
val processor = new CluProcessor(optionalNER = Some(customLexiconNer))

processor
}
val extractorEngine: ExtractorEngine = {
val rules = FileUtils.getTextFromResource(extractorEngineConfig.rules)
val extractorEngine = ExtractorEngine(rules)

extractorEngine
}

{
val document = processor.annotate("John eats cake.")
extractorEngine.extractFrom(document)
}
println("[processors] Completed Initialization ...")
(processor, extractorEngine)
}

implicit val mentionOrder = {
val mentionRank: Map[Class[_], Int] = Map(
classOf[TextBoundMention] -> 0,
classOf[EventMention] -> 1,
classOf[RelationMention] -> 2,
classOf[CrossSentenceMention] -> 3
)

Unordered[Mention]
.orElseBy(_.sentence)
.orElseBy { mention => mentionRank.getOrElse(mention.getClass, mentionRank.size) }
.orElseBy(_.getClass.getName)
.orElseBy(_.arguments.size)
.orElseBy(_.tokenInterval)
.orElse(-1)
}

def printMention(mention: Mention, nameOpt: Option[String] = None, depth: Int = 0): Unit = {
val sentence = mention.sentenceObj
val tokenInterval = mention.tokenInterval
val indent = " " * depth
val name = nameOpt.getOrElse("<none>")
val labels = mention.labels
val words = tokenInterval.map(sentence.words)
val tags = sentence.tags.map(tokenInterval.map(_)).getOrElse(Seq.empty)
val lemmas = sentence.lemmas.map(tokenInterval.map(_)).getOrElse(Seq.empty)
val entities = sentence.entities.map(tokenInterval.map(_)).getOrElse(Seq.empty)
val norms = sentence.norms.map(tokenInterval.map(_)).getOrElse(Seq.empty)
val chunks = sentence.chunks.map(tokenInterval.map(_)).getOrElse(Seq.empty)
val raws = tokenInterval.map(sentence.raw)

def toRow(field: String, text: String): Unit = println(s"$indent$field: $text")

def toRows(field: String, texts: Seq[String]): Unit = toRow(field, texts.mkString(" "))

toRow (" Name", name)
toRow (" Type", mention.getClass.getSimpleName)
toRow (" FoundBy", mention.foundBy)
toRow (" Sentence", mention.sentenceObj.getSentenceText)
toRows(" Labels", labels)
toRows(" Words", words)
toRows(" Tags", tags)
toRows(" Lemmas", lemmas)
toRows(" Entities", entities)
toRows(" Norms", norms)
toRows(" Chunks", chunks)
toRows(" Raw", raws)
toRows("Attachments", mention.attachments.toSeq.map(_.toString).sorted)

mention match {
case textBoundMention: TextBoundMention =>
case eventMention: EventMention =>
toRow(" Trigger", "")
printMention(eventMention.trigger, None, depth + 1)
case relationMention: RelationMention =>
case crossSentenceMention: CrossSentenceMention =>
case _ =>
}

if (mention.arguments.nonEmpty) {
toRow(" Arguments", "")
for (name <- mention.arguments.keys.toSeq.sorted; mention <- mention.arguments(name).sorted)
printMention(mention, Some(name), depth + 1)
}
println()
}

val webSerializer = new WebSerializer()
val (processor, extractorEngine) = initialize()

def index(): Action[AnyContent] = Action { implicit request: Request[AnyContent] =>
Ok(views.html.index())
}

def parseText(text: String): Action[AnyContent] = Action {
println("Text:")
println(text)
println()

val document = processor.annotate(text)

println("Sentences:")
document.sentences.foreach { sentence =>
println(sentence.getSentenceText)
}
println()

val mentions = extractorEngine.extractFrom(document).sorted

println("Mentions:")
mentions.foreach { mention =>
printMention(mention)
}
println()

val json = webSerializer.processDocument(text, document, mentions)

Ok(json)
}
}

case class CustomLexiconNerConfig(@BeanProperty var kb: String, @BeanProperty var caseInsensitiveMatching: Boolean) {
def this() = this("", false)
}

case class ExtractorEngineConfig(@BeanProperty var rules: String) {
def this() = this("")
}
@@ -0,0 +1,121 @@
package org.clulab.processors.webapp.serialization

import org.clulab.odin.{CrossSentenceMention, EventMention, Mention, RelationMention, TextBoundMention}

class MentionsObj(mentions: Seq[Mention]) {
val tableHeader = """
|<table style="margin-top: 0;">
|""".stripMargin
val tableTrailer = """
|</table>
|""".stripMargin
val leftTdHeader = """
|<tr>
| <td align="right">
|""".stripMargin
val rightTdHeader = """
|<tr>
| <td>
|""".stripMargin
val tdSeparator = """
| </td>
| <td>
|""".stripMargin
val tdTrailer = """
| </td>
|</tr>
|""".stripMargin

def getTrSeparator(wide: Boolean): String = {
val style = if (wide) """ style = "width: 100%;"""" else ""
s"""
|<tr>
| <th>Field</th>
| <th$style>Value</th>
|</tr>
|""".stripMargin
}

def getTd(field: String, text: String): String =
s"""
|$leftTdHeader
| ${xml.Utility.escape(field)}:&nbsp;
|$tdSeparator
| ${xml.Utility.escape(text)}
|$tdTrailer
|""".stripMargin

def getTds(field: String, strings: Seq[String]): String =
getTd(field, strings.mkString(", "))

def openTable(field: String): String = s"""
|$leftTdHeader
| ${xml.Utility.escape(field)}:&nbsp;
|$tdSeparator
| $tableHeader
|""".stripMargin

val closeTable: String = s"""
| $tableTrailer
|$tdTrailer
|""".stripMargin

def mkMentionsObj(mention: Mention, sb: StringBuilder, nameOpt: Option[String] = None, depth: Int = 0): Unit = {
val sentence = mention.sentenceObj
val tokenInterval = mention.tokenInterval
val name = nameOpt.getOrElse("<none>")
val labels = mention.labels
val words = tokenInterval.map(sentence.words)
val tags = sentence.tags.map(tokenInterval.map(_)).getOrElse(Seq.empty)
val lemmas = sentence.lemmas.map(tokenInterval.map(_)).getOrElse(Seq.empty)
val entities = sentence.entities.map(tokenInterval.map(_)).getOrElse(Seq.empty)
val norms = sentence.norms.map(tokenInterval.map(_)).getOrElse(Seq.empty)
val chunks = sentence.chunks.map(tokenInterval.map(_)).getOrElse(Seq.empty)
val raws = tokenInterval.map(sentence.raw)

sb
.append(getTrSeparator(depth != 0))
.append(getTd ("Sentence #", (mention.sentence + 1).toString))
.append(getTd ("Name", name))
.append(getTd ("Type", mention.getClass.getSimpleName))
.append(getTd ("FoundBy", mention.foundBy))
.append(getTd ("Sentence", mention.sentenceObj.getSentenceText))
.append(getTds("Labels", labels))
.append(getTds("Words", words))
.append(getTds("Tags", tags))
.append(getTds("Lemmas", lemmas))
.append(getTds("Entities", entities))
.append(getTds("Norms", norms))
.append(getTds("Chunks", chunks))
.append(getTds("Raw", raws))
.append(getTds("Attachments", mention.attachments.toSeq.map(_.toString).sorted))

mention match {
case textBoundMention: TextBoundMention =>
case eventMention: EventMention =>
sb.append(openTable("Trigger"))
mkMentionsObj(eventMention.trigger, sb, None, depth + 1)
sb.append(closeTable)
case relationMention: RelationMention =>
case crossSentenceMention: CrossSentenceMention =>
case _ =>
}

if (mention.arguments.nonEmpty) {
sb.append(openTable("Arguments"))
for (name <- mention.arguments.keys.toSeq.sorted; mention <- mention.arguments(name).sorted)
mkMentionsObj(mention, sb, Some(name), depth + 1)
sb.append(closeTable)
}
}

def mkHtml: String = {
val sb = new StringBuilder(tableHeader)

mentions.foreach { mention =>
mkMentionsObj(mention, sb)
}
sb.append(tableTrailer)
sb.toString
}
}

0 comments on commit 2ffeab5

Please sign in to comment.