Skip to content

Commit

Permalink
Updated README
Browse files Browse the repository at this point in the history
  • Loading branch information
thvitt committed Jun 7, 2023
1 parent 1a46bd1 commit 86fd7c0
Showing 1 changed file with 24 additions and 24 deletions.
48 changes: 24 additions & 24 deletions README.md
Expand Up @@ -2,9 +2,9 @@

The Faust-Edition web application builds on data that is automatically pre-generated from the original XML files. This project integrates the various generation and upload processes (except for the images, actually).

## Note: Maven → Gradle transition
## Basic Build

The project is currently switching its build system from Maven to Gradle. Currently, everything except for the final deploy-via-rsync phase works using Gradle, and building by Gradle is the recommended way:
The recommended way to build the edition is to use gradle:

```bash
git clone --recursive https://github.com/faustedition/faust-gen
Expand All @@ -19,32 +19,18 @@ This will clone this repository and all of its subrepositories, download (almost

and a lot of auxilliary and intermediate files in `build`.

For the build to run, you need to have _Java_ and _GraphViz_ installed locally.
For the build to run, you need to have _Java_ installed locally.


## Basic Usage

Clone the project and all submodules and run `mvn clean package` (or just `mvn`)

## Dependencies

Maven ≥ 3.2 and Java ≥ 8 are required.

For the `deploy` step, you need a local rsync and make sure we can build a ssh connection to the target server (e.g., by providing an SSH agent with an unlocked key).
For the `deployRSync` step, you need a local rsync and make sure we can build a ssh connection to the target server (e.g., by providing an SSH agent with an unlocked key).

## eXist-based search

There is a preliminary eXist app that implements the search functionality. Deploying the app to the eXist server automatically from the build is not yet implemented, however, there are two preliminary ways to get the data in there:

* `mvn package` builds an eXist application archive, `faust.xar`, that can then be deployed to the database using, e.g., its package manager
* You can run a command line along the lines of `mvn -P'exist' -Dexist.uri=xmldb:exist://localhost:8080/exist/xmlrpc/db/apps/faust antrun:run -Dexist.user=admin -Dexist.pass=secret` to update all the files (data and code) of an already-deployed app after you have built at least the 'xproc' stuff.
There is a preliminary eXist app that implements the search functionality. Deploying the app to the eXist server automatically from the build is not really tested, there is a hack without proper configuration in the `deployToExist` target. The recommended way, however, is to build the xar (`./gradlew build` or `./gradlew xar`) and deploy it to eXist manually. For a fresh install, this can be done by putting the app into eXist’s `autodeploy` folder.

## Advanced usage

!! Parts of this section are slightly outdated …

The build uses _profiles_ to select the parts that should run. The profile `svg` (`mvn -Psvg package`) runs the SVG generation, the profile `xproc` the XProc stuff. Everything is on by default, so just running `mvn clean package` will generate the whole site (except images) in `target/www`

## Components

### Diplomatic transcripts: SVG generation
Expand All @@ -53,11 +39,11 @@ The diplomatic transcripts are rendered page by page using JavaScript in a simul

The code that does the actual rendering can be found in <svg_rendering/page>. This folder contains a simple web page, with font resources etc. pulled in from faust-web, plus the rendering code mainly developed by Moritz Wissenbach in <svg_rendering/page/js_gen>.

To create both the diplomatic transcript and the overlay transcript for a single page, <render-multi-pages.js> is called using node.js. This uses [Puppeteer](https://pptr.dev/) to remote-control a headless Chromium browser in which each page will be rendered as SVG. The SVGs will then be extracted and stored so they can later be included in the edition’s UI.
To create both the diplomatic transcript and the overlay transcript for a single page, <render-multi-pages.js> is called using node.js. This uses [Puppeteer](https://pptr.dev/) to remote-control a headless Chromium browser in which each page will be rendered as SVG. The SVGs will then be extracted and stored so they can later be included in the edition’s UI. Finally, the whole site of a complete witness will be rendered to a downloadable PDF file.

The JS does not directly work with the XML transcripts. Instead, each page needs to be transformed to a JSON representation, which is done using code from https://github.com/faustedition/faust-app, which is pulled in as a Maven dependency. The Java program at <src/main/java/net/faustedition/gen/DiplomaticConversion.java> is used to run the actual pipeline, i.e. iterate through the manuscripts and their pages, convert stuff to JSON, and run <rendersvgs.js> on each of these JSON files. Intermediate results (i.e. JSON files) and, if enabled, debugging data (e.g., PDFs) are written to the target directory.

The process might well take 1.5h, it is bound to the `svg` profile. All components — nodejs, puppeteer as well as Chromium – will be downloaded on first run.
The process will run in parallel on multiple cores, and it may take quite long (30-90min).

### Textual transcripts, metadata, and overview data

Expand All @@ -70,13 +56,28 @@ Basically all other data that depends on the source data is generated using the
* the JSON basis for the genesis bar graph
* various other JSON data used by the web application

This process takes around 1/2h, it is bound to the xproc profile. You can run individual parts from within that project, look at its documentation. Intermediate and generated files are put in the `target` directory.
This process takes around 1/2h, it is bound to the xproc profile. You can run individual parts from within that project, I still recommend using gradle tasks since that will generate required dependencies.

### Experimental Macrogenetic Analysis

The macrogenesis part will produce various things:

* a zip file with the internal analysis data
* `build/order.xml`, a file with an estimated sorting order
* various report on details of the internal analysis, in `build/www/macrogenesis`, later part of the web site
* rendered SVGs for a lot of graphs for the report

The analysis will take 10-15 minutes. Most of the reporting will be quick, but rendering a few large graphs may take a long time (hours) or even fail. A timeout can be specified, see the macrogenesis project’s documentation.

### Web application, and putting it all together

The web application, i.e. all non-generated code, is pulled in from http://github.com/faustedition/faust-web as submodule `src/main/web`.

<pom.xml> contains code to actually run this all, and copy stuff around. Afterwards, the complete web site (except for the digitized manuscripts) can be found at `target/www`. If you run the `deploy` phase, it is rsynced to the server.
<build.gradle> contains code to actually run this all, and copy stuff around. Afterwards, the complete web site (except for the digitized manuscripts) can be found at `build/www`. If you run the `deployRsync` task, it is rsynced to the server.

Gradle will try to parallelize some parts of the workflow. The following (generated) diagram shows the dependencies of the various tasks for the `gradle build`. Red tasks come from the main project, green ones from macrogenesis, blue ones from the xproc stuff (faust-gen-html). The superscript numbers are the order in which the tasks will be queued, the gray arrows indicate a dependency.

![Build tasks and their dependencies](gradle-dependencies.svg)

## Semiautomatical stuff

Expand All @@ -89,4 +90,3 @@ There are two steps that involve pulling in data from the internal wiki:

* filling the eXist instance, see the scripts in faust-gen-html
* preparing the facsimiles, see convert.sh
* running the macrogenesis part

0 comments on commit 86fd7c0

Please sign in to comment.