# Get info from the xml

We check the validity, make an inventory of elements and attributes.

We collect page sequences with the scans and regions that pages identify.

We generate IIIF manifests for the scans.

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
from ti.info.tei import TEI
from ti.info.scans import Scans
from ti.info.iiif import IIIF

In [3]:
BASE = "../newPipeline/israels"

# Step 0: Scan ingest

To convert a directory with `.jpf` files to `.jpg` files do

```
cd directory
mogrify -format jpg *.jpf
```

In [4]:
SC = Scans("../scans", f"{BASE}/config/scans.yml", verbose=1, force=False)

Source dir = ../scans
scanprep settings read from ../newPipeline/israels/config/scans.yml


In [5]:
SC.process(f"{BASE}/scanInfo", force=None)

Initialized ../newPipeline/israels/scanInfo
srcSubDirs=('illustrations', 'logo', 'pages')
illustrations:
	Already present: sizes and colorspaces files (illustrations)
	scans: 73
logo:
pages:
	Already present: sizes and colorspaces files (pages)
	scans: 263


In [6]:
SC.process(f"{BASE}/scanInfo", force=True)

Initialized ../newPipeline/israels/scanInfo
srcSubDirs=('illustrations', 'logo', 'pages')
illustrations:
		Get attributes of 73 scans (illustrations)
			100% done
	scans: 73
logo:
pages:
		Get attributes of 264 scans (pages)
			 38% done
			 76% done
			100% done
	scans: 263


# Step 1: Inventory

In [11]:
Tei = TEI("../tei/2025-10-07", f"{BASE}/config/tei.yml", verbose=0)

section model is I


In [12]:
Tei.inventory("../schema", f"{BASE}/report", carryon=True, verbose=1)

TEI to TF checking: ../tei/2025-10-07 => ../newPipeline/israels/report
Processing instructions are treated
XML validation will be performed
Analysing ~/github/annotation/text-info/ti/tools/tei/tei_all.xsd
	round   1: 232 changes
Analysing ~/github/annotation/text-info/ti/tools/tei/tei_all.xsd
	round   1: 232 changes
Analysing ../schema/editem-about.xsd
	round   1:  52 changes
118 identical override(s)
  0 changing override(s)
INFO: Needs editem.xsd (exists)
Analysing ~/github/annotation/text-info/ti/tools/tei/tei_all.xsd
	round   1: 232 changes
Analysing ../schema/editem-letter.xsd
	round   1:  71 changes
172 identical override(s)
  2 changing override(s)
	artwork complex pure (added)
	eventName complex mixed (added)
INFO: Needs editem.xsd (exists)
Analysing ~/github/annotation/text-info/ti/tools/tei/tei_all.xsd
	round   1: 232 changes
Analysing ../schema/editem-artworklist.xsd
	round   1:  21 changes
	round   2:   2 changes
 29 identical override(s)
  3 changing override(s)
	artwork c

255 validation error(s) in 2 file(s) written to ../newPipeline/israels/report/errors.txt


0 pagebreaks without facs attribute.
602 pagebreaks encountered.
301 distinct scans referred to by pagebreaks.
260 surfaces declared
148 zones declared
408 scans declared and mapped.


99 unused surfaces
8 unused zones


0 processing instructions encountered.
72 tags of which 0 with multiple namespaces written to ../newPipeline/israels/report/namespaces.txt
397 info line(s) written to ../newPipeline/israels/report/elements.txt
Refs written to ../newPipeline/israels/report/refs.txt
	resolvable:  798 in  798
	dangling:    341 in 2561
	ALL:        1139 in 3359 
Ids written to ../newPipeline/israels/report/ids.txt
	referenced:  798 by  798
	non-unique:    0
	unused:     4444
	ALL:        5242 in 5242
lb-parent info written to ../newPipeline/israels/report/lb-parents.txt


# Step 3: Generate IIIF manifests

In [13]:
II = IIIF(f"{BASE}/report", f"{BASE}/scanInfo", f"{BASE}/config/iiif.yml", verbose=1)

iiif settings read from ../newPipeline/israels/config/iiif.yml
Source information taken from ../newPipeline/israels/report


In [14]:
II.manifests(
    f"{BASE}/manifests",
    title="israels",
    baseUri="http://localhost:8040",
    iiifBaseUri="https://tt-iiif.dev.diginfra.org",
    verbose=1,
)

Parameters passed to manifest generation:
	baseUri    = http:/localhost:8040
	iiifBaseUri = https:/tt-iiif.dev.diginfra.org
	title      = israels
Values for the constants of the manifest generation:
	canvasUrl  = [[baseUri]]/files/[[title]]/static/manifests/{folder}/{file}.json/canvas/[[title]]/pages/{page}
	context    = http:/iiif.io/api/presentation/3/context.json
	ext        = jpg
	iiifParams = /full/max/0/default
	iiifPath   = [[title]]
	iiifPathEsc = [[title]]|
	iiifRoot   = [[iiifBaseUri]]/iiif/3/[[title]]|
	iiifserver = [[iiifBaseUri]]
	logoRoot   = [[baseUri]]/files/[[title]]/static/logo
	manifestRoot = [[baseUri]]/files/[[title]]/static/manifests
	manifestUrl = [[baseUri]]/files/[[title]]/static/manifests/{folder}/{file}.json
	pageUrl    = [[iiifBaseUri]]/iiif/3/[[title]]|pages|{page}.jpg/full/max/0/default.jpg
	profile    = http:/iiif.io/api/image/3/level1.json
	provHomepageId = https:/www.vangoghmuseum.nl/en
	provHomepageLabel = Homepage of the Van Gogh Museum
	provLogoHeigh