# Re-creating *VeBiDraCor*
For VeBiDraCor plays published on [DraCor](https://dracor.org) have been assembled into a *Very Big Drama Corpus* containing some 2.900 plays.

For the "original" Very Big Drama Corpus ("VeBiDraCor") used in the study ["Detecting Small Worlds in a Corpus of Thousands of Theater Plays – A DraCor Study in Comparative Literary Network Analysis"](https://github.com/dracor-org/small-world-paper) see the following repository: https://github.com/dracor-org/vebidracor. 

This notebook demonstrates how to use the [StableDraCor client](https://github.com/dracor-org/stabledracor) to re-create VeBiDraCor as a demonstrate for the paper ["Dockerizing DraCor – A Container-based Approach to Reproducibility in Computational Literary Studies"](https://www.conftool.pro/dh2023/index.php?page=browseSessions&form_session=36#paperID358) presented at the conference [DH2023](https://dh2023.adho.org).

In [1]:
import logging
#logging.basicConfig(level=logging.DEBUG)
logging.basicConfig(level=logging.INFO)

In [2]:
import os

In [3]:
#export GITHUB_TOKEN={token}
# https://github.com/settings/tokens
github_token = os.environ.get("GITHUB_TOKEN")

In [4]:
from src.stabledracor.client import StableDraCor

In [5]:
vebidracor = StableDraCor(name="vebidracor",
                          description="DraCor system containing VeBiDraCor – a very big drama corpus with plays from several DraCor corpora",
                          github_access_token=github_token)

INFO:root:Initialized new StableDraCor instance: 'vebidracor' (ID: 0e10877d-aa33-4a84-b8d8-43d961d0c40e).
INFO:root:Docker is available.


In [6]:
vebidracor.run()

INFO:root:Fetched default compose file (configuration) from https://raw.githubusercontent.com/dracor-org/stabledracor/master/configurations/compose.fullstack.empty.yml.
 Network vebidracor_default  Creating
 Network vebidracor_default  Created
 Container vebidracor-fuseki-1  Creating
 Container vebidracor-metrics-1  Creating
 Container vebidracor-fuseki-1  Created
 Container vebidracor-metrics-1  Created
 Container vebidracor-api-1  Creating
 Container vebidracor-api-1  Created
 Container vebidracor-frontend-1  Creating
 Container vebidracor-frontend-1  Created
 Container vebidracor-metrics-1  Starting
 Container vebidracor-fuseki-1  Starting
 Container vebidracor-metrics-1  Started
 Container vebidracor-fuseki-1  Started
 Container vebidracor-api-1  Starting
 Container vebidracor-api-1  Started
 Container vebidracor-frontend-1  Starting
 Container vebidracor-frontend-1  Started
INFO:root:Started with downloaded docker compose file.
INFO:root:Found dracor/dracor-api container with ID f

True

In [7]:
corpus_metadata = {
    "name" : "vebi", 
    "title": "Very Big Drama Corpus",
    "description": "This corpus assembles plays from several DraCor corpora as a single Very Big Drama Corpus"
}

In [8]:
# need to explicitly register this corpus
vebidracor.add_corpus(corpus_metadata=corpus_metadata, register=True)

INFO:root:Successfully created corpus vebi. All metadata is available and plays are available.


True

In [10]:
vebidracor.add_files_from_repo(corpusname="vebi", 
                              repository_name="alsdracor", 
                              commit="c87ea41aac9412e4bd84a28e9c7632c53904f77c")

INFO:root:Successfully added all 25 files of repository 'dracor-org/alsdracor' to corpus 'vebi'.


True

In [11]:
vebidracor.add_files_from_repo(corpusname="vebi", 
                              repository_name="bashdracor", 
                              commit="c16b58ef3726a63c431bb9575b682c165c9c0cbd")

INFO:root:Successfully added all 3 files of repository 'dracor-org/bashdracor' to corpus 'vebi'.


True

In [12]:
vebidracor.add_files_from_repo(corpusname="vebi", 
                              repository_name="caldracor", 
                              commit="6cb804d415051d5f18bc4841fa1ce4343a7f0ab5")

INFO:root:Successfully added all 205 files of repository 'dracor-org/caldracor' to corpus 'vebi'.


True

In [13]:
vebidracor.add_files_from_repo(corpusname="vebi", 
                              repository_name="greekdracor", 
                              commit="7397aafa1927c3e0a0720bf3c00bf367ab679f26")

INFO:root:Successfully added all 39 files of repository 'dracor-org/greekdracor' to corpus 'vebi'.


True

In [14]:
vebidracor.add_files_from_repo(corpusname="vebi", 
                              repository_name="hundracor", 
                              commit="57e64454a73ffd984ff5fcc1c9b7bc16f3a169f2")

INFO:root:Successfully added all 41 files of repository 'dracor-org/hundracor' to corpus 'vebi'.


True

In [15]:
vebidracor.add_files_from_repo(corpusname="vebi", 
                              repository_name="itadracor", 
                              commit="10c84b416d25a6cbfbb195b9f82f136e482a7093")

INFO:root:Successfully added all 139 files of repository 'dracor-org/itadracor' to corpus 'vebi'.


True

In [16]:
vebidracor.add_files_from_repo(corpusname="vebi", 
                              repository_name="romdracor", 
                              commit="20644eb44f59649721310c3a6d1fd1fe505653d5")

INFO:root:Successfully added all 36 files of repository 'dracor-org/romdracor' to corpus 'vebi'.


True

In [17]:
vebidracor.add_files_from_repo(corpusname="vebi", 
                              repository_name="rusdracor", 
                              commit="6d5b1e5549731538a48684a456006384da206e9a")

INFO:root:Successfully added all 212 files of repository 'dracor-org/rusdracor' to corpus 'vebi'.


True

In [19]:
vebidracor.add_files_from_repo(corpusname="vebi", 
                              repository_name="spandracor", 
                              commit="184ebf975ad9cd674ff37cab44a181fa7ed8d85f")

INFO:root:Successfully added all 25 files of repository 'dracor-org/spandracor' to corpus 'vebi'.


True

In [20]:
vebidracor.add_files_from_repo(corpusname="vebi", 
                              repository_name="swedracor", 
                              commit="0e73db9315c9c8ed64abff7d2053f84e76fcf7ec")

INFO:root:Successfully added all 68 files of repository 'dracor-org/swedracor' to corpus 'vebi'.


True

In [21]:
vebidracor.add_files_from_repo(corpusname="vebi", 
                              repository_name="tatdracor", 
                              commit="5c71364f39f6533baa3a2e04217fd39e0898c851")

INFO:root:Successfully added all 3 files of repository 'dracor-org/tatdracor' to corpus 'vebi'.


True

In [23]:
vebidracor.add_files_from_repo(corpusname="vebi", 
                              repository_name="gerdracor", 
                              commit="9135bd4598f54133f23df6edfc983b79f1616fb5",
                              exclude=["kraus-die-letzten-tage-der-menschheit"])

INFO:root:Successfully added all 590 files of repository 'dracor-org/gerdracor' to corpus 'vebi'.


True

In [24]:
vebidracor.add_files_from_repo(corpusname="vebi", 
                              repository_name="fredracor", 
                              commit="65e93f6ff632b367cdc7e16e3e390956856c4b98",
                              exclude=["anonyme-vende","arnould-heroine-americaine","audinot-dorothee","becque-mere"])

INFO:root:Successfully added all 1556 files of repository 'dracor-org/fredracor' to corpus 'vebi'.


True

In [25]:
vebidracor.add_files_from_repo(corpusname="vebi",
                               repository_owner="ingoboerner",
                               repository_name="shakedracor",
                               commit="3a420de7d253a505d1d3b8225e6bb6659577d82f")

INFO:root:Successfully added all 37 files of repository 'ingoboerner/shakedracor' to corpus 'vebi'.


True

In [42]:
vebidracor.create_docker_image_of_service(service="api", image_tag="vebidracor_v4_arm64")

INFO:root:Committed container fdac4b7478f5 as dracor/stable-dracor:vebidracor_v4_arm64. Image identifier sha256:02d2b02d6cbfd6d299b8ee7268216f3045686f4c23d13daad2c3c57a82071a69
.


In [43]:
vebidracor.create_compose_file()

INFO:root:Stored configuration (docker-compose file) as compose.vebidracor.yml.


In [48]:
#test if labels from image can be transformed back into manifest; when re-run, the image ID might be different
vebidracor.create_manifest(image="02d2b02d6cbf")

{'system': {'description': 'DraCor system containing VeBiDraCor – a very big drama corpus with plays from several DraCor corpora',
  'id': '0e10877d-aa33-4a84-b8d8-43d961d0c40e',
  'name': 'vebidracor',
  'timestamp': '2023-07-04T17:24:46.462276'},
 'services': {'api': {'base-image': 'dracor/stable-dracor:vebidracor_v4',
   'existdb': '6.0.1',
   'image': 'dracor/stable-dracor:vebidracor_v4_arm64',
   'version': '0.90.1-2-g19a3f46-dirty'},
  'frontend': {'image': 'dracor/dracor-frontend:v1.6.0-dirty'},
  'metrics': {'image': 'dracor/dracor-metrics:v1.2.0'},
  'triplestore': {'image': 'dracor/dracor-fuseki:v1.0.0'}},
 'corpora': {'vebi': {'corpusname': 'vebi',
   'num_of_plays': '2979',
   'sources': {'alsdracor': {'commit': 'c87ea41aac9412e4bd84a28e9c7632c53904f77c',
     'num-of-plays': '25',
     'timestamp': '2023-07-04T14:04:43.354536',
     'type': 'repository',
     'url': 'https://github.com/dracor-org/alsdracor'},
    'bashdracor': {'commit': 'c16b58ef3726a63c431bb9575b682c165c