In [85]:
import json
import pandas as pd

projects = [
    #'DigiBatMat',
    #'DIGITRUBBER',
    'DiProMag',
    #'DiStAl',
    'GlasDigital',
    #'iBain',
    #'KNOW-NOW',
    'KupferDigital',
    'LeBeDigital',
    'ODE_AM',
    'SensoTwin',
    'SmaDi'
]

data = {}

for ont in projects:
    with open(f'{ont}/{ont}.json', 'r', encoding='utf-8') as f:
        data.update({ont: json.load(f)})

## Used Top-Level-Ontologies
For each of the provided ontologies the use of TLOs was analyzed. This was achieved by counting rdfs:subClassOf and rdfs:subPropertyOf chains, for which the subject belongs to the projects namespace and the object belongs to the TLOs namespace. For example, the SPARQL-Query for the usage of PMD Core Ontology (v2.0.x) in the SensoTwin project reads:
```sparql
SELECT (COUNT(*) as ?subcount)
WHERE {
    ?ao rdfs:subClassOf+|rdfs:subPropertyOf+ ?tlo .
    FILTER( STRSTARTS( STR(?tlo), "https://w3id.org/pmd/co" ) ) .
    FILTER( STRSTARTS( STR(?ao), "http://w3id.org/sensotwin/applicationontology" ) ) .
}
```

In [86]:
tlos = {ont: item['tlos']['original'] for ont, item in data.items()}
pd.DataFrame(tlos).T

Unnamed: 0,pmdco-2.0.7,pmdco-v0.1-beta,emmo,cco,obo
DiProMag,123,0,0,0,0
GlasDigital,0,282,0,0,0
KupferDigital,577,0,0,0,0
LeBeDigital,112,0,0,0,0
ODE_AM,0,0,0,181,32
SensoTwin,242,0,0,0,0
SmaDi,0,0,0,0,0


## Overall defined concepts
The overall number of introduced concepts was analysed. For that, the projects ontology as well as the applicable pmdco were loaded into Protégé and a Reasoner was run. On the resultant graph, the following query was executed (exemplary for `owl:Class`es in SensoTwin):

```sparql
SELECT (COUNT(*) as ?classcount)
WHERE {
    ?class a owl:Class .
    FILTER STRSTARTS( ?class, "http://w3id.org/sensotwin/applicationontology" ) .
}
```

The table below shows the respective numbers of found definitions.

In [87]:
concepts = {ont: {
    'owl:Class': item['definitioncounts']['owl:Class'],
    'owl:ObjectProperty': item['definitioncounts']['owl:ObjectProperty'],
    'owl:DatatypeProperty': item['definitioncounts']['owl:DatatypeProperty'],
    'Total': item['definitioncounts']['owl:Class']+item['definitioncounts']['owl:ObjectProperty']+item['definitioncounts']['owl:DatatypeProperty'],
    'Reasoner': f"{item['reasoner']['reasoner']}-{item['reasoner']['version']}"
} for ont, item in data.items()}
pd.DataFrame(concepts).T

Unnamed: 0,owl:Class,owl:ObjectProperty,owl:DatatypeProperty,Total,Reasoner
DiProMag,217,3,2,222,elk-0.5.0
GlasDigital,213,10,33,256,pellet-2.2.0
KupferDigital,293,0,0,293,pellet-2.2.0
LeBeDigital,114,0,0,114,pellet-2.2.0
ODE_AM,256,12,3,271,pellet-2.2.0
SensoTwin,193,18,12,223,pellet-2.2.0
SmaDi,105,12,8,125,pellet-2.2.0


## Number of ProcessingNodes, ValueObjects (pmdco-2.0.x) and ProcessNodes (pmdco-v0.1-beta)
To get an overview over the usage of the PMD Core Ontology the number of subclasses of ProcessingNode and ValueObject was determined. For that, the projects ontology as well as the applicable pmdco were loaded into Protégé and a Reasoner was run. On the resultant graph, the following query was executed (exemplary for sub-classes of ProcessingNode in SensoTwin):

```sparql
SELECT ?classname
WHERE {
    ?x rdfs:subClassOf+ <https://w3id.org/pmd/co/ProcessingNode> .
    BIND(STR(?x) AS ?classname) .
    FILTER STRSTARTS( ?classname, "http://w3id.org/sensotwin/applicationontology" ) .
}
```

The table below shows the respective numbers of found definitions.

In [88]:
pmdusage = {ont: {
    'ProcessingNode (2.0.x)': item['processingnodes']['pmdco-2.0.7']['count'],
    'ValueObject (2.0.x)': item['valueobjects']['pmdco-2.0.7']['count'],
    'ProcessNode (v0.1-beta)': item['processingnodes']['pmdco-v0.1-beta']['count'],
    'Total': item['processingnodes']['pmdco-2.0.7']['count']+item['valueobjects']['pmdco-2.0.7']['count']+item['processingnodes']['pmdco-v0.1-beta']['count'],
    'Reasoner': f"{item['reasoner']['reasoner']}-{item['reasoner']['version']}"
} for ont, item in data.items()}
pd.DataFrame(pmdusage).T

Unnamed: 0,ProcessingNode (2.0.x),ValueObject (2.0.x),ProcessNode (v0.1-beta),Total,Reasoner
DiProMag,21,55,0,76,elk-0.5.0
GlasDigital,0,0,3,3,pellet-2.2.0
KupferDigital,28,196,0,224,pellet-2.2.0
LeBeDigital,9,42,0,51,pellet-2.2.0
ODE_AM,0,0,0,0,pellet-2.2.0
SensoTwin,140,82,0,222,pellet-2.2.0
SmaDi,0,0,0,0,pellet-2.2.0


## Used Licenses
The following table summarizes the referenced licenses. The SPARQL used for finding this information reads:
```sparql
SELECT ?lic
WHERE {
    ?x <http://purl.org/dc/terms/license>|<http://purl.org/dc/elements/1.1/license> ?lic .
}
```

In [89]:
def license_cleanup(license):
    replacements = [
        ('https://creativecommons.org/licenses/by/4.0', 'CC-BY-4.0'),
        ('http://creativecommons.org/licenses/by/4.0', 'CC-BY-4.0'),
    ]
    license = license.replace('<', '').replace('>', '')
    for old, new in replacements:
        if license.startswith(old):
            return new
    return license

licenses = {ont: {'used_licenses': ', '.join(map(license_cleanup, set(item['license']['items'])))} for ont, item in data.items()}
pd.DataFrame(licenses).T

Unnamed: 0,used_licenses
DiProMag,CC-BY-4.0
GlasDigital,CC-BY-4.0
KupferDigital,
LeBeDigital,CC-BY-4.0
ODE_AM,CC-BY-4.0
SensoTwin,CC-BY-4.0
SmaDi,CC-BY-4.0


## Contributors

In [90]:
import re
import rdflib
from IPython.display import display, HTML

def pp(df):
    return display(HTML(df.to_html().replace('\\n', '<br>')))

def orcid_resolve(string):
    m = re.match(r"<?(https://orcid.org/(\d{4}-\d{4}-\d{4}-\d{4}))>?", string)
    if m:
        orcid = m.group(1)
        g = rdflib.Graph()
        g.parse(orcid)
        names = []
        [names.append(str(row.gname)) for row in g.query(
            f"""
                SELECT ?gname WHERE {{
                    <{orcid}> <http://xmlns.com/foaf/0.1/givenName> ?gname .
                }}
            """
        )]
        [names.append(str(row.fname)) for row in g.query(
            f"""
                SELECT ?fname WHERE {{
                    <{orcid}> <http://xmlns.com/foaf/0.1/familyName> ?fname .
                }}
            """
        )]
        name = ' '.join(names)
        return f'{orcid} -> {name}'
    return string

contributors = {ont: {'creators_contributors': '\n'.join(map(orcid_resolve, set(item['creators_contributors']['items'])))} for ont, item in data.items()}
df = pd.DataFrame(contributors).T
pp(df)

Unnamed: 0,creators_contributors
DiProMag,Lennart Schwan Tapas Samanta Moritz Blum Christian Schröder Simon Bekemeier Alisa Chirkova Basil Ell Michael Feige Luana Caron Martin Wortmann Günter Reiss Sonja Schöning Philipp Cimiano Thomas Hilbig Inga Ennen Andreas Hütten
GlasDigital,Ya-Fan Chen (https://orcid.org/0000-0003-4295-7815) Simon Stier (https://orcid.org/0000-0003-0410-3616)
KupferDigital,Hossein Beygi Nasrabadi (www.orcid.org/0000-0002-3092-0532)
LeBeDigital,"https://orcid.org/0000-0003-0626-5002 -> Stephan Pirskawetz https://orcid.org/0009-0006-4524-9143 -> Melissa Telong https://orcid.org/0000-0003-2445-6734 -> Birgit Meng Mattheo Krüger, Melissa Telong Donfack, Aida Zoriyatkha, Birgit Meng, Stephan Pirskawetz https://orcid.org/0009-0004-9700-2439 -> Aida Zoriyatkha https://orcid.org/0009-0003-7121-0283 -> Mattheo Krüger"
ODE_AM,"Mohamed Kamal, Jan Reimann Thomas Bjarsch Mohamed Kamal, Heiko Beinersdorf"
SensoTwin,https://orcid.org/0009-0004-1208-3971 -> Ursula Pähler
SmaDi,https://orcid.org/0000-0003-1017-8921 -> Mena Leemhuis
