# Generation of ShEx schemas

This examples shows how to automatically generate ShEx schemas from an RDF repository. This example is based on the [Zeri Photo Archive to Linked Open Data](http://data.fondazionezeri.unibo.it) dataset.

In [1]:
from shexer.shaper import Shaper
from shexer.consts import NT

### We will use the public SPARQL endpoint

In [2]:
url_endpoint="http://data.fondazionezeri.unibo.it/sparql"

### We define the target classes to mine the RDF dataset

In [3]:
target_classes = [
     "http://www.cidoc-crm.org/cidoc-crm/E39_Actor",
     "http://www.cidoc-crm.org/cidoc-crm/E53_Place",
     "http://www.cidoc-crm.org/cidoc-crm/E65_Creation",
     "http://www.cidoc-crm.org/cidoc-crm/E35_Title",
     "http://www.cidoc-crm.org/cidoc-crm/E22_Man-Made_Object"
]

### We can provide the namespaces 

In [4]:
namespaces_dict = {"http://www.w3.org/1999/02/22-rdf-syntax-ns#": "rdf",
                   "http://www.w3.org/2000/01/rdf-schema#": "rdfs", 
                   "http://weso.es/shapes/": "",
                   "http://www.w3.org/2001/XMLSchema#": "xsd",
                   "http://www.cidoc-crm.org/cidoc-crm/": "cidoc-crm",
                   "http://schema.org/": "schema",
                   "http://rdfs.org/ns/void#": "void",
                   "http://creativecommons.org/ns#": "creativeCommons",
                   "http://purl.org/dc/terms/": "dc-terms",
                   "http://xmlns.com/foaf/0.1/": "foaf",
                   "http://purl.org/ontology/bibo/": "bibo",
                   "http://www.w3.org/2004/02/skos/core#": "skos",
                   "http://purl.org/dc/elements/1.1/": "dc",
                   "http://www.w3.org/2008/05/skos-xl#": "skos-xl",
                   "http://data.nobelprize.org/terms/": "nobel"
                   }

### We use the Shaper class 

We use as an example 5 instances (param limit_remote_instances) from the dataset to generate the ShEx schemas

In [5]:
shaper = Shaper(target_classes=target_classes,
                url_endpoint=url_endpoint, 
                input_format=NT,
                limit_remote_instances=5,
                namespaces_dict=namespaces_dict,  # Default: no prefixes
                instantiation_property="http://www.w3.org/1999/02/22-rdf-syntax-ns#type")  # Default rdf:type

output_file = "shex_zeri_actor.shex"

shaper.shex_graph(output_file=output_file,
                  acceptance_threshold=0.8)

print("Done!")

Done!


### As a result we obtain a ShEx schema based on the information provided by the dataset

```
weso-s:E39_Actor
{
   rdf:type  [cidoc-crm:E39_Actor]  ;                          # 100.0 %
   rdfs:label  rdf:langString  +;                              # 100.0 %
            # 80.0 % obj: rdf:langString. Cardinality: {1}
   <http://purl.org/spar/pro/holdsRoleInTime>  IRI  *;
            # 80.0 % obj: IRI. Cardinality: +
   cidoc-crm:P76_has_contact_point  IRI  *
            # 80.0 % obj: IRI. Cardinality: +
}


weso-s:E53_Place
{
   rdf:type  [cidoc-crm:E53_Place]  ;                          # 100.0 %
   rdfs:label  rdf:langString  +;                              # 100.0 %
            # 80.0 % obj: rdf:langString. Cardinality: {2}
   cidoc-crm:P89_falls_within  @weso-s:E53_Place  ?
            # 80.0 % obj: @weso-s:E53_Place. Cardinality: {1}
}


weso-s:E65_Creation
{
   rdf:type  [cidoc-crm:E65_Creation]  ;                       # 100.0 %
   rdfs:label  rdf:langString  {2};                            # 100.0 %
   cidoc-crm:P14_carried_out_by  IRI  ;                        # 100.0 %
   cidoc-crm:P4_has_time_span  IRI  ?
            # 80.0 % obj: IRI. Cardinality: {1}
}


weso-s:E35_Title
{
   rdf:type  [cidoc-crm:E35_Title]  ;                          # 100.0 %
   rdfs:label  rdf:langString  *
            # 80.0 % obj: rdf:langString. Cardinality: {2}
}


weso-s:E22_Man-Made_Object
{
   rdf:type  [cidoc-crm:E22_Man-Made_Object]  ;                # 100.0 %
   rdfs:label  rdf:langString  {2}                             # 100.0 %
}
```
