# Basic demo

This notebooks assumes use on a local machine, where *both* the Postgres and Typesense containers have been started (using `scripts/start_postgres_container.py` and `scripts/start_typesense_container.py`.

Starting these containers will give instructions on which environment variables to set. It should include something like:

```console
export PyST_typesense_url="http://localhost:56974"
export PyST_typesense_api_key="123abc"
export PyST_db_backend="postgres"
export PyST_db_user="test"
export PyST_db_pass="test"
export PyST_db_host="localhost"
export PyST_db_port="56967"
export PyST_db_name="test"
```

You **also need** to set the auth token - we use "abc123" in this notebook.

```console
export PyST_auth_token="abc123"
```

You can then run the server with:

```console
cd <repo-dir>
python src/py_semantic_taxonomy/app.py
```

## Setup

This notebook uses the ReST server API, so only needs a web library

In [None]:
import httpx

Create a `Client` instance with parameters that won't change

In [None]:
client = httpx.Client(
    base_url="http://localhost:8000",
    headers={"X-PyST-Auth-Token": "abc123"},
)

## Concept scheme

Need to start with a concept scheme. Concepts must reference a known concept scheme.

In [None]:
scheme = {
  "@id": "http://data.europa.eu/xsp/cn2024/cn2024",
  "@type": [
    "http://www.w3.org/2004/02/skos/core#ConceptScheme"
  ],
  "http://purl.org/dc/terms/created": [
    {
      "@type": "http://www.w3.org/2001/XMLSchema#dateTime",
      "@value": "2023-10-11T13:59:56"
    }
  ],
  "http://purl.org/dc/terms/creator": [
    {
      "@id": "http://publications.europa.eu/resource/authority/corporate-body/ESTAT"
    },
    {
      "@id": "http://publications.europa.eu/resource/authority/corporate-body/TAXUD"
    }
  ],
  "http://purl.org/ontology/bibo/status": [
    {
      "@id": "http://purl.org/ontology/bibo/status/accepted"
    }
  ],
  "http://rdf-vocabulary.ddialliance.org/xkos#follows": [
    {
      "@id": "http://data.europa.eu/xsp/cn2023/cn2023"
    }
  ],
  "http://schema.org/endDate": [
    {
      "@type": "http://www.w3.org/2001/XMLSchema#date",
      "@value": "2024-12-31"
    }
  ],
  "http://schema.org/startDate": [
    {
      "@type": "http://www.w3.org/2001/XMLSchema#date",
      "@value": "2024-01-01"
    }
  ],
  "http://www.w3.org/2002/07/owl#versionInfo": [
    {
      "@value": "2024"
    }
  ],
  "http://www.w3.org/2004/02/skos/core#definition": [
    {
      "@language": "en",
      "@value": "The main classification for the European ITGS (International trade in goods statistics)  is the Combined Nomenclature (CN). This is the primary nomenclature as it is the one used by the EU Member States to collect detailed data on their trading of goods since 1988. Before the introduction of the CN, ITGS were based on a product classification called NIMEXE.  The CN is based on the Harmonised Commodity Description and Coding System (managed by the World Customs Organisation (WCO). The Harmonised System (HS) is an international classification at two, four and six-digit level which classifies goods according to their nature. It was introduced in 1988 and, since then, was revised six times: in 1996, 2002, 2007, 2012, 2017 and 2022. The CN corresponds to the HS plus a further breakdown at eight-digit level defined to meet EU needs. The CN is revised annually and, as a Council Regulation, is binding on the Member States. "
    }
  ],
  "http://www.w3.org/2004/02/skos/core#notation": [
    {
      "@value": "CN 2024"
    }
  ],
  "http://www.w3.org/2004/02/skos/core#prefLabel": [
    {
      "@language": "da",
      "@value": "Den Kombinerede Nomenklatur, 2024 (KN 2024)"
    },
    {
      "@language": "de",
      "@value": "Kombinierte Nomenklatur, 2024 (KN 2024)"
    },
    {
      "@language": "en",
      "@value": "Combined Nomenclature, 2024 (CN 2024)"
    },
    {
      "@language": "es",
      "@value": "Nomenclatura combinada, 2024 (NC 2024)"
    }
  ],
  "http://www.w3.org/2004/02/skos/core#scopeNote": [
    {
      "@type": "http://www.w3.org/2001/XMLSchema#anyURI",
      "@value": "http://publications.europa.eu/resource/oj/JOC_2019_119_R_0001"
    }
  ]
}

In [None]:
response = client.post("/concept_scheme/", json=scheme)
response.status_code

## Concept

Our first concept. In this case, we start with a concept at the top of the hierarchy:

In [None]:
concept_top = {
  "@id": "http://data.europa.eu/xsp/cn2024/280011000090",
  "@type": [
    "http://www.w3.org/2004/02/skos/core#Concept"
  ],
  "http://purl.org/dc/elements/1.1/identifier": [
    {
      "@value": "280011000090"
    }
  ],
  "http://www.w3.org/2004/02/skos/core#altLabel": [
    {
      "@language": "da",
      "@value": "AFSNIT VI - PRODUKTER FRA KEMISKE OG NÆRSTÅENDE INDUSTRIER"
    },
    {
      "@language": "de",
      "@value": "ABSCHNITT VI - ERZEUGNISSE DER CHEMISCHEN INDUSTRIE UND VERWANDTER INDUSTRIEN"
    },
    {
      "@language": "en",
      "@value": "SECTION VI - PRODUCTS OF THE CHEMICAL OR ALLIED INDUSTRIES"
    },
    {
      "@language": "es",
      "@value": "SECCIÓN VI - PRODUCTOS DE LAS INDUSTRIAS QUÍMICAS O DE LAS INDUSTRIAS CONEXAS"
    }
  ],
  "http://www.w3.org/2004/02/skos/core#inScheme": [
    {
      "@id": "http://data.europa.eu/xsp/cn2024/cn2024"
    }
  ],
  "http://www.w3.org/2004/02/skos/core#notation": [
    {
      "@value": "VI"
    }
  ],
  "http://purl.org/ontology/bibo/status": [
    {
      "@id": "http://purl.org/ontology/bibo/status/accepted"
    }
  ],
  "http://www.w3.org/2004/02/skos/core#prefLabel": [
    {
      "@language": "da",
      "@value": "AFSNIT VI - PRODUKTER FRA KEMISKE OG NÆRSTÅENDE INDUSTRIER"
    },
    {
      "@language": "de",
      "@value": "ABSCHNITT VI - ERZEUGNISSE DER CHEMISCHEN INDUSTRIE UND VERWANDTER INDUSTRIEN"
    },
    {
      "@language": "en",
      "@value": "SECTION VI - PRODUCTS OF THE CHEMICAL OR ALLIED INDUSTRIES"
    },
    {
      "@language": "es",
      "@value": "SECCIÓN VI - PRODUCTOS DE LAS INDUSTRIAS QUÍMICAS O DE LAS INDUSTRIAS CONEXAS"
    }
  ],
  "http://www.w3.org/2004/02/skos/core#scopeNote": [
    {
      "@language": "de",
      "@value": "ERZEUGNISSE DER CHEMISCHEN INDUSTRIE UND VERWANDTER INDUSTRIEN"
    },
    {
      "@language": "en",
      "@value": "PRODUCTS OF THE CHEMICAL OR ALLIED INDUSTRIES"
    },
    {
      "@language": "fr",
      "@value": "PRODUITS DES INDUSTRIES CHIMIQUES OU DES INDUSTRIES CONNEXES"
    }
  ],
  "http://www.w3.org/2004/02/skos/core#topConceptOf": [
    {
      "@id": "http://data.europa.eu/xsp/cn2024/cn2024"
    }
  ]
}

In [None]:
response = client.post("/concept/", json=concept_top)
response.status_code

In [None]:
concept_mid = {
  "@id": "http://data.europa.eu/xsp/cn2024/370021000090",
  "@type": [
    "http://www.w3.org/2004/02/skos/core#Concept"
  ],
  "http://purl.org/dc/elements/1.1/identifier": [
    {
      "@value": "370021000090"
    }
  ],
  "http://purl.org/ontology/bibo/status": [
    {
      "@id": "http://purl.org/ontology/bibo/status/accepted"
    }
  ],
  "http://rdf-vocabulary.ddialliance.org/xkos#depth": [
    {
      "@type": "http://www.w3.org/2001/XMLSchema#positiveInteger",
      "@value": "2"
    }
  ],
  "http://www.w3.org/2004/02/skos/core#broader": [
    {
      "@id": "http://data.europa.eu/xsp/cn2024/280011000090"
    }
  ],
  "http://www.w3.org/2004/02/skos/core#inScheme": [
    {
      "@id": "http://data.europa.eu/xsp/cn2024/cn2024"
    }
  ],
  "http://www.w3.org/2004/02/skos/core#notation": [
    {
      "@value": "37"
    }
  ],
  "http://www.w3.org/2004/02/skos/core#prefLabel": [
    {
      "@language": "da",
      "@value": "KAPITEL 37 - FOTOGRAFISKE OG KINEMATOGRAFISKE ARTIKLER"
    },
    {
      "@language": "de",
      "@value": "KAPITEL 37 - ERZEUGNISSE ZU FOTOGRAFISCHEN ODER KINEMATOGRAFISCHEN ZWECKEN"
    },
    {
      "@language": "en",
      "@value": "CHAPTER 37 - PHOTOGRAPHIC OR CINEMATOGRAPHIC GOODS"
    },
    {
      "@language": "es",
      "@value": "CAPÍTULO 37 - PRODUCTOS FOTOGRÁFICOS O CINEMATOGRÁFICOS"
    }
  ],
  "http://www.w3.org/2004/02/skos/core#scopeNote": [
    {
      "@language": "da",
      "@value": "Kapitel 37, FOTOGRAFISKE OG KINEMATOGRAFISKE ARTIKLER"
    },
    {
      "@language": "de",
      "@value": "ERZEUGNISSE ZU FOTOGRAFISCHEN ODER KINEMATOGRAFISCHEN ZWECKEN"
    },
    {
      "@language": "en",
      "@value": "PHOTOGRAPHIC OR CINEMATOGRAPHIC GOODS"
    },
    {
      "@language": "es",
      "@value": "PRODUCTOS FOTOGRÁFICOS O CINEMATOGRÁFICOS"
    }
  ]
}

In [None]:
response = client.post("/concept/", json=concept_mid)
response.status_code

## Relationships

With two concepts, we can already query for hierarchical relationships:

In [None]:
response = client.get("/relationships/", params={"iri": "http://data.europa.eu/xsp/cn2024/370021000090"})
response.json()

We can also go down the hierarchy:

In [None]:
client.get("/concept/", params={"iri": "http://data.europa.eu/xsp/cn2024/280011000090"}).json()

In [None]:
response = client.get("/relationships/", params={"iri": "http://data.europa.eu/xsp/cn2024/280011000090", "target": "1"})
response.json()

Let's add one more concept

In [None]:
concept_low = {
  "@id": "http://data.europa.eu/xsp/cn2024/370400000080",
  "@type": [
    "http://www.w3.org/2004/02/skos/core#Concept"
  ],
  "http://purl.org/dc/elements/1.1/identifier": [
    {
      "@value": "370400000080"
    }
  ],
  "http://purl.org/ontology/bibo/status": [
    {
      "@id": "http://purl.org/ontology/bibo/status/accepted"
    }
  ],
  "http://rdf-vocabulary.ddialliance.org/xkos#depth": [
    {
      "@type": "http://www.w3.org/2001/XMLSchema#positiveInteger",
      "@value": "3"
    }
  ],
  "http://www.w3.org/2004/02/skos/core#broader": [
    {
      "@id": "http://data.europa.eu/xsp/cn2024/370021000090"
    }
  ],
  "http://www.w3.org/2004/02/skos/core#inScheme": [
    {
      "@id": "http://data.europa.eu/xsp/cn2024/cn2024"
    }
  ],
  "http://www.w3.org/2004/02/skos/core#notation": [
    {
      "@value": "3704 00"
    }
  ],
  "http://www.w3.org/2004/02/skos/core#prefLabel": [
    {
      "@language": "da",
      "@value": "Fotografiske plader, film, papir, pap og tekstilstof, eksponerede, men ikke fremkaldte"
    },
    {
      "@language": "de",
      "@value": "Fotografische Platten, Filme, Papiere, Pappen und Spinnstoffwaren, belichtet, jedoch nicht entwickelt"
    },
    {
      "@language": "en",
      "@value": "Photographic plates, film, paper, paperboard and textiles, exposed but not developed"
    },
    {
      "@language": "es",
      "@value": "Placas, películas, papel, cartón y textiles, fotográficos, impresionados pero sin revelar"
    }
  ],
  "http://www.w3.org/2004/02/skos/core#scopeNote": [
    {
      "@language": "da",
      "@value": "Fotografiske plader, film, papir, pap og tekstilstof, eksponerede, men ikke fremkaldte"
    },
    {
      "@language": "de",
      "@value": "Platten, Filme, Papiere, Pappen und Spinnstoffwaren, fotografisch, belichtet, jedoch unentwickelt"
    },
    {
      "@language": "en",
      "@value": "Photographic plates, film, paper, paperboard and textiles, exposed but not developed"
    },
    {
      "@language": "es",
      "@value": "Placas, películas, papel, cartón y textiles, fotográficos, impresionados pero sin revelar"
    }
  ]
}

In [None]:
response = client.post("/concept/", json=concept_low)
response.status_code

## Search

We can also already do search. By default, semantic search is enabled, so we don't have to search for an exact (sub)string match:

In [None]:
response = client.get("/concept/search/", params={"query": "kodak", "language": "es"})
response.json()

We can also do suggestion searching based on prefixes. Because this looks at prefixes, it doesn't use vector search:

In [None]:
response = client.get("/concept/suggest/", params={"query": "pape", "language": "es"})
response.json()

## Correspondence

We can describe correspondences to concepts which are in our system, but can also reference unknown concepts. Let's add one more concept:

In [None]:
concept_film = {
  "@id": "http://data.europa.eu/xsp/cn2024/370400100080",
  "@type": [
    "http://www.w3.org/2004/02/skos/core#Concept"
  ],
  "http://purl.org/dc/elements/1.1/identifier": [
    {
      "@value": "370400100080"
    }
  ],
  "http://purl.org/ontology/bibo/status": [
    {
      "@id": "http://purl.org/ontology/bibo/status/accepted"
    }
  ],
  "http://www.w3.org/2004/02/skos/core#broader": [
    {
      "@id": "http://data.europa.eu/xsp/cn2024/370400000080"
    }
  ],
  "http://www.w3.org/2004/02/skos/core#exactMatch": [
    {
      "@id": "http://data.europa.eu/xsp/cn2023/370400100080"
    }
  ],
  "http://www.w3.org/2004/02/skos/core#inScheme": [
    {
      "@id": "http://data.europa.eu/xsp/cn2024/cn2024"
    }
  ],
  "http://www.w3.org/2004/02/skos/core#notation": [
    {
      "@value": "3704 00 10"
    }
  ],
  "http://www.w3.org/2004/02/skos/core#prefLabel": [
    {
      "@language": "da",
      "@value": "Plader og film"
    },
    {
      "@language": "de",
      "@value": "Platten und Filme"
    },
    {
      "@language": "en",
      "@value": "Plates and film"
    },
    {
      "@language": "es",
      "@value": "Placas y películas"
    }
  ],
  "http://www.w3.org/2004/02/skos/core#relatedMatch": [
    {
      "@id": "http://data.europa.eu/1j7/nst2007/175"
    },
  ],
  "http://www.w3.org/2004/02/skos/core#scopeNote": [
    {
      "@language": "da",
      "@value": "Fotografiske plader og film, eksponerede, men ikke fremkaldte, af andre materialer end papir, pap eller tekstilstof"
    },
    {
      "@language": "de",
      "@value": "Platten und Filme, fotografisch, belichtet, jedoch unentwickelt (ausg. aus Papier, Pappe oder Spinnstoff)"
    },
    {
      "@language": "en",
      "@value": "Photographic plates and film, exposed but not developed (excl. products made of paper, paperboard or textiles)"
    },
    {
      "@language": "es",
      "@value": "Placas y películas, fotográficas, impresionadas pero sin revelar (exc. de papel, de cartón o de textiles)"
    }
  ]
}

In [None]:
response = client.post("/concept/", json=concept_film)
response.status_code

We can now create a correspondence between CN 2024 and CPA 21:

In [None]:
correspondence = {
  "@id": "http://data.europa.eu/xsp/cn2024/CN2024_CPA21",
  "@type": [
    "http://rdf-vocabulary.ddialliance.org/xkos#Correspondence"
  ],
  "http://purl.org/dc/terms/created": [
    {
      "@type": "http://www.w3.org/2001/XMLSchema#dateTime",
      "@value": "2023-11-09T11:58:32"
    }
  ],
  "http://purl.org/dc/terms/creator": [
    {
      "@id": "http://publications.europa.eu/resource/authority/corporate-body/ESTAT"
    }
  ],
  "http://purl.org/dc/terms/modified": [
    {
      "@type": "http://www.w3.org/2001/XMLSchema#dateTime",
      "@value": "2024-02-07T08:44:47"
    }
  ],
  "http://www.w3.org/2002/07/owl#versionInfo": [
    {
      "@value": "2024"
    }
  ],
  "http://purl.org/ontology/bibo/status": [
    {
      "@id": "http://purl.org/ontology/bibo/status/accepted"
    }
  ],
  "http://rdf-vocabulary.ddialliance.org/xkos#compares": [
    {
      "@id": "http://data.europa.eu/ehl/cpa21/cpa21"
    },
    {
      "@id": "http://data.europa.eu/xsp/cn2024/cn2024"
    }
  ],
  "http://www.w3.org/2000/01/rdf-schema#seeAlso": [
    {
      "@id": "http://data.europa.eu/ehl/cpa21/CPA21_CN2024"
    }
  ],
  "http://www.w3.org/2004/02/skos/core#prefLabel": [
    {
      "@language": "en",
      "@value": "Transposition between CN 2024 and CPA 2.1"
    }
  ],
  "http://www.w3.org/2004/02/skos/core#scopeNote": [
    {
      "@language": "en",
      "@value": "CN - CPA 2.1 correspondences: amendments for CN codes 0307 60 00 , 0802 92 00, 0810 20 90, 1905 90 80 and 2106 90 98."
    }
  ]
}

In [None]:
response = client.post("/correspondence/", json=correspondence)
response.status_code

We now can describe a concept association - a link from a concept in our source concept scheme to one or more concepts in another concept scheme.

In [None]:
association = {
  "@id": "http://data.europa.eu/xsp/cn2024/CN2024_CPA21_370400100080",
  "@type": [
    "http://rdf-vocabulary.ddialliance.org/xkos#ConceptAssociation"
  ],
  "http://rdf-vocabulary.ddialliance.org/xkos#sourceConcept": [
    {
      "@id": "http://data.europa.eu/xsp/cn2024/370400100080"
    }
  ],
  "http://rdf-vocabulary.ddialliance.org/xkos#targetConcept": [
    {
      "@id": "http://data.europa.eu/ehl/cpa21/742011"
    }
  ]
}

In [None]:
response = client.post("/association/", json=association)
response.status_code

The XKOS definition for association verbs doesn't include any way to describe the *quality* of the associations, but we can add these *associative* relationships manually:

In [None]:
film_relationship = [
    {
        "@id": "http://data.europa.eu/xsp/cn2024/370400100080",
        "http://www.w3.org/2004/02/skos/core#closeMatch": [
            {
                "@id": "http://data.europa.eu/ehl/cpa21/742011",
            }
        ]
    }
]

In [None]:
response = client.post("/relationships/", json=film_relationship)
response.status_code

In [None]:
response = client.get("/relationships/", params={"iri": "http://data.europa.eu/xsp/cn2024/370400100080"})
response.json()