# PyST Simple Client

The `pyst_client.simple` is a Python client library for simplifying the use of [PyST servers](https://docs.pyst.dev/), which use JSON-LD as their transport serialization format. This format is both verbose and bureaucratic, and `pyst_client.simple` does a lot of the paperwork for you.

In [1]:
from pyst_client.simple import *
import httpx, csv, io
from tqdm import tqdm

## Setup

We first need to setup the client. This means we need to specify the following:

* PyST server URL base path: `settings.set_server_url(<url>)`, e.g. `settings.set_server_url("https://vocab.bonsai.uno")`. Trailing slash is optional.
* PyST API authentication key: `settings.set_api_key(<api_key>`, e.g. `settings.set_api_key("supersecret")`.
* Default creation language. This must be a [RFC 3987](https://datatracker.ietf.org/doc/html/rfc3987) language code, and should be one of the PyST server configured languages. All multilingual strings without language codes will use this language: `settings.set_language(<code>)`, e.g. `settings.set_language("es")`.
* Creation base URL. The URL used as a base path for your object's IRIs when using automatic IRI generation: `settings.set_creation_base_url(<url>)`, e.g. `settings.set_creation_base_url("https://awesome.namespace.com")`
* Default creator IRI. Will be used as a fallback default for all created objects when `creator` is not specific: `settings.set_creator(<my_url>)`, e.g. `settings.set_creator("https://valentin.stargazer")`.

## Data creation

We follow the [PyST data model](https://docs.pyst.dev/data-model/), and won't go into detail on the structure or individual attributes here.

The API code is pretty readable - refer to it for more information on each method. For most object classes, you can do the following:

* `object_class.create(args)`
* `object_class.get_one(args)`
* `object_class.get_many(args)`
* `object_instance.save()`
* `object_instance.delete()`

### Concept schemes

The `ConceptScheme.create` method will create a `ConceptScheme` in memory, but *won't persist it to the server*.

In [2]:
cs = ConceptScheme.create(
    pref_labels=["Central Product Classification"],
    version="2.1",
    notations=["CPCv2.1"],
    definitions=["CPC constitutes a comprehensive classification of all goods and services. CPC presents categories for all products that can be the object of domestic or international transactions or that can be entered into stocks.  It includes products that are an output of economic activity, including transportable goods, non-transportable goods and services.  CPC, as a standard central product classification, was developed to serve as an instrument for assembling and tabulating all kinds of statistics requiring product detail.  Such statistics may cover production, intermediate and final consumption, capital formation, foreign trade or prices.  They may refer to commodity flows, stocks or balances and may be compiled in the context of input/output tables, balance of payments and other analytical presentations. The CPC classifies products based on the physical characteristics of goods or on the nature of the services rendered. CPC was developed primarily to enhance harmonization among various fields of economic and related statistics and to strengthen the role of national accounts as an instrument for the coordination of economic statistics.  It provides a basis for recompiling basic statistics from their original classifications into a standard classification for analytical use."],
    creators=[{"@id": "https://unstats.un.org/"}]
)

Our IRI is automatically generated based on the creation base URL. My creation base URL was `https://ninja.space`:

In [18]:
cs.id_

'https://ninja.space/CPCv2.1'

If you want to create IRIs yourself, use `.create(..., id_=my_iri)`.

The `.save()` method will try to create the object on the PyST server, and then switch to updating it if it already exists. It you know it exists already, you can use `.save(already_exists=True)` to save a bit of time.

In [3]:
cs.save()

[2m2025-05-07 08:29:57[0m [[32m[1minfo     [0m] [1mServer URL http://192.168.1.137:8000 successfully loaded from secrets directory[0m
[2m2025-05-07 08:29:57[0m [[32m[1minfo     [0m] [1mDefault language `en` successfully loaded from secrets directory[0m
[2m2025-05-07 08:29:57[0m [[32m[1minfo     [0m] [1mServer URL `http://192.168.1.137:8000` is healthy and reachable[0m
[2m2025-05-07 08:29:57[0m [[32m[1minfo     [0m] [1mAPI key successfully loaded from secrets directory[0m
[2m2025-05-07 08:29:57[0m [[32m[1minfo     [0m] [1mCreation base URL successfully loaded from secrets directory[0m


<Response [200 OK]>

We can change the `ConceptScheme` data and save the changed object:

In [19]:
cs.notations = [
    {
        '@value': 'CPCv2.1',
        '@type': 'http://www.w3.org/1999/02/22-rdf-syntax-ns#PlainLiteral'
    },
    {
        '@value': 'CPC-latest',
        '@type': 'http://www.w3.org/1999/02/22-rdf-syntax-ns#PlainLiteral'  # Required for notations
    },
]

In [20]:
cs.save(already_exists=True)

<Response [200 OK]>

### Concepts

We can now populate our `ConceptScheme` with `Concept` objects, following the same pattern as with `ConceptScheme`:

In [3]:
concept_0 = Concept.create(
    concept_scheme=cs,
    pref_labels=["Agriculture, forestry and fishery products"],
    notations=["0"],
    extra={"http://rdf-vocabulary.ddialliance.org/xkos#depth": 1},
    top_concept=True,
)
concept_0.id_

'https://ninja.space/CPCv2.1/0'

In [5]:
concept_0.save()

<Response [200 OK]>

In [4]:
concept_01 = Concept.create(
    concept_scheme=cs,
    pref_labels=["Products of agriculture, horticulture and market gardening"],
    notations=["01"],
    extra={"http://rdf-vocabulary.ddialliance.org/xkos#depth": 2}
)
concept_01.id_

'https://ninja.space/CPCv2.1/01'

In [7]:
concept_01.save()

<Response [200 OK]>

We can now ask the `ConceptScheme` about its concepts:

In [13]:
cs.concepts()

[Concept(id_='https://ninja.space/CPCv2.1/0', types=['http://www.w3.org/2004/02/skos/core#Concept'], pref_labels=[{'@value': 'Agriculture, forestry and fishery products', '@language': 'en'}], status=[{'@id': 'http://purl.org/ontology/bibo/status/accepted'}], notations=[{'@type': 'http://www.w3.org/1999/02/22-rdf-syntax-ns#PlainLiteral', '@value': '0'}], definitions=[], change_notes=[], history_notes=[{'http://purl.org/dc/terms/issued': [{'@type': 'http://www.w3.org/2001/XMLSchema#date', '@value': '2025-05-07T00:00:00'}], 'http://purl.org/dc/terms/creator': [{'@id': 'https://chris.mutel.org/'}], 'http://www.w3.org/1999/02/22-rdf-syntax-ns#value': [{'@value': 'Created by pyst-client version 1', '@language': 'en'}]}], editorial_notes=[], extra={'http://rdf-vocabulary.ddialliance.org/xkos#depth': 1}, schemes=[{'@id': 'https://ninja.space/CPCv2.1'}], alt_labels=[], hidden_labels=[], top_concept_of=[{'@id': 'https://ninja.space/CPCv2.1'}]),
 Concept(id_='https://ninja.space/CPCv2.1/01', type

We can also see the data on the server:

In [14]:
cs.open_new_tab()

### Relationships

The two concepts we created are related - `concept_0` is broader than `concept_01`. We can create this relationship on the server.

Note that *relationships* are directly persisted to the server. Relationships can't be modified, only created, and deleted.

In [9]:
Relationship.create_many(
    sources=[concept_01],
    targets=[concept_0],
    verbs=RelationshipVerbs.broader,
)

[Relationship(source='https://ninja.space/CPCv2.1/01', target='https://ninja.space/CPCv2.1/0', predicate=<RelationshipVerbs.broader: 'http://www.w3.org/2004/02/skos/core#broader'>)]

We can now see the relationship on the server:

In [21]:
concept_01.open_new_tab()

## Mass import

Let's do a mass data import. We will take data from the GTDR/BONSAI project. We can start with their data on CPC, as we already have the `ConceptScheme`.

In [3]:
url = "https://gitlab.com/bonsamurais/bonsai/util/classifications/-/raw/main/src/classifications/data/flow/flowobject/tree_cpc_2_1.csv?inline=false"
response = httpx.get(url)
data = list(csv.DictReader(io.StringIO(response.text)))
data[10]

{'code': '0113', 'name': 'Rice', 'parent_code': '011', 'level': '4'}

We first create the `Concept` objects. The `pyst_simple_client` library is synchronous, and does one task at a time. 

The `pyst_client` library is asyncronous and can have much better performance, but requires a bit more work as it has fewer helper classes. You would also need to work with the existing event loop if running code in a Jupyter notebook.

In [24]:
for row in tqdm(data):
    Concept.create(
        concept_scheme=cs,
        pref_labels=[row["name"]],
        notations=[row["code"]],
        extra={"http://rdf-vocabulary.ddialliance.org/xkos#depth": int(row["level"])},
        top_concept=row["level"] == "0",
    ).save()

100%|██████████████████████████████████████████████████████████| 4597/4597 [04:11<00:00, 18.26it/s]


We can bulk-create the relationships. We don't have all the `Concept` object instances, but don't need them - `Relationship.create_many` can take IRI strings.

We already defined one relationship from this data, but our helper function will remove this duplicate. Note that duplicate detection happens on the server, and each request only removes one duplicate, so this isn't efficient for many duplicate entries.

You could also chunk the input data for this function.

In [5]:
sources = [Concept.generate_iri(concept_scheme=cs, notations=[row["code"]]) for row in data if row["parent_code"]]
targets = [Concept.generate_iri(concept_scheme=cs, notations=[row["parent_code"]]) for row in data if row["parent_code"]]
    
responses = Relationship.create_many(
    sources=sources,
    targets=targets,
    verbs=RelationshipVerbs.broader,
    timeout=120
)
len(responses)

[2m2025-05-07 09:48:44[0m [[32m[1minfo     [0m] [1mSkipping existing relationship between `https://ninja.space/CPCv2.1/0` and `https://ninja.space/CPCv2.1/total`[0m
[2m2025-05-07 09:48:52[0m [[32m[1minfo     [0m] [1mSkipping existing relationship between `https://ninja.space/CPCv2.1/01` and `https://ninja.space/CPCv2.1/0`[0m
[2m2025-05-07 09:49:00[0m [[32m[1minfo     [0m] [1mSkipping existing relationship between `https://ninja.space/CPCv2.1/011` and `https://ninja.space/CPCv2.1/01`[0m
[2m2025-05-07 09:49:07[0m [[32m[1minfo     [0m] [1mSkipping existing relationship between `https://ninja.space/CPCv2.1/0111` and `https://ninja.space/CPCv2.1/011`[0m
[2m2025-05-07 09:49:17[0m [[32m[1minfo     [0m] [1mSkipping existing relationship between `https://ninja.space/CPCv2.1/01111` and `https://ninja.space/CPCv2.1/0111`[0m
[2m2025-05-07 09:49:27[0m [[32m[1minfo     [0m] [1mSkipping existing relationship between `https://ninja.space/CPCv2.1/01121` and `htt

4560

We can do the same for another `ConceptScheme` - this time the set of BONSAI flow objects:

In [6]:
url = "https://gitlab.com/bonsamurais/bonsai/util/classifications/-/raw/main/src/classifications/data/flow/flowobject/tree_bonsai.csv?inline=false"
response = httpx.get(url)
data = list(csv.DictReader(io.StringIO(response.text)))
data[10]

{'code': 'fi_01121',
 'parent_code': 'fi_0112',
 'in_final_sut': '',
 'name': 'Maize (corn), seed',
 'level': '5',
 'default_unit': 'tonnes',
 'alias_code': '',
 'comment': ''}

This has additional attributes. We can just add these to the `extra` section. They could be defined with reference to a proper RDF ontology, but don't need to be.

In [5]:
cs_bonsai = ConceptScheme.create(
    pref_labels=["BONSAI Flow Object classification"],
    version="2025.1",
    notations=["BONSAI2025.1"],
    definitions=["BONSAI is a work output of the GTDR project. This classification extends EXIOBASE with much more detail on individual products. Original data from https://gitlab.com/bonsamurais/bonsai/util/classifications/-/blob/main/src/classifications/data/flow/flowobject/tree_bonsai.csv"],
    creators=[
        {"@id": "https://gitlab.com/mabudz"},
        {"@id": "https://gitlab.com/matdelpierre"},
        {"@id": "https://gitlab.com/Albertkwame"},
        {"@id": "https://gitlab.com/SanderNielen"},
    ]
)
cs_bonsai.save()
cs_bonsai.id_

[2m2025-05-07 12:27:59[0m [[32m[1minfo     [0m] [1mServer URL http://192.168.1.137:8000 successfully loaded from secrets directory[0m
[2m2025-05-07 12:27:59[0m [[32m[1minfo     [0m] [1mDefault language `en` successfully loaded from secrets directory[0m
[2m2025-05-07 12:27:59[0m [[32m[1minfo     [0m] [1mServer URL `http://192.168.1.137:8000` is healthy and reachable[0m
[2m2025-05-07 12:27:59[0m [[32m[1minfo     [0m] [1mAPI key successfully loaded from secrets directory[0m
[2m2025-05-07 12:27:59[0m [[32m[1minfo     [0m] [1mCreation base URL successfully loaded from secrets directory[0m


'https://ninja.space/BONSAI2025.1'

In [9]:
from tqdm import tqdm

for row in tqdm(data):
    Concept.create(
        concept_scheme=cs_bonsai,
        pref_labels=[row["name"]],
        notations=[row["code"]],
        extra={
            "http://rdf-vocabulary.ddialliance.org/xkos#depth": int(row["level"]),
            "in_final_sut": row["in_final_sut"] == "True",
            "default_unit": row["default_unit"],
            "alias_code": row["alias_code"],
            "comment": row["comment"],
        },
        top_concept=row["level"] == "0",
    ).save()

100%|██████████████████████████████████████████████████████████| 7440/7440 [07:15<00:00, 17.10it/s]


In [11]:
sources = [Concept.generate_iri(concept_scheme=cs_bonsai, notations=[row["code"]]) for row in data if row["parent_code"]]
targets = [Concept.generate_iri(concept_scheme=cs_bonsai, notations=[row["parent_code"]]) for row in data if row["parent_code"]]
    
responses = Relationship.create_many(
    sources=sources,
    targets=targets,
    verbs=RelationshipVerbs.broader,
    timeout=240
)
len(responses)

7430

### Correspondence

We can now define a `Correspondence` between the two `ConceptScheme` objects.

In [4]:
correspondence = Correspondence.create(
    compares=[cs, cs_bonsai],
    pref_labels=["BONSAI Flow Object classification"],
    version="2025.1",
    definitions=[""],
    creators=[
        {"@id": "https://gitlab.com/mabudz"},
        {"@id": "https://gitlab.com/SanderNielen"},
    ]
)
correspondence.save()
correspondence.id_

'https://ninja.space/CPCv2.1-BONSAI2025.1'

### Associations

`Correspondence` objects are made of concept `Association`. We can read the association data from the BONSAI Gitlab:

In [6]:
import httpx, csv, io

In [7]:
url = "https://gitlab.com/bonsamurais/bonsai/util/classifications/-/raw/main/src/classifications/data/flow/flowobject/conc_bonsai_cpc_2_1.csv?inline=false"
response = httpx.get(url)
data = list(csv.DictReader(io.StringIO(response.text)))
row = data[10]
row

{'flowobject_from': 'fi_01121',
 'flowobject_to': '01121',
 'classification_from': 'bonsai',
 'classification_to': 'cpc_2_1',
 'comment': 'one-to-one correspondence',
 'skos_uri': 'http://www.w3.org/2004/02/skos/core#exactMatch'}

Let's create a single `Association` to show its pattern. Note that `Association` objects can't be updated once created - they need to be deleted and re-created instead.

In [8]:
assoc = Association.create(
    correspondence=correspondence,
    source_concepts=[
        # Can also be a `Concept` instance
        {"@id": Concept.generate_iri(concept_scheme=cs_bonsai, notations=[row["flowobject_from"]])}
    ], 
    target_concepts=[
        {"@id": Concept.generate_iri(concept_scheme=cs, notations=[row["flowobject_to"]])}
    ],
    extra={
        "mapping_type": row["skos_uri"],
        "comment": row["comment"],
    }
)
assoc.save()
assoc.id_

'https://ninja.space/CPCv2.1-BONSAI2025.1/6fbf961723fd4195a44b3149b6d16add'

Creating an `Association` adds it automatically to the corresponding `Correspondence` object. We need to reload this from the server as our version has stale data:

In [19]:
correspondence = Correspondence.get_one(correspondence.id_)
correspondence.made_ofs

[{'@id': 'https://ninja.space/CPCv2.1-BONSAI2025.1/3584a98600994a23a90f13c1651f2458'},
 {'@id': 'https://ninja.space/CPCv2.1-BONSAI2025.1/6fbf961723fd4195a44b3149b6d16add'}]

We can delete this association instance and mass import the file:

In [18]:
assoc.delete()

<Response [204 No Content]>

In [20]:
for row in tqdm(data):
    Association.create(
        correspondence=correspondence,
        source_concepts=[
            # Can also be a `Concept` instance
            {"@id": Concept.generate_iri(concept_scheme=cs_bonsai, notations=[row["flowobject_from"]])}
        ], 
        target_concepts=[
            {"@id": Concept.generate_iri(concept_scheme=cs, notations=[row["flowobject_to"]])}
        ],
        extra={
            "mapping_type": row["skos_uri"],
            "comment": row["comment"],
        }
    ).save()

100%|██████████████████████████████████████████████████████████| 6194/6194 [04:14<00:00, 24.32it/s]


## Using the server data

Once data has been uploaded to the server, we can query it during out scripts, and let our users browse the classification data. You might want to load the server webpage to see the data which is now there.

The main way we will interact with the server is via `Concept` objects. You can get a single `Concept` with:

In [6]:
my_concept = Concept.get_one(concept_01.id_)

[2m2025-05-07 12:06:19[0m [[32m[1minfo     [0m] [1mServer URL http://192.168.1.137:8000 successfully loaded from secrets directory[0m
[2m2025-05-07 12:06:19[0m [[32m[1minfo     [0m] [1mDefault language `en` successfully loaded from secrets directory[0m
[2m2025-05-07 12:06:19[0m [[32m[1minfo     [0m] [1mServer URL `http://192.168.1.137:8000` is healthy and reachable[0m


We can ask our concepts about their hierarchical relationships:

In [24]:
my_concept.relationships()

[[Relationship(source='https://ninja.space/CPCv2.1/01', target='https://ninja.space/CPCv2.1/0', predicate=<RelationshipVerbs.broader: 'http://www.w3.org/2004/02/skos/core#broader'>)]]

By default `.relationships()` only looks for `Relationship` objects where the originating `Concept` is the source, but we can also look for targets with `.relationships(target=True)`

In [26]:
my_concept.relationships(target=True)

[[Relationship(source='https://ninja.space/CPCv2.1/01', target='https://ninja.space/CPCv2.1/0', predicate=<RelationshipVerbs.broader: 'http://www.w3.org/2004/02/skos/core#broader'>)],
 [Relationship(source='https://ninja.space/CPCv2.1/011', target='https://ninja.space/CPCv2.1/01', predicate=<RelationshipVerbs.broader: 'http://www.w3.org/2004/02/skos/core#broader'>)]]

We can also ask about associations:

In [11]:
Concept.get_one(
    Concept.generate_iri(concept_scheme=cs_bonsai, notations=["fi_12"])
).associations(target=True)

Source: [Association(id_='https://ninja.space/CPCv2.1-BONSAI2025.1/88a1b03df77e4e769ce26c6967b72f55', types=['http://rdf-vocabulary.ddialliance.org/xkos#ConceptAssociation'], source_concepts=[{'@id': 'https://ninja.space/BONSAI2025.1/fi_12'}], target_concepts=[{'@id': 'https://ninja.space/CPCv2.1/12'}], kind=<AssociationKind.simple: 'simple'>, extra={'comment': 'one-to-one correspondence', 'mapping_type': 'http://www.w3.org/2004/02/skos/core#exactMatch', 'http://rdf-vocabulary.ddialliance.org/xkos#Correspondence': 'https://ninja.space/CPCv2.1-BONSAI2025.1'})]
Target: []


[Association(id_='https://ninja.space/CPCv2.1-BONSAI2025.1/88a1b03df77e4e769ce26c6967b72f55', types=['http://rdf-vocabulary.ddialliance.org/xkos#ConceptAssociation'], source_concepts=[{'@id': 'https://ninja.space/BONSAI2025.1/fi_12'}], target_concepts=[{'@id': 'https://ninja.space/CPCv2.1/12'}], kind=<AssociationKind.simple: 'simple'>, extra={'comment': 'one-to-one correspondence', 'mapping_type': 'http://www.w3.org/2004/02/skos/core#exactMatch', 'http://rdf-vocabulary.ddialliance.org/xkos#Correspondence': 'https://ninja.space/CPCv2.1-BONSAI2025.1'})]