# Basic useage of the DSMS-Python-SDK

Before you run this tutorial: make sure to have access to an DSMS-instance of your interest, that you have installed this package and that you have copied the needed variables such as the `DSMS_HOST_URL` and `DSMS_TOKEN` into an `.env`-file.

First of all, make let us import the needed classes and functions for this tutortial.

In [1]:
from dsms import DSMS, KItem

Now source the environmental variables from an `.env` file and start the DSMS-session.

In [2]:
dsms = DSMS(env="../.env")

### 1: Introduction

We can see which kind of DSMS-object we own as a user:

We can investigate what a KItem needs in order to be created. KItems are entirely based on [`Pydantic`](https://docs.pydantic.dev/latest/)-Models (v2), hence the properties (in `Pydantic` called `Fields`) are automatically validated once we set them. 

The schema of the KItem itself is a JSON schema which is machine-readable and can be directly incorporated into [Swagger](https://swagger.io/tools/swagger-ui/)-supported APIs like e.g. [`FastAPI`](https://fastapi.tiangolo.com/).

We can investigate the KTypes defined in the remote instance:

In [3]:
for ktype in dsms.ktypes:
    print(ktype)

KTypes.Organization
KTypes.App
KTypes.Dataset
KTypes.DatasetCatalog
KTypes.TestingMachine
KTypes.Expert


### 2: Create KItems

We can make new KItems by simple class-initiation:

In [4]:
item = KItem(
    name="foo123",
    ktype_id=dsms.ktypes.Dataset,
    custom_properties={"foo": "bar"},
)

item

KItem(

	name = foo123, 

	id = 6e63e2f4-587b-46a2-871e-fa835d219ba8, 

	ktype_id = KTypes.Dataset, 

	in_backend = False, 

	slug = foo123-6e63e2f4, 

	annotations = [], 

	attachments = [], 

	linked_kitems = [], 

	affiliations = [], 

	authors = [], 

	avatar_exists = False, 

	contacts = [], 

	created_at = None, 

	updated_at = None, 

	external_links = [], 

	kitem_apps = [], 

	summary = None, 

	user_groups = [], 

	custom_properties = {
		foo: bar
	}, 

	hdf5 = None, 

	rdf_exists = False
)

Remember: changes are only syncronized with the DSMS when you call the `commit`-method:

In [5]:
dsms.commit()
item.url

'https://bue.materials-data.space/knowledge/dataset/foo123-6e63e2f4'

As we can see, the object we created before running the `commit`-method has automatically been updated, e.g. with the creation- and update-timestamp:

In [6]:
item

KItem(

	name = foo123, 

	id = 6e63e2f4-587b-46a2-871e-fa835d219ba8, 

	ktype_id = dataset, 

	in_backend = True, 

	slug = foo123-6e63e2f4, 

	annotations = [], 

	attachments = [], 

	linked_kitems = [], 

	affiliations = [], 

	authors = [
		{
			user_id: 29367f49-0562-43f9-b0cd-62b662446cc9
		}
	], 

	avatar_exists = False, 

	contacts = [], 

	created_at = 2024-05-13 11:52:26.331260, 

	updated_at = 2024-05-13 11:52:26.331260, 

	external_links = [], 

	kitem_apps = [], 

	summary = None, 

	user_groups = [], 

	custom_properties = {
		foo: bar
	}, 

	hdf5 = None, 

	rdf_exists = False
)

### 3. Update KItems

Now, we would like to update the properties of our KItem we created previously.

Depending on the schema of each property (see `KItem.model_schema_json()` in the **Introduction** of this tutorial), we can simply use the standard `list`-method as we know them from basic Python (e.g. for the `annotations`, `attachments`, `external_link`, etc). 


Other properties which are not `list`-like can be simply set by attribute-assignment (e.g. `name`, `slug`, `ktype_id`, etc).

In [7]:
item.name = "foobar"
item.custom_properties.foobar = "foobar"
item.attachments.append("../README.md")
item.annotations.append("www.example.org/foo")
item.external_links.append(
    {"url": "http://example.org", "label": "example link"}
)
item.contacts.append({"name": "foo", "email": "foo@bar.mail"})
item.affiliations.append("foobar team")
item.user_groups.append({"name": "foogroup", "group_id": "123"})

Changes are sent to the DSMS through the `commit`-method again.

In [8]:
dsms.commit()

In [9]:
item

KItem(

	name = foobar, 

	id = 6e63e2f4-587b-46a2-871e-fa835d219ba8, 

	ktype_id = dataset, 

	in_backend = True, 

	slug = foo123-6e63e2f4, 

	annotations = [
		{
			iri: www.example.org/foo,
			name: foo,
			namespace: www.example.org
		}
	], 

	attachments = [
		{
			name: README.md
		}
	], 

	linked_kitems = [], 

	affiliations = [
		{
			name: foobar team
		}
	], 

	authors = [
		{
			user_id: 29367f49-0562-43f9-b0cd-62b662446cc9
		}
	], 

	avatar_exists = False, 

	contacts = [
		{
			name: foo,
			email: foo@bar.mail,
			user_id: None
		}
	], 

	created_at = 2024-05-13 11:52:26.331260, 

	updated_at = 2024-05-13 11:52:29.850682, 

	external_links = [
		{
			label: example link,
			url: http://example.org/
		}
	], 

	kitem_apps = [], 

	summary = None, 

	user_groups = [
		{
			name: foogroup,
			group_id: 123
		}
	], 

	custom_properties = {
		foo: bar
	}, 

	hdf5 = None, 

	rdf_exists = False
)

We can see now that e.g. the local system path of the attachment is changed to a simply file name, which means that the upload was successful. If not so, an error would have beem thrown during the `commit`.

Furthermore we can also download the file we uploaded again:

In [10]:
for file in item.attachments:
    download = file.download()

    print("\t\t\t Downloaded file:", file)
    print("|------------------------------------Beginning of file------------------------------------|")
    print(download)
    print("|---------------------------------------End of file---------------------------------------|")

			 Downloaded file: {
			name: README.md
		}
|------------------------------------Beginning of file------------------------------------|
# DSMS-SDK
Python SDK core-package for interacting with the Dataspace Management System (DSMS)


## Authors

[Matthias Büschelberger](mailto:matthias.bueschelberger@iwm.fraunhofer.de) (Fraunhofer Institute for Mechanics of Materials IWM)

[Yoav Nahshon](mailto:yoav.nahshon@iwm.fraunhofer.de) (Fraunhofer Institute for Mechanics of Materials IWM)

[Pablo De Andres](mailto:pablo.de.andres@iwm.fraunhofer.de) (Fraunhofer Institute for Mechanics of Materials IWM)

## License

This project is licensed under the BSD 3-Clause. See the LICENSE file for more information.

## Usage

The SDK provides a general Python interface to a remote DSMS deployment, allowing users to access, store and link data in a DSMS instance easily and safely. The package provides the following main capabilities:

- Managing Knowledge-Items (KItems), which are data instances of an expl

### 4: Delete KItems and their properties

We can also remove properties from the KItem without deleting the KItem itself.

For the `list`-like properties, we can use the standard `list`-methods from basic Python again (e.g. `pop`, `remove`, etc. or the `del`-operator).

For the other, non-`list`-like properties, we can simply use the attribute-assignment again.

When we only want single parts of the properties in the KItem, we can do it like this:

In [11]:
item.attachments.pop(0)
item.annotations.pop(0)

{
			name: foogroup,
			group_id: 123
		}

However, we can also reset the entire property by setting it to e.g. an empty list again:

In [12]:
item.user_groups = []

See the changes:

Send the changes to the DSMS with the `commit`-method:

In [13]:
dsms.commit()

In [14]:
item

KItem(

	name = foobar, 

	id = 6e63e2f4-587b-46a2-871e-fa835d219ba8, 

	ktype_id = dataset, 

	in_backend = True, 

	slug = foo123-6e63e2f4, 

	annotations = [], 

	attachments = [], 

	linked_kitems = [], 

	affiliations = [
		{
			name: foobar team
		}
	], 

	authors = [
		{
			user_id: 29367f49-0562-43f9-b0cd-62b662446cc9
		}
	], 

	avatar_exists = False, 

	contacts = [
		{
			name: foo,
			email: foo@bar.mail,
			user_id: None
		}
	], 

	created_at = 2024-05-13 11:52:26.331260, 

	updated_at = 2024-05-13 11:52:29.850682, 

	external_links = [
		{
			label: example link,
			url: http://example.org/
		}
	], 

	kitem_apps = [], 

	summary = None, 

	user_groups = [], 

	custom_properties = {
		foo: bar
	}, 

	hdf5 = None, 

	rdf_exists = False
)

However, we can also delete the whole KItem from the DSMS by applying the `del`-operator to the `dsms`-object with the individual `KItem`-object:

In [15]:
del dsms[item]

Commit the changes:

In [16]:
dsms.commit()

### 5: Search for KItems

In the last unit of this tutorial, we would like to search for specfic KItems we created in the DSMS.

For this purpose, we will firstly create some KItems and apply the `search`-method on the `DSMS`-object later on in order to find them again in the DSMS.

We also wnat to demonstrate here, that we can link KItems to each other in order to find e.g. a related item of type `DatasetCatalog`. For this strategy, we are using the `linked_kitems`-attribute and the `id` of the item which we would like to link.

The procedure looks like this:

In [17]:
item = KItem(
    name="foo 1",
    ktype_id=dsms.ktypes.DatasetCatalog
)

item2 = KItem(
    name="foo 2",
    ktype_id=dsms.ktypes.Organization,
    linked_kitems=[item],
    annotations=["www.example.org/foo"]
)
item3 = KItem(
    name="foo 3", 
    ktype_id=dsms.ktypes.Organization
)
item4 = KItem(
    name="foo 4",
    ktype_id=dsms.ktypes.Organization,
    annotations=["www.example.org/bar"],
)

dsms.commit()

Now, we are apply to search for e.g. kitems of type `DatasetCatalog`:

In [18]:
dsms.search(ktypes=[dsms.ktypes.DatasetCatalog])

[SearchResult(hit=KItem(
 
 	name = 3D-Blechmodelle2, 
 
 	id = 493a9075-c30c-48c2-b9d6-2da408f0ecda, 
 
 	ktype_id = dataset-catalog, 
 
 	in_backend = True, 
 
 	slug = 3d-blechmodelle2-493a9075, 
 
 	annotations = [], 
 
 	attachments = [], 
 
 	linked_kitems = [
 		{
 			id: aaabc3d4-d06e-4512-9fe3-0d01f815690e,
 			name: DX56_D_FZ2_WR00_43,
 			slug: dx56_d_fz2_wr00_43-aaabc3d4,
 			ktype_id: dataset,
 			summary: None,
 			avatar_exists: False,
 			annotations: [{
 			iri: https://w3id.org/steel/ProcessOntology/DX56D,
 			name: DX56D,
 			namespace: https://w3id.org/steel/ProcessOntology
 		}, {
 			iri: https://w3id.org/steel/ProcessOntology/TensileTest,
 			name: TensileTest,
 			namespace: https://w3id.org/steel/ProcessOntology
 		}],
 			linked_kitems: [{
 			id: 493a9075-c30c-48c2-b9d6-2da408f0ecda
 		}, {
 			id: 02db2d4a-e6a3-4187-95f3-923f058ded2c
 		}],
 			external_links: [],
 			contacts: [],
 			authors: [{
 			user_id: 29367f49-0562-43f9-b0cd-62b662446cc9
 		}],
 			

... and for all of type `Organization` and `DatasetCatalog`:

In [19]:
dsms.search(ktypes=[dsms.ktypes.Organization, dsms.ktypes.DatasetCatalog])

[SearchResult(hit=KItem(
 
 	name = 3D-Blechmodelle2, 
 
 	id = 493a9075-c30c-48c2-b9d6-2da408f0ecda, 
 
 	ktype_id = dataset-catalog, 
 
 	in_backend = True, 
 
 	slug = 3d-blechmodelle2-493a9075, 
 
 	annotations = [], 
 
 	attachments = [], 
 
 	linked_kitems = [
 		{
 			id: aaabc3d4-d06e-4512-9fe3-0d01f815690e,
 			name: DX56_D_FZ2_WR00_43,
 			slug: dx56_d_fz2_wr00_43-aaabc3d4,
 			ktype_id: dataset,
 			summary: None,
 			avatar_exists: False,
 			annotations: [{
 			iri: https://w3id.org/steel/ProcessOntology/DX56D,
 			name: DX56D,
 			namespace: https://w3id.org/steel/ProcessOntology
 		}, {
 			iri: https://w3id.org/steel/ProcessOntology/TensileTest,
 			name: TensileTest,
 			namespace: https://w3id.org/steel/ProcessOntology
 		}],
 			linked_kitems: [{
 			id: 493a9075-c30c-48c2-b9d6-2da408f0ecda
 		}, {
 			id: 02db2d4a-e6a3-4187-95f3-923f058ded2c
 		}],
 			external_links: [],
 			contacts: [],
 			authors: [{
 			user_id: 29367f49-0562-43f9-b0cd-62b662446cc9
 		}],
 			

... or for all of type `DatasetCatalog` with `foo` in the name:

In [20]:
dsms.search(query="foo", ktypes=[dsms.ktypes.DatasetCatalog])

[SearchResult(hit=KItem(
 
 	name = foo 1, 
 
 	id = c60cca3e-1d14-4e8e-ad2d-92bda7f975e5, 
 
 	ktype_id = dataset-catalog, 
 
 	in_backend = True, 
 
 	slug = foo1-c60cca3e, 
 
 	annotations = [], 
 
 	attachments = [], 
 
 	linked_kitems = [
 		{
 			id: 56791862-dd99-4fdb-984a-f9741c7cfdfe,
 			name: foo 2,
 			slug: foo2-56791862,
 			ktype_id: organization,
 			summary: None,
 			avatar_exists: False,
 			annotations: [{
 			iri: www.example.org/foo,
 			name: foo,
 			namespace: www.example.org
 		}],
 			linked_kitems: [{
 			id: c60cca3e-1d14-4e8e-ad2d-92bda7f975e5
 		}],
 			external_links: [],
 			contacts: [],
 			authors: [{
 			user_id: 29367f49-0562-43f9-b0cd-62b662446cc9
 		}],
 			linked_affiliations: [],
 			attachments: [],
 			user_groups: [],
 			custom_properties: None,
 			created_at: 2024-05-13T11:49:54.440708,
 			updated_at: 2024-05-13T11:49:54.440708
 		}
 	], 
 
 	affiliations = [], 
 
 	authors = [
 		{
 			user_id: 29367f49-0562-43f9-b0cd-62b662446cc9
 		}


... and for all of type `Organization` with the annotation `www.example.org/foo`:

In [21]:
dsms.search(
        ktypes=[dsms.ktypes.Organization], annotations=["www.example.org/foo"]
    )

[SearchResult(hit=KItem(
 
 	name = foo 2, 
 
 	id = 56791862-dd99-4fdb-984a-f9741c7cfdfe, 
 
 	ktype_id = organization, 
 
 	in_backend = True, 
 
 	slug = foo2-56791862, 
 
 	annotations = [
 		{
 			iri: www.example.org/foo,
 			name: foo,
 			namespace: www.example.org
 		}
 	], 
 
 	attachments = [], 
 
 	linked_kitems = [
 		{
 			id: c60cca3e-1d14-4e8e-ad2d-92bda7f975e5,
 			name: foo 1,
 			slug: foo1-c60cca3e,
 			ktype_id: dataset-catalog,
 			summary: None,
 			avatar_exists: False,
 			annotations: [],
 			linked_kitems: [{
 			id: 56791862-dd99-4fdb-984a-f9741c7cfdfe
 		}],
 			external_links: [],
 			contacts: [],
 			authors: [{
 			user_id: 29367f49-0562-43f9-b0cd-62b662446cc9
 		}],
 			linked_affiliations: [],
 			attachments: [],
 			user_groups: [],
 			custom_properties: None,
 			created_at: 2024-05-13T11:49:53.852559,
 			updated_at: 2024-05-13T11:49:53.852559
 		}
 	], 
 
 	affiliations = [], 
 
 	authors = [
 		{
 			user_id: 29367f49-0562-43f9-b0cd-62b662446cc

Clean up the DSMS from the tutortial:

In [22]:
del dsms[item]
del dsms[item2]
del dsms[item3]
del dsms[item4]

dsms.commit()


### 6. Apps

We can investigate which apps are available through JupyterLab:

In [23]:
dsms.apps

[App(filename='dsms-app-initiator/csv_bulgetest/csv_bulgetest.ipynb', basename='csv_bulgetest.ipynb', folder='dsms-app-initiator/csv_bulgetest'),
 App(filename='dsms-app-initiator/csv_tensile_test/csv_tensile_test.ipynb', basename='csv_tensile_test.ipynb', folder='dsms-app-initiator/csv_tensile_test'),
 App(filename='dsms-app-initiator/csv_tensile_test_f2/csv_tensile_test_f2.ipynb', basename='csv_tensile_test_f2.ipynb', folder='dsms-app-initiator/csv_tensile_test_f2'),
 App(filename='dsms-app-initiator/excel_component_test/excel_component_test.ipynb', basename='excel_component_test.ipynb', folder='dsms-app-initiator/excel_component_test'),
 App(filename='dsms-app-initiator/excel_nakajima_test/excel_nakajima_test.ipynb', basename='excel_nakajima_test.ipynb', folder='dsms-app-initiator/excel_nakajima_test'),
 App(filename='dsms-app-initiator/excel_notch_test/excel_notch_tensile_test.ipynb', basename='excel_notch_tensile_test.ipynb', folder='dsms-app-initiator/excel_notch_test'),
 App(fil

7. HDF5 

We are also able to upload dataframes or time series data and investigate them:

In [24]:
data = {"a": list(range(100)), "b": list(range(1,101))}


item = KItem(name="testdata1234", ktype_id=dsms.ktypes.DatasetCatalog, hdf5=data)
dsms.commit()

print("Column-wise:")
for column in item.hdf5:
    print("column:", column.name, ",\n", "data:", column.get())

df = item.hdf5.to_df()
print("\nAs data frame:")
print(df)

new_df = df.drop(['a'], axis=1)
item.hdf5 = new_df

dsms.commit()

Column-wise:
column: a ,
 data: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]
column: b ,
 data: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100]

As data frame:
     a    b
0    0    1
1    1    2
2    2    3
3    3    4
4    4    5
..  ..  ...
95  95   96
96  96   97
97  97   98
98  98   99
99  99  100

[100

In [25]:
del dsms[item]

In [26]:
dsms.commit()