Quick start

Note that datasets placed on the testing storage locations within this guide are readable globally by anyone, not deletable once deposited there, not backed up, and may disappear at any time.

Reporting issues

Please report any issues with the dtool lookup GUI at https://github.com/livMatS/dtool-lookup-gui/issues, any issues with this documentation at https://github.com/livMatS/RDM-Wiki-public/issues, or to data@livmats.uni-freibur.de.

When reporting an issue, please include information on your OS, the version information available within the GUI's About dialog, and debug log ouptput if possible. The dtool-lookup-gui offers a logging window. Open it by a click on logging within the "burger" menu. Switch the loglevel to DEBUG, go ahead with the actions that lead to the problem you want to report, save the logging output to file, and append that log when reporting the issue. Thank you.

Set-up

Navigate to https://github.com/livMatS/dtool-lookup-gui/releases and download the latest release zip file containing the dtool lookup gui for your OS. Minimum requirements are macOS 10.15, Windows 10, Ubuntu 20.04 or comparably recent Linux distribution. Unpack the zip archive and launch the application. When you launch the GUI for the first time, it may look quite empty:

Open the main menu by clicking the burger menu in the upper right corner and select settings.

Download the sample configuration for a testing server instance at dtool.json.

Import it via the import in the settings dialog

and selecting the downloaded file.

The imported settings will then appear in the dialog.

dtool uses cacheable tokens to facilitate authentication against the lookup server. Click on renew token to fetch such a token. Authenticate with your username and password.

For the testing configuration, it's testuser and test_password. The generated token appears in the settings dialog.

You won't have to authenticate again until the cached token looses its validity.

After importing the configuration and closing the settings dialog, you will find these settings stored within ~/.config/dtool/dtool.json below your user's home folder. The GUI will list two new base URIs on the left-hand side:

The prefix indicates the type of protocol used to communicate with the underlying storage infrastructure. s3 points to s3-compatible object storage like Amazon S3, smb points to network storage better known as Windows shares. The tesitng server instance offers s3://test-bucket and smb://test-share to play around with. Browse those locations by selecting them.

The first entry in the list plays a special role. Here you can see and search through all the datasets that have been indexed by the lookup server:

In the central column you see the list of datasets. On the right-hand side you see a few buttons, the Details, Manifest and Dependencies tab, and below them the fixed administrative and editable descriptive metadata. The latter is shown as YAML-highlighted text.

Add a local base URI

Add a local folder to the list of base URIs by clicking on the folder icon in the upper left corner

and selecting the desired location.

To distinguish them from other (remote) endpoints, local base URIs come with the file:// prefix.

Copy a dataset from remote to local

Now, copy a dataset from a remote location to your local machine. Select a dataset on the s3-endpoint and download it by choosing your local folder from the copy-button's drop-down menu:

The dataset will appear at your local base URI:

.

Notice the dataset URI entry in the administrative metadata.

The Manifest lists all items contained within the dataset. Click on Show to explore the dataset with the local file system browser.

Create a dataset

Download the README.yml template.

Adapt it to your needs in a text editor.

Point the GUI to this template at the bottom of settings dialog

or just open your .config/dtool/dtool.json in a text editor and set the DTOOL_README_TEMPLATE_FPATH entry to point to your README.yml template:

Specify your name and e-mail address as well. Next, create a new dataset by selecting your local base URI and clicking the '+' icon in the upper left corner:

Pick a name

and notice the new entry in the list of datasets.

You see a new UUID in bold assigned to the freshly created dataset. This is an important concept. No matter how your dataset is stored, how it's moved around, or how many copies of it are created, this Universally Unique IDentifier will stay with your dataset over its whole lifetime. No other dataset will ever own the same UUID. It hence serves as a persistent identifier, an important building block for implementing the FAIR principles ¹.

The UUID is prefixed by an asterisk '*' to mark it as a ProtoDataset. If configured correctly, the README.yml template should appear as descriptive metadata for the fresh dataset with some placeholders automatically filled in.

Enable the metadata editing switch at the bottom and fill in some more descriptive metadata in YAML format.

Add items to your dataset,

and freeze it,

confirming the warning.

Freezing means making the dataset immutable. The ProtoDataset turns into a Datset, the asterisk mark disappears. It's now forbidden to alter the content. You may inspect the manifest

and explore the contents with your file system browser.

Structure of a dataset

The dataset's top level holds the README.yml, the data and the .dtool directories:

The README.yml just contains what you have entered as descriptive metadata in YAML-formatted text:

The data directory holds all items:

The .dtool directory contains administrative and structural metadata distributed into several small files.

It is designed to be both machine-processible but also human-readible. As such, it holds a README.txt describing the meaning of all items within:

The manifest.json holds size and checksums of all items at the point of freezing, making any illegal tempering with the items of the frozen dataset immediately noticeable:

For more information on the structure of a dataset, refer to the software authors' publication ².

Search for a dataset

Copy your frozen dataset to the s3://test-bucket,

and confirm it's there,

Depending on the server's configuration, it will register new datasets immediately or just at certain time intervals. After that has happened, the dataset will appear on the Lookup server dataset list,

The lookup server makes the dataset discoverable by its administrative and descriptive metadata. A search query may be plain text to aim at content of the README.yml, i.e.

or formulated more specifically to aim at certain fields of the README.yml,

To understand more about the possibilities for sophisticated querying, continue reading Finding a dataset.

Browse repository from anywhere

livMatS offers a simple web app to explore the content of s3://test-bucket in your browser. Visit https://livmats-data.vm.uni-freiburg.de:4443 and log in with testuser and test_password. You should see a few datasets, among them your creation:

Footnotes

T. S. G. Olsson and M. Hartley, “Lightweight data management with dtool,” PeerJ, vol. 7, p. e6562, Mar. 2019, doi: 10.7717/peerj.6562. ↩
M. D. Wilkinson et al., The FAIR Guiding Principles for scientific data management and stewardship, Scientific Data, vol. 3, no. 1, Art. no. 1, Mar. 2016, doi: 10.1038/sdata.2016.18. ↩

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

005_quick_start.md

005_quick_start.md

Quick start

Reporting issues

Set-up

Add a local base URI

Copy a dataset from remote to local

Create a dataset

Structure of a dataset

Search for a dataset

Browse repository from anywhere

Files

005_quick_start.md

Latest commit

History

005_quick_start.md

File metadata and controls

Quick start

Reporting issues

Set-up

Add a local base URI

Copy a dataset from remote to local

Create a dataset

Structure of a dataset

Search for a dataset

Browse repository from anywhere

Footnotes