Skip to content

Hierarchy, folders, authorization and nested (parent-child) relationships #18

Closed
@joepio

Description

@joepio

This is going to be a mess of various thoughts regarding hierarchies - sorry for the lack of structure

On desktops, we generally use folders and sub-folders for various goals:

  1. Identification: the path of a file is often also its ID - it tells the machine where the file can be found
  2. Authorization: setting rights - whether a user or program has read / write access
  3. Categorization: grouping related stuff together.
  4. Disk space management: calculating size / identifying data usage culprits.
  5. Navigation & Intuition: useful for humans for finding some file about some subject.

If Atomic Data is to be just as useful as a Unix filesystem or a Google Drive, we need to find solutions for the earlier mentioned five goals, too.

  1. Identification: The ID is obviously the Subject of the resource. This is the URL. There are no real restrictions on Atomic URLs - as long as they resolve using HTTP(S) or IPFS. If a resource moves and changes its URL, the owner must redirect to the new location.
  2. Authorization: Doing authorization without some hierarchical model seems almost impossible. Although a multi-parent model could work fine (if its additive).
  3. Categorization: We could use a tag like system. For this.
  4. Disk space management: When counting things
  5. Navigation & Intuition: useful for humans for finding some file about some subject.

So let's check out some approaches to hierarchy

Approaches to hierarchy

Classic Unix-style file/folder model

This is the one we're most familiar with. Each files has a path, and only one parent.
The five goals in the intro describe where we use these paths for.
That's a lot of responsibilities for a single string, but that has some merits:

  • Simple mental model. It's easy to reason about where a file is and who has rights.

But it also causes issues:

  • Changing IDs. If we move a file, we can no longer find it by its ID. That doesn't fare well on the web, we want Cool URIs. Google Drive solves this by using UUIDs in URLs, instead of paths. This leads to ugly, nontransparent URLs, though.
  • Having a single parent limits how we can organize a file. For example, you might want to have picture inside your personal vacation 2019 folder, but one on your public timeline. You could copy it, but that wastes space and makes it harder to manage items from one place.

Nested tagging

In this approach, resources can be linked to Tags. A Tag is like a parent folder, but its a 1-n relationship. Every resource can have multiple Tags - contrary to how folders work in most systems. Google drive does use a tag-like model, though. A folder can be placed in multiple places.

Tags can be nested, like folders. Should circular tags be possible? If they are, then implementations need to be very much aware of this, to prevent getting stuck in some loop.

These tags could be easily used for authorization. If you want to find out if an agent can read something, check the tag of the resource. Then, check the agents-read-access property (an array of Agents), and check if the requester is present in that array. If that is not the case, check the groups-read-access property (an array of Groups), and check if the requester is present in that group.

We could use an additive rights model for authorization. Check all the tags of a resource (and all parent tags of these tags) to find out whether the user has the correct rights. If the correct rights are present in any of the tags, you're good to go.

Folders are often also used for calculating disk space. This might be a bit harder with tags - you don't want to count items with multiple tags in each parent. So how do you decide how big a tag is, in bytes? One solution is to have like a 'main' tag, which perhaps is simply the first tag. Only that one is counted.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions