Skip to content
This repository has been archived by the owner on Dec 6, 2022. It is now read-only.

Propose symlinks #16

Closed
wants to merge 1 commit into from
Closed

Propose symlinks #16

wants to merge 1 commit into from

Conversation

warpfork
Copy link

@warpfork warpfork commented Oct 4, 2018

UnixfsV2 should have a way to encode symlinks.

Symlinks are a very commonly used and critical part of most unixy filesystems.

Symlinks are conceptually simple: at heart, they're just a string. Kernels consider them thusly: readlink yields a string, and setting symlinks also is simply a string. The content of the string is never really validated in advance, only evaluated when actually dereferenced: so we can do the same, and treat it as an opaque string.

Permissions of symlinks is not yet mentioned in this text; it should probably later be updated to use whatever is our choice for files and dirs (issue #14 discusses).

@warpfork warpfork requested a review from mikeal October 4, 2018 19:09

A symlink object has the following fields:

- `type`: String with the value of `'sym'`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the type field for differentiation was also suggested by @Stebalien at some point. I like this approach as well, it allows you to more easily drop out of support for special types when you haven't implemented them.

@kevina
Copy link
Contributor

kevina commented Oct 4, 2018

@warpfork @mikeal we might want to store some additional data with the symlink, such as the CID of the target. See ipfs/kubo#5421 (comment).

@mikeal
Copy link
Contributor

mikeal commented Oct 4, 2018

@warpfork @mikeal we might want to store some additional data with the symlink, such as the CID of the target. See ipfs/kubo#5421 (comment).

How would we get the CID if it's a circular reference to a directory up the tree? If a parent directory is the target of the symlink we can't actually create the Block until we have the complete block for the subdirectory which contains the circular ref?

@kevina
Copy link
Contributor

kevina commented Oct 4, 2018

@mikeal I would think a circular reference is something to be avoided. In that case if we want to support it we simply won't store the CID, I think. Please see the original discussion as I am not the one who proposed this.

@mikeal
Copy link
Contributor

mikeal commented Oct 4, 2018

@kevina read through the original thread and commented.

@warpfork
Copy link
Author

warpfork commented Oct 4, 2018

@mikeal I read and thumb's up'd your comment on the 5421 thread as well.

Symlinks are symlinks. They're strings. We're spec'ing how to store unixy filesystems. Unixy filesystems store these as strings. This should be pretty open and shut.

People create dangling symlinks all the time. Often intentionally. Entangling CIDs with this would make all of these common occurrences harder to represent, not clearer. And I don't even want to touch the circularity idea. We can just not have any of that by not trying to make symlinks something that they aren't.

@warpfork
Copy link
Author

warpfork commented Oct 4, 2018

For another point of reference: git can also store symlinks. When it does so, they're a string. It will faithfully preserve all oddities of that string: it won't normalize things, it won't have opinions about if it starts with a slash or a dot nor any other character, it won't trim "../" and it won't normalize ".//".

Pretty much every tool that's aware of symlinks to some degree seems to be united on this stance: transport the string and let the filesystem and the kernel (and anyone else who's aware of symlinks and reads them as a string) sort it out.

@Stebalien
Copy link

So, considering @mikeal's comment, I tend to agree: just leave symlinks alone (although we do need to fix symlink resolution). Really, proper symlink resolution would fix most of the issues here. That along with encouraging users to use full paths instead of short-cutting /ipfs/QmA/b/c/d to /ipfs/QmD.

@AndreaCensi
Copy link

@Stebalien Just a thought: I agree with @warpfork's assessment of how symlinks are treated in git and similar contexts (as opaque strings). However in IPLD you can "get it right" instead.

I am thinking of something like JSON Pointer, which solves the problem "address members of a hierarchical structure, possibly in a circular way". The IPLD data model is compatible with JSON Pointer so it could be a simple drop in.

Otherwise we could end up with IPFS structures with symlinks represented in different ways (according to the conventions of the OS). Therefore we will have:

  1. Compatibility issues: Each IPFS tool will have to know about multiple conventions to represent symlinks.
  2. Loss of "canonization": There are now different hashes that represent the same structure.

@Stebalien
Copy link

We effectively use json pointers in IPLD (except we simply ban / instead of providing an encoding scheme, for now).

For (1), we should probably lay out recommendations on how IPFS tooling should create symlinks (from scratch). But this issue is mostly about importing existing symlinks.

For (2), the content is actually different. Symlinks are, for better or worse, text files with a special flag. If we simply say "import the symlink unmodified", every implementation will import the exact same data. If we did any canonicalization, I'd actually expect implementations to end up with different hashes (nobody will get it quite right).

@Stebalien
Copy link

Jumping on the "we need symlinks" bandwagon, this would give us easy mutability within IPFS (symlink to /ipns/Qm.../something. We could, alternatively, introduce a special "mutable mount" type but symlinks are the most general approach.

@AndreaCensi
Copy link

Interesting discussion...

One question: how much variety is there for symlinks in UNIX?

If we were to store symlink as text that should be a UNIX file path, perhaps the "canonicalization" is easier than I thought. It would amount essentially to canonize a//b//c as a/b/c, /a/../basb`, etc.

@mikeal mikeal mentioned this pull request Aug 8, 2019
@rvagg
Copy link
Member

rvagg commented Dec 6, 2022

closing for archival

@rvagg rvagg closed this Dec 6, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants