Tar: configurable symbolic link handling

Continues from #74404.

The original issue asked for hardlink detection and support for toggling whether hardlinks should be stored as regular files (copies) and similar decision for extraction. This was implemented in https://github.com/dotnet/runtime/pull/123874.

Similar functionality can be envisioned for symbolic links, and the following API shape was approved as of https://github.com/dotnet/runtime/issues/74404#issuecomment-4120743237

```csharp
namespace System.Formats.Tar
{
    public enum TarSymbolicLinkMode
    {
        PreserveLink,
        CopyContents,
        Skip,
    }

    public paritial sealed class TarWriterOptions
    {
        public TarSymbolicLinkMode SymbolicLinkMode { get; set; } = TarSymbolicLinkMode.PreserveLink;
    }

    public partial sealed class TarExtractOptions
    {
        public TarSymbolicLinkMode SymbolicLinkMode { get; set; } = TarSymbolicLinkMode.PreserveLink;
    }
}
```

However, during implementation, there arose some problems and uncertainties, see  https://github.com/dotnet/runtime/issues/74404#issuecomment-4143344219

For files, it pretty straightforward, headache starts with directory symlinks and CopyContents.

> GNU Tar will traverse directory symlinks as if they were normal directories (with -h flag), but that is not straightforward to replicate in .NET because:
> 
> The TarSymbolicLinkMode is set on a TarWriter level, so it should apply during a call to TarWriter.WriteEntry(fileName, entryName)
> for non-symlinks, when we pass a directory to TarWriter.WriteEntry, we omit one entry (which results in empty directory during extraction), recursing here and writing multiple entries for a directory symlink introduces inconsistency, and writing only a directory entry does not roundtrip w.r.t. users expectations when using TarFile.CreateFromDirectory
> Alternative is to implement this link-traversal in TarFile.CreateFromDirectory, but then the TarSymbolicLinkMode should be on "TarCreationOptions" (which we decided to omit from the proposal).
> What's more, during extraction, when we encounter a symlink entry, the target file/directory does not necessarily exist yet. (it may be present later in the archive), so we don't know which contents we should copy, we probably need to postpone these entries until all other entries are extracted.
> 
> And don't even start with loops created by symbolic links, detecting these during creation is not complicated, but it's another level to do that during extraction (preventing infinite recursion vs finding a loop in a directed graph).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tar: configurable symbolic link handling #126404

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Tar: configurable symbolic link handling #126404

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions