Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Call tarfile.open with dereference=True to handle hard links. #91

Closed
wants to merge 1 commit into from

Conversation

@fbrosson
Copy link
Member

fbrosson commented Jun 30, 2016

  • Utils.py: Use dereference=True (instead of tarinfo=MyTarInfo) when calling tarfile.open to handle archives with hard links.
  • SourceFetcher.py: Use --hard-dereference when calling tar.
* Utils.py: Use dereference=True (instead of tarinfo=MyTarInfo)
  when calling tarfile.open to handle archives with hard links.
* SourceFetcher.py: Use --hard-dereference when calling tar.
@fbrosson
Copy link
Member Author

fbrosson commented Jun 30, 2016

Should I also revert the rest of the commit mentionned in #46 (comment) by dropping the bloc that defines class MyTarInfo?

@waddlesplash
Copy link
Member

waddlesplash commented Jun 30, 2016

I'm too uncertain of this code to merge this, sorry.

@fbrosson
Copy link
Member Author

fbrosson commented Jul 1, 2016

Works fine for me.
With the current haikuporter, when you unpack e.g. cdrtools-3.02a06 (which has a few hard links) you get them as symlinks but they are relative to work-3.02a06/sources/, so they are unusable. Hopefully these hard links are only readme files (and are not necessary for the compilation).
With this PR, you get copies instead of incorrect symlinks. If the conversion from hard link to symlink was correct then I wouldn't be suggesting this change. But hard links are not (currently) being correctly handled.
If you still prefer to not merge this PR, that's OK for me. I'll continue to use my locally modified haikuports.

@pulkomandy
Copy link
Member

pulkomandy commented Jul 1, 2016

Replacing hardlinks with copies may break things in the following situation:

  • a and b are hardlinks to the same data
  • compilation process modifies a
  • compilation process reads b and expects changes done in previous step to be visible

Replacing them with symlinks usually avoids this problem, and is the correct way to go about this. However, the symlinks should ideally use relative paths.

The old RelativizeSymLink tracker add-on could do this (and I have the source somewhere). But it should be done the Python way in this case, I guess.

@fbrosson
Copy link
Member Author

fbrosson commented Jul 1, 2016

OK, I got it. Let me add this, however, just for the record ;-)
The symlinks that the current version of HaikuPorter creates are not relative to the directory where they are created but to the directory where the tarball is extracted. So we cannot even expect them to point to the file they were supposed to be a hard link of. Here's what we get when we unpack cdrtools:

$ haikuporter -b cdrtools-3.02~a06

$ cd haikuports/app-cdr/cdrtools/work-3.02~a06/sources/cdrtools-3.02

$ ls -go BUILD COMPILE INSTALL README.compile 
lrwxrwxrwx 1    28 Jul  1 18:50 BUILD -> cdrtools-3.02/README.compile
lrwxrwxrwx 1    28 Jul  1 18:50 COMPILE -> cdrtools-3.02/README.compile
lrwxrwxrwx 1    28 Jul  1 18:50 INSTALL -> cdrtools-3.02/README.compile
-r--r--r-- 1 19660 Jul 28  2015 README.compile

$ cd cdrecord
$ ls -go COPYING LICENSE LIMITATIONS 
-r--r--r-- 1 2344 Jul  1 19:04 COPYING
lrwxrwxrwx 1   30 Jul  1 18:51 LICENSE -> cdrtools-3.02/cdrecord/COPYING
lrwxrwxrwx 1   30 Jul  1 18:51 LIMITATIONS -> cdrtools-3.02/cdrecord/COPYING

So, if there existed a package with hard links that were needed by the compilation, then it would probably fail to build due to the wrong relative links.
So the real reason why we should not merge this PR is that we would no longer be able to detect these cases. We would, as you have described, end up with packages built from files with incorrect content.
I therefore agree that we should not merge this PR. Shall I close it?

BTW, are there any kernel developers willing to add support for hard links in BFS?

@pulkomandy
Copy link
Member

pulkomandy commented Jul 1, 2016

hard links in BFS are not going to happen. This is not due to lazyness of our kernel programmers, it is a limitation in the design of on-disk structures for BFS making it not possible to have hard links. Fixing this would make "new" BFS incompatible with the current implementation, and possibly need major rework of the way the file system works.
We plan to replace the filesystem at some point (either with a new design or reusing an existing FS from another OS), but this probably wont happen anytime soon.

I think the way to go for this is symlinks with an absolute path. I'm not sure how we can achieve this, however.

@fbrosson
Copy link
Member Author

fbrosson commented Jul 1, 2016

OK, thanks for the info.
Regarding the symlinks, maybe it is possible to teach MyTarInfo how to fix the path of the target when it converts hard links to symlinks, but I wouln't know how to do that, so I'll close this PR for now.

@fbrosson fbrosson closed this Jul 1, 2016
@fbrosson fbrosson deleted the fbrosson:dereference branch Jul 1, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants
You can’t perform that action at this time.