Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

avoid collisions when downloading source files #455

Closed
boegel opened this issue Feb 2, 2013 · 13 comments
Closed

avoid collisions when downloading source files #455

boegel opened this issue Feb 2, 2013 · 13 comments

Comments

@boegel
Copy link
Member

boegel commented Feb 2, 2013

We need a proper way to handle software packages that feature non-versioned source files (e.g. mytool.tar.gz).

@fgeorgatos
Copy link
Collaborator

I also came across the following one today: http://www.g95.org/downloads.shtml#V0.93
(ie http://ftp.g95.org/v0.93/g95_source.tgz)

could we perhaps raise a flag in the easyconfigs file, to prepend the filename with version or something like that?
(eg. 0.93_g95_source.tgz or even 0.93/g95_source.tgz - ie. subdir!)

alternatively, could we do this automatically whenever there is a collision?
(ie. an easyconfig searches both, say, 0.93_g95_source.tgz & g95_source.tgz)

@boegel
Copy link
Member Author

boegel commented Feb 13, 2013

I think doing this automagically should be possible...

We might need an easyconfig parameter though, to specify that the renaming should be done.

@stdweird
Copy link
Contributor

atm, are the downloaded sources flagged read-only? it will at the very least prevent from accidentally overwriting the source (although it should detect that it is there and not redownload, but it will prevent admins from manually overwriting ;).
there should be nothing automatically. it should look for one file and one file only as valid source. setting an easyconfig parameter will be the best way, the actual result can be either prefixing or separate directory (i prefer last one)

@JensTimmerman
Copy link
Contributor

I propose to allow for a dictionary in the 'sources' field.
We currently have:

sources = [('abinit-%s_x86_64_linux_gnu4.5.bz2' % version, 'tar xfj %s')]
source_urls = ['http://ftp.abinit.org/']

We could keep supporting this, bud also add support for something like we have in the 'extension packages'

sources = [{'name': 'abinit-%s_x86_64_linux_gnu4.5.bz2' % version, 'unpack_cmd': 'tar xfj %s', 'rename_to': 'abinit-%s.tar.gz' % version, 'url': 'http://ftp.abinit.org/' )]

Note that actually renaming it would also remove the need for the unpack_cmd, but I left it here for illustrative purposes.

Switching to a dict would give us much more flexibility in the future, example for adding 'git_id', 'branch', etc...
Because expanding the tuple further up would lead to a big mess.

@fgeorgatos
Copy link
Collaborator

On Thu, Feb 14, 2013 at 11:42 AM, Jens Timmerman
notifications@github.comwrote:

I propose to allow for a dictionary in the 'sources' field.

very interesting offer;

I propose a variation: because 'name' is a compulsory item,
and sometimes doing the product sources x source_urls is necessary (see why
on #462 !!!),
let's rather make the format of it instead:

sources = [('abinit-%s_x86_64_linux_gnu4.5.bz2' % version, {'unpack_cmd': 'tar xfj %s', 'rename_to': 'abinit-%s.tar.gz' % version)]

ie. the dictionary contains only any extra optional specializations.

what do you think?

@JensTimmerman
Copy link
Contributor

makes sense, but we could always add name-version.tar.gz as a default, and then get rid of the compulsory item altogether ;-)

@stdweird
Copy link
Contributor

maybe change rename_to to rename but i think the 'unpack' part should be forbidden.
i have no opinion on either dict or free-form option 2nd element (i sort of prefer the 2nd, but i fully understand it's ugly ;)

i would also propose that all value strings in easyconfig are templated against

{'version':version,
'versionlower':version.lower(),
'name':name,
'namelower':name.lower(),
}

and probably a few more (maybe do a short survey on existing configs to see other frequent ones and soome constants like github, googlecode,sourceforge for the homepage/source_url)

(the example should then become

sources = [('%(namelower)s-%(version)s_x86_64_linux_gnu4.5.bz2', {'rename': '%(namelower)s-%(version)s.tar.bz2' )] 

and similar for things like url

@boegel
Copy link
Member Author

boegel commented Feb 14, 2013

@stdweird: The unpack is there for a very good reason, i.e. sometimes source files are called something.tar while they are really .tar.gz files. We have an actual example of this somewhere...
It's very wrong by the people creating the tarball, but nevertheless, we need a way to deal with it.

Or are you saying that in that case we should use the rename feature, and then let extract_cmd figure things out correctly?

@stdweird
Copy link
Contributor

i know it has a vlaid use case now, but if we have the rename possibility, we should rename to proper extenesion or, even better, use some file-like tools to determine what is in the source, irrespective of the the extension.

if we rename, the lookup to the local cache should be with the proper renamed name (or both should be tried)

@boegel
Copy link
Member Author

boegel commented Feb 14, 2013

Point taken, if we have rename, there's no need for unpack anymore.

But, we'll have to deprecate is (unless it's not a part of v1.1 yet, I'd have to check, but I think it is), not just remove it...

@boegel boegel modified the milestone: v1.X Jun 24, 2015
@boegel boegel added this to the 3.1.0 milestone Dec 20, 2016
@boegel boegel modified the milestones: 3.2.0, 3.1.0 Jan 14, 2017
@boegel boegel modified the milestones: 3.2.0, 3.3.0 May 2, 2017
@pneerincx
Copy link

Just ran into this issue today with Picard. Older versions had a version number in the downloaded file, but it's distributed as of version 2.5.0 without version number :(. Combined with a lack of a checksum for the sources (checksums are optional :o) this resulted in an easyconfig for version 2.9.x that happily found a previously downloaded version 2.7.x and installed that as 2.9.x: oops. Robust handling of unversioned sources would be welcome and related I wouldn't mind if checksum become mandatory for easyconfigs ;)...

@boegel
Copy link
Member Author

boegel commented May 10, 2017

@pneerincx Making checksums mandatory is something we're considering, but it'll be quite an effort.

Also, even though it would have helped to avoid this problem occurring silently, it's still a good idea to rename sources that are not versioned. This is something we should also support in EasyBuild, maybe with a syntax like:

sources = ['picard.zip:picard-%(version)s.zip']

EasyBuild would then first go looking for picard-%(version)s.zip, and only download when it's missing.

Of course, this should be accompanied with a checksum to avoid downloading and renaming the wrong zip file...

@boegel
Copy link
Member Author

boegel commented Jun 22, 2017

follow up in #2223 which adds support for renaming sources on download

@boegel boegel closed this as completed Jun 22, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants