Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: Require a Julia Header with Version in Shebang #29640

Closed
iwelch opened this issue Oct 14, 2018 · 17 comments
Closed

Suggestion: Require a Julia Header with Version in Shebang #29640

iwelch opened this issue Oct 14, 2018 · 17 comments

Comments

@iwelch
Copy link

iwelch commented Oct 14, 2018

https://discourse.julialang.org/t/suggestion-julia-version-header-in-jl-files/16316

strongly suggest a header in .jl files that indicates the version under which the file is running. issue a warning if this is not the first line in the .jl file. I would suggest

#!julia 1.0.0 1.0.1

where the first version on which this .jl was tested on is 1.0.0 and the last version is 1.0.1.

In the distant future, such a requirement will help you by making it easier for you to deprecate and break features without incurring the wrath of every user who has ever written a julia program. I understand that packages have some REQUIRE aspects already, but this is different. the shebang line should be in every .jl file, user or packaged.

a secondary minor advantage is easier recognition for unix file.

regards,

/iaw

@yuyichao
Copy link
Contributor

such a requirement will help you by making it easier for you to deprecate and break features without incurring the wrath of every user who has ever written a julia program

No. The majority of such changes will be impossible to support this way (or they won't/shouldn't be a breaking change in the first place). And any such backward compatibility can be provided by macro/function wrappers (similar to Compat, possibly in the other direction if you really want it) and there's nothing a file versioning can provide that can't be done otherwise. Julia file does not carry any semantics and doesn't need to, which is why,

I understand that packages have some REQUIRE aspects already

is the correct solution, since packages are semantically significant.

a secondary minor advantage is easier recognition for unix file.

Just use #!/usr/bin/julia or anything like this that you like.

@yuyichao
Copy link
Contributor

And as a even bigger 👎, this is not what shebang is for, and you just made it impossible to write a runnable script.

@iwelch
Copy link
Author

iwelch commented Oct 14, 2018

elaborate?

@yuyichao
Copy link
Contributor

elaborate

Which part?

@iwelch
Copy link
Author

iwelch commented Oct 15, 2018

oh, I see. The rest of the shebang line is also interpreted. So, change this to #!julia #versions if need be, or any other way to do this.

but there is a real problem here.

I write a user program in 2018 to analyze my data. the program is not a package for use by others. In 2028, someone asks me for the program that produced my Nobel-prize winning thesis on astrology. I bet that julia programs that are written today will no longer run under julia 2028. I won't remember what the prevailing julia version was for my program. Did I write it under 0.6.3? Under 1.0.0? 1.0.1? Having a near-mandatory mechanism to designate the julia version under which a program is supposed to be able to run is very useful.

In LaTeX, they did this (belatedly) with documentclass and documentstyle. In Perl, they have something like 'use v5.6.1`.

I don't care about exactly how this is done, but without a standard mechanism to designate the code compatibility of code today, few users will put it into the code today. The julia developers will need to be more paranoid about breaking 1.0.0 compatibility for end users. Many users will curse any changes of julia in the future that might break their old programs (which they will not have touched in years, and do not want to check everytime the version increases).

with a .jl file designation, end users could at least download old julia versions to run their programs.

there were fewer screams in latex2e, because latex is smart enough to understand that documentstyle code is supposed to run on a "relic" latex.

R has fewer problems, because it is now more stable than julia. Julia will still be changing quite a bit.

/iaw

@yuyichao
Copy link
Contributor

So, change this to #!julia #versions if need be, or any other way to do this.

No, that doesn't help. julia is already wrong and there's no comment in there.

I bet that julia programs that are written today will no longer run under julia 2028.

It should still run under julia 1.0 in 2028.

Did I write it under 0.6.3? Under 1.0.0? 1.0.1? Having a near-mandatory mechanism to designate the julia version under which a program is supposed to be able to run is very useful.

I never, ever find that an issue, with document and version control. (e.g. saved command line (potentially in shebang) to run the script, date of creation etc.)

In LaTeX, they did this (belatedly) with documentclass and documentstyle. In Perl, they have something like 'use v5.6.1`.

As I said, that's just impossible for most features.

I don't care about exactly how this is done

Then just download the old version of julia.

The julia developers will need to be more paranoid about breaking 1.0.0 compatibility for end users.

This is very good.

with a .jl file designation, end users could at least download old julia versions to run their programs.

Without it, end users can still download old julia versions.

there were fewer screams in latex2e, because latex is smart enough to understand that documentstyle code is supposed to run on a "relic" latex.

Well, the two systems are completely different in complexity. They are not comparable at all.

the program is not a package for use by others

Not being a package really doesn't stop you from doing anything. Also, everything you mentioned above are strictly only for scripts that you run directly without any consideration on multiple files. If that's your usecase, just put #!/usr/bin/env julia-1.0 or sth like that as your shebang. (And I do think it's a good idea to ship versioned executables in additional to the unversioned ones). If you need to run your script on multiple versions, well, you are already put in more than enough effort on your script that you should just use a check_version function.

I'll just repeat that you are proposing a solution to a non-problem, in a way that wouldn't work due to the complexity. It's wrong both in the object it is applied on (file, rather than module/package) and in that you are comparing a full programing language that is designed to be extensible to other languages that are much closer to data files than a language in this metric. Data files, of course, is very easy to have version dependent behavior but any direct interaction between code in a programing language will just expose the version dependency to the user with no way to hide it at all.

@iwelch
Copy link
Author

iwelch commented Oct 15, 2018

with respect, how often have you been asked to explain a paper that you wrote 30 years ago?

I think you underestimate the seriousness of the problem of preservation of scientific programs that were used only in your own project, and that a graduate student now wants to check, possibly even for integrity.

of course, nobody stops you (or me) from taking all necessary precautionary steps today. I am only suggesting a crutch.

some of the pain in moving from 0.6.3 to 1.0.0 in packages shows how difficult a small transition without indicated version numbers was, and this even for packages that were designed from the outset to be shared by others. now try the same thing for julia code written in 0.6.3 when we are in 5.0.0, for code that you wrote not for sharing, and often not even with great documentation, but to solve your problem as quickly as possibly.

/iaw

@jgoldfar
Copy link
Contributor

Isn't the solution to the problem of Julia being v5.0.0 and your code only working on v0.6.3 that of documenting the environment in which your script runs in your published work, and, as mentioned above, keeping the previous versions of Julia available? After all, many things about your environment can affect the output of any given script, but Julia v0.6.3 is as close at hand your git repository and

git checkout tags/v0.6.3

@yuyichao
Copy link
Contributor

how often have you been asked to explain a paper that you wrote 30 years ago

Well, that was before I was born so none. However, I have had need to run code from years ago and with version control of my code, that is never an issue.

I think you underestimate the seriousness of the problem of preservation of scientific programs that were used only in your own project, and that a graduate student now wants to check, possibly even for integrity.

Well, I fully appreciate that. And I don't think your statement of the problem is correct and neither is your solution. All papers I've seen that provide code also give a full description of all relavant info (version, system, even OS sometimes) to run the program. Any less than that the code is useless anyway. For non-published code, that's why lab notebook exist and that's exactly how we use it.

some of the pain in moving from 0.6.3 to 1.0.0 in packages shows how difficult a small transition without indicated version numbers was, and this even for packages that were designed from the outset to be shared by others.

None of what you suggested will help with that. Breaking change that are hard to fix won't be easier for the compiler. They are hard exactly because they can't be fixed automatically. So yes, the pain shows that what your suggested is not useful.

now try the same thing for julia code written in 0.6.3 when we are in 5.0.0, for code that you wrote not for sharing, and often not even with great documentation, but to solve your problem as quickly as possibly.

Again, running the code with a flag and expect it to work under 5.0.0 is impossible. No matter what feature you add to the language. If you want it to be possible, that you are giving the developers much less freedom to introduce breaking changes. Excatly what you want to avoid, i.e.

The julia developers will need to be more paranoid about breaking 1.0.0 compatibility for end users.

And yet again, downloading 0.6.3 should be possible. Since it's a pre-release, a pre-compiled binary might not be available for download in years, though as mentioned above, nothing you can do to the new version of the language will ever help.

@KristofferC
Copy link
Sponsor Member

Reproducibility is done by providing a Project + Manifest file. This already takes care of recording the exact version of all the packages that are used which is just as important as the julia version. We just need to record the julia version as well. Feel free to open an issue on Pkg.jl about this.

@yuyichao
Copy link
Contributor

yuyichao commented Oct 15, 2018

And as for why requiring a flag like this won't really help, your argument for making it mandatory is basically to combat human laziness, yet the solution isn't lazy-proof. I bet if this is required, people that don't feel like testing will just copy-and-paste 1.0.0 9.9.9 as version bound everywhere, possibly other bound that just happens to include the version he is using at the time. (This has to work btw because the julia 1.0.0 binary has no way to know if 9.9.9 has been released yet).

I know this because this is exactly what happens in cmake. Having a cmake_minimum_required command is a really nice way for well organized projects to let the user know that their cmake version needs to be updated. However, having it a required command doesn't actually help. I have countless number of small project with cmake_minimum_required(VERSION 2.8) (these days cmake_minimum_required(VERSION 3.0)) in them that are just copy-and-pasted and is in no way accurate. Cmake is at least simple enough that it can have new/old policy settings but as mentioned above, there's nothing we can guarantee about this other than throwing an error.

@iwelch
Copy link
Author

iwelch commented Oct 15, 2018

yes, they are all crutches and they all are designed to help combat human laziness. and they never succeed fully.

with respect, you have some good points; but you also display a confidence in knowing the correct perspective--not on technical matters but on scientific needs and matters--that is astounding. you have never had to deal with these problems and thus could not have developed so profound a perspective on the human tradeoffs with code. I cannot speak for all scientists, either, but I can speak for some of them in my neck of the woods. you may speak for some in your's, but not for all.

@yuyichao
Copy link
Contributor

not on technical matters but on scientific needs and matters
you have never had to deal with these problems and thus could not have developed so profound a perspective on the human tradeoffs with code.

It IS a technical issue that I'm talking about. It's the selection of tool and using the right tool to do the right thing rather than partially duplicating functions everywhere.
I don't mean that the issue of loosing track of things doesn't exist and I know from people around me that it is a big issue with code due to their bad coding habit. However, as I mentioned since the very beginning, version control (and document) are the right tool for this since that's how you properly take a snapshot of the local state and help others (including your future self) understand and use the code. You shouldn't expect everything you use to duplicate the function of everything else just because people don't want to use them.
That said, this is all based on the assumption that people forgetting the version is the only issue the proposal can solve. If letting a later julia version mimic an older behavior was technically doable (it's not, not a chance) then of course implementing such a feature will help with running old code in many other ways and it not being the right tool for book keeping wouldn't be an issue anymore since it wouldn't be introduced for that purpose.

I cannot speak for all scientists, either, but I can speak for some of them in my neck of the woods. you may speak for some in your's, but not for all.

And I've given examples for all my claims. In case it's not clear, I've also acknowledged the issue with running old code and only have problem with the tool you use in this case. If there's any case where a (fully) versioned controlled and (partially) documented code base not providing enough information to allow the user to easily guess what the version of the language being used is, I'd be interested to know. (I said "guess" since I've used commit date to narrow down the version info to 1 or 2 a few times.)

@iwelch
Copy link
Author

iwelch commented Oct 15, 2018

apologies. my tone was a little too sharp, too.

@StefanKarpinski
Copy link
Sponsor Member

There's definitely a problem to be considered and an potential idea for a solution here. What you want to be able to do is to declare that a certain bit of code should be interpreted according to the rules of Julia 1.5 or something, and then be able to use it with code that's using other rules. One way to accomplish that is to have newer Julia versions—say Julia 2.1— know how to behave as if they were older versions of Julia like Julia 1.5. The massive downside to that is that you have to drag along all the code needed to implement the behaviors of older versions and test that entire combined language surface. We deleted some 16,000 lines of code when we got rid of the deprecations from 0.7 in 1.0. And that was only from a single release. Imagine how much extra junk we'd be carrying around across many different major versions? I really have no inclination to maintain that mess.

There are other potential ways to support this, however. If we had a separate compilation model—and this would require two different modules (or whatever unit we would want) being able to collaborate through some protocol to generate efficient compositional code—then, as long as two different Julia versions agree on their calling conventions and composition protocol, you could compile them with different versions of Julia and still get a correctly working result.

I think there are a few problems with this specific proposal:

  1. We're not going to require indicating the Julia version in files, although putting a comment could be a good convention and it might be good to have a standard format for such a comment.
  2. The shebang line is really not the right place for it—it has a very specific meaning and purpose in UNIX which we should not subvert.
  3. The file is not the right unit for this. Module or packages may be but files definitely aren't.
  4. The maintenance cost even in a plausible form (modules, packages) is way too high.

I do think that saving the Julia version in the Manifest.toml file is a good idea, and eventually it would be great if Pkg can manage BinaryBuilder-provided versions of Julia itself as easily as any other binary dependency. One can even imagine the julia binary finding the right libjulia to load based on the contents of the relevant manifest file and thereby using the right version. That would only work on a per-process basis, of course, so it doesn't address the mix-and-match problem at all.

@iwelch
Copy link
Author

iwelch commented Oct 15, 2018

hi stefan---my specific suggestion (as yichuan pointed out) where indeed terrible. but the problem is real.

I would not drag along the old julia compiler, but keep the old julia compilers on the website. so, I would not keep this as a first-class feature, but as a second-class feature...presumably as a last resort when user code has broken a long time ago without anytime noticing. (continuously used code is typically updated fairly soon.)

again, this is not for me (I already have the checks in my own startup.jl, which I do with:

 confirm( [ "julia" => v"1.0.1", "DataFrames" => v"0.14.1", "GLM" => v"1.0.1", "Plots" => v"0.20.6" ] )

the whole idea is to plan ahead to make it easier for you to deprecate features in the future and face less wrath by your users, because they can backtrack.

@StefanKarpinski
Copy link
Sponsor Member

If what one wants is per-process Julia versioning then I think that the manifest file is the right place to record this information. That's already where we store all the precise versioning information for packages, so it's a natural place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants