How fast will my open source code break? #1

moorepants · 2018-09-18T20:47:33Z

One of my biggest complaints about open source software is the fact that APIs do not remain stable. If I create a research paper using a software stack, publish, don't maintain it, and then come back ~1 year later it seems to take a day or more to update the software such that it can function with the updated dependencies. One year isn't that long of a time in a research world. This isn't good for reproducibility and I don't think we should have to shop a VM with a paper that freezes the entire stack. I've also noticed that my Matlab code that is 10+ years old tends to run just fine on new version, leading me to believe that Mathworks takes this much more seriously.

I'm interested in characterizing:

how quickly changes in downstream dependencies break scientific software
the ranking of stability in API for core software packages
comparing the API stability culture among languages, e.g. Python and R
how deep in the stack do you have to go to get stable APIs (for example the Linux kernel API is probably rock solid stable)

Hypothesis: On average a given script or software package that relies on a high level scientific computing software stack will break within a year due to unstable dependency APIs.

Prior art

Haven't found anything much yet.

Methods

Here is an idea for a method to do this:

Download a package or script at the top of (or near top of) the stack and log its release date
Install the dependencies specified at the time of release and ensure the software runs
Increment the dependency versions in chronological order and test if the script/package still runs at every increment. You can detect whether is runs or not and also whether deprecation warnings are emitted. If a single dependency fails, you can then fix it at the last working version and then continue to increment the other until you get to the script's release date or all dependencies fail.
Record the dates that your software gets deprecation warnings and fails.

Another method:

Track a code bases through git commits and somehow measure the frequency and time of depredations and removals.

We will have to find a reliable way to get old dependencies installed. This is often quite a painful process to simply get things installed as they were from some point in the past.

Another thought:

We could check how many tests of a prior version raise errors or deprecation warnings.

moorepants · 2020-09-18T11:45:32Z

I added this project idea here: https://mechmotum.github.io/jobs/msc/how-fast-will-open-source-break.html.

moorepants · 2020-09-21T14:31:39Z

A static analysis tool to identify deprecated Python code: https://github.com/QuantStack/memestra. Could be useful.

moorepants added the software label Sep 18, 2018

moorepants mentioned this issue Jul 6, 2021

Is countersteering required to change direction or can you "lean" to change direction without countersteering? #43

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How fast will my open source code break? #1

How fast will my open source code break? #1

moorepants commented Sep 18, 2018 •

edited

Loading

moorepants commented Sep 18, 2020

moorepants commented Sep 21, 2020

How fast will my open source code break? #1

How fast will my open source code break? #1

Comments

moorepants commented Sep 18, 2018 • edited Loading

Prior art

Methods

moorepants commented Sep 18, 2020

moorepants commented Sep 21, 2020

moorepants commented Sep 18, 2018 •

edited

Loading