Test more packages before publishing a new release #1530

lvjr · 2024-02-22T13:37:28Z

Brief outline of the enhancement

Recently some packages broke after a new l3kernel release had been out. Therefore I think it might be a good improvement to test more packages before publishing a new latex2e (or l3kernel) release.

Mr. Wright replied in lvjr/tabularray#474 (comment):

We've considered that before but it's not workable: changes in packages would lead to breakage we can't track, etc.

But after doing some experiments I think it is doable. Island of TeX provides weekly updated Docker images for latest full TeX Live. So we can run tests on them with GitHub Actions.

Step 0: Find out all .sty and .cls files in tex/latex folder and compile them with pdflatex on current TeX Live.

% for somename.cls file
\documentclass{somename}
\begin{document}
TEST
\end{document}

% for somename.sty file
\documentclass{article}
\usepackage{somename}
\begin{document}
TEST
\end{document}

Then add every package file which fails the compilation to an ignorelist. (If you want to do more you could also consider tex/xelatex and tex/lualatex folders as well as xelatex and lualatex programs in the future.)

We only need to do Step 0 once. Now every time a new latex release is out, we need to do the following two steps:

Step 1: Do tests similar to Step 0, but we exclude package files in ignorelist and record every success package file in a passlist.

Step 2: Install the new latex release to TeXLive Docker image. And this time we only test files in passlist. If a package file in passlist fails the test, we are almost sure that it is caused by the new latex release.

The text was updated successfully, but these errors were encountered:

josephwright · 2024-02-22T13:42:02Z

But after doing some experiments I think it is doable. Island of TeX provides weekly updated Docker images for latest full TeX Live. So we can run tests on them with GitHub Actions.

That's not an issue: we build a minimal TeX Live for our tests anyway - it's just a question of what to include. (This means the install is updated every day from CTAN.)

Step 0: Find out all .sty or .cls files in tex/latex folder and compile them with pdflatex on current TeX Live.

% for somename.cls file
\documentclass{somename}
\begin{document}
TEST
\end{document}

% for somename.sty file
\documentclass{article}
\usepackage{somename}
\begin{document}
TEST
\end{document}

Then add every package file which fails the compilation to an ignorelist. (If you want to do more you could also consider tex/xelatex and tex/lualatex folders as well as xelatex and lualatex programs in the future.)

The problems here are

Testing package loading may or may not be useful - it's not really possible to know unless you are the package author (e.g. siunitx has about 30 test files, most with multiple tests inside)
If a package changes and causes a failure, the team then have to debug - and until that's done, any releases are blocked

Now, one might decide this is a reasonable balance, but at least when we looked before it felt fragile.

lvjr · 2024-02-22T14:02:13Z

That's not an issue: we build a minimal TeX Live for our tests anyway - it's just a > question of what to include. (This means the install is updated every day from CTAN.)

It is faster to install a Docker image than manually install TeX Live and lots of packages. And the docker image contains all files needed.

Also these tests could be run at the same time when l3build is running. We only need to add a new .yml file to workflows folders.

Testing package loading may or may not be useful - it's not really possible to know unless you are the package author (e.g. siunitx has about 30 test files, most with multiple tests inside)

It is still useful even if package loading is tested only.

If a package changes and causes a failure, the team then have to debug - and until that's done, any releases are blocked

Step 1 and Step 2 together mainly catch latex bugs.

And the team could still decide not to consider these tests as release blockers and upload new releases to CTAN as before.

josephwright · 2024-02-22T14:04:28Z

That's not an issue: we build a minimal TeX Live for our tests anyway - it's just a > question of what to include. (This means the install is updated every day from CTAN.)

It is faster to install a Docker image than manually install TeX Live and lots of packages. And the docker image contains all files needed.

Rather, it contains too many: you need to be selective :)

Testing package loading may or may not be useful - it's not really possible to know unless you are the package author (e.g. siunitx has about 30 test files, most with multiple tests inside)

It is still useful even if package loading is tested only.

If a package changes and causes a failure, the team then have to debug - and until that's done, any releases are blocked

Step 1 and Step 2 together mainly catch latex bugs.

I forgot to add (3): one has to decide for every package involved if the issue is a LaTeX or a package bug.

And the team could still decide not to consider these tests as release blockers and upload new releases to CTAN as before.

Nope: releases can only go if the test suite passes.

josephwright · 2024-02-22T14:05:37Z

That's not an issue: we build a minimal TeX Live for our tests anyway - it's just a > question of what to include. (This means the install is updated every day from CTAN.)

It is faster to install a Docker image than manually install TeX Live and lots of packages. And the docker image contains all files needed.

Rather, it contains too many: you need to be selective :)

I.e. the current cache size is only ~240 Mb.

muzimuzhi · 2024-02-22T14:20:23Z

Previous discussion was in

Testing core third-party packages latex2e#235

In a more general aspect, people want an as smooth as possible experience in updating (La)TeX packages because the package managers of TeX Live and MiKTeX both don't support per-project package and package version locking.

Maybe latex3 packages could have a set of -dev releases, for instance two weeks ahead of the production releases. Then package maintainers could config CI to test with -dev releases on schedule (like once a week).

josephwright · 2024-02-22T14:25:42Z

Maybe latex3 packages could have a set of -dev releases, for instance two weeks ahead of the production releases. Then package maintainers could config CI to test with -dev releases on schedule (like once a week).

The team have been discussing this: there are some downsides but obviously potential upsides too. We are likely to talk 'in person' about it next week before reaching a conclusion.

muzimuzhi · 2024-02-22T14:37:09Z

Another idea: technically it's possible to write an action which fetches the latest files in latex3/latex3 repo (or the latest commit which passed all the checks), installs them into TDS tree, and updates the (dev-)formats. Then maintainers of packages hosted on GitHub and using GitHub Actions as CI platform could use this action in testing their packages against the latest latex3 components, on schedule.

Just technically.

josephwright · 2024-02-22T14:49:44Z

Another idea: technically it's possible to write an action which fetches the latest files in latex3/latex3 repo (or the latest commit which passed all the checks), installs them into TDS tree, and updates the (dev-)formats. Then maintainers of packages hosted on GitHub and using GitHub Actions as CI platform could use this action in testing their packages against the latest latex3 components, on schedule.

Sure, but that most likely helps devs who are already relatively 'involved' - ones with for example an existing testing setup. My take on the original question is it's more aimed at people who are not in that position. (And yes, it's unfortunate that l3debug missed the issue that sparked this - now corrected but doesn't necessarily help.)

lvjr · 2024-02-23T12:20:05Z

I created a new repository https://github.com/lvjr/pkgstatus and write some code for experiments.

First I removed ignorelist.txt and ran the tool. The result was

number of ignored packages = 0
number of passed packages = 3953
number of failed packages = 944

Then I renamed faillist.txt as ignorelist.txt and ran the tool agian. The result was:

number of ignored packages = 944
number of passed packages = 3953
number of failed packages = 0

u-fischer · 2024-02-23T13:40:09Z

I created a new repository https://github.com/lvjr/pkgstatus and write some code for experiments.

Well your own ignorelist shows the problem with this approach: It contains for example citation-style-language and acro (for different reasons), which means that your test would now not catch if we break these packages. You exclude packages which require a specific engine, you exclude packages which expect a specific class, you exclude packages which expect that some other package is loaded before (like tagpdf, newpax or hypcap) etc. So to make such a testsuite more meaningful (and once it is there package authors will request to make it more meaningful) you need exceptions---you already started that by handling beamer and fontspec differently. This all requires manual maintenance for which we don't have the bandwidth.

Naturally nothing prevents you (or someone else) to setup such a testsuite. You can pull in the newest latex sources, run the tests and notify us if you think that there is a problem with the format.

lvjr · 2024-02-23T14:19:21Z

Well your own ignorelist shows the problem with this approach: It contains for example citation-style-language and acro (for different reasons), which means that your test would now not catch if we break these packages. You exclude packages which require a specific engine, you exclude packages which expect a specific class, you exclude packages which expect that some other package is loaded before (like tagpdf, newpax or hypcap) etc. So to make such a testsuite more meaningful (and once it is there package authors will request to make it more meaningful) you need exceptions---you already started that by handling beamer and fontspec differently. This all requires manual maintenance for which we don't have the bandwidth.

Could you please read the title of this issue again? I am not talking about testing ALL packages but MORE packages. Even tesing 3953 packages is a large improvement and the team could decide which packages must be included or excluded.

Naturally nothing prevents you (or someone else) to setup such a testsuite. You can pull in the newest latex sources, run the tests and notify us if you think that there is a problem with the format.

I have heard this kind of sentences several times.

josephwright · 2024-02-23T14:24:31Z

Could you plese read the title of this issue again? I am not talking about testing ALL packages but MORE packages. Even tesing 3953 packages is a large improvement and the team could decide which packages must be included or excluded.

The issue comes in that if there's a breakage, one has to decide what to do. Much of the time, issues arise which are. highlighted by a LaTeX change, but are not (formally) 'caused' by a LaTeX change, i.e. something was already broken compared to the formal API, etc. One then has to decide how to handle things. Excluding a 'broken' package is easiest but doesn't help (other than as a record), asking a dev to fix may or may not be successful, fixing via some firstaid approach is sometime necessary but isn't something that scales well, etc.

lvjr · 2024-02-24T11:27:08Z

The issue comes in that if there's a breakage, one has to decide what to do. Much of the time, issues arise which are. highlighted by a LaTeX change, but are not (formally) 'caused' by a LaTeX change, i.e. something was already broken compared to the formal API, etc. One then has to decide how to handle things. Excluding a 'broken' package is easiest but doesn't help (other than as a record), asking a dev to fix may or may not be successful, fixing via some firstaid approach is sometime necessary but isn't something that scales well, etc.

I am not meant to require responsibilty of the team for broken packages. My key point is, it is always an improvement to know the breakages sooner.

And I don't expect lots of effort of the team in this direction. Minimal effort is just enough:

set up the test suite and run it only when a new release is out or run it only manually.
the test suite will only output the failed list and not return errors in every execution.
the team can spend one minute on a glance at the failed list before uploading a new release.
the team need not to debug the failed list, but can decide to do it if the failed list looks abnormal.

I think spending one more minute will not slow down the development of the project, and it is good for all.

josephwright · 2024-02-27T13:31:47Z

The team discussed the general issue at today's meeting. We have a few possible approaches in mind, each with positive arguments in favour. We are going to work on this a bit more before making a decision on changes for the future.

josephwright · 2024-04-09T11:28:11Z

We have decided to go with a main/dev split for expl3 as is now done for LaTeX2e: I will sort out shortly.

josephwright · 2024-04-27T07:25:41Z

We have moved ahead with the dev/main plan for l3kernel: see https://www.latex-project.org/news/2024/04/24/l3prog-dev/. As such, I'm closing here.

josephwright self-assigned this Apr 9, 2024

josephwright transferred this issue from latex3/latex2e Apr 9, 2024

josephwright closed this as completed Apr 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test more packages before publishing a new release #1530

Test more packages before publishing a new release #1530

lvjr commented Feb 22, 2024 •

edited

josephwright commented Feb 22, 2024

lvjr commented Feb 22, 2024

josephwright commented Feb 22, 2024

josephwright commented Feb 22, 2024

muzimuzhi commented Feb 22, 2024

josephwright commented Feb 22, 2024

muzimuzhi commented Feb 22, 2024

josephwright commented Feb 22, 2024

lvjr commented Feb 23, 2024

u-fischer commented Feb 23, 2024

lvjr commented Feb 23, 2024 •

edited

josephwright commented Feb 23, 2024

lvjr commented Feb 24, 2024

josephwright commented Feb 27, 2024

josephwright commented Apr 9, 2024

josephwright commented Apr 27, 2024

Test more packages before publishing a new release #1530

Test more packages before publishing a new release #1530

Comments

lvjr commented Feb 22, 2024 • edited

Brief outline of the enhancement

josephwright commented Feb 22, 2024

lvjr commented Feb 22, 2024

josephwright commented Feb 22, 2024

josephwright commented Feb 22, 2024

muzimuzhi commented Feb 22, 2024

josephwright commented Feb 22, 2024

muzimuzhi commented Feb 22, 2024

josephwright commented Feb 22, 2024

lvjr commented Feb 23, 2024

u-fischer commented Feb 23, 2024

lvjr commented Feb 23, 2024 • edited

josephwright commented Feb 23, 2024

lvjr commented Feb 24, 2024

josephwright commented Feb 27, 2024

josephwright commented Apr 9, 2024

josephwright commented Apr 27, 2024

lvjr commented Feb 22, 2024 •

edited

lvjr commented Feb 23, 2024 •

edited