Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

providing different number of checkruns in a test suite #59

Closed
FrankMittelbach opened this issue Mar 3, 2018 · 33 comments
Closed

providing different number of checkruns in a test suite #59

FrankMittelbach opened this issue Mar 3, 2018 · 33 comments
Assignees
Milestone

Comments

@FrankMittelbach
Copy link
Member

I just wrote a bunch of tests that require 2 or even 3 checkruns before they get the right result. Thus in the confix I have to set checkruns=3 with the result that if I would put that into a larger suite of tests the processing times goes up drastically (as 99% of the tests would come out right after a single run).

Now on fast machines that is not so much of an issues, but on this rather old desktop, for example, the 2e suite already runs for 20 minutes so I really don't need another 10 or 40 minutes.

Proposal:

checkruns =  <number> | *

If * then do a loop in the check target (up to 3 times maybe) comparing the log against the tlg and rerun if differences. Fail after 3 run. This way the majority of tests would finish after 1 run, and real failures would run 1 or 2 times unnecessarily (but that's the exceptional case).

Of course that also requires a change in the in the save, either by running that always 3 times (if * is set above) or by offering something like

texlua build.lua save --checkruns=2 tlb-foo 

same could be in principle also be offered in check for locally overwriting the config value for testing purposes.

@josephwright
Copy link
Member

Perhaps the easiest way is if I add the ability to pick .lvt files in a config. One can then have two configs which differ only in checkruns: easier than trying to extend what is at present a simple syntax ...

@josephwright josephwright self-assigned this Mar 3, 2018
@wspr
Copy link
Contributor

wspr commented Mar 3, 2018 via email

@josephwright
Copy link
Member

@wspr Could just be an appropriately-named .lua file, doesn't have to have an 'odd' extension.

@wspr
Copy link
Contributor

wspr commented Mar 3, 2018

@josephwright Sure; makes little big difference whether you have l3test01.lvt then l3test01.l3b or l3test01-l3b.lua but having a separate extension kind of ‘fits’ nicely.

@FrankMittelbach
Copy link
Member Author

FrankMittelbach commented Mar 3, 2018 via email

@josephwright
Copy link
Member

@FrankMittelbach At present, our configs are simple .lua files with no odd parsing. In current case, checkruns is an integer. In we want to sue the suggested syntax, we'd have to parse the entire config file separately to pick out the value. Assuming we don't want to do that, we might make it a string so we can parse just the value itself, but that still feels awkward to me ...

@davidcarlisle
Copy link
Member

davidcarlisle commented Mar 3, 2018 via email

@blefloch
Copy link
Member

blefloch commented Mar 3, 2018 via email

@FrankMittelbach
Copy link
Member Author

@josephwright hmm. I wonder if that approach then is valid (and not dangerous):

 checkruns = <max-number>

runs the checks a maximum number of times each time comparing the result. Stops if either the results match or we have tried of times. For save always do

Offhand I can't really see a case where that is going to fail, ie matching first time by not second or third. I know one can construct such cases, but that has to be rather deliberately in my opinion.

But I think that @davidcarlisle 's suggestion is even better. 0 doesn't quite work as you may want to say how often to try.

@josephwright
Copy link
Member

I'll have to adjust the code to let us 'break out' of the loop to get to the .tlg comparison stage. I'd like to do #50 first (it's a big-ish job), then I guess I can look at this and #6. I'll flag for TL'18.

@josephwright josephwright added this to the TL'18 milestone Mar 5, 2018
@wspr
Copy link
Contributor

wspr commented Mar 5, 2018

Hmmmm, what if you were testing that a certain condition didn't converge after a number of runs? I'm thinking something cross-ref related where you might expect a value after two runs but due to some misbehaving package still have ??. Possibly too contrived an example...

Secondly, I guess this isn't a major problem but this does slow down all legitimately failing tests, right? (Especially if not running --halt.) Unless you subdivide all tests that need multiple runs, but if you were willing to do that you wouldn't need the variable number of runs to start with.

I'm sorry to be contrary but if it were up to a vote I think I'd prefer a simple mechanism to indicate how many runs a given test should use. (Even if it were embedded somehow in the .lvt file?)

@FrankMittelbach
Copy link
Member Author

FrankMittelbach commented Mar 5, 2018 via email

@wspr
Copy link
Contributor

wspr commented Mar 5, 2018 via email

@blefloch
Copy link
Member

blefloch commented Mar 6, 2018 via email

@wspr
Copy link
Contributor

wspr commented Mar 6, 2018 via email

@FrankMittelbach
Copy link
Member Author

FrankMittelbach commented Mar 6, 2018 via email

@FrankMittelbach
Copy link
Member Author

FrankMittelbach commented Mar 6, 2018 via email

josephwright added a commit that referenced this issue Mar 21, 2018
At present this just covers .tlg-based tests.
@josephwright
Copy link
Member

I've added some code for this but not documented yet. I wonder if we should just enable this approach all of the time: typically the TeX run is slower than the comparison, and we are normally running only one check run so there would be no impact. That avoids the entire need for some specialised interface.

I also need to get it working with PDF-based tests, but to be honest that entire area needs re-doing so I've not worried at present.

@blefloch
Copy link
Member

blefloch commented Mar 21, 2018 via email

@FrankMittelbach
Copy link
Member Author

my approach would have been to use the max number of reruns at save time (ie no breaking out earlier)

@josephwright
Copy link
Member

On saving, I think the current logic should be OK. When you save, there are two cases:

  • There is no existing .tlg file: the comparison will always fail and the maximum number of runs will be applied
  • There is an existing .tlg, which will only match if we can safely break out of the loop

That said, I've not tried this out just yet: I thought I'd first see if the entire plan sounded any good. I can look at forcing 'no break out', but it's a bit tricky as the runtest() function doesn't actually know if the required target is check or save ...

@josephwright
Copy link
Member

I've done some (simple) tests and I'm reasonably confident that there should be no issue with the save target ...

@blefloch
Copy link
Member

blefloch commented Mar 21, 2018 via email

@josephwright
Copy link
Member

@blefloch I'd not thought of that case: I guess I will need to work out how best to handle it. All doable of course.

More general thoughts on leaving the interface unchanged? I wondered if we should have a boolean to turn this on-and-off: optimisecheckruns or similar?

@FrankMittelbach
Copy link
Member Author

FrankMittelbach commented Mar 21, 2018 via email

@blefloch
Copy link
Member

blefloch commented Mar 21, 2018 via email

@FrankMittelbach
Copy link
Member Author

FrankMittelbach commented Mar 21, 2018 via email

josephwright added a commit that referenced this issue Mar 22, 2018
This means that the loop always runs in full for "save".
@josephwright
Copy link
Member

The latest change means that treating checkruns as a maximum only applies when checking, not when saving. If people could test it out, or at least have a look over the commits, that would be great! I'll document once it's clear this is the desired behaviour.

@josephwright
Copy link
Member

I'm calling this fixed ...

@FrankMittelbach
Copy link
Member Author

I think so ... but my machine is still too slow :-(

@josephwright
Copy link
Member

@FrankMittelbach Comes down to how many tests one decides to run: we could only pick 'obvious' ones for multiple engines, but in the past it's been the non-obvious ones that have been problematic.

@FrankMittelbach
Copy link
Member Author

no comes down to that being an inexpensive machine that by now is really old so ... it is simply slow. It is always the non-obvious that show errors so there is a good reason to run all (I'm not complaining)

@josephwright
Copy link
Member

@FrankMittelbach One reason people use a branch-and-pull-request workflow: you check stuff in on a branch, the CI does the tests, you only merge when they pass ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants