Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specified Regression CIs #1635

Closed
10 tasks
pitag-ha opened this issue Jun 26, 2023 · 1 comment
Closed
10 tasks

Specified Regression CIs #1635

pitag-ha opened this issue Jun 26, 2023 · 1 comment

Comments

@pitag-ha
Copy link
Member

pitag-ha commented Jun 26, 2023

The General Behavior CI (see #1634) needs a bit of thinking to scale: It compares on the whole Merlin output and the Merlin output of some samples is very very big. As a lower-hanging fruit, we can first implement a concrete Behavior Regression CI.

Purpose

Some concrete changes in behavior can almost certainly be considered a regression when happening. Two examples:

  • Merlin now errors where before it would succeed (error-regression CI).
  • The Merlin server now crashes where before it wouldn't (crash-regression CI).

So additionally to the General Behavior CI #1634, we can also add concrete Regression CI workflows monitoring these concrete changes. Adding this level of output simplicity to the CI will make it easier to scale in terms of size and help avoid missing concrete regressions in behavior.

Implementation

Let's focus on the error-regression CI for now.

merl-an already has a command merl-an error-regression for the error-regression CI. What's missing is the integration into a CI workflow.

Data to take into account

For the first and most direct PoC, @3Rafal has already gathered data:

  • Running 7 different queries, each on 1 sample per file on the whole of Irmin (700-800 files):
    In CI:

    38m 12s
    
  • Running 7 different queries, each on 30 samples per file on the whole of Irmin (700-800 files):
    Locally on a Laptop with Intel i7 11 gen:

    203m8.524s
    

This is the data for the first and most direct PoC. We can optimize the CI in terms of time on different levels (see below).

Next action points

@3Rafal has already written a PoC for the error-regression CI. There are several things we'd need to improve to make the CI more useful. The next action points are

  • Discuss and decide if these CIs are really useful and worth the effort.
    • For the error-regression CI: Find out if there have been any changes in the past that made Merlin return an error where before it would return suffesscully. If so, which ones? And how likely is it that this will happen again in the future?
    • Similarly for the crash-regression CI: Have there been any changes in the past which made Merlin crash where before it wouldn't? If so, which ones?
  • If we think it might be worth it, optimize the CI in terms of time:
    • On the CI-side: e.g. cache the set-up.
    • On the merl-an-side: Make sure we use the Merlin cache as much as possible (e.g. if the traversal is done in query -> file order, do it in file -> query order instead).
    • On the sample-side:
      • For Merlin cache, better to have lots of samples in few files than having few samples in a lot of files.
      • Do we want to run it on all mli-files as well?
  • Have a look at how long the optimized CI takes and decide a good sample set and workflow for it.
@pitag-ha pitag-ha changed the title Behavior-regression CI Concrete-regression CIs Jul 3, 2023
@pitag-ha pitag-ha changed the title Concrete-regression CIs Specified Regression CIs Jul 3, 2023
@pitag-ha
Copy link
Member Author

Implemented in #1716

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant