-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Lock module version in lookup.json #407
Comments
Yeah, moving targets are never fun. I'm +1 on this. |
@MylesBorins I'd be interested to know your thoughts on this. I'm +1 on this. If we can encourage module authors to update their modules here then that would be an added bonus. |
I'm still very -1 on this.
The whole point of citgm is to find failures
We have no clear mechanism to upgrade modules, or know when they need to be
upgraded. It is inconvenient when they break, but isn't that kind of the
point?
…On May 12, 2017 11:46 AM, "Gibson Fahnestock" ***@***.***> wrote:
@MylesBorins <https://github.com/mylesborins> I'd be interested to know
your thoughts on this.
I'm +1 on this. If we can encourage module authors to update their modules
here then that would be an added bonus.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#407 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAecV8NF0IiGSvaTSebZHUK1hbivdbYoks5r5H7WgaJpZM4NZcso>
.
|
@MylesBorins ... how would you feel about an approach that used two lookup tables in CI... one with a last-known-good configuration of modules, and one with the current-set? Running differentials between those could help us track down regressions quite easily. |
Tbqh I'm still not sure that is super useful, and it sounds like quite a
bit of work to implement / maintain. We should likely focus our efforts on
finding out why our CI infra has false negatives that don't exist on other
machines
…On May 12, 2017 4:23 PM, "James M Snell" ***@***.***> wrote:
@MylesBorins <https://github.com/mylesborins> ... how would you feel
about an approach that used two lookup tables in CI... one with a
last-known-good configuration of modules, and one with the current-set?
Running differentials between those could help us track down regressions
quite easily.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#407 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAecV9zvBpw0vpx5qCJtPRJgdgpN4wVfks5r5L_QgaJpZM4NZcso>
.
|
I am kind of with @MylesBorins on this, the whole point of citgm is that we catch module failures as soon as they are released, I agree that perhaps having two lookup tables could work though but it could get messy as you may have different versions being compatible with different platforms ? |
OK so I'm thinking about this a bit more and thinking specifically about the high level problem... people don't want to use CITGM because of the number of false negatives, and the fact that it is not intuitive to read the results In the past I had been more on top of keeping the flakyness of the lookup table up to date. I would run CITGM on each release line, review the results, and update the lookup as appropriate. An obvious solution would be to make this a task that people sign up for... (or possibly an expected part of using CITGM if you experience flakyness). Another problem is knowing if something is truly flaky. @gdams had started work on a stress test feature for CITGM that would make it easier for us to run a module multiple times to figure out just how flaky they were. Perhaps we can even find a way to automate the above process, or make it far easier to verify if a module is flaky or now (multiple failures auto updates the lookup for example). @BethGriggs thanks for bringing this up! Obviously there is push back from the collaborators on using this tool regularly and we should definitely work on figuring out how to make it more user friendly |
I disagree pretty strongly with this. The whole point of CitGM is to smoke-test whether Node.js core changes break the community. The purpose of CitGM isn't to fix module issues, the modules we test have their own CI.
CitGM is not useful unless it's green. I don't think most breaking changes will be discovered only on the absolute latest module versions, so being a month or two (at most) behind the latest module should be fine (if anything older module versions probably use more deprecated features).
Implementing should just be a version in the
Sure, but we have to do that anyway, this is just one thing we can do to try to insulate people running CitGM on their PRs in core from having to deal with transient module issues. What I tend to see is someone running CitGM, getting a bunch of failures, and then pinging @nodejs/citgm to ask whether they're expected. And I don't think we've been very good at answering those people. IMO the possible reasons we might not want to do this are:
|
When I run CITGM on a release I care that we are testing against what people will get when they run |
I would assume the vast majority of installs come from a module range specified in a
Okay, but if it is broken we just skip it in the lookup. It's not like we hold the release until all the modules are fixed. If we had a mechanism to automatically update the lookup, raise a PR, and run CI on the latest release (e.g. |
@gibfahn I think we can definitely introduce a more aggressive policy for skipping modules. Technically at this point any collaborator can make changes to the lookup without requiring any sign off. What we really need is regular audits of the release lines with CITGM, where we ignore flakes and keep the lookup up to date. This could potentially be automated |
#321 is a start to keeping our lookup up to date |
IMHO multiple tables makes sense. A sparse one for the general case, and a locked version one for each release line. This will allow us to differentiate our regressions and modules' regressions. |
I'm a huge -1 on multiple tables. lookup is the baseline and it is what we run off of. A separate table does not solve my concerns above regarding not keeping up to date... it simply doesn't enforce them in the main citgm. I think the wiki has been a really good first take fwiw |
Thanks! |
I personally think that CITGM would have a much higher value if it would be green in average even if the versions tested are not all up to date. The point is that they should still reflect most of the user basis even if they are older. Let us get more modules in and have a better guarantee because of that and not because of the newest version. We should use a lookup table for the newest versions though e.g. once each two month. |
I agree we need to figure out how to let more people run this usefully. To be useful it needs to be green most of the time when people need/want to run it. It would be interesting to see if a version with fixed module versions would be more stable or not. Running something like a "stable" and "canary" where new versions of modules get promoted from "canary" to "stable" once they have passed for some period of time is a model that might balance the need to test new versions of modules while also allowing it to be more broadly used. |
I once again am going to push back on this. As the person who is doing the majority of the work with CITGM, the majority of the updates to the lookup.json and is one of the primary people using the tool day to day. I am not convinced that we would keep the lookup table up to date and am concerned that it would fall back on me to maintain it. We do get failures, and it is a mixed bag of reasons. Currently the majority of failures have been due to flakes or infra issues on our side. In fact, the 6.x line is for the most part green on citgm. All of that being said. If someone is willing to do the work to implement installing specific versions I am willing to give it a try to see if it improves things, landing commits and testing workflow is not a massive burden. One thing to keep in mind is that we should like support tagged Majors only as a non trivial number of npm modules are installed to auto-update to the latest minor |
Yeah, this is definitely the core issue. I would say that I think the maintenance burden is likely to be the same either way (modules break pretty frequently in my experience), especially if we could get some automation. However we might also fall into a false sense of security, where we think everything's fine because some ancient version of a module is green, but actually latest is broken (and we just never update the lookup version because the latest one is broken). Alternative proposalHow about we do something more on the lines of: Have an 'lkgr' version in the lookup. First run tests on latest, and if the tests fail there then fall back to the
Obviously it won't work if the module is flaky, but nothing works if the module is flaky, so 🤷♂️ . |
I like the proposal of making this a fall back rather than a default. How
would you propose we keep the lookup updated?
How do we handle lkgr being different across release lines and platforms?
…On Oct 24, 2017 11:45 AM, "Gibson Fahnestock" ***@***.***> wrote:
I am not convinced that we would keep the lookup table up to date and am
concerned that it would fall back on me to maintain it.
Yeah, this is definitely the core issue. I would say that I think the
maintenance burden is likely to be the same either way (modules break
pretty frequently in my experience), especially if we could get some
automation.
However we might also fall into a false sense of security, where we think
everything's fine because some ancient version of a module is green, but
actually latest is broken (and we just never update the lookup version
because the latest one is broken).
Alternative proposal
How about we do something more on the lines of:
Have an 'lkgr' version in the lookup. First run tests on latest, and if
the tests fail there then fall back to the lkgr. That way we should be
able to quickly see that
- Fail, Pass => Module broken by module update
- Fail, Fail => Module broken by node update
Obviously it won't work if the module is flaky, but nothing works if the
module is flaky, so 🤷♂️ .
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#407 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAecV_hNuuU_X8qSXF3UQE3e3kuooWy0ks5svgYagaJpZM4NZcso>
.
|
Ideally we'd have
For release lines we might need multiple versions sometimes. |
this leaves me with a feeling that keeping track of this stuff is going to explode in complexity |
Proposal:
There is much more to consider ( |
So fwiw I'm still not 100% sold on this, but I really like the approach. The LKGR approach is particularly interested, but I do fear the complexity of adding platforms / versions and the lookup exploding in size. One idea I was kicking around was updating the CI job to run both the No-Build job and the Latest job and compare the differences. TBH most of the failures we see are flakes / timeouts that are specific to infra... it is really hard to lock those down |
As we test more modules in CitGM, triaging failures does not seem to be scaling.
With any failure we need to work out whether it was caused by:
To make life easier I would suggest that we:
lookup.json
(with appropriate flaky tags if necessary).This means that instead of constantly having to stay on top of all of the module/modules test suite updates and regularly having to mark some of these tests as flaky, we could do this in batches at a regular interval.
The text was updated successfully, but these errors were encountered: