-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix erroneous behaviour in LinuxMaintainers #41
Comments
Affected subsystem: ISDN/mISDN section |
I can try to solve it |
ping :-) |
Look into the mail thread here: https://lore.kernel.org/lkml/08d88848280f93c171e4003027644a35740a8e8e.camel@perches.com/ |
Uff, okay, I guess I understood the issue. Argh, yikes. Let me just summarise to see if I understood correctly: As MAINTAINERS puts it:
But appears that this is not the whole truth. Assume the following entry:
arch is in fact a directory, and there's a file called arch/bla.c. get_maintainers will assign that file to the TEST section, while in PaStA we won't. Oh boy, this is going to be ugly to implement. In particular, this is going to be real fun, if this case is intermixed with wildcards. E.g.:
I just tested it: get_maintainers.pl will assign @bulwahn could you please verfiy this statement: Pia, let me explain the issue in detail in a short call... |
And by the way, this kind of issue raises a completely new pattern of issues: How can we be sure, that PaStA's implementation aligns with get_maintainers.pl for any patch for any point in time? I mean, we work on historical data, and people used get_maintainers.pl at the time of writing their patch. And I'm pretty sure that there were some changes to get_maintainers.pl that changed the assignment of sections without actual changes to MAINTAINERS. Our current assumption is: get_maintainers.pl tells the truth, as it is the tool of choice for developers. I see no real possibility that PaStA's implementation will 100% reflect get_maintainers.pl's output for any patch for any point in time. But we need PaStA's implementation as we want to do mass bulk analysis, and get_maintainers.pl is horribly slow. So this is why Pia's compare_getmaintainers.pl gains importance: If we want to use PaStA's LinuxMaintainers.py to argue about behaviour of authors, we need to ensure, that our output is "good enough" to reflect the reality of get_maintainers.pl |
Another hypothetical scenario: In future, we might want to decide if a maintainer was correctly addressed by a patch. We already tried this with Sebastian's work and realised that this question is hard to answer, and strongly depends on the metric how we define "correctly addressed" -- we had a lot of discussions on that topic. So at the moment, if we want to check if a patch was correctly addressed, we take it's date, try to assign it roughly to the kernel version "at that point in time", and compare recipients vs. maintainers. But who says that there were no changes of MAINTAINERS or get_maintainers.pl in the meanwhile? Those files evolve as well. We need to be extremely careful with the choice of metrics. For example, a question that is pretty easy to answer with high accuracy is: "Was a patch shot completely off target". E.g., a mail patches a soundcard and was sent to KVM. That's easy to answer with high accuracy. The opposite of the question "Was a patch correctly addressed" is hard to answer and we have (probably) bad accuracy due to metrics. But we can answer "What amount of patches were not completely off target" with high accuracy, if we ask the question "was a patch completely off shot" for all patches. |
Might be fixed with 40f11f2 Requires further testing. |
Lukas:
The text was updated successfully, but these errors were encountered: