New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The AI help button is very good but it links to a feature that should not exist #9230
Comments
|
I just want to agree with this report wholeheartedly. The use of large language models to offset labor is problematic enough, but doing so when those LLMs do not even consistently produce reasonable or correct output is utterly unconscionable. MDN is supposed to be a simple, authoritative source for the web platform; with the introduction of "AI Help", you're throwing that reputation away. I never would have imagined I'd be recommending w3schools over MDN to new programmers, but that's where we are today. I'm a long-time Firefox user. I've worked with Mozillans in the past, including on the 2nd edition of Programming Rust. I know you're decent people; do the right thing and ditch the AI bullshit. |
|
To provide some context here about the relationship of OWD to MDN and about my own role in all this: OWD funds the work of a group of writers, whose names you can find at https://openwebdocs.org/team/#writers — and the funding for OWD itself is organized through an Open Collective, which has a formal Team, the names of whose members you can find under the Team tab at https://opencollective.com/open-web-docs#section-contributors. While I am among the 150+ individual people who have donated to OWD, I am neither formally one of the OWD writers nor formally one of the OWD Team members. To be clear on my actual role: I’m one of the core reviewers/maintainers who have push/merge access to the https://github.com/mdn/content/ repo (the content of MDN) doing reviews of incoming PRs and otherwise contributing to the repo. The set of core reviewers/maintainers includes the OWD writers, but it also includes some writers who work for Mozilla, and includes me and some others who are neither formally OWD writers nor writers from Mozilla. See https://github.com/orgs/mdn/teams?query=@sideshowbarker for the list of relevant GitHub teams I belong to, and https://github.com/mdn/content/pulls?q=reviewed-by:sideshowbarker for the reviews I’ve done (3858 so far) and https://github.com/mdn/content/graphs/contributors to see my own commits (and those of other contributors). And FWIW here I’ll mention that I actually also have push/merge access to the Yari repo at https://github.com/mdn/yari/ repo, which has the source code for the platform on which MDN runs — including code for things like the “AI Explain” button, but also code for all kinds of good things that aren’t controversial at all. I am not a core Yari reviewer/maintainer, but I have actually done reviews there (20 so far), as shown in https://github.com/mdn/yari/pulls?q=is:pr+reviewed-by:sideshowbarker — in cases where it has made sense for me to review — and commits (42 so far), as shown in https://github.com/mdn/yari/commits?author=sideshowbarker. |
|
I do not believe there is currently a theoretical framework for making statistical text generation distinguish the truth of a statement so there is no likelihood of this being fixed with any anticipated development based on the current technology. |
This comment was marked as outdated.
This comment was marked as outdated.
|
I'm not sure if this is the right implementation, but the idea is on the right track. A better way to implement this with large language models should that be desired is to have an AI generate a bunch of options for knowledgeable human technical writers to pick from and then form a composite to ensure that the answers are technically accurate and using large language models to assist the creative flow. i think that the current implementation is wishful thinking at best and I am sad to see such a critical web resource fall prey to hype cycles that cut out the best part of MDN: the technical writers. Hopefully what I proposed is a viable middle path. |
As far as I can tell, the framework AI Help is using is described here. Basically, it feeds the posts to a search engine and then uses the search engine to make sure at least one relevant MDN doc can be surfaced to the LLM before it outputs anything. The idea seems to be "well, AIs are better at summarizing text and doing question/answer tasks about specific passages than they are at answering questions off the cuff," which I think is probably true. (Does this work? I don't know. When I tried the tool, I was trying to trick it, and it mostly just told me "I can't answer that," which I suspect means it was falling over at the search engine step.) I would say this is actually really close to the model AI Explain used, so I would expect it to produce similar mistakes. From talking to Augner, it sounds like Augner doesn't believe any examples taken from AI Explain are representative of likely weaknesses in AI Help, which is surprising to me, but that appears to be their current position. Overall, I think an affirmative case for "AI would be good at this task" is still missing. Augner wants an affirmative case that it won't work, I want an affirmative case that it will, so we're basically talking past each other. |
|
AI Help works very different than AI Explain. We restrict the LLM to base it answers on the context we give it (which is actual up to date MDN content). So you won't see the issues AI Explain was exhibiting. A basic helpful answer (to the following question) would be:
To detect if you are in offline mode, you can use the Here is an example of how you can use In this example, if Please note that if the browser does not support the If you want to listen for changes in the network status, you can use the
By adding event listeners for the MDN content that I've consulted that you might want to check: |
|
@fiji-flo it looks like the links you've provided are broken. (maybe the urls were relative?) My main concern is that AI Help does not have a place in technical documentation. Yes, in theory it could help out a few people, but the target audience it seems to aim for (new developers or someone unfamiliar with the concept it is trying to learn about) coupled with our current understanding and research about LLMs (in a nutshell; they can confidently present inaccurate information) seems to be a hugely concerning mismatch. You need someone to fact-check the response from a LLMs, a four eye principle is often applied on technical docs (one writer, and at least one reviewer) which is missing from the LLM. Therefore, there is a significantly increased risk that the LLM provides wrong information to someone not knowledgeable enough about the subject to fact-check if the AI is confidently providing misinformation, or is actually accurate. How does the team behind AI Explain hope to alleviate this concern, beyond plastering the user with warnings (which might be hint that this is not a product-market fit?) |
|
Here's another helpful answer for the following question about a brand-new web feature:
The While In Using Here is an example of using the In this example, the Overall, MDN content that I've consulted that you might want to check: |
|
@Zarthus Both this and that comment respond to @nyeogmi who requested positive examples of answers produced by AI Help:
|
|
@caugner: If that was the essence and what the contributors of AI Explain and AI Help have taken away from this issue, and is their official response to this issue, I shall pardon myself from this thread. |
|
This is honestly quite embarrassing. I've been a vocal proponent of Mozilla, their products, and MDN for quite a long time. Seeing the consistent non-acknowledgment of perfectly valid, calmly laid out reasoning against this feature in its current state is disheartening. If Mozilla is set on its current path and will refuse to bend to criticism on this feature, at least do the service of outright saying so - then we can all stop wasting our time. |
|
I really really didn't want to be part of this discussion. But if people are worried about this feature producing convincing but inaccurate/wrong/misleading output (which LLMs are known to do), providing examples of correct output will not convince them. That only proves that the LLM is capable of being correct and useful (which I don't think anyone has disputed). Not that it is likely to be correct most of the time. Nor that it will not provide really bad results some of the time. Nor does it address the issue that users may not be able to tell these cases apart. It's really easy to create an algorithm that produces correct output some of the time, or even most of the time, but that fails spectacularly in some (edge) cases. That may be acceptable if it's clear beforehand when it will fail, so that people can avoid the edge cases, or if it's easy to tell when it has failed. But algorithms are a lot more predictable than LLMs. You can usually at least prove they are correct under certain conditions. LLMs are much harder to predict. And we know that LLMs can "hallucinate" perfectly convincing but non-existent sources for their claims. Even if the LLM produces accurate, useful, output 99% of the time, can I know whether the output I'm currently getting is in fact accurate without fact-checking it every time? |
My understanding is that they were requesting an affirmative case to be made for it being structurally good at this task, rather than providing an individual question that it managed to answer sufficiently accurately (which does not say much about structural fitness for the task). |
|
@sideshowbarker Please stop hiding or deleting comments in this repository. Thank you! |
what? if a property isn't supported, it will always be most of that answer is complete fluff, and more importantly it does not really answer the original question — because of exactly the problem that it struggles to raise. if you want to know for sure that you're in offline mode, you would have to check
so are they the same colorspace or not? this seems like the crux of the question, but the bulk of the response is rambling that rephrases parts of the linked articles (including repeated mention of the cartesian/polar distinction, which i doubt will help someone who isn't already visualizing a colorspace in their head), rather than a direct answer. it's mostly explaining a good direct answer would probably say that you just want but an LLM can't give an answer like that, because it doesn't understand context, or what common sticking points might look like. or anything at all. all it can do is babble, and hopefully not babble something that's incorrect. but you can't ever be confident that it won't be wrong about some percentage of arbitrary questions. and if it is wrong, you can't directly correct it the way you might correct a static article. all you can do is keep feeding it more text and cross your fingers that it starts babbling more correctly, in an infinite game of whack-a-mole. it might seem like i'm being nitpicky here. and i am — because these examples were specifically cherry-picked to defend the existence of the feature itself. they are the best case scenario. and they are, charitably, mediocre. ultimately, if you create a chatbot (which you explicitly call "trusted"!) that can't really do much more than restate the contents of existing articles, and you're relying on the reader to sift through its rambling to find the actual information they asked for... then what was the point? they could just as well have sifted through the articles themselves to find the information they wanted, without risking that crucial details will get lost or twisted. |
I think the affirmative case for Augner should be "there are many examples of incorrect information being provided already cited." I'd like to read what precisely the proponents think it's going to help with. |
Lest anyone else here be led to believe I hid or deleted any comments nefariously or something: Allow me to be fully transparent about exactly what did actually hide and delete — So, for the record here: The only comments I hid or deleted were completely innocuous cleanup of outdated comments related to updates that got made to the issue description. (See the remaining related comment at #9230 (comment).) Specifically: I had posted a comment correcting some things that had been in the issue description, and there were some follow-up comments from the OP and another commenter about that — and then the issue description was subsequently updated based on my corrections. So that update of the issue description rendered all those comments outdated and no-longer-necessary, and they were therefore amicably deleted by agreement with the OP — with the point being that keeping those comments hanging around would have just been noise that distracted from the substance of the discussion here. |
I tried to confirm this assertion by pasting some code into AI Help and asked it to explain the code. I used my first CSS example from issue 9208 (I do not have an account, so I don't want to use up my free checks for today) For the example, I got this final paragraph (after the LLM explained each included property that visually hides the pseudo-content):
I italicized the part that seems questionable given the context it just provided (that the styles visually hide the content it claims visually indicates the start and end of an element). I agree that it seems less overtly wrong, but it is still wrong. In a more subtle way. |
|
@aardrian Can you please use the (new) "Report a problem with this answer on GitHub" link at the bottom of the AI Help answer, so that the team can follow up on the specific problem you're experiencing? Thanks! |
@aardrian's comment is valid in this thread. Encouraging users to report each incident separately seems like "divide and conquer" tactics to obscure the true scale and prevalence of the problem. By chopping it up into smaller, specific blocks they can be "addressed" with cherry-picked responses as attempted earlier in this thread, only with less context due to being isolated single Issues, not contributing to the overall picture. Like how @nyeogmi's previous issue was renamed to obfuscate the real problem being raised, and then closed without addressing said problem properly and prompting the creation of this Issue. And how #9208 was also renamed to obfuscate and downplay the very concerning issue being discussed. |
No. First, I am already giving my free labor by engaging on this (versus swearing off MDN) anḋ second, what @Ultrabenosaurus said. |
Given that one of the comments that were deleted was mine, I'd like to further emphasize that what @sideshowbarker said in #9230 (comment) is in fact completely accurate: my comment (along with other deleted ones) related entirely and only to minor cleanup and did not need to be present after that was cleared up. I have no issue at all with the deletion of the comment and fully agree that leaving it there would just have cluttered things up. |
I don't think the value of good examples is literally zero. But if advocates of the feature are rejecting isolated examples of bad answers as evidence that the feature is bad, then I am reluctant to accept isolated examples of good answers as evidence that the feature is good. Specifically: if we accuse one side of cherry picking w/o specific basis, we have to accuse both sides of cherrypicking and throw out all the examples. If we just take everyone's evidence at face value, we conclude that it produces both good and bad answers with roughly equal likelihood, which is more consistent with the case that it's bad. |
Well, I also think the fact that the side submitting "good examples" is actually submitting examples that seem superficially good but actually have large problems is also quite relevant. |
Makes sense now looking up the author of that post https://blog.mozilla.org/en/mozilla/steve-teixeira-mozilla-new-chief-product-officer/.
I don't think the Yari devs can roll this back, even if they wanted to |
|
Reading this issue is extremely frustrating, because the answers to it read like PR damage control, and do not address the core issue. So to restate it: A technical reference's most important attribute is to be accurate. An LLM cannot be guaranteed to be accurate. That's it. That's the core issue. That should be the end of this debate. It doesn't matter how much spin you put on it. It is irrelevant whether using an LLM is moral or not, if it respects copyright or not: even in a world where it was moral, didn't involve as much underpaid human labour, and wasn't a copyright nightmare, it would not address the core issue: the output of a LLM, especially one that the user does not control, cannot be guaranteed to be accurate, and a technical reference simply cannot tolerate such a large margin of error in accuracy. The fact that proponents of this feature seem to be willing to disregard this in order to push this feature suggests that they either wrongly believe that the LLM can be made accurate, or that it's okay to compromise the accuracy of the reference. I don't know which is worse. |
|
Just a confirmation that the concerns I raised in my closed duplicate issue (i.e. the loss of MDN's data authenticity due to this feature, and concerns regarding its deployment) are in fact quite well and faithfully represented here. |
|
In reply to https://blog.mozilla.org/en/products/mdn/responsibly-empowering-developers-with-ai-on-mdn/ : Steve the author seems like a person who values data-driven decisionmaking, so I'd like to point out some data. Issue #9208, at the time of writing, has 1287 upvotes to four downvotes. That's 1287 technical people who know about MDN and also know to read the github issues - exactly MDN's target audience. I can also add 125 likes on this much younger issue at the time of writing. The dashboard image posted in the article shows 1017 likes on AI explain and 129 likes on AI help. Ignoring the fact that only ~3% of everyone who viewed that survey clicked like, if I add up both those like numbers, that's still less than the number of people who see these AI explain and AI help's existence as an issue. And that's before counting the hundreds of dislikes. Even though MDN has a convenient link to the survey, and makes every effort to try to get people to click the like button if they like the product, the tiny ~3% of people who clicked like is still outweighed by the 1287 community members who came to this somewhat obscure repo to say this is a terrible idea. I'd recommend you listen to the data. |
|
it's a bit of a kick in the teeth to see the tone deaf blog post use the word "responsibly" in the title. i fear that by the gaslighting stage, there probably isn't a whole lot of trust left in the community to salvage. RIP one of the best technical references on the internet. |
it's deeply disheartening to see the feature described like this, because it's outright misleading and anyone who's been within spitting distance of machine learning knows it. this is implying that mozilla has its own model that it's trained, or at least finetuned. but it doesn't — as far as i can tell this feature just searches for relevant MDN articles, ships the entire article contents to ChatGPT, and asks it nicely in english prose to only refer to those articles to answer the question. that is not what "training" means in an ML context, and it is not "limited" in any serious way. maybe i'm missing something. if so, it would've been nice to explain whatever that is in the blog post about how responsible this feature is. |
Both of those statements could be true - when taken individually. But my reading of the post and those specific parts leads me to believe exactly what @eeveee concluded, which is that the two statements are conflating things and/ or glossing over that the system wasn’t trained on the mdn docs. |
|
It doesn't really matter what it was trained on, ChatGPT doesn't generate accurate or factual summaries of its training or input corpus, that's not what it's designed for. It's designed to satisfy some interpretation of the Turing test and thus to deceive humans. It generates texts that closely resemble the training corpus and are credible continuations of the input, that seem likely to the reader to be the result of reasoning. It frequently contradicts its training model as well as its own ongoing output. To be precise, it explicitly has no mechanism to suppress generated output that contradicts either class of input. |
|
For one more line of arguments against AI Help's existence: Can I ask for an explanation how is this feature supposed to get improved over time, exactly? From what we know, it seems that this feature is based on the currently existing MDN pages (maintained by the community), the regular Open AI LLM (a blackbox maintained externally), and a prompt (the only thing directly controlled by Mozilla as the provider of the AI Help feature). So let's say, 1000 AI Help users see an answer to a common question X and decide to submit feedback that it is clearly incorrect because of a reason Y. What will you do when you review the source MDN pages and it turns out they are correct, but AI Help answers aren't? It seems the only option will be to change the prompt. So how do you plan to do that? Will you fine-tune the prompt until it gives a correct answer to question X specifically? How do you plan to do that without changing all the answers to all of the other questions? How will you ensure that overall answer quality went up after the change? Will you just discard all the feedback that you received to date every time you update the prompt, and gather it from scratch? Do you plan to write unit tests for thousands and thousand of common questions and answers? Or how exactly do you plan to make changes that you can show are actively improving the feature, not only slightly shifting in which cases it happens to be right or wrong? Bonus points for an explanation how are the users supposed to know whether they should trust more the answer that they received yesterday, or the answer that they are given today after somebody pushed a prompt update. With human-made changes to doc pages, it is clear that the more up-to-date version is expected to be the more accurate one. How are the users supposed to decide which version is more accurate after a prompt update? More bonus points for an explanation how do you plan to prepare for changes to the base OpenAI LLM once they force you to use an updated version of it. Do you have any idea how they might affect the answer quality? Do you have any plan what to do with the user feedback that you've gathered so far when that happens? Honestly, do you have any plan for anything at all that's related to improving the accuracy of AI Help answers? |
|
There's one thing to say about the accuracy of AI Help is that the subtly incorrect, rambling, ignoring nuance, and concerning.. are accurate to the kind of responses that "sound good" we're getting from the team on this. There is a place for AI to help you find useful information, but it's like people said with Wikipedia: it's not useful as a primary source, but it can be very useful for finding the primary source. If this is going to go through regardless of concerns, it should at least be presented as a search engine. |
This is a great point, and it mirrors a conversation I had in the MDN Matrix channel yesterday. Personally, while I still view OpenAI as a pretty strange bedfellow for Mozilla due to their horrendous labor practices, I could definitely support a limited version of this system that only surfaces links to MDN articles, provided:
A chat agent that could surface relevant articles would be really useful, unlike this system. |
|
I think it would be great if we submit some of the great questions in this thread to their call. |
And here I was hoping you simply had not seen the reply of mine in which I give evidence that these "horrendous labor practices" aren't exactly what you claim them to be, hence the no-reply that followed. I must therefore conclude, given also the amount of downvotes that responses got, that in this part of the internet facts are of no use if they contradict one's preconceived, libel-bordering, view. Quite ironic, since that's precisely what is being contested to the MDN folks - that they do not care about the evidence. |
|
I reserve my right to disagree with you about morality. That's not libel. If you're going to threaten me with legal action, even obliquely, I don't see any particular reason to continue to engage with your opinions. For those following along, though, let's remember that there are three major issues here:
Any one of these should be enough to condemn AI Help and AI Explain as a bad idea; that you think I'm wrong about what is perhaps the most subjective of the three doesn't change the overall validity of the argument that AI Help and AI Explain are bad ideas implemented poorly. |
Unfortunately, using LLMs to find primary sources has the same problems as using them as a primary source. Since the output is not (and cannot be) guaranteed to be accurate, only contextually plausible, they will readily generate citations of sources that look superficially appropriate, but in actuality:
Sometimes they also generate relevant citations that support the citing text, but they cannot be relied upon to do so consistently. In practice this generally makes them worse than a traditional search engine, which is still prone to problems 1 through 3 but usually manages to avoid 4, and does so using a tiny fraction of the computing power. |
Oh, you can disagree all you want, but you didn't. No reply from you on the issue, in spite of the evidence presented. One can disagree with opinions, certainly not with facts. 2+2=4, any disagreement about that?
You are literally throwing mud at a whole company and the people working in it, who might even take pride in doing so, without solid evidence to support your claims. That's precisely what libel is.
Do you feel threatened by the fact somebody with no ties with OpenAI makes you aware of the fact you are trashing them without solid evidence? Or, funnily, you are thinking I am in some ways representative of OpenAI - which I am not? Either way, I am pretty sure you've engaged enough. |
In general, I think this is true and a good criticism. However, in this case, I think the problem could be mitigated by:
Of course, that doesn't ensure that those links are actually relevant, and I would reach for a search engine long before an LLM, but I do think this use case is at least theoretically reasonable. |
|
Can people please stop cluttering the thread and filling our inboxes with petty immaturity? Thank you. Valid critiques have been raised, no amount of loyalty or defensiveness to a company or technology will improve the conversation. |
i don't think "outsourcing to somewhere with dramatically lower wages so you can pay slightly more than average and look moderately impressive there, while still paying nowhere near the wages in your own region" is, uh, great, exactly. and "labor practices" are more than just wages. not to take a strong stance about OpenAI's use of labor; only to say that it's odd to act as though you hit a home run here. your comments seem to follow a pattern of glossing over details in order to pluck out one that's convenient, then presuming victory. i don't see how this is constructive. |
And dramatically lower cost of life. I know, easy to forget detail, right?
According to the figures I have presented, it's way more than average.
Nobody in this conversation has claimed there's any "greatness" to it. On the other hand, in more than one occasion, some of you have used adjectives which are on the opposite scale of "greatness". I have asked what alternative source of income would you suggest to those Kenyans but, unsurprisingly at this point, no answer has been provided. If you cared to follow one of the links I gave, you'd read Kenyans' opinion about it, which greatly differs from yours. But certainly you know better than the people involved, I am sure.
Which other practices are you thus talking about?
Did I? Well, that was not meant to seem like it, but if it did, may I suggest you take your time to ponder why?
Tell me about it: I have provided plenty of details myself - the aforementioned evidence - which were literally ignored. |
|
No matter what your opinion of OpenAI, the issue we're here to discuss is LLM integration in MDN. Let's not get derailed from the true point. |
|
As a regular user of MDN, I think it's irresponsible for Mozilla to incorporate an LLM into the service. Thanks to those who are trying to escalate this issue with them. |
|
@falemagn hey bud can you just be chill for a bit and accept that this situation youre arguing is purely an agree-to-disagree situation that will likely never be resolved in a satisfying manner? because its starting to get a bit ridiculous and i would argue that at this point it is starting to become off topic to the actual issue and discussion at hand. you've said your piece, everyone has understood it, regardless of if they agree or not, whether you have more to say or not, let's just leave it for now. this isnt really the time or place to be debating these things. |
|
This message appeared in the Mozilla discord: This is one of the good ways to make info available to Steve Teixeira, but I frankly think he will ignore us unless there is a level of backlash that is externally visible. His comments suggest he knows this is unpopular but doesn't think the public knows that, so I think it would be better if the public knows that. I think this is clearly a governance issue and that it needs to be brought to a space other than mdn/yari. This issue is only visible to people who deliberately click into it. It should be left open because at this point press has noticed it (including The Register, here) I would like a petition or something, not because that obligates anyone to respond but because it's easier to boost on social media and it's less likely to be deleted or closed. (in particular: falemagn is engaging in provocateur-style antics that are likely to get this issue locked as "too heated") I will make this later this weekend if no one else does. I would prefer that someone who is a member of a Mozilla project make or put their name on the website. @sideshowbarker You're in contact with mousetail, who made the stackoverflow petition. I have just contacted vantablack who ran fedipact on Mastodon. They say I can steal the text from their campaign if anything is useful, and they sent me some pragmatic details about how they verified people and stuff. @eevee I think you have the best posts in this thread and the people upvoting you obviously agree. Do you want to write something for the Mozilla conversation or for a webpage complaining about this more publicly? General question: does anyone know either [1] content creators who can loudly complain about this or [2] who can provide actual engineering contacts inside of Mozilla who might be disgruntled? I am hoping this can be discussed on Mozilla Foundation's Slack, which is where the cryptocurrency donations issue was litigated. (For my part re [1]: I DMed fasterthanlime, who was involved in the Rust Project situation -- I suspect the Rust team and Mozilla are overlapping. I am reluctant to DM ThePrimeagen, who I am sure would complain about this publicly but whose fans have occasionally harassed people.) |
I'd like to second this methodological concern. Can we take these metrics at face value? Is there any way to account for this potential bias? It seems reasonable to speculate that the users most likely to use this feature are precisely the users least likely to spot inaccuracies, which would cast doubt on whether an immediate subjective impression of "helpful" really means it's a high-quality, non-misleading answer. I might propose that, rather than a user's first impression which may not include validation of correctness, a more relevant measure for the specific concerns raised here would be to fact-check a random sampling of the AI responses and find an overall error rate. If this rate is above what MDN tolerates under its editorial standards, then obviously the feature itself fails to live up to those standards; but if errors occur at a tolerable rate, then proponents of this feature will have a direct quantitative rebuttal to the concerns raised here. Either way, this will settle the question of how prevalent the issue of misinformation is, which is crucial information that hand-selected anecdotal examples have proven inadequate to settle in either direction. |
|
Related to my previous comment: here's an infopage I wrote. I want to use it elsewhere. You all have Comment access if you want to make the text less bad. (I'm not a writer.) https://docs.google.com/document/d/1fKfCy83SvHP3zMtmPkTNPfjMb8koqCqEFQn1qpBGrz8/edit?usp=sharing |
Way to ruin an "open conversation" by locking an open issue with legitimate concerns about OpenAI. Well, I don't question your power position, dear collaborator: no-no-no, you can do whatever, it's your repository.
Citing @colin-p-hill, as his is a great summary of how most AI features need an assessment outside the very bubble of users that use them. |

Summary
I made a previous issue pointing out that the AI Help feature lies to people and should not exist because of potential harm to novices.
This was renamed by @caugner to "AI Help is linked on all pages." AI Help being linked on all pages is the intended behavior of the feature, and @caugner therefore pointed out that the button looks good and works even better, which I agree with -- it is a fantastic button and when I look at all the buttons on MDN, the AI Help button clearly stands out to me as the radiant star of the show.
The issue was therefore closed without being substantively addressed. (because the button is so good, which I agree with)
I think there are several reasons the feature shouldn't exist which have been observed across multiple threads on platforms Mozilla does not control. Actually, the response has been universally negative, except on GitHub where the ability to have a universally negative response was quietly disabled Monday morning.
Here is a quick summary of some of those reasons.
One, the AI model is frequently wrong. Mozilla claims it intends to fix this, but Mozilla doesn't contain any GPT-3.5 developers and OpenAI has been promising to fix it for months. It's unlikely this will actually happen.
Two: contrary to @caugner 's opinion, it's very often wrong about core web topics, including trivial information where there is no obvious excuse. Here are some examples:
<portal>in ways that the page text explicitly tells the user they must not use it.Even examples posted by people who support the existence of the AI contain significant errors:
(I say examples, but note: this is the only usage example provided by a person who supported the existence of the feature, and it contained an error.)
This is identical to one of the categories of problem seen on StackExchange when StackExchange introduced its generative AI assistant based on the same model, and it led to Stack removing the assistant because it was generating bizarre garbage.
Three: it's not clear that any documentation contributors were involved in developing the feature. Actually, it's still unclear who outside of @fiji-flo and @caugner was involved in the feature. Some contributors including @sideshowbarker have now objected and the process has produced a default outcome, which is that AI Explain was voluntarily rolled back and AI Help remains in the product.
It is probably OK for those contributors to review each other's own code, but they're also managing the response to the backlash. After a bunch of people have already signaled "hey, I have an active interest in this feature" by engaging with a relevant issue, excluding those people reflects that a ruling of "actually, you do not have an active interest!" has been reached, and it's not clear what basis that ruling would have been reached on.
Four: the existence of this feature suggests that product decisions are being made by people who don't understand the technology or who don't think I understand it.
Overall, the change tells the story that MDN doesn't know who their average user is, but assumes that the average user is (1) highly dissimilar to the GitHub users who were involved in the backlash (2) easy to sell to.
The fact is that in one day, measured in upvotes, you attracted comparable backlash to what the entire StackOverflow strike attracted in a month. It would be a mistake to think only a small group of people are concerned. This attitude would be wishful thinking.
It seems like the fork in the road for MDN is:
If option 1 isn't sustainable, then between option 2 and option 3, option 3 is obviously better for humanity in the long-run and I would encourage MDN to make plans for its own destruction.
In the worst possible world, the attitude is correct and the users are easy to sell to. Well, in that case, you've created another product company and in doing so you've metaphorically elected to serve both God and money -- and as is evidenced by the recent implosions of every siloed social media company, that is always a great idea.
Again, the AI Help button is absolutely gorgeous and functions as intended. This issue is not about the AI Help button and therefore should not be closed as a button-related wontfix, or renamed by @caugner into a description of the behavior of the button.
URL
#9208
#9214
Reproduction steps
Pivot to a more aggressive funding model, then engage in a mix of panic and corporate groupthink.
Expected behavior
I think the button is amazing and you are doing a great job.
Actual behavior
The AI help feature should not exist.
Device
Desktop
Browser
Chrome
Browser version
Stable
Operating system
Windows
Screenshot
Anything else?
No response
Validations
The text was updated successfully, but these errors were encountered: