-
Notifications
You must be signed in to change notification settings - Fork 509
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The AI help button is very good but it links to a feature that should not exist #9230
Comments
I just want to agree with this report wholeheartedly. The use of large language models to offset labor is problematic enough, but doing so when those LLMs do not even consistently produce reasonable or correct output is utterly unconscionable. MDN is supposed to be a simple, authoritative source for the web platform; with the introduction of "AI Help", you're throwing that reputation away. I never would have imagined I'd be recommending w3schools over MDN to new programmers, but that's where we are today. I'm a long-time Firefox user. I've worked with Mozillans in the past, including on the 2nd edition of Programming Rust. I know you're decent people; do the right thing and ditch the AI bullshit. |
To provide some context here about the relationship of OWD to MDN and about my own role in all this: OWD funds the work of a group of writers, whose names you can find at https://openwebdocs.org/team/#writers — and the funding for OWD itself is organized through an Open Collective, which has a formal Team, the names of whose members you can find under the Team tab at https://opencollective.com/open-web-docs#section-contributors. While I am among the 150+ individual people who have donated to OWD, I am neither formally one of the OWD writers nor formally one of the OWD Team members. To be clear on my actual role: I’m one of the core reviewers/maintainers who have push/merge access to the https://github.com/mdn/content/ repo (the content of MDN) doing reviews of incoming PRs and otherwise contributing to the repo. The set of core reviewers/maintainers includes the OWD writers, but it also includes some writers who work for Mozilla, and includes me and some others who are neither formally OWD writers nor writers from Mozilla. See https://github.com/orgs/mdn/teams?query=@sideshowbarker for the list of relevant GitHub teams I belong to, and https://github.com/mdn/content/pulls?q=reviewed-by:sideshowbarker for the reviews I’ve done (3858 so far) and https://github.com/mdn/content/graphs/contributors to see my own commits (and those of other contributors). And FWIW here I’ll mention that I actually also have push/merge access to the Yari repo at https://github.com/mdn/yari/ repo, which has the source code for the platform on which MDN runs — including code for things like the “AI Explain” button, but also code for all kinds of good things that aren’t controversial at all. I am not a core Yari reviewer/maintainer, but I have actually done reviews there (20 so far), as shown in https://github.com/mdn/yari/pulls?q=is:pr+reviewed-by:sideshowbarker — in cases where it has made sense for me to review — and commits (42 so far), as shown in https://github.com/mdn/yari/commits?author=sideshowbarker. |
I do not believe there is currently a theoretical framework for making statistical text generation distinguish the truth of a statement so there is no likelihood of this being fixed with any anticipated development based on the current technology. |
This comment was marked as outdated.
This comment was marked as outdated.
I'm not sure if this is the right implementation, but the idea is on the right track. A better way to implement this with large language models should that be desired is to have an AI generate a bunch of options for knowledgeable human technical writers to pick from and then form a composite to ensure that the answers are technically accurate and using large language models to assist the creative flow. i think that the current implementation is wishful thinking at best and I am sad to see such a critical web resource fall prey to hype cycles that cut out the best part of MDN: the technical writers. Hopefully what I proposed is a viable middle path. |
As far as I can tell, the framework AI Help is using is described here. Basically, it feeds the posts to a search engine and then uses the search engine to make sure at least one relevant MDN doc can be surfaced to the LLM before it outputs anything. The idea seems to be "well, AIs are better at summarizing text and doing question/answer tasks about specific passages than they are at answering questions off the cuff," which I think is probably true. (Does this work? I don't know. When I tried the tool, I was trying to trick it, and it mostly just told me "I can't answer that," which I suspect means it was falling over at the search engine step.) I would say this is actually really close to the model AI Explain used, so I would expect it to produce similar mistakes. From talking to Augner, it sounds like Augner doesn't believe any examples taken from AI Explain are representative of likely weaknesses in AI Help, which is surprising to me, but that appears to be their current position. Overall, I think an affirmative case for "AI would be good at this task" is still missing. Augner wants an affirmative case that it won't work, I want an affirmative case that it will, so we're basically talking past each other. |
AI Help works very different than AI Explain. We restrict the LLM to base it answers on the context we give it (which is actual up to date MDN content). So you won't see the issues AI Explain was exhibiting. A basic helpful answer (to the following question) would be:
To detect if you are in offline mode, you can use the Here is an example of how you can use
In this example, if Please note that if the browser does not support the If you want to listen for changes in the network status, you can use the
By adding event listeners for the MDN content that I've consulted that you might want to check: |
@fiji-flo it looks like the links you've provided are broken. (maybe the urls were relative?) My main concern is that AI Help does not have a place in technical documentation. Yes, in theory it could help out a few people, but the target audience it seems to aim for (new developers or someone unfamiliar with the concept it is trying to learn about) coupled with our current understanding and research about LLMs (in a nutshell; they can confidently present inaccurate information) seems to be a hugely concerning mismatch. You need someone to fact-check the response from a LLMs, a four eye principle is often applied on technical docs (one writer, and at least one reviewer) which is missing from the LLM. Therefore, there is a significantly increased risk that the LLM provides wrong information to someone not knowledgeable enough about the subject to fact-check if the AI is confidently providing misinformation, or is actually accurate. How does the team behind AI Explain hope to alleviate this concern, beyond plastering the user with warnings (which might be hint that this is not a product-market fit?) |
Here's another helpful answer for the following question about a brand-new web feature:
The While In Using Here is an example of using the
In this example, the Overall, MDN content that I've consulted that you might want to check: |
@Zarthus Both this and that comment respond to @nyeogmi who requested positive examples of answers produced by AI Help:
|
@caugner: If that was the essence and what the contributors of AI Explain and AI Help have taken away from this issue, and is their official response to this issue, I shall pardon myself from this thread. |
This is honestly quite embarrassing. I've been a vocal proponent of Mozilla, their products, and MDN for quite a long time. Seeing the consistent non-acknowledgment of perfectly valid, calmly laid out reasoning against this feature in its current state is disheartening. If Mozilla is set on its current path and will refuse to bend to criticism on this feature, at least do the service of outright saying so - then we can all stop wasting our time. |
I really really didn't want to be part of this discussion. But if people are worried about this feature producing convincing but inaccurate/wrong/misleading output (which LLMs are known to do), providing examples of correct output will not convince them. That only proves that the LLM is capable of being correct and useful (which I don't think anyone has disputed). Not that it is likely to be correct most of the time. Nor that it will not provide really bad results some of the time. Nor does it address the issue that users may not be able to tell these cases apart. It's really easy to create an algorithm that produces correct output some of the time, or even most of the time, but that fails spectacularly in some (edge) cases. That may be acceptable if it's clear beforehand when it will fail, so that people can avoid the edge cases, or if it's easy to tell when it has failed. But algorithms are a lot more predictable than LLMs. You can usually at least prove they are correct under certain conditions. LLMs are much harder to predict. And we know that LLMs can "hallucinate" perfectly convincing but non-existent sources for their claims. Even if the LLM produces accurate, useful, output 99% of the time, can I know whether the output I'm currently getting is in fact accurate without fact-checking it every time? |
My understanding is that they were requesting an affirmative case to be made for it being structurally good at this task, rather than providing an individual question that it managed to answer sufficiently accurately (which does not say much about structural fitness for the task). |
@sideshowbarker Please stop hiding or deleting comments in this repository. Thank you! |
what? if a property isn't supported, it will always be most of that answer is complete fluff, and more importantly it does not really answer the original question — because of exactly the problem that it struggles to raise. if you want to know for sure that you're in offline mode, you would have to check
so are they the same colorspace or not? this seems like the crux of the question, but the bulk of the response is rambling that rephrases parts of the linked articles (including repeated mention of the cartesian/polar distinction, which i doubt will help someone who isn't already visualizing a colorspace in their head), rather than a direct answer. it's mostly explaining a good direct answer would probably say that you just want but an LLM can't give an answer like that, because it doesn't understand context, or what common sticking points might look like. or anything at all. all it can do is babble, and hopefully not babble something that's incorrect. but you can't ever be confident that it won't be wrong about some percentage of arbitrary questions. and if it is wrong, you can't directly correct it the way you might correct a static article. all you can do is keep feeding it more text and cross your fingers that it starts babbling more correctly, in an infinite game of whack-a-mole. it might seem like i'm being nitpicky here. and i am — because these examples were specifically cherry-picked to defend the existence of the feature itself. they are the best case scenario. and they are, charitably, mediocre. ultimately, if you create a chatbot (which you explicitly call "trusted"!) that can't really do much more than restate the contents of existing articles, and you're relying on the reader to sift through its rambling to find the actual information they asked for... then what was the point? they could just as well have sifted through the articles themselves to find the information they wanted, without risking that crucial details will get lost or twisted. |
I think the affirmative case for Augner should be "there are many examples of incorrect information being provided already cited." I'd like to read what precisely the proponents think it's going to help with. |
Lest anyone else here be led to believe I hid or deleted any comments nefariously or something: Allow me to be fully transparent about exactly what did actually hide and delete — So, for the record here: The only comments I hid or deleted were completely innocuous cleanup of outdated comments related to updates that got made to the issue description. (See the remaining related comment at #9230 (comment).) Specifically: I had posted a comment correcting some things that had been in the issue description, and there were some follow-up comments from the OP and another commenter about that — and then the issue description was subsequently updated based on my corrections. So that update of the issue description rendered all those comments outdated and no-longer-necessary, and they were therefore amicably deleted by agreement with the OP — with the point being that keeping those comments hanging around would have just been noise that distracted from the substance of the discussion here. |
I tried to confirm this assertion by pasting some code into AI Help and asked it to explain the code. I used my first CSS example from issue 9208 (I do not have an account, so I don't want to use up my free checks for today) For the example, I got this final paragraph (after the LLM explained each included property that visually hides the pseudo-content):
I italicized the part that seems questionable given the context it just provided (that the styles visually hide the content it claims visually indicates the start and end of an element). I agree that it seems less overtly wrong, but it is still wrong. In a more subtle way. |
@aardrian Can you please use the (new) "Report a problem with this answer on GitHub" link at the bottom of the AI Help answer, so that the team can follow up on the specific problem you're experiencing? Thanks! 🙏 |
@aardrian's comment is valid in this thread. Encouraging users to report each incident separately seems like "divide and conquer" tactics to obscure the true scale and prevalence of the problem. By chopping it up into smaller, specific blocks they can be "addressed" with cherry-picked responses as attempted earlier in this thread, only with less context due to being isolated single Issues, not contributing to the overall picture. Like how @nyeogmi's previous issue was renamed to obfuscate the real problem being raised, and then closed without addressing said problem properly and prompting the creation of this Issue. And how #9208 was also renamed to obfuscate and downplay the very concerning issue being discussed. |
No. First, I am already giving my free labor by engaging on this (versus swearing off MDN) anḋ second, what @Ultrabenosaurus said. |
Given that one of the comments that were deleted was mine, I'd like to further emphasize that what @sideshowbarker said in #9230 (comment) is in fact completely accurate: my comment (along with other deleted ones) related entirely and only to minor cleanup and did not need to be present after that was cleared up. I have no issue at all with the deletion of the comment and fully agree that leaving it there would just have cluttered things up. |
I don't think the value of good examples is literally zero. But if advocates of the feature are rejecting isolated examples of bad answers as evidence that the feature is bad, then I am reluctant to accept isolated examples of good answers as evidence that the feature is good. Specifically: if we accuse one side of cherry picking w/o specific basis, we have to accuse both sides of cherrypicking and throw out all the examples. If we just take everyone's evidence at face value, we conclude that it produces both good and bad answers with roughly equal likelihood, which is more consistent with the case that it's bad. |
Well, I also think the fact that the side submitting "good examples" is actually submitting examples that seem superficially good but actually have large problems is also quite relevant. |
This whole debacle is making W3CSchools more useful than MDN. That's embarrassing. |
Since we apparently want "AI" in everything, here is why ChatGPT-3.5 thinks it would be a bad idea to use ChatGPT4 to interpret or produce technical documentation [*]: As a seasoned web developer, I can provide several reasons why relying solely on a language model like ChatGPT4 for technical documentation is not a good idea:
While language models like ChatGPT4 can be useful for generating text and providing general information, they should not be relied upon as the sole source of technical documentation. [*] -- warning, this was produced by a LLM and may not be accurate. |
I agree. These wall-of-text posts aren't going to get them to change. So, take it out of their control. If I did web development more than on rare occasions, I'd already be using the content and code to setup a fork of MDN on a new domain with all this LLM nonsense removed. |
Sorry to be another voice chiming in, but it seems the thread is lacking examples of bad responses from AI Help, leading to a dismissal of the issues as being only related to AI Explain. So here's a very bad response I got from AI Help about reacting to element size changes:
This is obviously wrong in every way, and I'd be surprised to see someone defend it as inaccurate but useful. Full disclosure, I deliberately tricked the LLM by asking how to use MutationObserver for this purpose. But IMO that's a question a confused beginner is likely to ask, and the documentation should correct them rather than hallucinate a world in which they are correct. |
What is even the point of this feature? If I wanted to ask ChatGPT for explanations, I would just... you know... go on OpenAI's website and do it. With all the precautions it comes with, like staying skeptical of all output because of the non-negligible risk of it being confidently wrong. But so far I never even felt the need to ask a LLM about MDN documentation, because it is well written and sufficient. So, at best this AI help button is useless, and at worst it is harmful, because of the risk that someone might end up misinformed by the output. Also the idea that "incorrect information can still be helpful" is asinine. This is technical documentation, not Fox News. |
So let me get this straight… the objections to the original post were that the answers output are correct—for the most part? |
This is actually a generic problem with ChatGPT: if you ask it something that is impossible, it simply cannot tell you that what you ask is impossible; instead, it will hallucinate a world wherein the thing you ask for is in fact possible and then come up with an overly elaborate answer, with full code examples and everything, but it will never work because it's not possible and it does not have the ability to tell you this. I don't know whether this is a generic problem with LLMs or a specific problem with ChatGPT, but on all the interactions that I've had with it, I've never seen it tell me that a thing is impossible, and believe me, this was not for lack of trying. In other words, ChatGPT is an XY problem amplifier. You want to do something with an API that wasn't made to do the something, you ask the tool in MDN how to do that, it will hallucinate some gibberish for you that makes it sound like it's possible, and now you're stuck even further in your XY problem. This is not something MDN should be doing, ever, but it does, both with AI Help and with AI Explain. |
@faintbeep Thanks for being honest, and glad to hear you had to trick AI Help to get a seemingly incorrect answer. Could you please report the answer using the "Report a problem with this answer on GitHub" link to create a (public) GitHub issue for it? That issue will then contain both the question(s) you asked and the answer you received, which makes it easier to reproduce and follow-up. (So far we have received only 5 issue reports - all valid - since we added the link.) It's important to mention that had you asked if you can detect size changes using MutationObserver instead (e.g. "Can I detect size changes with MutationObserver?"), AI Help would have told you that you cannot and pointed you to ResizeObserver. And my question "How can I detect size changes with MutationObserver?" was just rejected by AI Help. So I'm curious how you phrased that question. It seems you insisted specifically on a solution with MutationObserver, and AI Help gave you what seems to me like a possibly valid solution to a subset of size changes (namely through style attribute changes, which may effectively change the size of an element), without mentioning this limitation though. Luckily there are the two links that allow the beginner (who, kudos, already heard about MutationObserver) to double-check, deepen their knowledge about MutationObserver and discover ResizeObserver through the "See also" section. Even if you don't find this helpful, maybe we can agree that there is some helpfulness in this? But seriously, if you actually report this as an issue, we can look into what improvements can avoid this kind of scenario. For example, we could update the MutationObserver page to better explain its differences to ResizeObserver, or add an overview page for all the observers with their respective use cases could help maybe it already exists, then we could look into why it wasn't deemed relevant enough, and ensure it's passed as context). And last but not least, it's an option to update our system instructions to prevent GPT-3.5 from suggesting solutions using unsuitable feature, even if the user specifically asked for it. PS: Just to make this clear once and for all, we are aware of the limitations of LLMs, and we know that the LLM doesn't understand the question or these instructions, and only uses statistics to come up with the next words. However, the crux is that it works surprisingly well, which is the reason why LLMs can provide value for users, why AI Help's answers are mostly helpful, and why we experiment with an LLM as part of this beta feature. The success of this experiment is yet to be evaluated, and all feedback is going to be taken into consideration. |
You’ve asserted this but not supported the claim. Even if we ignore the inaccuracies, the positive examples provided have mostly been disorganized and turgid, so I think the better way to convince people would be by having real human testimonials: survey learners in the target audience and see how helpful they found it for solving real problems. |
Definitely a better approach than asking LLMs to evaluate each other! Perhaps this could be improved further: divide the target audience into two and give them all the same (short) task. One group gets to use only MDN for help and the other gets to use MDN + "AI Help". Have professionals evaluate the quality of the results from both groups. |
The sad part is that old Mozilla could have had volunteers to do this if they were training an open LLM and approached this as a research project without a predetermined outcome. As a former donor and contributor, “help Open AI pro bono” is just not as compelling a pitch. |
I decided to test the "ask it for something impossible and it will answer as if it was possible" thing above by asking a question I've had myself many times over the years: How do I use CSS selectors to select an element only if it contains a specific child element? The AI response not only gets the asked question backwards, but then it answers the rewritten question (which misses the entire point) A Google query, For curiosity's sake I decided to reformat the question and try again; by this point I know it won't give me an accurate, correct answer, but once again it manages to get basic details wrong: The only "trick" involved in this was asking it a question I already knew the answer to. |
How can anyone validate the information provided by an AI assistant if the sites they were supposed to validate that information against are the ones providing that "AI assistance"? How do they know who to trust? This problem most severely negatively affects those who do not have an abundance of spare time, energy, and knowledge to validate the output of AI tools, which are the people who most need assistance from things like MDN. MDN's AI help establishes the baseline of trustworthiness of help from MDN, because it is so much lower than the rest of the site, and if it is trusted as a vector of information, there is no reason to believe such information has not been incorporated elsewhere on the site in less obviously perceived ways. No one is auditing the edit history of every single article here, and the obvious next step to happen is "the AI starts making edits". Now that you've made it clear you are happy to incorporate this tool into the text displayed for individual articles via "AI Explain", it's not enough to roll things back to "AI Help". The entire thing has to go, otherwise I have no reason to assume you're not just going to reimplement AI Explain later, when things quiet down, as everyone tends to. Thus in order for MDN to be useful, I will start having to audit the edit history of every article, which is harder for people to do now that it's a git history (git has notoriously poor UX). Defensive maneuvers against misinformation should not cost more than the misinformation costs to generate. Otherwise the misinformation wins. Checking to see if "AI Editing" was enabled when I was away every time I reference or cite MDN is not cost-efficient. So the only defensive maneuver that makes sense is to assume you've abandoned your responsibility to providing reliable and accurate information, as that is the easiest explanation for why a tool that does not provide reliable and accurate information was incorporated into a website that does provide reliable and accurate information. "It generates value" is not enough if it raises the cost of using the resources on MDN overall. |
(Periodic reminder: this thread has literally no multiplier effect and the devs aren't listening to you. If you want anything to happen, post about it on a platform that has a multiplier effect.) |
Is there a timeline for this? When can we expect answers and/or the transcript to be posted? |
I keep seeing the proponents of this conflate seeming to be helpful with actually being helpful and assuming that there is no meaningful difference between inaccurate information provided by well-meaning people (e.g. on stack overflow) and the kind of inaccurate information that an LLM can produce. See my comment here. |
MDN AI drama: Archive and cite reputable journalism sources, such as The Register. Links to the GitHub issues: mdn/yari#9208 mdn/yari#9230
There is no world in which "tell me when inline styles change" or even "tell me when size-related attributes change" could ever be an adequate answer to "tell me when the size of a typical element changes." The latter is asking about an effect; the former focuses only on one cause among so, so very many. (And it's overbroad in its wrongness, too: it doesn't even double-check
Explain how.
You say this, but it directly contradicts your last remark. You can "update your system instructions" to overcome the fundamental nature of LLMs? You're acknowledging the limitations of LLMs but refusing to actually consider them, and this is evident in everything you've been saying: it's evident in you projecting confidence that with the right prompt, the right prayer to the toaster oracle, you can get it to reliably correct mistakes; it's evident in you assuming that someone definitely has to be acting in bad faith and insisting that your genius machine provide a wrong answer, for the machine to do so. (The LLM provided a correct answer when you asked it, so clearly, it "knows" the answer, right? If it gave someone else a wrong answer, it must be because shenanigans are afoot. It can't be that innocent enough variations in wording or phrasing -- variations you simply haven't thought of and tested -- might trip up a program that reacts entirely and blindly to wording with no mental model of what words actually mean.) And let's not forget the context of you failing to actually demonstrate the awareness you say you have: multiple GitHub issues with hundreds upon hundreds of comments' worth of explanations of LLMs' limitations, presented and explained in just about every way possible, in some cases with examples pulled from MDN itself. At best, assuming good faith as hard as I can, you've shown an appalling level of myopia that should immediately disqualify someone from making or in any way being involved in any noteworthy decisions about how one of the web's most critical developer documentation sites should be run; but it's becoming increasingly difficult to believe that this is the thoughtlessness it looks like. |
I feel it's worth pointing out what one of the community call answers had to say: https://github.com/orgs/mdn/discussions/414#discussioncomment-6541058
We're just "an extremely vocal small" minority, apparently, because anyone who simply hasn't responded clearly finds AI integration to be a flawless addition. |
I'm pretty sure we've actually expressed a lot of concern that adding more incorrect information to MDN will not help those "not yet capable of finding the correct information" instead of forgetting about them; quoting myself:
|
See, you say that, but then your very next words are
No, it doesn’t. It appears to work surprisingly well, but you can never be certain whether you’ve gotten the one true book containing your life’s story or one of the ones that’s just 60,000 q’s in a row from the infinite library of every combination of words ever made, and that is fundamentally the problem. As for “incorrect answers can be helpful,” I’d like to go on record as saying that I find incorrect answers given to me by a tool that is supposed to give me correct information to be nothing but infuriating. I don’t even like getting wrong information from Stack Overflow answers because now I’m having to waste more of my time trying to figure out why it’s not working as expected. I’m sure we’re all more than familiar with adapting Stack Overflow answers that sort of answer the same question we’re trying to ask, but that, too, is a fundamentally different process than “ask the magic answer box my exact question and get an exact answer that should work”. Finally, I think if you really wanted to impress upon your users the limitations of these tools, you wouldn’t call them “AI” anything. You’d call them “LLM Help” and “LLM Explain”. “AI” has so many sci-fi implications about sentience and reasoning and understanding embedded in it that expecting people to see “AI” in the name of a tool and think “box that makes convincing-sounding sentences” is, frankly, laughable. Despite disclaimers plastered every which way, people are still using ChatGPT to do things like write translations and write legal briefs full of hallucinated court case citations. People will not use these tools the way you expect them to, doubly so if you keep insisting on calling them something they very blatantly are not: artificial intelligence. |
In a fair world the people who introduced these programs by referring to them as AI would have burst into black flames for the sheer hubris of it all. They are parody generators. Nothing more. |
I am aware that management has long moved on and am not expecting a response, here, but I wanted to raise this nevertheless just in case someone who can effect change sees it by chance. The paper Who Answers It Better? An In-Depth Analysis of ChatGPT and Stack Overflow Answers to Software Engineering Questions, Kabir et al, 2023 (preprint) delivers exactly what its title suggests. It finds that ChatGPT answers for software engineering questions are wrong 52 per cent of the time - to within a margin of error the same as tossing a coin. But it goes deeper than that. Because ChatGPT and other LLMs write very, very convincingly, their answers are often preferred over human equivalents (from Stack Overflow, in the case of Kabir et al) - 39.34 per cent of the time, in this case. Of the preferred answers, over 77 per cent were wrong. So, given MDN is using the same technology, I believe it would not be unreasonable to assume the same holds true: of those users clicking the button to report an answer as "helpful," as many as 77 per cent may have done so on an answer which is wrong. But, because they're unfamiliar with the subject matter and ChatGPT's output is designed to sound helpful, they have no idea they're being led up the garden path. |
In my professional opinion, LLMs have no place being included on MDN, where developers come looking for trustworthy technical information. As someone who has used ChatGPT for technical questions numerous times, I know from experience that although it can be quite useful sometimes, it very frequently spews out misinformation and leads you down a rabbit hole of plausible looking garbage. Often it can take more time trying to get ChatGPT to arrive at a working solution that it would to just use a trustworthy source of documentation (like MDN is supposed to be). This is very confusing and frustrating, especially for newer developers. The things that LLMs can actually answer accurately (most of the time), are simple, well known things that a quick Google search would have sufficed for. There is a reason why ChatGPT is banned on StackOverflow:
I also find it very concerning that newer developers turn to ChatGPT and AI in general as a source of guidance. It is too easy for developers to use it as a crutch. This is dangerous because unlike a calculator being used in mathematics, LLMs/ChatGPT do not always present factually accurate outputs. While using a calculator will always provide an accurate answer for the problem entered, LLMs have no such guarantee. Using GPT is not just detrimental to developers because it reduces their ability to do their own work, but also because it introduces a higher probability of error and often can waste a lot of time. TD;DR: LLMs are not a good source of factual information, and as such MDN shouldn't expect to be considered a reliable source while they have it included on their website. |
I know that no action is going to be taken on this. But I would be remiss if I didn't provide this link (not written by me): https://www.zdnet.com/article/third-party-ai-tools-are-responsible-for-55-of-ai-failures-in-business/ |
Yes! I just made this exact comparison to someone recently. So often the applications people are pushing LLMs for already have solutions (keyword searches, math calculations, boilerplates/templates, etc). And those solutions aren't using an insane amount of processing to get results, sapping communities of potable water, requiring a precarious data training labor pool, etc. The externalities of "AI" and LLMs are massive and it's so frustrating that people hand-wave these important factors away on top of the technology itself being demonstrably worse than things we already have. |
Summary
I made a previous issue pointing out that the AI Help feature lies to people and should not exist because of potential harm to novices.
This was renamed by @caugner to "AI Help is linked on all pages." AI Help being linked on all pages is the intended behavior of the feature, and @caugner therefore pointed out that the button looks good and works even better, which I agree with -- it is a fantastic button and when I look at all the buttons on MDN, the AI Help button clearly stands out to me as the radiant star of the show.
The issue was therefore closed without being substantively addressed. (because the button is so good, which I agree with)
I think there are several reasons the feature shouldn't exist which have been observed across multiple threads on platforms Mozilla does not control. Actually, the response has been universally negative, except on GitHub where the ability to have a universally negative response was quietly disabled Monday morning.
Here is a quick summary of some of those reasons.
One, the AI model is frequently wrong. Mozilla claims it intends to fix this, but Mozilla doesn't contain any GPT-3.5 developers and OpenAI has been promising to fix it for months. It's unlikely this will actually happen.
Two: contrary to @caugner 's opinion, it's very often wrong about core web topics, including trivial information where there is no obvious excuse. Here are some examples:
<portal>
in ways that the page text explicitly tells the user they must not use it.Even examples posted by people who support the existence of the AI contain significant errors:
(I say examples, but note: this is the only usage example provided by a person who supported the existence of the feature, and it contained an error.)
This is identical to one of the categories of problem seen on StackExchange when StackExchange introduced its generative AI assistant based on the same model, and it led to Stack removing the assistant because it was generating bizarre garbage.
Three: it's not clear that any documentation contributors were involved in developing the feature. Actually, it's still unclear who outside of @fiji-flo and @caugner was involved in the feature. Some contributors including @sideshowbarker have now objected and the process has produced a default outcome, which is that AI Explain was voluntarily rolled back and AI Help remains in the product.
It is probably OK for those contributors to review each other's own code, but they're also managing the response to the backlash. After a bunch of people have already signaled "hey, I have an active interest in this feature" by engaging with a relevant issue, excluding those people reflects that a ruling of "actually, you do not have an active interest!" has been reached, and it's not clear what basis that ruling would have been reached on.
Four: the existence of this feature suggests that product decisions are being made by people who don't understand the technology or who don't think I understand it.
Overall, the change tells the story that MDN doesn't know who their average user is, but assumes that the average user is (1) highly dissimilar to the GitHub users who were involved in the backlash (2) easy to sell to.
The fact is that in one day, measured in upvotes, you attracted comparable backlash to what the entire StackOverflow strike attracted in a month. It would be a mistake to think only a small group of people are concerned. This attitude would be wishful thinking.
It seems like the fork in the road for MDN is:
If option 1 isn't sustainable, then between option 2 and option 3, option 3 is obviously better for humanity in the long-run and I would encourage MDN to make plans for its own destruction.
In the worst possible world, the attitude is correct and the users are easy to sell to. Well, in that case, you've created another product company and in doing so you've metaphorically elected to serve both God and money -- and as is evidenced by the recent implosions of every siloed social media company, that is always a great idea.
Again, the AI Help button is absolutely gorgeous and functions as intended. This issue is not about the AI Help button and therefore should not be closed as a button-related wontfix, or renamed by @caugner into a description of the behavior of the button.
URL
#9208
#9214
Reproduction steps
Pivot to a more aggressive funding model, then engage in a mix of panic and corporate groupthink.
Expected behavior
I think the button is amazing and you are doing a great job.
Actual behavior
The AI help feature should not exist.
Device
Desktop
Browser
Chrome
Browser version
Stable
Operating system
Windows
Screenshot
Anything else?
No response
Validations
The text was updated successfully, but these errors were encountered: