Artificial Artificial Artificial Intelligence: Crowd Workers Widely Use Large Language Models for Text Production Tasks, Veniamin Veselovsky+, N/A, arXiv'23 #780

AkihikoWatanabe · 2023-07-03T12:02:13Z

URL

https://arxiv.org/abs/2306.07899

Affiliations

Veniamin Veselovsky, N/A
Manoel Horta Ribeiro, N/A
Robert West, N/A

Abstract

Large language models (LLMs) are remarkable data annotators. They can be usedto generate high-fidelity supervised training data, as well as survey andexperimental data. With the widespread adoption of LLMs, human gold--standardannotations are key to understanding the capabilities of LLMs and the validityof their results. However, crowdsourcing, an important, inexpensive way toobtain human annotations, may itself be impacted by LLMs, as crowd workers havefinancial incentives to use LLMs to increase their productivity and income. Toinvestigate this concern, we conducted a case study on the prevalence of LLMusage by crowd workers. We reran an abstract summarization task from theliterature on Amazon Mechanical Turk and, through a combination of keystrokedetection and synthetic text classification, estimate that 33-46% of crowdworkers used LLMs when completing the task. Although generalization to other,less LLM-friendly tasks is unclear, our results call for platforms,researchers, and crowd workers to find new ways to ensure that human dataremain human, perhaps using the methodology proposed here as a stepping stone.Code/data: https://github.com/epfl-dlab/GPTurk

Translation (by gpt-3.5-turbo)

大規模言語モデル（LLMs）は、注目すべきデータ注釈ツールです。LLMsは、高品質な教師付きトレーニングデータや調査データ、実験データを生成するために使用できます。LLMsの広範な採用に伴い、LLMsの能力やその結果の妥当性を理解するためには、人間のゴールドスタンダードの注釈が重要です。しかし、人間の注釈を取得するための重要なかつ安価な手段であるクラウドソーシング自体も、LLMsの影響を受ける可能性があります。なぜなら、クラウドワーカーは生産性と収入を向上させるためにLLMsを使用する経済的なインセンティブを持っているからです。この懸念を調査するために、私たちはクラウドワーカーによるLLMの使用の普及率についての事例研究を行いました。私たちは、Amazon Mechanical Turkで文の要約タスクを再実行し、キーストロークの検出と合成テキスト分類の組み合わせにより、33〜46％のクラウドワーカーがタスクの完了時にLLMsを使用していると推定しました。他のLLMに適さないタスクへの一般化は明確ではありませんが、私たちの結果は、プラットフォーム、研究者、クラウドワーカーが、人間のデータが人間のものであることを確保するための新しい方法を見つける必要性を訴えています。ここで提案された手法をステップとして使用することが考えられます。
コード/データ：https://github.com/epfl-dlab/GPTurk

Summary (by gpt-3.5-turbo)

大規模言語モデル（LLMs）の普及率を調査するために、クラウドワーカーによるLLMの使用の事例研究を行った。結果から、33〜46％のクラウドワーカーがタスクの完了時にLLMsを使用していることが推定された。これにより、人間のデータが人間のものであることを確保するために新しい方法が必要であることが示唆された。

AkihikoWatanabe · 2023-07-03T12:03:21Z

Mturkの言語生成タスクにおいて、Turkerのうち33-46%はLLMsを利用していることを明らかにした

AkihikoWatanabe added the Pocket label Jul 3, 2023

AkihikoWatanabe changed the title あ Artificial Artificial Artificial Intelligence: Crowd Workers Widely Use Large Language Models for Text Production Tasks, Veniamin Veselovsky+, N/A, arXiv'23 Jul 3, 2023

AkihikoWatanabe added Dataset LanguageModel Evaluation labels Jul 3, 2023

AkihikoWatanabe added the NLP label Oct 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Artificial Artificial Artificial Intelligence: Crowd Workers Widely Use Large Language Models for Text Production Tasks, Veniamin Veselovsky+, N/A, arXiv'23 #780

Artificial Artificial Artificial Intelligence: Crowd Workers Widely Use Large Language Models for Text Production Tasks, Veniamin Veselovsky+, N/A, arXiv'23 #780

AkihikoWatanabe commented Jul 3, 2023 •

edited

AkihikoWatanabe commented Jul 3, 2023

Artificial Artificial Artificial Intelligence: Crowd Workers Widely Use Large Language Models for Text Production Tasks, Veniamin Veselovsky+, N/A, arXiv'23 #780

Artificial Artificial Artificial Intelligence: Crowd Workers Widely Use Large Language Models for Text Production Tasks, Veniamin Veselovsky+, N/A, arXiv'23 #780

Comments

AkihikoWatanabe commented Jul 3, 2023 • edited

URL

Affiliations

Abstract

Translation (by gpt-3.5-turbo)

Summary (by gpt-3.5-turbo)

AkihikoWatanabe commented Jul 3, 2023

AkihikoWatanabe commented Jul 3, 2023 •

edited