From 780b269255f18e3afe0b01e0c9c597fd050170bb Mon Sep 17 00:00:00 2001 From: Riccardo Gualtieri <77571958+Rai-Rock@users.noreply.github.com> Date: Thu, 28 Dec 2023 12:16:47 +0100 Subject: [PATCH] =?UTF-8?q?Create=20Posts=20=E2=80=9Ccome-ho-creato-un-sit?= =?UTF-8?q?o-con-chat-gpt=E2=80=9D?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- .../posts/come-ho-creato-un-sito-con-chat-gpt.md | 14 ++++++++++++++ 1 file changed, 14 insertions(+) create mode 100644 src/content/posts/come-ho-creato-un-sito-con-chat-gpt.md diff --git a/src/content/posts/come-ho-creato-un-sito-con-chat-gpt.md b/src/content/posts/come-ho-creato-un-sito-con-chat-gpt.md new file mode 100644 index 0000000..6008f02 --- /dev/null +++ b/src/content/posts/come-ho-creato-un-sito-con-chat-gpt.md @@ -0,0 +1,14 @@ +--- +title: Come ho creato un sito con Chat Gpt +description: sul supporto che i software possono dare all'attività di scrittore +publishedAt: 2023-12-28T11:15:49.384Z +isPublish: true +isDraft: false +--- + + +#### Abstract + +Human feedback can prevent overtly harmful utterances in conversational models, but may not automatically mitigate subtle problematic behaviors such as a stated desire for self-preservation or power. Constitutional AI offers an alternative, replacing human feedback with feedback from AI models conditioned only on a list of written principles. We find this approach effectively prevents the expression of such behaviors. The success of simple principles motivates us to ask: can models learn general ethical behaviors from only a single written principle? To test this, we run experiments using a principle roughly stated as "do what's best for humanity." We find that the largest dialogue models can generalize from this short constitution, resulting in harmless assistants with no stated interest in specific motivations like power. A general principle may thus partially avoid the need for a long list of constitutions targeting potentially harmful behaviors. However, more detailed constitutions still improve fine-grained control over specific types of harms. This suggests both general and specific principles have value for steering AI safely. + + \ No newline at end of file