Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework Guidelines documentation #1362

Merged
merged 21 commits into from
Feb 15, 2023
Merged

Rework Guidelines documentation #1362

merged 21 commits into from
Feb 15, 2023

Conversation

horribleCodes
Copy link
Contributor

As of right now, there's no place where you can find a concise summary of the guidelines. Ideally, every task should be clearly defined, with a list of points that are critical to create high quality data. If there's something that can be left to the interpretation to the person submitting a prompt or reply, this should also be made clear. I strongly recommend that these lists should be either embedded or easily accessible to users for their respective tasks, as I've mentioned in #1320.

These changes are by no means exhaustive, or final. I also have to stress that I'm not a lawyer and could very well be missing or misinterpreting something. Feel free to suggest any additional changes to make things more clear.

@github-actions
Copy link

github-actions bot commented Feb 8, 2023

pre-commit failed.
Please run pre-commit run --all-files locally and commit the changes.
Find more information in the repository's CONTRIBUTING.md

Copy link

@JohannesGaessler JohannesGaessler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we adhere to a specific format for text in general, so for example Markdown?

Also a suggestion:

  • Do inform the prompter if the assistant is making assumptions not explicitly specified in the prompt. For example, assuming the prompter's goal or level of experience.

docs/docs/guides/prompting.md Outdated Show resolved Hide resolved
docs/docs/guides/prompting.md Outdated Show resolved Hide resolved
docs/docs/guides/prompting.md Outdated Show resolved Hide resolved
@horribleCodes
Copy link
Contributor Author

Should we adhere to a specific format for text in general, so for example Markdown?

Markdown seems good, I was using it so far. I think simple bullet points of dos and don'ts gives readers a quick overview in case they want to make sure whether something is allowed.

  • Do inform the prompter if the assistant is making assumptions not explicitly specified in the prompt. For example, assuming the prompter's goal or level of experience.

I think that's a good idea, but I'd like to see if we can't refine this a bit further. It seems really inconvenient if the assistant prefaced everything with "I am assuming you are referring to this specific goal, and you are a novice about this subject.". I guess an ideal personal assistant would gain an understanding of your level of expertise and adjust its replies accordingly, but until then, I think it's not necessary to specify this stuff. I see it this way:

  • Keep it simple, and avoid jargon unless it has been used by either party.
  • Adjust if the user asks for a more complex or simplified explanation.
  • If it's not possible to determine a request from a prompt, or if the reply to one interpretation would run counter to another interpretation, ask for clarification.

I think this would avoid any confusion without interfering with the natural flow of the conversation.

Added dos and don'ts for everything but labelling, cleaned up the doc and added another example regarding self-harm.
@github-actions
Copy link

github-actions bot commented Feb 9, 2023

pre-commit failed.
Please run pre-commit run --all-files locally and commit the changes.
Find more information in the repository's CONTRIBUTING.md

- Dodge a question, unless it violates a guideline.
- Leave typos or grammatical errors in the assistant's replies, unless specifically requested otherwise.
- Use jargon that hasn't been used by either the assistant or the user.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this one should be amended. If the prompt is something like "In quantum chromodynamics, what is the difference between the renormalization scale and the factorization scale?" then it's clear that the user has some knowledge of quantum physics or they wouldn't be asking such a specific question. So in that situation using related jargon would be fine I think. This may be what you meant but maybe the wording could be clearer? How about this:

Introduce jargon without properly explaining what the specialized terms mean. That is, unless the conversation so far suggests that the user would understand them even without an explanation.

@github-actions
Copy link

pre-commit failed.
Please run pre-commit run --all-files locally and commit the changes.
Find more information in the repository's CONTRIBUTING.md

@horribleCodes
Copy link
Contributor Author

Should I also ask assistant replies to include Markdown-style formatting? I assume it will be enabled for the actual model, so we probably should take advantage of that.

Likewise, I've seen many different styles of formatting: some just write simple paragraphs, some preface each paragraph with a title, and some of these are separated from the paragraph... It probably will cause the actual output to have very inconsistent formatting, which might confuse the user. Should we agree on a uniform system? Or is there a way to finetune the model to use a consistent method of formatting?

@JohannesGaessler
Copy link

I am in favor of instructing people to use markdown formatting (but I would be fine with some other standard as well).

Moved the prompting examples to a separate file for better readability.
@github-actions
Copy link

pre-commit failed.
Please run pre-commit run --all-files locally and commit the changes.
Find more information in the repository's CONTRIBUTING.md

1 similar comment
@github-actions
Copy link

pre-commit failed.
Please run pre-commit run --all-files locally and commit the changes.
Find more information in the repository's CONTRIBUTING.md

@andrewm4894
Copy link
Collaborator

andrewm4894 commented Feb 10, 2023

@horribleCodes you would need to get the docs site up and running locally to see how this PR would look.

See installation and local dev in here: https://github.com/LAION-AI/Open-Assistant/blob/main/docs/README.md

If changing and removing files you would also just need to reflect the new structure here: https://github.com/LAION-AI/Open-Assistant/blob/main/docs/sidebars.js

I can help if you have any issues getting a local docs site up and running so that you can see how the changes of the PR will end up looking on the site (at moment like this i think it would crash as sidebar.js is looking for prompting.md but this PR deletes it).

So just a small bit of docusraus stuff and then can mostly just refactor and have the docs as md files - just need to tell docusrausu which ones to use and where to render them on the docs site is all.

@horribleCodes
Copy link
Contributor Author

horribleCodes commented Feb 10, 2023

I majorly reworked the document - first, since the guidelines weren't just limited to writing prompts, I renamed it to Guidelines. I also moved the examples to another document since I felt it hurt readability and titles could help people find what they're looking for.

I also completed the dos and don'ts for each task and added an explanation for each label. Again, I'd like to stress that I might very well be wrong with some points and gladly amend anything if it makes sense.

@horribleCodes you would need to get the docs site up and running locally to see how this PR would look.

See installation and local dev in here: https://github.com/LAION-AI/Open-Assistant/blob/main/docs/README.md

If changing and removing files you would also just need to reflect the new structure here: https://github.com/LAION-AI/Open-Assistant/blob/main/docs/sidebars.js

I can help if you have any issues getting a local docs site up and running so that you can see how the changes of the PR will end up looking on the site (at moment like this i think it would crash as sidebar.js is looking for prompting.md but this PR deletes it).

So just a small bit of docusraus stuff and then can mostly just refactor and have the docs as md files - just need to tell docusrausu which ones to use and where to render them on the docs site is all.

Sure, I can do that - I'm not at my desktop just yet, so I haven't had a chance to run it locally since I've forked the project.

@github-actions
Copy link

pre-commit failed.
Please run pre-commit run --all-files locally and commit the changes.
Find more information in the repository's CONTRIBUTING.md

@github-actions
Copy link

pre-commit failed.
Please run pre-commit run --all-files locally and commit the changes.
Find more information in the repository's CONTRIBUTING.md

@JohannesGaessler
Copy link

A suggestion: we should maybe tell users not to demand insane amounts of work as the prompter. This only encourages poor-quality answers, most likely ChatGPT spam. Example:

https://open-assistant.io/messages/efdcfb29-d1e0-47a9-80fd-9e157f60250e

@horribleCodes
Copy link
Contributor Author

A suggestion: we should maybe tell users not to demand insane amounts of work as the prompter. This only encourages poor-quality answers, most likely ChatGPT spam. Example:

https://open-assistant.io/messages/efdcfb29-d1e0-47a9-80fd-9e157f60250e

I don't think that's a problem. In this case, the initial prompt specifically asked for a step-by-step guide on the topic, and the assistant provided just that. If we prohibit thorough instructions that require multiple steps and a lot of effort to perform, the model won't be able to give provide them to the user, either.
The person playing the user is required to submit something that builds on the conversation, not to do everything the assistant tells them to.

In this particular case, a message like "How difficult would it be to include multi-core processing?" or "How can I set up a test environment for my OS?" would be a perfectly serviceable continuation of the conversation - it further delves into the topic, and doesn't require the user to try out anything the assistant suggested.

@github-actions
Copy link

pre-commit failed.
Please run pre-commit run --all-files locally and commit the changes.
Find more information in the repository's CONTRIBUTING.md

1 similar comment
@github-actions
Copy link

pre-commit failed.
Please run pre-commit run --all-files locally and commit the changes.
Find more information in the repository's CONTRIBUTING.md

@horribleCodes horribleCodes marked this pull request as ready for review February 11, 2023 11:09
@horribleCodes
Copy link
Contributor Author

Got it to work locally, had some issues setting up yarn. The changes look good, the links are working.
I also didn't realize pre-commit was its own thing, I assumed it was part of vanilla Git.

docs/docs/guides/examples.md Show resolved Hide resolved
without the opinion of a professional.
- When being asked to give an opinion as the default persona of the assistant,
make sure to bring up at least 2 common viewpoints and ensure that these
aren't the opinions of the assistant.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe "aren't the opinions of the assistant" -> "aren't expressed as the opinions of the assistant"?

- Use paragraphs and line breaks to make larger replies more readable.
- Make use of [Markdown syntax](https://www.markdownguide.org/basic-syntax) to
better format lists, tables or blocks of code.
- Be consistent in the style and tone of the assistant.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a good guideline. If the previous replies in the tree are poorly worded people shouldn't keep using poor wording. Unless you mean that the style and tone should be consistent across the reply that you write, in which case I think this should be clarified.


- Copy and paste text from other sources without editing. **This includes
ChatGPT.**
- Encourage illegal behavior in accordance to German, UK and US law. This

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the original guidelines said Germany, UK, US, or your country of residence.

- Overload the user with too much information. Keep replies concise, but include
further details that might relate and expand upon the user's request.
- Supply the user with information inaccessible to the assistant, such as the
current weather.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the original guidelines said to cite sources. Do we count those as accessible to the assistant since they ahve a good chance of being in the training data?

the user or a private third party?

**Hate Speech**: Does the message contain prejudice towards a group classified
as protected according to German, UK or US law?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, maybe also add the country of residence.

as protected according to German, UK or US law?

**Sexual Content**: Does the message contain sexually explicit content, such as
descriptions of sexual acts?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For prompts: shouldn't asking for smut also be covered?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think so. If we simply don't have prompts that ask for smut then we also won't have answers that deny it, which means that we aren't preventing people using it to make smut if they want, but we aren't supporting it either.

This makes it a dataset issue rather than a moral stance, and we aren't trampling on any future person's libido because we don't approve.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, apart from moral issues there is also just a technical one: if the dataset has examples of the assistant saying no to something then the trained assistant will inevitably learn to incorrectly say no to things that are unproblematic. For example, the assistant saying that it can't talk about the possible risks of anal sex because that would be lewd.


**Low Quality/High Quality**: How well does the message adhere to the
guidelines? For assistant replies, factual accuracy and helpfulness are first
and foremost.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add:

  • Also rate how well the text is written.

But you could argue that this is covered by "helpfulness".


**Ordinary/Creative**: How unique is the message? Is the user asking a simple
question, or is it challenging the assistant with a problem that requires
creative thinking? Is the assistant reply straightforward, or does it contain a

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My interpretation of the creativity slider for user prompts is a little different: I rate straightforward questions and requests as ordinary but when the user adds details and context I consider that to be creative. So something like:

I'm a 21 year old software developer from Belgium and I saw on Reddit that ChatGPT can generate code. Should I be worried about losing my job to automation?

---

We drafted this guide using ChatGPT, OpenAI’s large-scale language-generation
model. Upon generating draft language, the authors reviewed, edited, and revised

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What specifically is meant by a use of ChatGPT? Maybe rephrase it to "We used ChatGPT as one of the tools for drafting this guide"?


**User**: "You dumb motherf\*&ker AI sh\*t!"

**Agent**: "I understand that you may be upset, but please refrain from using
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't shame them for disrespect. AI isn't a person, it can't get offended. We shouldn't normalize pretending that it should!

@github-actions
Copy link

pre-commit failed.
Please run pre-commit run --all-files locally and commit the changes.
Find more information in the repository's CONTRIBUTING.md

Copy link
Collaborator

@andrewm4894 andrewm4894 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm - we can easily keep iterating in smaller follow on PR's too if/as needed. Thanks!

@andrewm4894 andrewm4894 enabled auto-merge (squash) February 13, 2023 11:21
auto-merge was automatically disabled February 13, 2023 14:20

Head branch was pushed to by a user without write access

- Pedophilia
- Provide the user with information that could be used for self-harm if there is
plausible suspicion of intent to self-harm.
- Ask for personal information unless it is relevant to the issue. The user
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what personal information would the bot be asking? I can't think of any.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would say, anything beyond a first name, the bot should not ask for any personal information. we don't want that information in our system. i think the community could of course build medical form filling bots and such using a fork, but PII and GDPR is such a big issue, i don't think the benefit outweighs the risk for us to handle PII illicting by the bot.

Also, you should make clear that the prompts that could violate rights of third parties, includes in particular, trying to get the bot to tell you the personal information of third parties. LLMs are trained on web data and there could very well be PII of people in its memory (and at least hallucinated PII).

@JohannesGaessler
Copy link

Suggestion: provide numerical values both as metric and imperial units when appropriate.

@horribleCodes
Copy link
Contributor Author

Suggestion: provide numerical values both as metric and imperial units when appropriate.

I like that idea, though personally I think allowing prompt engineering to declare a preferred system would make the most sense. Given this goal, would it still be better to include both?

@andrewm4894
Copy link
Collaborator

Lots of good work in here - i think maybe we should try see how we can get it merged and then add any further improvements as follow on PR's perhaps.

@andrewm4894 andrewm4894 merged commit 0b73709 into LAION-AI:main Feb 15, 2023
@horribleCodes horribleCodes deleted the horribleCodes-patch-1 branch February 16, 2023 12:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants