Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Diversity in building the model and training it LLM03 #114

Closed
ManishYadu7 opened this issue Aug 5, 2023 · 1 comment
Closed

Diversity in building the model and training it LLM03 #114

ManishYadu7 opened this issue Aug 5, 2023 · 1 comment
Assignees
Labels
enhancement Changes/additions to the Top 10; eg. clarifications, examples, links to external resources, etc llm-03 Relates to LLM Top-10 entry #3 wontfix Indicates a deliberate decision has been made not to fix the issue

Comments

@ManishYadu7
Copy link

-Have a team with diverse backgrounds and solicit broad input. Diverse perspectives are needed to characterize and address how language models will operate in the diversity of the real world, where if unchecked they may reinforce biases or fail to work for some groups.
-Too much reliance could cause bias based on color, gender, and physical appearance

  1. https://hbr.org/2019/10/what-do-we-do-about-the-biases-in-ai
@rot169 rot169 added enhancement Changes/additions to the Top 10; eg. clarifications, examples, links to external resources, etc llm-03 Relates to LLM Top-10 entry #3 labels Aug 6, 2023
@GangGreenTemperTatum
Copy link
Collaborator

GangGreenTemperTatum commented Aug 7, 2023

Hey Manish

Many thanks for reaching out and I appreciate the suggestion.

Whilst I understand with your hypothesis, I do not feel this is significant enough risk to explicitly call out as a vulnerability against Training Data Poisoning. As LLM application developers, we do care about safety and harms-related risk such as bias, judgement etc. Ultimately, we should be catering for this in other avenue's such as sources & supply chain of the foundation training data, fine-tuning and benchmarking. In terms of these risks, the current LLM03: Training Data Poisoning entry lists ways to mitigate against high risk data sources already:

image

I will close this one out in ~ a week if I don't get a response on this.

@GangGreenTemperTatum GangGreenTemperTatum added wontfix Indicates a deliberate decision has been made not to fix the issue notapplicable and removed wontfix Indicates a deliberate decision has been made not to fix the issue notapplicable labels Aug 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Changes/additions to the Top 10; eg. clarifications, examples, links to external resources, etc llm-03 Relates to LLM Top-10 entry #3 wontfix Indicates a deliberate decision has been made not to fix the issue
Projects
None yet
Development

No branches or pull requests

3 participants