Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

using Linguistic Stenography in Open Assistant. #792

Closed
SummerSigh opened this issue Jan 17, 2023 · 5 comments
Closed

using Linguistic Stenography in Open Assistant. #792

SummerSigh opened this issue Jan 17, 2023 · 5 comments
Labels

Comments

@SummerSigh
Copy link
Collaborator

This issue pertains to the ideas around watermarking Open Assistant generations via Linguistic Stenography. Linguistic Stenography is a field of active research however a few methods have emerged that make me confident that it can be used effectively in identifying Open Assistant generations. This issue isn't going to detail these methods but rather discuss the ethical debate on the side of using Linguistic Stenography.

In my opinion here are the main benefits:

  1. Preventing plagiarism: Watermarking can help prevent others from using the generated text without proper attribution, which is especially important in academic and professional settings.
  2. Identifying false information: Watermarking can also be used to identify fake or misleading information that is generated by Open Assistant, which can help combat misinformation and disinformation.

However I also have the following concerns:

  1. Limited applicability: Watermarking may not be effective in certain contexts, such as when the text is heavily edited or translated.
  2. Decreasing efficiency: Watermarking can add an extra step to the text generation process, which can slow down the process and make it less efficient.
  3. Complicating collaboration: Watermarking can make it more difficult for multiple users to work together on a project. For example, if part of program is written using Open Assistant, it becomes more difficult to detect which parts of the code were generated by the model.

I think this is a subject of great discussion, and requires multiple view points so we can properly evaluate if this is something we should pursue further.

@huu4ontocord
Copy link
Collaborator

huu4ontocord commented Jan 17, 2023

Thank you for brining this important issue up @SummerSigh.

This is a hard issue and I admit I haven't thought of this very much. My main focus is to have people use our tool to enable them to do better work and help people learn.

From the educaiton perspective, I think cheating in school essays could be a problem. However, I don't know if we should build water marking upfront, or put in the API hooks to enable, say a school open source community to plug in such a tool in the generator. So those communities can decide for themselves.

As for false information, I know this is a real problem. But the danger is whatever solution we might want to devise to limit fake news by tracking (with a switch turned on by default for example, and people can turn it off if they are researching this kinda stuff), I also don't want to propogate tracking technologies. I don't want orgs or governments to be able to track people unless they follow the process set up in their jursidictions.

Another solution to limit fake news is to get our models to be more factual, which we are trying to do. But we also don't want to limit people from creating fiction for example. An altnerative history where Hitler won the war, and everyone in America lives under Nazi rule (See e.g., The Man in High Castle), would have news articles that are very different in tone than our current news articles, for example. I can imagine people could and should be able to use the OA to generate such fiction. Could that be used for fake news?

I look forward to the discussion here. Also, in our discussion, we should propose solutions that are actually implementable with tech we have now or can reasonably create. For example, I think watermarking could be done via some sort of decoder skewing that might not change the semantics much. But we would need to research this.

@SummerSigh
Copy link
Collaborator Author

@ontocord here are some implementations I have tested and know work:

https://github.com/falcondai/lm-steganography
https://github.com/ku-nlp/steganography-with-masked-lm
https://github.com/mickeysjm/StegaText
https://github.com/jumon/himitsu

I personally like steganography-with-masked-lm so I put it up in this Kaggle notebook:

https://www.kaggle.com/code/summerbreeze11/text-steganograpthy/edit

@huu4ontocord
Copy link
Collaborator

Thank you for sharing these links!

@huu4ontocord
Copy link
Collaborator

Following up on this if anyone is interested in exploring this more? Maybe create an API hook into the safety pipeline to steer decoding? And any org can put their own method/callback in there. This could be useful for privately run bots.

@SummerSigh
Copy link
Collaborator Author

I’ll take a look at it further. I’ll put my updates here as I go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants