Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Privacy and copyright? #8

Open
randomm opened this issue Sep 27, 2023 · 7 comments
Open

Privacy and copyright? #8

randomm opened this issue Sep 27, 2023 · 7 comments

Comments

@randomm
Copy link

randomm commented Sep 27, 2023

I am very intrigued to use this plugin, but what about copyright of private repo code? Will the code be accessed either by you, or saved by OpenAI?

@dsomok
Copy link
Contributor

dsomok commented Sep 27, 2023

Hi @randomm,

Thank you for bringing this up, it's a very valid concern.

The plugin does not collect or store source codes from the repositories. It neither logs contents nor parts of content. All interactions with the code are transient and solely in-memory. When the plugin processes a repository link, it first communicates to GPT the list of files within that repository. During the chat interaction, GPT may request specific files, prompting the plugin to retrieve their contents for ChatGPT. Importantly, once the content is relayed to ChatGPT, it's immediately purged from the plugin's memory, ensuring no residual caching or storage of file contents.

Regarding OpenAI's data handling, by default, they might use conversations for model training. You can find more details on this here. However, there's an option to disable this. Simply navigate to Settings -> Data Controls and deactivate "Chat history & training". More on this can be found here.

I hope this addresses your concerns. Should you have any more questions or require further clarification, I'll be glad to help!

@randomm
Copy link
Author

randomm commented Sep 27, 2023

Thank you for your prompt reply!

This sounds good, I was also aware of OpenAI’s policy and of the option of turning off the history from their settings.

Your plugin code is not publicly viewable, is it? Has its function been verified by OpenAI?

I do not want to sound like I am looking for faults here, but I would be very keen to try the plugin with our organisation’s repos, however, I would like to have some guarantees in place that you are not able to access our code. Even letting OpenAI, potentially, having access is a calculated risk, but as I understand it you are not a company (or other legal entity) and there are no terms and conditions that our lawyer would like to pore through. Should I tell them, that I would like to grant you access to our codebase, I should come armed with some solid backup.

@dsomok
Copy link
Contributor

dsomok commented Sep 27, 2023

Your concerns make perfect sense to me. I worked as a consultant for companies regarding integrating GPT with their data, and they were very cautious about OpenAI's privacy policies. So, when it comes to a smaller plugin like this, all these concerns are absolutely valid and understandable.

To address your questions:

  • I am currently the sole developer of this plugin, so there isn't a legal entity behind it yet.
  • For Github OAuth, I utilize PluginLab. This means I don't store the Github access tokens on my end. Instead, these tokens are stored on the PluginLab side, and I retrieve them from there.
  • I do cache this token in memory, but it's only for a short duration of 5 minutes. This cache is not persisted to any type of storage and exists solely in memory.
  • The plugin's codebase is indeed in a private repository.
  • To clarify, OpenAI does not review the codebases of plugins. When the plugin is submitted to the Plugin Store, OpenAI reviews only its behavior. They test it to ensure the plugin works as intended and provides only relevant information.

I completely understand and respect your concerns. When developing the plugin, I always prioritized ensuring that users' repository file contents would not leak in any manner. I always kept in mind that the contents should remain inaccessible to myself as the developer.

@dsomok
Copy link
Contributor

dsomok commented Sep 28, 2023

Hi @randomm,

In addition to my last comment, I'd like to note that the requests which are sent to the plugin are logged. This greatly aids in support and resolving issues.

While the responses are not logged in any manner, it's important to mention that file names in the repository are logged as they are part of the content retrieval requests. If file names are also considered as private information, this indeed could be a concern.

@NotCoffee418
Copy link

NotCoffee418 commented Nov 18, 2023

The permissions requested by the authorization are entirely too permissive, no matter how good the intent.
This is what it currently requests permission for:

Read and Write on all public and private repositories.

  • Code
  • Issues
  • Pull requests
  • Wikis
  • Settings
  • Webhooks and services
  • Deploy keys
  • Collaboration invites

Ideally you should be able to select the repositories you wish to grant it read-only access to, and only permissions essential for the plugin to function:

  • Code
  • Issues
  • Pull Requests (?)
  • Wikis

@dsomok
Copy link
Contributor

dsomok commented Nov 18, 2023

Hi @NotCoffee418,

Your concerns about the permissions requested by the authorization are completely valid. However, due to the granularity of permissions for GitHub OAuth apps, it's not possible to limit them in the way you've suggested. I discussed this limitation in more detail in another issue, which you can read about here: #3 (comment).

The repo scope is the only one that grants access to the code in private repositories, allowing users to use the plugin with their own repos, not just public ones.

@NotCoffee418
Copy link

Thanks for your response. I understand that your oauth provider does not have this feature, but github's native oauth does allow for selective and optional scoping.

I don't know to what extent your oauth provider is a requirement since the plugin appears to be closed source, but i would again encourage the use of an alternative approach if possible.
Unfortunately, granting this much access is a dealbreaker for me, and I would assume others too.

Best of luck with your project!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants