Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create plan for private-by-default anonymous reporting flow #3124

Closed
miketaylr opened this issue Jan 3, 2020 · 15 comments
Closed

Create plan for private-by-default anonymous reporting flow #3124

miketaylr opened this issue Jan 3, 2020 · 15 comments
Assignees

Comments

@miketaylr
Copy link
Member

@miketaylr miketaylr commented Jan 3, 2020

The rough idea:

  • anonymous reports are filed in a private repo
  • once triaged, they are moved to public web-bugs repo

Challenges:

  • Getting people to not post private information
  • Anonymous users won't know URL to re-visit issue
  • The usual technical challenges of doing the work
@karlcow

This comment has been minimized.

Copy link
Contributor

@karlcow karlcow commented Jan 5, 2020

Points of discussions:

  1. When reviewers are assessing the legal nature of the content, they might expose themselves too. The browser cache being polluted by the illegal content.
  2. There might be an important psychological consequence on reviewing content people might not prefer to see. This is hard to solve, but should not be ignored.
@miketaylr

This comment has been minimized.

Copy link
Member Author

@miketaylr miketaylr commented Jan 6, 2020

Some notes from Transferring issues:

To transfer an open issue to another repository, you must have write permissions on the repository the issue is in and the repository you're transferring the issue to.

It doesn't seem like there's an API for this yet, so this would have to be done via github.com. (note there is an API to transfer a repository. Maybe if there's an API in the future we could write a small GitHub app to respond to a keyword in a comment or something.

You can only transfer issues between repositories owned by the same user or organization account.

No problem.

You can't transfer an issue from a private repository to a public repository.

😱 Wait, what? I will contact support and ask if there are any possible exceptions to this rule. FWIW, I 100% understand why you would do this. Going from private -> public is full of problematic possibilities (especially if people learn that issues are private by default and expect to give sensitive information).

When you transfer an issue, comments and assignees are retained. The issue's labels and milestones are not retained.

This means we would need to write some code to clone labels... or manually copy them over (which seems tedious).

The original URL redirects to the new issue's URL.

This is cool. It means we could present a URL to anonymous users to bookmark and they could revisit (at some undetermined future point) to see if their (hopefully valid) issue was made public. Annoying, but better than a black hole.

I guess the work flow would be something like:

  1. Anon user files private issue
  2. Triage issue
  3. Manually move (valid) issue to public repo (assuming GitHub exposes a way for us to do this...)
  4. Copy over labels
  5. Assign the correct milestone

Issue: Do we make invalid issues public? I think we probably want all non-abusive or illegal issues to be public.

Another possibility which would get around the need for special access from GitHub for transferring from private to public would be the following:

  1. Anon user files private issue
  2. Triage issue
  3. Have a magic comment a GitHub app could recognize to do the following, for non-harmful issues:
    1. Clone the content of the original report and file a new (public) issue in web-bugs
    2. Clone the labels
    3. Set the milestone (you could tell it which to set via some magic comment)

We would lose the GitHub URL redirect, though. It might be possible for us to provide our own private to (new) public URL mapping, but this would only work for webcompat.com URLs, not GitHub URLs.

@miketaylr

This comment has been minimized.

Copy link
Member Author

@miketaylr miketaylr commented Jan 6, 2020

As for how to implement reporting to a different repo, I would probably start here:

https://github.com/webcompat/webcompat.com/blob/master/webcompat/issues.py#L27-L34

proxy_request should maybe take a flag to indicate private=True (or there should be a private_proxy_request method: https://github.com/webcompat/webcompat.com/blob/master/webcompat/helpers.py#L446-L465. Depending on that, we would pick a different REPO_URI: https://github.com/webcompat/webcompat.com/blob/master/webcompat/helpers.py#L43

We also need to take into account the "thanks" message. Currently it expects a number that corresponds to a public issue: https://github.com/webcompat/webcompat.com/blob/master/webcompat/templates/issue/thanks.jst. We may or may not want to include that, depending on what the UX flow is.

We'll need to add a webhook on the new private repo and point it at https://github.com/webcompat/webcompat.com/blob/master/webcompat/webhooks/__init__.py.

I think that's enough to get private issues in. I may be forgetting a few details.

Getting them out is TBD, depending on what GitHub support tells me (I emailed them a few hours ago).

@karlcow

This comment has been minimized.

Copy link
Contributor

@karlcow karlcow commented Jan 6, 2020

@miketaylr two thoughts

Going from private → public is full of problematic possibilities (especially if people learn that issues are private by default and expect to give sensitive information).

Except if you make it clear that all issues filed are public. Aka there is no notion of private/public. There is a notion of moderation. This is a bit like a blog post comment with moderation on by default. The comment is not accessible until it has been reviewed.

You can't transfer an issue from a private repository to a public repository.

Another possibility of working this out. Why the need for a private GitHub repo, we could collect all anonymous issues in a different DB, with all the structured information we need. Once they have been reviewed by the triage team. A public flag is assigned and a script automatically publish (with webcompat-bot oauth) them into the webcompat repo web-bugs.

There is no benefit of having to put this through GitHub and we avoid illegal content liability of even hitting GitHub (even if private).

@miketaylr

This comment has been minimized.

Copy link
Member Author

@miketaylr miketaylr commented Jan 6, 2020

we could collect all anonymous issues in a different DB, with all the structured information we need.

Yep, totally a possibility. I want to explore all the options and their problem before we decide on a plan. Can you come up with a (rough) plan including challenges as you see them for this scenario @karlcow?

@karlcow

This comment has been minimized.

Copy link
Contributor

@karlcow karlcow commented Jan 6, 2020

Also another thought/question: Do we want to publish publicly invalid anonymous reported issues? Pro/Cons. To dig further.

@karlcow

This comment has been minimized.

Copy link
Contributor

@karlcow karlcow commented Jan 6, 2020

Can you come up with a (rough) plan including challenges as you see them for this scenario @karlcow?

yup.

@miketaylr

This comment has been minimized.

Copy link
Member Author

@miketaylr miketaylr commented Jan 6, 2020

Issue: Do we make invalid issues public?

I think we probably want all non-abusive or illegal issues to be public.

I had the same question... publishing everything non-harmful is good for transparency, and good for anonymous users who might try to follow up with their issue (assuming we can solve the URL problem). But, it's also noisy and time consuming (perhaps, we don't know what it will all look like).

@karlcow

This comment has been minimized.

Copy link
Contributor

@karlcow karlcow commented Jan 7, 2020

@miketaylr

This comment has been minimized.

Copy link
Member Author

@miketaylr miketaylr commented Jan 7, 2020

Just for the record: there is no way to get an exception from GitHub to move an issue from a private repo to a public one, according to support. There is also no planned Issue transfer API. There is, however, a draft Issue Importing API that GitHub pointed me to: https://gist.github.com/jonmagic/5282384165e0f86ef105

So in this scenario, we would need to file a private issue, triage it, then using OAuth, read the details from the private issue and import it into web-bugs. This isn't very different from my idea of cloning an issue, but it looks like it retains history. However, all comments would be attributed to the OAuth user, so that's annoying (but not different from the cloned issue idea).


A thought on what to do with invalid issues filed privately, and the UX of URLS.

A user files a bug, we display a "thanks" page that gives them a unique URL w/ a hash, e.g., webcompat.com/issues/hotdog (trust me, hotdog is a hash) that they can re-visit. "Please revisit this URL in the near future to follow-up on the status of your bug report".

  • Until it has been triaged /issues/hotdog redirects to a landing page that says "hang tight, we're working to triage your issue".
  • If the issue is valid and made public, we maintain a redirect from /issues/hotdog to /issues/NNNNN (whatever it ends up being).
  • If it's INVALID, we flag it in the database and /issues/hotdog will just redirect to a landing page that says "your issue is invalid, here's a link for info on filing valid reports, etc." Then we don't have to expose abuse, spam, or invalid garbage.

invalid, but not harmful -> redirect to landing page w/o content?

@karlcow

This comment has been minimized.

Copy link
Contributor

@karlcow karlcow commented Jan 8, 2020

The redirection for handling URL is a good idea. (Repeating what @miketaylr wrote but in a structure I better understand. Talking to my inner duck 🦆 ).

  1. Issue reported (moderation queue)
  2. URL sent to reporter (could be hash based on time. The arrow of time is unidirectional. so hashes are unique. It could be a simple integer too: mNNNNN 😁 with m for moderation. )
  3. URL non moderated -> content: This issue is x days old, and has not been moderated yet.
  4. Moderation happens by humans and/or bot
    • If positive 💚:
      1. system posts issue on /web-bugs/
      2. system collects returned issues number to the DB
      3. system associates hash-url to 301 with real new issue number.
    • If negative 💔:
      1. system associates the hash-url to a 410 Gone with a content saying the issue was deleted because was not in line with our criteria for acceptability with links to these criteria.
      2. system deletes the issue content from the DB? (maybe we need to keep a comment for the reason it has been deleted. Also the content of the issue doesn't necessary needs to be in a DB, it can be a flat file)

Some questions:

  • Moderation can be a source of pre-triage. We could imagine a form there where the comment goes posted under his/her name into web-bugs when positive moderation. Pros: easier for triager? Cons: Handling of Oauth more complicated probably, another layer of requirements.
  • OR moderation should just be moderation and when done it goes to triage. Pros: less complicated to develop, Cons: more delays, more human actions?
  • Should all issues even authenticated ones be recorded in the issues DB (even if posted directly to web-bugs when authenticated). (Opportunity for building our own db as a backup?). Creating a liability?
@karlcow

This comment has been minimized.

Copy link
Contributor

@karlcow karlcow commented Jan 8, 2020

As we are fleshing out ideas. Another one. We can't DELETE an issue programmatically, but we can PATCH it.

  1. Issue reported by anonymous user on webcompat.com
  2. system POST to /web-bugs/ but with a moderated content message and save the real issue in a DB for moderation
  3. user receives the real URL to share with others
  4. The issue is being moderated by a human.
    • If positive 💚:
      1. system PATCH the issue with the real content
    • If negative 💔:
      2. system PATCH the issue with a content saying that the content was not in line with our criteria for acceptability with links to these criteria.

No need to maintain a URL redirection DB in this case, just a system for moderation with a pool of actions which can be done. Potential ☢️ content never hits GitHub, before we moderate it, but URLs are created. Less pipes.

@karlcow

This comment has been minimized.

Copy link
Contributor

@karlcow karlcow commented Jan 8, 2020

This is a very naive diagram of my previous comment

BLUE = Before moderation
MAGENTA = After moderation

anonymous-reporting

@miketaylr

This comment has been minimized.

Copy link
Member Author

@miketaylr miketaylr commented Jan 9, 2020

I think we have a plan. I'm going to file bugs and let's start executing on the plan next week:

https://webcompat-meet.herokuapp.com/S58INgfrQzO3oEfiWEmiSQ#

@miketaylr

This comment has been minimized.

Copy link
Member Author

@miketaylr miketaylr commented Jan 9, 2020

(going to close, but if there are fundamental objections to the plan, please re-open)

@miketaylr miketaylr closed this Jan 9, 2020
Webcompat Belt On automation moved this from To do to Done Jan 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
2 participants
You can’t perform that action at this time.