Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Databased Backed Blacklists #2396

Merged
merged 16 commits into from Sep 17, 2017
Merged

Databased Backed Blacklists #2396

merged 16 commits into from Sep 17, 2017

Conversation

@dstufft
Copy link
Member

dstufft commented Sep 16, 2017

This removes the blacklist that was hardcoded, and instead replaces it with one that is stored in the database. It also provides an admin UI to view and add new blacklists.

This is a somewhat of a dangerous workflow, because blacklisting a name also involves removing all files, releases, etc that exist for that (essentially the same thing as hitting remove package on legacy PyPI). Due to the danger, adding a blacklist requires confirmation.

This still requires a few things:

  • Add the ability to remove existing blacklists to allow users to claim the name again.
  • Make the confirmation page harder to blindly click though.
    • This might be requiring typing the name of the project that would be deleted into an input box?
    • This might require typing in a random string?
  • Write tests

Throwing this up here so people can take a look at it before I finish it.

@dstufft dstufft requested a review from ewdurbin Sep 16, 2017
@dstufft dstufft changed the title [WIP] Databased Backed Blacklists Databased Backed Blacklists Sep 16, 2017
dstufft added 3 commits Sep 16, 2017
<p>
Blacklisting {{ blacklist.project }} will irreversibly delete
the {{ existing.project.name }} project along with
{{ existing.releases|length() }} releases and

This comment has been minimized.

@ewdurbin

ewdurbin Sep 16, 2017 Member

worth linking to the releases view?

{% if existing.project %}
<p>
Blacklisting {{ blacklist.project }} will irreversibly delete
the {{ existing.project.name }} project along with

This comment has been minimized.

@ewdurbin

ewdurbin Sep 16, 2017 Member

worth linking to the project detail view?

below:

<ul>
{% for user in existing.users %}

This comment has been minimized.

@ewdurbin

ewdurbin Sep 16, 2017 Member

i don't think it'd be trivial but exposing the role_name here would be nice.

)
return HTTPMovedPermanently(request.current_route_path())
elif canonicalize_name(confirm) != canonicalize_name(project_name):
request.session.flash(

This comment has been minimized.

@ewdurbin

ewdurbin Sep 16, 2017 Member

the flash produced is pretty much identical in color to the header in admin, not a huge deal but threw me off when reviewing.
screen shot 2017-09-16 at 7 34 29 pm

request.db.add(
JournalEntry(
name=project.name,
action="remove",

This comment has been minimized.

@ewdurbin

ewdurbin Sep 16, 2017 Member

should the action here be blacklisted?

This comment has been minimized.

@dstufft

dstufft Sep 17, 2017 Author Member

I'm going to leave this alone, because other things might be reading our journal entries and inferring things from them, and we want them to still go ahead and remove these projects if they are. It's not really super important to keep in the log that a project was blacklisted either, since it's a binary state and admins can see when it was and who did it (as well as any relevant comments).

queue="success",
)

return HTTPMovedPermanently(request.route_path("admin.blacklist.list"))

This comment has been minimized.

@ewdurbin

ewdurbin Sep 16, 2017 Member

maybe just a redirect?

Copy link
Member

ewdurbin left a comment

No blockers. I would like for us to work on a method of bulk blacklisting at some point.

In recent reports, there have been dozens of package names affected that would be somewhat cumbersome to blacklist one-by-one

@dstufft
Copy link
Member Author

dstufft commented Sep 17, 2017

Feedback handled. I'm a little nervous about the idea of a bulk blacklisting, since blacklisting means deleting packages and is generally irreversible, perhaps once we get some more experience with the single entrypoint we can explore that?

dstufft added 2 commits Sep 17, 2017
fix
@dstufft dstufft merged commit 52b2471 into pypa:master Sep 17, 2017
2 checks passed
2 checks passed
codecov/project 100% (target 100%)
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@dstufft dstufft deleted the dstufft:db-backed-blacklist branch Sep 17, 2017
@ncoghlan
Copy link
Member

ncoghlan commented Sep 17, 2017

As far as a bulk feature goes, that may be more appropriate to tackle as a "bulk quarantine" (i.e. shuffle all the affected artifacts off to a different S3 bucket, without deleting them outright) rather than as bulk automation of the existing removal behaviour (quarantine also has the benefit of preserving the artifacts for security analysis, similar to the way anti-virus quarantining works).

@dstufft
Copy link
Member Author

dstufft commented Sep 17, 2017

We don't actually delete the files from S3 even with bulk delete. The cost of storing the files is minuscule it's actually serving the files that actually costs the bulk of our S3 "bill". This also makes it easier for us to deal with transactionally deleting these projects since it's all contained inside of the database.

So deleting here is really just deleting the data from the database (although maybe at some point we'll have some process to garbage collect data from S3 that is no longer referenced in the DB).

@ncoghlan
Copy link
Member

ncoghlan commented Sep 17, 2017

OK, so even today the database entries for "deleted" releases could potentially be backfilled from the artifacts retained in S3. I think that would already mitigate a lot of the risk associated with removal errors, even if the scripts for state reconstruction don't exist yet.

func.normalize_pep426_name(form.name.data))).scalar():
raise _exc_with_message(
HTTPBadRequest,
"The name {!r} is not allowed.".format(form.name.data),

This comment has been minimized.

@benjaoming

benjaoming Sep 17, 2017

Hi @dstufft - glad you started something!

Would suggest the message to be a little more clear?

For concerns of security (namesquatting and typosquatting), '{!r}' is not an allowed name for a project. Please use a different name.

This comment has been minimized.

@dstufft

dstufft Sep 17, 2017 Author Member

I don't want to get into specifics like that because there are a lot of reasons a name might be prohibited, not just for security purposes. If we want to provide details about why a name is blocked, we'll want to either expose the comment field or (more likely) provide categories to the blacklist.

This comment has been minimized.

@ncoghlan

ncoghlan Sep 18, 2017 Member

I added that question to #2401 (comment) (as I think it's related to the general UX of how we communicate invalid name declarations to end users)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants
You can’t perform that action at this time.