Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DISCUSS] Black/White Listing for Extensions #7933

Closed
echarles opened this issue Feb 26, 2020 · 30 comments
Closed

[DISCUSS] Black/White Listing for Extensions #7933

echarles opened this issue Feb 26, 2020 · 30 comments
Labels
status:resolved-locked Closed issues are locked after 30 days inactivity. Please open a new issue for related discussion.
Milestone

Comments

@echarles
Copy link
Member

echarles commented Feb 26, 2020

This issue intent is to collect requirements on the following initial proposal for a black/white listing for extensions.

Problem

We want to bring more Safety to the JupyterLab users when they install and use extensions.

Goals

  • Users should have ways to easily identify and block extensions considered as malicious.
  • Users should be made aware that the extensions they are installing can run arbitrary possibly unsafe and harmful code.
  • Users should be protected from extensions that are known to be malicious.

Solution

Both Whitelist and Blacklist approach can be implemented to offer more safety. Thos paragdims can be seen as complementary. The JupyterLab default setup defines the paradigm which could be Blacklist. Users can change from paradigm via their settings.

Blacklist

Extensions can be freely downloaded without going through a vetting process. However, users can add malicious extensions to a blacklist. Extensions on the blacklist will be hidden from public view and possibly disabled automatically by the extension manager.

The extension manager should show all extensions except for those that have been explicitly added to the blacklist. The blacklist method is therefore more permissive than the whitelist method.

An extension being nor in the Blacklist nor in the Whitelist is considered as being in the Greylist.

Whitelist

Maintain a whitelist of approved extensions that users can freely search and download. Extensions need to go through some sort of vetting process before they are added to the whitelist.

When using a whitelist, the extension manager should only show extensions that have been explicitly added to the whitelist.

Considerations

  • Who is primarily responsible for maintaining these lists? The Jupyter organization or the community? Possibly a joint effort?
  • What are the criteria used to white/black list an extension?
  • What should the terms of service be?
  • How are the ~500 existing extensions affected?
  • Should we allow or prohibit the installation of an extension in the Blacklist?

User Interface

We will add UI controls to extension manager with warnings modal.

The user should be able to switch between using the blacklist and whitelist using a toggle switch underneath the search bar. The switch should be off by default. If the user switches it, the use_whitelist setting should be changed so that the user’s preference is persisted across sessions. The switch should also have a tooltip which explains to the user what the switch does.

To alert the user that installing extensions (being in the Blacklist) could be potentially dangerous, the extension manager should display a modal with text describing the dangers every time it is opened. The user must explicitly change a setting to not show the modal every time the extension manager is opened, as indicated by the show_warning_modal setting below.

The user must explicitly check the Don’t show again checkbox for the modal to be hidden the next time the extension manager is opened. Checking the checkbox will change the show_warning_modal setting defined above.

  • Should we list in the modal installed extensions being in the Blacklist?
  • Should we list in the modal installed extensions being in the Greylist?

Listing Specs

We have to come up with format of listings file: lines? regex? JSON? list of names? versioning format?

Settings Specs

{
    // URL for Whitelist.
    "whitelist": "https://github.com/jupyterlab/listings/whitelist.json",
    // URL for Blacklist.
    "blacklist": "https://github.com/jupyterlab/listings/blacklist.json",
    // Whether or not to use the whitelist. If this setting is false,
    // then the blacklist should be used instead.
    "use_whitelist": false,
    // Whether or not to show the warning modal.
    "show_warning_modal": true
}

Stakeholder

  • All JupyterLab users and developers.
  • Meet with Binder about requirements.

Tasks

Initial tasks for this are defined on [BOARD] Improve extension manager (the board also includes activities for the splash screen feature which is not part of this discussion).

  • Research viability of storing tags in npm vs in another database (with eye to how this would work on pypi/conda).
  • Design format of blacklist/whitelist files.
  • Research what other extension systems use for Text in extension UI as a warning and create our own warning to add.
  • Create repo for listing of extension.
  • Create Sketch UI.
  • Implement listings hosted on JupyterLab repo and document them.
  • Add to JupyterLab the config settings (switch to another URL, switch between blacklist and whitelist, show modal).
  • Implement and test changes to JupyterLab with default Blacklist and configurable options.
@saulshanabrook saulshanabrook moved this from TODO to In Progress in Improve extension manager Feb 26, 2020
@meeseeksmachine
Copy link
Contributor

This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/security-of-jupyterlab-extension-manager/3444/1

@blink1073
Copy link
Member

I think the whitelist and the blacklist should allow for regex patterns. This would allow one to whitelist all @jupyterlab/* packages for example.

@saulshanabrook
Copy link
Member

Thanks for opening this @echarles! A few thoughts off the top of my head:

  1. We should be clear that this only affects the extension manager UI, not the CLI, which would still be able to install whatever it likes.
  2. Not sure if we need UI for whitelist/blacklist enabling. I thought the idea was that this would be set up more like for a whole enterprise or deployment, so would be present maybe in the settings and not readily accessible for users to change.
  3. I believe we also talked about maybe having some persistent warning text in the extension UI itself about warnings, instead of a one time modal.

@echarles
Copy link
Member Author

I think the whitelist and the blacklist should allow for regex patterns. This would allow one to whitelist all @jupyterlab/* packages for example.

Makes sense. We can support this whatever format (JSON...) we use.

@echarles
Copy link
Member Author

echarles commented Feb 26, 2020

We should be clear that this only affects the extension manager UI, not the CLI, which would still be able to install whatever it likes.

True, only UI is in scope. This implies that the safety could be skipped by power-user having access to the jlab env via CLI. The hosted users on Hubs (jupyerhub, binderhub...) would not have the ability to skip it.

Not sure if we need UI for whitelist/blacklist enabling. I thought the idea was that this would be set up more like for a whole enterprise or deployment, so would be present maybe in the settings and not readily accessible for users to change.

We can implement the setting without exposing the toggle in the UI, that is not an issue.

I believe we also talked about maybe having some persistent warning text in the extension UI itself about warnings, instead of a one time modal.

The content of the modal still needs to be narrowed down. Is it simply a shout to the user to remind him he needs to know what he does, or should it contain more logic with a list of extensions potentially harmfull? Also, if the user decides to not see the modal anymore, should it still popup in case of alerts (I am thinking to a case where a non-blacklisted extension is installed and after some time appears in the blacklist).

@saulshanabrook
Copy link
Member

The content of the modal still needs to be narrowed down. Is it simply a shout to the user to remind him he needs to know what he does, or should it contain more logic with a list of extensions potentially harmfull? Also, if the user decides to not see the modal anymore, should it still popup in case of alerts (I am thinking to a case where a non-blacklisted extension is installed and after some time appears in the blacklist).

We might also want to link users to the public blacklist location (if they are using the default) and say something like "Did you find an unsafe extension? Adda PR to this list!"

@minrk
Copy link
Contributor

minrk commented Feb 26, 2020

What does the blacklist accomplish? How are greylist packages discoverable? I'm not sure what's accomplished by maintaining the blacklist compared to its absence.

pip doesn't have a package blacklist, nor does npm (the installer). But both PyPI and npmjs.com (the host) have been known to accept takedown requests when malicious packages are found. Would the blacklist use case be handled well enough by notifying e.g. npmjs.com that it's hosting malicious code?

If there is an index that needs to be maintained, I would suggest admitting ~everything by default (with very minimal validation) and having a "report abuse" mechanism for prompting to take things out of the index. I guess this is the whitelist method.

@jasongrout
Copy link
Contributor

pip doesn't have a package blacklist, nor does npm (the installer). But both PyPI and npmjs.com (the host) have been known to accept takedown requests when malicious packages are found. Would the blacklist use case be handled well enough by notifying e.g. npmjs.com that it's hosting malicious code?

Possibly. Maintaining our own blacklist that we control allows us to not depend on a third-party to honor our takedown requests, even if we are relying on them for hosting.

@jasongrout
Copy link
Contributor

One nice thing about building in blacklist/whitelist support is that custom institutional deployments of JupyterLab can use these to customize what packages show up in the extension manager.

@saulshanabrook
Copy link
Member

saulshanabrook commented Feb 26, 2020

pip doesn't have a package blacklist, nor does npm (the installer). But both PyPI and npmjs.com (the host) have been known to accept takedown requests when malicious packages are found. Would the blacklist use case be handled well enough by notifying e.g. npmjs.com that it's hosting malicious code?

Hm, yeah it makes me think maybe we shouldn't add all this complicated logic until it's actually been shown we need it.

One nice thing about building in blacklist/whitelist support is that custom institutional deployments of JupyterLab can use these to customize what packages show up in the extension manager.

The institutional case is separate I think and would be nice to support, but doesn't seem like a dealbreaker for getting the extension manager UI enabled by default? Like maybe we wait until an institution is actually rolling this out and build the requirements around some more actual user stories instead of trying to imagine what they might be?

@jasongrout
Copy link
Contributor

The institutional case is separate I think and would be nice to support, but doesn't seem like a dealbreaker for getting the extension manager UI enabled by default? Like maybe we wait until an institution is actually rolling this out and build the requirements around some more actual user stories instead of trying to imagine what they might be?

True, good point. At least we can design the architecture at this point so that adding blacklist/whitelist support in the future is easier.

@jasongrout
Copy link
Contributor

a dealbreaker for getting the extension manager UI enabled by default?

I think for me personally, the biggest blocker is that there is some sort of warning to the user like what we show now in a modal when it is enabled. Perhaps that warning can be in the extension manager sidebar itself, though, and perhaps it can be modal inside that sidebar (i.e., the extension manager does not show any content until the user acknowledges the warning), or perhaps it could just be a warning sticky to the top of the extension manager until the user dismisses it.

@echarles
Copy link
Member Author

What does the blacklist accomplish? How are greylist packages discoverable? I'm not sure what's accomplished by maintaining the blacklist compared to its absence.

The blacklist is just a list of extensions considered malicious. The list can be updated with a PR or a user reporting suspicious extension. The extension manager then has a easy job to consult the blacklist and interact with the user to inform him that he should install a black-listed extension. If you do not have that blacklist, you can not warn the user against potential risks.

I introduced the concept of greylist which actually is for extensions not listed at all (nor in white, not in black lists). This is a language abuse from my side and it may introduce more confusion than help. I propose to forget that wording.

pip doesn't have a package blacklist, nor does npm (the installer). But both PyPI and npmjs.com (the host) have been known to accept takedown requests when malicious packages are found. Would the blacklist use case be handled well enough by notifying e.g. npmjs.com that it's hosting malicious code?

As Jason said, this could be an option but less efficient and safe-proof than having full control with our own lists. Having our own mechanism also gives the opportunity for corporations to host and control their own list by changing the list URL in the user settings.

If there is an index that needs to be maintained, I would suggest admitting ~everything by default (with very minimal validation) and having a "report abuse" mechanism for prompting to take things out of the index. I guess this is the whitelist method.

In the whitelist paragdim, things are a bit different. Extensions writers would have to go via a vetting process where a kind of committee would have to review the code, run validation tests against the jupyterlab versions, ensure there is no performance degradation, usability if good... This is not the model we target in first instance but we want to build the foundation that support it. It would demand investment from the organization maintaining the whitelist which I guess Jupyter does not have today. I will still use a language abuse... you could see the whitelist as the apple marketplace where you apply, and if your app is validated, you are listed in the market and can be consumed by endusers.

@echarles
Copy link
Member Author

I think for me personally, the biggest blocker is that there is some sort of warning to the user like what we show now in a modal when it is enabled. Perhaps that warning can be in the extension manager sidebar itself, though, and perhaps it can be modal inside that sidebar (i.e., the extension manager does not show any content until the user acknowledges the warning), or perhaps it could just be a warning sticky to the top of the extension manager until the user dismisses it.

I have that mixed feeling about the modal vs the content and messages in the sidebar itself. Has JupyterLab already run user tests with mock screens to get feedback and choose the best UI?

@echarles
Copy link
Member Author

We have discussed at yesterday meeting the amount of HTTP requests and load time and said this should be an attention point. With this proposal, we will create additional HTTP requests to get the listings. We have not discussed yet about that, but we could enforce at the startup the consultation of the blacklist and if an installed extension sits in the blacklist, inform the user he runs a risk. This is useful for extensions installed via CLI, or extension being not in the blacklist at the time of installation but pulled in the blacklist after.

Maybe this is overkill? Maybe we do not want that? But if we do, we need to make sure we do not slowdown the startup. This could be achieved with correct async loading.

@jasongrout
Copy link
Contributor

Has JupyterLab already run user tests with mock screens to get feedback and choose the best UI?

Not that I am aware of.

@tgeorgeux
Copy link
Contributor

tgeorgeux commented Feb 28, 2020

I have that mixed feeling about the modal vs the content and messages in the sidebar itself. Has JupyterLab already run user tests with mock screens to get feedback and choose the best UI?

There has been no user testing around the extension manager.

Summarizing what I got from reading through this issue so far:

  • We're looking at creating a whitelist and a blacklist functionality for extensions.
    • We're not creating a greylist , this terminology represents any extension not found on either list above.
  • Institutions can use these to manage what extensions their users can or cannot install based on their own preference, this will require custom solutions, but we're providing infrastructure to allow this.
  • We're not sure how we're going to manage those quite yet for non-institutional users we may:
    1. Allow users to toggle whitelist and blacklist independently of one another.
    2. Only expose whitelist extensions ever.
    3. Allow users to toggle between whitelist only, or show all available non-blacklist extensions.
    4. Allow users to toggle between whitelist only, or show all available extensions, and not offer any blacklist functionality at this point.
    5. Show all extensions and offer a generic warning about running arbitrary code.

The simplest option forward for the UI is #5. We could move forward with a mockup of a persistent warning and a mockup of a modal warning, we can test both and see if one performs better.

I think the blocker on the whitelist/blacklist is the work to curate that list. If we want to go down that route, we should make the repo and see if there's community involvement surrounding it, if we are maintaining a whitelist, we can leverage it, in the meantime, institutional users can maintain their own.

@vidartf
Copy link
Member

vidartf commented Feb 28, 2020

Two points worth noting:

  • For large institutions, there is often internal npm/pypi repositories.
  • No matter what solution we end up using, we should take care not to do an external fetch before the user actively engages with the extension manager, to avoid having all clients "phone home" by default.

@echarles
Copy link
Member Author

For large institutions, there is often internal npm/pypi repositories.

True and they can control what they put in there. However the user is often still able/allowed to install packages/extensions from the internet.

No matter what solution we end up using, we should take care not to do an external fetch before the user actively engages with the extension manager, to avoid having all clients "phone home" by default.

+1

@echarles
Copy link
Member Author

echarles commented Mar 3, 2020

@mlucool
Copy link
Contributor

mlucool commented Mar 3, 2020

For large institutions, there is often internal npm/pypi repositories.

I'll also note there is often a strong desire to take the community standard and then apply a policy on top (whitelist internal products, blacklist ones that don't have the right licences).

@echarles
Copy link
Member Author

echarles commented Mar 7, 2020

During last weekly meeeting, @vidartf raised a point regarding the need to bring more security and control on the URI serving the lists. The current proposal foresee that the user can change those URIs via its settings, which is not the best on a security point of vue as it would be very easy for the user to bypass the blacklist and install any malicious extensions.

I have looked at ways to define ReadOnly settings, but this does not seem to be supported ATM by JupyterLab (correct me if I am wrong).

Therefor, I propose to move as discussed to URIs defined on the server level. The Administrator could rely on the default ones or overrides them via Server Traits.

@echarles
Copy link
Member Author

echarles commented Mar 7, 2020

FYI Implementation of the discussed features is happening in:

@mlucool
Copy link
Contributor

mlucool commented Mar 7, 2020

In general, my experience with these is that these lists often need some color to be maintained easily (i.e. commend a line with the issue number). I'd recommend not using JSON but something else, such as JSON5 or YAML or ...

@tgeorgeux
Copy link
Contributor

After various discussions over a video with @echarles and other members of the JupyterLab team, I think it makes sense to treat the white and blacklist modes as follows:

In blacklist mode:

  • Do not show blacklisted extensions at all.
  • When a blacklisted extension is installed, then becomes blacklisted, it is highlighted in red, and a warning is given to the user suggesting they disable and uninstall the extension immediately.
  • Do not notify users that blacklisted extensions exist.
    • There's also an option to tell the user that blacklisted extensions exists, but not show them. I am personally against this but open to debate. This makes sense to me in the case where a single user is referencing a blacklist made by another user; this presents problems when an institution has a blacklist and users are presented with options they are not allowed to use.

In white list mode:

  • Show only whitelisted extensions.
  • In the event, an extension is taken off the whitelist, warn users by highlighting them in red with a tooltip that asks users to uninstall and disables.
    • Use this Icon for warning tooltip:
      baseline_help_outline_black_18dp

Copy for warning messages:

For blacklist:
This extension has been blacklisted since install. Please uninstall immediately and contact your blacklist administrator.

For Whitelist
This extension has been removed from the whitelist since installation. Please uninstall immediately and contact your whitelist administrator.

@echarles
Copy link
Member Author

@tgeorgeux Thx a lot for concretizing here the discussions. I have already updated the current PRs and they implement the logic you have described. I will just update the icon and the warning message. I will also remove the text that tells the user that blacklisted extensions exists.

@saulshanabrook
Copy link
Member

This sounds great, yeah thanks @tgeorgeux for writing this up and @echarles for implementing.

Do not notify users that blacklisted extensions exist.

👍 I am on board with this simpler option, not to notify users of a blacklisted extension if they search for it. We can always add more nuanced logic later if the need comes up.

@echarles
Copy link
Member Author

Just opened #8050 to enable by default the extension manager once the 2 PRs for listings are merged.

@echarles
Copy link
Member Author

echarles commented Mar 23, 2020

Snippet to install and try the listing branches.

conda create -y -n jlab-listings \
    -c conda-forge \
    python=3.7 nodejs=12.14.1 yarn=1.22.0 && \
  conda activate jlab-listings && \
  git clone https://github.com/datalayer-contrib/jupyterlab-server --branch bw-list --depth 1 && \
  pip install -e ./jupyterlab-server && \
  git clone https://github.com/datalayer-contrib/jupyterlab --branch bw-list --depth 1 && \
  pip install -e ./jupyterlab && \
  cd jupyterlab && \
  yarn build && \
  cd packages/extensionmanager-extension/examples/listings && \
  make dev # edit Makefile to define the listings URIs.

@saulshanabrook
Copy link
Member

Closed in #7989 and jupyterlab/jupyterlab_server#82

@saulshanabrook saulshanabrook added this to the 2.1 milestone Mar 29, 2020
@lock lock bot added the status:resolved-locked Closed issues are locked after 30 days inactivity. Please open a new issue for related discussion. label May 5, 2020
@lock lock bot locked as resolved and limited conversation to collaborators May 5, 2020
@tgeorgeux tgeorgeux moved this from In Progress to Done in Improve extension manager May 27, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
status:resolved-locked Closed issues are locked after 30 days inactivity. Please open a new issue for related discussion.
Projects
No open projects
Development

No branches or pull requests

9 participants