Skip to content
This repository has been archived by the owner on Aug 30, 2022. It is now read-only.

[DEPRECATED] Deployed Dashboard Threat Analysis

Peter Parente edited this page Aug 1, 2018 · 1 revision

Table of Contents

Threat Analysis Update (2016-05-24)

This section contains an analysis of the threats against dashboards deployed using the latest versions of the jupyter_dashboards, jupyter_dashboards_bundlers, jupyter-dashboards-server, and kernel_gateway components. These components were designed and implemented with the security threats from the original analysis in mind. The goal of this section is to identify remaining threats against deployed dashboards.

System Overview

Application Versions

  • jupyter_dashboards 0.5.x
  • jupyter_dashboards_bundlers 0.7.x
  • jupyter-dashboards-server 0.6.x
  • jupyter_kernel_gateway 0.5.x

Application Description

A deployed Jupyter dashboard is the result of converting a Jupyter Notebook into a form that can be accessed as a standalone interactive web application. Every deployed dashboard consists of at least the following components:

  • Dashboard server frontend
  • Dashboard server backend
  • Kernel gateway
  • Kernel

A deployed dashboard may touch arbitrary APIs, data, libraries, etc. in addition to the above. This fact is a consequence of the flexibility of the notebooks from which they originate, and probably the biggest challenge for securing deployed dashboards.

Note: We drop the concept of a kernel cluster from this update. We instead analyze the threats against one or more instances of a kernel gateway instead of calling out a cluster component.

Additional Info

None.

Major Components

  • Dashboard server frontend - The HTML/CSS/JavaScript that runs in the dashboard user's web browser
  • Dashboard server backend - The web server that returns the web app frontend and is responsible for controlling communication between it and a kernel.
  • Kernel gateway - The web service that provides an API for requesting a kernel and communicating with it.
  • Kernel - The compute engine that executes dashboard backend code and publishes responses to listeners.

Dependent Components

  • Dashboard server frontend - jupyter-dashboards-server frontend node modules (jQuery, jupyter-js-services, jupyter-js-widgets jupyter-js-notebook, ...)
  • Dashboard server backend - jupyter-dashboards-server backend node modules (express, http-proxy, websocket, ...)
  • Kernel gateway - jupyter/kernel_gateway
  • Kernel - Python, R, Scala, Julia, or any other Jupyter language kernel

System Assumptions / External Dependencies

  • The dashboard server frontend runs in a browser environment that properly secures cookies, SSL connections, etc.
  • The dashboard server backend and kernel gateway run on one or more cloud hosts that properly isolate tenants, secure networks connections, etc.
  • The kernel executes in an environment having all of the necessary libraries for a particular dashboard.

Security Objectives

  • Only authorized users should be allowed to load and interact with the dashboard UX.
  • Authorized users should be able to use any interactive widgets the dashboard exposes.
  • Only code from the original notebook should be allowed to execute on the kernel. The dashboard user cannot send arbitrary additional code to the kernel.
  • Only authorized dashboard applications should be allowed to request and communicate with kernels on a given cluster.
  • Only users who have access to the original notebook and the source of the deployed dashboard backend can see the notebook / dashboard code.

Data Flow

  1. An anonymous user visits the URL of a deployed dashboard on a dashboard server configured with an authentication provider.
  2. The dashboard server frontend shows a login form to the user.
  3. The user enters his/her credentials.
  4. The dashboard server backend validates the user credentials with the auth provider.
  5. The dashboard server backend responds with the dashboard frontend HTML containing HTML for layout, configuration, and pointers to CSS/JS. The response also includes an encrypted client session cookie.
  6. The user's browser loads the dashboard frontend CSS/JS.
  7. The JS sends a HTTP POST to the dashboard server backend API.
  8. The dashboard server reads the notebook file associated with the dashboard from its storage provider.
  9. The dashboard server sends a HTTP POST to the kernel gateway including necessary metadata from the notebook (e.g., kernel language) as well as the necessary kernel gateway API token.
  10. The kernel gateway validates the API token.
  11. The kernel gateway launches a kernel and responds with metadata about it.
  12. The dashboard server backend associates the kernel ID with the client and notebook source.
  13. The dashboard server forwards the kernel UUID to its frontend.
  14. The frontend establishes a Websocket connection with the dashboard server backend resource for the kernel UUID.
  15. The dashboard server filters and proxies Websocket messages to and from the kernel gateway for the kernel UUID.

Entry Points

  • UI of the dashboard server
  • REST API of the dashboard server
  • REST API of the kernel gateway
  • Websocket connection between the dashboard server backend and the kernel gateway
  • ZeroMQ connections between the kernel gateway and kernels
  • SSH daemons and other administrator access points to the infrastructure hosting

Assets

  • The dashboard UX
  • Credentials for accessing the REST API of the dashboard server
  • Credentials for accessing the REST API of the kernel gateway
  • Access to a running kernel
  • Any algorithms expressed in or used by the notebook code
  • Any data accessed by the notebook code
  • Any secrets used by the notebook code

Threats

  • A user gains unauthorized access to the dashboard server frontend UI
  • A user gains unauthorized access to the dashboard server backend API
  • A user requests a kernel from a kernel gateway for arbitrary use
  • A user sends malicious code to the kernel to delete data, to steal data, to hijack other kernels, to use up cluster resources, etc.
  • A user discovers and connects to a running kernel to view and/or change messages going to/from it
  • A user requests kernels until cluster resources are exhausted

Desired Trust levels

  • Anonymous user
  • Authenticated dashboard user
  • Notebook-dashboard author
  • Administrator of the dashboard server
  • Administrator of the kernel gateway

Addressed Vulnerabilities

  • Any user can load the dashboard frontend
    • Administrators may configure the dashboard server to require user authentication.
    • Anonymous users must login using the configured authentication provider before a dashboard frontend will load.
  • Any user can request a kernel
    • Administrators may configure the kernel gateway to require an API token for all kernel management requests.
    • Administrators may restrict kernel gateway access to the dashboard server backend alone.
  • Any user can send commands to any kernel
    • The kernel gateway prevents discovery of running kernels by default.
    • Administrators may configure the kernel gateway to require an API token for all kernel management requests.
    • Administrators may restrict kernel gateway access to the dashboard server backend alone.
  • Any user can retrieve code and secrets from the dashboard HTML
    • The dashboard server stores all notebook source code in the backend away from users who only have access to the frontend.
    • The dashboard server filters code from execute_input kernel responses to prevent it from reaching the frontend.
    • Administrators may configure the dashboard server to require an API token for all notebook-dashboard management requests.
  • Any user can send arbitrary code to a kernel
    • The dashboard server prevents arbitrary execute_request messages from the dashboard frontend from reaching a kernel.
    • See above about restricting direct communication with a kernel gateway / kernel.
  • Any user can query a kernel gateway for running kernel IDs
    • The kernel gateway prevents discovery of running kernels by default.
  • Any user can request an unlimited number of kernels
    • Administrators may configure the maximum number of kernels that a kernel gateway can launch.
    • The dashboard server backend cleans up kernels that no longer have frontend clients associated with them.

New and Remaining Vulnerabilities

  • Unauthorized users may guess weak dashboard credentials, API keys, administrator credentials, etc.
  • Notebook-dashboard authors may write code may grant unintentional permissions to dashboard users (e.g., write access to data stores).
  • Notebook-dashboard authors may use widget libraries that allow arbitrary code execution, intentionally or unintentionally.
  • Notebook-dashboard authors may write malicious code to execute on kernels.
  • Administrator access to dashboards and kernels is virtually unlimited.

Planned Countermeasures

  • Ensure that kernel processes cannot pull secrets from the kernel gateway environment.
  • Continue to restrict the set of Jupyter protocol messages allowed between the dashboard frontend and kernel backend.

Baseline Threat Analysis (2015-12-08)

This section contains an analysis of the threats against dashboards deployed using the 0.1.0.dev version of the jupyter_dashboards extension from https://github.com/jupyter-incubator/dashboards. This early version of the project is built atop components designed to allow anonymous, public access to interactive dashboards. As such, numerous threats and vulnerabilities are to be expected. The goal is to document these in order to properly design and implementat the first security mechanisms for private dashboards.

System Overview

Application Version

0.1.x.x

Application Description

A deployed Jupyter dashboard is the result of converting a Jupyter Notebook into a form that can be accessed as a standalone interactive web application. Every deployed dashboard consists of at least the following components:

  • web app frontend
  • web app backend
  • kernel cluster
  • kernel gateway
  • kernel

In addition to these components, a deployed dashboard may touch arbitrary APIs, data, libraries, etc. This is a consequence of the flexibility of the notebooks from which they originate, and probably the biggest challenge for securing deployed dashboards.

Additional Info

None.

Implementation Overview

Major Components

  • Web app frontend - The HTML/CSS/JavaScript that runs in the dashboard user's web browser and contains the code from the original notebook in <pre> elements.
  • Web app backend - The web server that returns the dashboard HTML containing the notebook code along with configuration information about the kernel cluster and/or kernel gateway.
  • Kernel cluster - The web service that provides an API for requesting a kernel gateway.
  • Kernel gateway - The web service that provides an API for requesting a kernel and communicating with it.
  • Kernel - The compute engine that executes dashboard backend code and publishes responses to listeners.

Dependent Components

  • Web app frontend - Thebe, Gridstack, jQuery, lodash
  • Web app backend - Web server capable of executing PHP and/or serving static web assets
  • Kernel cluster - tmpnb or another component implementing its API
  • Kernel gateway - jupyter/notebook or jupyter-incubator/kernel_gateway
  • Kernel - Python, Scala, Julia, or any other Jupyter language kernel

System Assumptions / External Dependencies

  • The web app backend runs on a web server capable of serving the content over SSL. The web server runs on a secure cloud host.
  • The kernel cluster manager runs on one or more secure cloud hosts. Its API is accessible over SSL as are the APIs of the kernel gateways it spawns.
  • The kernel executes in an environment having all of the necessary libraries for a particular dashboard.

Security Objectives

  • Only authorized users should be allowed to load and interact with the dashboard UX.
  • Authorized users should be able to use any interactive widgets the dashboard exposes.
  • Only code from the original notebook should be allowed to execute on the kernel. The dashboard user cannot send arbitrary additional code to the kernel.
  • Only authorized dashboard applications should be allowed to request and communicate with kernels on a given cluster.
  • Only users who have access to the original notebook and the source of the deployed dashboard backend can see the notebook / dashboard code.

Data Flow

  1. An anonymous user visits the URL of a deployed dashboard.
  2. The dashboard backend responds with the dashboard frontend HTML containing configuration, code, and pointers to CSS/JS.
  3. The user's browser loades the frontend CSS/JS.
  4. The JS sends a HTTP POST to the configured kernel cluster API.
  5. The kernel cluster launches a kernel gateway and responds with a URL referring to it.
  6. The JS sends a HTTP POST to the kernel gateway API.
  7. The kernel gateway launches a kernel and responds with metadata about it.
  8. The JS establishes a Websocket connection to kernel gateway resource representing the communication channel with the launched kernel.

Entry Points

  • Web server hosting the dashboard backend
  • REST API of the kernel cluster
  • REST API of the kernel gateway
  • Websocket connection between the dashboard frontend and the kernel gateway
  • ZeroMQ connections between the kernel gateway and kernel
  • SSH daemons and other administrator access points on the dashboard backend and kernel cluster

Assets

  • The dashboard UX
  • Credentials for requesting a kernel gateway from a kernel cluster
  • Credentials for requesting a kernel from a kernel gateway
  • Access to a running kernel
  • Any algorithms expressed in or used by the notebook code
  • Any data accessed by the notebook code
  • Any secrets used by the notebook code

Threats

  • A user gains unauthorized access to the dashboard frontend by visiting its URL
  • A user requests a kernel gateway from the kernel cluster for arbitrary use
  • A user requests a kernel from a kernel gateway for arbitrary use
  • A user sends malicious code to the kernel to delete data, to steal data, to hijack other kernels, to use up cluster resources, etc.
  • A user discovers and connects to a running kernel to views and/or changes messages going to/from it
  • A user requests kernel gateways and/or kernels until cluster resources are exhausted

Current Vulnerabilities

  • Any user can load the dashboard frontend
  • Any user can request a kernel gateway
  • Any user can request a kernel
  • Any user can send commands to any kernel
  • Any user can retrieve code and secrets from the dashboard HTML
  • Any user can send arbitrary code to a kernel
  • Any user can query the kernel cluster for running kernel gateway URLs
  • Any user can query a kernel gateway for running kernel IDs
  • Any user can request an unlimited number of kernel gateways or kernels

Desired Trust levels

  • Anonymous user
  • Authenticated dashboard user
  • Developer with kernel cluster / kernel gateway API keys
  • Administrator of the dashboard web host
  • Administrator of the kernel cluster

Initial Planned Countermeasures

  • The dashboard backend requires user authentication before it responds to any other browser request. The dashboard backend uses a session cookie to track authentication.
  • The dashboard frontend does not contain the code from the notebook. Rather, the dashboard backend holds the code and acts as a proxy between the frontend and a kernel.
  • The dashboard backend requests a kernel gateway from a kernel cluster using an API key configured by the developer at deployment time.
  • The dashboard backend requests a kernel from a kernel gateway using an API key configured by the developer at deployment time.
  • Each kernel gateway and kernel has a unique URL that is not easily guessed by other cluster / gateway users.
  • Neither the kernel cluster nor a kernel gateway exposes a list of all running gateway or kernel instances to any user.
  • The dashboard backend only allows widget comm messages on the shell channel from the frontend to the kernel (e.g., from widgets to their backend counterparts).
  • All services have the ability to run with HTTPS enabled to account for situations when they are not on a secure network or not running behind a proxy with SSL termination.

Remaining Vulnerabilities

  • Unauthorized users can guess weak dashboard credentials, API keys, administrator credentials, etc.
  • Developers with API keys can spawn and use kernels for any purpose, even beyond the original notebook code.
  • Notebook code may grant unintentional permissions to dashboard users (e.g., write access to data stores).
  • Kernel-side processing of widget comm channel messages may have unknown vulnerabilities.
  • Kernels running within a shared cluster may execute code that discovers other kernels and snoops on their ZeroMQ traffic.
  • Administrator access is virtually unlimited.

References