Skip to content

DRAT Proteus Requirements Page

Chris Mattmann edited this page Aug 15, 2020 · 4 revisions

Steven Francus
Jun Suh Lee
Daler Asrorov
Zhi Fei (Emily) Liu

Project Proteus – Requirements Document

"A GUI for DRAT"

Introduction

Proteus is an extensible addition to the current DRAT project (https://github.com/chrismattmann/drat) that adds an optional graphical user interface (GUI) for interacting with and getting information from DRAT. The GUI should only be loosely coupled with DRAT, and should allow usage of the project with or without it.

Creating the GUI for the current command line interface was motivated out of a desire for improving developer productivity and control when using DRAT. One common issue that we hope to address in this product is to improve the messages visible while DRAT is performing its automated license analyses, so that developers know what DRAT is doing (since it can be on the order of minutes or hours for large codebases). Additionally, creating a GUI wrapper for DRAT can allow less-technical contributors to perform the necessary release audit analysis, helping shorten release cycles.

Definitions

  • DRAT or DRAT refers to the Distributed Release Audit Tool, an open-source project allowing distributed checking of software licensing in large code-bases.
  • OODT refers to Apache OODT: a way to integrate and archive your processes, your data, and its metadata. It facilitates the generation, processing, management, distribution and analysis of data management, data archiving, and data analytics systems allowing for the integration of data, computation, visualization and other components.
  • Proteus refers to the current project, a graphic user interface (GUI) module that allows end users to view the status of DRAT operations, as well as control it in lieu of the current command line interface.
  • Level 0 refers to the most important requirements to be implemented with highest priority. Every following level (1, 2, …) should be considered lower priority than the level before it.

System High-Level Overview

Our current proposal for the GUI design separates it into three modular, loosely-coupled components: a front-end, which exposes the GUI to the end user; a mid-tier, which acts a conduit between the front-end and back-end; and a back-end, which communicates directly with DRAT for gathering status updates, as well as passing relevant information to the end user.

The front-end will be developed as an in-browser client, utilising HTML, CSS, and JavaScript for interactivity. The mid-tier client will be a Java-based server which will communicate via a REST HTTP API (see attached API Design Documentation) with the front-end. Lastly, the backend will utilize the existing DRAT API to communicate with the necessary components, as well as expose an API for the mid-tier.

Proposed Schedule

The production schedule for Proteus is attached in Deliverable #1, the Schedule document. This specifies the roles of each team member, the proposed schedule of design and implementation, and which deliverables will be presented at each checkpoint.

Requirements

User Interaction/System Behavior

User Interaction Requirements

For user interaction with Proteus, we aim to mimic the standard interactions available in the DRAT CLI: go (automated method, running all four steps in order), crawl, index, map, and reduce. Therefore the following requirements should be implemented in Proteus:

  • [Level 0] The end-user should be able to run “go” -- the automated method, which runs all four remaining methods in order -- from Proteus.
  • [Level 0] The end-user should be able to select a local file directory on their computer with an in-browser file/directory picker
  • [Level 0] The end user should be able to re-run or run a different command on the selected repository without having to re-select it (local) or re-download it (non-local).
  • [Level 1] The end user should be able to input a URI containing a valid git or svn repository into an input box, and Proteus should:
  • Download the repository into a temporary location
  • Run the specified drat command on the repository
  • [Level 2] Give the option to save the repository into a non-temporary location.
  • [Level 1] The end user should be able to kill running processes.

System Behavior Requirements

After the end user selects a command to run and a repository to run DRAT against, the system should present an easy-to-understand graphical interface updating in real time as DRAT audits the repository. Therefore the following requirements should be implemented for use while DRAT is working (on one or all of the tasks listed above):

  • [Level 0] Proteus should show a visualization of the MIME types (the media type of a file) present in the repository. This can be represented as a graph (pie, histogram, etc.) showing the number of each MIME type present in the repository, and should update in real-time as DRAT is running (with a reset similar to the progress bar).
  • [Level 0] Proteus should expose how many instances of RAT are running for the user. This occurs since DRAT is a distributed wrapper around Apache RAT, and can help for analysis when working on distributed systems.
  • [Level 1] Proteus should expose an analyzed version of each RAT log running, and visualize for the end-user the following information: what stage each RAT wrapper is on (if automated running is selected)
  • [Level 1] Proteus should show a running feed of the last 20 files scanned. This can be exposed via an RSS feed from Apache OODT.
  • [Level 1] The in-browser GUI should show the user what percentage of the repository codebase has been analyzed. This will reset during each step (i.e. if the automated run is selected, during index, crawl, map, and reduce, the progress bar will reset at the conclusion of each step). Currently the percentage of the repository codebase refers to the number of files already visited by DRAT divided by the total number of files present in the repository. After DRAT has finished analyzing a repository, the following requirements should be implemented in the Proteus workflow:
  • [Level 0] Proteus should show a data visualization of all license types present in the repository, what relative percentage of the repository each license type is present at, and what unapproved licenses are found and where.
  • [Level 0] All existing data visualization that occurs during DRAT’s runtime should be present when DRAT finishes running (updating in real time, i.e. after DRAT finishes running, the data visualization should show information for the entire repository)
  • [Level 1] Proteus should map the aggregate logs of each RAT instance to determine which RAT instance found problematic licenses, and display it to the end-user.
  • [Level 2] Proteus should display the size of the entire repository if possible (in memory size and number of files).

Error Handling Requirements

Proteus should expose any significant errors that occur during DRAT’s operation that are deemed:

  • Severe enough to impact operation of DRAT, or harm the validity of its results
  • Cause DRAT to behave in a manner unexpected by Proteus
  • Cause DRAT to crash or exit unexpectedly These errors should be displayed to the user in a way that can allow easy debugging (give more information rather than less, e.g. an expandable box that hides information when not debugging). Warnings can be displayed either during DRAT’s operation (similar to the current CLI) or saved in a log that can be viewed or downloaded at the end.

Security

Despite being an in-browser GUI, security concerns should be minimal as the server back-end and mid-tier will not be accessible via a network connection. As such, our only current security requirement is to make sure the in-browser client is invulnerable to XSS cross-site scripting attacks. Any other security flaws should be considered inherent to DRAT or to the host machine being compromised.

Scalability

Since Proteus is a GUI tool for transmitting and receiving data from DRAT (in lieu of the existing CLI), we do not foresee any scalability challenges in building Proteus. Proteus is designed to work from one computer (the same computer running DRAT), and should not be impacted by the distributed nature of DRAT. In order to lower the memory footprint of Proteus, we will pay special attention to optimizing our use of the Java Virtual Machine (including hosting the backend on the same server as DRAT instead of spinning a new instance) as well as the JavaScript/images on the Proteus front-end.

System

Similar to the scalability concerns, Proteus should aim to limit the memory footprint required for operation in order to work optimally. This refers to practicing proper coding techniques both for Java server development (back-end and mid-tier) and JavaScript (front-end). Additionally, at this time we do not aim to support responsive design for the Proteus front-end, and only guarantee best visual results with a standard desktop/laptop set-up. Moreover, while we aim to develop in order to provide best results on all systems, the optimal environment for using Proteus will be a Mac OS running Google Chrome for the in-browser UI (however, we will make sure that Firefox, Safari, and Internet Explorer/Edge (Level 2) are also supported).

Testing Plan

All components of Proteus will be tested individually (with dependency injection and mocked data as necessary) and as a cohesive unit (integration and manually testing). Each lead will be responsible for developing a testing plan for their module.

Back-end

The back-end will be quality assured with a combination of unit testing and integration testing for the common workflow (as described in the requirements section). Our testing will presume that DRAT and OODT work as expected for integration testing.

Mid-tier

The mid-tier will be tested via integration tests when combined with the back-end and front-end. Additionally, it will be unit tested as described above.

Front-end

The front-end will be tested primarily via manual testing, with limited unit testing for the JavaScript portions.

Full System

The full system will be tested via developing a test plan for common use cases with manual testing.