Skip to content

Software Design

Gregory Gould edited this page Apr 5, 2018 · 10 revisions

This document outlines the design of the software, from which classes will be used and how responsibilities will be delegated, to the physical components, their relations to each other, and which software components they will handle. The following diagrams are in increasing order of detail.

High Level Diagram

Please find the High Level Architecture diagram file on this page.

UML Component Diagram

Please find the UML Component diagram file on this page.

Components

  • Website - This subsystem handles all components related to the website; the delivery of web pages, retrieving user interactions on the website, and producing visual components that the end-user will see.
    • DjangoWebserver - This component handles the delivery of webpages, and the retrieval and delivery of user requests (such as searches), passing them off to the appropriate component to handle.
    • VueInterfaces- This component deals with coordinating searches that have been submitted to the site. It obtains and then passes off the search string, handles the result sets that are returned, and provides the result set to the visualizer to obtain visualizations before returning the complete result set page formatted ready for the webserver to deliver.
    • Visualizer - This component handles creating visualizations from a search result set. It produces the appropriate visualization, or in the case of multiple options the requested visualization type, and then returns the visualization in a web ready format.
  • AnalysisTools - This subsystem deals with the controlling and analyzing data that has been collected elsewhere in the system.
    • NaturalLanguageProcessor - This component handles the detection and tagging of entities and keywords that are found in unstructured text data provided to it. It returns these tags and meta data in a structured format for other components to use.
    • ElasticSearch - This component handles the storage and searching of keywords and meta data that have been collected by the system.
  • IngestionTools - This subsystem handles the intake of raw data from the governance data dumps, this includes both unstructured and structured data.
    • PDFScraper - This component handles obtaining text from PDF files and seperation of individual PDF documents when several are grouped together. It uses the NaturalLanguageProcessor to deal with string of text it finds, and returns structured data and organized split PDFs.
    • DataDumpParser - This component handles obtaining a data dump from a user, understanding where to send the data and when to call other components to transcribe or tag data when needed, and sends the results to be stored.
    • DatabaseManager - This component handles the storage and retrieval of all collected data.

UML Class Diagrams

Back end

Please find the UML Class for back end on this page.

Scraping and Analysis

Please find the UML Class for scraping and analysis on this page.

UML Front End

Please find the UML front end here page.

Clone this wiki locally