Visualization for Intelligence Analysis
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
Docs
docs
src
.gitignore
README.md

README.md

VIA: Visualization for Intelligence Analysis


Team member:

• Mai Dahshan

• Shengzhe Xu

Yali Bian


Problem topic: Visualization for Intelligence Analysis

Our information visualization project -- Visualization for Intelligence Analysis (VIA) will help analysts make sense of collection of documents and form new insights or conclusion that aids in making decisions, solving problems, or simply seeking to understand a situation better.

From information visualization perspective, VIA could help display at least 100 events at the same time extracted from large collection of documents and their complicated relationships. Events is the basic glyph elements on VIA, but events have attributes (5W): persons (who take part in this event), time (when the event happened), location (where did the event happened), content (what happened during the event), reason (why did it happened). The relationships between two events could be expressed through their attributes: same person who took part both events, both events happened in same place, etc.

For the data analysis perspective, the data analyst has a large dataset with huge number of documents that he would spend weeks or even months trying to figure out the relationships between documents in order to reach useful insights about this data. Therefore, data analysts usually need fact-hypothesis analysis tool, which helps combine elements into facts, show relationship between facts, sense making it into hypothesis and give interactive of putting new fact/hypothesis. We planned to make VIA as such a powerful tool to help analysts gain full insights of interested dataset.

From the sense-making perspective, VIA could help analysts make sense of large amounts of data not only through augmenting users’ foraging loop and synthesis loop, but also through improve the dual search efficiency. VIA have an event-based timeline overview to display events extracted from large collections of document, and their relationships, which could provide a clear structured contents of documents and forester the foraging loop. Also, VIA provides a hypotheses generation interaction: users could add hypotheses on the timeline overview, and make connection between events and new generated hypotheses to mimic the synthesis loop. What’s more, VIA will provide more interactions to help search, explore and highlight related events to support hypotheses.


People (users)

VIA is mainly used to help data analysts to analyze and sense-make a large dataset in order to form new insights that aids in making decisions, solving problems, or simply seeking to understand a situation better. Anyone who need fact-hypothesis analysis tool, which helps combine elements into facts, show relationship between facts, sense making it into hypothesis and give interactive of putting new fact/hypothesis. VIA could help users understand the relationships between events (facts, evidence), and hypotheses (references, conclusion) from large collection of documents.

So users could be analysts from all the filed as long as they need to gain insights from complicated relationships of entities from large collection of documents. For example, user could be police officers (FBI, CIA) fighting crime, doctors curing diseases, or researchers researching a new field.


Tasks (kinds of questions)

VIA will help analysts make sense of collection of documents and form new insights or conclusion that aids in making decisions, solving problems, or simply seeking to understand a situation better.

The main task is how to help the analyst explore and sense make the data to gain a better insight into large collection of datasets which users may not know from where to begin working on, what is important parts, or how concepts/events are related? The main task could be divided into three levels: data analysis, visual encoding and interaction idiom, sense making encoding and interaction idiom.

Data analysis

• How to make full use of large collection of documents: how to design an effective data model to display valuable information like events (facts, evidences) and their connections to fully and clearly represent the main contents of documents.

• How to use right granularity to represent each event element in data model: use different level of granularity to display events in a hierarchy structure.

• How to express all kinds of complicated connections between all level event elements with appropriate connection description attributes.

Visual encoding and interaction idiom

• How to build a multi-view visualization with high cognitive fit. For example, how to build an integrated multi - view visualization that helps the analyst to understand the relationship information presentation and problem-solving task?

• How implement dynamic visualization tool of showing fact flow and putting proper hypothesis visualization item by users.

• How provide appropriate views combined with interaction to help users find interested contents, to form a unique integrated view providing the user a better insight into the relationships between entities.

• How to use views display the documents contents through all kinds of levels: document collection level, text level, events level, key words level, etc.

• How to manage the trade of between a good document representation and users' sense-making process. How to manage display all the big collection of events or documents into a laptop screen. Some visualization tools offer good means of document representation but as the users explore more documents, the screen is fully loaded with nodes(documents) leading to visual cluttering.

Sense-making algorithm

• How to help the analyst in building hypothesis using relationship diagrams and grouping of information and provide him with evidence to support/reject his hypothesis?

• How to build an interactive visualization that helps the user annotate, mark, and change the representations.

• How to help users improve their foraging and synthesis loop: how to help users find the interested (important to the user) contents, how to help users scheme the related contents, and gain and put basic insights into the VIA, to help form a high insight or conclusion.


Relevant data sources

VIA could be applying to visualize all kinds of documents as long as the contents could be structured as basic sense-making elements. Furthermore, VIA is designed to work best with collections of many documents that includes events/facts/stories/evidence, since more reality entities could help form a more understandable visual element on the view. For example, the crescent dataset which includes when, where, what could be great dataset examples: each document in crescent dataset, describe one or several events: on which time, someone did something, at which place, all those kinds of reality entities could be used as the attribute of events. So for the crescent dataset, the collection of documents could be mapped to a list of events, each events could include attributes like (5W) and sub-events (small sub-stories with compose this event).


Existing solutions

Since VIA is a combination of fields of intelligence analysis, sense-making, intelligence and forecasting, law enforcement, analytical provenance as well as text visual analytics. There are lots of great solutions from those fileds like: Wigmore chart, Jigsaw, charting techniques on criminal intelligence. Wigmore chart

Wigmore charT

Wigmore’s ‘The Problem of Proof’, was a path-breaking attempt to systematize the process of drawing inferences from trial evidence. A Wigmore chart[1] is a graphical method for the analysis of legal evidence in trials, developed by John Henry Wigmore. It is an early form of the modern belief network. Its main disadvantage is inflexibility, in this point, this project gives users the chance to think freely and label their thought on the visualization directly, and see the logic probability and trend so that they don’t need to remember every relationship and details in their brain.

Jigsaw

Jigsaw [2] is a visual analytics tool that visualize relations among entities such as people, organizations, places and times, which helps analysts understand large collections of text documents. It offers a different number of visualization views, such as List View, The Document View, the Document Cluster View, Graph View etc. , that help the analyst better understand the underlying data set. Jigsaw offers a convenient environment for visualizing data, but fall short of the following: Large initial time investment is needed in understanding and setting the tool to work.

We need to refresh our understanding of the semantics connection when we switch between different views (low cognitive fit). So this project mainly shows one main view with detail/brief information switch interaction, supported with clear information display. Jigsaw Main view only shows link node with no direction which is hard to pick useful information out. So this project shows more clear logic relationship by workflow, relationship flow and logic trend. Very limited filtering options and users can’t easily select a subset of data, he/she has to work on the whole data set.

charting techniques

The analytical process[3] in criminal intelligence is aimed at the use and development of intelligence to direct law enforcement objectives, both for short-term operational aims and for long-term strategic reasons. The scope of analysis and its overall credibility depends on the level and accuracy of acquired information, combined with the skills of the analyst. Analysis is a cyclical process, which can be performed to assist with all types of law enforcement objectives. Different types of crimes and criminal operations require different scenarios, but in all cases the information used should not be pre-filtered through an artificially and arbitrarily imposed selective grid.

Data integration is the first phase of the analytical process. It involves combining information from different sources in preparation for the formulation of inferences. Various techniques may be used to display this information, the most common being the use of charting techniques.

• Link charting—to show relationships among entities featuring in the investigation

• Event charting—to show chronological relationships among entities or sequences of events

• Commodity flow charting—to explore the movement of money, narcotics, stolen goods or other commodities

• Activity charting—to identify activities involved in a criminal operation

• Financial profiling—to identify concealed income of individuals or business entities and to identify indicators of economic crime

• Frequency charting—to organize, summarize and interpret quantitative information

• Data correlation—to illustrate relationships between different variables

The drawback of this series of charts is that they cannot help users combine all those kinds of views into one view, users use this series of charts should layouts all those kinds of charts together to get a whole understand of contents.


References:

[1] Goodwin, Jean. "Wigmore's chart method." Informal Logic 20.3 (2000)

[2] “Jigsaw - Visual Analytics for Investigative Analysis”. Retrieved October 21, 2016, from http://www.cc.gatech.edu/gvu/ii/jigsaw/

[3] Criminal Intelligence Manual for Analysts. https://www.unodc.org/documents/organized-crime/Law-Enforcement/Criminal_Intelligence_for_Analysts.pdf