Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Purpose and Audience
This section contains a walk-through of the Threat Investigation analyst view. The intended audience is Security Analysts responsible for reviewing the results for potential threats. The Threat Investigation notebook provides a way to perform a more detailed analysis of the connections previously scored as high risk. Users will select a day to investigate, starting at the Suspicious Connects section to later get to the detailed analysis performed with a Threat Investigation Jupyter notebook.
###Walk-through Access the analyst view for suspicious connects http://“server-ip":8889/files/ui/flow/suspicious.html Select the date that you want to review. Your screen should now look like this:
The analyst must score the suspicious connections before moving into Threat Investigation View, please refer to Suspicious Connects Analyst View walk-through
Select Flows > Threat Investigation from Open Network Insight Menu.
Threat Investigation Web Page will be opened, loading the embedded Jupyter notebook.
You can select any IP from the list and click "Search" to view specific details about it. A query to the flow table will be executed looking into the raw data initially collected to find all communication between this and any other IP Addresses during the day, collecting additional information, such as:
- max & avg number of bytes sent/received
- max & avg number of packets sent/received
- destination port
- source port
- first & last connection time
- count of connections
The full output of this query is stored into the
ir-<ip>.csv file. If an expanded search was previously executed on this IP, the system will extract the results from the preexisting file to reduce the execution time by avoiding another query to the table.
Query execution time is long and will vary depending on whether Hive or Impala is being used.
Based on the results in this file, the following functions will be executed:
get_in_out_and_twoway_conns add_geospatial_info() add_network_context()
The system will create three dictionaries, each containing:
- Inbound connections (when the suspicious IP acts only as destination)
- Outbound connections (when the suspicious IP acts only as source)
- 2Way Connections (when the suspicious IP acts as both source and destination)
iploc.csv file is available, each dictionary will be updated with the geolocation data for each IP.
network_context_1.txt file is available, a description for each identified node will also be added to each dictionary.
The connections dictionary will be separated into two smaller dictionaries, each containing
- Top 'n' IP's per number of connections.
- Top 'n' IP's per bytes transferred.
The number of results stored in the dictionaries (n) can be set by updating the value of the top_results variable.
In addition, a web form is displayed under the title of 'Threat summary', where the analyst can enter a Title & Description on the kind of attack/behavior described by the particular IP address that is under investigation.
Click on the Save button after entering the data to write it into a CSV file, which eventually will be used in the Storyboard Analyst View.
After creating the csv file with the analysis description, the following functions will generate all graphs and diagrams related to the IP under investigation, to populate the Storyboard Analyst view.
generate_attack_map_file(anchor_ip, top_inbound_b, outbound, twoway) generate_stats(anchor_ip, top_inbound_b, outbound, twoway, threat_name) generate_dendro(anchor_ip, top_inbound_b, outbound, twoway, date) details_inbound(anchor_ip,top_inbound_b)
generate_attack_map_file() - create a globe map indicating the trajectory of the connections based on their geolocation. This function depends on having geolocation data for each IP. If you haven't set up a geolocation database file, the map file won't be generated.
generate_stats() - This will create the horizontal bar graph for the Impact Analysis. This will represent the number of inbound, outbound and twoway connections found.
generate_dendro() - This function creates a file linking all different IP's that have connected to the IP under investigation, this will be displayed in the Storyboard under the Incident Progression panel as a dendrogram.
If no network context file is included, the dendrogram will only be 1 level deep, but if a network context file is included, additional levels will be added to the dendrogram to break down the threat activity.
details_inbound() - This function executes a query to the flow table, to find additional details on the IP under investigation and its connections grouping them by time; so the result will be a graph showing the number of connections occurring in a customizable timeframe.
add_threat() - This function updates/creates the
threats.csv file, appending a new line for every threat analyzed. This file will serve as an index for the Storyboard and is displayed in the 'Executive Threat Briefing' panel.
Each function will print a message to let you know if its output file was successfully updated.
Continue to the Storyboard
Once you have saved comments on any suspicious IP, you can continue to the Storyboard to check the results.
flow_scores.csv iploc.csv network_context_1.txt
/oni-oa/data/flow/<date>/threats.csv /oni-oa/data/flow/<date>/threat_<ip>.csv /oni-oa/data/flow/<date>/sbdet-<ip>.tsv /oni-oa/data/flow/<date>/globe_<ip>.json /oni-oa/data/flow/<date>/stats-<ip>.json /oni-oa/data/flow/<date>/dendro-<ip>.json
HDFS tables consumed