# CBTH Search, Feed & Watchlist Short Demo

This Jupyter Notebook will provide a brief walkthrough of the search, feed and watchlist fucntionality in CB ThreatHunter. For more detail walkthrough of what is possible via the API please take a look at the other notebooks available in this repo.

## Prerequisites

There are two prerequisites for using this code: first, you need credentials to log into the API for your
Cb PSC organization; and second, you need the `cbapi` bindings to use this Python code directly. If you
want to use another language, or to call the REST API endpoints manually, you won't need to install `cbapi`.

### API Credentials

The first step is to create connectors in your Cb PSC organization. Log into the console and follow the
instructions at https://developer.carbonblack.com/reference/cb-defense/authentication/ to create an`API` type connector

Once you have your connector, you'll need the following information:

1. URL endpoint (e.g. `defense-prod05.conferdeploy.net`) for the APIs. This is the same URL you would use for the PSC Web UI
2. Connector ID and API key for the API connector
4. "Org key" - this is a unique identifier for your org and is displayed on the top of API Keys page

### Install cbapi

The second step is only if you want to run this code directly. This python script uses the `cbapi`
module. The support for ThreatHunter in `cbapi` is being actively developed in a fork available from
https://github.com/trailofbits/cbapi-python/tree/tob-cbth. To run this code as-is, you need to `git clone`
that repository, change into the `tob-cbth` branch, and install `cbapi` in a virtualenv.

`cbapi` uses credential file to read the API secret keys. Whenever you write scripts to interact with the
Cb APIs (or any API for that matter) you should **always** keep your API secret keys separate from your script.
If your script is ever exposed, either intentionally (by sharing it), or accidentally, then your API token
could be compromised if it were embedded inside your script.

To learn more about credential files and `cbapi`, see the docs at https://cbapi.readthedocs.io/en/latest/#api-credentials.

## Documentation

More information on configuring `cbapi`:
https://cbapi.readthedocs.io/en/latest/installation.html

Documentation for the ThreatHunter APIs is now available on the Developer Network website at: https://developer.carbonblack.com/reference/cb-threathunter/

## Setup

First lets add a paragraph to get the initial configuration of the `cbapi` objects ready to go
If desired you can enable debug logging, which will provide an output of the underlying REST API calls that are made to the backend

In [16]:
from cbapi.psc.threathunter import *
#pretty printer for formating of the json responses
import pprint
import time

# for debug logging import the logging module, and configure `cbapi` to DEBUG logging level
#import logging
#logging.basicConfig()
#logging.getLogger("cbapi").setLevel(logging.DEBUG)

# The following will fail if you have not yet set up your credentials file
th = CbThreatHunterAPI(profile="devday") # profile is the name of your config block in you credentials file
orgkey = th.credentials['org_key'] # makes for shorter URLs in future paragraphs

## Search

Now that we are set up, lets take a quick look at how search works for CB ThreatHunter. ThreatHunter is based on scalable and multi-tenant architecture that is capable of ingressing tens of millions of events per second. Searches are built to scale and perform.
Searches in ThreatHunter are asynchronous. This allows better experience to both API and UI user since search can be initiated quickly and results can be gathered later and incrementally. Another advantage is that results can be referenced over and over, without re-running the search.

For the purposes of today's demo we are going to skip over how to access these APIs either directly using something like `curl` and making raw requests leveraging `cbapi` supporting functions. Instead we will jump right to the full `cbapi` supported objects.

**NOTE** for more information on search API, including architectural diagrams, reference the [Search Notebook](search_demo.ipynb)

In [15]:
query = th.select(Process).where("process_name:browser_broker.exe")
query_results = list(query) # get all results from this query into a list
pprint.pprint(query_results)

[<cbapi.psc.threathunter.models.Process: id N4LFP2KN-00bbb744-000021a4-000000e9-1d552cd22a2604d> @ https://defense-prod05.conferdeploy.net]


## Results Segmentation
Like Cb Response, each query result actually represents a process "segment" - that is, a set of events
associated with a process. 
If you issue default search request, all the segments will be returned, which could cause a lot of duplicate results for long-living processes.

For more information about process segments, see https://developer.carbonblack.com/reference/enterprise-response/6.1/process-api-changes/#new-immutable-model.

One way around this is to use `collapse` function available in Solr. In that case, CBTH will try its best to coalesce these results by process when possible, returning only the latest segment.
Still, you might still get duplicates in two cases:
1. If segments come from different indices
2. If there are too many results for Solr to group in each index (threshold for this is currently configured at 1M results)

**To create a unique list of process IDs (process_guid) we will create a map ourselves**

In [18]:
unique_processes = {r.process_guid:r for r in query_results}
pprint.pprint(unique_processes)

{'N4LFP2KN-00bbb744-000021a4-000000e9-1d552cd22a2604d': <cbapi.psc.threathunter.models.Process: id N4LFP2KN-00bbb744-000021a4-000000e9-1d552cd22a2604d> @ https://defense-prod05.conferdeploy.net}


Now lets take a deeper look at one of the process objects to see what sort of information has been retrieved.

In [20]:
interesting_process_guid='N4LFP2KN-00bbb744-000021a4-000000e9-1d552cd22a2604d'
process = unique_processes[interesting_process_guid]
print(process)

Process object, bound to https://defense-prod05.conferdeploy.net.
-------------------------------------------------------------------------------

       backend_timestamp: 2019-09-17T20:01:02.004Z
         childproc_count: 0
         crossproc_count: 0
      device_external_ip: 
               device_id: 12302148
      device_internal_ip: 96.234.213.61
             device_name: pdftestspectacular
               device_os: WINDOWS
        device_policy_id: 12202
        device_timestamp: 2019-09-17T19:59:54.959Z
                enriched: True
     enriched_event_type: NETWORK
       event_description: The application "<share><link hash="609a3a73a98...
                event_id: b8cfbaa3d98511e9b0507b3efec0350c
              event_type: netconn
           filemod_count: 0
             index_class: default
                  legacy: True
           modload_count: 0
           netconn_count: 0
                  org_id: N4LFP2KN
            partition_id: 0
         process_cmdline: ['C:\\Win

## Feeds and Watchlists

We are now going to transition over to setting up some example reports, watchlists and feeds that are related to the searches that we just performed.

**NOTE** For more detailed information on the feeds and watchlists data model please take a look at [Watchlist Demo](watchlist_demo.ipynb)

The first thing that we will do is create a new **query based report**

In [59]:
report_dict = {
    "id":"randomidentifier",
    "timestamp": int(time.time()),
    "link": "https://devday2019.carbonblack.com/notepadReport",
    "title": "Notepad spawning processes",
    "description": "Detected an instance of notepad.exe creating one or more child processes",
    "severity": 8,
    "iocs_v2": [
        {
            "id":"notepad_child_proc",
            "match_type":"query",
            "values":["process_name:notepad.exe childproc_count:[1 TO *]"],
            "link": "https://devday2019.carbonblack.com/notepadReport"
        }
    ]
}

report_obj = th.create(Report,report_dict)
report_obj.save_watchlist()
print(report_obj)

DEBUG:cbapi.connection:Sending HTTP POST /threathunter/watchlistmgr/v3/orgs/N4LFP2KN/reports with {"description": "Detected an instance of notepad.exe creating one or more child processes", "id": "randomidentifier", "iocs_v2": [{"id": "notepad_child_proc", "link": "https://devday2019.carbonblack.com/notepadReport", "match_type": "query", "values": ["process_name:notepad.exe childproc_count:[1 TO *]"]}], "link": "https://devday2019.carbonblack.com/notepadReport", "severity": 8, "timestamp": 1569529469, "title": "Notepad spawning processes"}
DEBUG:cbapi.connection:HTTP POST /threathunter/watchlistmgr/v3/orgs/N4LFP2KN/reports took 0.244s (response 200)


Report object, bound to https://defense-prod05.conferdeploy.net.
-------------------------------------------------------------------------------

             description: Detected an instance of notepad.exe creating on...
                      id: OnwHYUMISdmR0KAGET994g
                    iocs: None
                 iocs_v2: [{'id': 'notepad_child_proc', 'match_type': 'qu...
                    link: https://devday2019.carbonblack.com/notepadReport
                severity: 8
                    tags: None
               timestamp: 1569529469
                   title: Notepad spawning processes
              visibility: None


Now that we have a report we can connect the report to a visible watchlist. The watchlist will execute this report on data as it is coming in through the system

In [23]:
ts = int(time.time())
watchlist_dict = {
    "create_timestamp": ts,
    "last_update_timestamp":ts,
    "name": "DevDay Test Watchlist",
    "description": "Pretty cool, its a watchlist",
    "tags_enabled": True,
    "alerts_enabled": False,
    "report_ids": [report_obj.id]
}

watchlist_obj = th.create(Watchlist,watchlist_dict)
watchlist_obj.save()

<cbapi.psc.threathunter.models.Watchlist: id FhQUSoh6S0G0WGqyh1UxjA> @ https://defense-prod05.conferdeploy.net

And lets take a look to make sure the report got added correctly

In [70]:
for report in watchlist_obj.reports:
    print(report)

DEBUG:cbapi.connection:HTTP GET /threathunter/watchlistmgr/v3/orgs/N4LFP2KN/reports/OnwHYUMISdmR0KAGET994g took 0.094s (response 200)


Report object, bound to https://defense-prod05.conferdeploy.net.
-------------------------------------------------------------------------------

             description: Detected an instance of notepad.exe creating on...
                      id: OnwHYUMISdmR0KAGET994g
                    iocs: None
                 iocs_v2: [{'id': 'notepad_child_proc', 'match_type': 'qu...
                    link: https://devday2019.carbonblack.com/notepadReport
                severity: 8
                    tags: None
               timestamp: 1569529469
                   title: Notepad spawning processes
              visibility: None


# Create an Ingress Based Report

Ingress reports will be process as the event data is processed and index, they must only be created against a single field, but each field supports all the the "match types" that are provided in the query based reports

In [77]:
ingress_report_dict = {
    "id":"randomidentifier",
    "timestamp": int(time.time()),
    "link": "https://devday2019.carbonblack.com/notepadReport",
    "title": "Misspelled Notepad",
    "description": "Detected an instance of suspicious notepad.exe",
    "severity": 8,
    "iocs_v2": [
        {
            "id":"notepad_name_ioc",
            "match_type":"regex",
            "field":"process_name",
            "values": [".+/notep@d.exe"],
            "link": "https://devday2019.carbonblack.com/notNotepad"
        }
    ]
}
ingress_report_obj = th.create(Report,ingress_report_dict)
ingress_report_obj.save_watchlist()
print(ingress_report_obj)

DEBUG:cbapi.connection:Sending HTTP POST /threathunter/watchlistmgr/v3/orgs/N4LFP2KN/reports with {"description": "Detected an instance of notepad.exe creating one or more child processes", "id": "randomidentifier", "iocs_v2": [{"field": "process_name", "id": "notepad_name_ioc", "link": "https://devday2019.carbonblack.com/notNotepad", "match_type": "regex", "values": [".+/notep@d.exe"]}], "link": "https://devday2019.carbonblack.com/notepadReport", "severity": 8, "timestamp": 1569532167, "title": "Notepad spawning processes"}
DEBUG:cbapi.connection:HTTP POST /threathunter/watchlistmgr/v3/orgs/N4LFP2KN/reports took 0.133s (response 200)


Report object, bound to https://defense-prod05.conferdeploy.net.
-------------------------------------------------------------------------------

             description: Detected an instance of notepad.exe creating on...
                      id: a275KJ2zQaCqPyJwp9QB0Q
                    iocs: None
                 iocs_v2: [{'id': 'notepad_name_ioc', 'match_type': 'rege...
                    link: https://devday2019.carbonblack.com/notepadReport
                severity: 8
                    tags: None
               timestamp: 1569532167
                   title: Notepad spawning processes
              visibility: None


Now we are going to update our watchlist to include our ingress report as well, note that currently with `cbapi` you need to call they update which will overwrite existing reports that are tied to this watchlist.

In [78]:
watchlist_obj.id
watchlist_obj.update(report_ids=[report_obj.id,ingress_report_obj.id])
for report in watchlist_obj.reports:
    print(report)

DEBUG:cbapi.connection:Sending HTTP PUT /threathunter/watchlistmgr/v3/orgs/N4LFP2KN/watchlists/FhQUSoh6S0G0WGqyh1UxjA with {"alerts_enabled": false, "classifier": null, "create_timestamp": 1569524235, "description": "Pretty cool, its a watchlist", "id": "FhQUSoh6S0G0WGqyh1UxjA", "last_update_timestamp": 1569530336, "name": "DevDay Test Watchlist", "report_ids": ["vAhoXP36ST6e4guEgChWeA", "a275KJ2zQaCqPyJwp9QB0Q"], "tags_enabled": true}
DEBUG:cbapi.connection:HTTP PUT /threathunter/watchlistmgr/v3/orgs/N4LFP2KN/watchlists/FhQUSoh6S0G0WGqyh1UxjA took 0.397s (response 200)
DEBUG:cbapi.connection:HTTP GET /threathunter/watchlistmgr/v3/orgs/N4LFP2KN/reports/a275KJ2zQaCqPyJwp9QB0Q took 0.144s (response 200)
DEBUG:cbapi.connection:HTTP GET /threathunter/watchlistmgr/v3/orgs/N4LFP2KN/reports/a275KJ2zQaCqPyJwp9QB0Q took 0.079s (response 200)


Report object, bound to https://defense-prod05.conferdeploy.net.
-------------------------------------------------------------------------------

             description: Detected an instance of notepad.exe creating on...
                      id: a275KJ2zQaCqPyJwp9QB0Q
                    iocs: None
                 iocs_v2: [{'id': 'notepad_name_ioc', 'match_type': 'rege...
                    link: https://devday2019.carbonblack.com/notepadReport
                severity: 8
                    tags: None
               timestamp: 1569532167
                   title: Notepad spawning processes
              visibility: None
Report object, bound to https://defense-prod05.conferdeploy.net.
-------------------------------------------------------------------------------

             description: Detected an instance of notepad.exe creating on...
                      id: a275KJ2zQaCqPyJwp9QB0Q
                    iocs: None
                 iocs_v2: [{'id': 'notepad_name_ioc', 'match_t