Skip to content

Commit

Permalink
Ianhelle/velociraptor provider 2023 05 19 (#668)
Browse files Browse the repository at this point in the history
* Adding Velociraptor provider for local logs

* Format of cluster name has changed in new KustoClient. Fixing test cases to allow for old and new format.

* Minor updates for DataProv-Velociraptor.rst

* Fixing comments in PR.

Fixed bug in azure_kusto_driver and test_azure_kusto_driver
Fixed some doc references.

* Adding acknowledgement of Blue Team Village data
  • Loading branch information
ianhelle committed Jul 3, 2023
1 parent 9316568 commit 2908083
Show file tree
Hide file tree
Showing 24 changed files with 607 additions and 6 deletions.
1 change: 1 addition & 0 deletions docs/source/DataAcquisition.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ Individual Data Environments
data_acquisition/DataProv-Kusto-New
data_acquisition/DataProv-Cybereason
data_acquisition/DataProv-OSQuery
data_acquisition/DataProv-Velociraptor


Built-in Data Queries
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
msticpy.data.drivers.local\_velociraptor\_driver module
=======================================================

.. automodule:: msticpy.data.drivers.local_velociraptor_driver
:members:
:undoc-members:
:show-inheritance:
1 change: 1 addition & 0 deletions docs/source/api/msticpy.data.drivers.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ Submodules
msticpy.data.drivers.kusto_driver
msticpy.data.drivers.local_data_driver
msticpy.data.drivers.local_osquery_driver
msticpy.data.drivers.local_velociraptor_driver
msticpy.data.drivers.mdatp_driver
msticpy.data.drivers.mordor_driver
msticpy.data.drivers.odata_driver
Expand Down
7 changes: 7 additions & 0 deletions docs/source/api/msticpy.init.mp_plugins.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
msticpy.init.mp\_plugins module
===============================

.. automodule:: msticpy.init.mp_plugins
:members:
:undoc-members:
:show-inheritance:
4 changes: 2 additions & 2 deletions docs/source/data_acquisition/DataProv-OSQuery.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ The ``OSQuery`` data provider can read OSQuery log files
and provide convenient query functions for each OSQuery "table"
(or event type) contained in the logs.

The provide can read in one or more log files, or multiple log files
The provider can read in one or more log files, or multiple log files
in multiple folders. The files are read, converted to pandas
DataFrames and grouped by table/event. In addition, date fields
within the data are converted to pandas Timestamp format.
Expand All @@ -16,7 +16,7 @@ within the data are converted to pandas Timestamp format.
qry_prov = mp.QueryProvider("OSQueryLogs", data_paths=["~/my_logs"])
qry_prov.connect()
df_processes = qry_prov.processes()
df_processes = qry_prov.os_query.processes()
The query provider query functions will ignore parameters and do
no further filtering. You can use pandas to do additional filtering
Expand Down
167 changes: 167 additions & 0 deletions docs/source/data_acquisition/DataProv-Velociraptor.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,167 @@
The Velociraptor provider
=========================

:py:mod:`Velociraptor driver documentation<msticpy.data.drivers.local_velociraptor_driver>`

The ``Velociraptor`` data provider can read Velociraptor
offline collection log files (see
`Velociraptor Offline Collection <https://docs.velociraptor.app/docs/offline_triage/#offline-collections>`__)
and provide convenient query functions for each data set
in the output logs.

The provider can read files from one or more hosts, stored in
in separate folders. The files are read, converted to pandas
DataFrames and grouped by table/event. Multiple log files of the
same type (when reading in data from multiple hosts) are concatenated
into a single DataFrame.

.. code::ipython3
qry_prov = mp.QueryProvider("Velociraptor", data_paths=["~/my_logs"])
qry_prov.connect()
df_processes = qry_prov.velociraptor.Windows_Forensics_ProcessInfo()
The query provider query functions will ignore parameters and do
no further filtering. You can use pandas to do additional filtering
and sorting of the data, or use it directly with other MSTICPy
functionality.

.. note:: The examples used in this document were from data
provided by Blue Team Village at Defcon 30. You can find
this data at the
`Project-Obsidian-DC30 GitHub <https://github.com/blueteamvillage/Project-Obsidian-DC30>`__
and more about
`Project Obsidian <https://media.blueteamvillage.org/DC30/Obsidian/>`__
here.

Velociraptor Configuration
--------------------------

You can (optionally) store your connection details in *msticpyconfig.yaml*,
instead of supplying the ``data_paths`` parameter to
the ``QueryProvider`` class.

For more information on using and configuring *msticpyconfig.yaml* see
:doc:`msticpy Package Configuration <../getting_started/msticpyconfig>`
and :doc:`MSTICPy Settings Editor<../getting_started/SettingsEditor>`

The Velociraptor settings in the file should look like the following:

.. code:: yaml
DataProviders:
...
Velociraptor:
data_paths:
- /home/user1/sample_data
- /home/shared/sample_data
Expected log file format
------------------------

The log file format must be a text file of JSON records. An example
is shown below

.. parsed-literal::
{"Pid":1664,"Ppid":540,"Name":"spoolsv.exe","Path":"C:\\Windows\\System32\\spoolsv.exe","CommandLine":"C:\\Windows\\System32\\spoolsv.exe","Hash":{"MD5":"c111e3d38c71808a8289b0e49db40c96","SHA1":"e56df979d776fe9e8c3b84e6fef8559d6811898d","SHA256":"0ed0c6f4ddc620039f05719d783585d69f03d950be97b49149d4addf23609902"},"Username":"NT AUTHORITY\\SYSTEM","Authenticode":{"Filename":"C:\\Windows\\System32\\spoolsv.exe","ProgramName":"Microsoft Windows","PublisherLink":null,"MoreInfoLink":"http://www.microsoft.com/windows","SerialNumber":"33000002ed2c45e4c145cf48440000000002ed","IssuerName":"C=US, ST=Washington, L=Redmond, O=Microsoft Corporation, CN=Microsoft Windows Production PCA 2011","SubjectName":"C=US, ST=Washington, L=Redmond, O=Microsoft Corporation, CN=Microsoft Windows","Timestamp":null,"Trusted":"trusted","_ExtraInfo":{"Catalog":"C:\\Windows\\system32\\CatRoot\\{F750E6C3-38EE-11D1-85E5-00C04FC295EE}\\Package_6350_for_KB5007192~31bf3856ad364e35~amd64~~10.0.1.8.cat"}},"Family":"IPv4","Type":"TCP","Status":"LISTEN","Laddr.IP":"0.0.0.0","Laddr.Port":49697,"Raddr.IP":"0.0.0.0","Raddr.Port":0,"Timestamp":"2022-02-12T19:35:45Z"}
{"Pid":548,"Ppid":416,"Name":"lsass.exe","Path":"C:\\Windows\\System32\\lsass.exe","CommandLine":"C:\\Windows\\system32\\lsass.exe","Hash":{"MD5":"93212fd52a9cd5addad2fd2a779355d2","SHA1":"49a814f72292082a1cfdf602b5e4689b0f942703","SHA256":"95888daefd187fac9c979387f75ff3628548e7ddf5d70ad489cf996b9cad7193"},"Username":"NT AUTHORITY\\SYSTEM","Authenticode":{"Filename":"C:\\Windows\\System32\\lsass.exe","ProgramName":"Microsoft Windows","PublisherLink":null,"MoreInfoLink":"http://www.microsoft.com/windows","SerialNumber":"33000002f49e469c54137b85e00000000002f4","IssuerName":"C=US, ST=Washington, L=Redmond, O=Microsoft Corporation, CN=Microsoft Windows Production PCA 2011","SubjectName":"C=US, ST=Washington, L=Redmond, O=Microsoft Corporation, CN=Microsoft Windows Publisher","Timestamp":null,"Trusted":"trusted","_ExtraInfo":null},"Family":"IPv4","Type":"TCP","Status":"LISTEN","Laddr.IP":"0.0.0.0","Laddr.Port":49722,"Raddr.IP":"0.0.0.0","Raddr.Port":0,"Timestamp":"2022-02-12T19:35:54Z"}
{"Pid":540,"Ppid":416,"Name":"services.exe","Path":"C:\\Windows\\System32\\services.exe","CommandLine":"C:\\Windows\\system32\\services.exe","Hash":{"MD5":"fefc26105685c70d7260170489b5b520","SHA1":"d9b2cb9bf9d4789636b5fcdef0fdbb9d8bc0fb52","SHA256":"930f44f9a599937bdb23cf0c7ea4d158991b837d2a0975c15686cdd4198808e8"},"Username":"NT AUTHORITY\\SYSTEM","Authenticode":{"Filename":"C:\\Windows\\System32\\services.exe","ProgramName":"Microsoft Windows","PublisherLink":null,"MoreInfoLink":"http://www.microsoft.com/windows","SerialNumber":"33000002a5e1a081b7c895c0ed0000000002a5","IssuerName":"C=US, ST=Washington, L=Redmond, O=Microsoft Corporation, CN=Microsoft Windows Production PCA 2011","SubjectName":"C=US, ST=Washington, L=Redmond, O=Microsoft Corporation, CN=Microsoft Windows Publisher","Timestamp":null,"Trusted":"trusted","_ExtraInfo":null},"Family":"IPv4","Type":"TCP","Status":"LISTEN","Laddr.IP":"0.0.0.0","Laddr.Port":49728,"Raddr.IP":"0.0.0.0","Raddr.Port":0,"Timestamp":"2022-02-12T19:35:57Z"}
The columns in each JSON will be used to create the pandas DataFrame columns.


Using the Velociraptor provider
-------------------------------

To use the Velociraptor provider you need to create an QueryProvider
instance, passing the string "VelociraptorLogs" as the ``data_environment``
parameter. If you have not configured ``data_paths`` in msticpyconfig.yaml,
you also need to add the ``data_paths`` parameter to specify
specific folders or files that you want to read.

.. code::ipython3
qry_prov = mp.QueryProvider("VelociraptorLogs", data_paths=["~/my_logs"])
Calling the ``connect`` method triggers the provider to register the paths of the
log files to be read (although the log files are not read and parsed
until the related query is run - see below).

.. code::ipython3
qry_prov.connect()
Listing Velociraptor tables
~~~~~~~~~~~~~~~~~~~~~~~~~~~

Until you run ``connect`` no queries will be available. After running
``connect`` you can list the available queries using the ``list_queries``

.. code:: ipython3
qry_prov.list_queries()
.. parsed-literal::
['velociraptor.Custom_Windows_NetBIOS',
'velociraptor.Custom_Windows_Patches',
'velociraptor.Custom_Windows_Sysinternals_PSInfo',
'velociraptor.Custom_Windows_Sysinternals_PSLoggedOn',
'velociraptor.Custom_Windows_System_Services',
'velociraptor.Windows_Applications_Chrome_Cookies',
'velociraptor.Windows_Applications_Chrome_Extensions',
'velociraptor.Windows_Applications_Chrome_History',
'velociraptor.Windows_Applications_Edge_History',
'velociraptor.Windows_Forensics_Lnk',
'velociraptor.Windows_Forensics_Prefetch',
'velociraptor.Windows_Forensics_ProcessInfo',
'velociraptor.Windows_Forensics_Usn',
...]
Querying Velociraptor table schema
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The schema of the log tables is built by sampling the first record
from each log file type, so is relatively fast to retrieve even
if you have large numbers and sizes of logs.

.. code:: ipython3
vc_prov.schema["Windows_Network_InterfaceAddresses"]
.. parsed-literal::
{'Index': 'int64',
'MTU': 'int64',
'Name': 'object',
'HardwareAddr': 'object',
'Flags': 'int64',
'IP': 'object',
'Mask': 'object'}
Running a Velociraptor query
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Each query returns a pandas DataFrame retrieved
from the logs of that type (potentially containing records from
multiple hosts depending on the ``data_paths`` you specified).

.. code:: python3
qry_prov.vc_prov.velociraptor.Windows_Forensics_ProcessInfo()
==== =========== ================ ===== =============================== ================================================================ ==================== ===================================
.. Name PebBaseAddress Pid ImagePathName CommandLine CurrentDirectory Env
==== =========== ================ ===== =============================== ================================================================ ==================== ===================================
10 LogonUI.exe 0x95bd3d2000 804 C:\Windows\system32\LogonUI.exe "LogonUI.exe" /flags:0x2 /state0:0xa3b92855 /state1:0x41c64e6d C:\Windows\system32\ {'ALLUSERSPROFILE': 'C:\\ProgramD..
11 dwm.exe 0x6cf4351000 848 C:\Windows\system32\dwm.exe "dwm.exe" C:\Windows\system32\ {'ALLUSERSPROFILE': 'C:\\ProgramD..
12 svchost.exe 0x6cd64d000 872 C:\Windows\System32\svchost.exe C:\Windows\System32\svchost.exe -k termsvcs C:\Windows\system32\ {'ALLUSERSPROFILE': 'C:\\ProgramD..
13 svchost.exe 0x7d18e99000 912 C:\Windows\System32\svchost.exe C:\Windows\System32\svchost.exe -k LocalServiceNetworkRestricted C:\Windows\system32\ {'ALLUSERSPROFILE': 'C:\\ProgramD..
14 svchost.exe 0x5c762eb000 920 C:\Windows\system32\svchost.exe C:\Windows\system32\svchost.exe -k LocalService C:\Windows\system32\ {'ALLUSERSPROFILE': 'C:\\ProgramD..
==== =========== ================ ===== =============================== ================================================================ ==================== ===================================
6 changes: 3 additions & 3 deletions docs/source/data_acquisition/DataProviders.rst
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ would only use this parameter if you were building your own
data driver backend, which is not common.

2. You can choose to import additional queries from a custom
query directory (see `Creating new queries`_ for more
query directory (see :doc:`../extending/Queries` for more
details) with:

.. code:: ipython3
Expand Down Expand Up @@ -494,7 +494,7 @@ for Timedelta in the
.. warning:: There are some important caveats to this feature.

1. It currently only works with pre-defined queries (including ones
that you may create and add yourself, see `Creating new queries`_
that you may create and add yourself, see :doc:`../extending/Queries`
below). It does not work with `Running an ad hoc query`_
2. If the query contains joins, the joins will only happen within
the time ranges of each subquery.
Expand All @@ -512,7 +512,7 @@ Dynamically adding new queries
You can use the :py:meth:`msticpy.data.core.data_providers.QueryProvider.add_query`
to add parameterized queries from a notebook or script. This
let you use temporary parameterized queries without having to
add them to a YAML file (as described in `Creating new queries`_).
add them to a YAML file (as described in :doc:`../extending/Queries`).

get_host_events

Expand Down
4 changes: 4 additions & 0 deletions msticpy/data/core/query_defns.py
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,7 @@ class DataEnvironment(Enum):
AzureSentinel = 1 # alias of LogAnalytics
LogAnalytics = 1
Kusto = 2
AzureDataExplorer = 2 # alias of Kusto
AzureSecurityCenter = 3
MSGraph = 4
SecurityGraph = 4
Expand All @@ -106,8 +107,11 @@ class DataEnvironment(Enum):
Cybereason = 12
Elastic = 14
OSQueryLogs = 15
OSQuery = 15
MSSentinel_New = 16
Kusto_New = 17
VelociraptorLogs = 18
Velociraptor = 18

@classmethod
def parse(cls, value: Union[str, int]) -> "DataEnvironment":
Expand Down
4 changes: 4 additions & 0 deletions msticpy/data/drivers/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,10 @@
DataEnvironment.Elastic: ("elastic_driver", "ElasticDriver"),
DataEnvironment.MSSentinel_New: ("azure_monitor_driver", "AzureMonitorDriver"),
DataEnvironment.Kusto_New: ("azure_kusto_driver", "AzureKustoDriver"),
DataEnvironment.VelociraptorLogs: (
"local_velociraptor_driver",
"VelociraptorLogDriver",
),
}

CUSTOM_PROVIDERS: Dict[str, type] = {}
Expand Down
15 changes: 14 additions & 1 deletion msticpy/data/drivers/azure_kusto_driver.py
Original file line number Diff line number Diff line change
Expand Up @@ -162,7 +162,7 @@ def __init__(self, connection_str: Optional[str] = None, **kwargs):
self._strict_query_match = kwargs.get("strict_query_match", False)
self._kusto_settings: Dict[str, Dict[str, KustoConfig]] = _get_kusto_settings()
self._default_database: Optional[str] = None
self.current_connection: Optional[str] = connection_str
self._current_connection: Optional[str] = connection_str
self._current_config: Optional[KustoConfig] = None
self.client: Optional[KustoClient] = None
self._az_auth_types: Optional[List[str]] = None
Expand All @@ -189,6 +189,18 @@ def _set_public_attribs(self):
"set_database": self.set_database,
}

@property
def current_connection(self) -> Optional[str]:
"""Return current connection string or URI."""
if self._current_connection:
return self._current_connection
return self.cluster_uri

@current_connection.setter
def current_connection(self, value: str):
"""Set current connection string or URI."""
self._current_connection = value

@property
def cluster_uri(self) -> str:
"""Return current cluster URI."""
Expand Down Expand Up @@ -318,6 +330,7 @@ def connect(self, connection_str: Optional[str] = None, **kwargs):
kusto_cs = self._get_connection_string_for_cluster(self._current_config)
else:
logger.info("Using connection string %s", connection_str)
self._current_connection = connection_str
kusto_cs = connection_str

self.client = KustoClient(kusto_cs)
Expand Down

0 comments on commit 2908083

Please sign in to comment.