# Example Notebook - Using Microsoft Sentinel Search Jobs

[Search in Microsoft Sentinel](https://docs.microsoft.com/azure/sentinel/investigate-large-datasets) is built on top of Search jobs. Search jobs are asynchronous queries that fetch records.<br>
The results are returned to a search table that's created in your Log Analytics workspace after you start the Search job. <br>
The search job uses parallel processing to run the search across long time spans, in extremely large datasets. 

Using [MSTICPy](https://github.com/microsoft/msticpy) you can create Search jobs from a notebook, check when the requested logs are ready and then query the returned data.
In this notebook we take you through an example of doing just this.

## Setup

The first thing we need to do is install and configure MSTICPy in order to ensure the features are accessible.

In [3]:
from IPython.display import display, HTML

%pip install --upgrade msticpy
display(HTML("<h3>Starting Notebook setup...</h3>"))

import msticpy

msticpy.init_notebook(
    namespace=globals(),
    verbosity=0,
);

The next step is to connect to Microsoft Sentinel. If you have configured a [MSTICPy configuration file](https://msticpy.readthedocs.io/en/latest/getting_started/msticpyconfig.html) you can use your workspace details configured there. Otherwise you can pass in your details when initalizing it.

In [4]:
from msticpy.data.azure import MicrosoftSentinel
microsoft_sentinel = MicrosoftSentinel()
microsoft_sentinel.connect()

Once connected we can now start a search with `create_search`. To this function we need to pass in a KQL query to run for the search.<br>
Log queries in a search job are intended to scan very large sets of data. <br>
To support distribution and segmentation, the queries can only search on data source at a time and can only use a subset of KQL, including the operators:
 - where
 - extend
 - project
 - project-away
 - project-keep
 - project-rename
 - project-reorder
 - parse
 - parse-where

More details on the limitations can be found in the [documenation](https://docs.microsoft.com/azure/sentinel/investigate-large-datasets#limitations-of-a-search-job).

In addition to the query we can also provide the following optional parameters:
 - start and end times for the query - if this isn't provided it defaults to the last 90 days
 - a name for the serach - if not provided a random GUID is generated
 - a limit on the number of results to return - by defualt this is 1000

<b>Note:</b> It can take some time to create a search job.

In [6]:
microsoft_sentinel.create_search(query="SecurityEvent | where * has 'infected.exe'", search_name="examplesearch")

Search job created with for examplesearch_SRCH.


Once a search job is created it is not immediately ready for querying, it can take some time to run the search, and return the data.<br>
We can check the status of our search job with `check_search_status` and by passing it our search name.<br>
This will print out the current search jobs's status. Once the status is 'Succeeded' the data is ready for querying when this happens the function will return True.

In [7]:
import time
while not microsoft_sentinel.check_search_status("examplesearch"):
    time.sleep(10)

examplesearch_SRCH status is 'Updating'
examplesearch_SRCH status is 'Updating'
examplesearch_SRCH status is 'InProgress'
examplesearch_SRCH status is 'InProgress'
examplesearch_SRCH status is 'Succeeded'


Once the search job is ready we can use MSTICPy's [QueryProvider](https://msticpy.readthedocs.io/en/latest/data_acquisition/DataProv-MSSentinel.html) feature to run a query against the search's dataset and see the results of the search.<br>
The name of the table to query is the name of the search job with _SRCH appended - this is output when you run `create_search` or `check_search_status`.

In [10]:
qry_prov = QueryProvider("MSSentinel")
qry_prov.connect(WorkspaceConfig())
search_results = qry_prov.exec_query("examplesearch_SRCH | take 10")
search_results

Connecting... 

connected


Unnamed: 0,TimeGenerated,SourceSystem,Account,AccountType,Computer,EventSourceName,Channel,Task,Level,EventData,EventID,Activity,PartitionKey,RowKey,StorageAccount,AzureDeploymentID,AzureTableName,AccessList,AccessMask,AccessReason,AccountDomain,AccountExpires,AccountName,AccountSessionIdentifier,AdditionalInfo,...,TemplateOID,TemplateSchemaVersion,TemplateVersion,TokenElevationType,TransmittedServices,UserAccountControl,UserParameters,UserPrincipalName,UserWorkstations,VirtualAccount,VendorIds,Workstation,WorkstationName,SourceComputerId,EventOriginId,MG,TimeCollected,ManagementGroupName,Id,_OriginalType,_OriginalItemId,_OriginalTimeGenerated,TenantId,Type,_ResourceId


Once a search job is complete and the data no longer needed we can delete the job and its associated data. <br>
This can be done with `delete_search` and again passing it the search name. <br>
As with search job creation, the deletion can take some time but not further action is required once the deletion is started. 

In [5]:
microsoft_sentinel.delete_search("examplesearch")


examplesearch_SRCH set for deletion.


More details about these features can be found at:
 - [Microsoft Sentinel Search Jobs](https://docs.microsoft.com/azure/sentinel/investigate-large-datasets)
 - [MSTICPY's Sentinel features](https://msticpy.readthedocs.io/en/latest/data_acquisition/Sentinel.html)