# Get Started with KQL and Notebooks

__Notebook Version:__ 1.1<br>
__Python Version:__ Python 3.6 (including Python 3.6 - AzureML)<br>
__Required Packages:__ Kqlmagic 0.1.90<br>
__Platforms Supported:__<br>
    -  Azure Notebooks Free Compute
    -  Azure Notebooks DSVM
__Data Source Required:__<br>
    -  Log Analytics - At least one table with data from the last 30 days.
    
### Description
The notebook provides you basic knowledge to using Kusto Query Language (KQL) in Microsoft Sentinel Notebooks.


<font color=red>When you switch between Azure Notebooks Free Compute and Data Science Virtual Machine (DSVM), you may need to select Python version: please select Python 3.6 for Free Compute, and Python 3.6 - AzureML for DSVM.</font>

## Other resources

This notebook provides a very simple introduction into what can be done with KQL and Notebooks. There are many more things that notebooks can do.
For a more comprehensive guide to getting starting with Microsoft Sentinel and Notebooks please refer to the following resources:

[Azure Notebooks Documentation](https://docs.microsoft.com/azure/sentinel/notebooks)

The majority of the pre-built Microsoft Sentinel Notebooks use the a Python library we have created to support notebook usage called [msticpy](https://github.com/Microsoft/msticpy). Documentation on this library can be found [here](https://msticpy.readthedocs.io/en/latest/) and in addition the [Configuration guide notebook](https://github.com/Azure/Azure-Sentinel-Notebooks/blob/master/ConfiguringNotebookEnvironment.ipynb) provides useful support.


For more details on configuring your Azure Notebooks Project review this notebook:
[AzureNotebooks-Configure Python Version](https://github.com/Azure/Azure-Sentinel-Notebooks/blob/master/HowTos/AzureNotebooks-ConfigurePythonVersion.ipynb)


For help troubleshooting problems with notebooks use this notebook:
[Troubleshooting Microsoft Sentinel Notebooks](https://github.com/Azure/Azure-Sentinel-Notebooks/blob/master/TroubleShootingNotebooks.ipynb)

This [blog](https://techcommunity.microsoft.com/t5/azure-sentinel/security-investigation-with-azure-sentinel-and-jupyter-notebooks/ba-p/432921) provides a great introduction to Notebooks for security investigaiton

### Installation and Imports
At the start of any notebook we need to make sure that we have the requisite pacakges installed and imported into our notebook environment. This is a Python based notebook so we can use [pip](https://pypi.org/project/pip/) to install the packages needed. If you are using the free tier of Azure Notebooks you may need to install these packages every time, however if you use a local Jupyter server or DSVM in Azure Notebooks you should only need to do this installation once.

In [None]:
!pip install Azure-Sentinel-Utilities --upgrade 
!pip install msticpy --upgrade 

We have also created some utilities that can help you check that you have the required packages installed and the correct version of Python for your notebook enabled. For more details on this checker please refer to https://github.com/Azure/Azure-Sentinel/tree/master/Notebooks/SentinelUtilities/SentinelUtils.

In [None]:
import SentinelUtils
# checking Python version
check = SentinelUtils.version_management.ModuleVersionCheck()
py_check = check.validate_python('3.6.0')
if py_check.requirement_met == False:
    print('Please select Python 3.6 or Python 3.6 - AzureML at the upper right corner')
else:
    print('All OK, please continue')

In this notebook we will be using the [Kqlmagic library](https://pypi.org/project/Kqlmagic/) to query data from our Microsoft Sentinel instance, so we need to check it is installed and the correct version.

In [None]:
# checking required packages
mods_check = check.validate_installed_modules(['Kqlmagic>=0.1.105'])
for mod_info in mods_check:
    if mod_info.requirement_met == False:
        print('Please install {} {}.'.format(mod_info.name, mod_info.required_version))
    else:
        print("All required pacakges installed. Please continue.")

If the preceding cell asked you to install certain pacakges you can add a cell to do this. Click the + icon at the top of the page to add a new cell below the one you have currently selected. Then to install a pacakge with pip simply type '!pip install <package_name> --upgrade' into the cell and run it. This will install the latest version of the package you specify.

Once we have checked that we have all the needed elements installed we can import the modules we are going to be using in this notebook.

In [None]:
import ipywidgets as widgets
from IPython.display import display
import pandas as pd
import numpy as np
from msticpy.nbtools.wsconfig import WorkspaceConfig
from msticpy.data.data_providers import QueryProvider
from msticpy.nbtools import nbwidgets
print("Imports complete.")

Now that we have set up all the elements we need we can look at how to connect to a Microsoft Sentinel workspace, query it and interact with the output.

The first stage of this is to connect to our Microsoft Sentinel workspace. To do this we need to provide an worksapce and tenant ID of the workspace we wish to connect to, this can be provided either via a [config file](https://msticpy.readthedocs.io/en/latest/getting_started/msticpyconfig.html) or interactively in the cell output.

In [None]:
#Update to WorkspaceConfig(workspace="WORKSPACE_NAME") to get alerts from a Workspace other than your default one.
ws_config = WorkspaceConfig()
try:
    ws_id = ws_config['workspace_id']
    ten_id = ws_config['tenant_id']
    print("Workspace details collected from config file")
except:
    print('Please go to your Log Analytics workspace, copy the workspace ID and/or tenant Id and paste here to enable connection to the workspace and querying of it.')
    ws_id = nbwidgets.GetEnvironmentKey(env_var='WORKSPACE_ID',
                                        prompt='Please enter your Log Analytics Workspace Id:', auto_display=True)
    ten_id = nbwidgets.GetEnvironmentKey(env_var='TENANT_ID',
                                         prompt='Please enter your Log Analytics Tenant Id:', auto_display=True)
    ws_id = ws_id.value
    ten_id = ten_id.value

Now that we have collected information on the workspace to connect to we can go ahead an authenticate using Kqlmagic. This uses a device logon process where you are required to authenticate your device using a set of credentials you provide via browser window.

In [None]:
# You must run this cell to log into Log Analytics to continue
# Make sure you have 0.1.90 or above, if not, run Kqlmagic installation again
%reload_ext Kqlmagic
%kql loganalytics://code;workspace=ws_id;tenant=ten_id;alias="Sentinel" 

Once we have connected we can run KQL queries against our workspace by using '%kql our query'. Below we are just getting a list of tables with data in and how much data they have.

In [None]:
%kql search * | summarize count() by Type

Now that we have a list of tables we can build some elements to interact with the data and focus on a specific table. Below we take the list of tables in Microsoft Sentinel we collected before and we display this in a drop down list, allowing us to select a table to focus on.

In [None]:
dbSchema = %kql --schema "Sentinel@loganalytics"
tables = list(dbSchema.keys())
selected_table = widgets.Dropdown(options=tables, value=tables[1],description='Data Table:')
display(selected_table)

Now that we have selected a table we can collect data from that table and store them in a [Pandas DataFrame](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html)

In [None]:
%kql {selected_table.value} | where TimeGenerated >= ago(30d) | take 1000
if len(_kql_raw_result_) > 0:
    df = _kql_raw_result_.to_dataframe()
    print(f"Data collected from {selected_table.value}")
else:
    df = None
    print(f'No data found for {selected_table.value} in the last 30 days.')

To further focus our search in the data we can now get a list of columns in our table and select one of them.

In [None]:
columns = list(dbSchema[selected_table.value])
columns.sort()
selected_column = widgets.Dropdown(options=columns,description='Column:')
display(selected_column)

And focussing even further we can use the features of Pandas to get a list of unique values in the column selected.

In [None]:
if isinstance(df, pd.DataFrame) and not df.empty:
    #Get a unique list of values in our column
    unique_values = df[selected_column.value].replace('', np.nan).dropna().drop_duplicates().sort_values()
    if len(unique_values.index) > 0:
        data_point = widgets.Dropdown(options=unique_values,description='Data value:')
        display(data_point)
else:
    print(f"No data avalaible for {selected_table.value} please try another table.")

Now that we have selected a table, a query, and a specific data value we can pass these variables into a new KQL query in order to just get the dat we want.

In [None]:
# scope to a table and a column
%kql {selected_table.value} | where {selected_column.value} contains '{data_point.value}' | take 5

MSTICpy also includes a way to query Microsoft Sentinel via KQL that can be simpler than using native Kqlmagic. Below is an example of using MSTICpy to create a query, run it using MSTICpy and return the results in a Pandas DataFrame.

In [None]:
qry_prov = QueryProvider('LogAnalytics')
la_connection_string = f'loganalytics://code().tenant("{ten_id}").workspace("{ws_id}")'
qry_prov.connect(connection_str=f'{la_connection_string}')

query = """
    SecurityAlert
    | where TimeGenerated > ago(30d)
    | take 10
    """

qry_prov.exec_query(query)