# Root Cause Search in Android Dataset

This Notebook presents an example execution of the algorithm. The test is performed using 10 reference errors within a segment of an Android log dataset comprising 10 million lines. Execution was carried out on a MacBook Air M1 equipped with 16 GB of RAM. The purpose of this notebook is to showcase the functionality and provide insights into the code's behavior with the specified dataset and hardware conditions.

In [1]:
import sys
sys.path.append("../root_cause")
from search import RootCauseSearch
from settings import SearchStrategy
from settings import SearchSettings
from output import DisplayNotebookOutput

# General Settings

In [2]:
search_strategy = {
    'intersection_occurrences_col': 'content',
    'intersection_col': 'service_template_id',
    'hidden_occurrences_col': 'service_template_id',
    'uniqueness_col': 'content',
    'max_noise': 1,
    'window_seconds': 2
}

general_settings = {
    'dataset_name': 'ba',
    'storage_dir': '../storage',
    'source_csv_file': '../storage/ba.source.csv',
    'drain_config_file': '../drain3.ini',
    'service_filter': ['PowerManagerService', 'HicomNfTimerManager', 'HwAmbientLuxFilterAlgo', 'chatty'],
    'content_filter': ['ApiCount', r'^Date: '],
    'duplicate_filter_col': 'service_template_id',
    'parallel_processing': True,
    'output': DisplayNotebookOutput()
}

# Search

In [3]:
search_strategies = [SearchStrategy(**search_strategy)]
search_settings = SearchSettings(strategies=search_strategies, **general_settings)

root_cause = RootCauseSearch(search_settings)

[30;1mLoading dataset from CSV file and preparing it:[0m
↻ Preparing dataset for template clustering ...
↻ Creating template clusters ...
↻ Assigning the templates to their log messages ...
✓ Dataset loaded and prepared.


## Error 1

In [4]:
rc1 = root_cause.search(542657)

[30;1mTrying search strategy "content|service_template_id|service_template_id|content|1":[0m
ℹ 2 error occurrences found. They are used to create a intersection of all time windows before the error.
ℹ 109 values in intersection of time windows found.
ℹ 25 error occurrences found. They are used to mark the time windows that are skipped in the uniqueness check for root cause candidates.
✓ 22 lines added to root cause.
[30;1m
Results:[0m

[32;1mLine:[0m [32m538728[0m
[32;1mTimestamp:[0m [32m2020-12-02 14:31:45.708000[0m
[32;1mService:[0m [32mhievent_jni[0m
[32;1mTemplate:[0m [32mWrite to dev len:<:NUM:>[0m
[32;1mContent:[0m [32mWrite to dev len:283[0m
[32;1mFound with strategies:[0m
[32m- content|service_template_id|service_template_id|content|1|0[0m

[32;1mLine:[0m [32m539398[0m
[32;1mTimestamp:[0m [32m2020-12-02 14:31:45.797000[0m
[32;1mService:[0m [32mnStackXDMsg[0m
[32;1mTemplate:[0m [32m<:*:> <:*:> <:*:> <:*:> = <:NUM:>, <:*:> = <:NUM:>[0m


✅ The root cause can be derived from the returned result: A sudden disconnection to an external device (presumably VR technology) with a screen and input capability led to a failed TCP transmission of media content.

## Error 2

In [5]:
rc2 = root_cause.search(546260)

[30;1mReloading dataset from CSV file:[0m
✓ Dataset loaded.
[30;1mTrying search strategy "content|service_template_id|service_template_id|content|1":[0m
ℹ 8 error occurrences found. They are used to create a intersection of all time windows before the error.
ℹ 3 values in intersection of time windows found.
ℹ 94 error occurrences found. They are used to mark the time windows that are skipped in the uniqueness check for root cause candidates.
✓ 1 lines added to root cause.
[30;1m
Results:[0m

[32;1mLine:[0m [32m546259[0m
[32;1mTimestamp:[0m [32m2020-12-02 14:31:48.407000[0m
[32;1mService:[0m [32mwpa_supplicant[0m
[32;1mTemplate:[0m [32mNotifying <:*:> <:*:> to hidl control: <:NUM:>[0m
[32;1mContent:[0m [32mNotifying disconnect reason to hidl control: -3[0m
[32;1mFound with strategies:[0m
[32m- content|service_template_id|service_template_id|content|1|1[0m

[31;1mLine:[0m [31m546260[0m
[31;1mTimestamp:[0m [31m2020-12-02 14:31:48.407000[0m
[31;1mServ

❌ The root cause can not be derived from the returned result.

## Error 3

In [6]:
rc3 = root_cause.search(596489)

[30;1mReloading dataset from CSV file:[0m
✓ Dataset loaded.
[30;1mTrying search strategy "content|service_template_id|service_template_id|content|1":[0m
ℹ 81 error occurrences found. They are used to create a intersection of all time windows before the error.
ℹ 2 values in intersection of time windows found.
ℹ 140 error occurrences found. They are used to mark the time windows that are skipped in the uniqueness check for root cause candidates.
✓ 1 lines added to root cause.
[30;1m
Results:[0m

[32;1mLine:[0m [32m596488[0m
[32;1mTimestamp:[0m [32m2020-12-02 14:32:29.178000[0m
[32;1mService:[0m [32mwpa_supplicant[0m
[32;1mTemplate:[0m [32mnl80211: Drv Event <:NUM:> <:*:> received for <:*:>[0m
[32;1mContent:[0m [32mnl80211: Drv Event 48 (NL80211_CMD_DISCONNECT) received for p2p-p2p0-2[0m
[32;1mFound with strategies:[0m
[32m- content|service_template_id|service_template_id|content|1|0[0m

[31;1mLine:[0m [31m596489[0m
[31;1mTimestamp:[0m [31m2020-12-02 1

✅ The root cause can be derived from the returned result: The Wi-Fi connection to a peer-to-peer interface of the mobile phone cannot be established. A brief Google search for the error code "NL80211_CMD_DISCONNECT" also indicates that the reason is a lack of support for 5 GHz Wi-Fi networks.

## Error 4

In [7]:
rc4 = root_cause.search(596536)

[30;1mReloading dataset from CSV file:[0m
✓ Dataset loaded.
[30;1mTrying search strategy "content|service_template_id|service_template_id|content|1":[0m
ℹ 8 error occurrences found. They are used to create a intersection of all time windows before the error.
ℹ 9 values in intersection of time windows found.
ℹ 94 error occurrences found. They are used to mark the time windows that are skipped in the uniqueness check for root cause candidates.
✓ 5 lines added to root cause.
[30;1m
Results:[0m

[32;1mLine:[0m [32m596493[0m
[32;1mTimestamp:[0m [32m2020-12-02 14:32:29.178000[0m
[32;1mService:[0m [32mwpa_supplicant[0m
[32;1mTemplate:[0m [32mP2P: wpas_p2p_group_delete, interface name:p2p-p2p0-<:NUM:>.[0m
[32;1mContent:[0m [32mP2P: wpas_p2p_group_delete, interface name:p2p-p2p0-2.[0m
[32;1mFound with strategies:[0m
[32m- content|service_template_id|service_template_id|content|1|0[0m

[32;1mLine:[0m [32m596496[0m
[32;1mTimestamp:[0m [32m2020-12-02 14:32:29.1

✅ The root cause can be derived from the returned result: The virtual peer-to-peer Wi-Fi interface was deleted, leading to the deauthentication of all connected devices.

## Error 5

In [8]:
rc5 = root_cause.search(620427)

[30;1mReloading dataset from CSV file:[0m
✓ Dataset loaded.
[30;1mTrying search strategy "content|service_template_id|service_template_id|content|1":[0m
ℹ 5 error occurrences found. They are used to create a intersection of all time windows before the error.
ℹ 92 values in intersection of time windows found.
ℹ 25 error occurrences found. They are used to mark the time windows that are skipped in the uniqueness check for root cause candidates.
✓ 12 lines added to root cause.
[30;1m
Results:[0m

[32;1mLine:[0m [32m620180[0m
[32;1mTimestamp:[0m [32m2020-12-02 14:32:45.554000[0m
[32;1mService:[0m [32mHiSight-RTSP-Net[0m
[32;1mTemplate:[0m [32m[ NetSessionImpl.java: <:*:> <:*:> error, socket closed, SocketException: java.net.SocketException: <:*:> <:*:>[0m
[32;1mContent:[0m [32m[ NetSessionImpl.java: 293]rtsp:Receive data error, socket closed, SocketException: java.net.SocketException: Connection reset[0m
[32;1mFound with strategies:[0m
[32m- content|service_tem

✅ The root cause can be derived from the returned result: A sudden disconnection to an external device (presumably VR technology) with a screen and input capability led to a failed TCP transmission of media content.

## Error 6

In [9]:
rc6 = root_cause.search(623042)

[30;1mReloading dataset from CSV file:[0m
✓ Dataset loaded.
[30;1mTrying search strategy "content|service_template_id|service_template_id|content|1":[0m
ℹ 5 error occurrences found. They are used to create a intersection of all time windows before the error.
ℹ 73 values in intersection of time windows found.
ℹ 94 error occurrences found. They are used to mark the time windows that are skipped in the uniqueness check for root cause candidates.
✓ 7 lines added to root cause.
[30;1m
Results:[0m

[32;1mLine:[0m [32m616241[0m
[32;1mTimestamp:[0m [32m2020-12-02 14:32:44.572000[0m
[32;1mService:[0m [32mDMSDP   [0m
[32;1mTemplate:[0m [32m<:*:> <:*:> mDeviceType=<:NUM:>, <:*:> <:*:> mBtName='', <:*:> <:*:> mBusinessId=<:NUM:>', mChannelType=<:NUM:>}[0m
[32;1mContent:[0m [32mDMSDPDeviceManager:checkRemainEventReport device:InnerDMSDPDevice{mDeviceId='CD****35', mDeviceType=7, mDeviceName='', mBluetoothMac='', mBtName='', mLocalIp='19****.1', mRemoteIp='19****13', mBusine

❌ The root cause can not be derived from the returned result.

## Error 7

In [10]:
rc7 = root_cause.search(643184)

[30;1mReloading dataset from CSV file:[0m
✓ Dataset loaded.
[30;1mTrying search strategy "content|service_template_id|service_template_id|content|1":[0m
ℹ 81 error occurrences found. They are used to create a intersection of all time windows before the error.
ℹ 2 values in intersection of time windows found.
ℹ 140 error occurrences found. They are used to mark the time windows that are skipped in the uniqueness check for root cause candidates.
✓ 1 lines added to root cause.
[30;1m
Results:[0m

[32;1mLine:[0m [32m643183[0m
[32;1mTimestamp:[0m [32m2020-12-02 14:33:00.639000[0m
[32;1mService:[0m [32mwpa_supplicant[0m
[32;1mTemplate:[0m [32mnl80211: Drv Event <:NUM:> <:*:> received for <:*:>[0m
[32;1mContent:[0m [32mnl80211: Drv Event 48 (NL80211_CMD_DISCONNECT) received for p2p-p2p0-4[0m
[32;1mFound with strategies:[0m
[32m- content|service_template_id|service_template_id|content|1|0[0m

[31;1mLine:[0m [31m643184[0m
[31;1mTimestamp:[0m [31m2020-12-02 1

✅ The root cause can be derived from the returned result: The Wi-Fi connection to a peer-to-peer interface of the mobile phone cannot be established. A brief Google search for the error code "NL80211_CMD_DISCONNECT" also indicates that the reason is a lack of support for 5 GHz Wi-Fi networks.

## Error 8

In [11]:
rc8 = root_cause.search(643203)

[30;1mReloading dataset from CSV file:[0m
✓ Dataset loaded.
[30;1mTrying search strategy "content|service_template_id|service_template_id|content|1":[0m
ℹ 4 error occurrences found. They are used to create a intersection of all time windows before the error.
ℹ 15 values in intersection of time windows found.
ℹ 94 error occurrences found. They are used to mark the time windows that are skipped in the uniqueness check for root cause candidates.
✓ 8 lines added to root cause.
[30;1m
Results:[0m

[32;1mLine:[0m [32m643188[0m
[32;1mTimestamp:[0m [32m2020-12-02 14:33:00.640000[0m
[32;1mService:[0m [32mwpa_supplicant[0m
[32;1mTemplate:[0m [32mP2P: wpas_p2p_group_delete, interface name:p2p-p2p0-<:NUM:>.[0m
[32;1mContent:[0m [32mP2P: wpas_p2p_group_delete, interface name:p2p-p2p0-4.[0m
[32;1mFound with strategies:[0m
[32m- content|service_template_id|service_template_id|content|1|0[0m

[32;1mLine:[0m [32m643190[0m
[32;1mTimestamp:[0m [32m2020-12-02 14:33:00.

✅ The root cause can be derived from the returned result: The virtual peer-to-peer Wi-Fi interface was deleted, leading to the deauthentication of all connected devices.

## Error 9

In [12]:
rc9 = root_cause.search(884269)

[30;1mReloading dataset from CSV file:[0m
✓ Dataset loaded.
[30;1mTrying search strategy "content|service_template_id|service_template_id|content|1":[0m
ℹ 5 error occurrences found. They are used to create a intersection of all time windows before the error.
ℹ 92 values in intersection of time windows found.
ℹ 25 error occurrences found. They are used to mark the time windows that are skipped in the uniqueness check for root cause candidates.
✓ 13 lines added to root cause.
[30;1m
Results:[0m

[32;1mLine:[0m [32m883749[0m
[32;1mTimestamp:[0m [32m2020-12-02 14:38:58.018000[0m
[32;1mService:[0m [32mHiSight-RTSP-Net[0m
[32;1mTemplate:[0m [32m[ NetSessionImpl.java: <:*:> <:*:> error, socket closed, SocketException: java.net.SocketException: <:*:> <:*:>[0m
[32;1mContent:[0m [32m[ NetSessionImpl.java: 293]rtsp:Receive data error, socket closed, SocketException: java.net.SocketException: Connection reset[0m
[32;1mFound with strategies:[0m
[32m- content|service_tem

✅ The root cause can be derived from the returned result: A sudden disconnection to an external device (presumably VR technology) with a screen and input capability led to a failed TCP transmission of media content.

## Error 10

In [13]:
rc10 = root_cause.search(889546)

[30;1mReloading dataset from CSV file:[0m
✓ Dataset loaded.
[30;1mTrying search strategy "content|service_template_id|service_template_id|content|1":[0m
ℹ 2 error occurrences found. They are used to create a intersection of all time windows before the error.
ℹ 12 values in intersection of time windows found.
ℹ 94 error occurrences found. They are used to mark the time windows that are skipped in the uniqueness check for root cause candidates.
✓ 2 lines added to root cause.
[30;1m
Results:[0m

[32;1mLine:[0m [32m887806[0m
[32;1mTimestamp:[0m [32m2020-12-02 14:39:00.022000[0m
[32;1mService:[0m [32mWificondControl[0m
[32;1mTemplate:[0m [32mNoise: <:NUM:>, Snr: <:NUM:>, Chload: <:NUM:>, rssi: <:NUM:>, txBitrate: <:NUM:>, rxBitrate: <:NUM:>, frequency: <:NUM:>, UlDelay: <:NUM:>, currentTxBytes: <:*:> currentTxPackets: <:NUM:>, currentTxFailed: <:NUM:>, currentRxBytes: <:*:> currentRxPackets: <:NUM:>[0m
[32;1mContent:[0m [32mNoise: -100, Snr: 24, Chload: 162, rssi: -

❌ The root cause can not be derived from the returned result.