Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data stored in SQL is getting deleted #33

Open
nikhiltalati opened this issue Aug 22, 2022 · 6 comments
Open

Data stored in SQL is getting deleted #33

nikhiltalati opened this issue Aug 22, 2022 · 6 comments

Comments

@nikhiltalati
Copy link

Hello,

I have saved the output to SQL. But i see that data is getting deletd from the tables. Also no last_run file is getting created under windows. I have schedueld task to run the script every hour.

Thanks

@nikhiltalati
Copy link
Author

Hello,

Now facing this issue
Starting run @ 2022-08-23 14:38:48.625658. Content: deque(['Audit.General', 'Audit.AzureActiveDirectory', 'Audit.Exchange', 'Audit.SharePoint', 'DLP.All']).
Traceback (most recent call last):
File "AuditLogCollector.py", line 712, in
File "AuditLogCollector.py", line 71, in run
File "AuditLogCollector.py", line 84, in run_once
File "AuditLogCollector.py", line 125, in receive_results_from_rust_engine
File "AuditLogCollector.py", line 448, in _handle_retrieved_content
TypeError: string indices must be integers
[6256] Failed to execute script 'AuditLogCollector' due to unhandled exception!
thread '' panicked at 'called Result::unwrap() on an Err value: SendError { .. }', src\api_connection.rs:234:57
note: run with RUST_BACKTRACE=1 environment variable to display a backtrace

@ddbnl
Copy link
Owner

ddbnl commented Aug 23, 2022

I'll take a look at this issue, seems to be somewhere in the Rust engine. Could you post your config file so I can try to reproduce this?

Also just to confirm, are you using the latest release?

@nikhiltalati
Copy link
Author

nikhiltalati commented Aug 25, 2022

Yes I am uisng latest releawe. The config file is as below

log: # Log settings. Debug will severely decrease performance
path: 'collector.log'
debug: True
collect: # Settings determining which audit logs to collect and how to do it
workingDir: ./ # Directory to save cache files in (known_logs, known_content, last_run). Default is dir where executable is l
ocated
contentTypes:
Audit.General: True
Audit.AzureActiveDirectory: False
Audit.Exchange: False
Audit.SharePoint: True
DLP.All: False
rustEngine: True # Use False to revert to the old Python engine. If running from python instead of executable, make sure to
install the Rust enginej python wheel in the RustEngineWheels folder

schedule: 0 1 0 # How often to run in days/hours/minutes. Program will never exit and run on the schedule. Uncomment to use.

maxThreads: 50 # Maximum number of simultaneous threads retrieving logs
globalTimeout: 59 # Number of minutes before the process is forced to exit if still running (0 = no timeout). If you run e.g
. every hour you could set this to 59, ensuring there will only be 1 active process.
retries: 3 # Times to retry retrieving a content blob if it fails
retryCooldown: 3 # Seconds to wait before retrying retrieving a content blob
autoSubscribe: True # Automatically subscribe to collected content types. Never unsubscribes from anything.
skipKnownLogs: True # Remember retrieved log ID's, don't collect them twice
resume: False # DEPRECATED, recommended to keep 'false'. Remember last run time, resume collecting from there next run
hoursToCollect: 72 # Look back this many hours for audit logs (can be overwritten by resume)
filter: # Only logs that match ALL filters for a content type are collected. Leave empty to collect all
Audit.General:
Audit.AzureActiveDirectory:
Audit.Exchange:
Audit.SharePoint:
DLP.All:
output:
file: # CSV output
enabled: False
separateByContentType: True # Creates a separate CSV file for each content type, using file name from 'path' as a prefix
path: 'output.csv'
separator: ';'
cacheSize: 500000 # Amount of logs to cache until each CSV commit, larger=faster but eats more memory
azureLogAnalytics:
enabled: False
workspaceId:
sharedKey:
maxThreads: 50 # Maximum simultaneous threads sending logs to workspace
azureTable: # Provide connection string to executable at runtime with --table-string
enabled: False
tableName: AuditLogs # Name of the table inside the storage account
maxThreads: 10 # Maximum simultaneous threads sending logs to Table
azureBlob: # Write CSV to a blob container. Provide connection string to executable at runtime with --blob-string
enabled: False
containerName: AuditLogs # Name of the container inside storage account
blobName: AuditLog # When separatedByContentType is true, this is used as file prefix and becomes e.g. AuditLog_AuditExcha
nge.csv
tempPath: './output'
separateByContentType: True
separator: ';'
cacheSize: 500000 # Amount of logs to cache until each CSV commit, larger=faster but eats more memory
sql: # Provide connection string to executable at runtime with --sql-string
enabled: True
cacheSize: 500000 # Amount of logs to cache until each SQL commit, larger=faster but eats more memory
chunkSize: 500 # Amount of rows to write simultaneously to SQL, in most cases just set it as high as your DB allows. COUNT errors = too high
graylog:
enabled: False
address:
port:
prtg:
enabled: False
channels:
fluentd:
enabled: False
tenantName:
address:
port:

Ther error is now as below with Debug

Starting new HTTPS connection (1): login.microsoftonline.com:443
https://login.microsoftonline.com:443 "POST /xxxxxxxxxxxxxxxxxx/oauth2/token HTTP/1.1" 200 1510
Logged in
Starting new HTTPS connection (1): manage.office.com:443
https://manage.office.com:443 "GET /api/v1.0/xxxxxxxxxxxxxxxxxxxxxxxxx/activity/feed/subscriptions/list HTTP/1.1" 20
0 342
Starting run @ 2022-08-25 13:07:38.330209. Content: deque(['Audit.General', 'Audit.SharePoint']).
Exception in thread Thread-4:
Traceback (most recent call last):
File "threading.py", line 932, in _bootstrap_inner
File "threading.py", line 870, in run
File "Interfaces/SqlInterface.py", line 198, in _process_cache
File "pandas/core/frame.py", line 721, in init
File "pandas/core/internals/construction.py", line 519, in nested_data_to_arrays
File "pandas/core/internals/construction.py", line 875, in to_arrays
File "pandas/core/internals/construction.py", line 960, in _list_of_dict_to_arrays
File "pandas/_libs/lib.pyx", line 403, in pandas._libs.lib.fast_unique_multiple_list_gen
File "pandas/core/internals/construction.py", line 958, in
RuntimeError: deque mutated during iteration
Interfaces/SqlInterface.py:101: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-ve
rsus-a-copy
Traceback (most recent call last):
File "AuditLogCollector.py", line 712, in
File "AuditLogCollector.py", line 71, in run
File "AuditLogCollector.py", line 84, in run_once
File "AuditLogCollector.py", line 125, in receive_results_from_rust_engine
File "AuditLogCollector.py", line 448, in _handle_retrieved_content
TypeError: string indices must be integers
[1755938] Failed to execute script 'AuditLogCollector' due to unhandled exception!
thread '' panicked at 'called Result::unwrap() on an Err value: SendError { .. }', src/api_connection.rs:254:57
note: run with RUST_BACKTRACE=1 environment variable to display a backtrace

@ddbnl
Copy link
Owner

ddbnl commented Aug 27, 2022

I've located and fixed the crashing issue in the rust engine.

I'll set up a sql db this weekend to try and reproduce that issue. Are you working with an azure sql instance or running your own server?

@nikhiltalati
Copy link
Author

I have my own server

@nikhiltalati
Copy link
Author

Hello,

Any update when you will release patched version.

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants