# Golden SAML AD FS Mail Access Hunt

#### Dataset
https://securitydatasets.com/notebooks/compound/GoldenSAMLADFSMailAccess.html
- WindowsEvents.Zip
- Microsoft365DefenderEvents.Zip
- AADAuditEvents.Zip
- OfficeActivityEvents.Zip

#### Simple Data Ingestion

All data is unzipped and loaded to two SQlite stores (SQL-supported lakehouse simulation) using [kestrel-tool::mkdb](https://github.com/opencybersecurityalliance/kestrel-lang/blob/develop/packages/kestrel_tool/src/kestrel_tool/mkdb.py)

#### Simulated On-premise Datastore `sqlite:///onpremise.db`

| Data File | Datastore Index | Kestrel Datasource | ........................................................................................................................................................................ |
| :------- | :------ | :------ | ------- |
| WindowsEvents.Zip | windows | GoldenSAML-WindowsEvents ||

#### Simulated Cloud Datastore `sqlite:///cloud.db`

| Data File | Datastore Index | Kestrel Datasource | .............................................................................................................................. |
| :------- | :------ | :------ | ------- |
| Microsoft365DefenderEvents.Zip | msdefender | GoldenSAML-Microsoft365DefenderEvents||
| AADAuditEvents.Zip | aad | GoldenSAML-AADAuditEvents ||
| OfficeActivityEvents.Zip | office | GoldenSAML-OfficeActivityEvents||

#### Analytics Ready to Use Besides Data Retrieval

- `rare-event-detection`: a Python analytics to find rare events from a pool of events. This is an example of hunt steps invoking Turing complete logic besides SQL to support computation like clustering and classification.

- `ask-AI`: a simulated LLM cyber agent to answer questions regarding specified fields in a Kestrel variable. The answer is enriched to the Kestrel variable as a new attribute (unmapped OCSF field) `unmapped.gen_ai`. A real OpenAI-API-based Kestrel analytics example can be found at [Kestrel analytics repo](https://github.com/opencybersecurityalliance/kestrel-analytics/tree/release/analytics/openai-suspicious-processes).

## Steps in This Huntbook

1. Start from Windows events to locate suspicious events (already written in this huntbook)
2. Think about potential campaigns, e.g., *GoldenSAML*, from the most suspicious Window event
3. Read the [GoldenSAML blog](https://www.cyberark.com/resources/threat-research-blog/golden-saml-newly-discovered-attack-technique-forges-authentication-to-cloud-apps) to understand the generic attack flow
4. Develop the threat hypothesis around the [SimuLand GoldenSAML attack simulation](https://simulandlabs.com/labs/GoldenSAML/README.html)
5. Move across multiple data sources to verify different phases of the attack from multiple angles (already written in this huntbook)
6. **Execute this huntbook and report your findings**
7. Further drill down
    - Add new cells in this huntbook to explore other paths of the hunt
    - Explore other aspects of the attack with any *goldensaml-quiz* huntbook
    - Check the *goldensaml-explain-kestrel* huntbook to get a basic idea of the Kestrel abstraction

In [None]:
# start from zero: detect rare events from a single day's Windows logs

all_events = GET event FROM sqlalchemy://GoldenSAML-WindowsEvents
             WHERE device.hostname LIKE '%'
             START 2021-08-02T00:00:00.000Z STOP 2021-08-03T00:00:00.000Z

In [None]:
DISP all_events ATTR time, type_uid, type_name, device.hostname SORT BY time ASC

In [None]:
rare_events = all_events

# this example analytics (Python function) will filter only rare events by counting
APPLY python://rare-event-detection ON rare_events

# more details can be provided to the analytics as parameters
# APPLY python://rare-event-detection ON rare_events WITH field='type_uid', threshold=5

DISP rare_events ATTR time, type_uid, type_name, device.hostname SORT BY time ASC

In [None]:
# let's first explore the (the single) network connection event

conn_event = rare_events WHERE type_uid = 400101

src = FIND network_endpoint ORIGINATED conn_event
DISP src

dst = FIND network_endpoint RESPONDED conn_event
DISP dst

In [None]:
# just a connection to localhost:80, nothing suspicious at the current moment

In [None]:
# next let's check the pipe event

pipe_event = rare_events WHERE type_uid = 100114

pipe = FIND file RESPONDED pipe_event
DISP pipe

In [None]:
# leverage LLM's knowledge to make sense of the named pipe

APPLY python://ask-AI ON pipe WITH prompt='What is the following pipe in Windows?', field='name'

DISP pipe ATTR unmapped.gen_ai

In [None]:
# which process created the pipe

pipe_proc = FIND process CREATED pipe

DISP pipe_proc

In [None]:
# this could be a GoldenSAML attack

# now it's time to read
# - https://www.cyberark.com/resources/threat-research-blog/golden-saml-newly-discovered-attack-technique-forges-authentication-to-cloud-apps
# - https://simulandlabs.com/labs/GoldenSAML/README.html

# questions:
# 1. do you find any relation between the pipe event and the connection event?
# 2. do we have more logs of the powershell from other angles/sources?

In [None]:
# let's check the suspicious process across another datasource: Microsoft365Defender

mde_events = GET event FROM sqlalchemy://GoldenSAML-Microsoft365DefenderEvents
             WHERE actor.process.pid = pipe_proc.pid
               AND actor.process.endpoint.hostname = pipe_proc.endpoint.hostname

DISP mde_events ATTR time, type_uid, type_name, device.hostname SORT BY time ASC

In [None]:
# so what's searched in the LDAP searches

queries = FIND query_info RESPONDED mde_events

DISP queries

In [None]:
# we found ________!!!

In [None]:
# looks like defender logs has lots of useful information, let's check other activities logged here

mde_all_events = GET event FROM sqlalchemy://GoldenSAML-Microsoft365DefenderEvents
                 WHERE type_name LIKE '%'
                 START 2021-08-02T00:00:00.000Z STOP 2021-08-03T00:00:00.000Z

DISP mde_all_events ATTR time, type_uid, type_name SORT BY time ASC

In [None]:
# attack impact analysis

delegation_event = mde_all_events WHERE type_uid = 300501

DISP delegation_event

In [None]:
mailaccess_event = mde_all_events WHERE type_uid = 300403

accessed_mail = FIND managed_entity RESPONDED mailaccess_event

DISP accessed_mail

In [None]:
# attack impact analysis and confirmation with data from other datasources

aad_events = GET event FROM sqlalchemy://GoldenSAML-AADAuditEvents
             WHERE type_name LIKE '%'
             START 2021-08-02T00:00:00.000Z STOP 2021-08-03T00:00:00.000Z

DISP aad_events ATTR time, type_uid, type_name SORT BY time ASC

In [None]:
DISP aad_events SORT BY time ASC

In [None]:
# attack impact analysis and confirmation with data from other datasources

office_events = GET event FROM sqlalchemy://GoldenSAML-OfficeActivityEvents
                WHERE type_name = mailaccess_event.type_name
                START 2021-08-02T00:00:00.000Z STOP 2021-08-03T00:00:00.000Z

DISP office_events