# Automation and integration

We have a running Lookyloo instance and can use it to correlate captures and parts of captures with eachother, but it is definitely not enough:

1. Feeding manually URLs to lookyloo using the web interface is nice, but it doesn't scale if we have a big amount of URLs to process
2. Having a lot of indicators in a lookyloo instance, but we want to share them with colleagues, partners, and correlate them with other datasources

In this section, we will use the APIs of Lookyloo, and discover the MISP API in order to make our work more efficient.

Note that what we're dong here is pretty much the best case scenario, where there are Python modules for both services and we do not have to reverse engineer the protocol of either services, and then write a python module.

## Requirements

* Access to a MISP instance
  
  **Recommendation**: [MISP virtual machine](https://www.misp-project.org/download/#virtual-images), on Virtualbox, running locally.

* Access to a Lookyloo instance

  **Recommandation**: Lookyloo installed locally following [this procedure](https://www.lookyloo.eu/docs/main/install-lookyloo.html)

* Python 3.8+ development environment

  **Recommandation**: This jupyter notebook, running locally on Ubuntu 22.04 or more recent, will do the trick.
  
# Automation 

## Lookyloo / PyLookyloo

[PyLookyloo](https://github.com/Lookyloo/pylookyloo) is a very simple python module that makes it easier to interact with a Lookyloo instance.

**Task**: use it to automatically push a bunch of URLs to a Lookyloo intance. This list of URL can come from [phishtank](http://phishtank.org/phish_search.php?valid=y&active=y&Search=Search), or other sources you know about. 
You can also get a list of URLs from the spam directory in your mailbox for example.

The code below is an example from the Pylookyloo [example directory](https://github.com/Lookyloo/PyLookyloo/blob/main/examples/enqueue_list.py). You can use it directly, and improve it later on to fit your needs.

**Note**: you need the python module `pylookyloo`. If you're running the notebook as recommended using `poetry`, the dependency is already installed. Otherwise, you can install it by starting a shell in the virtual environment and install with `pip`:

```bash
poetry shell
pip install pylookyloo
```

Note the path to a list file in the example below, you may want to change that, and the URL is pointing to a local instance, you may want to change that too.

**Important**: do not start by pushing houndred of URLs, 5 to 10 will be enough at first.

In [None]:
from pathlib import Path
from sys import exit

import requests

from pylookyloo import Lookyloo

"""
Get all the URLs from a file, check if they are still working (return code <400), push them to lookyloo.
"""

list_urls_file = Path('list.txt')
lookyloo_url = "https://127.0.0.1:5100/"


if not list_urls_file.exists():
    print(list_urls_file, 'does not exists')
    exit()

with list_urls_file.open() as f:
    urls = set(line.strip() for line in f.readlines())

print('To process:', len(urls))

lookyloo = Lookyloo(lookyloo_url)

for url in urls:
    try:
        print(url)
        response = requests.head(url, allow_redirects=True, timeout=3)
        response.raise_for_status()
        permaurl = lookyloo.enqueue(url, listing=True, quiet=True)
        print(f'Enqueued: {url} - Permaurl: {permaurl}')
    except Exception as e:
        print(f"{url} is down: {e}")


When the script is done running, have look at captures on the lookyloo instance, make sure everything worked as expected, at least some of the captures are present (we skip the dead URLs), and find more URLs if needed.

Keep in mind that the captures are processed by the async module sequentially, and each will take 30s to a minute to finish, so it may take a bit of time before all the URLs you want to capture are done.

The complete documentation for the Lookyloo API is availble on [the demo instance](https://lookyloo.circl.lu/doc/).

# MISP / PyMISP / Configuration

Lookyloo v1.4.0 officially [supports MISP](https://github.com/Lookyloo/lookyloo/releases/tag/v1.4.0) in a few different ways:
1. Export in MISP Json format on `http://127.0.0.1:5100/json/<tree_UUID>/misp_export` - can be used by anyone with access to the platform
2. Push to a preconfigured MISP instance **with** user interaction - requires authentication
3. Push to a preconfigured MISP instance **without** user interaction - requires authentication


The 1st option doesn't requires any configuration and can be used out of the box as soon as you have a capture. But none of that is automated, we will look at it last.

The next two options require some configuration, the following assumes you have access to a system where a Lookyloo instance is running, and access to a MISP instance (web interface):
1. Configure the [authentication on Lookyloo](https://www.lookyloo.eu/docs/main/lookyloo-auth.html)
2. Create a [dedicated user on MISP](https://www.circl.lu/doc/misp/administration/#adding-a-new-user) - the user should be setup [this way](https://www.lookyloo.eu/docs/main/lookyloo-integration.html#_recommended_setup_on_misp_side)
3. Configure the [MISP module on Lookyloo](https://www.lookyloo.eu/docs/main/lookyloo-integration.html#_misp) accordingly

## Integration with user interaction via the UI

In order to test the setup, we will start with the 2nd aproach: pushing a capture to MISP **with** user interaction:

1. Initialize an authenticated session on `http://127.0.0.1:5100/login`, and enter login/password you configured
2. Open a capture and you should see an entry `Prepare push to MISP` in the menu on the left, click on it
     
     If the entry is missing, it is either because you're not authenticated, the MISP module isn't enabled, or the MISP instance is unreachable (look in the logs).

3. Select tags if needed, push to MISP
4. Look at the event on MISP

## Integration without user interaction via PyLookyloo

If you want to push a vast amount of captures to a MISP instance, it will be a problem so you want to automate that, and that's where you're going to use PyLookyloo:

In [None]:
from pylookyloo import Lookyloo
import json

lookyloo_url = "http://0.0.0.0:5100"
                                                                  
lookyloo = Lookyloo(lookyloo_url)
lookyloo.init_apikey(username='admin', password='hacklu2023')
event = lookyloo.misp_push('0e41b66f-11f1-46fa-82c2-1f7a0e4e3fba')

print(json.dumps(event, indent=2))

As on the web interface, you need to be authenticated in order to push an event to MISP, and the authenticated calls to Lookyloo require an authentication key that will be passed in the headers.

To make your live easier, the method `init_apikey` will take care of initializing the proper things and allow you to do the authenticated calls.

**Task**: write a script that pushes to MISP all the captures with more than 1 redirect (`get_redirects` in PyLookyloo will help you there). You probably want to reuse the initial script and merge it with the one above.

## Manual integration

With many services, you either need to do the integration manually or you may want more flexibility that what the integration allows you to do.

In that case, you will do something along these lines:

In [None]:
from pylookyloo import Lookyloo
from pymisp import MISPEvent, PyMISP
import json
                                                                   
lookyloo_url = "http://0.0.0.0:5100"
                                                                  
lookyloo = Lookyloo(lookyloo_url)
e = lookyloo.misp_export('6ae2afdc-4d90-41ce-9cae-510daf1e6577')

event = MISPEvent()
event.load(e[0])
event.add_attribute('text', 'This is my event, I changed it.')

misp = PyMISP(url="https://127.0.0.1:8443", key="d6OmdDFvU3Seau3UjwvHS1y3tFQbaRNhJhDX0tjh", ssl=False)
new_event = misp.add_event(event, pythonify=True)

print(new_event.objects)

**Notes**:
* Doesn't require authentication on Lookyloo side
* Allows to modify the data before you push to MISP

**Tasks**:
* Look at the [PyMISP documentation](https://pymisp.readthedocs.io/en/latest/index.html)
* Modify the event you just got from Lookyloo with more details. 

**Example 1:** The URL you captured comes from a mail
1. create a [MISPObject of type email](https://github.com/MISP/PyMISP/blob/main/examples/add_email_object.py), 
2. attach it to the event
3. push the whole thing to MISP

In [None]:
import re
from pymisp.tools import EMailObject

email_obj = EMailObject('email.txt')

lookyloo = Lookyloo(lookyloo_url)

uuids = []
for url in re.findall('(http:.*)', email_obj.get_attributes_by_relation('email-body')[0].value):    
    try:
        print(url)
        response = requests.head(url, allow_redirects=True, timeout=3)
        response.raise_for_status()
        uuid = lookyloo.enqueue(url, listing=True, quiet=True)
        uuids.append(uuid)
        print(f'Enqueued: {uuid}')
    except Exception as e:
        print(f"{url} is down: {e}")
    

In [None]:
uuids

In [None]:
events = []

for uuid in uuids:
    print(uuid)
    e = lookyloo.misp_export(uuid)
    events.append(e)

master_event = MISPEvent()
master_event.info = 'Master event from mail'

for e in events: 
    event = MISPEvent()
    event.load(e)
    print(event.info)
    for a in event.attributes:
        print(a)
        master_event.add_attribute(**a)
    for o in event.objects:
        master_event.add_object(**o)

master_event.add_object(**email_obj)

In [None]:
misp = PyMISP(url="https://127.0.0.1:8443", key="d6OmdDFvU3Seau3UjwvHS1y3tFQbaRNhJhDX0tjh", ssl=False)
new_event = misp.add_event(master_event, pythonify=True)

**Example 2:** Do the same thing, but with multiple emails, and multiple URLs per email
1. Export emails from your mailbox
2. extract one or more URL(s) from the mails
3. capture them with lookyloo
4. get the event(s) from Lookyloo
5. (if necessary) merge the events 
6. attach the email to the final event
7. push it to MISP

# Phishtank Lookup Integration

[Phishtank Lookup](https://phishtankapi.circl.lu/) is a simple tool to query the current phishing URLs on [Phishtank](https://phishtank.org/).

In [None]:
import requests
import time

from pylookyloo import Lookyloo
from pymisp import MISPEvent
from pyphishtanklookup import PhishtankLookup

lookyloo_url = 'http://127.0.0.1:5100'

lookyloo = Lookyloo(lookyloo_url)
lookyloo.init_apikey(username='admin', password='hacklu2023')

phishtank = PhishtankLookup()
urls_in_Luxembourg = phishtank.get_urls_by_cc('LU')

uuids = {}
for url in urls_in_Luxembourg:
    try:
        print(url)
        response = requests.head(url, allow_redirects=True, timeout=3)
        response.raise_for_status()
        uuid = lookyloo.enqueue(url, listing=True, quiet=True)
        uuids[uuid] = False
        print(f'Enqueued: {url} - UUID: {uuid}')
        break
    except Exception as e:
        print(f"{url} is down: {e}")

# all the URLs are enqueued
        
while not all(uuids.values()):
    print(uuids)
    for uuid in uuids.keys():
        uuids[uuid] = lookyloo.get_status(uuid)['status_code'] == 1
    time.sleep(10)
    
# All the captures are done

for uuid in uuids.keys():
    misp_event = lookyloo.misp_push(uuid)
    if 'error' in misp_event:
        print(uuid, misp_event)
    else:
        for event in misp_event:
            me = MISPEvent()
            me.from_json(event)
            print(uuid, me.info, me.id)
    

# Static file analysis with Pandora

Currently missing features in the python module: https://github.com/pandora-analysis/pypandora/issues/33

Your organisation and/or friends receive all kind of random files, generally by email, from trusted or untructed sources and they either have to, or want to open them. 

In practice, they most of the time simply want to see what's in the file and a screenshot will do.

In [None]:
from pypandora import PyPandora

p = PyPandora("https://pandora-demo.yoyodyne-it.eu")

In [None]:
p.is_up

In [None]:
p.init_apikey('admin', 'hacklu2023')

In [None]:
p.apikey

In [None]:
p.submit_from_disk("./BigPicture.png", seed_expire=3600)

In [None]:
p.get_stats()

# PyLacus / LacusCore

* **PyLacus**: Use this library to submit a capture to a Lacus webservice

In [None]:
import time

from pylacus import PyLacus, CaptureStatus, CaptureResponse

lacus = PyLacus("https://lacus-demo.yoyodyne-it.eu/")
if lacus.is_up:
    print("Lacus is up and running")

uuid = lacus.enqueue(url="wort.lu")

while lacus.get_capture_status(uuid) != CaptureStatus.DONE:
    print(f"Capture {uuid} not done yet.")
    time.sleep(15)

print(f"Capture {uuid} done.")

result = lacus.get_capture(uuid)

In [None]:
result.keys()

* **LacusCore**: Use this library to trigger the capture directly, without using the Lacus web service

As LacusCore triggers the capture directly, you need to run the commands below to install the required dependencies and browsers:

```bash
poetry run playwright install-deps
poetry run playwright install
```

You will also need to have a redis instance running locally. It can either installed via `apt install redis` or cloned from the repository and installed manually. The code below assumes default hostname and port, change accordingly.

It also expects you to have a tor proxy running, feel free to remove the line if needed.

In [None]:
import time

from redis import Redis

from lacuscore import LacusCore, CaptureStatus, CaptureResponse

redis = Redis('127.0.0.1', 6379)

lacus = LacusCore(redis, 
                  tor_proxy="socks5://127.0.0.1:9050", 
                  max_capture_time=300, 
                  only_global_lookups=False)
uuid = lacus.enqueue(url="rtl.lu")

# This loop can run in another process
for capture_task in lacus.consume_queue(10):
    print(f'Waiting for Task {capture_task.get_name()} to finish.')
    await capture_task
    print(f'Task {capture_task.get_name()} done.')
else:
    print('Nothing to consume, the capture might already me cached.')

# If the loop above runs in another process, you need to check the status

while lacus.get_capture_status(uuid) != CaptureStatus.DONE:
    print(f"Capture {uuid} not done yet.")
    time.sleep(15)
print(f"Capture {uuid} done (from status call).")

result = lacus.get_capture(uuid)

In [None]:
result.keys()