# Common OSINT Model Examples
This notebook shows how to use the common model. In order to do this "live", you need access to BinaryEdge, Censys
and/or Shodan. You can also use data in the [test_data](test_data) directory.

**Note:** The modules used in this notebook are not installed by default when installing common-osint-model as they are
no direct project requirements. In order to run all the examples code, you need to (pip) install:
  - shodan
  - pybinaryedge
  - censys

And also, if you don't have Jupyter Notebooks installed, you need to install the module `jupyterlab`.

# Demo with preloaded data files
This section uses data files given in the `test_data` directory and converts the host into common model objects in
order to show some of the features.

The following snippet initializes the host objects with data originally loaded from the specific APIs and saved as JSON
files.

In [4]:
import json
from common_osint_model import Host
binaryedge_host = Host.from_binaryedge(json.loads(open("test_data/140.82.121.4_binaryedge.json").read()))
shodan_host = Host.from_shodan(json.loads(open("test_data/140.82.121.4_shodan.json").read()))
censys_host = Host.from_censys(json.loads(open("test_data/140.82.121.4_censys_v2.json").read()))

INFO:SSHComponentKey:Censys data does not contain the key as raw data. The public key can be constructed with given data, however, currently this is only supported for RSA keys.
INFO:TLSComponentCertificate:Censys does not provide raw certificate data, to hashes must be taken from the data and cannot be calculated.


Generally, available services are stored as `Service` objects in a list under the service attribute. In order to use the
model similarly to the pre v4.0 model, a dict `service_dict` is available which lists the services according to their
ports.

The following snippet prints the listening ports of the host according to the data given by the various sources. After
that, we loop over all available services and print available HTTP headers and TLS certificate common names.

In [2]:
print(f"Listening ports for {binaryedge_host.ip} according to the various data sources:")
print(f"\tBinaryEdge: {binaryedge_host.ports}")
print(f"\tCensys: {censys_host.ports}")
print(f"\tShodan: {shodan_host.ports}")

print(f"\n\nHTTP Headers and TLS certificate common names available per service and data source:")
print("\tBinaryEdge:")
for service in binaryedge_host.services:
    if service.http:
        print(f"\tPort {service.port} HTTP headers: {service.http.headers}")
    if service.tls:
        print(f"\tPort {service.port} TLS certificate common name: {service.tls.certificate.subject.common_name}")

print("\n\tCensys:")
for service in censys_host.services:
    if service.http:
        print(f"\tPort {service.port} HTTP headers: {service.http.headers}")
    if service.tls:
        print(f"\tPort {service.port} TLS certificate common name: {service.tls.certificate.subject.common_name}")

print("\n\tShodan:")
for service in shodan_host.services:
    if service.http:
        print(f"\tPort {service.port} HTTP headers: {service.http.headers}")
    if service.tls:
        print(f"\tPort {service.port} TLS certificate common name: {service.tls.certificate.subject.common_name}")

Listening ports for 140.82.121.4 according to the various data sources:
	BinaryEdge: [443, 22, 80]
	Censys: [22, 80, 443]
	Shodan: [443, 80, 9418, 22]


HTTP Headers and TLS certificate common names available per service and data source:
	BinaryEdge:
	Port 443 HTTP headers: {'permissions-policy': 'interest-cohort=()', 'x-frame-options': 'deny', 'server': 'GitHub.com', 'vary': 'X-PJAX, Accept-Language, Accept-Encoding, Accept, X-Requested-With', 'content-type': 'text/html; charset=utf-8', 'strict-transport-security': 'max-age=31536000; includeSubdomains; preload', 'date': 'Sat, 07 Aug 2021 12:57:29 GMT', 'x-content-type-options': 'nosniff', 'referrer-policy': 'origin-when-cross-origin, strict-origin-when-cross-origin', 'expect-ct': 'max-age=2592000, report-uri="https://api.github.com/_private/browser/errors"', 'content-security-policy': "default-src 'none'; base-uri 'self'; block-all-mixed-content; connect-src 'self' uploads.github.com www.githubstatus.com collector.githubapp.com api.gi

Additionally to the raw data obtained, goal of this model is to add common hash types wherever raw data is available.
This is not implemented for every ssh key type for the Censys part, tough.

The following snippet shows different hash values for a favicon available in the BinaryEdge raw data. The given
`shodan_murmur` hash is related to the method Shodan uses for hashing favicons. Instead of the binary data, the hash is
 calculated on the base64 encoded data.

In [7]:
print("Favicon hashes based on BinaryEdge raw data:")
for service in binaryedge_host.services:
    if service.http and service.http.content.favicon:
        print(f"HTTP on port {service.port}")
        print(f"\tMD5: {service.http.content.favicon.md5}")
        print(f"\tSHA1: {service.http.content.favicon.sha1}")
        print(f"\tSHA256: {service.http.content.favicon.sha256}")
        print(f"\tMurmur: {service.http.content.favicon.murmur}")
        print(f"\tMurmur: {service.http.content.favicon.shodan_murmur}\n")

Favicon hashes based on BinaryEdge raw data:
HTTP on port 443
	MD5: 7f969f62ee272a3be19966806fff4ad5
	SHA1: 07ed688be6d6288a669778f65f7eccdd96770925
	SHA256: 2ee43237d196100210f1786e7b73b57cd140f6013c072c70dbdffd9e9bc695f8
	Murmur: -640077903
	Murmur: -456454785

HTTP on port 80
	MD5: 7f969f62ee272a3be19966806fff4ad5
	SHA1: 07ed688be6d6288a669778f65f7eccdd96770925
	SHA256: 2ee43237d196100210f1786e7b73b57cd140f6013c072c70dbdffd9e9bc695f8
	Murmur: -640077903
	Murmur: -456454785



Additional hashes might help to track hosts across other services. For example, the SHA256 hash of the favicon can be
used to search for websites on [URLScan.io](https://urlscan.io).

In case you want to dump the host data into elasticsearch or similar, you might want to avoid arrays of objects. In this
case you can export a JSON object of the data which uses the listening ports as keys. The following snippet shows an
example on how to do that. In this model, this is called "flattened json".

In [5]:
print(f"Dumping Shodan host object to JSON:")
print(shodan_host.flattened_json())

Dumping Shodan host object to JSON:
{
  "443.port": 443,
  "443.banner": "HTTP/1.1 301 Moved Permanently\r\nContent-Length: 0\r\nLocation: https://github.com/\r\n\r\n",
  "443.md5": "d402a6212741f3690b4fa1e46d9bd8b6",
  "443.sha1": "a24eb4ba0332776d38050a7b41d0366742dbf262",
  "443.sha256": "5fbe0315395986d131e4948888987319e88e3e1da6c5460e8a0bf8b7a1e639f0",
  "443.murmur": "-1655207803",
  "443.first_seen": "2021-08-27T13:51:30.841319",
  "443.last_seen": "2021-08-27T13:51:30.841321",
  "443.timestamp": "2021-08-16T20:42:40.325940",
  "443.http.headers.Content-Length": "0",
  "443.http.headers.Location": "https://github.com/",
  "443.http.content.raw": "",
  "443.http.content.length": 0,
  "443.http.content.md5": "d41d8cd98f00b204e9800998ecf8427e",
  "443.http.content.sha1": "da39a3ee5e6b4b0d3255bfef95601890afd80709",
  "443.http.content.sha256": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
  "443.http.content.murmur": "0",
  "443.tls.certificate.issuer.dn": "C=U

# Interactive Demo
This section aims to be an interactive demo where you grab live results from scanning services and directly convert the
results into the common model in order to play around with the data yourself.

**Note:** This is still work in progress and does not work right now.

## Initialize API clients
The following cells set-up the api clients in order to grab live data from BinaryEdge, Censys and Shodan.

In [None]:
shodan_key = input("Shodan API-Key (empty to skip downloading data from Shodan)")
be_key = input("BinaryEdge API-Key")
censys_uid = input("Censys UID")
censys_secret = input("Censys Secret")
clients = {}
if shodan_key and len(shodan_key) > 0:
    import shodan
    clients["shodan"] = shodan.Shodan(shodan_key)

if be_key and len(be_key) > 0:
    import pybinaryedge
    clients["binaryedge"] = pybinaryedge.BinaryEdge(be_key)

if censys_uid and censys_secret and len(censys_uid) > 0 and len(censys_secret) > 0:
    from censys.search.v2 import CensysHosts
    clients["censys"] = CensysHosts(censys_uid, censys_secret)

print(f"Initialized {len(clients)} API(s).")