## Domain Recon For Presales

Wouldn't it be great if Presales could do recon as part of the presales engagement and showcase the output in assets?

What this script does:

1. Pulls a list of subdomains from two cert databases
- crt.sh
- certspotter
2. Enriches that information using headers and server response
- webtech
- hackertarget 

Enter the taget domain and scan type below (full will make a request to each subdomain and is not strictly passive).


Release notes and further working:
- crt.sh and certspotter have load balancers that limit requests, this may restrict output. Need to include error handling for this.
- hackertaget limits API calls to 50 a day, might need to find another source
- Next step is to upload directly to H1

In [1]:
'''imports'''
import requests
import pandas as pd
from pandas import json_normalize
import webtech as webtech
from bs4 import BeautifulSoup
pd.options.mode.chained_assignment = None  # default='warn'
wt = webtech.WebTech(options={'json': True})


#===================================================#

## Target domain, scan options = passive, full

target = "example.com"
scan_type = 'passive'
# Org ID = ****
# API Key = ****

#===================================================#

## Cert Search

## Source: crt.sh
def crt(domain):
    req = requests.get(f'https://crt.sh/?q=%.{domain}&output=json')
    return req.json()

## Source: certspotter
def certspotter(domain):
    req = requests.get(f'https://api.certspotter.com/v1/issuances?domain={domain}&include_subdomains=true&expand=dns_names')
    return req.json()

    
#===================================================#

## Find the tech stack for each domain, NOTE: This is not passive
def techmeta(domain):
    try:
        report = wt.start_from_url('https://'+domain)
        techlist = []
        for tech in report['tech']:
            techlist.append(tech['name'])
        return techlist
    except webtech.utils.ConnectionException:
        return("Not Reachable")

## Find server information, NOTE: This is not passive
def server(domain):
    try:
        req = requests.get(f'https://api.hackertarget.com/dnslookup/?q={domain}')
        soup = BeautifulSoup(req.content)
        serverlist = soup.get_text().split("\n")
        if serverlist == ['API count exceeded - Increase Quota with Membership']:
            return "API Count Limited"
        else:
            return serverlist
    except:
        return("Not Reachable")

#===================================================#

## Transform output from crt.sh
dfcrt = json_normalize(crt(target))
dfcrt2 = pd.DataFrame()
dfcrt2['Location'] = dfcrt['issuer_name']
dfcrt2['subdomain'] = dfcrt['name_value']
dfcrt2['subdomain'] = dfcrt2['subdomain'].str.split("\n", expand = False)

## Transform output from certspotter

dfcert = json_normalize(certspotter(target))
dfcert2 = pd.DataFrame()
if len(dfcert.index) < 2:
    dfcert2['subdomain'] = ''
else:
    dfcert2['subdomain'] = dfcert['dns_names']

#===================================================#

## Combine, explode and de-dupe
dfcertcombined = dfcrt2.append(dfcert2)
dfcertcombined2 = dfcertcombined.explode('subdomain').drop_duplicates(subset='subdomain', keep="first", inplace=False)
dfcertcombinedcleaned = dfcertcombined2[dfcertcombined2['subdomain'].str.contains('\*.')==False]

#===================================================#

## Find supporting technologies and server responses
if scan_type == 'full':
    dfcertcombinedcleaned['technology'] = dfcertcombinedcleaned['subdomain'].apply(techmeta)
    #currently turned off due to API limits at hackertarget
    #dfcertcombinedcleaned.loc[dfcertcombinedcleaned['subdomain'] == target, 'Main_Domain_Server_Response'] = server(target)

dfcertcombinedcleaned.to_csv(target+'_dns_output.csv')
print("Recon is complete.")

Recon is complete.


### This section converts the output into something you can directly upload to HackerOne

In [2]:
output = pd.DataFrame()
output['identifier'] = dfcertcombinedcleaned['subdomain']
output['asset_type'] = 'Domain'
output['technologies'] = 'Tag1'

output.to_csv(target+'_semioutput.csv', sep=';', index=False)