# RiskIQ PassiveTotal Python Library

#### *Projects and Monitors*

## Getting Started

This notebook leverages the RiskIQ Illuminate / PassiveTotal API through the `passivetotal` Python library. 

Documentation for the library, including how to install it and configure API keys, are available here:
https://passivetotal.readthedocs.io/en/latest/getting-started.html

You will need API credentials to authenticate with the API server that provide access to the datasets queried in this notebook. Ask your RiskIQ contact for details or visit https://info.riskiq.net/ to contact the support team.

### Optional Dependencies

This notebook uses the `pandas` Python library primarily to improve the visual output of data tables retrieved from the API. You will need to install that library in your Python (virtual) environment (`pip install pandas`) or change the code examples to return a Python dictionary instead of a dataframe. Simply change `.as_df` to `.as_dict`.

Some examples may use special features in `pandas` to filter or aggregate data, but these can also be implemented in pure Python.

By default, `pandas` will only show a subset of rows in notebooks. To display more, set the `max_rows` option to a higher value.

In [2]:
import pandas as pd
pd.options.display.max_rows=100

### Product Context

[PassiveTotal Projects](https://info.riskiq.net/hc/en-us/articles/1500000017121-PassiveTotal-Projects-Overview)
are used by analysts to group related indicators of compromise (IOCs) together in the course of an investigation and (optionally) share those indicators with other users in their organziation. IOCs are stored as "artifacts" in a project and may include domains, IPs, keywords, SSL certificate hashes, and other types.

Most artifact types can be monitored for changes or new keyword matches using [PassiveTotal Monitors](https://info.riskiq.net/hc/en-us/articles/360057825114-PassiveTotal-Monitors).

Alerts are typically sent via email but they can also be retrieved programatically via the API. This notebook demonstrates how to create a project, store indicators in the project, and retrieve new alerts for those indicators.


### Setup Notebook
*If this returns errors, ensure you have followed the Getting Started document linked above to install necessary dependencies and configure your API keys.*

In [3]:
from passivetotal import analyzer
analyzer.init()

### Table of Contents

* [Set Active Project](#Set-Active-Project): Use the `analyzer` module to quickly set a project context.
* [Get Active Project](#Get-Active-Project): Get details about the current project including the IOCs previously added to it.
* [Find Specific Project](#Find-Specific-Project): Find a specific project by ID or name.
* [Working With Artifacts](#Working-With-Artifacts): List artifacts in a project and activate monitoring.
* [Artifact Monitoring](#Artifact-Monitoring): Get daily alerts for monitored artifacts.
* [Filter Alerts](#Filter-Alerts): Enrich alerts with data from other PassiveTotal APIs to enable filtering and focused analysis.

---
## Projects

### Set Active Project
The easiest way to get started with a project is to use the `analyzer` module-level `set_project` method to set the default project for all subsequent actions in your notebook session.

In [3]:
analyzer.set_project('My Sample Project')

Projects can be made visible to only you, or your team, or everyone; "analyst" visiblity is the default. 

To set other options, [consult the documentation](https://passivetotal.readthedocs.io/en/latest/analyzer.html) or use the built-in help function inside this notebook:

In [3]:
analyzer.set_project?

[0;31mSignature:[0m
[0manalyzer[0m[0;34m.[0m[0mset_project[0m[0;34m([0m[0;34m[0m
[0;34m[0m    [0mname_or_guid[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mvisibility[0m[0;34m=[0m[0;34m'analyst'[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mdescription[0m[0;34m=[0m[0;34m''[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mtags[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mcreate_if_missing[0m[0;34m=[0m[0;32mTrue[0m[0;34m,[0m[0;34m[0m
[0;34m[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Set the active Illuminate Project for this investigation. 

Used by Analyzer objects to persist results to projects. Performs an API query to determine if project
exists, create it if it is missing, and obtain necessary details.

:param name_or_guid: Project name or project GUID.
:param visibility: Who can see the project: public, private or analyst (optional, defaults to 'analyst').
:param description: Description of the project (

After settting the active project, you can easily add Hostname or IPAddress artifact types directly from the analyzer.

In [4]:
analyzer.Hostname('riskiq.net').save_to_project()
analyzer.IPAddress('8.8.8.8').save_to_project()

> To save other artifact types, use the UI or access the underling API libraries directly.

---
### Get Active Project
Retrieve the current project as an object to change settings or list artifacts in the project.

In [5]:
project = analyzer.get_project()
project

<Project 3c7f7ed1-15eb-41bd-93ac-9e1b36a41244 'My Sample Project'

In [6]:
project.as_dict

{'project_guid': '3c7f7ed1-15eb-41bd-93ac-9e1b36a41244',
 'name': 'My Sample Project',
 'description': '',
 'visibility': 'analyst',
 'is_featured': False,
 'tags': [],
 'owner': 'riskiq',
 'creator': 'user@host.com',
 'organization': 'riskiq',
 'link': None,
 'collaborators': [],
 'links': {'self': '/v2/project?project=3c7f7ed1-15eb-41bd-93ac-9e1b36a41244',
  'artifact': '/v2/artifact?project=3c7f7ed1-15eb-41bd-93ac-9e1b36a41244',
  'tag': '/v2/project/tag?project=3c7f7ed1-15eb-41bd-93ac-9e1b36a41244'},
 'subscribers': [],
 'can_edit': True,
 'created': '2021-10-13 18:02:14.551000+00:00'}

In [7]:
for artifact in project.artifacts:
    print(artifact)

riskiq.net
8.8.8.8


---
### Find Specific Project
If you already know the project you want to retrieve, you can obtain it directly without using the convenience methods provided by the analyzer.

The most efficient way to obtain a project is with the projects GUID which you can access from the `project_guid` property of an `Project` object, or from the PassiveTotal UI by navigating to the project and looking at the last part of the URL.

In [8]:
project = analyzer.Project.find('3c7f7ed1-15eb-41bd-93ac-9e1b36a41244')
project.name

'My Sample Project'

Alternatively, you can find a project by name, but be aware this performs a search against the API, which may take some time.

In [9]:
project = analyzer.Project.find('My Sample Project')
project.name

'My Sample Project'

> Project names are not unique in the PassiveTotal app. You will get an error if your result returns more than one project. To obtain a list of projects, use `analyzer.ProjectList.find` instead.

> By default, the scope of the search is projects with visibilty="analyst". If you aren't finding the project you expect, use the GUID instead or set a different visibility.

---
### Working With Artifacts

Projects contain lists of artifacts of various types. To obtain the list, load a project and access the `artifacts` property.

In [10]:
project.artifacts

<passivetotal.analyzer.projects.ArtifactList at 0x11bdcaac0>

Artifacts are returned in a list-like analyzer object of type `ArtifactList`. Like other analyzer objects, it can iterated, sorted, filtered, and displayed in various ways.

In [11]:
project.artifacts.as_dict

{'totalrecords': 2,
 'records': [{'type': 'domain',
   'project_guid': '3c7f7ed1-15eb-41bd-93ac-9e1b36a41244',
   'artifact_guid': 'd0b69764-7dac-45c3-8248-366ff8fb181b',
   'is_monitored': False,
   'is_monitorable': True,
   'organization': 'riskiq',
   'links': {'tag': '/v2/artifact/tag?artifact=d0b69764-7dac-45c3-8248-366ff8fb181b',
    'project': '/v2/project?project=3c7f7ed1-15eb-41bd-93ac-9e1b36a41244',
    'self': '/v2/artifact?artifact=d0b69764-7dac-45c3-8248-366ff8fb181b'},
   'owner': 'riskiq',
   'name': 'riskiq.net',
   'creator': 'user@host.com',
   'tags_meta': {'test': {'creator': 'user@host.com',
     'created_at': '2021-10-13T19:26:35.237000'}},
   'tags_global': None,
   'tags_system': [],
   'tags_user': ['test'],
   'created': '2021-10-13 18:09:58.241000'},
  {'type': 'ip',
   'project_guid': '3c7f7ed1-15eb-41bd-93ac-9e1b36a41244',
   'artifact_guid': 'a11c40b7-6bf4-45b0-9e77-a30b66e20be3',
   'is_monitored': True,
   'is_monitorable': True,
   'organization': 'ris

In [12]:
project.artifacts.filter_in(type='domain,certificate').as_dict

{'totalrecords': 1,
 'records': [{'type': 'domain',
   'project_guid': '3c7f7ed1-15eb-41bd-93ac-9e1b36a41244',
   'artifact_guid': 'd0b69764-7dac-45c3-8248-366ff8fb181b',
   'is_monitored': False,
   'is_monitorable': True,
   'organization': 'riskiq',
   'links': {'tag': '/v2/artifact/tag?artifact=d0b69764-7dac-45c3-8248-366ff8fb181b',
    'project': '/v2/project?project=3c7f7ed1-15eb-41bd-93ac-9e1b36a41244',
    'self': '/v2/artifact?artifact=d0b69764-7dac-45c3-8248-366ff8fb181b'},
   'owner': 'riskiq',
   'name': 'riskiq.net',
   'creator': 'user@host.com',
   'tags_meta': {'test': {'creator': 'user@host.com',
     'created_at': '2021-10-13T19:26:35.237000'}},
   'tags_global': None,
   'tags_system': [],
   'tags_user': ['test'],
   'created': '2021-10-13 18:09:58.241000'}]}

-----
Each artifact has a set of properties and a few methods to modify the tags or activate monitoring.

In [15]:
art = project.artifacts[0]
art.tags_user

['']

In [14]:
art.update_tags('Test')

True

In [38]:
art.enable_monitoring()

True

Or, access the `as_dict` or `as_df` property to see all the fields.

In [None]:
art.as_dict

---
Hostname and IP `analyzer` objects become artifacts when they are added to a project. 

The objects may exist as artifacts in multiple projects - you can obtain the list by accesing the `projects` property of a `Hostname` or `IPAddress` object. 

In [None]:
analyzer.Hostname('riskiq.net').projects.as_df

Alternatively, access the `artifacts` property to get their representation as an `Artifact` object. This gives you access to all the attributes of an `Artifact` including the `alerts` property, which will contain a list of any monitor results if the object is being monitored.

In [None]:
analyzer.Hostname('riskiq.net').artifacts.as_df

### Artifact Monitoring

Many artifact types can be "Monitored" for changes or new results (the specific behavior depends on the type of artifact). Once monitoring is enabled for an artifact, new alerts will be generated each day and emailed to the artifact owner. The same alerts are available through the API, and are made accessible by the `analyzer` in the `alerts` property of an `Artifact`.

Alert queries are date-bounded. When using the `analyzer` to fetch alerts, be sure to set the date range of the `analyzer` module before making your queries. Here, we set our date range to the last 7 days.

In [33]:
analyzer.set_date_range(days_back=30)

As with all `analyzer` objects, you can access the `as_dict` or `as_df` property of the this object or iterate through it like a list. Either look for a specific artifact in the list, or act programatically on all artifacts with monitoring enabled.

Next, load a project by GUID or name, then access the list of artifacts on the project.

In [17]:
alert_project = analyzer.Project.find('3ee99899-19d4-49a2-b7f7-236002f6a382')
alert_project.artifacts

<passivetotal.analyzer.projects.ArtifactList at 0x10bf14490>

> This example query finds a project with a specific GUID that you likely do not have access to. Replace the GUID with a GUID or a name of a project you have access to to avoid errors or empy artifact lists.

The `ArtifactsList` object you get back when accessing the `artifacts` property behaves like other `analyzer` record lists. Use the `as_df` or `as_dict` properties to view the list, or iterate through it programmatically as you would any other Python object. Use the `filter` method to consider only artifacts that are currently being monitored.

In [18]:
for art in alert_project.artifacts.filter(is_monitored=True):
    print(f'artifact: {art} has {art.alerts_available} alert(s) available')

artifact: syncun.com has 17 alert(s) available


> The `alerts_available` property makes a query to the API to retrieve one page of results, and returns the "totalrecords" field from that (likely abbreviated) recordlist. It is a convenient way to get the alert count but in some cases may not be optimal, especially if you're expecting results every day, in which case it is likely better to fetch the results directly.


To fetch the alerts, focuson a single artifact, either by iterating through the list of artifacts, or by filtering for a specific artifact. Then access the `alerts` property.

In [35]:
focus_artifact = alert_project.artifacts.filter_substring(name='syncun')[0]
alerts = focus_artifact.alerts
alerts

<passivetotal.analyzer.projects.ArtifactAlerts at 0x11be65d00>

Alerts are returned as a list-like `ArtifactAlerts` object. Use the `as_dict` property of the object to get the list as a dictionary, or iterate through the list directly:

In [37]:
for alert in alerts:
    print(f'{alert.artifact} change {alert.change} to {alert.result} on {alert.firstseen}')

syncun.com change registrarUpdatedAt to 1557971143000 on 2021-10-03 00:00:00
syncun.com change contact_country to india on 2021-10-03 00:00:00
syncun.com change registrant_organization to conceptualise on 2021-10-03 00:00:00
syncun.com change admin_state to haryana on 2021-10-03 00:00:00
syncun.com change admin_organization to conceptualise on 2021-10-03 00:00:00
syncun.com change registrar_phone to 480-624-2505 on 2021-10-03 00:00:00
syncun.com change nameserver to ns141.iabhost.com on 2021-10-03 00:00:00
syncun.com change nameserver to ns140.iabhost.com on 2021-10-03 00:00:00
syncun.com change new resolution to ns140.iabhost.com. admin.hiisecuredns.com. 2021091504 3600 1800 1209600 86400 on 2021-09-23 00:00:00
syncun.com change new resolution to 10 inbound-smtp.us-east-1.amazonaws.com. on 2021-09-22 00:00:00


## Filter Alerts

In these examples, we focused primarily on IPs and hostnames, but the projects feature in RiskIQ PassiveTotal can track more than a dozen different types of artifacts, and many of these can be monitored. 

One popular artifact type is "Keyword PDNS" that enables discovery of newly observed hostnames that contain a brand, phishing lure, or threat actor indicator. Depending on the keyword you choose, the system can generate a signficiant amount of alerts. 

Data in the PassiveTotal API, combined with capabilities in the `analyzer` module of the `passivetotal` Python library, provide ways to enrich IP and hostname alerts with attributes for filtering and deeper research.

---
First, set a narrow date range at the module level:

In [22]:
analyzer.set_date_range(days_back=1)

Next, load the project that contains the artifacts you are monitoring. The most direct way is by project GUID which you can obtain from the URL of the project in the PassiveTotal UI, but you can also load a project by name.

In [4]:
alert_project = analyzer.Project.find('6a7ea8b1-9582-4343-a364-2822bf764b2d')

> This will likely raise an error if you run it without changing the project GUID, becasue you won't have access to that specific project from your account.

Locate the artifact you want to monitor. Here, we list all the artifacts, then filter them by the "query" field.

In [5]:
alert_project.artifacts.as_df

Unnamed: 0,query,type,project_guid,artifact_guid,is_monitored,is_monitorable,organization,links,owner,name,creator,tags_meta,tags_global,tags_system,tags_user,created
0,verification,pdns_keyword,6a7ea8b1-9582-4343-a364-2822bf764b2d,8f9faca5-5248-431b-99a0-08b3f42f5e43,True,True,riskiq,{'tag': '/v2/artifact/tag?artifact=8f9faca5-52...,riskiq,verification,benjamin.powell@riskiq.net,{},,[],[],2021-09-23 23:47:05.605
1,unverified,pdns_keyword,6a7ea8b1-9582-4343-a364-2822bf764b2d,134239cc-ff92-4f58-a089-849509a819ca,True,True,riskiq,{'tag': '/v2/artifact/tag?artifact=134239cc-ff...,riskiq,unverified,benjamin.powell@riskiq.net,{},,[],[],2021-09-23 23:47:21.511
2,verified,pdns_keyword,6a7ea8b1-9582-4343-a364-2822bf764b2d,69e2e4fd-76fd-40be-9f8b-7db1d2aaab94,True,True,riskiq,{'tag': '/v2/artifact/tag?artifact=69e2e4fd-76...,riskiq,verified,benjamin.powell@riskiq.net,{},,[],[],2021-09-23 23:47:37.686
3,verifiy,pdns_keyword,6a7ea8b1-9582-4343-a364-2822bf764b2d,1c1c79b1-6eb5-43be-87be-b887f8346d53,True,True,riskiq,{'tag': '/v2/artifact/tag?artifact=1c1c79b1-6e...,riskiq,verifiy,benjamin.powell@riskiq.net,{},,[],[],2021-09-23 23:47:52.453
4,communication,pdns_keyword,6a7ea8b1-9582-4343-a364-2822bf764b2d,cc22a6aa-f4c3-4385-8e2e-4eef42f0306f,True,True,riskiq,{'tag': '/v2/artifact/tag?artifact=cc22a6aa-f4...,riskiq,communication,benjamin.powell@riskiq.net,{},,[],[],2021-09-24 16:07:34.749
5,nerdpol.ovh,pdns_keyword,6a7ea8b1-9582-4343-a364-2822bf764b2d,53d4c82d-c2e0-4da2-999f-51e094186bfd,True,True,riskiq,{'tag': '/v2/artifact/tag?artifact=53d4c82d-c2...,riskiq,nerdpol.ovh,benjamin.powell@riskiq.net,{},,[],[],2021-09-24 16:08:07.878


In [31]:
focus_artifact = alert_project.artifacts.filter(query='unverified')[0]
focus_artifact.as_dict

{'type': 'pdns_keyword',
 'project_guid': '6a7ea8b1-9582-4343-a364-2822bf764b2d',
 'artifact_guid': '134239cc-ff92-4f58-a089-849509a819ca',
 'is_monitored': True,
 'is_monitorable': True,
 'organization': 'riskiq',
 'links': {'tag': '/v2/artifact/tag?artifact=134239cc-ff92-4f58-a089-849509a819ca',
  'project': '/v2/project?project=6a7ea8b1-9582-4343-a364-2822bf764b2d',
  'self': '/v2/artifact?artifact=134239cc-ff92-4f58-a089-849509a819ca'},
 'owner': 'riskiq',
 'name': 'unverified',
 'creator': 'benjamin.powell@riskiq.net',
 'tags_meta': {},
 'tags_global': None,
 'tags_system': [],
 'tags_user': [],
 'created': '2021-09-23 23:47:21.511000'}

> We use the `filter` method of the `artifacts` list to find alerts with a `query` property set to `unverified`. This returns a list, but we are expecting only one match, so we use the `[0]` syntax to select the first item in the list.

As covered above, you can check the `alerts_available` property to see how many alerts are available, but be mindful this will make a query to the API to obtain the first page of results. If you're planning to work with the results anyway, consider skipping this and going directly at the alerts.

In [32]:
focus_artifact.alerts_available

166

In [33]:
focus_artifact.alerts.as_df

Unnamed: 0,type,change,query,result,firstseen,project_name,project_guid
0,pdns_keyword,keyword_match,unverified,0.r1.unverified-forwarding.projectbaseline.com,2021-10-19,potential phishing domains,6a7ea8b1-9582-4343-a364-2822bf764b2d
1,pdns_keyword,keyword_match,unverified,100.r1.unverified-forwarding.verily.com,2021-10-19,potential phishing domains,6a7ea8b1-9582-4343-a364-2822bf764b2d
2,pdns_keyword,keyword_match,unverified,102.r3.unverified-forwarding.projectbaseline.com,2021-10-19,potential phishing domains,6a7ea8b1-9582-4343-a364-2822bf764b2d
3,pdns_keyword,keyword_match,unverified,102.r3.unverified-forwarding.verily.com,2021-10-19,potential phishing domains,6a7ea8b1-9582-4343-a364-2822bf764b2d
4,pdns_keyword,keyword_match,unverified,103.r2.unverified-forwarding.spotifyforbrands.com,2021-10-19,potential phishing domains,6a7ea8b1-9582-4343-a364-2822bf764b2d
...,...,...,...,...,...,...,...
161,pdns_keyword,keyword_match,unverified,87.r3.unverified-forwarding.spotifyforbrands.com,2021-10-18,potential phishing domains,6a7ea8b1-9582-4343-a364-2822bf764b2d
162,pdns_keyword,keyword_match,unverified,95.r3.unverified-forwarding.spotifyforbrands.com,2021-10-18,potential phishing domains,6a7ea8b1-9582-4343-a364-2822bf764b2d
163,pdns_keyword,keyword_match,unverified,attach-an-unverified-funding-source.rechargeap...,2021-10-18,potential phishing domains,6a7ea8b1-9582-4343-a364-2822bf764b2d
164,pdns_keyword,keyword_match,unverified,attach-an-unverified-funding-source.just-eat.ie,2021-10-18,potential phishing domains,6a7ea8b1-9582-4343-a364-2822bf764b2d


We used the `as_df` property to see the alerts in a convenient dataframe that works well in a notebook, but it is likely you will setup automated processes to feed these results into downstream systems. Remember the `alerts` property returns a list-like `analyzer` object you can slice, iterate, and filter like other Python lists.

*IMPORTANT*
The enrichment code that follows assumes results are from a pdns_keyword type monitor, and that the alert results are hostnames. You will need to make adjustmetns if your project has a different type of alerts.

In [76]:
for alert in focus_artifact.alerts[0:10]:
    if alert.change != 'keyword_match':
        continue # because this code assumes alert results are hostnames
    alert_host = analyzer.Hostname(alert.result)
    print(alert_host, alert_host.whois.registrant_name)

0.r1.unverified-forwarding.projectbaseline.com Domain Administrator
100.r1.unverified-forwarding.verily.com 
102.r3.unverified-forwarding.projectbaseline.com Domain Administrator
102.r3.unverified-forwarding.verily.com 
103.r2.unverified-forwarding.spotifyforbrands.com 
109.r4.unverified-forwarding.projectbaseline.com Domain Administrator
110.r3.unverified-forwarding.spotifyforbrands.com 
112.r3.unverified-forwarding.verily.com 
112.r4.unverified-forwarding.verily.com 
116.r2.unverified-forwarding.projectbaseline.com Domain Administrator


In this example, since we expect alerts to be hostnames, we can assign the alert result to `analyzer.Hostname` objects and then access properties from other PassiveTotal datasets. 

A more sophisticated approach may be to iterate through all the alerts and generate a list of Python dictionaries containing both Whois details and Illuminate Reputation Scores for each hostname.

In [71]:
alert_records = []
for alert in focus_artifact.alerts:
    alert_host = analyzer.Hostname(alert.result)
    record = {
        'host': alert_host,
        'whois_registrar': str(alert_host.whois.registrar),
        'whois_registrant_org': str(alert_host.whois.registrant_org),
        'whois_registrant_name': str(alert_host.whois.registrant_name),
        'whois_registrant_email': str(alert_host.whois.registrant_email),
        'whois_age': alert_host.whois.age
    }
    try:
        record.update({
            'riskiq_score': alert_host.reputation.score,
            'riskiq_classification': alert_host.reputation.classification
        })
    except analyzer.AnalyzerAPIError:
        pass
    alert_records.append(record)

After this code completes, you should have a list of `alert_records` you can feed into downstream systems. Here, we leverage the `pandas` library to createa a `DataFrame`, then view the top 10 results with the highest risk score.

In [73]:
alert_df = pd.DataFrame.from_records(alert_records)
alert_df.nlargest(10, 'riskiq_score')

Unnamed: 0,host,whois_registrar,whois_registrant_org,whois_registrant_name,whois_registrant_email,whois_age,riskiq_score,riskiq_classification
127,productreviews-unverifiedattachments.frp.zooma...,Gandi SAS,,,,133,69,SUSPICIOUS
163,attach-an-unverified-funding-source.rechargeap...,NAMECHEAP INC,Redacted for Privacy Purposes,Redacted for Privacy Purposes,select contact domain holder link at https://w...,2589,1,UNKNOWN
0,0.r1.unverified-forwarding.projectbaseline.com,MarkMonitor Inc.,DNStination Inc.,Domain Administrator,admin@dnstinations.com,3991,0,UNKNOWN
1,100.r1.unverified-forwarding.verily.com,MarkMonitor Inc.,Google LLC,,,7614,0,UNKNOWN
2,102.r3.unverified-forwarding.projectbaseline.com,MarkMonitor Inc.,DNStination Inc.,Domain Administrator,admin@dnstinations.com,3991,0,UNKNOWN
3,102.r3.unverified-forwarding.verily.com,MarkMonitor Inc.,Google LLC,,,7614,0,UNKNOWN
4,103.r2.unverified-forwarding.spotifyforbrands.com,Ports Group AB,Spotify AB,,abuse@portsgroup.se,2693,0,UNKNOWN
5,109.r4.unverified-forwarding.projectbaseline.com,MarkMonitor Inc.,DNStination Inc.,Domain Administrator,admin@dnstinations.com,3991,0,UNKNOWN
6,110.r3.unverified-forwarding.spotifyforbrands.com,Ports Group AB,Spotify AB,,abuse@portsgroup.se,2693,0,UNKNOWN
7,112.r3.unverified-forwarding.verily.com,MarkMonitor Inc.,Google LLC,,,7614,0,UNKNOWN


`pandas` also provides ways of grouping results in ways that help us spot outliers. For example, we can group by the Whois registrant organization or Whois registrar, both of which may provide ways to filter defensive registrations by legimate companies.

In [70]:
alert_df.groupby(by='whois_registrant_org').size()

whois_registrant_org
                                     3
Bitsmedia Pte Ltd                    4
DNStination Inc.                    58
Domain Protection Services, Inc.     1
Google LLC                          43
Knock Knock WHOIS Not There, LLC     2
Redacted for Privacy Purposes        1
Spotify AB                          54
dtype: int64

In [74]:
alert_df.groupby(by='whois_registrar').size()

whois_registrar
Automattic Inc.                                          2
CSC Corporate Domains, Inc [Tag = CSC-CORP-DOMAINS]      1
CSC Domains Inc                                          1
Gandi SAS                                                1
MarkMonitor Inc.                                       101
NAMECHEAP INC                                            1
Name.com, Inc.                                           1
OVH, SAS                                                 4
Ports Group AB                                          54
dtype: int64