# RiskIQ PassiveTotal Python Library

#### *Illuminate Attack Surface Intelligence (ASI)* including *Vulnerability Intel*

## Getting Started

This notebook leverages the RiskIQ Illuminate / PassiveTotal API through the `passivetotal` Python library. 

Documentation for the library, including how to install it and configure API keys, are available here:
https://passivetotal.readthedocs.io/en/latest/getting-started.html

You will need API credentials to authenticate with the API server that provide access to the datasets queried in this notebook. Ask your RiskIQ contact for details or visit https://info.riskiq.net/ to contact the support team.

### Optional Dependencies

This notebook uses the `pandas` Python library primarily to improve the visual output of data tables retrieved from the API. You will need to install that library in your Python (virtual) environment (`pip install pandas`) or change the code examples to return a Python dictionary instead of a dataframe. Simply change `.as_df` to `.as_dict`.

Note that some examples may use special features in `pandas` to filter or aggregate data, but these can also be implemented in pure Python.

### Product Context

https://www.riskiq.com/solutions/attack-surface-intelligence/

### Setup Notebook
*If this returns errors, ensure you have followed the Getting Started document linked above to install necessary dependencies and configure your API keys.*

In [None]:
from passivetotal import analyzer
analyzer.init()

## Attack Surface Intelligence

### Your Attack Surface

Define a variable to store your organization's attack surface

In [None]:
my_asi = analyzer.AttackSurface()
my_asi

The `my_asi` variable here now stores an instance of `AttackSurface` object. To learn what you can do with this object, place your cursor after the variable name, add a dot (.), and press the (tab) key. You'll see a menu of options. 

The complete list of properties is available in the [reference docs](
https://passivetotal.readthedocs.io/en/latest/illuminate.html#passivetotal.analyzer.illuminate.AttackSurface).

---
RiskIQ assesses your Attack Surface by analyzing a set of insights and testing whether the discovered assets in your Attack Surface are impacted by each insight. These impacted assets are listed as observations, and are grouped into three levels: high, medium, and low.

To obtain the list of impacted assets, first enumerate the insights, either by a specific priority or across all priority levels. The most direct route is the `all_active_insights` property.

In [None]:
my_asi.all_active_insights.as_dict

> This property is filtered to only the insights with observations, but the API provides all insights, even those without observations. To see them, use the `all_insights` property instead.

The `all_active_insights` property returns an object of type `AttackSurfaceInsights`. Complete details on the capability of this object are available [in the reference docs](https://passivetotal.readthedocs.io/en/latest/illuminate.html#passivetotal.analyzer.illuminate.AttackSurfaceInsights) and follow the same list of options available for most list-like Analyzer objects. 

To get started, loop through the `all_active_insights` property as if it was Python list. 

In [None]:
for insight in my_asi.all_active_insights:
    print(insight)

The `all_active_insights` property returns an object of type `AttackSurfaceInsight` which can be printed like a string, but also offers additional properties. Use tab-completion here in Jupyter on one insight or consult [the docs](https://passivetotal.readthedocs.io/en/latest/illuminate.html#passivetotal.analyzer.illuminate.AttackSurfaceInsight).

For example, we can sort the high-priority insights by reverse order of observations, select the first insight in the list, and look at the observations for that insight.

In [None]:
my_asi.high_priority_insights.sorted_by('observation_count', True)[0].observations

Observations are of type `AttackSurfaceObservations` which is also list-like in it's behavior. Complete details are in the [reference docs](https://passivetotal.readthedocs.io/en/latest/illuminate.html#passivetotal.analyzer.illuminate.AttackSurfaceObservations) but again, the easiest way to start is to simply iterate the list.

In [None]:
for obs in my_asi.high_priority_insights.sorted_by('observation_count', True)[0].observations:
    print(obs)

Each observation is of type `AttackSurfaceObservation` and when printed simply shows the asset name, although many more details are available in [other properties](https://passivetotal.readthedocs.io/en/latest/illuminate.html#passivetotal.analyzer.illuminate.AttackSurfaceObservation) including the dates when the observation was last seen.

---
Consider using pandas DataFrames if you are working with ASI interactively in a notebook. Virtually every object offers an `as_df` property which is especially useful for lists.

In [None]:
my_asi.high_priority_insights.as_df

In [None]:
my_asi.high_priority_insights.only_active_insights[0].observations.as_df

> Notice the use of `only_active_insights` here to filter the list of insights to only those with observations. If you skip this step you may get an API error when you query for observations if none are available for that insight.

### Third-Party (Vendor) Attack Surfaces

#### Load all third-party ASIs

Define a variable to store all third-party attack surfaces and load them from the API.

In [None]:
vendor_asi = analyzer.illuminate.AttackSurfaces.load()
vendor_asi

> The list of third-party vendors is defined in your account settings in consultation with your RiskIQ account team. There are no options to change the composition of the list in the API.

The object returned is of type `AttackSurfaces` - this can be treated as a list, filtered, or displayed in several ways. Full details are in the [reference docs](
https://passivetotal.readthedocs.io/en/latest/illuminate.html#passivetotal.analyzer.illuminate.AttackSurfaces).

If you have a very large list of third-party vendors, the API will return the data one page at a time, but that will be handled automatically by the Python library.

This will return a list of third-party vendors (associated with Third-Party Intelligence module) and other third-party metadata (attack surface id, name of the vendor, if the name of the organization is your own, if the attack surface is a third-party vendor, number of active high priority, medium priority, and low priority assets linked to insight detected observations. 

You can iterate through the list as with any Python list, or if you have `pandas` installed, use the `as_df` property to see a dataframe in your notebook.

In [None]:
vendor_asi.as_df

___
#### Load a specific vendor

To easily load an attack surface for a specific vendor, use the same shortcut method on the `analyzer` module you used to load your own attack surface, but supply a RiskIQ-assigned vendor ID instead.

In [None]:
vendor_asi = analyzer.AttackSurface(51620)
vendor_asi

Alternatively, pass a string to the `analyzer.AttackSurface()` method. This will load all the third-party Attack Surfaces for your account and perform a case-insensitive substring search on their names. You'll get an error message if no vendors match or if more than one vendor matched.

In [None]:
analyzer.AttackSurface('tencent')

> Although this is a handy way to find an attack surface interactively, it is not recommended for use in automated scripts because of the extra round-trip to the API server to get the list of attack surfaces, the extra memory to contain them, and the extra CPU cycles to filter them. Instead, consider storing vendor ID's in your own system and iterating through them explicitly.

The unique identifier for a third-party vendor is available in the `id` property. Use that ID to load the attack surface directly.

In [None]:
analyzer.AttackSurface('tencent').id

In [None]:
focus_asi = analyzer.AttackSurface(71172)
focus_asi

Once you load the third-party attack surface, you can interact with it exactly as you did for your own attack surface earlier in this notebook. Here, we display a vendor's attack surface as a dataframe to see a quick snapshot of meaningful insights.

In [None]:
focus_asi.as_df.T

> The `T` property of pandas dataframes rotates the table 90 degrees which improves formatting when you only have one row of data.

We can return all active insights with the `all_active_insights` property.

In [None]:
focus_asi.all_active_insights.as_df

> Remember, we are using the `as_df` property to improve usability in a notebook context, but you can easily access the underlying objects, either by iterating through the `focus_asi.all_active_insights` property, or using the `as_dict` property instead of `as_df` to get the data as a regular Python dictionary.

Insights can be treated like strings to make printing them easier, but remember there are more fields available on each insight.

In [None]:
for insight in focus_asi.all_active_insights:
    print(insight)

---
Using simple string matching, we can search a vendor's attack surface for a specific insight.

In [None]:
for insight in focus_asi.all_active_insights:
    if insight.name == 'ASI: REvil Ransomware Actors Exploit Kaseya VSA Software in Broad Supply Chain Attack':
        for obs in insight.observations:
            print (obs)

The `all_active_insights` property of an `AttackSurface` object offers a number of filtering options, including `filter_substring` that performs a case-insensitive match on any string field in the objects in that list. This is a property available on most `RecordList` type objects in the Analyzer.

In [None]:
for insight in focus_asi.all_active_insights.filter_substring(name='kaseya'):
    for obs in insight.observations:
        print(obs)

We can apply the same technique to search across all vendor attack surfaces. Here, we iterate (loop through) the `vendor_asi` variable we stored earlier that contains all third-party attack surfces, and then store the length of the insight list that matches our keyword. 

In [None]:
for vendor in vendor_asi:
    kaseya_insights = len(vendor.all_active_insights.filter_substring(name='kaseya'))
    print(vendor.name, kaseya_insights) 

## Vulnerability Intelligence

RiskIQ's Vulnerability Intelligence (Vuln Intel) provides a practical picture of vulnerability risk, focused on a specific Attack Surface (your own or a third-party vendor). 

---
### CVEs for your Attack Surface

In the `analyzer` module, Vuln Intel is offered primarily through the `cves` property of an Attack Surface. 

In [None]:
analyzer.AttackSurface().cves

The returned object is of type `AttackSurfaceCVEs` that can be iterated just like any other `analyzer` record list.

In [None]:
for cve in analyzer.AttackSurface().cves:
    print(cve)

Consider using `pandas` dataframes for a friendlier view.

In [None]:
analyzer.AttackSurface().cves.as_df

---
### CVEs for third-party vendors

The `cves` property of AttackSurface objects also works with third-party (vendor) attack surfaces to discover vulnerabilites and impacted assets within other attack surfaces.

In [None]:
analyzer.AttackSurface('rhythm').cves.as_df

---
### CVE Observations

An Observation is a discovered asset (i.e. hostname or IP address) with in a specific Attack Surface that is impacted by a vulnerability. Access the `observations` property of a specific CVE to get the complete list.

In [None]:
focus_cve = (
    analyzer
    .AttackSurface()
    .cves
    .filter_fn(lambda c: c.score > 80 and c.observation_count > 50)
    .sorted_by('score',reverse=True)[0]
)
focus_cve.as_df.T

> Here, we used the `filter_fn` method available on all analyzer RecordLists to apply a custom test to each CVE that filters only those with a score greater than 80 and more observations than 50. The `[0]` syntax gives us the first item on the list.

Obtain the list of impacted assets with the `observations` property of the CVE.

In [None]:
focus_cve.observations

In [None]:
for obs in focus_cve.observations:
    print(obs)

Each observation can be printed as a string, which directly accesses the identifier for the impacted asset (the IP address or the hostname, for example), but the `obs` object has more properties available. 

In [None]:
focus_cve.observations[0].firstseen, focus_cve.observations[0].lastseen, focus_cve.observations[0].type

Or, view the entire list of observations as a dataframe.

In [None]:
focus_cve.observations.as_df

---
### Vulnerability Intelligence Articles

Complete details on a vulnerability is available through the top-level `VulnArticle` objects. You can access them from the `article` property of a CVE or instantiate them directly if you already know the CVE identifier.

#### Load a vuln article by ID

In [None]:
article = analyzer.illuminate.VulnArticle.load('CVE-2021-23017')
print(article.description)

In [None]:
article.as_df.T

#### Access a vuln article from a CVE

In [None]:
focus_article = focus_cve.article
focus_article.to_dataframe()

> The `to_dataframe()` method usually operates in the background when you access the `as_df` property on `analyzer` objects, but for some objects it provides additional functionality. We can use the `view` param to access other lists in the article.

In [None]:
focus_article.to_dataframe(view='references')

In [None]:
focus_article.to_dataframe(view='components')

In [None]:
focus_article.to_dataframe(view='impacts')

#### Impacted Assets

One of the most powerful features of the Vulnerability Intelligence module is to quickly assess whether your attack surface is impacted by a vulnerability, and also whether any third-party (vendor) attack surfaces may be impacted. You can obtain this information from the `cves` property of a given attack surface, as described above, but you can also access it directly from an article. 

The article provides an `observation_count` and `observations` properties that are focused on your own attack surface.

In [None]:
focus_article.observation_count

In [None]:
len(focus_article.observations)

> We are showing the length of the observations list, but of course you can also access the list directly.

To view which third-party vendors have assets impacted by the vulnerability, access the `attack_surfaces` property. These are returned as a list of `VulnArticleImpacts` that provide the vendor's name and the count of impacted assets.

In [None]:
focus_article.attack_surfaces

In [None]:
for vendor in focus_article.attack_surfaces:
    print(vendor)

The list of impacted assets (observations) is available on a given vendor's `VulnArticleImpact` object in the `observation_count` and `observations` properties, just like with our own attack surface.

In [None]:
impacted_vendor = focus_article.attack_surfaces.filter_substring(vendor_name='union')[0]
impacted_vendor.observation_count

In [None]:
impacted_vendor.observations.as_df

#### Access related OSINT Articles

Many CVEs are related to open-source threat intelligence articles collected and published by RiskIQ analysts. These articles are available through the `analyzer.AllArticles` object, but you can also obtain related articles directly from a Vulnerability Article.

In [None]:
focus_article.osint_articles.as_df