# Virus Total Lookup
VTLookupV3 is a module in the MSTICPy library that provides integration with the VirusTotal API for querying file and URL reputation data. The module includes a `VTLookupV3` class that can be used to submit queries to the VirusTotal API and retrieve response data.

VTLookupV3 is a powerful tool for security researchers and analysts who need to quickly and easily query VirusTotal for reputation data on files and URLs, and it is just one of many useful modules in the MSTICPy library.
This notebook describes the use of the `VTLookupV3` capabilities in MSTICPy. 

This notebook aims to demonstrate the capabilities of `VTLookupV3` using an example investigation inspired by [F5: Attackers Use New, Sophisticated Ways to Install Cryptominers](https://www.f5.com/labs/articles/threat-intelligence/attackers-use-new--sophisticated-ways-to-install-cryptominers). In order to maximize the value of `VTLookupV3` some queries, like search require [Virus Total Enterprise License](https://support.virustotal.com/hc/en-us/articles/360001387057-VirusTotal-Intelligence-Introduction).


## Features
To improve readability, the capabilities of VTLookupV3 are encapsulated into a single, easy to access class. Most functions within the VTLookupV3 class can be done for either a single IOC, or a DataFrame of IOCs. 

| Single IOC methods | Multiple IOCs methods | Description |
|--------------------|-----------------------|-------------|
| VTLookupV3.lookup_ioc | VTLookupV3.lookup_iocs | Queries VT API for detection and details of provided IOC(s) |
| VTLookupV3.lookup_ioc_relationships | VTLookupV3.lookup_iocs_relationships | Queries VT API for a specific relationship type for provided IOC(s) |
| VTLookupV3.get_object | VTLookupV3.search | Queries VT API for full information about IOC(s) |
| VTLookupV3.get_file_behavior | N/A | Queries VT API for sandbox / detonation information about IOC(s) |

### Initializing MSTICPy

In [None]:
# Built-in Libraries
import re
from urllib3 import get_host

# MSTICPy
import msticpy as mp
from msticpy.context.vtlookupv3 import VTLookupV3

# Third-Party Libraries
import pandas as pd


# Initialize MSTICPy
mp.init_notebook()

### Configuration File
MSTICPy contains robust configuration options to enable analysts to take a diverse approach to using selecting credentials, and pre-building queries. For more details see [MSTICPy Package Configuration](https://msticpy.readthedocs.io/en/latest/getting_started/msticpyconfig.html) documentation. Below outlines various `AuthKey` methods to obtain Virus Total API capabilities.


```yaml
TIProviders:
  VirusTotal:
    Args:
      AuthKey:
        KeyVault: MyKeyVault/MySecret
      AuthKey:
        EnvironmentVar: "VIRUSTOTAL_AUTH"
      AuthKey: "MY_API_KEY_IN_PLAINTEXT"
```

In [2]:
# uses settings from msticpyconfig.yaml
vt_client = VTLookupV3()

### Hunting for Configuration Files related to Cryptomining
Virus Total Intelligence Search is comprised of 40+ search modifiers to create extremely fine-tuned queries to narrow down investigations.

| Modifier | Description |
|----------|-------------|
| engines  | Malware family identified by Anti-Virus |
| tag      | Additional entity descriptor |
| have     | Checks if the entity contains a specific attribute |


```
Query: engines:miner AND tag:json AND have:itw
Description: Look for malware that is classified as a 'miner' that is tagged with 'json' and has an 'itw' characteristics
```

In [3]:
# perform lookup and limit results to 100 or fewer
cryptominer_df = vt_client.search('engines:miner AND tag:json AND have:itw', 100)

# filter down to necessary columns
cryptominer_df = cryptominer_df[ ['id', 'type', 'names'] ]

# expand out names list, and filter down to JSON file names
cryptominer_df = cryptominer_df.explode('names')
cryptominer_df = cryptominer_df[ cryptominer_df['names'].str.endswith('.json') ]

display(cryptominer_df.head())

Unnamed: 0,id,type,names
0,d8d6ff48c3d2df3c8289e3c519e15b9a2110d08975eaddef995d95e4c0d970cd,file,config.json
0,d8d6ff48c3d2df3c8289e3c519e15b9a2110d08975eaddef995d95e4c0d970cd,file,zbetcheckin_tracker_config.json
0,a80a424f11f3a55cd427011485b95f1f93b7f8492a1cf5dbe5ada1df4ca0f5ad,file,server2.0.0.json
0,a80a424f11f3a55cd427011485b95f1f93b7f8492a1cf5dbe5ada1df4ca0f5ad,file,C:\Users\<USER>\Downloads\server.json
0,a80a424f11f3a55cd427011485b95f1f93b7f8492a1cf5dbe5ada1df4ca0f5ad,file,C:\Users\user\Desktop\server.json


### Search for In-The-Wild (ITW) information
"In-The-Wild" (ITW) is a term used in computer security to describe malware that is actively spreading and infecting systems in real-world situations, as opposed to being studied in a controlled laboratory environment. In the 

In the context of VirusTotal, ["In-The-Wild" (ITW)](https://support.virustotal.com/hc/en-us/articles/360001385897-File-search-modifiers) refers to the number of times a particular piece of malware has been detected in the wild by VirusTotal's network of antivirus engines. This metric can be used to assess the prevalence and severity of a particular malware threat. A high ITW score indicates that the malware is actively spreading and infecting systems, while a low score may suggest that the malware is relatively rare or has been contained.

In [4]:
# look up in-the-wild URLs associated with files in cryptominer_df
cryptominer_itw_df = vt_client.lookup_iocs_relationships(cryptominer_df, 'itw_urls', 'id', 'type', all_props=True)

# filter and reorder columns 
cryptominer_itw_df = cryptominer_itw_df[ ['source', 'source_type', 'target', 'target_type', 'relationship_type'] ]

# join itw data on cryptominer_df
cryptominer_itw_df = cryptominer_itw_df.merge(cryptominer_df, how='inner', left_on='source', right_on='id')

# rename and reorder columns
cryptominer_itw_df.rename(columns={'names': 'source.name'}, inplace=True)
cryptominer_itw_df = cryptominer_itw_df[ ['source', 'source.name', 'source_type', 'target', 'target_type', 'relationship_type'] ]
cryptominer_itw_df.drop_duplicates(inplace=True)

display(cryptominer_itw_df.head())

Unnamed: 0,source,source.name,source_type,target,target_type,relationship_type
0,d8d6ff48c3d2df3c8289e3c519e15b9a2110d08975eaddef995d95e4c0d970cd,config.json,file,81734d74cdbc7a2b58442e8bae0f5b0352717346107e30d37e8e041248fe503d,url,itw_urls
1,d8d6ff48c3d2df3c8289e3c519e15b9a2110d08975eaddef995d95e4c0d970cd,zbetcheckin_tracker_config.json,file,81734d74cdbc7a2b58442e8bae0f5b0352717346107e30d37e8e041248fe503d,url,itw_urls
4,a80a424f11f3a55cd427011485b95f1f93b7f8492a1cf5dbe5ada1df4ca0f5ad,server2.0.0.json,file,ac441d52afe44184afbf85baab9722c94859dbf834b95359e3c16a26fe70538c,url,itw_urls
5,a80a424f11f3a55cd427011485b95f1f93b7f8492a1cf5dbe5ada1df4ca0f5ad,C:\Users\<USER>\Downloads\server.json,file,ac441d52afe44184afbf85baab9722c94859dbf834b95359e3c16a26fe70538c,url,itw_urls
6,a80a424f11f3a55cd427011485b95f1f93b7f8492a1cf5dbe5ada1df4ca0f5ad,C:\Users\user\Desktop\server.json,file,ac441d52afe44184afbf85baab9722c94859dbf834b95359e3c16a26fe70538c,url,itw_urls


### Enrich ITW Information
By default `VTLookupV3` returns relationship data in the form of 'id' and 'type' to obtain human readable information, enrichment is needed.

In [5]:
# gather meta data about target (itw url)
itw_enrichment = vt_client.lookup_iocs(cryptominer_itw_df, 'target', 'target_type', all_props=True)
itw_enrichment = itw_enrichment[ ['id', 'url'] ]
itw_enrichment['host'] = itw_enrichment.apply(lambda x:get_host(x.url)[1], axis=1)
itw_enrichment['host.type'] = itw_enrichment.apply(lambda x: ('ip_address', 'domain')[len(re.findall(r"[A-z]{1,}", x.host)) > 0], axis=1)
itw_enrichment.drop_duplicates(inplace=True)

display(itw_enrichment.head())

Unnamed: 0,id,url,host,host.type
0,81734d74cdbc7a2b58442e8bae0f5b0352717346107e30d37e8e041248fe503d,http://27.1.1.34:8080/docs/config.json,27.1.1.34,ip_address
0,ac441d52afe44184afbf85baab9722c94859dbf834b95359e3c16a26fe70538c,https://minerjson.oss-cn-beijing.aliyuncs.com/server2.0.0.json?t=638058298707060546,minerjson.oss-cn-beijing.aliyuncs.com,domain
0,4a97690d7813c24629ee4ac539503fb1f5a57ed98ded075dc8d785f570b50a6f,https://minerjson.oss-cn-beijing.aliyuncs.com/server2.0.0.json?t=637704108706432000,minerjson.oss-cn-beijing.aliyuncs.com,domain
0,578e3967424a350fcebfc7389c814bba5f71d8c62f03649fd39101fe6de69029,https://minerjson.oss-cn-beijing.aliyuncs.com/server2.0.0.json?t=638124776855263671,minerjson.oss-cn-beijing.aliyuncs.com,domain
0,b1868edae176ede468699c1cb094f611d4a1fef70f7f507736a03f14247f8666,https://minerjson.oss-cn-beijing.aliyuncs.com/server2.0.0.json?t=637703350378568000,minerjson.oss-cn-beijing.aliyuncs.com,domain


### Merging and Summarizing Findings
The final step is to merge, format, and summarize the findings by using methods native to pandas.

In [6]:
# merge, reorder, and rename dataframe
cryptominer_itw_df = cryptominer_itw_df.merge(itw_enrichment, how='left', left_on='target', right_on='id')
cryptominer_itw_df = cryptominer_itw_df[ ['source', 'source.name', 'source_type', 'target', 'url', 'target_type', 'host', 'relationship_type', ] ]
cryptominer_itw_df.rename(columns={"source_type": "source.type", "url": "target.name", 'host':'target.host'}, inplace=True)

display(cryptominer_itw_df.head())

display(cryptominer_itw_df.groupby('target.host').count()[['source']].sort_values('source', ascending=False))

Unnamed: 0,source,source.name,source.type,target,target.name,target_type,target.host,relationship_type
0,d8d6ff48c3d2df3c8289e3c519e15b9a2110d08975eaddef995d95e4c0d970cd,config.json,file,81734d74cdbc7a2b58442e8bae0f5b0352717346107e30d37e8e041248fe503d,http://27.1.1.34:8080/docs/config.json,url,27.1.1.34,itw_urls
1,d8d6ff48c3d2df3c8289e3c519e15b9a2110d08975eaddef995d95e4c0d970cd,zbetcheckin_tracker_config.json,file,81734d74cdbc7a2b58442e8bae0f5b0352717346107e30d37e8e041248fe503d,http://27.1.1.34:8080/docs/config.json,url,27.1.1.34,itw_urls
2,a80a424f11f3a55cd427011485b95f1f93b7f8492a1cf5dbe5ada1df4ca0f5ad,server2.0.0.json,file,ac441d52afe44184afbf85baab9722c94859dbf834b95359e3c16a26fe70538c,https://minerjson.oss-cn-beijing.aliyuncs.com/server2.0.0.json?t=638058298707060546,url,minerjson.oss-cn-beijing.aliyuncs.com,itw_urls
3,a80a424f11f3a55cd427011485b95f1f93b7f8492a1cf5dbe5ada1df4ca0f5ad,C:\Users\<USER>\Downloads\server.json,file,ac441d52afe44184afbf85baab9722c94859dbf834b95359e3c16a26fe70538c,https://minerjson.oss-cn-beijing.aliyuncs.com/server2.0.0.json?t=638058298707060546,url,minerjson.oss-cn-beijing.aliyuncs.com,itw_urls
4,a80a424f11f3a55cd427011485b95f1f93b7f8492a1cf5dbe5ada1df4ca0f5ad,C:\Users\user\Desktop\server.json,file,ac441d52afe44184afbf85baab9722c94859dbf834b95359e3c16a26fe70538c,https://minerjson.oss-cn-beijing.aliyuncs.com/server2.0.0.json?t=638058298707060546,url,minerjson.oss-cn-beijing.aliyuncs.com,itw_urls


Unnamed: 0_level_0,source
target.host,Unnamed: 1_level_1
45.90.220.62,45
cdn.discordapp.com,45
raw.githubusercontent.com,45
minerjson.oss-cn-beijing.aliyuncs.com,15
27.1.1.34,2
k2ygoods.top,2
main.cloudfronts.net,2
50.63.143.208,1
hk.kuai-go.com,1
safe.kuai-go.com,1


In [8]:
import sys
sys.getrecursionlimit()

3000