# New Malware AV Scans and Dynamic Analysis
This notebook contains an example of fetching newly discovered malware and comparing its AV scans with the dynamic analysis report.

### Used TitaniumCloud classes
- **NewMalwareFilesFeed** TCF-0101
- **FilesWithDetectionChanges** TCF-0109
- **AVScanners** TCA-0103
- **DynamicAnalysis** TCA-0106
- **FileDownload** TCA-0201

### Credentials
Credentials are loaded from a local file instead of being written here in plain text.
To learn how to creat the credentials file, see the **Storing and using the credentials** section in the [README file](./README.md)

### 1. Fetching newly discovered malware and samples with classification changes
First of all, we need to import our credentials and create API objects. To shorten the process, instead of passing the same variables into an object explicitly, 
we will use a dictionary as kwargs.

Afterward, we will start by defining a one-minute time period and fetching a maximum of 50 newly discovered malware and 50 samples with classification changes.
The resulting malware list will not be longer than 100 samples in total.

In [22]:
import json
from datetime import datetime, timedelta
from ReversingLabs.SDK.ticloud import NewMalwareFilesFeed, FilesWithDetectionChanges, FileAnalysis, AVScanners, DynamicAnalysis, FileDownload
from ReversingLabs.SDK.helper import NotFoundError


CREDENTIALS = json.load(open('credentials.json'))
USERNAME = CREDENTIALS.get("ticloud").get("username")
PASSWORD = CREDENTIALS.get("ticloud").get("password")
USER_AGENT = json.load(open('../user_agent.json'))["user_agent"]

config = {
    "host": "https://data.reversinglabs.com",
    "username": USERNAME,
    "password": PASSWORD,
    "user_agent": USER_AGENT
}

new_malware = NewMalwareFilesFeed(**config)
detection_changes = FilesWithDetectionChanges(**config)
rldata = FileAnalysis(**config)
xref = AVScanners(**config)
da = DynamicAnalysis(**config)
spex = FileDownload(**config)

In [23]:
hour_back = datetime.now() - timedelta(minutes=1)
timestamp = str(int(hour_back.timestamp()))

# Fetch newly discovered malware
new_malware.set_start(
	time_format="timestamp",
	time_value=timestamp
)

new_entries = []
while not new_entries:
    resp = new_malware.pull(
	    sample_available=True,
        record_limit=50
    )

    new_entries = resp.json().get("rl", {}).get("malware_detection_feed", {}).get("entries")

try:
    new_results = new_entries[:49]
except IndexError:
    new_results = new_entries
    
# Fetch samples with classification changes
detection_changes.start_query(
    time_format="timestamp",
    time_value=timestamp
)

change_entries = []
while not change_entries:
    resp = detection_changes.pull_query(
        sample_available=True,
        limit=50
    )
    
    change_entries = resp.json().get("rl", {}).get("malware_scan_change_feed", {}).get("entries")
    
try:
    change_results = change_entries[:49]
except IndexError:
    change_results = change_entries

# Merge both lists into new_results
new_results.extend(change_results)

### 2. Grouping AV Scanner hit percentage together with Dynamic Analysis classification
After gathering up to 50 newly discovered malware samples and up to 50 samples with classification changes, we will fetch the AV Scanner hit percentage and Dynamic Analysis
classification for each of them.
The resulting list is called `verdicts` and it carries the SHA1 hash, AV Scanner hit percentage and the Dynamic Analysis classification for each sample.

Example of an object in the `verdicts` list:

```json
{
  "sha1": "0a4dfd16565083dd110cbcd24f8c75d176133ddf", 
  "av_percentage": 95.83333333333334, 
  "classification": 'MALICIOUS',
  "risk_score": 8
}
```

**NOTE:** Not all gathered samples have an existing Dynamic Analysis report available. Those that won't will be stored in the `da_no_report` list so they can be detonated later.

In [None]:
def analyze_xref(report):
    xref_results = report.get("rl", {}).get("sample", {}).get("xref", [])[0]
    
    scanner_count = xref_results.get("scanner_count")
    scanner_match = xref_results.get("scanner_match")
    scanner_percentage = (scanner_match / scanner_count) * 100
    
    return scanner_percentage


def resolve_da_status(sample_hash, reports, no_report, processing):
    try:
        da_resp = da.get_dynamic_analysis_results(sample_hash=sample_hash, latest=True).json()
        reports.append({sample_hash: da_resp})
        
    except NotFoundError as e:
        if "report_not_found" or "error" in str(e):
            no_report.append(sample_hash)
        else:
            processing.append(sample_hash)
    
da_reports = []
da_no_report = []
da_processing = []

for sample in new_results:
    sha1 = sample.get("sha1")
            
    resolve_da_status(
        sample_hash=sha1,
        reports=da_reports,
        no_report=da_no_report,
        processing=da_processing
    )
    
if not da_reports:
    print("EXISTING REPORTS - da_reports list: There are no samples with existing Dynamic Analysis reports. Consider detonating some of the samples from the da_no_report list.")
    
if da_no_report:
    print(f"NON-EXISTENT REPORTS - da_no_report list: There are {len(da_no_report)} samples that have no existing Dynamic Analysis reports. Consider detonating them.")
    
if da_processing:
    print(f"PROCESSING REPORTS - da_processing list: There are {len(da_processing)} samples whose Dynamic Analysis reports are still being processed. Try fetching them later.")

# The verdicts list will carry the SHA1 hash, AV Scanner hit percentage and the Dynamic Analysis classification for each sample.
verdicts = []
    
for sample in da_reports:
    for k, v in sample.items():
        xref_report = xref.get_scan_results(hash_input=k).json()
        xref_percentage = analyze_xref(report=xref_report)
                
        da_classification = v.get("rl", {}).get("report", {}).get("classification")
        da_risk_score = v.get("rl", {}).get("report", {}).get("risk_score")
        
        verdicts.append(
			{
                "sha1": k,
                "av_percentage": xref_percentage,
                "classification": da_classification,
                "risk_score": da_risk_score
            }
        )

### 3. Downloading selected malicious samples
Our final step in this scenario will be deciding which samples to download and then, of course, downloading them.  
Why would we want to download malicious samples? - To study them and use them in further research.  
How do we decide which samples to download? - That is up to you to decide. However, we can suggest the following approach:  

Since all these samples are declared either malicious or suspicious, we can request download 
of the ones that have a risk score bigger than 7 or an AV Scanner hit percentage higher than 50%.

In [None]:
for sample in verdicts:
    sha1 = sample.get("sha1")
    
    if sample.get("av_percentage") > 50 or sample.get("risk_score") > 7:
        print(sample)
        
        resp = spex.download_sample(hash_input=sha1)
        
        with open(sha1, "wb") as file_handle:
            file_handle.write(resp.content)
            
        print(f"Downloaded sample {sha1}.")

The code block above will download the malicious samples into your current working folder (from which you're running this notebook).

### 4. Detonating samples (optional)
In case we end up with no existing Dynamic Analysis reports or we simply want to detonate more samples to, in turn, have more reports, we can use the following code.

**WARNING**: Dynamic Analysis detonation is a heavy and lengthy action. It is wise **not to swarm the Dynamic Analysis API with large amounts of samples**. For this reason,
We will be detonating only the first 5 samples from the `da_no_report` list as an example. 

In [None]:
for sample in da_no_report[:5]:
    da.detonate_sample(
        sample_hash=sample,
        platform="windows10"
    )
    
    print(f"Detonation of sample {sample} requested.")

After requesting detonation of these 5 additional samples, we can wait an arbitrary amount of time and re-run the code block which fetches Dynamic Analysis reports and creates the `verdicts` list.