# アマヤラ Lab

> by tsumarios

The AMAYARA (Android Malware Analysis YARA) Lab project provides a ready-to-use Jupyter Lab environment to help out with Android malware analysis using YARA rules.

### Prerequirements

Before proceeding you need to import some modules, as well as specify some configurations.


#### Dependencies

The YARA Python library is required.

In [None]:
!pip3 install yara-python

#### Imports

In [2]:
import os
import json
import requests
import yara
from hashlib import md5, sha1, sha256
from zipfile import ZipFile

#### Utils

Implement some util functions.

*Please remember to export your API key as an environment variable:* `export VT_API_KEY=<your_API_key>`.

In [3]:
# Virus Total setup
VT_API_KEY = os.environ.get('VT_API_KEY')


def get_virus_total_stats(file, md5_digest):
    """
    Return Virus Total statistics for the file.
    """
    result = {}

    # Get file report (if exists)
    url_report = f'https://www.virustotal.com/api/v3/files/{md5_digest}'
    headers_report = {'Accept': 'application/json', 'x-apikey': VT_API_KEY}
    response = requests.get(url_report, headers=headers_report)

    # If the file report does not exist, upload the file and analyse it
    if response.status_code == 404:
        url_upload = 'https://www.virustotal.com/api/v3/files'
        files_upload = {'file': (file, open(file, 'rb'))}
        headers_upload = {'x-apikey': VT_API_KEY}
        response = requests.post(url_upload, files=files_upload, headers=headers_upload)
        # Then get file report
        response = requests.get(url_report, headers=headers_report)

    # Return desired info from the analysis object
    result['report_url'] = f'https://www.virustotal.com/gui/file/{md5_digest}'
    result['suggested_threat_label'] = response.json().get('data', {}).get('attributes', {}).get('popular_threat_classification', {}).get('suggested_threat_label')
    result['last_analysis_stats'] = response.json().get('data', {}).get('attributes', {}).get('last_analysis_stats')
    return result


def get_file_digests(file):
    """
    Return the md5, sha1 and sha256 digests for a file.
    """
    md5_digest, sha1_digest, sha256_digest = md5(), sha1(), sha256()

    with open(file, 'rb') as f:
        # Read and update hash in chunks of 4K
        for byte_block in iter(lambda: f.read(4096), b''):
            md5_digest.update(byte_block)
            sha1_digest.update(byte_block)
            sha256_digest.update(byte_block)

    return md5_digest, sha1_digest, sha256_digest

## Include files and YARA rules

You have to add the files that you want to analyse in the `files` folder.
YARA rules need to be added in the `rules` folder.

*Note that you can also add them in subfolders as the script will take care of recursively iterating the path under files|rules.*

### Settings

Run the following code to get the files/rules paths and compile the latter.
Please remember to re-run the following cell every time you add/delete files and/or rules.

In [4]:
def get_paths(folder, extension):
    paths = {}
    for root, dirs, files in os.walk(folder, topdown=False):
        for name in files:
            if name.endswith(extension):
                paths[name] = os.path.join(root, name)

    return paths


# Retrieve paths for apk file(s) and YARA rule(s)
apk_files = get_paths('./files', '.apk')
rules_paths = get_paths('./rules', '.yar')
# Compile rules
rules = yara.compile(filepaths=rules_paths)

### Core

The core of this lab. We define a couple of functions to analyse and scan the apk file(s) and its/their contents with the included YARA rules.

In [5]:
def rules_scanner(file):
    """
    Scan a file using the YARA rules in the /rules folder (including subfolders).
    """
    results = {}
    for match in rules.match(file):
        strings_list = []
        for data in match.strings:
            # The string output is a tuple (Location, Identifier, String)
            string = data[2].decode("utf-8")
            if string not in strings_list:
                strings_list.append(string)
        results[match.rule] = strings_list

    return results


def analyse_files_in_apk(apk_file):
    """
    Analyse all the files in an apk.
    """
    results = {}
    # Extract the APK file into a temporary directory
    with ZipFile(apk_file, 'r') as zipObj:
        zipObj.extractall('tmp')
        # Iterate all over the extracted files
        for root, dirs, files in os.walk('tmp', topdown=False):
            for name in files:
                file_path = os.path.join(root, name)
                # Get results for the current file
                result = rules_scanner(file_path)
                if bool(result):
                    results[name] = result
            # Cleanup
                os.remove(file_path)
            for name in dirs:
                os.rmdir(os.path.join(root, name))
        os.rmdir('tmp')

    return(results)


def analyse_apk(apk_file):
    """
    Analyse an apk file and stores the results into a JSON file under the /results folder.
    """
    # Initialise results with file info, digests and VT stats
    md5_digest, sha1_digest, sha256_digest = get_file_digests(apk_file)
    vt_stats = get_virus_total_stats(apk_file, md5_digest.hexdigest())
    results = {'file_name': os.path.basename(apk_file),
               'digests': {'md5': md5_digest.hexdigest(), 'sha1': sha1_digest.hexdigest(), 'sha256': sha256_digest.hexdigest()},
               'vt_stats': vt_stats, 'pithus_report_url': f'https://beta.pithus.org/report/{sha256_digest.hexdigest()}',
               'yara_results': {}}

    # Scan APK and its content
    results['yara_results']['apk'] = rules_scanner(apk_file) or {}
    results['yara_results']['apk_content'] = analyse_files_in_apk(apk_file) or {}

    # Save and print results
    with open(f'./results/result_{md5_digest.hexdigest()}.json', 'w') as json_file:
        json.dump(results, json_file)
    print(results)


def main():
    for apk_file in apk_files.values():
        analyse_apk(apk_file)

## Usage

The script can simply be executed by invoking the `main()` function. The results will be displayed below and stored as a JSON file in the results folder with the following format: `results/results_<apk_md5_digest>.json`.

In [6]:
main()

{'file_name': 'Battery Charging Animation Bubble Effects.apk', 'digests': {'md5': 'f47b1ccd4d1ecee1f71f301b10f8ae9a', 'sha1': 'dbed4917fa7cc0e3d9df6a21a222c3156f1394b9', 'sha256': 'd3810acc806c4123b6b41ff85e29bf8b5b823be3e4f4ce5a8d76cff3dfd92e4f'}, 'vt_stats': {'report_url': 'https://www.virustotal.com/gui/file/f47b1ccd4d1ecee1f71f301b10f8ae9a', 'suggested_threat_label': 'trojan.joker/jocker', 'last_analysis_stats': {'harmless': 0, 'type-unsupported': 10, 'suspicious': 0, 'confirmed-timeout': 0, 'timeout': 0, 'failure': 0, 'malicious': 25, 'undetected': 39}}, 'pithus_report_url': 'https://beta.pithus.org/report/d3810acc806c4123b6b41ff85e29bf8b5b823be3e4f4ce5a8d76cff3dfd92e4f', 'yara_results': {'apk': {'Joker_Payload2': ['assets/62vrr5qqq6']}, 'apk_content': {'62vrr5qqq6': {'Joker_Payload2': ['MF8zXzEgbGlrZSBNYWMgT1MgWCkgQXBwbGVXZWJLaXQvNjAzLjEuMzAgKEtIVE1MLCBs']}}}}
