# COVID-19 Indicator feed notebook

Author: @JohnLaTwC

This notebook uses the COVID-19 threat indicator feed from Microsoft. It emits a Yara rule that matches on the COVID-19 indicator feed from Microsoft. You can add this rule as a ruleset on VT to get hunting notifications. 

It also can search VirusTotal to find matches.

References:
* https://www.microsoft.com/security/blog/2020/05/14/open-sourcing-covid-threat-intelligence/
* https://aka.ms/msft-covid19-Indicators

In [1]:
feedurl = 'https://raw.githubusercontent.com/Azure/Azure-Sentinel/master/Sample%20Data/Feeds/Microsoft.Covid19.Indicators.csv'

In [2]:
def get_feed(url):
    import requests
    import pandas as pd
    from io import StringIO
    results = None
    r = requests.get(url)
    if r.status_code==404:
        print('feed does not exist')
    else:
        csv = StringIO(r.content.decode())
        results = pd.read_csv(csv, sep=',', names=["Timestamp", "sha256", "IndicatorType", "TLP", "Product", "ThreatType", "Description"])
    return results

def create_rule(hsh):
    return f'hash.sha256(0, filesize) == "{hsh}"\n'

def print_vt_rule(df, feedurl):
    import datetime 
    timestamp = datetime.datetime.now()

    x = map(create_rule, df['sha256'].tolist())
    formatted_lst = '            or '.join(list(x))
    s = f'''
import "hash"
rule covid19_indicator_match
{{
    meta:
        feed = "{feedurl}"
        created_on = "{timestamp}"
        total_indicators = {len(df)}
    condition:
        filesize < 1MB and
        (
            {formatted_lst}
        )
}}

    ''' 
    print(s)

## Fetch the indicators and list some of them

In [3]:
import pandas as pd

df = get_feed(feedurl)
i = 5
print(f"Feed contains {len(df)} indicators. Listing {i} example indicators:")

df['Timestamp'] = pd.to_datetime(df['Timestamp'])
df.sort_values(by='Timestamp', ascending=True).tail(i)

Feed contains 309 indicators. Listing 5 example indicators:


Unnamed: 0,Timestamp,sha256,IndicatorType,TLP,Product,ThreatType,Description
218,2020-05-13 02:00:00,977820df24af8c83c1377290457cf7c0bab4f5634b1172...,sha256,white,Azure Sentinel,Malware,Microsoft COVID-19 Threat Indicators
219,2020-05-13 02:23:41,66ec588584fa4145e846df879dd3385a14f7a445b11299...,sha256,white,Azure Sentinel,Malware,Microsoft COVID-19 Threat Indicators
220,2020-05-13 02:35:44,262329dbb9e927ceaa1fb2eda4dd5ecd6d53f79664f412...,sha256,white,Azure Sentinel,Malware,Microsoft COVID-19 Threat Indicators
1,2020-05-14 05:41:38,c4a6245679676f18ab309dc7ca39ad7e70806bac16dd31...,sha256,white,Azure Sentinel,Malware,Microsoft COVID-19 Threat Indicators
304,2020-05-14 08:50:42,ccb9e262904aa7dd6cb3a529a2b77bebcd1d9dc51c9f2c...,sha256,white,Azure Sentinel,Malware,Microsoft COVID-19 Threat Indicators


## Construct Yara rule that matches the indicators

In [4]:
print_vt_rule(df, feedurl)


import "hash"
rule covid19_indicator_match
{
    meta:
        feed = "https://raw.githubusercontent.com/Azure/Azure-Sentinel/master/Sample%20Data/Feeds/Microsoft.Covid19.Indicators.csv"
        created_on = "2020-05-18 11:04:57.103130"
        total_indicators = 309
    condition:
        filesize < 1MB and
        (
            hash.sha256(0, filesize) == "d6332d4b5b5984ebb39685164428ad0f1f1b04e82b14cd5d773bbdd0d4ad05dc"
            or hash.sha256(0, filesize) == "c4a6245679676f18ab309dc7ca39ad7e70806bac16dd31af1f769bca84044f47"
            or hash.sha256(0, filesize) == "1789493dcb90022e86b162654f82c15e83553689e0810c2758ed49ab1b8e5611"
            or hash.sha256(0, filesize) == "fa9463f9970d2b83ebe1eb734e8f04e540253e9be25761cdd600a30da22fffe5"
            or hash.sha256(0, filesize) == "5b37cc85fd190a6b4726ea57f2588b5a74acc2c51e2917363c226b73ac79118f"
            or hash.sha256(0, filesize) == "5e3e874b7f87124567d4716a6f0e8d696bae261550b399649a9fb3a85f2e0d5a"
            or hash.sh

## Search VirusTotal for any matching indicators (requires API key)

In [5]:
# if you want to query VT, define your API KEY
VT_APIKEY = '<insert api key here>'

In [6]:
def vt_search(qry):
    import requests
    import json
    
    if '<' in VT_APIKEY:
        print('Define your API key in VT_APIKEY variable')
        return []
    
    lst = []
    url = "https://www.virustotal.com/api/v3/intelligence/search"
    params = {'descriptors_only': True, 
              'query':qry,
             'limit':200}
    headers = {'x-apikey': VT_APIKEY,
            'Accept': 'application/json'}    

    r = requests.get(url, params=params, headers=headers)
    if r.status_code==404:
        print('hash %s does not exist on VT' % qry)
    else:
        results = json.loads(r.content)
        if 'data' not in results or len(results['data']) == 0:
            return lst;
        else:
            for x in results['data']:
                lst.append(x['id'])
    return list(set(lst))

def query_for_indicators(lst):
    partitions = int(len(lst) / 25)
    subs = [lst[i::partitions] for i in range(partitions)] 
    r = []
    for sub in subs:
        r = r + vt_search('\n'.join(sub))
    return r

In [8]:
lst = df['sha256'].tolist()
matches = query_for_indicators(lst)
print(len(matches))

33


In [9]:
matches

['bbb0f2855d1444cae835700f58acb51b6a6fd2f48046e94850982753cb4a7268',
 'd6332d4b5b5984ebb39685164428ad0f1f1b04e82b14cd5d773bbdd0d4ad05dc',
 'd42a1fd60954189c93509bc921f188b7583a7c5849c7f2436922ad4babd5b374',
 '866aad642771c000c61eb3f4990432f63bbaec2eb194f80ef4f45967b6d7ff32',
 'e84b6245f19f01b8b9467e14501d114a19f5dbbd26f92f7823ec7f02cff6dda5',
 'c4a6245679676f18ab309dc7ca39ad7e70806bac16dd31af1f769bca84044f47',
 '3f6aca2b590e7fa767d8e85bf814688db1f449a7afac10994221b6a82c85a73a',
 'bfbada8b0ecc1f711dd4869c7fc97f658b88dc497a415a2e912eff9245fe9c9b',
 '1789493dcb90022e86b162654f82c15e83553689e0810c2758ed49ab1b8e5611',
 '1078d27f2873ddec4203062b5eca87a4b63917f1f970b3878fcfb31ecc16869c',
 '72566077693ec85bd5867e70d0c7f118d592e93498a152a51ea55fed4fdb2684',
 '76888b745714b1d0db8cd883eaac756c560b052462cae240c3917c441c07d611',
 '5b37cc85fd190a6b4726ea57f2588b5a74acc2c51e2917363c226b73ac79118f',
 'ccb9e262904aa7dd6cb3a529a2b77bebcd1d9dc51c9f2c1d8ededfc1a3c2040d',
 '5e3e874b7f87124567d4716a6f0e8d69