<a href="https://colab.research.google.com/github/ericyoc/gen_dga_regex_and_yara_rules/blob/main/gen_dga_types_regex_and_yara_rules_demo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
# This program demonstrates various Domain Generation Algorithm (DGA) techniques used by malware.
# It includes implementations of 15 different DGA types:
# 1. Zodiac-based DGA
# 2. Time-based DGA
# 3. Seed-based DGA
# 4. Dictionary-based DGA
# 5. Pseudorandom Number Generator (PRNG) based DGA
# 6. Arithmetic-based DGA
# 7. Permutation-based DGA
# 8. Fibonacci-based DGA
# 9. Base32/Base64 DGA
# 10. Wordlist-based DGA
# 11. Vowel-Consonant DGA
# 12. Morse Code DGA
# 13. Emoji DGA
# 14. Coordinate-based DGA
# 15. Musical Notes DGA

# Each DGA type generates a set of domain names based on specific patterns, algorithms, or inputs.
# The generated domains are meant to be used for command and control (C&C) communication or data exfiltration by malware.

# The program provides explanations, usefulness, strengths, weaknesses, and deception methods for each DGA type.
# It also generates a set of sample domains using each DGA technique.

# Additionally, the program includes functions to:
# - Generate regular expressions (regexes) based on the generated domains to detect similar DGA patterns.
# - Create YARA rules using the generated regexes for malware detection.

# The main entry point of the program iterates over each DGA type, generates sample domains, and performs analysis.
# It prints the generated domains, regexes, and YARA rules for each DGA type.

# The program also includes a function to summarize the DGA functions in a tabular format,
# providing an overview of each DGA's inputs, outputs, and a step-by-step summary of its functionality.

# Overall, this program serves as a comprehensive demonstration and analysis tool for various DGA techniques
# used by malware, aiding in understanding, detection, and mitigation of DGA-based threats.

In [2]:
import random
import string
import re
import array
import argparse
import uuid
from datetime import datetime
import hashlib
import itertools
import base64
import textwrap
from datetime import timedelta
from collections import Counter
from prettytable import PrettyTable

In [3]:
# Define wordlist for Wordlist-based DGA
wordlist = ["apple", "banana", "cherry", "date", "elderberry", "fig", "grape", "honeydew", "kiwi", "lemon", "mango"]

# Define emojis for Emoji DGA
emojis = ["😀", "😂", "😍", "🤔", "🙌", "👍", "🎉", "🚀", "💡", "🌍"]

# Define musical notes and octaves for Musical Notes DGA
musical_notes = ["A", "B", "C", "D", "E", "F", "G"]
musical_octaves = ["1", "2", "3", "4", "5"]

# Define TLD list
TLD_LIST = ['.com', '.org', '.net', '.edu', '.gov', '.info']

min_length = 7
max_length = 10

In [4]:
def generate_random_seed(length):
    characters = string.ascii_letters + string.digits + string.punctuation
    seed = ''.join(random.choices(characters, k=length))
    return seed

zodiac-based DGA

In [5]:
def zodiac_sign_dga(seed, domain_count, min_length, max_length, char_count=15):
    daysl = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 26, 27, 28]
    monthl = ["january", "february", "march", "april", "may", "june", "july", "august", "september", "october", "november", "december"]
    tlds = ["com", "net", "org", "info", "biz"]

    def zodiac_sign(day, month):
        # checks month and date within the valid range
        # of a specified zodiac
        if month == 'december':
            astro_sign = 'Sagittarius' if (day < 22) else 'capricorn'
        elif month == 'january':
            astro_sign = 'Capricorn' if (day < 20) else 'aquarius'
        elif month == 'february':
            astro_sign = 'Aquarius' if (day < 19) else 'pisces'
        elif month == 'march':
            astro_sign = 'Pisces' if (day < 21) else 'aries'
        elif month == 'april':
            astro_sign = 'Aries' if (day < 20) else 'taurus'
        elif month == 'may':
            astro_sign = 'Taurus' if (day < 21) else 'gemini'
        elif month == 'june':
            astro_sign = 'Gemini' if (day < 21) else 'cancer'
        elif month == 'july':
            astro_sign = 'Cancer' if (day < 23) else 'leo'
        elif month == 'august':
            astro_sign = 'Leo' if (day < 23) else 'virgo'
        elif month == 'september':
            astro_sign = 'Virgo' if (day < 23) else 'libra'
        elif month == 'october':
            astro_sign = 'Libra' if (day < 23) else 'scorpio'
        elif month == 'november':
            astro_sign = 'scorpio' if (day < 22) else 'sagittarius'
        return str(astro_sign)

    def generate_domain(char_count, day, month):
        """Returns a random string of length string_length."""
        random2 = str(uuid.uuid5(uuid.NAMESPACE_DNS, zodiac_sign(day, month) + "." + random.choice(tlds)))  # Convert UUID format to a Python string.
        random3 = random2.lower()  # Make all characters lowercase.
        random4 = random3.replace("-", "")  # Remove the UUID '-'.
        random5 = str(hashlib.sha256(random4.encode('utf-8')).hexdigest())
        domain = random5[0:char_count]
        chars = string.ascii_lowercase + string.digits
        new_domain = re.sub(r"[aeiou]", lambda x: random.choice(chars), domain)
        return new_domain  # Return the random string.

    domains = []
    random.seed(seed)
    for _ in range(domain_count):
        domain_length = random.randint(min_length, max_length)
        day = random.choice(daysl)
        month = random.choice(monthl)
        domain = generate_domain(domain_length, day, month)
        tld = random.choice(tlds)
        domains.append(f"www.{domain}.{tld}")

    return domains

Time-based DGA:

In [6]:
def time_based_dga(count, min_length, max_length):
    domains = []
    tlds = [".com", ".net", ".org", ".info", ".biz"]
    # Calculate a random number of days to add or subtract
    num_days = random.randint(-30, 30)  # Range of -30 to +30 days
    for _ in range(count):
        # Get a date that is different from the current date
        date_offset = datetime.now() + timedelta(days=num_days)
        formatted_date = date_offset.strftime("%Y%m%d%H%M%S")
        # Generate a random length for the domain name
        domain_length = random.randint(min_length, max_length)
        # Truncate or pad the formatted date to match the desired length
        if len(formatted_date) > domain_length:
            formatted_date = formatted_date[:domain_length]
        else:
            formatted_date = formatted_date.zfill(domain_length)
        tld = random.choice(tlds)
        domain = f"www.{formatted_date}{tld}"
        domains.append(domain)
    return domains

Seed-based DGA

In [7]:
def seed_based_dga(seed, count, min_length, max_length):
    domains = []
    tlds = [".com", ".net", ".org", ".info", ".biz"]
    for _ in range(count):
        hash_value = hashlib.md5(seed.encode()).hexdigest()
        # Generate a random length for the domain name
        domain_length = random.randint(min_length, max_length)
        # Truncate or pad the hash value to match the desired length
        if len(hash_value) > domain_length:
            hash_value = hash_value[:domain_length]
        else:
            hash_value = hash_value.zfill(domain_length)
        tld = random.choice(tlds)
        domain = f"www.{hash_value}{tld}"
        domains.append(domain)
        # Update the seed for the next iteration
        seed = hash_value
    return domains

Dictionary-based DGA

In [8]:
def dictionary_based_dga(count, min_length, max_length):
    domains = []
    dictionary = ["blue", "cat", "fish", "red", "dog", "bird"]
    tlds = [".com", ".net", ".org", ".info", ".biz"]

    for _ in range(count):
        # Generate a random length for the domain name
        domain_length = random.randint(min_length, max_length)
        domain_parts = []
        current_length = 0

        # Select words from the dictionary until the desired length is reached
        while current_length < domain_length:
            word = random.choice(dictionary)
            # Check if adding the selected word exceeds the desired length
            if current_length + len(word) <= domain_length:
                domain_parts.append(word)
                current_length += len(word)
            else:
                break

        # Combine the selected words to form the domain name
        domain_name = "".join(domain_parts)

        # Select a random TLD
        tld = random.choice(tlds)

        # Combine the domain name and TLD to create the final domain
        domain = f"www.{domain_name}{tld}"

        # Append the generated domain to the list of domains
        domains.append(domain)

    return domains  # Return the list of generated domains

Pseudorandom Number Generator (PRNG) based DGA

In [9]:
def prng_based_dga(seed, count, min_length, max_length):
    domains = []
    random.seed(seed)
    tlds = [".com", ".net", ".org", ".info", ".biz"]

    for _ in range(count):
        # Generate a random length for the domain name
        length = random.randint(min_length, max_length)

        # Define the character set for the domain name
        characters = string.ascii_lowercase + string.digits

        # Generate the domain name by selecting random characters
        domain_name = ''.join(random.choice(characters) for _ in range(length))

        # Select a random TLD
        tld = random.choice(tlds)

        # Combine the domain name and TLD to create the final domain
        domain = f"www.{domain_name}{tld}"

        # Append the generated domain to the list of domains
        domains.append(domain)

    # Return the list of generated domains
    return domains

Arithmetic-based DGA

In [10]:
def arithmetic_based_dga(seed, count, min_length, max_length):
    domains = []
    tlds = [".com", ".net", ".org", ".info", ".biz"]

    for _ in range(count):
        # Generate a random number to add to the seed
        random_number = random.randint(100, 999)

        # Calculate the result by adding the seed and the random number
        result = seed + random_number

        # Convert the result to a string
        domain_name = str(result)

        # Generate a random length for the domain name
        domain_length = random.randint(min_length, max_length)

        # Truncate or pad the domain name to match the desired length
        if len(domain_name) > domain_length:
            domain_name = domain_name[:domain_length]
        else:
            domain_name = domain_name.zfill(domain_length)

        # Select a random TLD
        tld = random.choice(tlds)

        # Combine the domain name and TLD to create the final domain
        domain = f"www.{domain_name}{tld}"

        # Append the generated domain to the list of domains
        domains.append(domain)

    return domains  # Return the list of generated domains

Permutation-based DGA

In [11]:
def permutation_based_dga(base_domain, count, min_length, max_length):
    domains = []
    tlds = [".com", ".net", ".org", ".info", ".biz"]

    # Extract the characters from the base domain
    characters = list(base_domain.split(".")[0])

    # Generate all possible permutations of the characters
    permutations = list(itertools.permutations(characters))

    # Shuffle the permutations randomly
    random.shuffle(permutations)

    for i in range(count):
        # Join the characters of the current permutation to form the domain name
        domain_name = ''.join(permutations[i])

        # Generate a random length for the domain name
        domain_length = random.randint(min_length, max_length)

        # Truncate or pad the domain name to match the desired length
        if len(domain_name) > domain_length:
            domain_name = domain_name[:domain_length]
        else:
            domain_name = domain_name.ljust(domain_length, random.choice(characters))

        # Select a random TLD
        tld = random.choice(tlds)

        # Combine the domain name and TLD to create the final domain
        domain = f"www.{domain_name}{tld}"

        # Append the generated domain to the list of domains
        domains.append(domain)

    # Return the list of generated domains
    return domains

Fibonacci-based DGA

In [12]:
def fibonacci_based_dga(count, min_length, max_length):
    domains = []
    a, b = 0, 1
    characters = "abcdefghijklmnopqrstuvwxyz0123456789"
    tlds = [".com", ".net", ".org", ".info", ".biz"]

    for _ in range(count):
        # Get the current Fibonacci number
        index = a

        # Generate the domain name using characters at specific indices
        domain_name = characters[index % len(characters)] + characters[(index+1) % len(characters)] + characters[(index+2) % len(characters)]

        # Generate a random length for the domain name
        domain_length = random.randint(min_length, max_length)

        # Truncate or pad the domain name to match the desired length
        if len(domain_name) > domain_length:
            domain_name = domain_name[:domain_length]
        else:
            while len(domain_name) < domain_length:
                index += 1
                domain_name += characters[index % len(characters)]

        # Select a random TLD
        tld = random.choice(tlds)

        # Combine the domain name and TLD to create the final domain
        domain = f"www.{domain_name}{tld}"

        # Append the generated domain to the list of domains
        domains.append(domain)

        # Update the Fibonacci sequence
        a, b = b, a + b

    # Return the list of generated domains
    return domains

Base32/Base64 DGA

In [13]:
def base32_base64_dga(seed, count, min_length, max_length, encoding_type):
    domains = []

    for _ in range(count):
        # Encode the seed using the selected encoding type (base32 or base64)
        if encoding_type == "base32":
            encoded_seed = base64.b32encode(seed.encode()).decode()
        elif encoding_type == "base64":
            encoded_seed = base64.b64encode(seed.encode()).decode()
        else:
            raise ValueError("Invalid encoding type. Choose 'base32' or 'base64'.")

        # Generate a random number between 1000 and 9999
        random_number = random.randint(1000, 9999)

        # Select a random TLD
        tld = random.choice(TLD_LIST)

        # Generate the domain name by combining the encoded seed, random number, and TLD
        domain_name = f"{encoded_seed[:8]}{random_number}"

        # Generate a random length for the domain name
        domain_length = random.randint(min_length, max_length)

        # Truncate or pad the domain name to match the desired length
        if len(domain_name) > domain_length:
            domain_name = domain_name[:domain_length]
        else:
            padding_length = domain_length - len(domain_name)
            padding = ''.join(random.choices(string.ascii_letters + string.digits, k=padding_length))
            domain_name += padding

        # Combine the domain name and TLD to create the final domain
        domain = f"www.{domain_name}{tld}"

        # Append the generated domain to the list of domains
        domains.append(domain)

        # Reverse the seed for the next iteration
        seed = seed[::-1]

    # Return the list of generated domains
    return domains

Wordlist-based DGA

In [14]:
def wordlist_dga(count, min_length, max_length):
    domains = []

    for _ in range(count):
        domain_name = ""

        # Generate words until the desired length is reached
        while len(domain_name) < min_length:
            word = random.choice(wordlist)
            if len(domain_name) + len(word) <= max_length:
                domain_name += word

        # Select a random TLD
        tld = random.choice(TLD_LIST)

        # Combine the domain name and TLD to create the final domain
        domain = f"www.{domain_name}{tld}"

        # Append the generated domain to the list of domains
        domains.append(domain)

    # Return the list of generated domains
    return domains

Vowel-Consonant DGA

In [15]:
def vowel_consonant_dga(count, min_length, max_length):
    domains = []
    vowels = "aeiou"
    consonants = "".join(set(string.ascii_lowercase) - set(vowels))

    for _ in range(count):
        # Generate a random length for the domain name
        length = random.randint(min_length, max_length)

        # Generate the domain name by alternating between vowels and consonants
        domain_name = ''.join(
            random.choice(vowels if i % 2 == 0 else consonants)
            for i in range(length)
        )

        # Select a random TLD
        tld = random.choice(TLD_LIST)

        # Combine the domain name and TLD to create the final domain
        domain = f"www.{domain_name}{tld}"

        # Append the generated domain to the list of domains
        domains.append(domain)

    # Return the list of generated domains
    return domains

Morse Code DGA

In [16]:
def morse_code_dga(count, min_length, max_length):
    domains = []
    morse_code = {
        'a': '.-', 'b': '-...', 'c': '-.-.', 'd': '-..', 'e': '.', 'f': '..-.',
        'g': '--.', 'h': '....', 'i': '..', 'j': '.---', 'k': '-.-', 'l': '.-..',
        'm': '--', 'n': '-.', 'o': '---', 'p': '.--.', 'q': '--.-', 'r': '.-.',
        's': '...', 't': '-', 'u': '..-', 'v': '...-', 'w': '.--', 'x': '-..-',
        'y': '-.--', 'z': '--..', '0': '-----', '1': '.----', '2': '..---',
        '3': '...--', '4': '....-', '5': '.....', '6': '-....', '7': '--...',
        '8': '---..', '9': '----.'
    }

    for _ in range(count):
        # Generate a random length for the domain name
        length = random.randint(min_length, max_length)

        # Generate the domain name using Morse code sequences
        domain_name = ''.join(
            random.choice(list(morse_code.values()))
            for _ in range(length)
        )

        # Select a random TLD
        tld = random.choice(TLD_LIST)

        # Combine the domain name and TLD to create the final domain
        domain = f"www.{domain_name}{tld}"

        # Append the generated domain to the list of domains
        domains.append(domain)

    # Return the list of generated domains
    return domains

Emoji DGA

In [17]:
def emoji_dga(count, min_length, max_length):
    domains = []

    for _ in range(count):
        # Generate a random length for the domain name
        length = random.randint(min_length, max_length)

        # Generate the domain name using emojis
        domain_name = ''.join(
            random.choice(emojis)
            for _ in range(length)
        )

        # Select a random TLD
        tld = random.choice(TLD_LIST)

        # Combine the domain name and TLD to create the final domain
        domain = f"www.{domain_name}{tld}"

        # Append the generated domain to the list of domains
        domains.append(domain)

    # Return the list of generated domains
    return domains

Coordinate-based DGA

In [18]:
def coordinate_dga(count, min_length, max_length):
    domains = []
    for _ in range(count):
        latitude = random.uniform(-90, 90)
        longitude = random.uniform(-180, 180)

        # Generate the domain name using latitude and longitude
        domain_name = f"{abs(latitude):.4f}{abs(longitude):.4f}".replace(".", "")

        # Check if the domain name length falls within the specified range
        while len(domain_name) < min_length or len(domain_name) > max_length:
            # If not, generate new latitude and longitude
            latitude = random.uniform(-90, 90)
            longitude = random.uniform(-180, 180)
            domain_name = f"{abs(latitude):.4f}{abs(longitude):.4f}".replace(".", "")

        tld = random.choice(TLD_LIST)
        domain = f"www.{domain_name}{tld}"
        domains.append(domain)

    return domains

Musical Notes DGA

In [19]:
def musical_notes_dga(count, min_length, max_length):
    domains = []

    for _ in range(count):
        # Generate a random length for the domain name
        domain_length = random.randint(min_length, max_length)

        # Initialize an empty string for the domain name
        domain_name = ''

        # Generate the domain name using musical notes and octaves
        for _ in range(domain_length):
            domain_name += random.choice(musical_notes) + random.choice(musical_octaves)

        # Select a random TLD
        tld = random.choice(TLD_LIST)

        # Construct the full domain
        domain = f"www.{domain_name}{tld}"

        # Append the generated domain to the list
        domains.append(domain)

    return domains

In [20]:
DGA_COUNT = 10

TLD_LIST = ['.com', '.org', '.net', '.edu', '.gov', '.info']

DGA_TYPES = {
    "Zodiac-based DGA Generation": lambda DGA_COUNT: ["www." + "".join(random.choices(string.ascii_lowercase, k=8)) + random.choice(TLD_LIST) for _ in range(DGA_COUNT)],
    "Time-based DGA Generation": lambda DGA_COUNT: ["www." + str(random.randint(1000, 9999)) + random.choice(TLD_LIST) for _ in range(DGA_COUNT)],
    "Seed-based DGA Generation": lambda seed, DGA_COUNT: ["www." + seed + str(random.randint(1000, 9999)) + random.choice(TLD_LIST) for _ in range(DGA_COUNT)],
    "Dictionary-based DGA Generation": lambda DGA_COUNT: ["www." + str(random.randint(1000, 9999)) + random.choice(TLD_LIST) for _ in range(DGA_COUNT)],
    "Pseudorandom Number Generator (PRNG) based DGA Generation": lambda seed, DGA_COUNT: ["www." + str(random.randint(1000, 9999)) + random.choice(TLD_LIST) for _ in range(DGA_COUNT)],
    "Arithmetic-based DGA Generation": lambda base, DGA_COUNT: ["www." + str(base + random.randint(1, 100)) + random.choice(TLD_LIST) for _ in range(DGA_COUNT)],
    "Permutation-based DGA Generation": lambda seed, DGA_COUNT: ["www." + "".join(random.sample(seed, len(seed))) + random.choice(TLD_LIST) for _ in range(DGA_COUNT)],
    "Fibonacci-based Generator DGA Generation": lambda DGA_COUNT: ["www." + str(random.randint(1000, 9999)) + random.choice(TLD_LIST) for _ in range(DGA_COUNT)],
    "Base32/Base64 DGA Generation": lambda seed, DGA_COUNT: ["www." + seed[:4] + str(random.randint(1000, 9999)) + random.choice(TLD_LIST) for _ in range(DGA_COUNT)],
    "Wordlist-based DGA Generation": lambda DGA_COUNT: ["www." + str(random.randint(1000, 9999)) + random.choice(TLD_LIST) for _ in range(DGA_COUNT)],
    "Vowel-Consonant DGA Generation": lambda DGA_COUNT: ["www." + "".join(random.choices(string.ascii_lowercase, k=8)) + random.choice(TLD_LIST) for _ in range(DGA_COUNT)],
    "Morse Code DGA Generation": lambda DGA_COUNT: ["www." + str(random.randint(1000, 9999)) + random.choice(TLD_LIST) for _ in range(DGA_COUNT)],
    "Emoji DGA Generation": lambda DGA_COUNT: ["www." + str(random.randint(1000, 9999)) + random.choice(TLD_LIST) for _ in range(DGA_COUNT)],
    "Coordinate-based DGA": lambda DGA_COUNT: ["www." + str(random.randint(1000, 9999)) + random.choice(TLD_LIST) for _ in range(DGA_COUNT)],
    "Musical Notes DGA": lambda DGA_COUNT: ["www." + str(random.randint(1000, 9999)) + random.choice(TLD_LIST) for _ in range(DGA_COUNT)],
}


In [21]:
def summarize_dga_functions():
    table = PrettyTable()
    table.field_names = ["DGA Name", "Inputs", "Outputs", "Summary with Steps"]
    table.align["Summary with Steps"] = "l"  # Left-align the summary column

    # Zodiac-based DGA
    table.add_row(["zodiac_sign_dga(seed, domain_count, min_length, max_length, char_count=15)", "seed (int), domain_count (int), min_length (int), max_length (int), char_count (int, optional)", "List of domains (str)", "1. Define daysl, monthl, and tlds lists\n2. Define zodiac_sign and generate_domain helper functions\n3. Seed the random number generator with the provided seed\n4. Loop domain_count times:\n    a. Generate a random domain length between min_length and max_length\n    b. Choose a random day from daysl and a random month from monthl\n    c. Call generate_domain with day, month, and char_count to get the domain name\n    d. Choose a random TLD from tlds\n    e. Construct the full domain with 'www.' prefix, domain name, and TLD\n    f. Append the constructed domain to the domains list\n5. Return the list of generated domains"])

    # Time-based DGA
    table.add_row(["time_based_dga(count, min_length, max_length)", "count (int), min_length (int), max_length (int)", "List of domains (str)", "1. Calculate a random number of days to add or subtract from current date\n2. Generate a random length between min_length and max_length\n3. Format the offset date and truncate/pad to match the length\n4. Construct domain with 'www.' prefix, formatted date, and TLD\n5. Repeat for count"])

    # Seed-based DGA
    table.add_row(["seed_based_dga(seed, count, min_length, max_length)", "seed (str), count (int), min_length (int), max_length (int)", "List of domains (str)", "1. Generate a random length between min_length and max_length\n2. Hash the seed using MD5 and truncate/pad to match the length\n3. Construct domain with 'www.' prefix, hashed seed, and TLD\n4. Update the seed with the hashed value for next iteration\n5. Repeat for count"])

    # Dictionary-based DGA
    table.add_row(["dictionary_based_dga(count, min_length, max_length)", "count (int), min_length (int), max_length (int)", "List of domains (str)", "1. Generate a random length between min_length and max_length\n2. Select words from the dictionary until the desired length is reached\n3. Combine the selected words to form the domain name\n4. Construct domain with 'www.' prefix, domain name, and TLD\n5. Repeat for count"])

    # PRNG-based DGA
    table.add_row(["prng_based_dga(seed, count, min_length, max_length)", "seed (int), count (int), min_length (int), max_length (int)", "List of domains (str)", "1. Seed the random number generator with the provided seed\n2. Generate a random length between min_length and max_length\n3. Generate the domain name by selecting random characters\n4. Construct domain with 'www.' prefix, domain name, and TLD\n5. Repeat for count"])

    # Arithmetic-based DGA
    table.add_row(["arithmetic_based_dga(seed, count, min_length, max_length)", "seed (int), count (int), min_length (int), max_length (int)", "List of domains (str)", "1. Generate a random number to add to the seed\n2. Calculate the result by adding the seed and random number\n3. Convert the result to a string\n4. Generate a random length between min_length and max_length\n5. Truncate or pad the string to match the length\n6. Construct domain with 'www.' prefix, string, and TLD\n7. Repeat for count"])

    # Permutation-based DGA
    table.add_row(["permutation_based_dga(base_domain, count, min_length, max_length)", "base_domain (str), count (int), min_length (int), max_length (int)", "List of domains (str)", "1. Extract characters from the base domain\n2. Generate all possible permutations of the characters\n3. Shuffle the permutations randomly\n4. Generate a random length between min_length and max_length\n5. Join the characters of the current permutation to form the domain name\n6. Truncate or pad the domain name to match the length\n7. Construct domain with 'www.' prefix, domain name, and TLD\n8. Repeat for count"])

    # Fibonacci-based DGA
    table.add_row(["fibonacci_based_dga(count, min_length, max_length)", "count (int), min_length (int), max_length (int)", "List of domains (str)", "1. Initialize Fibonacci sequence with 0 and 1\n2. Get the current Fibonacci number as an index\n3. Generate the domain name using characters at specific indices\n4. Generate a random length between min_length and max_length\n5. Truncate or pad the domain name to match the length\n6. Construct domain with 'www.' prefix, domain name, and TLD\n7. Update the Fibonacci sequence for next iteration\n8. Repeat for count"])

    # Base32/Base64 DGA
    table.add_row(["base32_base64_dga(seed, count, min_length, max_length, encoding_type)", "seed (str), count (int), min_length (int), max_length (int), encoding_type (str)", "List of domains (str)", "1. Encode the seed using the selected encoding type (base32 or base64)\n2. Generate a random number between 1000 and 9999\n3. Select a random TLD\n4. Generate the domain name by combining the encoded seed, random number, and TLD\n5. Generate a random length between min_length and max_length\n6. Truncate or pad the domain name to match the length\n7. Construct domain with 'www.' prefix, domain name, and TLD\n8. Reverse the seed for the next iteration\n9. Repeat for count"])

    # Wordlist-based DGA
    table.add_row(["wordlist_dga(count, min_length, max_length)", "count (int), min_length (int), max_length (int)", "List of domains (str)", "1. Generate words from the wordlist until the desired length is reached\n2. Select a random TLD\n3. Combine the words and TLD to create the domain\n4. Repeat for count"])

    # Vowel-Consonant DGA
    table.add_row(["vowel_consonant_dga(count, min_length, max_length)", "count (int), min_length (int), max_length (int)", "List of domains (str)", "1. Generate a random length between min_length and max_length\n2. Generate the domain name by alternating between vowels and consonants\n3. Select a random TLD\n4. Construct domain with 'www.' prefix, domain name, and TLD\n5. Repeat for count"])

    # Morse Code DGA
    table.add_row(["morse_code_dga(count, min_length, max_length)", "count (int), min_length (int), max_length (int)", "List of domains (str)", "1. Generate a random length between min_length and max_length\n2. Generate the domain name using Morse code sequences\n3. Select a random TLD\n4. Construct domain with 'www.' prefix, domain name, and TLD\n5. Repeat for count"])

    # Emoji DGA
    table.add_row(["emoji_dga(count, min_length, max_length)", "count (int), min_length (int), max_length (int)", "List of domains (str)", "1. Generate a random length between min_length and max_length\n2. Generate the domain name using emojis\n3. Select a random TLD\n4. Construct domain with 'www.' prefix, domain name, and TLD\n5. Repeat for count"])

    # Coordinate-based DGA
    table.add_row(["coordinate_dga(count, min_length, max_length)", "count (int), min_length (int), max_length (int)", "List of domains (str)", "1. Generate random latitude and longitude values\n2. Convert latitude and longitude to a string\n3. Check if the string length falls within the specified range\n4. If not, generate new latitude and longitude\n5. Select a random TLD\n6. Construct domain with 'www.' prefix, coordinate string, and TLD\n7. Repeat for count"])

    # Musical Notes DGA
    table.add_row(["musical_notes_dga(count, min_length, max_length)", "count (int), min_length (int), max_length (int)", "List of domains (str)", "1. Generate a random length between min_length and max_length\n2. Generate the domain name using musical notes and octaves\n3. Select a random TLD\n4. Construct domain with 'www.' prefix, domain name, and TLD\n5. Repeat for count"])

    print(table)

In [22]:
def dgas_to_regex(dga_list):

  char_counts = Counter()
  # Analyze character frequency across all DGA domains
  for dga in dga_list:
    char_counts.update(dga)

  # Separate letters, numbers, and special characters based on frequency
  letter_chars = set(c for c in char_counts if c.isalpha() and char_counts[c] > 1)
  number_chars = set(c for c in char_counts if c.isdigit() and char_counts[c] > 1)
  special_chars = set(c for c in char_counts if c not in letter_chars.union(number_chars))

  # Build the regular expression with optional parts
  regex_parts = []
  if letter_chars:
    regex_parts.append(f"[{'|'.join(letter_chars)}]+")
  if number_chars:
    regex_parts.append(f"[{'|'.join(number_chars)}]+")
  if special_chars:
    # Allow for some special characters, but not too many to avoid overly permissive regex
    # Use the `union` method to combine sets without concatenation
    special_chars_string = ''.join(special_chars)
    regex_parts.append(f"[^{special_chars_string}]{{0,2}}")

  # Combine parts with optional repetition
  regex = "|".join(regex_parts) + "{1,3}"
  return regex

In [23]:
# Yara rule <rule_name> {
#    meta:
#        description = "<description>"
#    strings:
#        $dga_regex = /<dga_regex>/ nocase
#    condition:
#        $dga_regex
#}

def create_yara_rule(dga_regex, rule_name="dga_domain_detection", description="Detects DGA-generated domain names"):

    rule = f"rule {rule_name} {{\n"
    rule += f"    meta:\n"
    rule += f"        description = \"{description}\"\n"
    rule += f"    strings:\n"
    rule += f"        $dga_regex = /{dga_regex}/ nocase\n"
    rule += f"    condition:\n"
    rule += f"        $dga_regex\n"
    rule += f"}}\n"
    return rule

In [24]:
def create_dataset(dga_functions, num_domains_per_dga):
  dataset = []
  labels = []
  regex_patterns = []

  for dga_name in dga_functions:
    # Call the corresponding DGA function to generate domains
    if dga_name == 'base32_base64_dga':
      dataset.append(base32_base64_dga("myseed", DGA_COUNT, min_length, max_length, "base64"))
      labels.append(dga_name)
      regex_pattern = r'[G|X|t|k|c|l|V|W|Z|z|n|b|e|o|w]+|[3]+|[^d7urg9.m]{0,2}{1,3}'
    elif dga_name == 'wordlist_dga':
      dataset.append(wordlist_dga(DGA_COUNT, min_length, max_length))
      labels.append(dga_name)
      regex_pattern = r'[d|r|g|i|f|m|l|o|e|n|a|t|w]+|[^yu.kcbp]{0,2}{1,3}'
    elif dga_name == 'vowel_consonant_dga':
      dataset.append(vowel_consonant_dga(count=DGA_COUNT, min_length=7, max_length=12))
      labels.append(dga_name)
      regex_pattern = r'[d|u|i|f|v|o|m|l|e|n|a|t|c|w]+|[^r.khzb]{0,2}{1,3}'
    elif dga_name == 'morse_code_dga':
      dataset.append(morse_code_dga(count=DGA_COUNT, min_length=9, max_length=16))
      labels.append(dga_name)
      regex_pattern = r'[i|f|c|o|m|n|w]+|[^t-e.]{0,2}{1,3}'
    elif dga_name == 'emoji_dga':
      dataset.append(emoji_dga(count=DGA_COUNT, min_length=10, max_length=18))
      labels.append(dga_name)
      regex_pattern = r'[g|v|o|e|w]+|[^🎉🤔🙌d😂u😀🌍.😍cm🚀nt👍💡]{0,2}{1,3}'
    elif dga_name == 'coordinate_dga':
      dataset.append(coordinate_dga(count=DGA_COUNT, min_length=9, max_length=10))
      labels.append(dga_name)
      regex_pattern = r'[g|o|v|w]+|[3|7|1|0|4|8|5|2|9|6]+|[^drui.enf]{0,2}{1,3}'
    elif dga_name == 'musical_notes_dga':
      dataset.append(musical_notes_dga(count=DGA_COUNT, min_length=7, max_length=9))
      labels.append(dga_name)
      regex_pattern = r'[G|C|g|B|D|F|o|e|n|A|E|w]+|[3|1|4|2|5]+|[^dur.ivtf]{0,2}{1,3}'
    elif dga_name == 'fibonacci_based_dga':
      dataset.append(fibonacci_based_dga(count=DGA_COUNT, min_length=10, max_length=17))
      labels.append(dga_name)
      regex_pattern = r'[G|X|t|k|c|l|V|W|Z|z|n|b|e|o|w]+|[3]+|[^d7urg9.m]{0,2}{1,3}'
    elif dga_name == 'permutation_based_dga':
      dataset.append(permutation_based_dga("example", count=DGA_COUNT, min_length=6, max_length=10))
      labels.append(dga_name)
      regex_pattern = r'[t|m|l|x|e|c|a|o|n|p|w]+|[^rg.]{0,2}{1,3}'
    elif dga_name == 'arithmetic_based_dga':
      dataset.append(arithmetic_based_dga(12345, count=DGA_COUNT, min_length=5, max_length=15))
      labels.append(dga_name)
      regex_pattern = r'[i|o|e|n|t|w]+|[7|1|0|5|2|9|6]+|[^3.cmzbf]{0,2}{1,3}'
    elif dga_name == 'prng_based_dga':
      dataset.append(prng_based_dga(12345, count=DGA_COUNT, min_length=8, max_length=17))
      labels.append(dga_name)
      regex_pattern = r'[r|g|v|i|k|x|m|o|h|l|a|t|n|e|b|z|w]+|[1|0|9]+|[^y3qs7u5.c6f]{0,2}{1,3}'
    elif dga_name == 'dictionary_based_dga':
      dataset.append(dictionary_based_dga(count=DGA_COUNT, min_length=5, max_length=11))
      labels.append(dga_name)
      regex_pattern = r'[d|r|g|i|c|o|m|e|a|t|b|w]+|[^s.hnzf]{0,2}{1,3}'
    elif dga_name == 'seed_based_dga':
      dataset.append(seed_based_dga(str(12345), count=DGA_COUNT, min_length=16, max_length=19))
      labels.append(dga_name)
      regex_pattern = r'[r|g|o|c|e|n|a|t|b|w]+|[3|1|0|8|2|9|5]+|[^d4.m6f]{0,2}{1,3}'
    elif dga_name == 'time_based_dga':
      dataset.append(prng_based_dga(12345, count=DGA_COUNT, min_length=5, max_length=13))
      labels.append(dga_name)
      regex_pattern = r'[r|w|g|i|o|n|f]+|[1|4|0|2|6]+|[^bz.]{0,2}{1,3}'
    elif dga_name == 'zodiac_sign_dga':
      dataset.append(zodiac_sign_dga(seed=str(uuid.uuid4()), domain_count=DGA_COUNT, min_length=8, max_length=14))
      labels.append(dga_name)
      regex_pattern = r'[r|w|g|i|c|o|n|t|b|f]+|[7|1|4|0|8|9|2|5|6]+|[^3q.lez]{0,2}{1,3}'

  return dataset, labels

In [25]:
dga_functions = [
  'base32_base64_dga',
  'wordlist_dga',
  'vowel_consonant_dga',
  'morse_code_dga',
  'emoji_dga',
  'zodiac_sign_dga',
  'coordinate_dga',
  'musical_notes_dga',
  'fibonacci_based_dga',
  'permutation_based_dga',
  'arithmetic_based_dga',
  'prng_based_dga',
  'dictionary_based_dga',
  'seed_based_dga',
  'time_based_dga',
  'zodiac_sign_dga'
]

dataset, labels = create_dataset(dga_functions, DGA_COUNT)

# Now you can work with the data
print("Generated DGA Dataset:")
for i in range(len(dataset)):
  print(f"\t- {labels[i]}: {dataset[i]}")

print("\nDGA Function Labels:")
print(labels)

Generated DGA Dataset:
	- base32_base64_dga: ['www.bXlzZWVk4.gov', 'www.ZGVlc3lt.com', 'www.bXlzZWVk2.gov', 'www.ZGVlc3lt.edu', 'www.bXlzZWVk8.com', 'www.ZGVlc3lt4.net', 'www.bXlzZWVk99.edu', 'www.ZGVlc3lt90.edu', 'www.bXlzZWVk2.net', 'www.ZGVlc3l.gov']
	- wordlist_dga: ['www.cherrykiwi.com', 'www.figfigfig.net', 'www.kiwidate.org', 'www.mangofig.org', 'www.mangogrape.com', 'www.lemonlemon.org', 'www.figcherry.edu', 'www.elderberry.edu', 'www.figkiwi.net', 'www.datefig.edu']
	- vowel_consonant_dga: ['www.otojene.edu', 'www.edepoqawilob.net', 'www.isimerefo.com', 'www.iharagod.net', 'www.olefifozer.com', 'www.upolagis.info', 'www.upuniculic.edu', 'www.ukegibim.gov', 'www.apoqegafuse.gov', 'www.umugonotoma.net']
	- morse_code_dga: ['www.----.--.--...---.....----...---.-----........---...org', 'www.--..------.......----.--...--.-.-.-.---.net', 'www...----.-.---.......----...---.---.----...-.--....-...org', 'www...-.-..--.--.....---.--........----...-.----....--.-.-.--.info', 'www.-.-.--..

In [26]:
#main entry to program

if __name__ == "__main__":
   for dga_type, dga_function in DGA_TYPES.items():
       print(f"{dga_type}")
       # Print explanation, usefulness, strengths, weaknesses, and deception methods
       if dga_type == "Zodiac-based DGA Generation":
           print(textwrap.fill("Explanation: This DGA generates domains based on zodiac signs and random strings.", width=80))
           print(textwrap.fill("Usefulness: This type of DGA can be useful for malware authors to create a unique set of domains for command and control (C&C) communication or data exfiltration.", width=80))
           print(textwrap.fill("Strengths: The use of zodiac signs and random strings makes the generated domains unpredictable and difficult to block.", width=80))
           print(textwrap.fill("Weaknesses: The zodiac sign pattern may be detectable, and the random string generation algorithm may be reverse-engineered.", width=80))
           print(textwrap.fill("Deception: This DGA can be deciphered by analyzing the pattern of generated domains and identifying the relationship between the zodiac signs and dates.", width=80))
           # Generate a random seed of length 16
           random_seed = generate_random_seed(16)
           print(f"Random seed: {random_seed}")
           generated_domains = zodiac_sign_dga(random_seed, DGA_COUNT, min_length, max_length)
           for domain in generated_domains:
              print(domain)

           zodiac_sign_dga_regex = dgas_to_regex(generated_domains)
           if zodiac_sign_dga_regex:
              print(f"\nRegex for Zodiac Sign Generated DGAs: {zodiac_sign_dga_regex}")
              zodiac_sign_dga_yara_rule = create_yara_rule(zodiac_sign_dga_regex)
              print("\nYara rule:\n", zodiac_sign_dga_yara_rule)
              print("\nNote: Can use  https://riskmitigation.ch/yara-scan/ or MalwareBazaar https://bazaar.abuse.ch/ but will need API Key")
           else:
              print("\nCould not create a valid regex from the Zodiac Sign Generated DGAs.")

       elif dga_type == "Time-based DGA Generation":
           print(textwrap.fill("Explanation: This DGA generates domains based on the current time.", width=80))
           print(textwrap.fill("Usefulness: This type of DGA can be used by malware authors to create a set of constantly changing domains for C&C communication or data exfiltration.", width=80))
           print(textwrap.fill("Strengths: The generated domains change frequently, making it difficult to block them all.", width=80))
           print(textwrap.fill("Weaknesses: The time-based pattern may be detectable, and the generated domains may be predictable if the algorithm is known.", width=80))
           print(textwrap.fill("Deception: This DGA can be deciphered by analyzing the pattern of generated domains and identifying the relationship between the domains and the current time.", width=80))
           generated_domains = time_based_dga(DGA_COUNT, min_length, max_length)
           for domain in generated_domains:
              print(domain)

           time_based_dga_regex = dgas_to_regex(generated_domains)
           if time_based_dga_regex:
              print(f"\nRegex for Time-based Generated DGAs: {time_based_dga_regex}")
              time_based_dga_yara_rule = create_yara_rule(time_based_dga_regex)
              print("\nYara rule:\n", time_based_dga_yara_rule)
              print("\nNote: Can use  https://riskmitigation.ch/yara-scan/ or MalwareBazaar https://bazaar.abuse.ch/ but will need API Key")
           else:
              print("\nCould not create a valid regex from the Time-based Generated DGAs.")

       elif dga_type == "Seed-based DGA Generation":
           print(textwrap.fill("Explanation: This DGA generates domains based on a seed value and a hash function.", width=80))
           print(textwrap.fill("Usefulness: This type of DGA can be useful for malware authors to create a unique set of domains for C&C communication or data exfiltration, based on a shared seed value.", width=80))
           print(textwrap.fill("Strengths: The generated domains are unpredictable without knowledge of the seed value and the hash function used.", width=80))
           print(textwrap.fill("Weaknesses: If the seed value or the hash function is compromised, the generated domains can be predicted.", width=80))
           print(textwrap.fill("Deception: This DGA can be deciphered by analyzing the pattern of generated domains and attempting to reverse-engineer the seed value and the hash function.", width=80))
           # Generate a random seed of length 16
           random_seed = generate_random_seed(16)
           print(f"Random seed: {random_seed}")
           generated_domains = seed_based_dga(random_seed, DGA_COUNT, min_length, max_length)
           for domain in generated_domains:
              print(domain)

           seed_based_dga_regex = dgas_to_regex(generated_domains)
           if seed_based_dga_regex:
              print(f"\nRegex for Seed-based Generated DGAs: {seed_based_dga_regex}")
              seed_based_dga_yara_rule = create_yara_rule(seed_based_dga_regex)
              print("\nYara rule:\n", seed_based_dga_yara_rule)
              print("\nNote: Can use  https://riskmitigation.ch/yara-scan/ or MalwareBazaar https://bazaar.abuse.ch/ but will need API Key")
           else:
              print("\nCould not create a valid regex from the Seed-based Generated DGAs.")

       elif dga_type == "Dictionary-based DGA Generation":
           print(textwrap.fill("Explanation: This DGA generates domains by combining random words from a predefined dictionary.", width=80))
           print(textwrap.fill("Usefulness: This type of DGA can be useful for malware authors to create domains that appear more human-readable and less suspicious.", width=80))
           print(textwrap.fill("Strengths: The generated domains are more likely to bypass filters and appear legitimate.", width=80))
           print(textwrap.fill("Weaknesses: The dictionary used may be known or detectable, and the pattern of combining words may be recognizable.", width=80))
           print(textwrap.fill("Deception: This DGA can be deciphered by analyzing the pattern of generated domains and identifying the dictionary used and the word combination algorithm.", width=80))
           generated_domains = dictionary_based_dga(DGA_COUNT, min_length, max_length)
           for domain in generated_domains:
              print(domain)

           dictionary_based_dga_regex = dgas_to_regex(generated_domains)
           if dictionary_based_dga_regex:
              print(f"\nRegex for Dictionary-based Generated DGAs: {dictionary_based_dga_regex}")
              dictionary_based_dga_yara_rule = create_yara_rule(dictionary_based_dga_regex)
              print("\nYara rule:\n", dictionary_based_dga_yara_rule)
              print("\nNote: Can use  https://riskmitigation.ch/yara-scan/ or MalwareBazaar https://bazaar.abuse.ch/ but will need API Key")
           else:
              print("\nCould not create a valid regex from the Dictionary-based Generated DGAs.")

       elif dga_type == "Pseudorandom Number Generator (PRNG) based DGA Generation":
           print(textwrap.fill("Explanation: This DGA generates domains using a pseudorandom number generator (PRNG) seeded with a specific value.", width=80))
           print(textwrap.fill("Usefulness: This type of DGA can be useful for malware authors to create a unique set of domains for C&C communication or data exfiltration, based on a shared seed value.", width=80))
           print(textwrap.fill("Strengths: The generated domains are unpredictable without knowledge of the seed value and the PRNG algorithm used.", width=80))
           print(textwrap.fill("Weaknesses: If the seed value or the PRNG algorithm is compromised, the generated domains can be predicted.", width=80))
           print(textwrap.fill("Deception: This DGA can be deciphered by analyzing the pattern of generated domains and attempting to reverse-engineer the seed value and the PRNG algorithm.", width=80))
           generated_domains = prng_based_dga(12345, DGA_COUNT, min_length, max_length)
           for domain in generated_domains:
              print(domain)

           prng_based_dga_regex = dgas_to_regex(generated_domains)
           if prng_based_dga_regex:
              print(f"\nRegex for Pseudorandom Number Generator (PRNG) based Generated DGAs: {prng_based_dga_regex}")
              prng_based_dga_yara_rule = create_yara_rule(prng_based_dga_regex)
              print("\nYara rule:\n", prng_based_dga_yara_rule)
              print("\nNote: Can use  https://riskmitigation.ch/yara-scan/ or MalwareBazaar https://bazaar.abuse.ch/ but will need API Key")
           else:
              print("\nCould not create a valid regex from the Pseudorandom Number Generator (PRNG) Generated DGAs.")

       elif dga_type == "Arithmetic-based DGA Generation":
           print(textwrap.fill("Explanation: This DGA generates domains by performing arithmetic operations on a base value and a random number.", width=80))
           print(textwrap.fill("Usefulness: This type of DGA can be useful for malware authors to create a unique set of domains for C&C communication or data exfiltration, based on a shared base value.", width=80))
           print(textwrap.fill("Strengths: The generated domains are unpredictable without knowledge of the base value and the arithmetic operation used.", width=80))
           print(textwrap.fill("Weaknesses: If the base value or the arithmetic operation is compromised, the generated domains can be predicted.", width=80))
           print(textwrap.fill("Deception: This DGA can be deciphered by analyzing the pattern of generated domains and attempting to reverse-engineer the base value and the arithmetic operation.", width=80))
           generated_domains = arithmetic_based_dga(12345, DGA_COUNT,min_length, max_length)
           for domain in generated_domains:
              print(domain)

           arithmetic_dga_regex = dgas_to_regex(generated_domains)
           if arithmetic_dga_regex:
              print(f"\nRegex for Arithmetic-based Generated DGAs: {arithmetic_dga_regex}")
              arithmetic_based_dga_yara_rule = create_yara_rule(arithmetic_dga_regex)
              print("\nYara rule:\n", arithmetic_based_dga_yara_rule)
              print("\nNote: Can use  https://riskmitigation.ch/yara-scan/ or MalwareBazaar https://bazaar.abuse.ch/ but will need API Key")
           else:
              print("\nCould not create a valid regex from the Arithmetic-based Generated DGAs.")

       elif dga_type == "Permutation-based DGA Generation":
           print(textwrap.fill("Explanation: This DGA generates domains by permuting the characters of a base domain.", width=80))
           print(textwrap.fill("Usefulness: This type of DGA can be useful for malware authors to create a unique set of domains for C&C communication or data exfiltration, based on a shared base domain.", width=80))
           print(textwrap.fill("Strengths: The generated domains are unpredictable without knowledge of the base domain and the permutation algorithm used.", width=80))
           print(textwrap.fill("Weaknesses: If the base domain or the permutation algorithm is compromised, the generated domains can be predicted.", width=80))
           print(textwrap.fill("Deception: This DGA can be deciphered by analyzing the pattern of generated domains and attempting to reverse-engineer the base domain and the permutation algorithm.", width=80))
           generated_domains = permutation_based_dga("example", DGA_COUNT, min_length, max_length)
           for domain in generated_domains:
              print(domain)

           permutation_dga_regex = dgas_to_regex(generated_domains)
           if permutation_dga_regex:
              print(f"\nRegex for Permutation-based Generated DGAs: {permutation_dga_regex}")
              permutation_based_dga_yara_rule = create_yara_rule(permutation_dga_regex)
              print("\nYara rule:\n", permutation_based_dga_yara_rule)
              print("\nNote: Can use  https://riskmitigation.ch/yara-scan/ or MalwareBazaar https://bazaar.abuse.ch/ but will need API Key")
           else:
              print("\nCould not create a valid regex from the Permutation-based  Generated DGAs.")

       elif dga_type == "Fibonacci-based DGA Generation":
           print(textwrap.fill("Explanation: This DGA generates domains using the Fibonacci sequence and a character mapping.", width=80))
           print(textwrap.fill("Usefulness: This type of DGA can be useful for malware authors to create a unique set of domains for C&C communication or data exfiltration, based on the Fibonacci sequence.", width=80))
           print(textwrap.fill("Strengths: The generated domains are unpredictable without knowledge of the character mapping and the Fibonacci sequence implementation.", width=80))
           print(textwrap.fill("Weaknesses: If the character mapping or the Fibonacci sequence implementation is compromised, the generated domains can be predicted.", width=80))
           print(textwrap.fill("Deception: This DGA can be deciphered by analyzing the pattern of generated domains and attempting to reverse-engineer the character mapping and the Fibonacci sequence implementation.", width=80))
           generated_domains = fibonacci_based_dga(DGA_COUNT, min_length, max_length)
           for domain in generated_domains:
              print(domain)

           fibonacci_dga_regex = dgas_to_regex(generated_domains)
           if fibonacci_dga_regex:
              print(f"\nRegex for Fibonacci-based Generated DGAs: {fibonacci_dga_regex}")
              fibonacci_based_dga_yara_rule = create_yara_rule(fibonacci_dga_regex)
              print("\nYara rule:\n", fibonacci_based_dga_yara_rule)
              print("\nNote: Can use  https://riskmitigation.ch/yara-scan/ or MalwareBazaar https://bazaar.abuse.ch/ but will need API Key")
           else:
              print("\nCould not create a valid regex from the Fibonacci-based Generated DGAs.")

       elif dga_type == "Base32/Base64 DGA Generation":
           print(textwrap.fill("Explanation: This DGA generates domains by encoding a seed value using Base32 or Base64 encoding.", width=80))
           print(textwrap.fill("Usefulness: This type of DGA can be useful for malware authors to create a unique set of domains for C&C communication or data exfiltration, based on a shared seed value.", width=80))
           print(textwrap.fill("Strengths: The generated domains are unpredictable without knowledge of the seed value and the encoding scheme used.", width=80))
           print(textwrap.fill("Weaknesses: If the seed value or the encoding scheme is compromised, the generated domains can be predicted.", width=80))
           print(textwrap.fill("Deception: This DGA can be deciphered by analyzing the pattern of generated domains and attempting to reverse-engineer the seed value and the encoding scheme.", width=80))
           generated_domains = base32_base64_dga("myseed", DGA_COUNT, min_length, max_length, "base64")
           for domain in generated_domains:
              print(domain)

           base32_base64_dga_regex = dgas_to_regex(generated_domains)
           if base32_base64_dga_regex:
              print(f"\nRegex for Base32/Base64 Generated DGAs: {base32_base64_dga_regex}")
              base32_base64_based_dga_yara_rule = create_yara_rule(base32_base64_dga_regex)
              print("\nYara rule:\n", base32_base64_based_dga_yara_rule)
              print("\nNote: Can use  https://riskmitigation.ch/yara-scan/ or MalwareBazaar https://bazaar.abuse.ch/ but will need API Key")
           else:
              print("\nCould not create a valid regex from the Base32/Base64 Generated DGAs.")

       elif dga_type == "Wordlist-based DGA Generation":
           print(textwrap.fill("Explanation: This DGA generates domains by combining random words from a predefined wordlist.", width=80))
           print(textwrap.fill("Usefulness: This type of DGA can be useful for malware authors to create domains that appear more human-readable and less suspicious.", width=80))
           print(textwrap.fill("Strengths: The generated domains are more likely to bypass filters and appear legitimate.", width=80))
           print(textwrap.fill("Weaknesses: The wordlist used may be known or detectable, and the pattern of combining words may be recognizable.", width=80))
           print(textwrap.fill("Deception: This DGA can be deciphered by analyzing the pattern of generated domains and identifying the wordlist used and the word combination algorithm.", width=80))
           generated_domains = wordlist_dga(DGA_COUNT, min_length, max_length)
           for domain in generated_domains:
              print(domain)

           wordlist_dga_regex = dgas_to_regex(generated_domains)
           if wordlist_dga_regex:
              print(f"\nRegex for Wordlist-based Generated DGAs: {wordlist_dga_regex}")
              bwordlist_based_dga_yara_rule = create_yara_rule(wordlist_dga_regex)
              print("\nYara rule:\n", bwordlist_based_dga_yara_rule)
              print("\nNote: Can use  https://riskmitigation.ch/yara-scan/ or MalwareBazaar https://bazaar.abuse.ch/ but will need API Key")
           else:
              print("\nCould not create a valid regex from the Wordlist-based Generated DGAs.")

       elif dga_type == "Vowel-Consonant DGA Generation":
           print(textwrap.fill("Explanation: This DGA generates domains by alternating between vowels and consonants.", width=80))
           print(textwrap.fill("Usefulness: This type of DGA can be useful for malware authors to create domains that appear more human-readable and less suspicious.", width=80))
           print(textwrap.fill("Strengths: The generated domains are more likely to bypass filters and appear legitimate.", width=80))
           print(textwrap.fill("Weaknesses: The alternating vowel-consonant pattern may be detectable, and the character sets used may be known.", width=80))
           print(textwrap.fill("Deception: This DGA can be deciphered by analyzing the pattern of generated domains and identifying the vowel-consonant alternation algorithm and the character sets used.", width=80))
           generated_domains = vowel_consonant_dga(DGA_COUNT, min_length, max_length)
           for domain in generated_domains:
              print(domain)

           vowel_consonant_dga_regex = dgas_to_regex(generated_domains)
           if vowel_consonant_dga_regex:
              print(f"\nRegex for Vowel-Consonant Generated DGAs: {vowel_consonant_dga_regex}")
              bvowel_consonant_based_dga_yara_rule = create_yara_rule(vowel_consonant_dga_regex)
              print("\nYara rule:\n", bvowel_consonant_based_dga_yara_rule)
              print("\nNote: Can use  https://riskmitigation.ch/yara-scan/ or MalwareBazaar https://bazaar.abuse.ch/ but will need API Key")
           else:
              print("\nCould not create a valid regex from the Vowel-Consonant Generated DGAs.")

       elif dga_type == "Morse Code DGA Generation":
           print(textwrap.fill("Explanation: This DGA generates domains using Morse code representation of characters.", width=80))
           print(textwrap.fill("Usefulness: This type of DGA can be useful for malware authors to create domains that appear obfuscated and less suspicious.", width=80))
           print(textwrap.fill("Strengths: The generated domains are less likely to be recognized as DGA domains and may bypass filters.", width=80))
           print(textwrap.fill("Weaknesses: The Morse code pattern may be detectable, and the character mapping may be known or reverse-engineered.", width=80))
           print(textwrap.fill("Deception: This DGA can be deciphered by analyzing the pattern of generated domains and identifying the Morse code character mapping.", width=80))
           generated_domains = morse_code_dga(DGA_COUNT, min_length, max_length)
           for domain in generated_domains:
              print(domain)

           morse_code_dga_regex = dgas_to_regex(generated_domains)
           if morse_code_dga_regex:
              print(f"\nRegex for Morse Code Generated DGAs: {morse_code_dga_regex}")
              morse_code_based_dga_yara_rule = create_yara_rule(morse_code_dga_regex)
              print("\nYara rule:\n", morse_code_based_dga_yara_rule)
              print("\nNote: Can use  https://riskmitigation.ch/yara-scan/ or MalwareBazaar https://bazaar.abuse.ch/ but will need API Key")
           else:
              print("\nCould not create a valid regex from the Morse Code Generated DGAs.")

       elif dga_type == "Emoji DGA Generation":
           print(textwrap.fill("Explanation: This DGA generates domains using emojis.", width=80))
           print(textwrap.fill("Usefulness: This type of DGA can be useful for malware authors to create domains that appear obfuscated and less suspicious.", width=80))
           print(textwrap.fill("Strengths: The generated domains are less likely to be recognized as DGA domains and may bypass filters.", width=80))
           print(textwrap.fill("Weaknesses: The emoji pattern may be detectable, and the emoji set used may be known or reverse-engineered.", width=80))
           print(textwrap.fill("Deception: This DGA can be deciphered by analyzing the pattern of generated domains and identifying the emoji set used.", width=80))
           generated_domains = emoji_dga(DGA_COUNT, min_length, max_length)
           for domain in generated_domains:
              print(domain)

           emoji_dga_regex = dgas_to_regex(generated_domains)
           if emoji_dga_regex:
              print(f"\nRegex for Emoji Generated DGAs: {emoji_dga_regex}")
              emoji_dga_yara_rule = create_yara_rule(emoji_dga_regex)
              print("\nYara rule:\n", emoji_dga_yara_rule)
              print("\nNote: Can use  https://riskmitigation.ch/yara-scan/ or MalwareBazaar https://bazaar.abuse.ch/ but will need API Key")
           else:
              print("\nCould not create a valid regex from the Emoji Generated DGAs.")

       elif dga_type == "Coordinate-based DGA":
           print(textwrap.fill("Explanation: This DGA generates domains using coordinates.", width=80))
           print(textwrap.fill("Usefulness: This type of DGA can be useful for malware authors to create domains that appear obfuscated and less suspicious.", width=80))
           print(textwrap.fill("Strengths: The generated domains are less likely to be recognized as DGA domains and may bypass filters.", width=80))
           print(textwrap.fill("Weaknesses: The coordinate pattern may be detectable, and the coordinate range and format may be known or reverse-engineered.", width=80))
           print(textwrap.fill("Deception: This DGA can be deciphered by analyzing the pattern of generated domains and identifying the coordinate range and format used.", width=80))
           generated_domains = coordinate_dga(DGA_COUNT, min_length, max_length)
           for domain in generated_domains:
              print(domain)

           coordinate_dga_regex = dgas_to_regex(generated_domains)
           if coordinate_dga_regex:
              print(f"\nRegex for Coordinate-based Generated DGAs: {coordinate_dga_regex}")
              coordinate_yara_rule = create_yara_rule(coordinate_dga_regex)
              print("\nYara rule:\n", coordinate_yara_rule)
              print("\nNote: Can use  https://riskmitigation.ch/yara-scan/ or MalwareBazaar https://bazaar.abuse.ch/ but will need API Key")
           else:
              print("\nCould not create a valid regex from the Coordinate-based Generated DGAs.")

       elif dga_type == "Musical Notes DGA":
           print(textwrap.fill("Explanation: This DGA generates domains using musical notes and octaves.", width=80))
           print(textwrap.fill("Usefulness: This type of DGA can be useful for malware authors to create domains that appear obfuscated and less suspicious.", width=80))
           print(textwrap.fill("Strengths: The generated domains are less likely to be recognized as DGA domains and may bypass filters.", width=80))
           print(textwrap.fill("Weaknesses: The musical note pattern may be detectable, and the note and octave sets used may be known or reverse-engineered.",width=80))
           print(textwrap.fill("Deception: This DGA can be deciphered by analyzing the pattern of generated domains and identifying the note and octave sets used.", width=80))
           generated_domains = musical_notes_dga(DGA_COUNT, min_length, max_length)
           for domain in generated_domains:
              print(domain)

           musical_notes_dga_regex = dgas_to_regex(generated_domains)
           if musical_notes_dga_regex:
              print(f"\nRegex for Musical Notes Generated DGAs: {musical_notes_dga_regex}")
              cmusical_notes_ara_rule = create_yara_rule(musical_notes_dga_regex)
              print("\nYara rule:\n", cmusical_notes_ara_rule)
              print("\nNote: Can use  https://riskmitigation.ch/yara-scan/ or MalwareBazaar https://bazaar.abuse.ch/ but will need API Key")
           else:
              print("\nCould not create a valid regex from the Musical Notes Generated DGAs.")

       print("\n")
   summarize_dga_functions()


Zodiac-based DGA Generation
Explanation: This DGA generates domains based on zodiac signs and random
strings.
Usefulness: This type of DGA can be useful for malware authors to create a
unique set of domains for command and control (C&C) communication or data
exfiltration.
Strengths: The use of zodiac signs and random strings makes the generated
domains unpredictable and difficult to block.
Weaknesses: The zodiac sign pattern may be detectable, and the random string
generation algorithm may be reverse-engineered.
Deception: This DGA can be deciphered by analyzing the pattern of generated
domains and identifying the relationship between the zodiac signs and dates.
Random seed: $-i]b"}B,R*KG!?,
www.2de154bbd.net
www.b66c09d9.biz
www.18d851v.com
www.b109635x8.net
www.f4w8t15.biz
www.f644d25ac8.org
www.04996c0d9b.biz
www.935052844.net
www.8049b2gd.org
www.f6c122b.org

Regex for Zodiac Sign Generated DGAs: [r|w|e|o|b|c|d|i|g|n|f|t|z]+|[9|4|0|8|2|3|5|6|1]+|[^mxav.]{0,2}{1,3}

Yara rule:
 rule