# Title: msticpy - Base64 Decoder
## Description:
This module allows you to extract base64 encoded content from a string or columns of a Pandas DataFrame.
The library returns the following information:
- decoded string (if decodable to utf-8 or utf-16)
- hashes of the decoded segment (MD5, SHA1, SHA256)
- string of printable byte values (e.g. for submission to a disassembler)
- the detected decoded file type (limited)

If the results of the decoding contain further encoded strings these will be decoded recursively. If the encoded string appears to be a zip, gzip or tar archive, the contents will be decompressed after decoding. In the case of zip and tar, the contents of the archive will also be checked for base64 encoded content and decoded/decompressed if possible.


<a id='contents'></a>
## Table of Contents
- [Decoding a String](#decoding_string)
- [Interpreting the DataFrame output](#dataframe_colums)
- [Using a DataFrame as input](#dataframeinput)
  - [Merging output with your DF input](#mergeresults)
- [Nested Encodings](#nested_encodings)
- [ToDo Items](#todos)


In [1]:
# Imports
import sys
MIN_REQ_PYTHON = (3,6)
if sys.version_info < MIN_REQ_PYTHON:
    print('Check the Kernel->Change Kernel menu and ensure that Python 3.6')
    print('or later is selected as the active kernel.')
    sys.exit("Python %s.%s or later is required.\n" % MIN_REQ_PYTHON)


from IPython.display import display
import pandas as pd

# Import Base64 module
from msticpy.sectools import b64

In [3]:
# Load test data
process_tree = pd.read_csv('data/process_tree.csv')
process_tree[['CommandLine']]

Unnamed: 0,CommandLine
0,.\ftp -s:C:\RECYCLER\xxppyy.exe
1,.\reg not /domain:everything that /sid:shines is /krbtgt:golden !
2,"cmd /c ""systeminfo && systeminfo"""
3,.\rundll32 /C 12345.exe
4,.\rundll32 /C c:\users\MSTICAdmin\12345.exe
5,.\rundll32 /C 1234.exe
6,.\rundll32 /C c:\users\MSTICAdmin\1234.exe
7,.\rundll32 /C reg.exe
8,.\reg.exe add \hkcu\software\microsoft\some\key\Run /v abadvalue
9,c:\Diagnostics\UserTmp\tsetup.1.exe C:\Users\MSTICAdmin\AppData\Local\Temp\2\is-01DD7.tmp\tsetu...


<a id='decoding_string'></a>[Contents](#contents)
## Decoding Base64 String

Base64 decode an input string or multiple strings taken from a pandas dataframe column.

```
    unpack_items(input_string: str = None, 
                 data: pd.DataFrame = None, 
                 column: str = None,
                 trace: bool = False):

    Base64 decode an input string or multiple strings taken from a pandas dataframe column.

    Keyword Arguments:
        input_string {[type]} -- single string to decode (default: {None})
        data {[type]} -- dataframe containing column to decode (default: {None})
        column {[type]} -- Name of dataframe text column (default: {None})
        trace {bool} -- Show additional status (default: {False})

    Returns:
        tuple{string, pd.DataFrame} OR
        pd.DataFrame
```

In [4]:
# get a commandline from our data set
cmdline = process_tree['CommandLine'].loc[39]
cmdline

'.\\powershell  -enc JAB0ACAAPQAgACcAZABpAHIAJwA7AA0ACgAmACAAKAAnAEkAbgB2AG8AawBlACcAKwAnAC0ARQB4AHAAcgBlAHMAcwBpAG8AbgAnACkAIAAkAHQA'

In [8]:
# Decode the string
base64_dec_str,df = b64.unpack_items(input_string=cmdline)

# Print decoded string
print(base64_dec_str)

# And the returned dataframe
display(df)

.\powershell  -enc <decoded type='string' name='[None]' index='1' depth='1'>$ t   =   ' d i r ' ;  
 &   ( ' I n v o k e ' + ' - E x p r e s s i o n ' )   $ t </decoded>


Unnamed: 0,reference,original_string,file_name,file_type,input_bytes,decoded_string,encoding_type,file_hashes,md5,sha1,sha256,printable_bytes
0,1.1,JAB0ACAAPQAgACcAZABpAHIAJwA7AA0ACgAmACAAKAAnAEkAbgB2AG8AawBlACcAKwAnAC0ARQB4AHAAcgBlAHMAcwBpAG8A...,unknown,,"b""$\x00t\x00 \x00=\x00 \x00'\x00d\x00i\x00r\x00'\x00;\x00\r\x00\n\x00&\x00 \x00(\x00'\x00I\x00n\...",$�t� �=� �'�d�i�r�'�;�\r�\n�&� �(�'�I�n�v�o�k�e�'�+�'�-�E�x�p�r�e�s�s�i�o�n�'�)� �$�t�,utf-8,"{'md5': '6cd1486db221e532cc2011c9beeb4ffc', 'sha1': '6e485467d7e06502046b7c84a8ef067cfe1512ad', ...",6cd1486db221e532cc2011c9beeb4ffc,6e485467d7e06502046b7c84a8ef067cfe1512ad,d3291dab1ae552b91e6b50d7460ceaa39f6f92b2cda4335dd77e28d25c62ce34,24 00 74 00 20 00 3d 00 20 00 27 00 64 00 69 00 72 00 27 00 3b 00 0d 00 0a 00 26 00 20 00 28 00 ...


<a id='dataframe_colums'></a>[Contents](#contents)
## Interpreting the DataFrame output.
For simple strings the Base64 decoded output is straightforward. However for nested encodings this can get a little complex and difficult to represent in a tabular format.

**Columns**
 - reference - The index of the row item in dotted notation in depth.seq pairs (e.g. 1.2.2.3 would be the 3 item at depth 3 that is a child of the 2nd item found at depth 1). This may not always be an accurate notation - it is mainly use to allow you to associate an individual row with the reference value contained in the full_decoded_string column of the topmost item).
 - original_string - the original string before decoding.
 - file_name - filename, if any (only if this is an item in zip or tar file).
 - file_type - a guess at the file type (this is currently elementary and only includes a few file types).
 - input_bytes - the decoded bytes as a Python bytes string.
 - decoded_string - the decoded string if it can be decoded as a UTF-8 or UTF-16 string. Note: binary sequences may often successfully decode as UTF-16 strings but, in these cases, the decodings are meaningless.
 - encoding_type - encoding type (UTF-8 or UTF-16) if a decoding was possible, otherwise 'binary'.
 - file_hashes - collection of file hashes for any decoded item.
 - md5 - md5 hash as a separate column.
 - sha1 - sha1 hash as a separate column.
 - sha256 - sha256 hash as a separate column.
 - printable_bytes - printable version of input_bytes as a string of \xNN values


In [11]:
df.T

Unnamed: 0,0
reference,1.1
original_string,JAB0ACAAPQAgACcAZABpAHIAJwA7AA0ACgAmACAAKAAnAEkAbgB2AG8AawBlACcAKwAnAC0ARQB4AHAAcgBlAHMAcwBpAG8A...
file_name,unknown
file_type,
input_bytes,"b""$\x00t\x00 \x00=\x00 \x00'\x00d\x00i\x00r\x00'\x00;\x00\r\x00\n\x00&\x00 \x00(\x00'\x00I\x00n\..."
decoded_string,$�t� �=� �'�d�i�r�'�;�\r�\n�&� �(�'�I�n�v�o�k�e�'�+�'�-�E�x�p�r�e�s�s�i�o�n�'�)� �$�t�
encoding_type,utf-8
file_hashes,"{'md5': '6cd1486db221e532cc2011c9beeb4ffc', 'sha1': '6e485467d7e06502046b7c84a8ef067cfe1512ad', ..."
md5,6cd1486db221e532cc2011c9beeb4ffc
sha1,6e485467d7e06502046b7c84a8ef067cfe1512ad


<a id='dataframeinput'></a>[Contents](#toc)
## Using a DataFrame as Input
You can replace the ```data=``` parameter to b64.unpack_items() to pass a DataFrame as an argument.
Use the ```column``` parameter to specify which column to process.

In the case of DataFrame input, the output DataFrame contains these additional columns:
 - src_index - the index of the row in the input dataframe from which the data came.
 - full_decoded_string - the full decoded string with any decoded replacements. This is only really useful for top-level items, since nested items will only show the 'full' string representing the child fragment.

In [37]:
# specify the data and column parameters
dec_df = b64.unpack_items(data=process_tree, column='CommandLine')
dec_df

Unnamed: 0,reference,original_string,file_name,file_type,input_bytes,decoded_string,encoding_type,file_hashes,md5,sha1,sha256,printable_bytes,src_index,full_decoded_string
0,1.1,JAB0ACAAPQAgACcAZABpAHIAJwA7AA0ACgAmACAAKAAnAEkAbgB2AG8AawBlACcAKwAnAC0ARQB4AHAAcgBlAHMAcwBpAG8A...,unknown,,"b""$\x00t\x00 \x00=\x00 \x00'\x00d\x00i\x00r\x00'\x00;\x00\r\x00\n\x00&\x00 \x00(\x00'\x00I\x00n\...",$�t� �=� �'�d�i�r�'�;�\r�\n�&� �(�'�I�n�v�o�k�e�'�+�'�-�E�x�p�r�e�s�s�i�o�n�'�)� �$�t�,utf-8,"{'md5': '6cd1486db221e532cc2011c9beeb4ffc', 'sha1': '6e485467d7e06502046b7c84a8ef067cfe1512ad', ...",6cd1486db221e532cc2011c9beeb4ffc,6e485467d7e06502046b7c84a8ef067cfe1512ad,d3291dab1ae552b91e6b50d7460ceaa39f6f92b2cda4335dd77e28d25c62ce34,24 00 74 00 20 00 3d 00 20 00 27 00 64 00 69 00 72 00 27 00 3b 00 0d 00 0a 00 26 00 20 00 28 00 ...,39,.\powershell -enc <decoded type='string' name='[None]' index='1' depth='1'>$�t� �=� �'�d�i�r�'�...
1,1.1,aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa,unknown,,b'i\xa6\x9ai\xa6\x9ai\xa6\x9ai\xa6\x9ai\xa6\x9ai\xa6\x9ai\xa6\x9ai\xa6\x9a',ꙩ榚骦ꙩ榚骦ꙩ榚骦ꙩ榚骦,utf-16,"{'md5': '9a45b2520e930dc9186f6d93a7798a13', 'sha1': 'f526c90fa0744e3a63d84421ff25e3f5a3d697cb', ...",9a45b2520e930dc9186f6d93a7798a13,f526c90fa0744e3a63d84421ff25e3f5a3d697cb,c1f6c05bdbe28a58557a9477cd0fa96fbc5e7c54ceb6057ec15eca4c664c4239,69 a6 9a 69 a6 9a 69 a6 9a 69 a6 9a 69 a6 9a 69 a6 9a 69 a6 9a 69 a6 9a,40,"cmd /c ""echo # <decoded type='string' name='[None]' index='1' depth='1'>ꙩ榚骦ꙩ榚骦ꙩ榚骦ꙩ榚骦</decoded> ..."
2,1.1,aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa,unknown,,b'i\xa6\x9ai\xa6\x9ai\xa6\x9ai\xa6\x9ai\xa6\x9ai\xa6\x9ai\xa6\x9ai\xa6\x9a',ꙩ榚骦ꙩ榚骦ꙩ榚骦ꙩ榚骦,utf-16,"{'md5': '9a45b2520e930dc9186f6d93a7798a13', 'sha1': 'f526c90fa0744e3a63d84421ff25e3f5a3d697cb', ...",9a45b2520e930dc9186f6d93a7798a13,f526c90fa0744e3a63d84421ff25e3f5a3d697cb,c1f6c05bdbe28a58557a9477cd0fa96fbc5e7c54ceb6057ec15eca4c664c4239,69 a6 9a 69 a6 9a 69 a6 9a 69 a6 9a 69 a6 9a 69 a6 9a 69 a6 9a 69 a6 9a,41,"cmd /c ""echo # <decoded type='string' name='[None]' index='1' depth='1'>ꙩ榚骦ꙩ榚骦ꙩ榚骦ꙩ榚骦</decoded> ..."
3,1.1,81ed03caf6901e444c72ac67d192fb9c,unknown,,b'\xf3W\x9d\xd3w\x1a\x7f\xaft\xd5\xee8\xe1\xce\xf6i\xce\xbbw_v}\xbf\\',埳펝᩷꽿해㣮컡槶믎彷絶岿,utf-16,"{'md5': '1c8cc6299bd654bbcd85710968d6a87c', 'sha1': '55377391141f59a2ff5ae4765d9f0b4438adfd73', ...",1c8cc6299bd654bbcd85710968d6a87c,55377391141f59a2ff5ae4765d9f0b4438adfd73,fd80ceba7cfb49d296886c10d9a3497d63c89a589587cda7d818cb4644842660,f3 57 9d d3 77 1a 7f af 74 d5 ee 38 e1 ce f6 69 ce bb 77 5f 76 7d bf 5c,44,implant.exe <decoded type='string' name='[None]' index='1' depth='1'>埳펝᩷꽿해㣮컡槶믎彷絶岿</decoded>


<a id='mergeresults'></a>[Contents](#contents)
### SourceIndex column allows you to merge the results with the input DataFrame
Where an input row has multiple IoC matches the output of this merge will result in duplicate rows from the input (one per IoC match). The previous index is preserved in the second column (and in the SourceIndex column).

Note: you will need to set the type of the SourceIndex column. In the example below case we are matching with the default numeric index so we force the type to be numeric. In cases where you are using an index of a different dtype you will need to convert the SourceIndex (dtype=object) to match the type of your index column.

In [53]:
merged_df.dropna(subset=['original_string'])

Unnamed: 0_level_0,TenantId,Account,EventID,TimeGenerated,Computer,SubjectUserSid,SubjectUserName,SubjectDomainName,SubjectLogonId,NewProcessId,NewProcessName,TokenElevationType,ProcessId,CommandLine,ParentProcessName,TargetLogonId,SourceComputerId,TimeCreatedUtc,NodeRole,Level,ProcessId1,NewProcessId1,reference,original_string,file_name,file_type,input_bytes,decoded_string,encoding_type,file_hashes,md5,sha1,sha256,printable_bytes,src_index,full_decoded_string
SourceIndex,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1
39,802d39e1-9d70-404d-832c-2de5e2478eda,MSTICAlertsWin1\MSTICAdmin,4688,2019-01-15 05:15:13.567,MSTICAlertsWin1,S-1-5-21-996632719-2361334927-4038480536-500,MSTICAdmin,MSTICAlertsWin1,0xfaac27,0x1684,C:\Diagnostics\UserTmp\powershell.exe,%%1936,0xbc8,.\powershell -enc JAB0ACAAPQAgACcAZABpAHIAJwA7AA0ACgAmACAAKAAnAEkAbgB2AG8AawBlACcAKwAnAC0ARQB4A...,C:\Windows\System32\cmd.exe,0x0,46fe7078-61bb-4bed-9430-7ac01d91c273,2019-01-15 05:15:13.567,sibling,1,,,1.1,JAB0ACAAPQAgACcAZABpAHIAJwA7AA0ACgAmACAAKAAnAEkAbgB2AG8AawBlACcAKwAnAC0ARQB4AHAAcgBlAHMAcwBpAG8A...,unknown,,"b""$\x00t\x00 \x00=\x00 \x00'\x00d\x00i\x00r\x00'\x00;\x00\r\x00\n\x00&\x00 \x00(\x00'\x00I\x00n\...",$�t� �=� �'�d�i�r�'�;�\r�\n�&� �(�'�I�n�v�o�k�e�'�+�'�-�E�x�p�r�e�s�s�i�o�n�'�)� �$�t�,utf-8,"{'md5': '6cd1486db221e532cc2011c9beeb4ffc', 'sha1': '6e485467d7e06502046b7c84a8ef067cfe1512ad', ...",6cd1486db221e532cc2011c9beeb4ffc,6e485467d7e06502046b7c84a8ef067cfe1512ad,d3291dab1ae552b91e6b50d7460ceaa39f6f92b2cda4335dd77e28d25c62ce34,24 00 74 00 20 00 3d 00 20 00 27 00 64 00 69 00 72 00 27 00 3b 00 0d 00 0a 00 26 00 20 00 28 00 ...,39.0,.\powershell -enc <decoded type='string' name='[None]' index='1' depth='1'>$�t� �=� �'�d�i�r�'�...
40,802d39e1-9d70-404d-832c-2de5e2478eda,MSTICAlertsWin1\MSTICAdmin,4688,2019-01-15 05:15:13.683,MSTICAlertsWin1,S-1-5-21-996632719-2361334927-4038480536-500,MSTICAdmin,MSTICAlertsWin1,0xfaac27,0x16b8,C:\Diagnostics\UserTmp\cmd.exe,%%1936,0xbc8,"cmd /c ""echo # aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa >> blah.ps1""",C:\Windows\System32\cmd.exe,0x0,46fe7078-61bb-4bed-9430-7ac01d91c273,2019-01-15 05:15:13.683,sibling,1,,,1.1,aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa,unknown,,b'i\xa6\x9ai\xa6\x9ai\xa6\x9ai\xa6\x9ai\xa6\x9ai\xa6\x9ai\xa6\x9ai\xa6\x9a',ꙩ榚骦ꙩ榚骦ꙩ榚骦ꙩ榚骦,utf-16,"{'md5': '9a45b2520e930dc9186f6d93a7798a13', 'sha1': 'f526c90fa0744e3a63d84421ff25e3f5a3d697cb', ...",9a45b2520e930dc9186f6d93a7798a13,f526c90fa0744e3a63d84421ff25e3f5a3d697cb,c1f6c05bdbe28a58557a9477cd0fa96fbc5e7c54ceb6057ec15eca4c664c4239,69 a6 9a 69 a6 9a 69 a6 9a 69 a6 9a 69 a6 9a 69 a6 9a 69 a6 9a 69 a6 9a,40.0,"cmd /c ""echo # <decoded type='string' name='[None]' index='1' depth='1'>ꙩ榚骦ꙩ榚骦ꙩ榚骦ꙩ榚骦</decoded> ..."
41,802d39e1-9d70-404d-832c-2de5e2478eda,MSTICAlertsWin1\MSTICAdmin,4688,2019-01-15 05:15:13.793,MSTICAlertsWin1,S-1-5-21-996632719-2361334927-4038480536-500,MSTICAdmin,MSTICAlertsWin1,0xfaac27,0x16ec,C:\Diagnostics\UserTmp\cmd.exe,%%1936,0xbc8,"cmd /c ""echo # aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa >> blah.ps1""",C:\Windows\System32\cmd.exe,0x0,46fe7078-61bb-4bed-9430-7ac01d91c273,2019-01-15 05:15:13.793,sibling,1,,,1.1,aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa,unknown,,b'i\xa6\x9ai\xa6\x9ai\xa6\x9ai\xa6\x9ai\xa6\x9ai\xa6\x9ai\xa6\x9ai\xa6\x9a',ꙩ榚骦ꙩ榚骦ꙩ榚骦ꙩ榚骦,utf-16,"{'md5': '9a45b2520e930dc9186f6d93a7798a13', 'sha1': 'f526c90fa0744e3a63d84421ff25e3f5a3d697cb', ...",9a45b2520e930dc9186f6d93a7798a13,f526c90fa0744e3a63d84421ff25e3f5a3d697cb,c1f6c05bdbe28a58557a9477cd0fa96fbc5e7c54ceb6057ec15eca4c664c4239,69 a6 9a 69 a6 9a 69 a6 9a 69 a6 9a 69 a6 9a 69 a6 9a 69 a6 9a 69 a6 9a,41.0,"cmd /c ""echo # <decoded type='string' name='[None]' index='1' depth='1'>ꙩ榚骦ꙩ榚骦ꙩ榚骦ꙩ榚骦</decoded> ..."
44,802d39e1-9d70-404d-832c-2de5e2478eda,MSTICAlertsWin1\MSTICAdmin,4688,2019-01-15 05:15:12.003,MSTICAlertsWin1,S-1-5-21-996632719-2361334927-4038480536-500,MSTICAdmin,MSTICAlertsWin1,0xfaac27,0x1250,C:\Diagnostics\UserTmp\implant.exe,%%1936,0xbc8,implant.exe 81ed03caf6901e444c72ac67d192fb9c,C:\Windows\System32\cmd.exe,0x0,46fe7078-61bb-4bed-9430-7ac01d91c273,2019-01-15 05:15:12.003,sibling,1,,,1.1,81ed03caf6901e444c72ac67d192fb9c,unknown,,b'\xf3W\x9d\xd3w\x1a\x7f\xaft\xd5\xee8\xe1\xce\xf6i\xce\xbbw_v}\xbf\\',埳펝᩷꽿해㣮컡槶믎彷絶岿,utf-16,"{'md5': '1c8cc6299bd654bbcd85710968d6a87c', 'sha1': '55377391141f59a2ff5ae4765d9f0b4438adfd73', ...",1c8cc6299bd654bbcd85710968d6a87c,55377391141f59a2ff5ae4765d9f0b4438adfd73,fd80ceba7cfb49d296886c10d9a3497d63c89a589587cda7d818cb4644842660,f3 57 9d d3 77 1a 7f af 74 d5 ee 38 e1 ce f6 69 ce bb 77 5f 76 7d bf 5c,44.0,implant.exe <decoded type='string' name='[None]' index='1' depth='1'>埳펝᩷꽿해㣮컡槶믎彷絶岿</decoded>


In [48]:
# Set the type of the SourceIndex column. 
dec_df['SourceIndex'] = pd.to_numeric(dec_df['src_index'])
merged_df = (process_tree
             .merge(right=dec_df, how='left', left_index=True, right_on='SourceIndex')
             .drop(columns=['Unnamed: 0'])
             .set_index('SourceIndex'))

# Show the result of the merge (only those rows that have a value in original_string)
merged_df.dropna(subset=['original_string'])

# Note the output of unpack_items() may have multiple rows (for nested encodings) 
# In this case merged DF will have duplicate rows from the source.

KeyError: 'src_index'

<a id='nested_encodings'></a>[Contents](#contents)
## Decoding Nested Base64/Archives
The module will try to follow nested encodings. It uses the following algorithm:
1. Search for a pattern in the input that looks like a Base64 encoded string
2. If not a known undecodable_string, try to decode the matched pattern.
   - If the base 64 string matches a known archive type (zip, tar, gzip) also decompress or unpack
     - For multi-item archives (zip, tar) process each contained item recursively (i.e. go to item 1. with 
      child item as input)
   - For anything that decodes to a UTF-8 or UTF-16 string replace the input pattern with the decoded string
   - Recurse over resultant output (i.e. submit decoded/replaced string to 1.)
3. If decoding fails, add to list of undecodable_strings (prevents infinite looping over something that looks like a base64 string but isn't)

In [47]:
encoded_cmd = '''
powershell.exe  -nop -w hidden -encodedcommand 
UEsDBBQAAAAIAGBXkk3LfdszdwAAAIoAAAAJAAAAUGVEbGwuZGxss6v+sj/A0diA
UXmufa/PFcYNcRwX7I/wMC4oZAjgUJyzTEgqrdHbfuWyy/OCExqUGJkZGBoYoEDi
QPO3P4wJuqsQgGvVKimphoUIIa1Fgr9OMLyoZ0z4y37gP2vDfxDp8J/RjWEzs4NG
+8TMMoYTCouZGRSShAFQSwMEFAAAAAAAYYJrThx8YzUhAAAAIQAAAAwAAABiNjRp
bnppcC5mb29CYXNlNjQgZW5jb2RlZCBzdHJpbmcgaW4gemlwIGZpbGVQSwMEFAAA
AAAAi4JrTvMfsJUaAAAAGgAAABIAAABQbGFpblRleHRJblppcC5kbGxVbmVuY29k
ZWQgdGV4dCBmaWxlIGluIHppcFBLAQIUABQAAAAIAGBXkk3LfdszdwAAAIoAAAAJ
AAAAAAAAAAAAIAAAAAAAAABQZURsbC5kbGxQSwECFAAUAAAAAABhgmtOHHxjNSEA
AAAhAAAADAAAAAAAAAABACAAAACeAAAAYjY0aW56aXAuZm9vUEsBAhQAFAAAAAAA
i4JrTvMfsJUaAAAAGgAAABIAAAAAAAAAAQAgAAAA6QAAAFBsYWluVGV4dEluWmlw
LmRsbFBLBQYAAAAAAwADALEAAAAzAQAAAAA='''

import re
dec_string, dec_df = b64.unpack_items(input_string=encoded_cmd)
print(dec_string.replace('<decoded', '\n<decoded'))


powershell.exe  -nop -w hidden -encodedcommand 
<decoded value='multiple binary' type='multiple'  index='1'>
<decoded type='string' name='[zip] Filename: PeDll.dll' index='1.1' depth='2'>笾뿴䅐〳⌁㾝䲍ǔ庰퀈쐿Č熠倀℈ꚜᨒ腦㽋ꚩ黓恓⊀́    ᠀菀ﳶ态ꨭꪪꪪꪪꪪꪪꨊ᪪耚ⶡꪪꪪꪪꪪ⪪ᆢ죺ſﵠ쀇׿ﾀ쀇׿｀䘁대䀃蜨榑v⃈Σ ።</decoded>
<decoded type='string' name='[zip] Filename: b64inzip.foo' index='1.2' depth='2'>Base64 encoded string in zip file</decoded>
<decoded type='string' name='[zip] Filename: PlainTextInZip.dll' index='1.3' depth='2'>Unencoded text file in zip</decoded></decoded>


<a id='todos'></a>[Contents](#contents)
## To-Do Items
- Use more comprehensive list of binary magic numbers and match on byte values after decoding to get better file typing
- Output nested decodings in a more readable output
- Add a pandas pipe() partial function to allow inline decoding in a pands pipeline. E.g.

`my_df = pd.read_cs('input.csv').b64decode(column='CommandLine').drop_duplicates().some_func()`