Gmail URL Decoder
Gmail URL Decoder is an Open Source Python tool that can be used against plaintext or arbitrary raw data files in order to find, extract, and decode information from Gmail URLs related to both the new and legacy Gmail interfaces.
Run with python3 (properly tested on 3.6.7 version):
usage: GmailURLDecoder.py [-h] (-t | -r) [-n | -l] -i PATH -o PATH [-v] [-c] Extract and decode information from Gmail URLs optional arguments: -h, --help show this help message and exit -t, --text set input as a plain text file with an URL per line -r, --raw set input as a raw data file -n, --new only look for new URLs -l, --legacy only look for legacy URLs -v, --verbose print the results as they are found -c, --compact write compact output required arguments: -i PATH, --input PATH path of the input file -o PATH, --output PATH path of the output file
Plaintext input vs Raw data input
You can run the program against to different types of input:
- plaintext file cointaining an URL per line (extracted beforehand)
- raw data file
The former is recommended in case you have access to non-corrupted files storing URLs in a known way that can be extracted easily (mainly history files) as it will make the search faster (due to the ability to load into RAM a line each time when dealing with plaintext files) and less prone to parsing inaccuracy (due to some heuristics that have to bee applied into raw data file parsing to properly separate URLs' data from other adjacent data).
Legacy vs New URL format
By default, the program will look for URLs from both legacy and new gmail interface. In case you are totally convinced that you won't find any URL from the old (or new) format, you can look only for one of those specific patterns to save a little time when running it against huge files.
The output file generated consists of json formatted data that can be easily consumed by any other application or programming language.
You can use the compact '-c' argument for the data to be written without any indentation (that might be useful if you will only consume the json output with another applicaion/language and not human reading it directly.
Moreover, if you want to print in terminal results as they are being found, you can use the verbose '-v' argument
Run the program against a raw data file
python3 GmailURLDecoder.py -r -i /path/to/input/raw.dat -o /path/to/output/rawdata_results.json
Run the program against a plaintext file in verbose mode
python3 GmailURLDecoder.py -t -i /path/to/input/plain.txt -o /path/to/output/plaintext_results.json -v
Run the program against a raw data file looking only for legacy URLs with compact output
python3 GmailURLDecoder.py -r -l -i /path/to/input/raw.dat -o /path/to/output/rawdata_results.json -c
Run the program against a plaintext file looking only for new URLs in verbose mode and with compact output
python3 GmailURLDecoder.py -t -n -i /path/to/input/plain.txt -o /path/to/output/plaintext_results.json -v -c
Contributions and improvements to the code are welcomed.
Distributed under the MIT License. See License.md for details.