Converts output files from Bulk Downloader for Reddit into pretty text files like this:
Issues and PRs are welcome.
$ git clone https://github.com/DownrightNifty/bdfr2text.git
$ cd bdfr2text
$ python3 bdfr2text.py INPUT_DIR OUTPUT_DIR
INPUT_DIR
is the output dir of bdfr archive
. See python3 bdfr2text.py -h
.
Only JSON or YAML (not XML) output from BDFR is supported. If converting YAML files, PyYAML is necessary (but this should already have been installed by BDFR). Otherwise, no dependencies.
The --parsable-out
(-p
) option produces a parsable output by escaping delimiters used by bdfr2text found in the Reddit posts. It replaces [
with [
, ]
with ]
, and ---
with ┄
.
--parsable-out
makes the output text files searchable with your favorite programs. Personally, I use Sublime Text, which can search entire folders and supports regex. For example, you could use the following regex to search for the string "query" within Reddit comments (excluding metadata blocks): query(?=[^\]]+\[)