RedditArchiver-standalone is the standalone version of RedditArchiver.
"Standalone" means that you do not need a web server: the function was reduced to a simple Python script.
For more information on RedditArchiver itself, see the main repository.
(Replace python3
by py -3
if you are on Windows)
Install dependencies:
python3 -m pip install -r requirements.txt
Copy config.yml.example
to config.yml
and put all the information needed to connect (client ID, client secret and refresh token). If you have no clue about what these options are, follow these steps:
- Go here and create an app. Use the "script" type, and put
http://localhost:8080
as redirect URI (the other options do not matter). - Finish creating the app. Take note of your client ID and client secret (the client ID is below the 'personal use script' label, the secret is beside the 'secret' label).
- Run the script
authentication.py
to get your refresh token. - Edit the config file to put the client ID, client secret and refresh token in it. (You can avoid putting the refresh token in the config file; see Shell Function for details.)
You can select the submissions you want to download with several methods:
-i
: specify a submission ID or an URL to download it.-i
can be written several times to download several submissions, like that:
python3 RedditArchiver.py \
-i https://www.reddit.com/r/Superbowl/comments/14hczkk/elf_owl_enjoying_our_pond/ \
-i https://www.reddit.com/r/Superbowl/comments/14gozc4/adult_and_hungry_juvenile_great_horned_owl_norcal/ \
-i 14iard6
-
-s
: Download all the submissions you saved. If you want to include as well the submissions which you saved a comment from, pass-S
instead. -
-u
: Download all the submissions you upvoted. -
-a
: Download all the submissions you posted. If you want to include as well the submissions which you posted a comment in, pass-A
instead.
You can also specify a name to download the submissions from another redditor. Here, you will download the submissions posted by you and by u/iamthatis:
python3 RedditArchiver.py -a -a iamthatis
You can combine these options to download a lot of things at once:
python3 RedditArchiver.py \
-i https://www.reddit.com/r/Superbowl/comments/14hczkk/elf_owl_enjoying_our_pond/ \
-i https://www.reddit.com/r/Superbowl/comments/14gozc4/adult_and_hungry_juvenile_great_horned_owl_norcal/ \
-i 14iard6 \
-s -u -A -l 10
RedditArchiver has more options to control its behavior (limit of submissions retrieved, config file...). To see more, display help with the -h
option.
The file reddit-archive.sh
defines a shell function that makes it simpler to use RedditArchiver for a single post. It also adds support for customization with environment variables, and better security for the refresh token.
You can add this function to your .bashrc or .zshrc with:
source ~/RedditArchiver-standalone/reddit-archive.sh
(Use the directory you cloned the repository if it's not in your home directory.)
reddit_archive https://www.reddit.com/r/Superbowl/comments/14hczkk/elf_owl_enjoying_our_pond/
reddit_archive 14iard6
Unlike the RedditArchiver.py script, you can only specify one submission ID or URL.
There are several environment variables to allow customization of paths/files:
REDDIT_ARCHIVER_SCRIPT_DIR
: Directory where RedditArchiver.py lives. If not set, this defaults to the location of thereddit-archive.sh
script (zsh/bash).REDDIT_ARCHIVER_SCRIPT_NAME
: Name of the script itself, normallyRedditArchiver.py
.REDDIT_ARCHIVER_VENV_PATH
: Path to the Python virtual environment, if one is in useREDDIT_ARCHIVER_OUTPUT_DIR
: Directory where the post will be saved. This defaults to$HOME/RedditPosts
.REDDIT_ARCHIVER_CONFIG_FILE
: Path and name of config file. This defaults to the location of the script plusconfig.yml
.REDDIT_ARCHIVER_REFRESH_TOKEN_SECRET_FILE
: Path and name of file containing encrypted refresh token. By default this is not set. If set, Actually Good Encryption (age) is used to decrypt the refresh token using a passkey. See https://github.com/FiloSottile/age for more. If you use something else such as gpg, make appropriate changes to the shell script. See [Security](### Security) for more.
The combination of a custom filepath to the config file and filepath to an encrypted refresh token provides for more security than the out-of-the-box RedditArchiver.py provides. After running authentication.py
to get the client ID, secret, and refresh token:
- Copy the
config.yml.example
file to e.g.~/.secrets/RedditArchiver.config.yml
. SetREDDIT_ARCHIVER_CONFIG_FILE
to this filepath. Do not edit the file in its original location. - Edit this file and set the client ID and secret values from your Reddit app. (If you aren't sure, the client ID is the string under the 'personal use script' label, and the secret is next to the 'secret' label.)
- Assuming you're using age as your encryption tool, you can use the
reddit_archiver_encrypt_refresh_token
function to encrypt your refresh token (path is an example):
reddit_archiver_encrypt_refresh_token ~/.secrets/RedditArchiver.refreshtoken.age
Then in your .bashrc
or .zshrc
:
export REDDIT_ARCHIVER_REFRESH_TOKEN_SECRET_FILE=~/.secrets/RedditArchiver.refreshtoken.age
This keeps your refresh token out of the filesystem. Additionally, the reddit_archive function passes the decrypted refresh token to the Python script as an environment variable, to keep it off the command line.
This software is licensed with MIT license.