Skip to content
master
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
.ci
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Export your personal Reddit data: saves, upvotes, submissions etc. as JSON.

Setting up

  1. The easiest way is pip3 install --user git+https://github.com/karlicoss/rexport.

    Alternatively, use git clone --recursive, or git pull && git submodule update --init. After that, you can use pip3 install --editable ..

  2. To use the API, you need to register a custom personal script app and get client_id and client_secret parameters.

    See more here.

  3. To access users personal data (e.g. saved posts/comments), Reddit API also requires username and password.

    Yes, unfortunately it wants your plaintext Reddit password, you can read more about it here.

Exporting

Usage:

Recommended: create secrets.py keeping your api parameters, e.g.:

username = "USERNAME"
password = "PASSWORD"
client_id = "CLIENT_ID"
client_secret = "CLIENT_SECRET"

If you have two-factor authentication enabled, append the six-digit 2FA token to the password, separated by a colon:

password = "PASSWORD:343642"

The token will, however, be short-lived.

After that, use:

python3 -m rexport.export --secrets /path/to/secrets.py

That way you type less and have control over where you keep your plaintext secrets.

Alternatively, you can pass parameters directly, e.g.

python3 -m rexport.export --username <username> --password <password> --client_id <client_id> --client_secret <client_secret>

However, this is verbose and prone to leaking your keys/tokens/passwords in shell history.

You can also import export.py as a module and call get_json function directly to get raw JSON.

I highly recommend checking exported files at least once just to make sure they contain everything you expect from your export. If not, please feel free to ask or raise an issue!

API limitations

WARNING: reddit API limits your queries to 1000 entries.

I highly recommend to back up regularly and keep old exports. Easy way to achieve it is command like this:

python3 -m rexport.export --secrets /path/to/secrets.py >"export-$(date -I).json"

Or, you can use arctee that automates this.

Check out these links if youre interested in getting older data thats inaccessible by API:

Example output

See example-output.json, its got some example data you might find in your data export. Ive cleaned it up a bit as its got lots of different fields many of which are probably not relevant.

However, this is pretty API dependent and changes all the time, so better check with Reddit API if you are looking to something specific.

Using the data

You can use rexport.dal (stands for Data Access/Abstraction Layer) to access your exported data, even offline. I elaborate on motivation behind it here.

  • main usecase is to be imported as python module to allow for programmatic access to your data.

    You can find some inspiration in =my.= package that Im using as an API to all my personal data.

  • to test it against your export, simply run: python3 -m rexport.dal --source /path/to/export
  • you can also try it interactively: python3 -m rexport.dal --source /path/to/export --interactive

Example output:

Your most saved subreddits:
[('orgmode', 50),
 ('emacs', 36),
 ('QuantifiedSelf', 33),
 ('AskReddit', 33),
 ('selfhosted', 29)]

About

Reddit takeout: export your account data as JSON: comments, submissions, upvotes etc.

Topics

Resources

License

Releases

No releases published

Packages

No packages published