Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interest in some changes I made to support AWS Lambda #63

Closed
scott-hand opened this issue Jul 16, 2016 · 8 comments
Closed

Interest in some changes I made to support AWS Lambda #63

scott-hand opened this issue Jul 16, 2016 · 8 comments

Comments

@scott-hand
Copy link
Collaborator

I really like this project, and I wanted to find an easy way to schedule it to run, and so I created a version that can be called using AWS Lambda scheduled events. The best part about it is that it is almost certainly guaranteed to use well under the limit for the Lambda Free Tier, meaning that users will be able to run it for free without managing any infrastructure.

The catch is that to make it easy and clean to use with AWS's Python deployment approach, I restructured things to make it a more standard Python package layout with a shred module that can be called programmatically. This also made it easier to do things like allowing users to specify the praw.ini file explicitly (and either multiple praw.ini files or multiple site entries in the one file will allow for multiple users to be handled). It also creates a nice "shreddit" script in the virtualenv's bin/ folder.

Here's the fork: https://github.com/scott-hand/Shreddit

I know it's a lot of structural changes, so I just wanted to see if you'd be interested in integrating them. If so, I'll make sure things are cleaned up and submit a PR. Otherwise that's totally cool. I can always spin off my own and cite you and your license file as contributors, and that would be ideal if you didn't want all the structural change introduced.

@scott-hand scott-hand changed the title Interest in some changes I made to support Lambda Interest in some changes I made to support AWS Lambda Jul 16, 2016
@x89
Copy link
Owner

x89 commented Jul 16, 2016

Nicely done @scott-hand! I'd actually meant to give it an overhaul for a while, which included modularising it amongst a code tidy up and I absolutely love the idea behind that, running it through Lambda while perhaps not newbie friendly is definitely a great option for those without a Linux VPS somewhere.

One thing I would like to maintain is backwards compatibility though, I don't want to break my own and other peoples' cron jobs. So a shreddit.py that utilises the module would completely suffice.

That and some testing and I'm more than happy to merge a PR. Good work!

@scott-hand
Copy link
Collaborator Author

Awesome. The "shreddit" script that is installed with setup.py is 100% compatible with shreddit.py, that was one of my primary goals. A shreddit.py that calls it wouldn't be too hard to put together. Your oauth check script is in there as well, with the "-t" or "--test-oauth" option.

And yeah Lambda can be a pain. I have thought about creating a very detailed guide on how to set it up, as there are quite a few steps to getting Lambda running. Another big obstacle is how much of a pain in the ass the OAuth refresh token is to get when you're not running it on your desktop computer. I might look into that as well.

Finally, I noticed that when I was deleting one of my relatively big (100s of comments) accounts, it would randomly stop and I'd have to relaunch shreddit.py. I'll look into that and come up with either a PR or a detailed issue about it.

@x89
Copy link
Owner

x89 commented Jul 17, 2016

I hear you, the OAuth is pretty much just a nightmare, I've had to answer a lot of questions on how it works and how to use it. And I don't blame people at all, it's a pain in the ass.

I noticed that issue years ago, it seems to be something to do with how Reddit caches comments. Usually if you have more than 1000 in your comment history you can't browse back past that stage. But if you delete up to 1000 and wait a while maybe when Reddit re-indexes your comment history you can browse it again? It only happened with my original Reddit account though, now I've had Shreddit running hourly for years and I only have 10-20 comments to deal with! I couldn't find a reliable way to delete all the comments from accounts with thousands of comments at the time and my best advice was to wait… Let me know if you find a working solution!

@scott-hand
Copy link
Collaborator Author

It seemed like it was every 50 - 100 comments, as it would stop after a minute or two and I'd immediately start it back up again. I have another one to shred with a bunch of comments so I'll do some debugging.

One thing I noticed as I'm debugging is that there was some confusion about the comment timestamp timezone ("Seems to be in users's timezone. Unclear."). This has burned me before. The timestamp itself is indeed UTC, but datetime.fromtimestamp() oh-so-helpfully adjusts epoch timestamps into your current system's timezone. Wonderful, right? I'll just swap it out with some arrow code, as it's super clear and easy to work with regarding timestamps.

@scott-hand
Copy link
Collaborator Author

Scratch that, on trying it a bit on my other account, it does indeed seem like it was the 1000 comment limit. Man I comment too much. I'll spend a bit more time checking that it works like it should (testing out the whitelists and whatnot) and then PR it.

I'm being reminded of how much I hate PRAW. Their global config object is obnoxious and I have to mutate it to get around their (as far as I can tell) refusal to allow the praw.ini filename to be specified programmatically. They even call del on the function that generates it so I can't monkeypatch it. The documentation is basically just reading docstrings as well.

Finally, testing the whitelists makes me think it might be worth the time to throw together a package that generates stub data for PRAW, which would, if the PRAW dependencies were injected appropriately, allow easy testing of whitelists and the like. I don't think it's worth the time to stub out everything, but just Comment and Submission ones would be good for this.

@x89
Copy link
Owner

x89 commented Nov 21, 2016

How're your updates going @scott-hand? I was recently thinking about giving it the once over that I believe you were working on, packaging it up and so on.

@scott-hand
Copy link
Collaborator Author

I'll try to get it documented and submit a PR this week. It stalled out a bit because I'm still having issues with it just giving up and silently quitting when having to handle lots of comments. I'll get it documented enough to use and then we can look into PRAW's mysterious quitting issue. The monologue-bot repo I made should help us out with that a bit.

@scott-hand
Copy link
Collaborator Author

@x89

PR created. I just tested a clean installation with pip on Windows and it works well. Some of the to do items are mentioned in the PR, but I'll add that making a guide on free AWS Lambda is also something I'll plan to do. I would also like to make a command to auto-generate a new config from a scaffold like my monologue-bot does here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants