der_gentleman is an Instagram comment scraper / Twitter bot art installation. Nevertheless, the scraper can be used on its own and may provide useful insights.
Instagram removed its Activity tab a while ago, and thus made it nearly impossible to catch up on all the comments a particular account writes and the post he/she likes.
der_gentleman solves this problem by going through every comment of every post of every account the target account is following. If a comment from the target account is found it is added to the database. While we’re at it we also check if the target account has liked the post. This is however somewhat unreliable as we parse the "top liker" string to mitigate heavy rate limiting on the "liked by" endpoint.
This code was written with a very specific use case in mind but can be easily adopted to fit one’s needs.
Quick stats: To recheck the last 16 posts (= one result page) of ~1300 accounts takes about one hour. Initial scraping may take longer, about seven hours or so.
go get -u github.com/buckket/der_gentleman
- Edit
config.toml
- Scrape all the data
- ???
- Profit
There’s a bug in the goinsta library, which has not been fixed upstream, thus this code uses my fork which fixes said bug.
GNU GPLv3+