Last Updated on 2024-09-30
This bot has been retired now Works in Progress are officially on Bluesky @worksinprogress.bsky.social
I enjoy reading Work in Progress, but each article is long (20 to 40 minutes reading) and I often forget about an issue once it’s published. Each issue has 6 or 7 articles.
So to help me keep up, I created a Bluesky bot to post a link and description of one of the 115 articles (at the time of writing) from the Work in Progress archive each week with a screenshot.
Bluesky bot: <! -- @wkipbot.bsky.social -->
I used rvest in R and the inspector tool in Firefox to get the CSS selectors to scrape all the articles.
This yielded a table with 115 articles from 17 issues (one issue a
special issue). The table wkip-articles-2024-09-28.csv has 10 columns
with one row for each article.
Rows: 115
Columns: 10
$ url <chr> "https://worksinprogress.co", "https://…
$ issue_num <chr> "issue-01", "issue-01", "issue-01", "is…
$ card_text <chr> "SpotlightIssue 01Epidemic disease and …
$ card_slug <chr> "issue/epidemic-disease-and-the-state/"…
$ title <chr> "Epidemic disease and the state", "Buil…
$ id <dbl> 16, 111, 112, 113, 114, 115, 15, 104, 1…
$ post_week <date> 2025-07-01, 2024-10-01, 2025-11-11, 20…
$ post_article_url <chr> "https://worksinprogress.co/issue/epide…
$ post_issue_url <chr> "https://worksinprogress.co/issue-01/",…
$ screen_shot_prefix <chr> "issue-01-1", "issue-01-2", "issue-01-3…The table is used for the posting bot for the text, alt-text and to
create the YAML for taking screenshots. I wanted to post one article a
week at random from the archive, so I created article id and date
post_week variable that was randomly assigned to each id.
I’ve added a get_post_uri_by_date() and repost() function to enable
reposting the article during the posting week.
The initial post is done with bsky.R and the reposts use the
bsky_tue.R and bsky_fri.R scripts with the associated Github Action
yml.
I used Simon Willison’s
shot-scraper to create
images for each article to use with the post using the
screen_shot_prefix in the articles table.
To do this I needed a YAML file for each issue to input to
shot-scraper, so with help from Claude I wrote some more R to create
the 17 YAML files.
Out of laziness, I ran a short bash script of the form:
for i in {1..16}; do shot-scraper multi issue-$i.yml --retina; done
on my computer to generate the the 115 screenshots.
See my Our World in Data bot and Literature bot for details of how the bot is set-up using R atrrr and Github Actions to post to Bluesky.
To post and repost, I’ve created a bot yml Github action with a cron
job for each Github Action: bot.yml, bot_tue.yml, bot_fri.yml.
