Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

100 years ago twitter bot & review page #642

Closed
4 tasks done
rlskoeser opened this issue May 22, 2020 · 12 comments
Closed
4 tasks done

100 years ago twitter bot & review page #642

rlskoeser opened this issue May 22, 2020 · 12 comments
Assignees

Comments

@rlskoeser
Copy link
Contributor

rlskoeser commented May 22, 2020

I'm working on this as R&D, but have some questions and notes and thought it would be easiest to create an issue to track them and test the work.

6/27/2020 still todo

  • handle multiple authors
  • include editor(s) if no authors
  • twitter credentials in local settings
  • unit tests
@rlskoeser rlskoeser self-assigned this May 22, 2020
@rlskoeser rlskoeser changed the title Code for 100 years ago twitter bot & review 100 years ago twitter bot & review page May 22, 2020
@rlskoeser
Copy link
Contributor Author

@jkotin @clmahoney some notes and questions on my 100 years ago twitter implementation

I'm building a login-only view where you'll be able to see upcoming tweets and check if there are any problems with the data or the twitterbot code — proposing we use the review for a while without automatically tweeting to check for any errors.

I propose we link to specific lending cards when possible (i.e. for borrows & purchases that are footnoted and linked to an image).

Questions:

  1. How important do you think is it to consolidate events for the same member? ("So and so joined & borrowed x" ) This is feasible, just a little harder.
  2. I noticed from the current account that periodicals are listed as 'name borrowed an issue of "title"'; should we use known issues/volumes if possible? Can you suggest the format, and what are the earliest events with known issues/volumes? (i.e. how soon will this come up)
  3. How should we handle multiple authors?
  4. Should we include editor(s) if no author?
  5. Are there still notations in work or event notes that need to be skipped for twitter? The guidelines document I found indicate ERROR in the event notes or "GENERIC" or "PROBLEM" in the work notes.
  6. How far ahead would you want to review these and where would you want to start? I can do two weeks like you've been doing, but it isn't not hard to do more. Currently I have it set to start with events from 100 years ago today and go out two weeks, but I don't think you'd actually want to review the current day's tweets.

Here's a screenshot of what the review page currently looks like:
Screen Shot 2020-05-22 at 5 27 18 PM

@jkotin
Copy link

jkotin commented May 23, 2020

@rlskoeser This looks great. Responses:

  1. Re: "How important do you think is it to consolidate events for the same member?" Not important. If you mean that there will be separate tweets if a member joins and borrows books on the same day. That's totally fine, maybe even preferable.

  2. Re: I'm happy with either option: "an issue of" or given the exact issue number or date. (Different periodicals will have different systems: "issue number two of," "the May 1935 issue of," "volume 3 and issue 4 of," etc.) I'll ask @clmahoney to let you know when this comes up next. We haven't discussed periodical numbering in depth yet: some cards specify a number, but a lot can be inferred. Cate and I need to work out a system.

  3. Re: multiple authors—ideally it would be: Francis Beaumont and John Fletcher's "Plays." If there are more than two authors you could have: John Smith et al.'s "Book of Gardening." I'm happy to follow CMS on this if it suggests a good system.

  4. Re: editor—yes, I would prefer to include an editor if there's no author. I would like the format to be: "Book of Gardening," edited by John Smith (DATE). If that format is too much trouble, better to exclude the editor.

  5. By the end of the months, there aren't any tags to skip, except "UNCERTAINTYICON." You should probably skip that, but maybe include at first so we can see what the tweets look like?

  6. Maybe a month out? I'm flexible.

@rlskoeser
Copy link
Contributor Author

@jkotin thanks for the responses.

  1. Great — much easier to do separate tweets. Maybe we could schedule them near each other, if we want.
  2. Given the variety of periodical and volume formats, I'm inclined to just stick with "an issue of"
  3. I like your proposal for multiple authors. Looks like CMS doesn't switch to et. al. until over 3, but three names seems like a lot for a tweet. Ok to do two and for two names, et. al. for 3 or more?
  4. I'll see if I can include editors as you suggest without too much trouble. Nice to provide more information if possible.
  5. Seems logical to me.
  6. I'll switch to 4 weeks instead of 2 so we can try that.

@jkotin
Copy link

jkotin commented May 26, 2020

Perfect. Re: 3 -- I agree that >2 should get an et al. It will be rare anyway.

@rlskoeser
Copy link
Contributor Author

@jkotin I'd like to revise the language we're using for subscription events — "joined the library" isn't always accurate for subscription events in the database, since sometimes what are effectively renewals (following a preceding subscription) were logged and documented as subscriptions, and there are plenty of people who let their subscription end but then subscribe again later. (In theory I could check if there are preceding events for the same account, but that could introduce errors and I'd rather not make this code any more complicated than it already is!)

Could we use the same language as the renewals, with subscribed instead of renewed? ("renewed for 1 month at 1 volume per month", "renewed for 2 months"). I know you liked explicitly mentioning the "Shakespeare and Company lending library" — if you don't like making them the same as renewals, please suggest alternate revised wording for subscriptions.

@jkotin
Copy link

jkotin commented Jun 27, 2020

Some notes on the tweets on the test site:

  1. Initial “smart” quotation marks are angled in the wrong direction.
  2. Update test site, if drawing from test site—some info is out of date.
  3. Exclude items with uncertainty icons or books (not periodicals) without dates?
  4. We need to fix author names and decide where best to pull author names from—e.g. Jean Riviere borrowed a book by “Edward John Moreton Drax Plunkett Dunsany.” That should be Lord Dunsany.
  5. Punctuation should be inside quotation marks—relevant for items without dates.

@jkotin
Copy link

jkotin commented Jun 27, 2020

Re: language for posts -- I'm OK with using "subscribed" and "renewed."

Would it be possible to include "Shakespeare and Company" like this:

100YearsAgoToday on Saturday, June 27 at the Shakespeare and Company lending library, John Smith subscribed for 1 month at 1 volume per month.

100YearsAgoToday on Saturday, June 27 at the Shakespeare and Company lending library, John Smith renewed for 1 month at 1 volume per month.

100YearsAgoToday on Saturday, June 27 at the Shakespeare and Company lending library, John Smith borrowed "Ulysses" (1922).

An alternative would be:

100YearsAgoToday on Saturday, June 27 at Shakespeare and Company, John Smith renewed for 1 month at 1 volume per month.

Indeed, just say "Shakespeare and Company," instead of "the Shakespeare and Company lending library" may be better.

@rlskoeser
Copy link
Contributor Author

@jkotin thanks for all the feedback. I like the revised language for subscriptions & including S&co in the prolog of every tweet.

  • I updated the test site with production data from this morning
  • I revised to include S&co in opening of all tweets
  • I revised tweet language for subscriptions
  • Excluded events connected to books with UNCERTAINTYICON in the notes
  • I looked at available name fields. I'm using name for authors, which looks like the best option, but it seem that some author names haven't been cleaned up as well as others (like Dunsany). Which name field would you expect/prefer me to use?
  • This did prompt me to switch the member names to use the lastname_first field that we're using for the heading on the member detail page; I think this is better for the cases where it matters.

The smart quotes aren't as readable in the font used on the site (or the one twitter uses, either 😕 ). Here's how it looks in my command line report:

#100YearsAgoToday on Wednesday, June 30, 1920 at Shakespeare and Company, Marquis Pagan returned George Bernard Shaw’s “The Philanderer” (1905).
https://shakespeareandco.princeton.edu/members/pagan/cards/6d80f641-d632-4123-9a75-fc8ce40fc62c/#e23116

#100YearsAgoToday on Wednesday, June 30, 1920 at Shakespeare and Company, Marquis Pagan borrowed Rudyard Kipling’s “The Light That Failed” (1891).
https://shakespeareandco.princeton.edu/members/pagan/cards/6d80f641-d632-4123-9a75-fc8ce40fc62c/#e23117

I don't know if the GitHub font is any better. If you're still not convinced, would you try cutting and pasting the text into a document so you can see the smart quotes more clearly?

Revisions are available on the test site.

@jkotin
Copy link

jkotin commented Jun 30, 2020

This all looks good to me! If Ian get the AR position, we'll fix the author names in the database to follow the same rules as the member names. At that point, you'll want to use the sort name, according to the same rules at the front end.

@rlskoeser
Copy link
Contributor Author

@jkotin this functionality is now on the test site with all the changes we discussed.

Please check the review page to confirm the format of the tweets and that it will be sufficient for you to review the tweets ahead of time before they are posted.
https://test-shakespeareandco.cdh.princeton.edu/events/100-years-review/

If there are scenarios you want to check that aren't showing up in the next four weeks (i.e., multiple authors, editors but no author, titles with no year) please provide a few dates for events you'd like to check — I can run a command line report that will display the tweets for any specified date and share the output with you.

The test site is configured to tweet using a test bot account that I've set to private. If you request to follow I will grant access so you can see how the new tweets look, and more easily compare them with the existing account. https://twitter.com/ShakesCo100test

@jkotin
Copy link

jkotin commented Jul 2, 2020

This looks good. I can't think of scenarios to test. Two sets of questions:

  1. What will be the workflow for this moving forward? When will it replace the current 100years twitter, and what will be the review process? Would it make sense to review a batch (three months?) of tweets just in case there are odd scenarios that arise? I don't know the data well enough to predict odd events.

  2. Does it make sense to think about a controlled list of hashtags to add to the tweets? E.g. if one of these 20 authors are mentioned, use hashtag -firstnamelastname?

@rlskoeser
Copy link
Contributor Author

@jkotin I manually configured the test site to show 3 months of data instead of 4 weeks so you can look and see if there are any odd cases that aren't being handled correctly. (The report is currently including borrow returns that are outside of that date range; let me know if that's confusing; I can clean it up, but wasn't sure it mattered). I don't see any examples with multiple authors or no authors and editors — but I did include those in my unit tests following your guidelines.

I thought it would be slower with that much more data, but it doesn't seem too bad — so we could change it to show more than 4 weeks if you prefer. Let me know what you'd like.

To answer your questions:

  1. My plan is for this code to go into production with the other small updates sometime this week. I'd like to run the test twitter account from production for a few days first to make sure everything is working (it took me a few tries to get it configured properly on the test server and I want to leave time for any problems). Once it's tweeting reliably to the test twitter account, it's a simple configuration change to switch it to the main 100 years account. This means we could switch over as soon as sometime mid/late next week, but we don't have to switch that soon (we should be in touch with Ara about our plans before August, though!). As for review: I built the review page to try to mimic the current process. My thought was that you should review regularly and have a workflow to address any errors I'm hoping that problems found in review will largely be fixable via data corrections (e.g. the Lord Dunsany thing you pointed out). If there are occasional bug fixes in the twitter code due to oddities or cases we didn't account for, it should be ok as long as you identify them far enough in advance for me to address.

  2. I'd prefer not to complicate the code any further at this point, and I'm not sure how much hashtags help anyway — famous people are pretty findable by search without them. I would consider adding them later if someone has evidence that they make a significant impact. (Personally, I suspect/hope that my updated tweet content, which links to lending cards for book events instead of just the member page, and thus includes the card preview, will make a bigger impact.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants