Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Bluesky backfeed support #1523

Merged
merged 16 commits into from
Oct 25, 2023
Merged

Add Bluesky backfeed support #1523

merged 16 commits into from
Oct 25, 2023

Conversation

JoelOtter
Copy link
Contributor

This is the bridgy portion of the Bluesky work. It's rather simple compared to the Granary work, which this depends on along with the oauth-dropins work.

Most significantly it adds the app password submission page, which currently looks like this:

Tests to come.

@snarfed
Copy link
Owner

snarfed commented Jul 29, 2023

Argh, just realized I never sent my review comments. Sorry for the delay!

from urllib.parse import quote

logger = logging.getLogger(__name__)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very minor style nit, two line breaks.

@@ -76,7 +77,7 @@ class Item(View):
"""
source = None

VALID_ID = re.compile(r'^[\w.+:@=<>-]+$')
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to keep the :, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything should be URL encoded so there shouldn't be colons - where would they come from?

Copy link
Owner

@snarfed snarfed Aug 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't really try hard to guarantee URL-encoding. Eg here's the code that generates webmention source URLs, which you :

bridgy/tasks.py

Lines 782 to 798 in 1d1fc44

# generate source URL
id = activity['id']
parsed = util.parse_tag_uri(id)
post_id = parsed[1] if parsed else id
parts = [self.entity.type, g.source.SHORT_NAME, g.source.key.string_id(), post_id]
if self.entity.type != 'post':
# parse and add response id. (we know Response key ids are always tag URIs)
_, response_id = util.parse_tag_uri(self.entity.key.string_id())
reaction_id = response_id
if self.entity.type in ('like', 'react', 'repost', 'rsvp'):
response_id = response_id.split('_')[-1] # extract responder user id
parts.append(response_id)
if self.entity.type == 'react':
parts.append(reaction_id)
return util.host_url('/'.join(parts))

URL encoding is nice, but I'm not that much of a stickler when it's not necessary for correctness. I sometimes like leaving in some characters like : when possible, even if only for aesthetics. As an example, when Bridgy supported Facebook, many of its ids included colons: https://github.com/snarfed/granary/blob/4f13ce738fabfdda11d59f96a3f6878df2233390/granary/facebook.py#L2427-L2447

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see your format_for_source_url change. OK! We may still get colons in other silos' ids, but at least not Bluesky's.

bluesky.py Outdated
return self.gr_source.user_url(self.name)

def format_for_source_url(self, key):
"""Bluesky keys (AT URIs) contain slashes, so must be double-encoded."""
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, why double encoded? What breaks if this is just once with safe=''?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is due to an issue in Flask (via WSGI) wherein URL-encoded slashes are treated as slashes, which breaks routing pallets/flask#900

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woowww that's unfortunate. Could you add that explanation and link to the docstring, including that we don't have to decode them correspondingly?

handlers.py Outdated
@@ -222,6 +223,10 @@ class Post(Item):
def get_item(self, post_id):
posts = None

# Bluesky IDs need to be URL-decoded.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor, could these all be unified into dispatch_request, eg maybe in the existing loop over kwargs on line 136?

models.py Outdated
@@ -483,6 +486,19 @@ def edit_template_url(self):
"""
raise NotImplementedError()

def format_for_source_url(self, key):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit, s/key/id/ here and below?

<br />
<p class="row big">Log into Bluesky with your app password.</p>

<p>App passwords can be generated from your Bluesky settings, and revoked at any time.</p>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor, center these with class="row"?

also, ugh, they don't really have user-facing docs for app passwords. mind linking this to https://bsky.app/settings/app-passwords ?

Suggested change
<p>App passwords can be generated from your Bluesky settings, and revoked at any time.</p>
<p class="row"><a href="https://bsky.app/settings/app-passwords">App passwords can be generated from your Bluesky settings</a>, and revoked at any time.</p>

<p>You should generate a new app password just for Bridgy.</p>

<form id="app-password" class="row big" action="/bluesky/callback" method="post">
<input type="text" name="username" placeholder="username" />
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mind if we change username to handle everywhere user-facing? Hopefully avoids confusion for people with custom domain handles.

@snarfed
Copy link
Owner

snarfed commented Aug 2, 2023

(this is re: #1453)

@JoelOtter
Copy link
Contributor Author

@snarfed Think I've addressed all comments - just needs tests added and the Granary change merged for it to pass on CI. I'm having a couple of tests fail locally and not sure why, would you be able to offer guidance?

======================================================================
FAIL: test_social_user_page_mf2 (tests.test_pages.PagesTest)
Check the custom mf2 we render on social user pages.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/joel/Code/bridgy/local/lib/python3.9/site-packages/mox3/mox.py", line 2139, in new_method
    func(self, *args, **kwargs)
  File "/Users/joel/Code/bridgy/tests/test_pages.py", line 328, in test_social_user_page_mf2
    self.assertEqual(200, response.status_code)
AssertionError: 200 != 500

======================================================================
FAIL: test_user_page_escapes_html_chars (tests.test_pages.PagesTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/joel/Code/bridgy/local/lib/python3.9/site-packages/mox3/mox.py", line 2139, in new_method
    func(self, *args, **kwargs)
  File "/Users/joel/Code/bridgy/tests/test_pages.py", line 418, in test_user_page_escapes_html_chars
    self.assertEqual(200, resp.status_code)
AssertionError: 200 != 500

@snarfed
Copy link
Owner

snarfed commented Sep 4, 2023

Awesome! I'll look tomorrow morning. For the tests, try running just one of them with -v, eg python -m unittest -v -k test_social_user_page_mf2, you should see the stack trace where it's crashing along with full logging.

@JoelOtter
Copy link
Contributor Author

Awesome! I'll look tomorrow morning. For the tests, try running just one of them with -v, eg python -m unittest -v -k test_social_user_page_mf2, you should see the stack trace where it's crashing along with full logging.

This worked great, thanks!

@snarfed
Copy link
Owner

snarfed commented Sep 4, 2023

This and the granary change are both looking great! You're right, the last remaining thing I'd probably look for is new tests. Not necessarily the most fun part, but I pretty much always find more bugs when I write them. Most new tests would probably go into granary; some here too, but mostly just logic, eg the URL-encoding/decoding in handlers.py and PropagateResponse.source_url, no need to worry too much about trying to test UI stuff. Feel free to copy and paste liberally from existing tests, obviously!

@JoelOtter
Copy link
Contributor Author

@snarfed How's this for tests? More needed? I kind of followed the Mastodon tests

@JoelOtter
Copy link
Contributor Author

CI errors appear to be hitting that old version of lexrpc again 🙃

snarfed added a commit that referenced this pull request Oct 24, 2023
@snarfed
Copy link
Owner

snarfed commented Oct 24, 2023

CI errors appear to be hitting that old version of lexrpc again 🙃

Oops true, sorry about that! Hadn't fixed that in this repo yet. I've done that now, feel free to rebase and try again.

@JoelOtter
Copy link
Contributor Author

Hmm, seem to be getting the same thing:

TypeError: Client.__init__() missing 1 required positional argument: 'lexicons'

@snarfed
Copy link
Owner

snarfed commented Oct 25, 2023

Huh, true, that's odd. Looking.

@snarfed
Copy link
Owner

snarfed commented Oct 25, 2023

Ah, it wasn't installing lexrpc from github. Fixed that! I'll look at the PR itself now...

Copy link
Owner

@snarfed snarfed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking pretty ready! Couple minor nits, but nothing big. Ready to merge after those and CI pass!

bluesky.py Outdated
gr_source = gr_bluesky.Bluesky(user.get('handle'), app_password=app_password)
actor = gr_source.user_to_actor(user)
return Bluesky(id=auth_entity.key_id(),
username=auth_entity.key_id(),
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should probably be handle, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying the value for username should be the Bluesky handle rather than the DID, or I should be passing a kwarg handle into the constructor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm being stupid here and you definitely mean the former, will change

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes! 😁

bluesky.py Outdated
return Bluesky(id=auth_entity.key_id(),
username=auth_entity.key_id(),
auth_entity=auth_entity.key,
name=user.get('handle'),
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and then this should be displayName?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be but I think that would actually be inconsistent with other silos? Bridgy current seems to use username rather than display name for this in Mastodon and Twitter (RIP):

image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, unless this is something that changed recently? Looking at the source code it appears to be as you say, but that isn't reflected in my actual brid.gy account

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, Bridgy shows silo-specific usernames/handles in the UI, but those come from label_name(), not from the name property. I think we want name to be the human-readable name.

<span class="p-name">{{ source.label_name() }}</span>

name = ndb.StringProperty() # full human-readable name

Checking in a few other silos...yup, name should be human name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I'm a bit baffled by this, running it locally gives the same result, but the name field is clearly my display name, it just doesn't get used in the display, but in Bluesky it does!

image

I guess the problem is somewhere else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so I should change the implementation of label_name()?

Copy link
Owner

@snarfed snarfed Oct 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh sorry, the missing piece is that Bluesky itself should override label_name() with whatever we want there. Other examples:

bridgy/mastodon.py

Lines 119 to 121 in 33b4a71

def label_name(self):
"""Returns the fully qualified address."""
return self.username

bridgy/github.py

Lines 74 to 76 in 33b4a71

def label_name(self):
"""Returns the username."""
return self.username

so yeah I think we want to do that and return handle.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the confusion!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish GitHub had auto-refresh 😆 Will get that changed!

OAUTH_START = oauth_bluesky.Start
AUTH_MODEL = oauth_bluesky.BlueskyAuth
URL_CANONICALIZER = util.UrlCanonicalizer(
domain=GR_CLASS.DOMAIN,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this makes me think we need to support multiple domains in UrlCanonicalizer, so we can canonicalize staging.bsky.app => bsky.app. Obviously not needed in this PR though!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh yeah federation/multiple PDSs is just not something I've thought about yet, likely to be a headache but hoping we can use a lot of the same stuff Mastodon does?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh this isn't federation, that's just the main official PSS's old hostname, there are still syndication links to it out there in the wild, eg on my site 😁

Agreed though, let's definitely punt on federation itself for now!

bluesky.py Outdated

@classmethod
def button_html(cls, feature):
"""Override oauth-dropins's button_html() to not show the instance text box."""
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit: update docstring? no instance text box, right?

@snarfed
Copy link
Owner

snarfed commented Oct 25, 2023

Oh, one last small thing, could you center the handle + app password text boxes on the login page?

@snarfed
Copy link
Owner

snarfed commented Oct 25, 2023

Oh btw, I tried running this locally, and login and poll worked great! Haven't tried propagate yet, but still, good sign!

@JoelOtter JoelOtter requested a review from snarfed October 25, 2023 20:17
@JoelOtter
Copy link
Contributor Author

Should be all addressed now!

@snarfed
Copy link
Owner

snarfed commented Oct 25, 2023

Woo, looks great! Thanks for your patience. Merging and deploying! Can't wait to try it in prod!

Next step, docs? Not much, only a few sections on https://brid.gy/about and in the README that call out specific silos. Then maybe a blog post? You should get to do the honors!

@snarfed snarfed merged commit 9e7e489 into snarfed:main Oct 25, 2023
4 checks passed
@snarfed
Copy link
Owner

snarfed commented Oct 25, 2023

Background service isn't starting up ok. 😕 Not sure why, I'm guessing not related to this PR, maybe a bad new dependency that we upgraded since deploying last a couple weeks ago. Looking now...

@snarfed
Copy link
Owner

snarfed commented Oct 25, 2023

Startup command is gunicorn --workers 1 --threads 10 --timeout 600 -b :$PORT background:app, that runs fine locally for me on macOS. Hrm.

@snarfed
Copy link
Owner

snarfed commented Oct 25, 2023

Same build config for both services' containers, and the default (ie frontend) service is up and serving fine. Can't tell why background isn't. And I'm not seeing any useful logs, it just closes the /_ah/start request's connection after 5s, which makes the watchdog terminate it. Hrm.

@snarfed
Copy link
Owner

snarfed commented Oct 25, 2023

Rolling back for now, still investigating.

@snarfed
Copy link
Owner

snarfed commented Oct 25, 2023

Aha, background.yaml is still on runtime: python39, missed upgrading that to 3.11 a while back. Fingers crossed that fixes it...

@snarfed
Copy link
Owner

snarfed commented Oct 25, 2023

It's alive, it's aliiiive!

@JoelOtter
Copy link
Contributor Author

Next step, docs? Not much, only a few sections on https://brid.gy/about and in the README that call out specific silos. Then maybe a blog post? You should get to do the honors!

Very happy to write some docs. :) What would a blog post look like, do you mean on my own blog?

@snarfed
Copy link
Owner

snarfed commented Oct 26, 2023

Sure, at least if you want! Just announcing the launch, and maybe syndicated to https://news.indieweb.org/en . Totally optional, of course.

@JoelOtter
Copy link
Contributor Author

Cool! I'm guessing the indie web way ™️ would be to write it on my own site, do a u-syndication to news.indieweb, do a u-mention-of to yourself, then send the webmentions accordingly?

@snarfed
Copy link
Owner

snarfed commented Oct 26, 2023

u-syndication to IndieNews, yes! u-mention-of to...me? Not necessary but OK if you want! Except I don't think u-mention-of exists, mentions are just links, no need to mark up with any specific mf2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants