Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bluesky: support at:// synd links #1579

Closed
snarfed opened this issue Oct 25, 2023 · 19 comments · Fixed by #1588
Closed

Bluesky: support at:// synd links #1579

snarfed opened this issue Oct 25, 2023 · 19 comments · Fixed by #1588

Comments

@snarfed
Copy link
Owner

snarfed commented Oct 25, 2023

Interesting surprise, two of the early beta testers who signed up to try the new Bluesky support use at:// URIs as their synd links, not bsky.app links: #1453 (comment) . @JoelOtter those should be pretty straightforward to support too, right?

@snarfed
Copy link
Owner Author

snarfed commented Oct 25, 2023

Specifically I think we just need to implement Bluesky.canonicalize_url to accept and return at:// URIs. Example:

bridgy/flickr.py

Lines 87 to 95 in 71225d9

def canonicalize_url(self, url, activity=None, **kwargs):
if not url.endswith('/'):
url = url + '/'
if self.username:
url = url.replace(f'flickr.com/photos/{self.username}/',
f'flickr.com/photos/{self.key_id()}/')
url = url.replace(f'flickr.com/people/{self.username}/',
f'flickr.com/people/{self.key_id()}/')
return super().canonicalize_url(url, **kwargs)

Called from here:

for hentry in (item for item in mf2['items']
if 'h-entry' in item['type']):
usynd = hentry.get('properties', {}).get('syndication', [])
if usynd:
logger.debug(f'u-syndication links: {usynd}')
syndication_urls.update(url for url in usynd
if isinstance(url, str))
results = _process_syndication_urls(
source, permalink, syndication_urls, preexisting)

@JoelOtter
Copy link
Contributor

Yep should be doable easily enough! Will do some experimentation

@JoelOtter
Copy link
Contributor

JoelOtter commented Oct 28, 2023

Hmm. Have done the canonicalisation and it seems to pick up the Bluesky content OK using the discover endpoint, but doesn't identify any webmention targets.

@snarfed
Copy link
Owner Author

snarfed commented Oct 28, 2023

Feel free to look at the SyndicatedPost entities in the datastore to see if they're what you expect! https://console.cloud.google.com/datastore/entities/query?project=brid-gy

@JoelOtter
Copy link
Contributor

I’m running locally alas! Is there a GUI for the emulator at all?

@snarfed
Copy link
Owner Author

snarfed commented Oct 28, 2023

Not any more, sadly, but you can look at them in a python shell:

# in virtualenv
env APP_ID=brid-gy python

from oauth_dropins.webutil.appengine_config import ndb_client
from bluesky import Bluesky

context = ndb_client.context()
context.__enter__()
snarfed = Bluesky.get_by_id('did:plc:fdme4gb7mu7zrie7peay7tst')
print(snarfed)

@JoelOtter
Copy link
Contributor

The problem appears to be that the SyndicatedPosts get inserted with their syndication field as the at:// one, but when doing original post discovery for the backfeed it looks them up based on the post URL from Bluesky, which is the http:// one.

I'm not sure what to do here. If we were in a clean environment I guess we could just always canonicalise everything to a at:// URL, but that would presumably break all existing relationships in the DB. We could do it the other way, which wouldn't break any (working) existing data, but that strikes me as very non-future-proof.

@snarfed
Copy link
Owner Author

snarfed commented Oct 29, 2023

Understood, that makes sense.

On the one hand, this is a Bluesky integration, not an ATProto integration, so I'm reluctant to go too deep into ATProto itself. On the other hand, at:// URIs are probably the way to go. Even after federation, we can switch to talking to the AppView and still expect to get data accepted all/most PDSes.

We could backfill existing SyndicatedPosts and convert their URLs, but we could just let OPD create all new ones and ignore the old bsky.app ones. I'm fine either way.

@JoelOtter
Copy link
Contributor

If this is something where it would fix itself without intervention on next crawl I’d be fine with that. The issue I guess is if it would cause duplicate web mentions to be fired

@snarfed
Copy link
Owner Author

snarfed commented Oct 29, 2023

Yes! It would effectively fix itself, by storing and using new SyndicatedPost entities with at:// URIs. It might indeed send dupe wms, but the source and target URLs should be the same, so that should be fine.

@snarfed
Copy link
Owner Author

snarfed commented Oct 29, 2023

The fix might be as easy as changing Bluesky.URL_CANONICALIZER to accept both bsky.app and at:// URIs and always emit to at:// URIs.

One catch is that we probably still want to use bsky.app URLs in the underlying Response.response_json AS1 objects' url property, since that ends up in human visible links that people see and click on. I haven't thought through how easy it will be to keep those different from the syndication URLs that we do OPD on. Maybe easy?...maybe not.

@JoelOtter
Copy link
Contributor

Yeah was going to say, on reflection the at URIs would be useless to a backfeed receiver. It would actually be pretty easy to just canonicalise everything to a bsky.app URL for now- we could use DIDs rather than handles in them so they should be pretty solid well into the future. Federation is obviously its own fairly large problem but I feel like we’ll have a lot to solve all in one go when that comes in anyway? Possibly alternatively we would need to look into separating out a “silo view” and “user view” of the silo URL but that’s a refactor I wouldn’t be comfortable doing myself.

@snarfed
Copy link
Owner Author

snarfed commented Oct 29, 2023

Canonicalize to bsky.app works for me!

I actually think we may be pretty ok as is for federation without any big changes, assuming we can switch all of our API requests over to the AppView (api.bsky.social) and they'll Just Work? Not 100% sure on that, but fairly confident. We'll see.

@snarfed
Copy link
Owner Author

snarfed commented Oct 31, 2023

@JoelOtter
Copy link
Contributor

I'm unable to get it to work for this post now it's deployed. Bridgy appears to find the relationship now but the responses to the post on Bluesky don't seem to trigger webmentions. Tried doing a recrawl/repoll/etc, nothing

@JoelOtter
Copy link
Contributor

@snarfed
Copy link
Owner Author

snarfed commented Oct 31, 2023

Hmm! Looks like this poll found it and canonicalized the URL to bsky.app correctly: https://brid.gy/log?module=background&start_time=1698761253&key=agdicmlkLWd5ci0LEgdCbHVlc2t5IiBkaWQ6cGxjOmlvejR6dGdoZnpueDRzNXM0anhxaXF1bgw

@snarfed
Copy link
Owner Author

snarfed commented Oct 31, 2023

Looks like there was only one response from someone else, a like. I clicked on its retry button in Bridgy and that finally did it. 🤷‍♂️

@JoelOtter
Copy link
Contributor

I uh...forgot about that button 🤦

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants