Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bluesky: publishing #1580

Closed
JoelOtter opened this issue Oct 26, 2023 · 62 comments
Closed

Bluesky: publishing #1580

JoelOtter opened this issue Oct 26, 2023 · 62 comments
Labels

Comments

@JoelOtter
Copy link
Contributor

We only have backfeed for bsky currently - will use this issue to track publishing work.

@JoelOtter
Copy link
Contributor Author

Trying to work out how to actually turn this on. Setting CAN_PUBLISH = True and adding Bluesky to the list of sources in publish.py doesn't seem to be enough. I think it's something to do with adding 'publish' to the source's features field but I'm not sure when to do that when Bluesky doesn't actually use auth scopes at all. Do you think it's OK to just add this feature to the source at initial login, and if so, where's the best place to do that? Is it in bridgy or in oauth-dropins?

@snarfed
Copy link
Owner

snarfed commented Oct 31, 2023

Yes! publish in features is it. And good question re scopes. I guess we could make it a Bridgy-internal toggle that users can opt into, but I'm also fine with always enabling it, at least for now. I think you'd just hard code feature here to 'listen,publish':

return super().button_html(feature, form_method='get', **kwargs)

More:

bridgy/models.py

Lines 525 to 535 in 87038d3

@classmethod
def button_html(cls, feature, **kwargs):
"""Returns an HTML string with a login form and button for this site.
Mostly just passes through to
:meth:`oauth_dropins.handlers.Start.button_html`.
Returns:
str: HTML
"""
assert set(feature.split(',')) <= set(cls.FEATURES)

Btw backfeed seems pretty stable, congrats again! I'm ready to announce it if you are. Want to do the honors?

@JoelOtter
Copy link
Contributor Author

Great, thanks, I'll have a go at this :)

Btw backfeed seems pretty stable, congrats again! I'm ready to announce it if you are. Want to do the honors?

Sure! Will write that soon- prob not today as I'm a bit under the weather but in the next few days for sure

@JoelOtter
Copy link
Contributor Author

Found some energy from somewhere! Hope this is OK. https://www.joelotter.com/posts/2023/10/bridgy-bluesky/

@JoelOtter
Copy link
Contributor Author

Random note: Bluesky still doesn't have proper embed support, and bsky.link appears to be dead, so I'm skipping post embeds in reply previews.

@JoelOtter
Copy link
Contributor Author

Bluesky has link cards (previews0, which unlike on other networks are optional and it's up to the user which link in a post becomes the link card. I propose that to start with we just handle the last link in a post as the card, as that feels like the most common usage. Later, we might want it to be adjustable using some specific bridgy property. What do you think?

@JoelOtter
Copy link
Contributor Author

Though interestingly Mastodon does actually choose the first link. I still think last is the more desired behaviour...

@snarfed
Copy link
Owner

snarfed commented Nov 14, 2023

Last link sounds good! Or even skip embeds altogether to start. I'm all for starting with the simplest thing that works end to end, then expanding.

@JoelOtter
Copy link
Contributor Author

Think I've got preview pretty much working for the basic feature set. I'm a bit confused how to actually do stuff against Bluesky though. Do I need to write raw ATproto repo-write queries? Does lexrpc include a way to do this? Had a look at how bridgy-fed does it and it seems very low-level!

@snarfed
Copy link
Owner

snarfed commented Nov 15, 2023

Hah, yes! I think you just call com.atproto.repo.createRecord. It's a little hidden because it's in the com.atproto lexicons, not app.bsky. https://atproto.com/blog/create-post

@JoelOtter
Copy link
Contributor Author

Hmm I'm not sure what I need to add in terms of the endpoint handling for e.g. /bluesky/publish/start, it seems very tied to the OAuth dropins in a way that I find a bit confusing given Bluesky's OAuth is fake. Any advice?

@snarfed
Copy link
Owner

snarfed commented Nov 16, 2023

That is totally true. 😕 I had the same problem when I added delete for Bluesky a bit ago, maybe follow the way that works?

@JoelOtter
Copy link
Contributor Author

So I'm still mildly baffled by the flow here but I think what I'm missing is functionality for the OAuth Dropins Bluesky Callback's dispatch_request function to be able to read state out of the request, and use the source ID in that to retrieve an existing user auth, if it exists. Does that sound correct? Is that actually safe to do?

@snarfed
Copy link
Owner

snarfed commented Nov 17, 2023

Sounds right, but you can probably get much of it for free, let me come back when I have a bit more time to sketch something. Btw feel free to skip interactive publish/preview entirely for now if you want, I'm happy to ship just webmention-based publish, and then add on interactive afterward!

@JoelOtter
Copy link
Contributor Author

Oh, I didn't clock that those were separate flows.

@snarfed
Copy link
Owner

snarfed commented Nov 17, 2023

Sorry the auth stuff is so difficult btw! Obviously the non-OAuth part here is awkward, but also the auth code on the Bridgy side in general is probably over-abstracted.

@JoelOtter
Copy link
Contributor Author

Random discovery is that Bluesky (presumably intentionally given a repo is supposed to be user-owned?) does not validate dates. We can publish posts at the time the source was written. https://bsky.app/profile/joelotter.com/post/3kefvqwq6g424

@JoelOtter
Copy link
Contributor Author

(I published that post with Bridgy!)

@snarfed
Copy link
Owner

snarfed commented Nov 17, 2023

Awesome! Congrats!

And yeah people like @DavidBuchanan314 have been all over that. The team says indexedAt is the more-trusted timestamp since it's from the relay/appview. Still, amusing!

@JoelOtter
Copy link
Contributor Author

Am I right in thinking lexrpc doesn't currently support non-JSON request bodies?

@JoelOtter
Copy link
Contributor Author

Alright, I've got it uploading images!

image

The next thing is facets for links and mentions. I can see that from_as1 actually has this, but it a) only does facet logic when the HTML content is identical to the text, which I don't see ever being true and b) appears to expect links and mentions to be present as tags on the AS1 object, which I don't think we currently do in vanilla bridgy?

@snarfed
Copy link
Owner

snarfed commented Nov 18, 2023

So cool! Congrats! And yeah good point, lexrpc supports binary output and input (less well tested) server side, but not client side, sorry about that. Did you have to add it? (Thank you if so!)

@snarfed
Copy link
Owner

snarfed commented Nov 18, 2023

Re facets, afaik we've never actually used the AS1 => Bluesky facet code for anything real yet, so yeah I can believe it's untested (at least in real situations) and likely broken in some ways.

For HTML content though, yes, that's intentional, I probably should have added a comment. I did the conversion from AS1 tags since it was easy, but for HTML, I didn't want to try yet since it seemed harder, and I didn't have a use case yet. I guess we do now!

We figured out lots of these details for publishing to Twitter, hopefully those decisions work here too. Eg we parse out mf2 and use it for explicit structural elements like media, and otherwise just run source.html_to_text and use its output as is. For Bluesky facets specifically, I don't actually think we have any precedent in Bridgy publish for @-mentions or links. I guess we could try! I'd be inclined to think about it after we ship a first version with just plain text though.

(Fwiw we do end up with AS1-style tags with indices in granary/Bridgy when we convert from some silos, eg tweets from Twitter with mentions, links, etc. Doesn't really matter here though!)

@JoelOtter
Copy link
Contributor Author

I ended up just implementing upload with requests directly - I find lexrpc quite hard to reason about internally due to the codegen stuff so I might not be the best person to try and add a whole additional class of input data!

RE facets, that makes sense. I think the linkify stuff we use for preview might be of some help here, though there will assuredly be some weird nuance due to unicode shenanigans.

Are we able to ship the micropub stuff without showing any publishing UI? If so that might work well, otherwise I'd worry about users seeing a broken-ish feature.

@JoelOtter
Copy link
Contributor Author

nvm I got over my fear snarfed/lexrpc#5

@snarfed
Copy link
Owner

snarfed commented Nov 18, 2023

Are we able to ship the micropub stuff without showing any publishing UI? If so that might work well, otherwise I'd worry about users seeing a broken-ish feature.

Good point! I guess I more meant merge than ship, ie we could merge this with just webmention support first, to keep the PR(s) manageable. We can definitely ship that first too, but yeah we'd want to hide the interactive form part of the publish UI if we do.

@JoelOtter
Copy link
Contributor Author

Apologies @snarfed, need to focus on some non-Bridgy stuff for a bit that I've been neglecting! I've put my WIP in the bsky-publish branch on my fork of Granary, which is where basically all of the work is: https://github.com/JoelOtter/granary/tree/bsky-publish

I mean to come back to this in two or three weeks but if you'd like to race ahead please don't hold back on my account!

@snarfed
Copy link
Owner

snarfed commented Nov 27, 2023

No worries! Totally ok, no obligation. I probably won't work on it much myself in the meantime, but I'll keep you posted. Best if luck with the rest!

@snarfed
Copy link
Owner

snarfed commented Jan 22, 2024

@JoelOtter just FYI I'm starting to look at this again. Let me know if you have any time freed up and you're interested in working on it with me!

@JoelOtter
Copy link
Contributor Author

Hello! I'm still a bit swamped - we're bringing our game out of Early Access at the end of February so I have a few things pulling my attention until then. From March on I'll have a lot more time to get back to this, but if you want to go ahead please do! Happy to help out with reviewing code in the interim.

@snarfed
Copy link
Owner

snarfed commented Feb 2, 2024

Ahh yeah that makes sense. I caved a while back and made publish accept "external" replies like that as normal notes, so I guess we should do that here too.

@JoelOtter
Copy link
Contributor Author

So if the reply is to a note we check its syndication links and reply to the matching syndicated post on the silo? It works for Masto so I guess I can just follow its implementation :)

@snarfed
Copy link
Owner

snarfed commented Feb 2, 2024

Oh that already happens, in publish.py, before it gets to any Bluesky-specific code. I meant if the in-reply-to wasn't POSSEd to Bluesky, or if it's on a different silo, etc. We originally didn't allow that, but enough people wanted it that I eventually allowed it and just converted to a normal note, no in-reply-to.

@JoelOtter
Copy link
Contributor Author

Ahh do you think in this case then it's because the in-reply-to is a bsky.app URL, rather than a AT URI?

@snarfed
Copy link
Owner

snarfed commented Feb 2, 2024

Hmm no, bsky.app URLs should work too, I tested with those!

@JoelOtter
Copy link
Contributor Author

Ah sorry I mis-wrote - the in-reply-to is a note URL, but the syndication link on that note is a bsky.app URL.

@JoelOtter
Copy link
Contributor Author

This is the note it fails for - it's got a bsky link on it now but that's cos I added it manually https://www.joelotter.com/notes/2024/02/02-celeste2/

@snarfed
Copy link
Owner

snarfed commented Feb 2, 2024

Ah ok! Yeah it's very possible that the in-reply-to synd link handling isn't working for Bluesky. It's here:

bridgy/publish.py

Lines 470 to 528 in f3c0a0d

def expand_target_urls(self, activity):
"""Expand the inReplyTo or object fields of an ActivityStreams object
by fetching the original and looking for rel=syndication URLs.
This method modifies the dict in place.
Args:
activity (dict): ActivityStreams activity being published
"""
for field in ('inReplyTo', 'object'):
# microformats2.json_to_object de-dupes, no need to do it here
urls = util.dedupe_urls(o.get('url') or o.get('id')
for o in as1.get_objects(activity, field))
augmented = list(urls)
for url in urls:
parsed = urllib.parse.urlparse(url)
# ignore home pages. https://github.com/snarfed/bridgy/issues/760
if parsed.path in ('', '/'):
continue
# get_webmention_target weeds out silos and non-HTML targets
# that we wouldn't want to download and parse
url, _, ok = util.get_webmention_target(url)
if not ok:
continue
logger.debug(f'expand_target_urls fetching field={field}, url={url}')
try:
mf2 = util.fetch_mf2(url)
except AssertionError:
raise # for unit tests
except HTTPException:
# raised by us, probably via self.error()
raise
except BaseException:
# it's not a big deal if we can't fetch an in-reply-to url
logger.info(f'expand_target_urls could not fetch field={field}, url={url}', exc_info=True)
continue
synd_urls = mf2['rels'].get('syndication', [])
# look for syndication urls in the first h-entry
queue = collections.deque(mf2.get('items', []))
while queue:
item = queue.popleft()
item_types = set(item.get('type', []))
if 'h-feed' in item_types and 'h-entry' not in item_types:
queue.extend(item.get('children', []))
continue
# these can be urls or h-cites
synd_urls += microformats2.get_string_urls(
item.get('properties', {}).get('syndication', []))
logger.debug(f'expand_target_urls found rel=syndication for url={url} : {synd_urls!r}')
augmented += synd_urls
if augmented:
activity[field] = [{'url': u} for u in augmented]

@JoelOtter
Copy link
Contributor Author

Fab, thanks! I'll have a crack at that this weekend, will be a good way to ease back into the swing of things. :)

@JoelOtter
Copy link
Contributor Author

So, I think the naive solution to support the URLs is to make this change in Granary:

diff --git a/granary/bluesky.py b/granary/bluesky.py
index 4e199afb..140b2045 100644
--- a/granary/bluesky.py
+++ b/granary/bluesky.py
@@ -571,7 +571,7 @@ def from_as1(obj, out_type=None, blobs=None, client=None):
 
     # in reply to
     reply = None
-    in_reply_to = as1.get_object(obj, 'inReplyTo')
+    in_reply_to = client.base_object(obj)
     if in_reply_to:
       parent_ref = from_as1_to_strong_ref(in_reply_to, client=client)
       reply = {

However, this kind of makes from_as1 require client, rather than it be an option. I'm not sure how best to proceed:

  • Make base_object a free function rather than on the Bluesky class, as it never uses its self parameter. I'm not sure this can work though as it's overriding a method on Source?
  • Make client a required parameter or, even scarier, put from_as1 on the Client itself.
  • Just pass the client along everywhere we can (naive solution but feels fragile)
  • Make base_object a free function and have the class method call it

I do worry I'm stumbling into doing something silly here so if you have advice I'd appreciate it!

@JoelOtter
Copy link
Contributor Author

I've made a PR snarfed/granary#673 with the last solution on this list, which does seem to work from my local testing.

@snarfed
Copy link
Owner

snarfed commented Feb 3, 2024

Nice! Pulling from the base object instead of raw inReplyTo looks like exactly the right idea. I'll take a look soon.

@snarfed
Copy link
Owner

snarfed commented Feb 5, 2024

This work ^ for incorporating syndication links is in. A few other things are remaining, eg #1661, but I'm ready to call v1 here done. I've posted an announcement at https://snarfed.org/2024-02-05_52058 . Congrats and thanks again @JoelOtter!

@snarfed snarfed closed this as completed Feb 5, 2024
@JoelOtter
Copy link
Contributor Author

Hmm did the syndication change roll out? I'm seeing the same error message, perhaps my fix didn't work after all...

@JoelOtter
Copy link
Contributor Author

Still working locally! Not sure what's up

@snarfed
Copy link
Owner

snarfed commented Feb 5, 2024

Hmm! I thought I deployed it, but maybe not? I've deployed again, feel free to retry now.

@snarfed snarfed reopened this Feb 5, 2024
@JoelOtter
Copy link
Contributor Author

Working now! Thank you!
https://www.joelotter.com/notes/2024/02/05-pacific-drive3/
https://bsky.app/profile/joelotter.com/post/3kkpgl7pjel2o

@snarfed snarfed closed this as completed Feb 5, 2024
@miklb
Copy link

miklb commented Feb 6, 2024

Congrats!

I confirmed I can post to Bluesky if I manually send the webmention using the curl provided with a preview on brid.gy but if I include https://brid.gy/publish/bluesky in a note it doesn't post. I don't have access to any logs on the current method I'm using for sending webmentions but it works fine with fed.brid.gy

Anything I'm missing? I'm really excited about this, thanks again for all of the hard work.

snarfed added a commit that referenced this issue Feb 6, 2024
@snarfed
Copy link
Owner

snarfed commented Feb 6, 2024

@miklb ooh thanks for the catch! Fixed, https://brid.gy/publish/bluesky is now serving GETs, feel free to try again.

I'll also update docs soon.

@miklb
Copy link

miklb commented Feb 6, 2024

@snarfed I thought a post went through last night but I posted one today and it's not syndicating. Posted fine through fed.brid.gy

Apologies, I'm sure I'm missing a step.

https://michaelbishop.me/note/1707189216

@snarfed
Copy link
Owner

snarfed commented Feb 6, 2024

Hmm! I don't see a webmention in Bridgy's logs from https://michaelbishop.me/note/1707189216 yet. I do see the link to https://brid.gy/publish/bluesky on that post, maybe try sending the wm again?

https://brid.gy/bluesky/did:plc:ulxk572dnu5v56yssqyupulk#publishes shows that Bridgy Publish has posted two of your other posts to Bluesky ok.

@miklb
Copy link

miklb commented Feb 6, 2024

Thanks. I'm using a Netlify plugin/function for sending webmentions so I'm flying blind.

For some reason it's only picking up the first link in the note to send a webmention for. If I only include an empty link to https://brid.gy/publish/bluesky, it sends the WM and Bridgy Publish posts to Bluesky.

I've had on my to-do list to roll-my-own sending of WM so I can get the response and update the post with syndication link so this will expedite the need. Thanks for helping me debug and cheers again on all the work.

[edit] ok there is a config on that plugin that I missed that defaults to setting the limit to first link for a url. I've bumped that up and all of my notes published to Bluesky.

@miklb
Copy link

miklb commented Feb 12, 2024

@snarfed would you prefer new issues for Bluesky related "issues"? Current question is about truncation brid.gy log

Invalid [app.bsky.feed.post](http://app.bsky.feed.post/) record: Record/text must not be longer than 300 graphemes"}

I'm currently using <data class="p-bridgy-omit-link" value="maybe" /> as I was debugging sending webmentions and wanted to eliminate the URL param as the culprit.

@snarfed
Copy link
Owner

snarfed commented Feb 12, 2024

@miklb yes, new issues please! And thanks for the report!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants