Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No webmentions to original URLs that include emojis #870

Closed
chrisaldrich opened this issue May 29, 2019 · 3 comments

Comments

Projects
None yet
2 participants
@chrisaldrich
Copy link

commented May 29, 2019

I've found a few instances in which Brid.gy will apparently fail to send a webmention (and/or fail to find a target) when the original URL contains an emoji(s). I'd suspect it's a quirky encoding issue of some sort. I'm sure I've seen this issue before on Instagram where it's probably more likely as the result of emojis in Instagram "titles" when using PESOS methods.

When I subsequently remove the emoji from the permalink, and reprocess Bridgy then has no problem finding the URL and sending the webmention. So at least there's a "fix" on the user's side for those experiencing this issue, but only if they're aware it exists and have the means of executing it.

Example of failed webmention:

(I'll note that it's also got a fragment # in the URL, but don't think this is a part of the issue)
Original: https://boffosocko.com/2019/04/29/%F0%9F%93%85-virtual-homebrew-website-club-meetup-on-may-15-2019/?replytocom=262215#respond

Syndicated copy that was liked: https://twitter.com/ChrisAldrich/status/1129124049068498944#favorited-by-14591484

Bridgy Log: https://brid.gy/log?start_time=1558056830&key=aglzfmJyaWQtZ3lyTAsSCFJlc3BvbnNlIj50YWc6dHdpdHRlci5jb20sMjAxMzoxMTI5MTI0MDQ5MDY4NDk4OTQ0X2Zhdm9yaXRlZF9ieV8xNDU5MTQ4NAw

Example of previously failed webmention that ultimately went through following emoji removal:

Original: https://boffosocko.com/2019/04/29/%F0%9F%93%85-virtual-homebrew-website-club-meetup-on-may-15-2019/?replytocom=262215#respond

Syndicated copy: https://twitter.com/ChrisAldrich/status/1129124049068498944#favorited-by-19844672

Bridgy Log: https://brid.gy/log?start_time=1558714459&key=aglzfmJyaWQtZ3lyTAsSCFJlc3BvbnNlIj50YWc6dHdpdHRlci5jb20sMjAxMzoxMTI5MTI0MDQ5MDY4NDk4OTQ0X2Zhdm9yaXRlZF9ieV8xOTg0NDY3Mgw

Another potential example from Instagram

Done via PESOS from Instagram which I'm sure missed webmentions (though too far back to find the specific logs): https://boffosocko.com/2017/10/15/docteur-jerry-et-mister-love-%E2%9D%A4%EF%B8%8F%E2%9A%97%EF%B8%8F%F0%9F%91%93%F0%9F%8E%ACi-found-this-original-french-one-sheet-47-x-63-after-the-move-will-have-to-get-it-mounted-and-fram/

(Originally posted at https://boffosocko.com/2019/05/29/no-webmentions-to-original-urls-that-include-emojis/)

@snarfed

This comment has been minimized.

Copy link
Owner

commented May 29, 2019

hah, funny. i wonder how it would handle https://🐷🔥.ws/ . thanks for filing! i'll look.

@snarfed snarfed added the now label May 29, 2019

@snarfed

This comment has been minimized.

Copy link
Owner

commented May 29, 2019

@chrisaldrich looking at these examples, the webmention target and the the bridgy page's mf2 u-like-of are both the same escaped URL, https://boffosocko.com/2019/04/29/%f0%9f%93%85-virtual-homebrew-website-club-meetup-on-may-15-2019/?replytocom=262215#respond, which seems correct.

your site is responding to the webmention POST with HTTP 400 and this JSON body:

{
  "message": "Cannot find target link",
  "code": "...",
  "data": {
    "status": 400,
    "data": {
      "comment_type": "webmention",
      "comment_date_gmt": "2019-05-17 00:00:42",
      "comment_author_IP": "107.178.194.210",
      "comment_post_ID": 55750344,
      "comment_date": "2019-05-16 17:00:42",
      "comment_approved": 0,
      "target": "https://boffosocko.com/2019/04/29/\ud83d\udcc5-virtual-homebrew-website-club-meetup-on-may-15-2019/?replytocom=262215#respond",
      "comment_author_url": "https://brid-gy.appspot.com/post/twitter/ChrisAldrich/1129168343049613317",
      "comment_meta": {
        "webmention_source_url": "https://brid-gy.appspot.com/post/twitter/ChrisAldrich/1129168343049613317",
        "webmention_target_fragment": "respond",
        "webmention_created_at": "2019-05-17 00:00:42",
        "webmention_target_url": "https://boffosocko.com/2019/04/29/\ud83d\udcc5-virtual-homebrew-website-club-meetup-on-may-15-2019/?replytocom=262215#respond"
      },
      "source": "https://brid-gy.appspot.com/post/twitter/ChrisAldrich/1129168343049613317",
      "comment_agent": "Bridgy (https://brid.gy/about) AppEngine-Google; (+http://code.google.com/appengine; appid: s~brid-gy)",
      "comment_parent": "262215",
      "comment_author_email": ""
    }
  }
}

looking at comment_post_ID: 55750344 field, the wordpress post id is indeed 55750344, so your site found the post ok. based on the Cannot find target link message and the https://boffosocko.com/2019/04/29/\ud83d\udcc5-virtual-homebrew-website-club-meetup-on-may-15-2019/?replytocom=262215#respond target URL, it seems like your site is validating the webmention by looking for that URL with \ud83d\udcc5 in the source page, but the actual URL in bridgy's source page has the escaped %f0%9f%93%85 URL instead.

cc @dshanske. when there are escaped chars in a webmention target URL, the wordpress plugin should probably validate by searching for either both the escaped and unescaped URLs, or the same version of the URL that it got from the webmention client, right?

@snarfed

This comment has been minimized.

Copy link
Owner

commented May 31, 2019

filed pfefferle/wordpress-webmention#199 . tentatively closing; feel free to reopen.

@snarfed snarfed closed this May 31, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.