Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HackerNews links don't work #393

Closed
edent opened this issue May 13, 2023 · 21 comments · Fixed by #451
Closed

HackerNews links don't work #393

edent opened this issue May 13, 2023 · 21 comments · Fixed by #451

Comments

@edent
Copy link
Contributor

edent commented May 13, 2023

Tried to submit https://news.ycombinator.com/item?id=35927509 to https://shkspr.mobi/blog/2023/05/the-new-zip-tld-is-going-to-cause-some-problems/

I got the error

Not enough data available

@dshanske
Copy link
Collaborator

dshanske commented Jun 7, 2023

You are right, but there isn't enough data. We may have to figure out how to find some.

@edent
Copy link
Contributor Author

edent commented Jun 12, 2023

Same issue on Lemmy - for example https://lemmy.one/post/78196

@pfefferle
Copy link
Owner

This is the same issue as with hackernews, lemmy does not really use WebSemantics: http://php.microformats.io/?url=https%3A%2F%2Flemmy.one%2Fpost%2F78196

@dshanske
Copy link
Collaborator

Anyone want to read the HTML and see if there is something we can extract of any type?

@pfefferle
Copy link
Owner

@edent do lemmy and hackernews support Webmentions or have you simply tried to send them manually?

@edent
Copy link
Contributor Author

edent commented Jun 12, 2023

@pfefferle I was submitting them manually to my site. That's where I saw the error.

Looking through the HTML of both HN & Lenny, I can see the links to my site. Is there something specific I should be looking for?

@pfefferle
Copy link
Owner

There is a bug in 5.0.0 that does not parse the meta-headers correctly. Maybe we wait until 5.1.0 is released to re-check.

@dshanske
Copy link
Collaborator

It's still a problem of what data could we extract to render a preview.

@edent
Copy link
Contributor Author

edent commented Jun 13, 2023

So, if I've understood correctly, the problem is that

public function verify() {

Either can't find an author or can't find a summary.

Changing verify() so the author always returns true gets me this blank comment:
Screenshot of a blank comment

For a basic WebMention, is it necessary to have a summary? What I'd like to appear on my site is:

This article was mentioned on [WebSite Name](http://example.com)

Could the name could come from the <title> or perhaps the domain name? I think that would be enough.

@janboddez
Copy link
Contributor

janboddez commented Jun 13, 2023

Sorry to chime in like this. :-S That's what IndieBlocks does, too. Set a default name (either "Anonymous" or the page's host name) and content. Then if there are microformats, we turn those into something more meaningful.

Shouldn't be too hard to do something similar here. I'd be happy to look into it, and maybe submit a PR.

What I do for" default" mentions (and replies, etc.) is, where possible, actually use the code that WP uses for pingbacks, set the comment to something like "[...] some text linking back to this article [...]" and otherwise fall back to "mentioned this" or something.

Not saying Webmention should do the same, but that could perhaps be an option, too.

@dshanske
Copy link
Collaborator

I think we can add something simple.. just was trying to avoid Anonymous as it looks like less useful info. I think Hackernews may be worth a custom rule to find things.

@pfefferle
Copy link
Owner

Maybe use the domain host?

@edent
Copy link
Contributor Author

edent commented Jun 15, 2023

Also, LinkedIn (urgh!). Take a URl like this - https://www.linkedin.com/posts/terenceeden_why-im-using-mx-as-a-title-activity-7074702026084884480-jO0C?utm_source=share&utm_medium=member_desktop

It doesn't contain the direct link in the HTML, but it is in the JSON+LD as "sharedContent":

<script type="application/ld+json">
{
  "@context": "http://schema.org",
  "@type": "SocialMediaPosting",
  "@id": "https://www.linkedin.com/posts/terenceeden_why-im-using-mx-as-a-title-activity-7074702026084884480-jO0C",
  "author": {
    "name": "Terence Eden",
    "image": {
      "url": "https://media.licdn.com/dms/image/C4E03AQEX11qBnAo43A/profile-displayphoto-shrink_400_400/0/1517677542848?e=1692230400&v=beta&t=-AHOoj__Ehm_p24dQ6McPbqNj9gQ2UsjvzOCIU16IGs",
      "@type": "ImageObject"
    },
    "url": "https://uk.linkedin.com/in/terenceeden",
    "@type": "Person"
  },
  "datePublished": "2023-06-14T11:00:04.600Z",
  "image": {
    "url": "https://media.licdn.com/dms/image/sync/D5627AQHWF4nuGqlfzQ/articleshare-shrink_800/0/1686397225711?e=1687428000&v=beta&t=SsQ_1b8M3xAGXnhUvID0976rpItlTLdC2Gz9RG3sXwI",
    "@type": "ImageObject"
  },
  "sharedContent": {
    "@type": "WebPage",
    "headline": "Why I'm using \"Mx\" as a title",
    "url": "https://shkspr.mobi/blog/2023/06/why-im-using-mx-as-a-title/"
  },
  "articleBody": "What title do you use when filling in forms? Mr? Mrs? Ms? Or something newer and more incluside?\n\nWhy do companies want to know my title? What are they going to do with my data?\n\nHere's why I'm using \"Mx\" as my title.\nhttps://lnkd.in/eCNH5sqp",
  "isAccessibleForFree": false,
  "hasPart": {
    "@type": "WebPageElement",
    "isAccessibleForFree": false,
    "cssSelector": ".details"
  }
}
</script>

pfefferle added a commit that referenced this issue Jul 4, 2023
@pfefferle
Copy link
Owner

The JSON-LD parser had a bug, but should work with the latest PR.

@edent
Copy link
Contributor Author

edent commented Jul 4, 2023

Cool! When there's a new release I'll give it a go and report back.

Thanks for all your work on this.

@edent
Copy link
Contributor Author

edent commented Jul 6, 2023

@pfefferle LinkedIn is now working - thanks :-)

@pfefferle
Copy link
Owner

Yea, hacker news is a bit tricky, because there are nearly no informations we can use.

@dshanske
Copy link
Collaborator

There's always the API. https://github.com/HackerNews/API

@pfefferle
Copy link
Owner

But I wouldn't see that as part of the plugin! that's too specific! 😳

Any ideas what we could add to make Hackernews at least pingable and not looking too ugly?

@edent
Copy link
Contributor Author

edent commented Jul 27, 2023

Grabbing the text from the <title> element seems like the easiest solution to me. Or use the domain name as a fallback?

pfefferle added a commit that referenced this issue Jul 27, 2023
pfefferle added a commit that referenced this issue Dec 28, 2023
if properties are not set and to avoid "anonymous".

fix #393
@pfefferle
Copy link
Owner

Result is now

{
  "published": {
    "date": "2023-12-28 19:41:45.864580",
    "timezone_type": 3,
    "timezone": "UTC"
  },
  "url": "https://news.ycombinator.com/item?id=35927509",
  "author": {
    "type": "card",
    "name": "news.ycombinator.com"
  },
  "name": "The new .zip TLD is going to cause some problems | Hacker News",
  "site_name": "news.ycombinator.com",
  "content": "The new .zip TLD is going to cause some problems | Hacker News",
  "response_type": "mention",
  "raw": {
    "referrer": "origin",
    "viewport": "width=device-width, initial-scale=1.0"
  }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants