Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Publish to Facebook from WordPress (theme: "oriental") refinements #563

Closed
tantek opened this issue Dec 3, 2015 · 13 comments · Fixed by #567
Closed

Publish to Facebook from WordPress (theme: "oriental") refinements #563

tantek opened this issue Dec 3, 2015 · 13 comments · Fixed by #567
Labels

Comments

@tantek
Copy link
Contributor

tantek commented Dec 3, 2015

When you use Bridgy Publish to post from a WordPress blog (with theme: "oriental') to Facebook, posts the title, date/by, content, "This entry was posted in ...", and originally published, like this:

Person-tag experiment
Posted on December 3, 2015 by BrynLove
So this is me tagging Alex(t,f) and let’s try something else, by tagging Tantek(t,f)
This entry was posted in Uncategorized.
(Originally published at: http://loveiswhoweare.com/?p=552)

This could be improved in two ways, FIRST, only publish title, content, and originally published (drop date/by, "This entry was posted in") :

Person-tag experiment
So this is me tagging Alex(t,f) and let’s try something else, by tagging Tantek(t,f)
(Originally published at: http://loveiswhoweare.com/?p=552)

SECOND, bold the title to distinguish it from content body:

Person-tag experiment
So this is me tagging Alex(t,f) and let’s try something else, by tagging Tantek(t,f)
(Originally published at: http://loveiswhoweare.com/?p=552)

Thanks to @BrynWolf for the suggestions!

@snarfed
Copy link
Owner

snarfed commented Dec 4, 2015

thanks for filing! definitely worth improving.

@snarfed
Copy link
Owner

snarfed commented Dec 5, 2015

you all won't be surprised to hear this, but there are tons of wordpress(.com), blogger, and tumblr themes, and they often have radically different markup and microformats. i try to make the default themes work well enough, but i don't make many guarantees about the rest, like oriental.

i can definitely take a look though! here's a snippet of the parsed mf2 from http://pin13.net/mf2/?url=http%3A%2F%2Floveiswhoweare.com%2F%3Fp%3D552 . the thing that immediately jumps out at me is that This entry was posted in Uncategorized. is only in entry["value"], not in entry["properties"]["content"]. hmm.

{"items": [{
  "type": ["h-feed"],
  "children": [{
    "type": ["h-entry"],
    "properties": {
      "name": ["Person-tag experiment"],
      "content": [{
        "html": "&#xD;\n\t\t<p>So this is me tagging <span class=\"u-category h-card\"><a class=\"u-url p-name\" href=\"http:\/\/alexthejourno.com\/\">Alex<\/a>(<a class=\"u-url\" href=\"https:\/\/twitter.com\/alexthejourno\">t<\/a>,<a class=\"u-url\" href=\"https:\/\/www.facebook.com\/alex.campbell.usmc\">f<\/a>)<\/span> and let\u2019s try something else, by tagging <span class=\"u-category h-card\"><a class=\"u-url p-name\" href=\"http:\/\/tantek.com\/\">Tantek<\/a>(<a class=\"u-url\" href=\"https:\/\/twitter.com\/t\">t<\/a>,<a class=\"u-url\" href=\"https:\/\/www.facebook.com\/tantek.celik\">f<\/a>)<br><\/br><\/span><\/p>\n\t\t\t",
        "value": "So this is me tagging Alex(t,f) and let\u2019s try something else, by tagging Tantek(t,f)"
      }]
    },
    "value": "Person-tag experiment\r\n\t\t\t\t\r\n\t\t\tPosted on December 3, 2015 by BrynLove\t\t\r\n\t\r\n\t\tSo this is me tagging Alex(t,f) and let\u2019s try something else, by tagging Tantek(t,f)\n\t\t\t\r\n\r\n\t\r\n\t\tThis entry was posted in Uncategorized."
  }]
}],
...
}

@snarfed
Copy link
Owner

snarfed commented Dec 5, 2015

ooh, actually, mf2py parses this way differently from php-mf2 (above)! cc @kartikprabhu @kylewm. here are the two logs. the parsed mf2 doesn't actually have entry["properties"]["content"] at all, just entry["name"], which looks like it matches php-mf2's.

{"items": [{
  "type": ["h-entry"],
  "properties": {
    "name": ["New person-tag test\n\r\n\t\t\tPosted on December 4, 2015 by BrynLove \n\n\nWhen at first you don\u2019t like the way it looks, bugTantek \u00c7elik(t,f) and Ryan Barret(t).\n\n\r\n\t\tThis entry was posted in Uncategorized."]
  },
  ...
}

here's the HTML:

<article id="post-552" class="post-552 post type-post status-publish format-standard hentry category-uncategorized kind-">
  <header class="entry-header">
    <h1 class="entry-title">Person-tag experiment</h1>
        <div class="entry-meta">
      Posted on <a href="http://loveiswhoweare.com/?p=552" title="7:43 pm" rel="bookmark"><time class="entry-date" datetime="2015-12-03T19:43:59+00:00" pubdate>December 3, 2015</time></a><span class="byline"> by <span class="author vcard"><a class="url fn n" href="http://loveiswhoweare.com/?author=1" title="View all posts by BrynLove" rel="author">BrynLove</a></span></span>    </div><!-- .entry-meta -->
  </header><!-- .entry-header -->

  <div class="entry-content">
    <p>So this is me tagging <span class="u-category h-card"><a class="u-url p-name" href="http://alexthejourno.com/">Alex</a>(<a class="u-url" href="https://twitter.com/alexthejourno">t</a>,<a class="u-url"
href="https://www.facebook.com/alex.campbell.usmc">f</a>)</span> and let&#8217;s try something else, by tagging <span class="u-category h-card"><a class="u-url p-name" href="http://tantek.com/">Tantek</a>(<a class="u-url" href="https://twitter.com/t">t</a>,<a class="u-url"
href="https://www.facebook.com/tantek.celik">f</a>)<br />
</span></p>
      </div><!-- .entry-content -->

  <footer class="entry-meta">
    This entry was posted in <a href="http://loveiswhoweare.com/?cat=1" rel="category">Uncategorized</a>.  </footer><!-- .entry-meta -->
</article><!-- #post-552 -->

@snarfed
Copy link
Owner

snarfed commented Dec 5, 2015

bridgy is on mf2py 0.2.6, from 2015-05-06. the changes since then look pretty minor: https://github.com/tommorris/mf2py/blob/master/CHANGELOG.md

@snarfed
Copy link
Owner

snarfed commented Dec 5, 2015

i think we have enough now to punt to the mf2py people, either a new bug or reopening kartikprabhu/mf2py#45.

thanks again for the report, @tantek and @BrynWolf!

@kylewm
Copy link
Contributor

kylewm commented Dec 5, 2015

@snarfed can you do me a favor and run

import requests, mf2py, pprint
r = requests.get("http://loveiswhoweare.com/?p=552", headers={"user-agent": "bridgy-testing"})
p = mf2py.parse(url="http://loveiswhoweare.com/?p=552", doc=r.text)
pprint.pprint(p)

in your virtualenv? I just did it here and it's definitely getting name, url, and content... 😕

@snarfed
Copy link
Owner

snarfed commented Dec 5, 2015

aha, you're right. i get the same (correct) thing in my virtualenv. thanks! narrowing down...

@BrynWolf
Copy link

BrynWolf commented Dec 6, 2015

Hey all,
thanks for including me in on this, as it's great for me to learn (even
though I now need to figure out what several words you guys typed mean...
like, "punt", which sounds like a football term...).
Hopefully I can try and implement some of your ideas and solutions and try
this myself.

Have a great Sunday. =)
Bryn

On Sat, Dec 5, 2015 at 12:02 PM, Ryan Barrett notifications@github.com
wrote:

aha, you're right. i get the same (correct) thing in my virtualenv.
thanks! narrowing down...


Reply to this email directly or view it on GitHub
#563 (comment).

Bryn "Wolf" Cartwright
Anthropologist, Yoga Teacher, Writer, Musician
Website: WWW.LoveIsWhoWeAre.com

music:
http://www.youtube.com/user/BrynShine

@armingrewe
Copy link

the only question being, American football or real football ;-)

@BrynWolf
Copy link

BrynWolf commented Dec 7, 2015

definitely American football. =)

On Sun, Dec 6, 2015 at 1:20 PM, Armin Grewe notifications@github.com
wrote:

the only question being, American football or real football ;-)


Reply to this email directly or view it on GitHub
#563 (comment).

Bryn "Wolf" Cartwright
Anthropologist, Yoga Teacher, Writer, Musician
Website: WWW.LoveIsWhoWeAre.com

music:
http://www.youtube.com/user/BrynShine

@kylewm
Copy link
Contributor

kylewm commented Dec 8, 2015

It looks like the culprit is this bit https://github.com/snarfed/bridgy/blob/master/webmention.py#L83 where it has a special-case for Tumblr's markup to augment it with mf2 classes.

It happens that this Wordpress theme also has .content > .post too, so it replaces hentry with h-entry which defeats backcompat parsing on the rest of the entry.

Some ideas:

  • Maybe there is a unique class in Tumblr pages that we could use to differentiate them?
  • Instead of .post search for .post:not(.hentry,.h-entry)
  • Only run the augment step if we first run mf2py and it doesn't find any entries on the page

@snarfed
Copy link
Owner

snarfed commented Dec 8, 2015

oh wow. nice sleuthing!

i don't feel strongly about how to fix this. the last idea seems like maybe the most robust? not sure.

kylewm added a commit to kylewm/bridgy that referenced this issue Dec 8, 2015
- fetch_mf2 has a special case for Tumblr where #content > .post that
  converts it to mf2
- unfortunately, this occurs outside Tumblr (e.g., Wordpress "oriental" theme),
  and the augmented classes override mf1 classes that WP themes have
- so do the special-case stuff iff the parser fails to find any items
  without it.

fixes snarfed#563
@kylewm
Copy link
Contributor

kylewm commented Dec 8, 2015

the last idea seems like maybe the most robust?

oh nice, that's where I was leaning too. sounds like a winner.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants