New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IA version of post doesn't update after being resaved #878

Closed
jerclarke opened this Issue Feb 13, 2018 · 23 comments

Comments

Projects
None yet
6 participants
@jerclarke

jerclarke commented Feb 13, 2018

Steps required to reproduce the problem

(UPDATED: Previously I described the old procedure where posts were automatically sent to FB at publish time, because there was no announcement that this had been removed)

  1. Publish a post and share it to facebook so IA is published and shows up in Page Manager etc.
  2. Edit the post and change content
  3. Re-save the post and check the IA version

Expected Result

Previously, this worked as it should, and the post's IA version would be updated and you'd see the new content in Page Manager and in the Facebook mobile app.

You could also see it had been updated in the "Production Articles" section of "Publishing Tools" in the FB page admin.

This meant that critical updates and corrections to articles would be reflected in the IA version, and Facebook users saw the truth.

Actual Result

IA version of the post is never updated. In Page Manager/FB mobile app the content never updates.

Additionally, in "Production Articles" section of "Publishing Tools" I can see that the Last Updated time never updates.

Needless to say this is wildly horrifying as a publisher. Our authors update posts with the expectation that their updates will be seen by all visitors, and in this case very likely most of our visitors will see the uncorrected IA version.

Please confirm that this is a problem specific to us, because if it effects all users it is a disaster.

@see WP Forum Thread from Dec 2017 where a user has this same problem of IA not updating

Version Info

Please state exact versions, not latest or new.

  • Plugin version: 4.0.6
  • WordPress version: 4.9.4
  • PHP version: 7.0.23
@jerclarke

This comment has been minimized.

jerclarke commented Feb 15, 2018

Would love some feedback, including Works for me! from anyone. Even This never worked, what are you talking about? would let me breathe a little easier that I'm not crazy.

@mburak

This comment has been minimized.

Collaborator

mburak commented Feb 15, 2018

Hi @jerclarke, unfortunately automatic updates are not supported anymore out of the box because of some changes in our APIs. In order to update the article you need to use the sharing debugger and re-scrape it, otherwise the FB crawler will fetch it every 30 days. If you rely on the automatic import please use the RSS feed of this plugin instead. Thanks.

@mburak mburak closed this Feb 15, 2018

@jerclarke

This comment has been minimized.

jerclarke commented Feb 15, 2018

Wow. Was this mentioned in the changelog and I missed it? This is totally unnacceptable to us as a deleted feature.

This plugin needs to replace this feature with something else (e.g. pinging the API) or have a warning after publishing that edits to the post won't be reflected in IA. Right now the UI strongly implies that the IA version will be up to date!

@jerclarke

This comment has been minimized.

jerclarke commented Feb 15, 2018

This ticket should be re-opened and addressed please. It describes an extremely important flaw in the plugin, whether the cause was derived from the plugin or from Facebook's API changes.

Bare minimum the resolution of this ticket is adding a note in the changelog.


Okay, I see now that the FAQ has been updated, and it indirectly hints at this change. And now I see how the changelog item about changing the UI text to mention sharing is related to this disaster.

But I can't imagine how no one saw this automatic updating issue as a huge problem that needed to be addressed. Do you seriously expect large publishers to tell their authors "every time you change a post, go use the scraper"?


Just looked through the Instant Articles Blog and of course there was no mention of this change.

@jerclarke

This comment has been minimized.

jerclarke commented Feb 15, 2018

Okay, so I found the change as it relates to this plugin, #798, which was expressed in the changelog like this:

4.0.6 (2017/12/04 00:26 +00:00)
...

  • #798 remove article rescrape code (@timjacobi)
    ...
  • #792 Change meta box and readme so user doesn’t think IAs are actually submitted to Facebook (@timjacobi)

So I guess I shouldn't have freaked out, and should have understood that this meant you had removed an enormous feature from the plugin? This seems a stretch to me. I strongly think this changelog should have been a lot more explicit about the many changes "remove article rescrape code" brings with it.

The real problem is undercommunication about the changes to the Graph API that necessitated "remove article rescrape code". Whatever that change was (I will go looking for it next) it constitutes a crippling of the IA system compared to how it worked previously, and as someone who has been using IA for years (and even did a WordCamp talk promoting the system) the echoes of this change to the Graph API are definitely something I needed to know about.

The fact that #798 didn't even include a link to the Graph API change, or any kind of justification for why this plugin no longer needed the rescrape code, is just disturbing.

My suspicion is that you didn't think of this as your problem, because the feature was already broken, right? Even though I actually hadn't updated Instant Articles for WordPress, my setup was already crippled because the API had silently stopped working behind the scenes, and as a result all our IA posts were already out of date.

So when you removed the rescrape code in #798 all you were doing was "cleaning up some useless code". But people, this is not what you were doing, you were conceeding defeat on an extremely important feature without even discussing it (AFAICT) or communicating it to your users.

"remove article rescrape code" makes it sound like that code was redundant, and had been replaced earlier by something else. It sounds like you knew in advance about the Graph API changes, had two versions of your code ready, and had updated the plugin so it would gracefully switch to the new version at the right time. Then, later, you removed the old code now that the transition was complete.

Seems like it was something else: The Graph API change broke your plugin, you noticed after it was broken, then you quietly deleted the code that no longer solved the problem it was intended to solve.

This plugin is being developed by Facebook right? The Graph API change is your problem too, when it breaks this plugin, that's something you have to deal with and explain and communicate to your users.

My understanding is that Facebook has a bit of a "right hand doesn't know what the left hand is doing" problem (as do all large orgs), but this is a very disturbing example. You need to be communicating with the Graph API team about changes that will make your plugin radically change how it functions, and how many steps are required to publish up-to-date news with it.


In a comment thread on an unrelated post on the Instant Articles Open Source group where I asked about how the OG/URL Debugger system gets updated @everton-rosario said the following:

The OG cache refreshes every 7 days, so does our ingestion based on rules.
You can perform the invalidation through the Share Debugger or through a Graph API call if you need to force the cache invalidation (on the case of having an update to that article)

We later confirmed that it's every 30 days, not every 7 (that's worse), but what I want to ask about is "You can perform the invalidation through the Share Debugger or through a Graph API call if you need to force the cache invalidation."

Now, that sounds like what was happening before, but I guess this would be a different call? If the call @everton-rosario is talking about does exist, then it should be part of this plugin, and should be called whenever a published IA post is updated.

Asking users to use the URL Debugger as part of their workflow is obnoxious and innapropriate, in addition to being too complicated. Doing it automatically is the only way that Facebook can make sure you have the most up to date version of each post, and that is the only way you can avoid being a source of fake news, since news articles missing corrections and updates are a surefire way to spread false information.


Please, re-open this ticket and find a way to address the problem.

At minimum a blog post explaining the changes to how the plugin works would be totally appropriate. Put a positive spin on it if you can, but explaining in plain language how the changes affect people is the only way you can ensure people understand that it doesn't work how it used to.

If the "cache invalidation" API call mentioned by @everton-rosario doesn't exist, and we're really trapped with the URL debugger as the only option, then we need to update this plugin to state that clearly in the IA metabox. Something like "Updates to this post won't be reflected unless you go to URL Debugger and force a re-scrape"

@jerclarke

This comment has been minimized.

jerclarke commented Feb 15, 2018

(Comment crossposted from #798 because this is the ticket where the development should happen)

So, now I have looked through all the Graph API changes in the changelog, going back to 2016, and none of them imply that this should be broken, that the scraping endpoint is deprecated, or that this feature should have been removed from this plugin.

So it looks like v. 2.10 - Released July 18, 2017 is the only sensible announcement that @timjacobi could have been referring to.

Here is the section that relates to scraping

URLs
POST /{url} - This node now requires a valid access token.

GET /{url} - og_object is no longer returned by default but must be included in the fields parameter.

/{url} Scraping - GET requests will no longer scrape the URL. POST /{url} must now be used to scrape the URL.

Now, this was all in the "90-Day Breaking Changes" section, which, because the release date was July 18 2017, means that this change to the /{url} node would have taken effect on October 16, 2017. This would then, logically, be the date when the code in this plugin stopped working, and about a month later the "fix" of #798 was applied to remove the code which no longer functioned. That all makes sense as a timeline, though it's still disturbing that the developers of this plugin didn't pre-empt the problem, having been given a 90 day warning, and working for the same company. Still, I won't hound you for that, I'm looking for a fix, not to blame people for doing their best.

But how does that API change mean that we can't have scraping in this plugin?!?

I can see that it broke, but why can't you fix it?

Based on reading the code, it doesn't seem like the GET/POST change is the culprit, as the code deleted from the plugin was already using POST. This leads me to believe that what broke the plugin was "This node now requires a valid access token."

So I don't even know exactly what that implies or entails, but I know that now I want to have an access token, and I want to use it to keep my posts up to date automatically the way this plugin kept them up to date before.

What is involved in getting an access token, and why can't this plugin accomplish it for us?

Was this problem discussed in a meeting that we don't have access to? Is it impossible?

It seems like this is the specific question that should have been discussed in #798, before the code was removed, or at least there should have been an explanation of the change to /{url}, along with a link to the changelog.

Looks to me like the API-based rescraping feature is still there, and this is something that can be fixed.

@pestevez

This comment has been minimized.

Collaborator

pestevez commented Feb 16, 2018

Hey @jerclarke - I hear you loud and clear. We are actively working on finding a solution.

@vkama

This comment has been minimized.

Collaborator

vkama commented Feb 16, 2018

Reopening as we are in discussion for a solution. Thanks for you input @jerclarke.

@vkama vkama reopened this Feb 16, 2018

@jerclarke

This comment has been minimized.

jerclarke commented Feb 20, 2018

What are the possibilities for a solution?

@everton-rosario

This comment has been minimized.

Collaborator

everton-rosario commented Feb 20, 2018

@jerclarke we are working on a solution and in the next days and you should receive a PR from @diegoquinteiro.

@diegoquinteiro

This comment has been minimized.

Collaborator

diegoquinteiro commented Feb 21, 2018

Hi @jerclarke, thanks for raising this issue and apologies for the trouble it caused to you.

The current workaround requires you to manually invalidate the scrape for every post you update, and I acknowledge that this is not ideal. Admittedly, also, the communication on the plugin is not clear about this behavior.

Here's how I plan to fix this problem:

  • Add an advanced option on the settings screen allowing you to paste a long-lived Access Token.
  • Automatically call the scrape invalidation API using such Access Token for every post update.
  • Display a message on updates alerting for the stale scrape for users that do not configure the Access Token.

I'll work on a Pull-Request this week and add you as one of the reviewers so you can follow it up.

Thanks one more time for the detailed diagnostic and for helping us improving our plugin.

@jerclarke

This comment has been minimized.

jerclarke commented Feb 22, 2018

Thank you @diegoquinteiro, this sounds like a good solution! Will read up on the long-lived access tokens tomorrow 👍

@jerclarke

This comment has been minimized.

jerclarke commented Feb 23, 2018

Alright, I read the article about access tokens. I have two main questions:

How will plugin users get the long-lived access tokens if it is a text field? The article implies it would be done through the server, so it seems like something the plugin could and should handle for us in the background.

Can this be done using a single FB app for multiple sites? The old system for this plugin required a FB app for every site, which was a very big pain. If it's possible for this setup to enable an organization to use one app for a variety of related setups (the same way we can use one FB page for multiple integrations) that would be much better.

@jerclarke

This comment has been minimized.

jerclarke commented Mar 5, 2018

Excited to hear any updates about this.

@jerclarke

This comment has been minimized.

jerclarke commented Mar 6, 2018

For anyone with authors that need to be informed about this bug, this article I wrote for our editors should be helpful so they can understand how the Sharing Debugger can overcome the limitation in the plugin.

Facebook Instant Articles: Thou shalt re-scrape thine articles after important edits

@everton-rosario

This comment has been minimized.

Collaborator

everton-rosario commented Mar 14, 2018

@jerclarke This PR #904 brings a functional setup where you can start using and let us know about your tests.

It is still a work in progress given I need to treat better if any misconfigured setup is done and then no message is being show for that particular setup problem. If you configure with correct app ID, app secret and page access token it is working perfectly.

You can generate your page access token from the Graph API explorer (so you don't need to implement nor have the full login flow into your CMS system):
https://developers.facebook.com/tools/explorer/

@jerclarke

This comment has been minimized.

jerclarke commented Mar 14, 2018

Okay I will test this and see what happens.

Can you describe the expected workflow a normal user would follow for this system?

Specific instructions for what to do with the explorer?

What should I use as my "Privacy Policy URL" for my app?

How do I keep 50 sites working using this system? Are you expecting me to hand-create codes for each site whenever they go down?

@jerclarke

This comment has been minimized.

jerclarke commented Mar 14, 2018

Before this can be tested properly, we need a full set of instructions on how to set it up correctly, as well as your code that will actually validate the data. As-is there's 100 things that could make it broken.

To be fair, this problem totally existed with the old app-driven version of this plugin, but it needed addressing then too.

Please, step-by-step instructions for someone who hasn't used FB apps or access tokens before.

@jerclarke

This comment has been minimized.

jerclarke commented Mar 28, 2018

Alright, I now have the plugin set up locally with the fix/auto-update branch, so I can see the Facebook App ID, Facebook App Secret and Page Access Token fields.

Can you please reply with instructions on how to use them to get it working?

@jerclarke

This comment has been minimized.

jerclarke commented Apr 10, 2018

@everton-rosario Still waiting for the instructions outlining the steps that are expected to work with this patch.

In the meantime it looks like FB have changed the way tokens work:

Can you please figure out and explain here how this will affect the patch and proposed system for having up to date IA versions of posts?

How long will our tokens be able to last without intervention?

Who will have to invervene when/if the token dies?

How will users know if their token has died?

What can the plugin do to ensure the tokens die as rarely as possible? (could a wp-cron based request keep it alive?)

@everton-rosario

This comment has been minimized.

Collaborator

everton-rosario commented Apr 11, 2018

Sorry @jerclarke I might have misunderstood that you were already unblocked by the last time we contacted.
If any token stops working, a response with "Expired access token" will return, then you should generate another one.

To generate one you can use the Graph API Explorer, or implement the calls by yourself.
Im exemplifying here the flow on the Graph API Explorer:

  1. Access the tool: https://developers.facebook.com/tools/explorer/
  2. Select the application on the first top right bottom (same application you are using as ID on the plugin)
  3. Select "Get token" button, selecting a sub option: Page Access
  4. This should open a popup window, then you will follow as your user
  5. It will now fill in the access token on the form field: "Access Token"
  6. Copy this access token and go to Token Debugger: https://developers.facebook.com/tools/debug/accesstoken/
  7. Click on extend token
  8. Copy and paste this token back into the Graph API Explorer
  9. Call GET: /me/accounts
  10. Find your page access token
  11. Confere this last copied token into the Token Debugger to check it expire date.
  12. Use this long-lived token into your Token field on the settings.
@jerclarke

This comment has been minimized.

jerclarke commented Apr 11, 2018

Thanks @everton-rosario for this answer to how to get a token to put into the plugin.

Do you think this is the right number of steps for plugin users to follow?

I would appreciate answers to my other questions I asked, both for general information and in light of the recent changes to the API:

Can you please figure out and explain here how this will affect the patch and proposed system for having up to date IA versions of posts?

How long will our tokens be able to last without intervention?

How will WP authors know if the token for their site has died?

Who will have to intervene when/if the token dies?

What can the plugin do to ensure the tokens die as rarely as possible? (could a wp-cron based request keep it alive?)

@everton-rosario

This comment has been minimized.

Collaborator

everton-rosario commented Apr 12, 2018

I don't think this is the best solution or best number of steps, this is just a temporary solution, since we are working into the final solution, that will require much less steps, but it is more complex and unfortunately taking more for getting it done.

How long will our tokens be able to last without intervention?

The tokens you generate this way should never expire, unless anything else comes on top of this.

How will WP authors know if the token for their site has died?

This is still on going into the PR, that's why it is still WIP. We are planning to add some warnings during publishings.

Who will have to intervene when/if the token dies?

Any developer from the app who has page_admin rights to the page and has admin rights to the WP environment.

What can the plugin do to ensure the tokens die as rarely as possible? (could a wp-cron based request keep it alive?)

Not needed, since token generated this way has a "never expires"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment