Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zenphoto 1.6 using invalid date format for sitemaps #1369

Closed
JesseHC opened this issue Jan 5, 2023 · 13 comments
Closed

Zenphoto 1.6 using invalid date format for sitemaps #1369

JesseHC opened this issue Jan 5, 2023 · 13 comments

Comments

@JesseHC
Copy link

JesseHC commented Jan 5, 2023

I noticed after upgrading to 1.6 that Google Search Console was reporting errors for all URL entries, and that the problem was with <lastmod> having invalid dates.

Sitemaps errors in Google Search Console

Sitemap invalid date errors with lastmod

It looks like the last four digits are missing a colon between them.

So
<lastmod>2021-03-24T18:38:17+0000</lastmod>
should be
<lastmod>2021-03-24T18:38:17+00:00</lastmod>

More info-
https://support.google.com/webmasters/answer/7451001#zippy=%2Cerror-list

Invalid date

Your sitemap contains one or more invalid dates. This error could be because a date is in the incorrect format, or the date itself is not valid. Dates must use W3C Datetime encoding, although you can omit the time portion. Make sure your dates match one of the following W3C Datetime formats:

2005-02-21 
2005-02-21T18:00:15+00:00

Specifying time is optional (the time defaults to 00:00:00Z), but if you do specify a time, you must also specify a time zone.

From here-
https://webmasters.stackexchange.com/questions/50440/why-is-this-date-in-my-sitemap-invalid-according-to-google

@acrylian
Copy link
Member

acrylian commented Jan 5, 2023

Thanks, I have to look, I was sure I used the DATE_ATOM constant (we have a report for wrong dates in RSS, too…): https://www.php.net/manual/en/class.datetimeinterface.php

Perhaps it was missed/lost with all the extra work when the PHP guys made locale based date time more complicated than necessary… There is also a DATE_W3C constant matching that format I think, but I don't know when this was introduced…

@acrylian
Copy link
Member

acrylian commented Jan 5, 2023

I just quickly looked and I am using the correct constant within the sitemap plugin actually:
Bildschirmfoto 2023-01-05 um 16 35 36

I can only assume that the locale stuff again breaks things unexpectedly…

@acrylian
Copy link
Member

acrylian commented Jan 5, 2023

I just looked at our own sitemaps and I am a bit confused. Google does not complain as we have just the also valid YYYY-MM-DD format :
Bildschirmfoto 2023-01-05 um 16 50 28

We do have the Intl extension regarding locale aware dates.

@acrylian
Copy link
Member

acrylian commented Jan 5, 2023

Sorry, those were the news on images we do have the same format you reported but Google does not complain either:

Bildschirmfoto 2023-01-05 um 16 54 13

@JesseHC
Copy link
Author

JesseHC commented Jan 5, 2023

I know it’s unlikely, but maybe Google hasn’t read the sitemap yet? They also didn’t send me any emails about the errors or notifications about them in Google Search Console. It was only when I checked under Indexing-Sitemaps in Search Console that I saw the issue.

My date format under General was set to Custom: locale_preferreddate_notime (I believe I’ve never changed that before updating to 1.6).

Changing it to “February 25, 2008” and regenerating the sitemaps keeps it with the same invalid date.

Changing it to “Preferred date representation” also keeps the date the same.

My current server settings:

Current locale setting: en_US.UTF8
PHP version: 7.4.28

The intl PHP extension is enabled:

intl
Internationalization support enabled
ICU version 69.1
ICU Data version 69.1
ICU TZData version 2021a
ICU Unicode version 13.0

@acrylian
Copy link
Member

acrylian commented Jan 5, 2023

Google reports all our sitemaps to be read successfully, all green and no errors anywhere. All read dates were from end of December or beginning of January. I have now regenerated them freshly now but it should not make any difference since it uses still the same "wrong" date.

Are you sure your dates itself are correct?

The date formats settings are for frontend only and don't appy to the sitemap. Btw, I also checked the rss feeds using that same format and the w3 valididator does not complain either…

@acrylian
Copy link
Member

acrylian commented Jan 5, 2023

But note we focus on PHP 8.1 now and do not really actively test on PHP 7 anymore. Please consider to upgrade in any case in case this makes any difference…

@JesseHC
Copy link
Author

JesseHC commented Jan 6, 2023

Since the errors are happening to all file entries with the sitemaps, and I know that most dates are correct. Some may be wrong in there, but not all of them. Also, this wasn’t a problem with 1.5.9, and as you can see from the Google Search Console help file I linked, the dates are indeed considered invalid. Maybe Google is not checking it in all regions yet? I know they can roll things out like that.

I checked and the date format also changed in the sitemaps from 1.5.9 to 1.6, where it changed from using the time zone designator to the four digits without the colon.

1.5.9
<lastmod>2021-03-24T18:38:17Z</lastmod>

1.6
<lastmod>2021-03-24T18:38:17+0000</lastmod>

I looked at the pubDate and lastBuildDate used by the Zenphoto RSS feed, and it uses a different specification than the sitemap. The RSS feed sticks to the RFC 822 specification, which looks like it doesn’t need the colon in there at the end with the four digits.

https://cyber.harvard.edu/rss/rss.html#optionalChannelElements

https://www.ietf.org/rfc/rfc822.txt

The RSS feed is also using a different date format than the sitemap (though it didn’t change between 1.5.9 and 1.6).

The date format in the RSS feeds:

<pubDate>Mon, 02 Jan 2023 02:36:17 -0800</pubDate>
<lastBuildDate>Mon, 02 Jan 2023 02:36:17 -0800</lastBuildDate>

PHP 7.4.28 is the latest my hosting company offers. I know about it now being out-of-date, but they tend to drag their feet with updating.

@acrylian
Copy link
Member

acrylian commented Jan 6, 2023

Yes, it has been changed from 1.5.9 because of all the date changes needed in 8.1 for deprecations. I am still confused why DateTimeInterface::ATOM does not create the date format with the colon although the docs say it does. Since Google is quite a bit slow updating/re-reading sitemaps I now see the same error as well at least. So we will investigate this.

And RSS probably should use the same constant actually which probably was just forgotten. We'll review that too…

@acrylian
Copy link
Member

acrylian commented Jan 6, 2023

It is the Intl part that is killing the date formatting it seems. The format string is Y-m-d\TH:i:sP where the \˚ escapes the Tfrom being recognized as a date format. Now the Intl formatting uses a totally different formatting system and does not understand it. Here chars that should be escape must be within single quotes. This is currently not properly converted and I am not sure if it can. So probably I will just use plaindate()` formatting here since we in any case don't need translated day or month names in this format at all.

@acrylian
Copy link
Member

acrylian commented Jan 6, 2023

Should be fixed now. Besides the issue with the new locale aware date function – not fixed just bypassed by not using it here – was another call that actually used the wrong date format constant.

@JesseHC
Copy link
Author

JesseHC commented Jan 18, 2023

Tried the updated sitemaps extension on a couple installs and it appears to work fine.

The important thing is that Google Search Console likes it…
2023-01-17 Google Search Console - Sitemaps read - Status Success

Not sure at the end of these, when things are working, if you want us to close the issue, or if it’s something you like to do?

@acrylian
Copy link
Member

Thanks for confirmation, I have tried myself. Google did read the ones wiht wrong dates but it was just not happy. Depending on the size of your site you may not even need a sitemap, e.g. quite small sites where Google finds all pages via the site itself of course.

I will just close the ticket since it is fixed now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants