Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xhtml doctype - lighthouse #9660

Closed
jcrm70 opened this issue Sep 12, 2019 · 7 comments
Closed

xhtml doctype - lighthouse #9660

jcrm70 opened this issue Sep 12, 2019 · 7 comments

Comments

@jcrm70
Copy link

jcrm70 commented Sep 12, 2019

Hi there!

When i check an xhtml page that starts with '<! DOCTYPE html PUBLIC "- // W3C // DTD XHTML 1.0 Strict // EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict .dtd ">', lighthouse says ' doctype disappeared' or 'you don't have doctype' (something similar).

Then i read that the document should be served as of the type 'application/xhtml+xml', and i have set Apache to serve that type, but lighthouse, continues by saying that the document has no doctype.

I need the document to be xhtml in order to include iso-8859-1, since the html entities do not serve my purpose, and I have a little confusion because according to the w3c xhtml it is valid as html5.

Does this have a known solution?.
Thank for read.

@patrickhulce
Copy link
Collaborator

The audit is specifically checking that you have a modern, HTML5 doctype while yours is XHTML 1.0. Passing this audit requires <!DOCTYPE html>.

If you have a specific use case for not using a modern doctype, then you'll want to just ignore this audit.

See https://www.w3.org/QA/2002/04/valid-dtd-list.html for list doctypes.

@jcrm70
Copy link
Author

jcrm70 commented Sep 12, 2019

but who defines what is modern, you?

the w3c says that it is modern and valid, and you say that according to your own standard it is not, that it is better html5 (let's see if you can explain why it is so much better than until the true standard is skipped).

modern? haha, what's modern? Is it better just for having less development time? You need to live a little longer. the w3v says it's just as good as html5 and what do you say is more modern? LOL

You will see the case.

There are more than 5,000 million people who are not English speakers, and who usually have signs not included in utf-8.

I explain:

I am Spanish, in Spanish (my language) the á, é, í, ó, ú y ñ, are necessary to unequivocally understand what is read.

avión <- is correct
avion <- it's a mistake

but if you search on google, you will see that there are different results, that is the search engines the differences, and the way that you propose, the search engines reward those who do NOT meet the lighthouse requirements.

To write 'avión' well, using utf-8 you can choose to write 'aviANDoacute;n ', ok, but Google gives different results for 'avión', 'avion', 'aviANDoacute;n', 'aviAND#243;' etc..., but I don't think there is a single person looking for the html entities in google, no one will search 'aviANDoacute;n' in google. The less cults will put 'avion', the more they will put 'avión' and nobody, not a single person will search 'aviANDoacute;n', which is what the search engine find when the document is written in uft-8.

This is a case where your concept of modernity is rather a hindrance to the rest of humanity than an advance in something, and I already commented that the Google search engine in non-English languages ​​rewards who has iso-8859-1 for languages non-English, against lighthouse criterion, and rewards the less educated because you write without accents-tildes (which is very confusing in other languages) because they comply with the uft-8 and they have fixed the problem.

So your concept of modernity is far from the founders of Google, which were more useful and practical. You instead insist on getting in the way.

And yes, I know that I will use or stop using the lighthouse audits when it seems good, it is my pc and my time, I do not need you to remind me that I do what I want with it, and you, well, do you also can , you can recognize that eliminating the possibility of using iso-8859-1 is nonsense, because it has no serious or clear disadvantage against utf-8, (or at least not equivalent to fuck everyone), or do not recognize it and flee forward, you can do it too.

@patrickhulce
Copy link
Collaborator

Apologies I did not intend to offend you @jcrm70, but there seems to be some confusion.

HTML5 was much later than XHTML 1.0, roughly a decade, and combined with utf-8 offers a significantly improved experience for non-english languages. That list of doctypes does not recommend using the old XHTML 1.0 strict at all. In fact, W3C strongly recommends the use of <!DOCTYPE html> and utf-8 for internationalization. The purpose of this audit is in-fact to ensure consistent rendering and parsing of your document around the globe and across browsers, not to hinder humanity with English-only websites.

To write 'avión' well, using utf-8 you can choose to write 'aviANDoacute;n ', ok

Perhaps there's a misunderstanding here with utf-8, the point of utf-8 is that you no longer have to use HTML entities in order to express richer character sets like this. They are just naturally part of the encoding.

the Google search engine in non-English languages ​​rewards who has iso-8859-1 for languages non-English

This isn't true. Where did you hear this? The Google webmaster guides instruct site owners to use utf-8 wherever possible.

@jcrm70
Copy link
Author

jcrm70 commented Sep 12, 2019

Yes, I know that xhtml is 20 years old, it puts it in the dtd url '1999' . And ok html about 2009, about 10 years later, and I think that html5 is fine, I think it's a breakthrough and my personal congratulations to everyone who developed it, but there are other languages ​​that to understand them you need a few more bytes, because if not, they stop being understood, and these require more bytes. You can't modernize that.

But if html5 does not allow you to write real text in other human languages, there is the alternative of xhtml, (you say that w3c recommends html5, ok but it does not required [you required], it does not say that xhtml is not valid, on the contrary, it clearly specifies that it is in use and active and no deprecations). Then of coure, i agree better in html5 ok, but this is a disadvantage to search engines for non english languages.

The question is not that it is xhtml, or something else, the question is that you do not give an alternative, and you eliminate without serious arguments the only one there was. I think it is unreasonable that there is a valid solution, invalidate it wich work fine, without giving any option in return, leaving the world in the dark where before there was a light. For me to understand that is to go back instead of moving forward.

Do you think someone is going to look for avi&oacute;n ? people will write in the imput of the google 'avión', and the search engine will give the results of 'avión' using iso-8859-1. If you use html entities, then 'avión' does not work, whoever searches for 'avión' will get pages in iso -8859-1 (or html5 not validated (with errors) in the w3c validator) and no one using utf8 will search in google for the correct word 'avión'. You can verify this.

I will not go in here and write you so much roll to lie to you. I put an image and you see it. Note that who uses 'avi&oacute; n' (those who use utf8) have no preference in the other results, and will never go out in a search for a normal user wich will are avión or avion, but never 'avi&oacute; n'.

avion

@patrickhulce
Copy link
Collaborator

patrickhulce commented Sep 13, 2019

I'm not really sure how to respond @jcrm70.

  1. I don't know why you still believe that utf-8 forces you to author using HTML entities. It doesn't. Most of the search results for avión and avion are utf-8.
  2. Obviously no one searches for avi&oacute;n. Ironically, most of the search results I get for avi&oacute;n are mistakes by the website developer that didn't know how to use a proper encoding.
  3. None of this discussion has actually raised an issue with following the doctype recommendation at all. If you have such a problem with utf-8, you're still free to override the utf-8 default in HTML5 with <meta charset="iso-8859-1"> and/or the content-type header. Modern browsers and the HTML Living Standard no longer allow for this, as only case insensitive matches for utf-8 are allowed and the entire declaration of charset is ignored anyway.

We won't be removing our audit to check for an HTML5 doctype. The entire point of Lighthouse is enforcing best practices to build better websites. All the documentation from all major bodies obviously encourages this practice, if anything your screenshots have shown why it's important to enforce, and it won't be changing.

Best of luck solving your issue.

@jcrm70
Copy link
Author

jcrm70 commented Sep 13, 2019

After reading you, I may have understood something wrong. You say I use meta charset iso-8859-1 with html5, so I must be doing something wrong. Please tell me how I can use it without errors. And why here tells me that the only value allowed is utf-8?

iso88591_an_html5

@nayuki
Copy link

nayuki commented Jul 29, 2021

why here tells me that the only value allowed is utf-8?

If you use <meta charset="...">, the value must be UTF-8. See MDN.

For non-UTF-8 character encodings, you can use the old syntax like this:

<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants