Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add type: Thing > CreativeWork > WebPage > HomePage ? #2062

Closed
EnnexMB opened this issue Sep 18, 2018 · 50 comments
Closed

Add type: Thing > CreativeWork > WebPage > HomePage ? #2062

EnnexMB opened this issue Sep 18, 2018 · 50 comments

Comments

@EnnexMB
Copy link

@EnnexMB EnnexMB commented Sep 18, 2018

The type "WebPage" has 12 subtypes (including in extensions). They include about pages, contact pages, and FAQ pages. I'm really surprised that there's no type for a homepage! What is the proper type to use for a website's homepage? Is the homepage supposed to just take on the type of the person, organization, or subject matter that the website is about? But isn't a homepage a legitimate kind of page, just like an about page or contact page? Should there be a new type for HomePage?

Pardon me if this has been discussed and rejected before. I searched issues for "homepage" and didn't find anything relevant. There was a proposal to add homepage as a property, but that's a different matter.

@WeaverStever

This comment has been minimized.

Copy link

@WeaverStever WeaverStever commented Sep 22, 2018

@EnnexMB You might try something like this,

{
"@context": "https://schema.org",
"@type": "WebPage",
"@id": "https://example.com/home.html",
"url": "https://example.com/home.html",
"additionalType": "http://www.productontology.org/id/Home_page",
"name": "Home"
}

Probably not necessary if you landing page resolves to root address

@EnnexMB

This comment has been minimized.

Copy link
Author

@EnnexMB EnnexMB commented Oct 9, 2018

The Product Ontology additionalType Home page is not relevant to my question at all. It is for use by a company that is selling a home page as a product or service (e.g., designing a home page for a customer). The webpage linked above for the additionalType Home page states:

Usage
The following shows how to model that you offer to sell [a/an/some] Home page for $ 19.99.

I am asking for a type to specify that a webpage is the homepage of a website.

@thadguidry

This comment has been minimized.

Copy link
Contributor

@thadguidry thadguidry commented Oct 9, 2018

@EnnexMB There is no Schema.org Type for designating a landing page (homepage) because it's typically not needed. Parsers look for this pattern
"https://" + "example.com" + "/"
and then derive the HTML content returned as the homepage or landing page.

@EnnexMB

This comment has been minimized.

Copy link
Author

@EnnexMB EnnexMB commented Oct 9, 2018

A homepage is not always the root of a domain. For example, http://yahoo.com/website could be a homepage.

If I use subtypes of WebPage to designate a website's about page, contact page, and FAQ page, how do I designate the site's homepage?

@thadguidry

This comment has been minimized.

Copy link
Contributor

@thadguidry thadguidry commented Oct 10, 2018

With Apache Web Server, this is DocumentRoot at the site domain. With Cherokee Web Server, this is also Document Root at the site domain as a default.

Backing out of this discussion...I honestly don't have time to teach the history of the web. But my point still stands, in that it is the responsibility of a web server configuration and website owner to control what designates the "homepage" of a site. For consumers & search engines, it is the index.html file content returned from a web site that designates the home page of the web site.

@EnnexMB

This comment has been minimized.

Copy link
Author

@EnnexMB EnnexMB commented Oct 10, 2018

I didn't come here for a history lesson, but to apply structured data to a website. You seem to be saying that I should not use structured data to specify that a page is the home page because search engines can figure that out on their own. Search engines can figure out a lot of things without structured data, but my understanding is that the point of structured data is to feed them detailed information so they don't have to figure things out.

@WeaverStever

This comment has been minimized.

Copy link

@WeaverStever WeaverStever commented Oct 10, 2018

@EnnexMB

All pages are presumed to be webpages, it would follow that "/" would be the homepage, whether it is named home.html, index.html etc.

The example I gave earlier is of the type "Webpage" not "Product", so it is not incorrect.
{
"@context": "https://schema.org",
"@type": "WebPage",
"@id": "https://example.com/home.html",
"url": "https://example.com/home.html",
"additionalType": "http://www.productontology.org/id/Home_page",
"name": "Home"
}

The schema documentation sources productontology.org for additionalType, but this could just as easily be the Wikipedia article for an authoritative source in additionalType.

{
"@context": "https://schema.org",
"@type": "WebPage",
"@id": "https://example.com/home.html",
"url": "https://example.com/home.html",
"additionalType": "https://en.wikipedia.org/wiki/Home_page",
"name": "Home"
}

There is nothing magic about productontology.org, if you use any Wikipedia title, you will get the same outline with different context. I.e. Christine Blasey Ford is obviously not a service or product.

The following shows how to model that you offer to sell [a/an/some] Christine Blasey Ford for $ 19.99.

http://www.productontology.org/doc/Christine_Blasey_Ford

@jono-alderson

This comment has been minimized.

Copy link

@jono-alderson jono-alderson commented Oct 9, 2019

I'd also be keen to be able to specifically and explicitly identify a homepage, as distinct from (as a subtype of) a normal WebPage.

For context...
The graphs we construct in Yoast SEO are typically built 'up' from a relationship between the WebPage, the WebSite and the Organization (in the context of the publisher of a website). E.g., https://search.google.com/structured-data/testing-tool/u/0/#url=yoast.com.

On the homepage, we apply a fudge to try and articulate that this isn't just any-old-webpage, and that it in its capacity as the homepage it in some sense represents the website/brand; so we add an about property referencing the Organization. This feels awkward, and it's done entirely because we don't have a concept of 'homepage'.

This is also muddied by the fact that the WebPage, WebSite and Organization all share the same URL. I'd like to be able to add some clarity to what's going on!

I'd love to alter this approach and to explicitly mark up / identify the homepage, in order to be completely explicit.

@RichardWallis

This comment has been minimized.

Copy link
Contributor

@RichardWallis RichardWallis commented Oct 9, 2019

Considering the set of subtypes for WebPage HomePage does look a little conspicuous by its absence.

I would also reference issue #2358 "/WebContent ? Introduce a common supertype of WebSite, WebPage, WebPageElement" which could be the basis for a little bit of a tidy up in this area.

@danbri

This comment has been minimized.

Copy link
Contributor

@danbri danbri commented Oct 16, 2019

I don't see anything here that isn't covered by a WebSite's URL. We would just be adding yet another way of saying more or less the same thing, wouldn't we?

@jvandriel

This comment has been minimized.

Copy link

@jvandriel jvandriel commented Oct 16, 2019

Personally I agree with what @danbri said. ☝️☝️

@WeaverStever

This comment has been minimized.

Copy link

@WeaverStever WeaverStever commented Oct 16, 2019

I haven't tested this exact scenario, but couldn't the association be made by using an identical @id?
The WebSite script would go in the head section, the WebPage script in the body.

For example: this might make sense where you have an html redirect as an entry point, but the rest of the site is being migrated to PHP elsewhere.

{
    "@context": "http://schema.org",
    "@type": "WebSite",
    "@id": "https://example.com/#"
    "url": "https://example.com/"
    ...
}

and

{
    "@context": "http://schema.org",
    "@type": "WebPage",
    "@id": "https://example.com/#"
    "url": "https://newserver.example.com/index.php"
    ...
}
@RichardWallis

This comment has been minimized.

Copy link
Contributor

@RichardWallis RichardWallis commented Oct 16, 2019

Is it me, or are we jumping through various hoops to avoid a simple solution?

@danbri's comment I would suggest is 'mostly' true.

I'm sure however that I have seen home pages urls that look like http://example.com/pages/home.htm

OK an intuitive crawl-bot can probably work out that a page is a HomePage, but not every structured data publisher would naturally assume that would be the case.

@jono-alderson

This comment has been minimized.

Copy link

@jono-alderson jono-alderson commented Oct 16, 2019

Agreed, there are many scenarios where a website's homepage doesn't live at the root URL. Assuming that the website URL and homepage URL are equivalent is going to make a mess.
It's also common that a site will have multiple homepages; e.g., a multilingual site without a splash page may have /en/ and /fr/, which should both be considered to be homepages.

More generally, I'm not keen on relying on external systems to infer what/where the homepage is, from a set of WebPage objects where one happens to (usually) have the same url value as the WebSite. In fact, if we're expecting systems to determine this through inference, that's greater rationale in my mind that we ought to be declaring it explicitly.

RE: @WeaverStever, I'm not keen on them sharing the same ID, as the merged properties don't necessarily accurately describe both/either property correctly - and there are definitely cases where that's going to make a mess.

@WeaverStever

This comment has been minimized.

Copy link

@WeaverStever WeaverStever commented Oct 16, 2019

@jono-alderson
Yes, I agree. My interest here is the use case and how it might be handled now. Perhaps I'm not understanding why the Homepage definition is being requested. I have no opinion for or against creating the homepage property, but some other utility page definitions might be useful. TOSpage -- PrivacyPage etc.

It seems to me a path like //example.com/ is the homepage.

@WeaverStever

This comment has been minimized.

Copy link

@WeaverStever WeaverStever commented Oct 17, 2019

@jono-alderson
Possible hreflang work around for your review

{
  "@context": "https://schema.org",
  "@type": "WebSite",
  "url": "https://www.example.com/",
  "potentialAction": {
    "@type": "SearchAction",
    "target": "https://query.example.com/search?q={search_term_string}",
    "query-input": "required name=search_term_string"
  },
  "mainEntity":{
    "@type": "Person",
    "@id": "https://www.example.com/#",
    "@url": "https://www.example.com/",
    "name": "Leroy Brown",
    "inLanguage": "en",
    "description": "baddest man in the whole damn town"
  },
  "about":[
    {
      "@type": "CreativeWork",
      "inLanguage": "es",
      "@id": "https://www.example.com/es/biography/#",
      "about": {
        "@type":"webPage",
        "url": "https://example.com/es/biography/index.html",
        "mainEntityOfPage": {
          "@type": "creativeWork",
          "url": "https://example.com/es/biography/headshot.jpg"        
        }
      }
    },{
      "@type": "CreativeWork",
      "inLanguage": "fr",
      "@id": "https://www.example.com/fr/biography/#",
      "about": {
        "@type":"webPage",
        "url": "https://example.com/fr/biography/index.html",
        "mainEntityOfPage": {
          "@type": "creativeWork",
          "url": "https://example.com/fr/biography/headshot.jpg"                  
        }
      }
    }
  ]
}
@jono-alderson

This comment has been minimized.

Copy link

@jono-alderson jono-alderson commented Oct 17, 2019

Why should the WebSite be 'about' the WebPages, in this case? Semantically, that's unpleasant.

I understand the current options and possible approaches, but none of them are as 'clean' as (nor solve the problems of needing to) explicitly identifying the homepage.

@WeaverStever

This comment has been minimized.

Copy link

@WeaverStever WeaverStever commented Oct 17, 2019

@jono-alderson

In the hreflang example, where you propose a homepage based upon the sub-directory, I suggest creating the type DefaultPage for version specific instances.

The term homePage is semantically incorrect because it is commonly understood to be the file that the webserver executes at the root of the website. I think the term "homepage" should be reserved for this specific file. (I.e. is the file (/index.html) a redirect with no body content? At one time, Facebook og: meta content had to be loaded in the head of the homepage - prior to the redirect.)

The home page is located in the root directory of a website. Most web server allow the home page to have one of several different filenames. Examples include index.html, index.htm, index.shtml, index.php, default.html, and home.html. The default filename of a website's home page can be customized on both Apache and IIS servers. Since the home page file is loaded automatically from the root directory, the home page URL does not need need to include the filename. https://techterms.com/definition/home_page

A DefaultPage, is a sub-level hierarchical landing-page located in a sub-directory, where the mainEntity (topic) of the WebSite is presented in an alternate format, such as alternate language versions, or specific versions for display-resolution sizes for hand-held or tablet devices.
@jono-alderson

This comment has been minimized.

Copy link

@jono-alderson jono-alderson commented Oct 17, 2019

We're trying to describe the roles and relationship of the content/pages, not the server architecture. The names, locations and files of various technical setups shouldn't determine how we describe websites for external consumers.

I maintain that my approach is more useful, semantic, and sensible. We're trying to describe what/where/which the 'home page' of a site is, not an artefact of the filesystem architecture.

@WeaverStever

This comment has been minimized.

Copy link

@WeaverStever WeaverStever commented Oct 17, 2019

We should not create confusing definitions that are contrary to the commonly understood and traditional nomenclature. Homepage = //example.com/

Not asking you to change your approach, suggesting a more correct nomenclature that will be less likely to be misunderstood. The vast majority will not understand HomePage to be a reusable object.

What you are describing is commonly known as the "Default Document."

A default document is the file that is sent by the web server when it receives a request for a URL that does not specify a file name. A default document can be the home page of a web site, or an index page that displays a hypertext listing of the contents of the site or folder. https://kb.intermedia.net/article/304

So rather than creating DefaultDocument, DefaultPage would maintain the naming consistency of the other subtypes of WebPage. (However, DefaultDocument may be more widely understood.)

@jono-alderson

This comment has been minimized.

Copy link

@jono-alderson jono-alderson commented Oct 17, 2019

I disagree. It's very rare for a website to be accessed and represented by a literal, 'physical' filesystem. Almost all websites run on some degree of abstraction, powered by a database and URL rewriting system. Aligning to the nomenclature of underlying system architecture, as opposed to describing the user/consumer-facing representation of the content, is the wrong approach (especially when, in many cases, it's not in fact how the system works at all).

There is no scenario where I'd want to, for example, represent a WordPress site's homepage or 'default document' as or via index.php, despite that being the file which the system uses to process requests to the root URL.

@EnnexMB

This comment has been minimized.

Copy link
Author

@EnnexMB EnnexMB commented Oct 17, 2019

@WeaverStever

This comment has been minimized.

Copy link

@WeaverStever WeaverStever commented Oct 17, 2019

@jono-alderson

The exercise here is to describe the user facing content to the system architecture(s). The user facing content is between two architectures. Abandoning a set of well-known defined terms is the wrong approach. You brought up the hardcoded path rendering with your //example.com/fr/ example and the "default document" declaration eliminates the need to provide index.php in the path in any case.

Even If you are rewriting the URL, //example.com/ is still your WebSite's HomePage. The HomePage is your configuration management page (language, viewport etc), even if it simply tells the browser to use PHP. I've never seen a Home button go anywhere other than the default document at the root.

//example.com/saltedpagename would be more correctly typed as an ItemPage, CollectionPage, or ProfilePage. Or perhaps some other needed sub-type yet to be determined.

Not saying you don't have a valid use-case, just saying that HomePage is not the term to describe it.

If you want to declare a bunch of homepages on your website (https://www.wikidata.org/wiki/Q11439), I'd suggest looking at the https://schema.org/additionalType property.

.

@WeaverStever

This comment has been minimized.

Copy link

@WeaverStever WeaverStever commented Oct 17, 2019

@EnnexMB

  1. If the user's content is in a subdomain, then (s)he has a homepage at //subdomain.example.com/, if the content is in a subdirectory (or dynamically generated), it is more accurately called a ProfilePage.

  2. The "default document" is the file that can be called without naming the file in the path.

  3. I'm sure that a lot of end-users also refer to their sub-directory hosted profile as their "website" too, neither website nor homepage are concise enough for our purposes. Your Facebook page is neither your homepage nor your website, it is your profile page.

We are not here to communicate with end-users, we are here to communicate with a database.

Restated:
If you want to declare a bunch of homepages on your website (https://www.wikidata.org/wiki/Q11439), I'd suggest looking at the https://schema.org/additionalType property.

For instance, you could nest additionalType (HomePage) declaration within a ProfilePage type.

@daneshcamp

This comment has been minimized.

Copy link

@daneshcamp daneshcamp commented Oct 18, 2019

hi
this is my website : telegram members

and i have query input issue

how van i fix this ?

@jono-alderson

This comment has been minimized.

Copy link

@jono-alderson jono-alderson commented Oct 18, 2019

I think we're at risk of going round in circles over some disagreements about the technical and conceptual difference between what a 'homepage' and a 'default document' is, and which of those routes we should be exploring (if either).

In the hopes of closing that loop, I'm going to go long-form, starting by rewinding a little and building up some definitions.

Definitions

What is a "WebSite"?

  • In most cases, a WebSite correlates 1-to-1 with a hostname. i.e., www.example.com is a WebSite. subdomain.example.com is a different WebSite.
  • There are some exceptions to this, where sections/components of a website are served via a subdomain for arbitrary/technical/legal reasons (such as login.example.com), where it'd be silly to consider that to be a different WebSite. The specifics of these scenarios vary beyond our ability to predictably document, so should be solved by individual implementors on a case-by-case basis with a dose of common sense.
  • Whilst content in a subfolder of a given hostname could in some cases technically be considered to be a distinct WebSite (e.g., where www.example.com/cats/ is a pet store, and www.example.com/cakes/ is a bakery), this rarely happens in the wild, and is poor enough practice (commercially, technically) that we shouldn't over-engineer schema.org to support it.
  • This explicitly means that 'profiles' on sites which allow you to create your own page(s), such as Facebook (www.facebook.com/example-profile/) should not be considered to be WebSites in their own right. We should, however, recognize we're making an arbitrary distinction here.

TL;DR, a WebSite is (the content contained on) one or more hostnames.

What is a "Homepage"?

  • Almost all websites have a "home page", which acts as a gateway to site content. Whilst it might also act as a CollectionPage (e.g., if it's providing sets of links into deeper resources), it is distinct in its role and function. It is often visually, structurally, functionally different from other pages on the site.
  • In most cases, this page resides at the website root (i.e., its URL is www.example.com). However, it's not uncommon for homepages to reside elsewhere, such as at www.example.com/home.html, or www.example.com/x/home/. In these cases, requests to the site root URL typically redirect to the homepage.
  • Some websites have multiple versions of their "home page", for the purposes of localisation. E.g., www.example.com/fr/ and www.example.com/de/ may for all intents and purposes be considered to be 'the homepage', but served in different languages (assuming that no generic homepage exists at the site root).
  • Whilst this is the most common scenario, it's not the only one, E.g., www.example.com/?logged_in=true vs www.example.com/?logged_in=false (or even, www.example.com/logged-in/ vs www.example.com/logged-out/) may both serve the same "home page", but with minor content changes based on the user's login status.

TL;DR, a 'Homepage" is the front/entrance page of a website (which may have localisation variants across multiple URLs)

On homepages vs default documents & document roots

  • Most web servers (and similar systems) are configured to serve a specific file when requests are made to the filesystem root (or a given folder). E.g., when I request www.example.com, behind the scenes, the website is configured to return the content of an index.html file.
  • The configuration and behaviour of this can vary wildly, but the principle remains the same; there's a "behind the scenes" process which maps my request(s) to a specific file.
  • In some technical parlance, this file may be referred to as the default document (for that particular folder).
  • For our purposes, we have no interest in understanding or describing this "behind the scenes" behaviour. As an external consumer, we have no insight into the logic which the system uses to serve pages-on-URLs (especially as it's common for systems to deliberately add abstraction between their URLs, pages and filesystems).
  • Our purposes are to describe the structure and nature of the WebSite to external consumers, in a way which adds clarity and precision of labelling. Confusing the technical architecture of a website's back-end with how it presents its pages and URLs is the wrong approach.
  • Furthermore, the more we try to represent and understand the technical behaviour of the homepage, the more grey areas we run into. Should www.example.com/?lang=fr (a query to the default document) be treated differently to www.example.com/fr/ - which may trigger exactly the same (rewritten) request, or represent a physical filesystem with a default document configuration? None of this matters - they're both "the homepage".
  • We have no interest in describing what the "default document" is/isn't.

TL;DR, the arbitrary and opaque filesystem structure and behaviour 'behind' a website should have no bearing on how external consumers understand what/where its "homepage" is.

The problem we're trying to solve

Helping external consumers to understand what/where the homepage is could add significant clarity to WebPage data structures (and by extension, WebSite graphs).

The schema.org standard already outlines different types of WebPage for this reason (including non-functional definitions, like AboutPage), but HomePage isn't yet a standard.

That means that:

  • When a homepage URL isn't equivalent to the hostname's root URL, we lack a mechanism to identify and describe a website's homepage.
  • If external consumers don't have a way of identifying the homepage, potentially:
    • They'll confuse the (content at the) root URL with the homepage, which may be incorrect and have ramifications.
    • When the root URL redirects to a different URL and/or homepage variant, they may think that there is no homepage.
    • They'll pick an arbitrary variant (or all variants), or make an arbitrary decision on what/where the homepage is ("We'll follow this 302 redirect, and consider this page to be the homepage)

In fact, there is currently no feasible way to declare a relationship between a homepage WebPage and its parent WebSite. Systems currently rely on parsing a page's graph to 'notice' that a WebPage node happens to share the same URL as a WebSite node, and is likely, therefore, the homepage. And, as we've discussed, that's not always (or even particularly often) the case.

That level of ambiguity and arbitrary interpretation - and the problems it causes - is exactly the problem which schema.org was created to address.

This is easy to solve

We can solve all of this mess by:

  • Defining a HomePage subtype of WebPage.

cc @RichardWallis

That means that:

  • Sites can explicitly set their WebPage as HomePage when on the 'homepage'.
  • Sites with homepage language/localisation/similar variants can mark each of those as a/the HomePage (with suitable URL and lang properties).
@RichardWallis

This comment has been minimized.

Copy link
Contributor

@RichardWallis RichardWallis commented Oct 18, 2019

A well described summary of the aspects of this issue from @jono-alderson, that for me clarifies things a bit.

As I mentioned before, in the set of subtypes for WebPage HomePage does look a little conspicuous by its absence. If HomePage was already part of that list, I don't believe it would appear out of place.

I believe that in the spirit, of the oft mentioned principle of making things simpler for [Schema.org] publishers outweighing making them simpler for consumers; pragmatically, the most effective solution would be to create a HomePage subtype of WebPage.

@WeaverStever

This comment has been minimized.

Copy link

@WeaverStever WeaverStever commented Oct 18, 2019

@RichardWallis,

My problem with this proposal is that @jono-alderson is proposing to create a non-traditional definition of HomePage, where a WebSite would have an unlimited number of HomePages. IMHO this scenario would be more understandable (to most with a programming background), with a name such as a DefaultPage, LandingPage or similarly named page type.

All of the examples I've seen in the last 25 years indicate that the definition of HomePage is //example.com/ and/or //subdomain.example.com/. At this point in the discussion, I think we should formalize it in the Schema with this definition.

A home page is a webpage that serves as the starting point of website. It is the default webpage that loads when you visit a web address that only contains a domain name. For example, visiting https://techterms.com will display the Tech Terms home page. The home page is located in the root directory of a website.

Home Page Definition - TechTerms (Aug 1, 2015)
https://techterms.com › definition › home_page

@jono-alderson is incorrect in that we cannot currently describe such a nonstandard architecture with breadcrumbs and additionalType definitions. Additionally, it is the duty of the HomePage to detect language and display preferences -- so "the" HomePage, doubles as the domain's configuration management page.

Finally, simply giving a WebPage a subtype of HomePage in the Schema alone is not going to clarify anything for the user's understanding of the page, or the website's ability to unpack a permalink. This should be done in META and I suggest taking the discussion to https://ogp.me. With Facebook as a creator of the OG protocol, they specialize in rooting these types of profilePages. Reading a Meta tag is much easier than unpacking a JSON-LD or Microdata, or RDFa script, that may or may not exist.

@WeaverStever

This comment has been minimized.

Copy link

@WeaverStever WeaverStever commented Oct 19, 2019

@jono-alderson

I think I have a solution for your use-case. My understanding is that you want to create a collection of LandingPages (HomePages) that are not subservient to DNS, aka aliases which are specific to your domain. Instead of a subtype of WebPage, these identities would be a subtype of Thing,

More specific Types
Action
CreativeWork
Event
Intangible
MedicalEntity
Organization
Person
Place
Product

This identical Thing is also going to exist in other domains and social media platforms with a similar naming convention -- perhaps there is an authoritative domain in DNS -- perhaps not.

I suggest looking at incorporating Folksonomy into your use case. Invariably the hashtag and ampersand aliases will not be identical across all platforms, so we would declare your (homepage) alias (as the url=) and the sameAs aliases for the topic at the same time.

<script type="application/ld+json">
{ "@context" : "http://schema.org",
  "@type" : "Organization",
  "name": "yoast",
  "url" : "http://yoast.com/",
  "sameAs": "https://www.wikidata.org/wiki/Q68342360",
  "Folksonomy" :
 { "@type": "ProfilePage",
    "name": ["@yoast", "#yoast", "@TheRealYoast"], 
    "url": "http://example.com/yoast",
    "sameAs": ["https://facebook.com/yoast", "https://twitter.com/yoast", "https://someotherplatform.com/therealyoast", "https://www.linkedin.com/company/yoast-com/" ]
 }
}
</script">

Now we have associated your alias, a collection of matching aliases, and their customer facing URLs with the authoritative domain. When an end-user searches for a folksonomy tag, we have already declared the associations (to the Knowledge Graph) and the authoritative domain for "this" Thing.

@RichardWallis

This comment has been minimized.

Copy link
Contributor

@RichardWallis RichardWallis commented Oct 24, 2019

My understanding is that you want to create a collection of LandingPages (HomePages) that are not subservient to DNS, aka aliases which are specific to your domain.

Reading back on the thread, I don't believe that is either @jono-alderson purpose, or the original motivation behind @EnnexMB raising this issue in the first place.

Going back to @EnnexMB's original...

I'm really surprised that there's no type for a homepage! What is the proper type to use for a website's homepage?

That is exactly the sort of question I expect from a large proportion of the community of web designers/developers/SEO's etc. that I meet on my travels. That question and the subsequent proposal to create a HomePage subtype of WebPage, are designed to satisfy the needs of such people who have little concern about, and potentially knowledge of, web server architecture and assumptions.

Regardless of the assumptions and inference that a structured data consumer, such as a search engine, can/could derive from the URL of a page, the fact that it is also considered to be a HomePage may well be a useful additional signal.

For me the bottom line is that creating a HomePage type will not break anything yet will contribute to the ease of understanding and use of Schema.org (by those that are not immersed in its apparent idiosyncrasies and the technical implementation of the web).

As to the name of such a new type: HomePage is such a widely understood term across all aspects of web development design and implementation, even if some may prefer something else. I believe no other name than HomePage would convey meaning so widely.

@WeaverStever

This comment has been minimized.

Copy link

@WeaverStever WeaverStever commented Oct 24, 2019

In the schema, the domain we are describing is the Internet. In any domain, there will be a "Home" directory for an entity, in some cases that directory will contain a "HomePage". On a TLD or subdomain, HomePage is understood to mean the default document in the root directory, on social media and blogging platforms this is not the case. In a subdirectory hosted situation, the content creator is usually investing in a folksonomy tag similar to, //wordpress.org/@myMainPublishingPlatform_ThisContentIsPushedToOtherPlatforms.

Search engines (generally) determine Domain Authority, Domain Trust and name disambiguation from the Wikipedia and Wikidata "Official URL." To induce the search engines to create a Graph for an otherwise unknown identity (no dedicated domain), the entity's numerous internet presence locations should agree that one URL is the HomePage (Official URL) for this entity.

  • Today, a lot of end users are not purchasing a domain, and instead are simply claiming a namespace(s) on multiple free platforms like FaceBook, Twitter, Wordpress.org, Blogger etc. I think a signal we could send to the search engines is which one of these profiles does the entity consider "their" HomePage or Official URL. (These profile pages will never be listed on Wikipedia, thus the need to create a top level to the structure -- declare the Official URL as HomePage.)

HomePage: Less trusted than WebSite, HomePage is for instances where the entity's Official URL is hosted on an unaffiliated domain (such as Facebook). This "home" URL should be in agreement with any authoritative URL(s) found on sites like WikiData, Google Places or other Knowledge Graph Panel authority(s).

@RichardWallis

This comment has been minimized.

Copy link
Contributor

@RichardWallis RichardWallis commented Oct 24, 2019

In the schema, the domain we are describing is the Internet. In any domain, there will be a "Home" directory for an entity, in some cases that directory will contain a "HomePage". On a TLD or subdomain, HomePage is understood to mean the default document in the root directory, on social media and blogging platforms this is not the case. In a subdirectory hosted situation, the content creator is usually investing in a folksonomy tag similar to, //wordpress.org/@myMainPublishingPlatform_ThisContentIsPushedToOtherPlatforms.

@WeaverStever I have the distinct impression that we are talking past each other.

Although I understand what you are saying here, most of it is not relevant to a basic website manager who is looking to provide simple structured data for the collection of pages on his/her site. This with little or no consideration of URL or directory structures (which in many browsers/devices are not visible nowadays); understanding of what is meant by investing in a folksonomy tag; or reference to a social media account, other than possibly providing a sameAs value.

An example creator may well have before them a set of pages to mark up with Schema.org: an about page and a contact page to describe their organisation; item pages to describe their products and a checkout page to purchase them from. They may even have a FAQ and search results pages to be helpful to their customers.

Put yourself in their place, when delving into the intricacies of applying Schema.org, to find that every one of those page types has its own specific type to apply - easy & simple!. Yet the home page, an important page in most webmaster's eyes, has no such type, and as a nearest equivalent has to be just defined as a generic 'WebPage'

It is hardly surprising that this leads to the question, why is there not a HomePage type? A I question find difficult to answer in a way that would make sense to such a webmaster.

@WeaverStever Other than in some circumstances potentially duplicating what might be inferred from inspecting URL/directory paths, hence sometimes being superfluous, what would the negative effects of using the HomePage type on a site (assuming it was added to the vocabulary)?

@WeaverStever

This comment has been minimized.

Copy link

@WeaverStever WeaverStever commented Oct 24, 2019

@RichardWallis

Short Answer: Because we would be missing the opportunity to declare an absolute HomePage (Official URL), where the entity has not purchased a domain. So if the user wants to use a service instead of a domain, the absolute HomePage could be a location where the search engines have more confidence (trust) in our structured-data scripts.

Every "basic website manager" I've encountered has had no problem with the term LandingPage (or profile page) and we avoid using the term HomePage unless we are discussing the default document at the root of the domain.

understanding of what is meant by investing in a folksonomy tag; or reference to a social media account, other than possibly providing a sameAs value.

The folksonomy tag is a non-unique internet tag, as opposed to a domain, but they are similar in search. For instance, if you plugin @Yoast into your browser, it should return the tag from many domains.

Now, let's say that yoast was already taken on a platform, so the company chose @TheRealYoast. Now we have a use-case where several "names" are relational to several urls, so we should probably create a property, telling the search engines what to expect (many names to many urls).

what would the negative effects of using the HomePage type on a site (assuming it was added to the vocabulary)?

Basically, a misinterpretation of what HomePage means to different people. In Unix, home is an absolute location cd $HOME. If we allow the HomePage alias to be used on millions of pages, we miss the opportunity to declare the user's absolute HomePage (authoritative). The folksonomy tag / url relationships could be declared on the absolute HomePage with other identifying information (e.g. disambiguation).

Establishing trust and identification has been a HUGE PROBLEM because Wikipedia has become an almost exclusive gateway to the Knowledge Graph. When the user has a domain, money has changed hands, so the search engines are a little more confident. When the page is hosted on an unverified LandingPage in a sub-directory, it is just (ambiguous) noise to the search engines, unless it is an Official URL on WikiData.

LandingPage: A sub-type of WebPage, a LandingPage is the root page for an entity hosted on an unaffiliated domain (such as Facebook) and having a sub-directory style of URL addressing (//example.com/topic). Typically, these entities are also assigned Folksonomy tags (tags prefixed with @ or #) which are specific to the domain and available to search engines.

Folksonomy: A Property with an expected type of Thing. For grouping Folksonomy @tags that are related to identifiers for the current URL and suggesting sameAs URLs having related Folksonomy tags of related interest.

@Tiggerito

This comment has been minimized.

Copy link

@Tiggerito Tiggerito commented Jan 1, 2020

I'd not go with LandingPage. In the advertising world this means a page specifically made to point adverts to. In many cases pages hidden deep in a website.

I'm wondering if another step back will help. What is the need here?

Are we talking about the need to define one (or even several) entrance pages to a WebSite? If so, would it make more sense to define an entrancePage property in WebSite?

Or is it about defining a page that will become home for information such as Organization. I think good cross referencing would be better suited for that. i.e. whenever you reference an Organization make sure you define which URL its full details are on.

@EnnexMB

This comment has been minimized.

Copy link
Author

@EnnexMB EnnexMB commented Jan 1, 2020

I'm wondering if another step back will help. What is the need here?

Since I'm the one who posted the issue, perhaps that's a question for me. It's been somewhat humorous and somewhat disconcerting to watch people talk all around the issue without getting at the very simple, basic need. So thank you for your question.

Let me answer by simply quoting what I originally posted back in September 2018:

The type "WebPage" has 12 subtypes (including in extensions). They include about pages, contact pages, and FAQ pages. I'm really surprised that there's no type for a homepage! What is the proper type to use for a website's homepage? Is the homepage supposed to just take on the type of the person, organization, or subject matter that the website is about? But isn't a homepage a legitimate kind of page, just like an about page or contact page? Should there be a new type for HomePage?

My answer to that question, even more now after watching this discussion for over a year, is Yes!

@EnnexMB

This comment has been minimized.

Copy link
Author

@EnnexMB EnnexMB commented Jan 1, 2020

And I would refer people back to the two comments above by @RichardWallis on Oct 24, 2019, for some rational, clear thinking on the subject.

(I don't think it's only because he agrees with me that I find his comments to be rational.)

@WeaverStever

This comment has been minimized.

Copy link

@WeaverStever WeaverStever commented Jan 1, 2020

@Tiggerito I'm not married to the term LandingPage, but a Facebook page, for instance, is buried within a website and when the same entity has several social media profiles, they are generally trying to drive traffic to a specific platform / profile. AKA "follow me on Twitter" or "Instagram influencer." This basically is advertising.

Supposedly, the new decade kicks off Web 4.0. In Web 3.0, the hashtag became a popular referrer. I would hate to see us adopt a definition for HomePage with a Web 2.0 mindset.

An Organization will almost always purchase a domain and therefore have a unique HomePage by default (https://schema.org/WebSite). Conversely, an individual entity will also want to have a unique HomePage, but without having to purchase a domain and hire staff to maintain a website. (AKA the entity may have several social media profiles, but only one of them is the preferred HomePage -- the authoritative and unique internet presence.)

IMHO the term HomePage should be reserved to define the top of the pyramid of all Internet presence(s) for the entity, regardless to what platform, CMS or social media site the entity is hosted on. Additionally, we need to explore how taxonomy tags (hashtags / ampersand prefixes) figure into this Web 3.0 version of entity hosting.

@Tiggerito

This comment has been minimized.

Copy link

@Tiggerito Tiggerito commented Jan 2, 2020

@WeaverStever I like the idea of defining a "Home Page" for an entity, and therefore move away from the strict rule that a home page is for a WebSite.

Would url already be covering it. The official url for the entity?

That would not cover things like organizations with multiple language websites. Or can it?

@EnnexMB

This comment has been minimized.

Copy link
Author

@EnnexMB EnnexMB commented Jan 2, 2020

Even without multiple languages, that doesn't work. There is not a 1:1 correspondence between entities and websites. I myself have four websites. Are you going to tell me I have to decide which one of them has a home page?

Did you go through all these mental gymanastics to define about pages, contact pages, and FAQ pages? Some websites have multiple about pages, multiple contact pages, and multiple FAQ pages for different audiences or different circumstances. Did you worry about defining what counts as an about page, a contact page, or a FAQ page? Why can't you just let the website developers decide for themselves what page or pages on their sites are home pages. I've seen sites with an index.htm and an index2.htm, and maybe the developer would say those are both home pages. Do you care? Is it up to you to dictate what people call a home page or not?

I opened this topic because I was looking into putting structured data in my websites. More than a year later, I can designate pages on my sites as about pages, contact pages, and FAQ pages, but I still can't designate the most important pages on the sites -- their home pages.

@jono-alderson

This comment has been minimized.

Copy link

@jono-alderson jono-alderson commented Jan 2, 2020

Quite. We're massively overcomplicating this.

@RichardWallis

This comment has been minimized.

Copy link
Contributor

@RichardWallis RichardWallis commented Jan 2, 2020

Totally agree: We're massively overcomplicating this by trying to define a multiplicity of potential circumstances and how data consumers may or may not interpret them.

I believe that @EnnexMB and many others will be satisfied by the addition of a HomePage type with a simple definition along the lines of:
A Home or Landing page for a website. Potentially, because of multiple languages and other criteria, a site may have more than one page defined as a HomePage

If someone then wanted to write an advisory document on best use of this and other WebPage subtypes (in a similar manner to the Hotels documentation), I am sure it would be welcomed by the community.

@jono-alderson

This comment has been minimized.

Copy link

@jono-alderson jono-alderson commented Jan 2, 2020

Happy to write something up at along the lines of the Hotels docs sometime in Jan, with some concrete examples!

@WeaverStever

This comment has been minimized.

Copy link

@WeaverStever WeaverStever commented Jan 7, 2020

@EnnexMB

What I'm saying is that HomePage and WebSite would be at the same level, not as a subtype of WebPage. WebSites already have a home page designated as "/", it doesn't matter if the file is named home.html or index.php, it still resolves to "/".

So if you want to name all of your websites as HomePages, that would be your business; however, there may be instances where an entity would prefer to create a pyramid linking style to direct traffic to the preferred CMS host that operates under a domain they do not control -- i.e., a writer on Blogger might prefer his readers read the Wordpress.org version/content for example.

The separate language example sort of proves my point. Languages are handled by the browser with meta tags. The page that contains the meta redirect would be the homePage, not the page after following the redirect. AKA //example.com/myblog/es really is not a homePage, //example.com/myblog/ is the homePage on a CMS hosting farm type site.

So for HomePage, we would be creating a special case where the content root page is not at "/" of the hosting domain. An entity could declare themselves to have one preferred "HomePage" or they could choose to make their internet presence ambiguous and name all of their social media pages HomePage.

@jono-alderson

This comment has been minimized.

Copy link

@jono-alderson jono-alderson commented Jan 7, 2020

I'm not even going to attempt to deconstruct this, it's incredibly muddled, and you're taking us around in circles.

This is incredibly simple. Homepages are a type of webpage on a website. We're trying to describe webpages on websites. Whether or not a user considers their Facebook profile to be their 'homepage' is irrelevant in this case.

That said, I do think that the idea that I might want to designate a 'home page', i.e., a URL which I consider to be the definitive location of an entity, is interesting. But that's what the url property is for, and the ambiguity between the URL of a website, an organization, and that organization's homepage is exactly the problem we're trying to solve.

@WeaverStever

This comment has been minimized.

Copy link

@WeaverStever WeaverStever commented Jan 7, 2020

@jono-alderson

I'm not seeing the problem with Organization / WebSite, the organization purchases a domain, they have a HomePage that is unique across the entire Internet.

For an entity that does not have a domain, we could purpose HomePage to mean the entity's $HOME on the Internet. I.e., the landing page that contains the official structured data and other details about the entity.

Identity is the next frontier.

@EnnexMB

This comment has been minimized.

Copy link
Author

@EnnexMB EnnexMB commented Jan 7, 2020

Do decisions ever get made here, or is this just a forum for incessant discussion of matters over which we have no control?

I opened this topic as a request for WebPage to have a new type called HomePage. Where does a decision on such a request get made? Is the body that makes such a decision even aware of this request?

@WeaverStever

This comment has been minimized.

Copy link

@WeaverStever WeaverStever commented Jan 7, 2020

HomePage as a subtype of WebPage = my vote is No

LandingPage (or any other name) as a subtype of WebPage = Yes.

Reasoning: HomePage is a unique identity within a domain -- our domain is the Internet.

@EnnexMB

This comment has been minimized.

Copy link
Author

@EnnexMB EnnexMB commented Jan 7, 2020

Who gets a vote, and does a vote here mean anything?

Does someone here know the answer to these two questions:

  • Where does a decision on a request for a new subtype get made?
  • Is the body that makes such a decision aware of the request made here for WebPage to have the subtype HomePage?
@WeaverStever

This comment has been minimized.

Copy link

@WeaverStever WeaverStever commented Jan 7, 2020

@EnnexMB

@RichardWallis is reading this thread, he is a consultant for Google. In his post (above) he mentions HomePage and Landing page. My contention is that HomePage should be reserved for a broader and more well thought out purpose.

Ultimately, Google and the other search engines will have the say on how the name/value pair is implemented and consumed, with advisement of Sr. members here.

Read about Richard Wallis.
https://www.thedrum.com/opinion/2019/07/03/richard-wallis-his-involvement-with-google-and-schemaorg

@danbri

This comment has been minimized.

Copy link
Contributor

@danbri danbri commented Jan 7, 2020

We don't vote here, we discuss here. Proposals with serious likelihood of being consumed i.e. used are our main focus. Without that constraint, Schema.org would balloon into something 100x bigger and be unusable.

Do decisions ever get made here

As far as I see, nobody in this thread is proposing to use actually HomePage or LandingPage type, or to add much value beyond what we already have via "WebPage", "WebSite", "url", and nearby. Let's archive the proposal for future reference by closing the issue and discussion, until such time as someone comes forward with intent of consuming it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
9 participants
You can’t perform that action at this time.