Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A subtype of WebPage is needed for policy pages like terms and privacy #59

Open
robrwo opened this issue Sep 24, 2019 · 14 comments
Open
Assignees

Comments

@robrwo
Copy link

robrwo commented Sep 24, 2019

There should be subtypes of WebPage for various policies and terms and conditions:

  • PolicyPage should indicate that this web page contains website policies.
    • TermsAndConditions is a subtype of PolicyPage, for terms and conditions of use.
    • PrivacyPolicy is a subtype of PolicyPage, for privacy policies.
    • ModerationPolicy is a subtype of PolicyPage, for content moderation policies.
    • SubmissionPolicy is a subtyle of PolicyPage, for content submission policies.
    • SecurityPolicy is a subtype PolicyPage, for security policies

The about property would be used to indicate what entity these policies are of, presumably if omitted they are assumed to be for the site in general.

See also schemaorg/schemaorg#1308 which mentions a TermsOfServicePage and PrivacyPolicyPage.

@WeaverStever
Copy link

I generally don't agree with specific subpages for items like localBusinesses, but I tend to agree on this suggestion. There is really nothing of interest on these pages, declaring them with the fewest lines of code possible would be cool. They are certainly not desirable pages we want taking up real estate in a rich snippet.

@jvandriel
Copy link

What I wonder about this proposal (and schemaorg/schemaorg#1308) is why do you want such subtypes of WebPage and what do you expect to achieve by having them?

If schema.org's sponsors would have a need for such subtypes I'm pretty sure they would have proposed them long ago already, so is there a reason beyond doing well in search engines you would like to add them?

@WeaverStever
Copy link

There is already quite a few of them. They could act as shortcuts for known types.

For instance, ContactPage is a shortcut for
Webpage > additionalType > https://en.wikipedia.org/wiki/Contact_page

More specific Types
AboutPage
CheckoutPage
CollectionPage
ContactPage
FAQPage
ItemPage
MedicalWebPage
ProfilePage
QAPage
SearchResultsPage

@robrwo
Copy link
Author

robrwo commented Oct 2, 2019

@jvandriel asked:

why do you want such subtypes of WebPage and what do you expect to achieve by having them?

Search engines will be aware of these pages and index them accordingly. They are important pages for a site, but they are not the main content of the site.

@jonoalderson
Copy link

I'm strongly in favour of this.

With the increasing requirement that sites have formalised structured for privacy / cookie / compliance content, this makes sense.

E.g., the California Consumer Privacy Act requires a “Do Not Sell My Personal Information” link on the home page". We should be able to identify what/where/which this page is, as a special subtype of WebPage.

These kinds of requirements - where a particular type of page is required by law - are likely to continue to expand. We need a mechanism to reliably identify them.

This also feels like an excellent opportunity that we currently have no way of representing what/where/which a homepage is.

@WeaverStever
Copy link

WeaverStever commented Dec 2, 2019

I agree with the proposal for TOU and Privacy page sub-types.

I take an alternative view to @jono-alderson, on the definition of HomePage.

I feel that we should treat the Internet like any other domain and define HomePage as the absolute authoritative URL (aka $HOME) for entities that use social media type pages instead of purchasing a domain. I.e., the url where the entity/user is most active. (HomePage would be the alternative equivalent to WebSite in terms of sending signals to the Knowledge Graph -- one per entity -- similar to the way "Official Website" works on Wikipedia / WikiData.)

Here is my alternate position on the definition of HomePage.

@robrwo
Copy link
Author

robrwo commented Dec 4, 2019

Ok, so what is the next phase? Do I need to work on a PR, or wait until there's approval from certain entities?

@danbri
Copy link
Contributor

danbri commented Jan 2, 2020

Thanks for the proposal. At this stage I am wary of adding another one of these "kinds of WebPage" types that doesn't come with a consuming application that needs it. We have several (e.g. AboutPage CheckoutPage CollectionPage) which afaik remain under-utilized.

In the News area we also have a number of similar properties that indicate specific kinds of page via a relationship/property rather than type. See http://schema.org/publishingPrinciples and its subproperties, i.e. actionableFeedbackPolicy, correctionsPolicy, diversityStaffingReport,
masthead, missionCoveragePrioritiesPolicy, noBylinesPolicy, ownershipFundingInfo,
unnamedSourcesPolicy, verificationFactCheckingPolicy. These all came from the Trust Project and are mostly rather specific. They are also, currently, under-utilized.

Another issue is that many of these issues are often documented on the same page. The https://schema.org/WebContent type may partially address that. My suggestion would be that we consider reworking these (and others) as properties, whose value could be 'WebContent' (i.e. hiding the distinction between WebSite, WebPage, WebPageElement). That might work better (read more naturally) on other pages pointing to the policies than on the page itself.

@danbri
Copy link
Contributor

danbri commented Jan 2, 2020

To answer the question from schemaorg/schemaorg#2376 I am not going to fast track this for v6, because it touches on the matter of search engines understanding terms, conditions, policies etc., where we historically tend to tiptoe to avoid introducing confusion or undermining the understanding of more infrastructural things like robots.txt.

@thezedwards
Copy link

howdy - this is a great discussion, thanks ya'll for linking to all those extra resources, it's been helpful for me to understand best practices for submissions.

I wanted to provide some quick feedback on this proposal -- in my opinion, the importance of every domain having 100% clarity on where their privacy policy/TOS exists and any "do not sell my information" pages/prompts is growing.

There are a large number of enterprise organizations that are using "Do not sell" services from companies like onetrust, which ends up creating a situation where their submission forms are hosted on separate domains - thousands of U.S. enterprises pushing users to forms on domains like: https://privacyportal-cdn.onetrust.com/

I've been auditing how organizations are doing this, and the vast majority merely link to the forms from their footers -- it's sorta become the defacto standard. BUT, this has created some real challenges for any organizations to organize all the privacy policies and these "do not sell" URls, without custom crawling across basically every domain.

I've heard one potential solution to this (suggested by the brilliant privacy strategist, Yoav Aviram, who is associated with @ yourdigitalrights.org) could be something like encouraging all domains to create a new domain.com/privacy.txt file that would be similar to a domain.com/ads.txt or robots.txt and this "privacy.txt" file would contain the schema/links that would be potentially generated through this discussion.

I tend to agree with the feedback someone wrote here that including privacy policy markup on every page would be way excessive and it would lead to low adoption rates, but I also know that many enterprise organizations are trying to figure out how to get cleaner data access/deletion/modification reports and the 3rd party industry supporting these submissions is growing VERY fast. Those enterprise orgs would benefit from dropping their CCPA links in a known-location, which would then encourage new industries that tap into that schema/markup to support consumer data access/deletion requests.

Would pushing this privacy/TOS schema into a known /privacy.txt file be something that could speed up adoption of this?

Does anyone have any feedback or ideas for how the CCPA/GDPR/TOS/privacy policy links can be better documented across all domains in a standardized way?

Thanks for any feedback and ya'lls work.

Sincerely,
Zach

@robrwo
Copy link
Author

robrwo commented May 7, 2020

@thezedwards wrote:

I tend to agree with the feedback someone wrote here that including privacy policy markup on every page would be way excessive and it would lead to low adoption rates,

I think you misunderstand this. It is a discussion of markup that only needs to be on at most one page for each type of document (privacy, TOS, etc).

How organisations identify it in a a sitemap schema is a separate issue.

@WeaverStever
Copy link

WeaverStever commented May 7, 2020

Because of the fact that some of these types have already been enumerated, it is sort of glaring that PolicyPage is not included. The PolicyPage subtype could be the container for all of these TOU pages and then sub-typed with external ontologies, if the website's policies are on several pages.

(Glancing at the list of WebPage "More specific Types", I am seeing that RealEstateListing is listed as pending. Bizarre that this would be adopted before a type that is relevant to almost all websites.)

<script type="application/ld+json">
{
    "@context": "http://schema.org",
    "@type": "PolicyPage",
    "@id": "https://example.com/privacypolicy.html",
    "additionalType": ["https://en.wikipedia.org/wiki/Privacy_policy", "https://www.wikidata.org/wiki/Q1999831"],
    "name": "Example.com Privacy Policy"
}
</script>

@thezedwards
Copy link

thezedwards commented May 9, 2020

thanks very much @robrwo - i totally agree with you. My one usecase where I was thinking that it needed to be embedded into every page was for external consent dialog flows where websites start to want to provide an easily-exposed endpoint on every page that will trigger their consent dialogs for data flow.

For now, I think it's better to focus on the process that @WeaverStever is articulating with the PolicyPage subtype and TOU pages and i'd 10000000% agree that this needs to move forward sometime soon -- I would strongly argue that big organizations should be advocating for this change NOW, ahead of the June/July CCPA guidance from the California Attorney General that could require a lot more collaboration between data controllers/processors and provide new types of exposure to publishers and advertising/analytics companies.

As it stands right now, the lack of consistent schema for "Do not sell" "Privacy Policy" "TOS" "cookie policy" and other types of pages like this is a glaring hole for collaboration and growing startups in this data privacy space, and just a small schema change to make it easier for people to communicate this to each other would go a long way towards starting this process.

I'm more than happy to create a mini website or whatever i needed to advocate for this change so that more people understand how the CCPA is going to impact collaboration with data privacy and how the current lack of schema standards around some core data privacy fields/pages is currently holding back innovation and creating some edge-case legal exposure. I know getting these issues through and launched is sometimes tough but this is an issue i'll be focusing on, so anyone feel free to ping me if you feel this needs more advocacy - or if I can help push it forward in any way.

thanks everyone,

Sincerely,
Zach

@RichardWallis RichardWallis transferred this issue from schemaorg/schemaorg Jul 13, 2020
@RichardWallis
Copy link

See issue #7 for the context of the move from the main Schema.org issue tracker to this repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants