Consider renaming generic types and properties #417

Closed
vholland opened this Issue Apr 8, 2015 · 36 comments
@vholland
Contributor
vholland commented Apr 8, 2015

As schema.org grows, generic names for types and properties leads to unfortunate collisions. The issue will become more problematic when extensions are made a part of the flat namespace.

I am opening this issue as a place to gather ideas for types and properties which should be considered for renaming:

Types

  • Series
  • Season
  • Seat
  • Taxi
  • Joint
  • Vessel
  • Artery
  • Diet
  • Code
  • AlignmentObject
  • Audience
  • Permit
  • Ticket

Properties

  • affectedBy
  • algorithm
  • alignmentType
  • antagonist
  • application
  • area
  • aspect
  • assembly
  • audience
  • background
  • benefits
  • branch
  • branchOf
  • carrier
  • catalog
  • code
  • collection
  • configuration
  • comment
  • cost
  • course
  • damages
  • device
  • diet
  • doors
  • engine
  • episode
  • event
  • followup
  • free
  • frequency
  • function
  • functionalClass
  • gears
  • guideline
  • incentives
  • indication
  • instrument
  • intensity
  • material
  • menu
  • object
  • option
  • origin
  • outcome
  • overview
  • phase
  • playerType
  • population
  • position
  • preparation
  • procedure
  • producer
  • produces
  • purpose
  • requirements
  • responsibilities
  • result
  • risks
  • season
  • significance
  • skills
  • source
  • specialty
  • stage
  • status
  • study
  • subtype
  • surface
  • target
  • temporal
  • text
  • timezone
  • title
  • transmission
  • tributary
  • workLoad
@rvguha
Contributor
rvguha commented Apr 8, 2015

Vicki,

A huge thanks for taking on this task. These badly named types and
relations have been an embarrassment for a long time and I for one will be
glad to see them go.

guha

On Wed, Apr 8, 2015 at 12:50 PM, vholland notifications@github.com wrote:

As schema.org grows, generic names for types and properties leads to
unfortunate collisions. The issue will become more problematic when
extensions are made a part of the flat namespace.

I am opening this issue as a place to gather ideas for types and
properties which should be considered for renaming:

##Types

  • Series
  • Season
  • Seat
  • Taxi
  • Joint
  • Vessel
  • Artery
  • Diet
  • Code
  • AlignmentObject
  • Audience
  • Permit
  • Ticket

Properties

  • affectedBy
  • algorithm
  • alignmentType
  • antagonist
  • area
  • aspect
  • assembly
  • audience
  • background
  • benefits
  • branch
  • branchOf
  • carrier
  • catalog
  • code
  • collection
  • course
  • device
  • diet
  • episode


Reply to this email directly or view it on GitHub
#417.

@elf-pavlik
Contributor

Maybe worth converting this list into a table and track proposed new names directly in it?

Types

Current Name Proposed Name
Series TvSeries

Properties

Current Name Proposed Name
target entryPoint

I also wonder how you plan to manage this renaming.

@danbri
Contributor
danbri commented Apr 15, 2015

Vicki - this is great. It would be good to prioritize, and - as @elf-pavlik says - collect proposed successor names. Sometimes we are just cursed with their not being a natural or obvious phrase.

I have just broadened supersededBy to support its use with classes and enumerations.

Regarding Series, which is a supertype of "TVSeries, RadioSeries, MovieSeries, BookSeries, Periodical and VideoGameSeries". Could this be called MediaSeries, ContentSeries, CreativeWorkSeries? Of those CreativeWorkSeries seems the easiest choice.

@thadguidry

+1 for CreativeWorkSeries - we are cursed sometimes, true. :)

@Dataliberate
Contributor

CreativeWorkSeries +1

~Richard

On 15 Apr 2015, at 16:05, Thad Guidry <notifications@github.commailto:notifications@github.com> wrote:

+1 for CreativeWorkSeries - we are cursed sometimes, true. :)


Reply to this email directly or view it on GitHubhttps://github.com/schemaorg/schemaorg/issues/417#issuecomment-93437091.

@vholland
Contributor

It may be easiest to keep some of these. I was trying to cast a wide net when I generated the list.

@danbri
Contributor
danbri commented Apr 15, 2015

I would also have included 'warranty' here, in that it actually meant "warranty promise". But we've just cleaned that up http://sdo-gozer.appspot.com/warrantyPromise #316

@tmarshbing

If we're creating a table, can we also include whatever usage statistics we have on the current names so we understand what kind of burden we would be putting on publishers to rename each one?

@vholland
Contributor

@tmarshbing good point. I did not look at usage when generating the list.

@vholland vholland was assigned by danbri Apr 15, 2015
@danbri danbri added this to the sdo-gozer release milestone Apr 15, 2015
@danbri danbri referenced this issue Apr 15, 2015
Closed

Meta bug for sdo-gozer release - vocab issues #418

19 of 36 tasks complete
@danbri
Contributor
danbri commented Apr 15, 2015

@vholland @rvguha @dbs @Dataliberate

we should also discuss whatever principles underly these decisions in a form that is helpful to groups making extensions.

https://lists.w3.org/Archives/Public/public-schemabibex/2015Apr/0000.html has a rough-cut draft off a proposal from BibExtend for a bib: extension. I would expect Agent, Meeting, Newspaper, Thesis and others to perhaps raise an eyebrow from @vholland, at least they seem to have a similar level of tension between the broad interpretations that the term suggest versus the specific meaning offered.

@vholland
Contributor

+1 to discussing principles used.

The original list was generated by grepping the sdo-gozer release on that date, and selecting terms I could not define on sight (AlignmentObject) or could imagine competing definitions (e.g. Is "Season" for TV/Radio or Sports?).

Given the initial selection criteria, there are probably terms I missed and terms which should not be deprecated and replaced.

@vholland
Contributor

I have taken the original list, broken it down, and added my recommendations. Comments are welcome.

Properties to deprecate (fewer than 100 domains)

  • assembly
  • catalog
  • course
  • diet
  • free
  • surface
  • temporal

Items to rename

Label Count Proposed Name
Series 100 - 1000 domains CreativeWorkSeries
Season 100 - 1000 domains CreativeWorkSeason
Taxi 100 - 1000 domains TaxiService
Code 100 - 1000 domains SoftwareSourceCode
application 10 - 100 domains actionApplication
area N/A serviceArea
benefits 100 - 1000 domains jobBenefits
branchOf 100 - 1000 domains parentCorporation, parentOrganization
collection 10 - 100 domains targetCollection
configuration NEW vehicleConfiguration
damages NEW knownVehicleDamages
device 100 - 1,000 domains availableOnDevice
doors NEW numberOfDoors
engine NEW vehicleEngine
gears NEW numberOfForwardGears
incentives 100 - 1,000 domains incentiveCompensation
material 100 - 1,000 domains artMedium
option 10 - 100 domains actionOption
produces 100 - 1,000 domains serviceOutput
requirements 100 - 1,000 domains softwareRequirements
season 100 - 1,000 domains mediaSeason
timezone NEW broadcastTimezone
transmission NEW vehicleTransmission

Items to leave as-is

These types/properties are either used by more than 1,000 domains, are aligned with another vocabulary, or I could not think of a better term.

Label Count Proposed Name Comments
Seat N/A Seat
AlignmentObject 10 - 100 domains AlignmentObject Propose keeping to align with LRMI
Audience 100 - 1000 domains Audience Seems OK?
Permit 10 - 100 domains Permit Can't think of an inclusive name
Ticket 10 - 100 domains Ticket Seems OK?
alignmentType 10 - 100 domains alignmentType Propose keeping to align with LRMI
audience 1,000 - 10,000 domains audience
comment 50,000 - 100,000 domains comment Pretty heavily used
episode 100 - 1,000 domains episode Seems OK?
event 1,000 - 10,000 domains event
instrument 100 - 1,000 domains instrument Intentionally abstract
menu 10,000 - 50,000 domains menu Heavily used
object 100 - 1,000 domains object Intentionally abstract
playerType 1,000 - 10,000 domains playerType
position 1,000 - 10,000 domains position
producer 1,000 - 10,000 domains producer
result 100 - 1,000 domains result Intentionally abstract
skills 1,000 - 10,000 domains skills
specialty 1,000 - 10,000 domains specialty
target Over 1,000,000 domains target Heavily used
text 250,000 - 500,000 domains text Heavily used
title 50,000 - 100,000 domains title Heavily used

Medical labels

The medical schema should be reconsidered. The following types/properties are part of that.

  • Joint
  • Vessel
  • Artery
  • Diet
  • affectedBy
  • algorithm
  • antagonist
  • aspect
  • background
  • branch
  • carrier
  • code
  • cost
  • followup
  • frequency
  • function
  • functionalClass
  • guideline
  • indication
  • intensity
  • origin
  • outcome
  • overview
  • phase
  • population
  • preparation
  • procedure
  • purpose
  • risks
  • significance
  • source
  • specialty
  • stage
  • status
  • study
  • subtype
  • tributary
  • workload
@rvguha
Contributor
rvguha commented Apr 22, 2015

Thanks Vicki.

Is there any medical term (in your list) that is used on more than a 100
domains?

guha

On Tue, Apr 21, 2015 at 7:17 PM, vholland notifications@github.com wrote:

I have taken the original list, broken it down, and added my
recommendations.
Properties to deprecate (fewer than 100 domains)

  • assembly
  • catalog
  • course
  • diet
  • free
  • intensity
  • surface
  • temporal
  • workLoad

Items to rename Label Count Proposed Name Series 100 - 1000 domains
CreativeWorkSeries Season 100 - 1000 domains CreativeWorkSeason Taxi 100

  • 1000 domains TaxiService Code 100 - 1000 domains SoftwareSourceCode
    application 10 - 100 domains actionApplication area N/A serviceArea
    benefits 100 - 1000 domains jobBenefits branchOf 100 - 1000 domains parentCorporation,
    parentOrganization collection 10 - 100 domains targetCollection
    configuration NEW vehicleConfiguration damages NEW knownVehicleDamages
    device 100 - 1,000 domains availableOnDevice doors NEW numberOfDoors
    engine NEW vehicleEngine gears NEW numberOfForwardGears incentives 100
  • 1,000 domains incentiveCompensation material 100 - 1,000 domains
    artMedium option 10 - 100 domains actionOption produces 100 - 1,000
    domains serviceOutput requirements 100 - 1,000 domains
    softwareRequirements season 100 - 1,000 domains mediaSeason timezone NEW
    broadcastTimezone transmission NEW vehicleTransmission Items to leave
    as-is

These types/properties are either used by more than 1,000 domains, are
aligned with another vocabulary, or I could not think of a better term.
Label Count Proposed Name Comments Seat N/A Seat AlignmentObject 10

  • 100 domains AlignmentObject Propose keeping to align with LRMI Audience 100
  • 1000 domains Audience Seems OK? Permit 10 - 100 domains Permit Can't
    think of an inclusive name Ticket 10 - 100 domains Ticket Seems OK?
    alignmentType 10 - 100 domains alignmentType Propose keeping to align
    with LRMI audience 1,000 - 10,000 domains audience comment 50,000 -
    100,000 domains comment Pretty heavily used episode 100 - 1,000 domains
    episode Seems OK? event 1,000 - 10,000 domains event instrument 100 -
    1,000 domains instrument Intentionally abstract menu 10,000 - 50,000
    domains menu Heavily used object 100 - 1,000 domains object Intentionally
    abstract playerType 1,000 - 10,000 domains playerType position 1,000 -
    10,000 domains position producer 1,000 - 10,000 domains producer result 100
  • 1,000 domains result Intentionally abstract skills 1,000 - 10,000
    domains skills specialty 1,000 - 10,000 domains specialty target Over
    1,000,000 domains target Heavily used text 250,000 - 500,000 domains text Heavily
    used title 50,000 - 100,000 domains title Heavily used Medical labels

The medical schema should be reconsidered. The following types/properties
are part of that.

  • Joint
  • Vessel
  • Artery
  • Diet
  • affectedBy
  • algorithm
  • antagonist
  • aspect
  • background
  • branch
  • carrier
  • code
  • cost
  • followup
  • frequency
  • function
  • functionalClass
  • guideline
  • indication
  • origin
  • outcome
  • overview
  • phase
  • population
  • preparation
  • procedure
  • purpose
  • risks
  • significance
  • source
  • specialty
  • stage
  • status
  • study
  • subtype
  • tributary


Reply to this email directly or view it on GitHub
#417 (comment).

@realworldobject

Is there some place I can go to see which terms are frequently used across domains?

Jeff

On Apr 21, 2015, at 10:33 PM, rvguha <notifications@github.commailto:notifications@github.com> wrote:

Thanks Vicki.

Is there any medical term (in your list) that is used on more than a 100
domains?

guha

On Tue, Apr 21, 2015 at 7:17 PM, vholland <notifications@github.commailto:notifications@github.com> wrote:

I have taken the original list, broken it down, and added my
recommendations.
Properties to deprecate (fewer than 100 domains)

  • assembly
  • catalog
  • course
  • diet
  • free
  • intensity
  • surface
  • temporal
  • workLoad

Items to rename Label Count Proposed Name Series 100 - 1000 domains
CreativeWorkSeries Season 100 - 1000 domains CreativeWorkSeason Taxi 100

  • 1000 domains TaxiService Code 100 - 1000 domains SoftwareSourceCode
    application 10 - 100 domains actionApplication area N/A serviceArea
    benefits 100 - 1000 domains jobBenefits branchOf 100 - 1000 domains parentCorporation,
    parentOrganization collection 10 - 100 domains targetCollection
    configuration NEW vehicleConfiguration damages NEW knownVehicleDamages
    device 100 - 1,000 domains availableOnDevice doors NEW numberOfDoors
    engine NEW vehicleEngine gears NEW numberOfForwardGears incentives 100
  • 1,000 domains incentiveCompensation material 100 - 1,000 domains
    artMedium option 10 - 100 domains actionOption produces 100 - 1,000
    domains serviceOutput requirements 100 - 1,000 domains
    softwareRequirements season 100 - 1,000 domains mediaSeason timezone NEW
    broadcastTimezone transmission NEW vehicleTransmission Items to leave
    as-is

These types/properties are either used by more than 1,000 domains, are
aligned with another vocabulary, or I could not think of a better term.
Label Count Proposed Name Comments Seat N/A Seat AlignmentObject 10

  • 100 domains AlignmentObject Propose keeping to align with LRMI Audience 100
  • 1000 domains Audience Seems OK? Permit 10 - 100 domains Permit Can't
    think of an inclusive name Ticket 10 - 100 domains Ticket Seems OK?
    alignmentType 10 - 100 domains alignmentType Propose keeping to align
    with LRMI audience 1,000 - 10,000 domains audience comment 50,000 -
    100,000 domains comment Pretty heavily used episode 100 - 1,000 domains
    episode Seems OK? event 1,000 - 10,000 domains event instrument 100 -
    1,000 domains instrument Intentionally abstract menu 10,000 - 50,000
    domains menu Heavily used object 100 - 1,000 domains object Intentionally
    abstract playerType 1,000 - 10,000 domains playerType position 1,000 -
    10,000 domains position producer 1,000 - 10,000 domains producer result 100
  • 1,000 domains result Intentionally abstract skills 1,000 - 10,000
    domains skills specialty 1,000 - 10,000 domains specialty target Over
    1,000,000 domains target Heavily used text 250,000 - 500,000 domains text Heavily
    used title 50,000 - 100,000 domains title Heavily used Medical labels

The medical schema should be reconsidered. The following types/properties
are part of that.

  • Joint
  • Vessel
  • Artery
  • Diet
  • affectedBy
  • algorithm
  • antagonist
  • aspect
  • background
  • branch
  • carrier
  • code
  • cost
  • followup
  • frequency
  • function
  • functionalClass
  • guideline
  • indication
  • origin
  • outcome
  • overview
  • phase
  • population
  • preparation
  • procedure
  • purpose
  • risks
  • significance
  • source
  • specialty
  • stage
  • status
  • study
  • subtype
  • tributary

Reply to this email directly or view it on GitHub
#417 (comment).

Reply to this email directly or view it on GitHubhttps://github.com/schemaorg/schemaorg/issues/417#issuecomment-95001396.

@vholland
Contributor

@rvguha

Only "specialty" was used by more than 1,000 domains.

The "aspect" and "indication" properties were used on between 100 and 1,000 domains.

The properties "code", "source", and "status" were used on between 100 and 1,000 domains, but were misused on non-medical types more than they were used in medical, suggesting we should consider repurposing these property names.

@vholland
Contributor

Jeff,

I'm not sure what you mean by across domains. I used the stats provided by
Guha.

  • Vicki

Vicki Tardif Holland | Ontologist | vtardif@google.com

On Tue, Apr 21, 2015 at 10:37 PM, Jeffrey Young notifications@github.com
wrote:

Is there some place I can go to see which terms are frequently used across
domains?

Jeff

On Apr 21, 2015, at 10:33 PM, rvguha <notifications@github.com<mailto:
notifications@github.com>> wrote:

Thanks Vicki.

Is there any medical term (in your list) that is used on more than a 100
domains?

guha

On Tue, Apr 21, 2015 at 7:17 PM, vholland <notifications@github.com
mailto:notifications@github.com> wrote:

I have taken the original list, broken it down, and added my
recommendations.
Properties to deprecate (fewer than 100 domains)

  • assembly
  • catalog
  • course
  • diet
  • free
  • intensity
  • surface
  • temporal
  • workLoad

Items to rename Label Count Proposed Name Series 100 - 1000 domains
CreativeWorkSeries Season 100 - 1000 domains CreativeWorkSeason Taxi 100

  • 1000 domains TaxiService Code 100 - 1000 domains SoftwareSourceCode
    application 10 - 100 domains actionApplication area N/A serviceArea
    benefits 100 - 1000 domains jobBenefits branchOf 100 - 1000 domains
    parentCorporation,
    parentOrganization collection 10 - 100 domains targetCollection
    configuration NEW vehicleConfiguration damages NEW knownVehicleDamages
    device 100 - 1,000 domains availableOnDevice doors NEW numberOfDoors
    engine NEW vehicleEngine gears NEW numberOfForwardGears incentives 100
  • 1,000 domains incentiveCompensation material 100 - 1,000 domains
    artMedium option 10 - 100 domains actionOption produces 100 - 1,000
    domains serviceOutput requirements 100 - 1,000 domains
    softwareRequirements season 100 - 1,000 domains mediaSeason timezone NEW
    broadcastTimezone transmission NEW vehicleTransmission Items to leave
    as-is

These types/properties are either used by more than 1,000 domains, are
aligned with another vocabulary, or I could not think of a better term.
Label Count Proposed Name Comments Seat N/A Seat AlignmentObject 10

  • 100 domains AlignmentObject Propose keeping to align with LRMI
    Audience 100
  • 1000 domains Audience Seems OK? Permit 10 - 100 domains Permit Can't
    think of an inclusive name Ticket 10 - 100 domains Ticket Seems OK?
    alignmentType 10 - 100 domains alignmentType Propose keeping to align
    with LRMI audience 1,000 - 10,000 domains audience comment 50,000 -
    100,000 domains comment Pretty heavily used episode 100 - 1,000 domains
    episode Seems OK? event 1,000 - 10,000 domains event instrument 100 -
    1,000 domains instrument Intentionally abstract menu 10,000 - 50,000
    domains menu Heavily used object 100 - 1,000 domains object Intentionally
    abstract playerType 1,000 - 10,000 domains playerType position 1,000 -
    10,000 domains position producer 1,000 - 10,000 domains producer result
    100
  • 1,000 domains result Intentionally abstract skills 1,000 - 10,000
    domains skills specialty 1,000 - 10,000 domains specialty target Over
    1,000,000 domains target Heavily used text 250,000 - 500,000 domains
    text Heavily
    used title 50,000 - 100,000 domains title Heavily used Medical labels

The medical schema should be reconsidered. The following types/properties
are part of that.

  • Joint
  • Vessel
  • Artery
  • Diet
  • affectedBy
  • algorithm
  • antagonist
  • aspect
  • background
  • branch
  • carrier
  • code
  • cost
  • followup
  • frequency
  • function
  • functionalClass
  • guideline
  • indication
  • origin
  • outcome
  • overview
  • phase
  • population
  • preparation
  • procedure
  • purpose
  • risks
  • significance
  • source
  • specialty
  • stage
  • status
  • study
  • subtype
  • tributary

Reply to this email directly or view it on GitHub
<#417 (comment)
.

Reply to this email directly or view it on GitHub<
https://github.com/schemaorg/schemaorg/issues/417#issuecomment-95001396>.


Reply to this email directly or view it on GitHub
#417 (comment).

@realworldobject

I see a script to generate stats, but maybe I'm overlooking the stats themselves.

15d5158

Sorry if I got caught napping and missed them, but I'm not too proud to admit it. :-/

On Apr 21, 2015, at 11:00 PM, vholland <notifications@github.commailto:notifications@github.com> wrote:

Jeff,

I'm not sure what you mean by across domains. I used the stats provided by
Guha.

  • Vicki

Vicki Tardif Holland | Ontologist | vtardif@google.commailto:vtardif@google.com

On Tue, Apr 21, 2015 at 10:37 PM, Jeffrey Young <notifications@github.commailto:notifications@github.com>
wrote:

Is there some place I can go to see which terms are frequently used across
domains?

Jeff

On Apr 21, 2015, at 10:33 PM, rvguha <notifications@github.commailto:notifications@github.com<mailto:
notifications@github.commailto:notifications@github.com>> wrote:

Thanks Vicki.

Is there any medical term (in your list) that is used on more than a 100
domains?

guha

On Tue, Apr 21, 2015 at 7:17 PM, vholland <notifications@github.commailto:notifications@github.com
mailto:notifications@github.com> wrote:

I have taken the original list, broken it down, and added my
recommendations.
Properties to deprecate (fewer than 100 domains)

  • assembly
  • catalog
  • course
  • diet
  • free
  • intensity
  • surface
  • temporal
  • workLoad

Items to rename Label Count Proposed Name Series 100 - 1000 domains
CreativeWorkSeries Season 100 - 1000 domains CreativeWorkSeason Taxi 100

  • 1000 domains TaxiService Code 100 - 1000 domains SoftwareSourceCode
    application 10 - 100 domains actionApplication area N/A serviceArea
    benefits 100 - 1000 domains jobBenefits branchOf 100 - 1000 domains
    parentCorporation,
    parentOrganization collection 10 - 100 domains targetCollection
    configuration NEW vehicleConfiguration damages NEW knownVehicleDamages
    device 100 - 1,000 domains availableOnDevice doors NEW numberOfDoors
    engine NEW vehicleEngine gears NEW numberOfForwardGears incentives 100
  • 1,000 domains incentiveCompensation material 100 - 1,000 domains
    artMedium option 10 - 100 domains actionOption produces 100 - 1,000
    domains serviceOutput requirements 100 - 1,000 domains
    softwareRequirements season 100 - 1,000 domains mediaSeason timezone NEW
    broadcastTimezone transmission NEW vehicleTransmission Items to leave
    as-is

These types/properties are either used by more than 1,000 domains, are
aligned with another vocabulary, or I could not think of a better term.
Label Count Proposed Name Comments Seat N/A Seat AlignmentObject 10

  • 100 domains AlignmentObject Propose keeping to align with LRMI
    Audience 100
  • 1000 domains Audience Seems OK? Permit 10 - 100 domains Permit Can't
    think of an inclusive name Ticket 10 - 100 domains Ticket Seems OK?
    alignmentType 10 - 100 domains alignmentType Propose keeping to align
    with LRMI audience 1,000 - 10,000 domains audience comment 50,000 -
    100,000 domains comment Pretty heavily used episode 100 - 1,000 domains
    episode Seems OK? event 1,000 - 10,000 domains event instrument 100 -
    1,000 domains instrument Intentionally abstract menu 10,000 - 50,000
    domains menu Heavily used object 100 - 1,000 domains object Intentionally
    abstract playerType 1,000 - 10,000 domains playerType position 1,000 -
    10,000 domains position producer 1,000 - 10,000 domains producer result
    100
  • 1,000 domains result Intentionally abstract skills 1,000 - 10,000
    domains skills specialty 1,000 - 10,000 domains specialty target Over
    1,000,000 domains target Heavily used text 250,000 - 500,000 domains
    text Heavily
    used title 50,000 - 100,000 domains title Heavily used Medical labels

The medical schema should be reconsidered. The following types/properties
are part of that.

  • Joint
  • Vessel
  • Artery
  • Diet
  • affectedBy
  • algorithm
  • antagonist
  • aspect
  • background
  • branch
  • carrier
  • code
  • cost
  • followup
  • frequency
  • function
  • functionalClass
  • guideline
  • indication
  • origin
  • outcome
  • overview
  • phase
  • population
  • preparation
  • procedure
  • purpose
  • risks
  • significance
  • source
  • specialty
  • stage
  • status
  • study
  • subtype
  • tributary

Reply to this email directly or view it on GitHub
<#417 (comment)
.

Reply to this email directly or view it on GitHub<
https://github.com/schemaorg/schemaorg/issues/417#issuecomment-95001396>.

Reply to this email directly or view it on GitHub
#417 (comment).

Reply to this email directly or view it on GitHubhttps://github.com/schemaorg/schemaorg/issues/417#issuecomment-95010094.

@danbri
Contributor
danbri commented Apr 22, 2015

The stats are read from this file, https://github.com/schemaorg/schemaorg/blob/sdo-gozer/data/2015-04-vocab_counts.txt which buckets each term into one of ten categories (roughly 10 = found on lots of sites, 1 = minimal, see script/site for a bit more detail).

@mfhepp mfhepp added a commit to mfhepp/schemaorg that referenced this issue Apr 23, 2015
@mfhepp mfhepp changed property names for configuration, damages, doors, engine, gea…
…rs, and transmission as per @vholland's suggestion in issue #417
0face5f
@tmarshbing

Why would we deprecate the infrequently used properties (assembly, catalog, course, etc.)? I feel like we should rename them instead, unless we have a general policy of deprecating properties that don't get usage after some period of time.

Otherwise, the changes look good to me.

@chaals
Contributor
chaals commented Apr 26, 2015

Agree that medical should be reconsidered, and I wonder if we should be more ruthless about some of the things in the as-is list such as audience -> intendedAudience...

Data on rates of change in response to e.g. deprecating things would be useful too - if people notice and fix stuff, we could be happier about making changes than if they don't. although we might get "drag" from tools that are unmaintained over time…

@mfhepp
Contributor
mfhepp commented Apr 27, 2015

A general comment:

Why would we deprecate the infrequently used properties (assembly, catalog, course, etc.)? I feel like >we should rename them instead, unless we have a general policy of deprecating properties that don't >get usage after some period of time.

I think we should not deprecate conceptual elements solely based on lack of usage, because that might cripple the conceptual model. Rather, the sponsors should check whether they can better motivate developer to use properties and types that are underutilized. I guess the majority of adoption of certain elements comes from their appearance in some examples or instructions on Google/Bing/Yahoo/Yandex developer documentation.

@thadguidry

@mfhepp I agree, and irregardless of crippling a model, if after 4 years, no one is using our hasDonkeyTailFromHell property, then we probably need to ask "why not ?" and "Does the definition limit to much or not provide a clear understanding of its use ?" "Have we asked folks in the domain for enough external reviews to uncover hidden issues we might have missed ?"

@vholland Medical needs to be reconsidered but it is a 2 part problem. You have Doctors and Research, mixed in with Consumer Usage. We can have both, but I really feel it needs work from more than just us. I would rather see some of the work that Leeza Rodriguez has done in this area (from a Consumer point of view) that aligns with the work that Marc Twagirumukiza with Agfa (Doctors, Research, Insurance, Billing) has done already #11

For Medical, I would like to see the Consumer side dealt with 1st.

@danbri
Contributor
danbri commented Apr 28, 2015

@vholland can you summarize where you think we are on this, w.r.t. a proposal for sdo-gozer in which:

  • some terms are marked as supersededBy other (possibly new) terms
  • some terms get new descriptions and examples that clarify their meaning
@chaals
Contributor
chaals commented Apr 28, 2015

I don't think that we deprecate things just based on lack of usage. Although at some point we probably should. Rather, we are more reluctant to deprecate things that we should - e.g. because they are badly named for Schema.org leading to confusion or difficulty in introducing important new stuff - if there is a lot of usage.

@danbri
Contributor
danbri commented Apr 28, 2015

@chaals I agree that lack of usage is not the best motivation. Lack of usage alone (especially for new terms or those that are important but niche or concentrated onto small number of sites) isn't sufficient. It is useful information to have in hand, and in public, but one factor amongst many.

The most common form of deprecation on schema.org is renaming, which we indicate as 'supersededBy' and document in a machine readable form. This is something that consuming apps could make more use of, and something we could make easier to find. Here fwiw is a quick summary of the state of supersededBy relationships from the sdo-gozer repo:

rdfa data/schema.rdfa | grep -i supersededBy

Results are in https://gist.githubusercontent.com/danbri/2f91faae84fbc8fffa76/raw/b02e93fd774e3b50139ebcb8b7ea0b06ee093e18/gistfile1.txt and are dominated by the early change we made to remove plurality from repeatable property names: https://www.w3.org/wiki/WebSchemas/Singularity

@vholland
Contributor

@mfhepp has already updated the vehicle-related names. Thanks, Martin!

The conversation seems to be around the properties to remove (deprecate without a corresponding 'supersededBy').

Given that, I propose renaming them as follows:

Items to rename

Label Count Proposed Name
assembly < 100 domains executableLibraryName
catalog < 100 domains hasDataCatalog
course < 100 domains exerciseCourse
diet < 100 domains exerciseRelatedDiet
free < 100 domains isAccessibleForFree
surface < 100 domains artworkSurface
temporal < 100 domains datasetTimeInterval

Comments on these names or the previously proposed names are appreciated. I'll create a pull request reflecting these names in the next couple of days.

@thadguidry

@vholland As a best practice going forward...I would suggest not prefixing with "has" or "is" if a property is not a boolean type, otherwise it causes a bit of confusion by making it feel like a true/false or flag having property. My suggestion is to rename catalog -> includedDataCatalog

@rvguha
Contributor
rvguha commented Apr 30, 2015

I am good with these. I also like Thad's suggestion.

guha

On Tue, Apr 28, 2015 at 7:52 AM, vholland notifications@github.com wrote:

@mfhepp https://github.com/mfhepp has already updated the
vehicle-related names. Thanks, Martin!

The conversation seems to be around the properties to remove (deprecate
without a corresponding 'supersededBy').

Given that, I propose renaming them as follows:
Items to rename Label Count Proposed Name assembly < 100 domains
executableLibraryName catalog < 100 domains hasDataCatalog course < 100
domains exerciseCourse diet < 100 domains exerciseRelatedDiet free <
100 domains isAccessibleForFree surface < 100 domains artworkSurface
temporal < 100 domains datasetTimeInterval

Comments on these names or the previously proposed names are appreciated.
I'll create a pull request reflecting these names in the next couple of
days.


Reply to this email directly or view it on GitHub
#417 (comment).

@mcglabs
mcglabs commented Apr 30, 2015

As per the original spirit of this thread (conflicts/generalization of properties) - I think there has to be a more elegant solution than arbitrarily extending the property names to be more descriptive and forgoing rational linear relationships.

Problem

  • Prepending the type to a property name (e.g. Vehicle[engine] -> vehicleEngine) can be inferred by traversing the tree alone.
  • Including valueType as part of property name (e.g. doors -> numberOfDoors) becomes arbitrary and adoption based on chance.

**Both methods are slippery slope to long/dependent property names.

Solutions

A possible solution may be to interpret schema in graph from -- types -> nodes; properties -> edges. Chaining the Type (node) w/property (edge) and gleaning context from ancestral lineage; enabling properties to be more contextual and extendable. Problem here is that we lose extensibility of a general vocabulary that can be gleaned by other means, such as traversing the graph to root to infer full context.

It's seems like an uphill battle to try to tow the line between more descriptive and general property names. We either do one or the other. Not both.

Either:

a) Figure out how to apply general property names to multiple Types using case rules for Property(valueTypes, description) by allowing context to be inferred by Type name (i.e. traversal of graph relationships to determine context, therefore meaning).
b) Intensifying the effort to combine property names with type names in an axiomatic way, while restricting the scope to nearest neighbor (e.g. Vehicle[engine] -> Vehicle[vehicleEngine], Vehicle[doors] -> Vehicle[vehicleDoors]). Strictly no superfluous nouns/verbs/adj/etc.

If the goal is to make Schema extensible (I feel) we have to preserve context while remaining flexible, and allowing for common sense reasoning for expansion.

@mfhepp
Contributor
mfhepp commented Apr 30, 2015

In the past, I have already suggested to break with the principle of global property names and IDs, because this reduces the risk of name clashes and allows for shorter property names. But this comes at the cost of adding a lot of complexity for managing and understanding schema.org, so I am fine with the current direction to add context to the names / IDs of properties.

@danbri danbri referenced this issue Apr 30, 2015
Closed

Clean up Comment vs UserComments vs CommentAction #170

1 of 4 tasks complete
@vholland vholland added a commit to vholland/schemaorg that referenced this issue Apr 30, 2015
@vholland vholland Issue #417: Implemented new names as suggested. This commit does not …
…touch anything under MedicalEntity.
38a9c75
@vholland
Contributor

Pull request #464 contains the names listed above, including @thadguidry's suggestion of "includedDataCatalog".

I did not remove any properties. All changes have a supersededBy statement. I also did not touch anything in MedicalEntity.

@danbri
Contributor
danbri commented May 1, 2015

Looks plausible - but this is a substantive change. Everyone do please take a look over the updates proposed: https://github.com/schemaorg/schemaorg/pull/464/files

/cc @pmika @tmarshbing @chaals @ajax-als @tilid @scor

@danbri
Contributor
danbri commented May 6, 2015

I'm merging in #464 but do invite careful review of this as it is a substantive and potentially impactful change.

@danbri
Contributor
danbri commented May 6, 2015

Merging. Thanks for including a release.html entry for this, @vholland - much appreciated :)

@boanuge
boanuge commented May 12, 2015

Hi, I am curious about the changing names for the schema.org types and properties.
In my understanding, the schema.org site is managed by 4 sponsors (Google, MS, Yahoo and Yandex).
Then, does this mean the naming of types and properties in the schema.org needs to be agreed with all the sponsors?
I would appreciate if anyone can explain the update(or any modification) process for the schema.org site.

@danbri
Contributor
danbri commented May 12, 2015

@boanuge reps from the schema.org partners discuss details here in github (e.g. @tmarshbing @chaals @pmika @vholland), as well as in the W3C Schema.org Community Group (e.g. https://lists.w3.org/Archives/Public/public-schemaorg/2015May/0000.html). We are in process of setting up a new (publicly archived) Steering Group to bring more visible structure to all this.

@danbri danbri closed this May 12, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment