Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use of ids:idsValue when referencing portions of IFC schema #39

Closed
CBenghi opened this issue Jan 26, 2022 · 16 comments
Closed

Use of ids:idsValue when referencing portions of IFC schema #39

CBenghi opened this issue Jan 26, 2022 · 16 comments

Comments

@CBenghi
Copy link
Contributor

CBenghi commented Jan 26, 2022

Hello,

after the group call this week, I've been pondering about the use of ids:idsValue when pointing to named parts of the schema as in:

<xs:complexType name="propertyType">
<xs:sequence>
<xs:element name="propertySet" type="ids:idsValue"/>
<xs:element name="name" type="ids:idsValue"/>

@NickNisbet makes the point that this allows the inclusion of people that cannot quite control the placement of information in the schema. Which is, of course, valuable for all of us.

But the solution proposed generates two major drawbacks, in my view.

  1. I think that the point of IDS is to provide some level of confidence that the data we need can be identified with certainty, and my concern is that the use of multiple options reading several parts of the model results in a high risk of combinatorial explosion of dissimilar models.
    In my view an IDS will have to be written on the agreement between consumers and providers, and by that time a specific constraint on PropertySet names, or Classification system names (just as two examples) should be feasible.

  2. The previous state (i.e. fixed names) would allow the automation of editing helpers (e.g. production of a propertySet template) that scaffolds the data input features to enable users to fix models and help them pass the requirements.
    This becomes much more arbitrary if we think of all possible combinations of content in the idsValue structure.

At this stage I think this might be a serious misstep for IDS. I would urge the group to reconsider this choice.

As an alternative solution, I would much rather have two or three IDS options of clear behaviours that different providers must comply with, than retain the need to adopt complex extracting behaviour across all reading operations.

Thanks,
Claudio

@berlotti
Copy link
Member

ids:idsValue is a choice between either simple value (fixed) or xs:restriction (which can be be minvalue, maxvalue, pattern, enumeration, etc).
By using existing, well used and well defined standards like the XSD restrictions, we enforce consistency in implementations.

@CBenghi
Copy link
Contributor Author

CBenghi commented Jan 26, 2022

I don't dispute using idsValue, that makes perfect sense.

The issue is with using it in places where a string (no regex or enums) offers a stricter and more useful information, such as in L130 and L131.

@berlotti
Copy link
Member

Again, idsValue is a choice.
You can also use simpleValue:

 <ids:value>
          <ids:simpleValue>L130</ids:simpleValue>
</ids:value>

@CBenghi
Copy link
Contributor Author

CBenghi commented Jan 26, 2022

yes, I fully understand that.
When I said 'L130' I meant Line 130 in the snippet above of the schema.

<xs:element name="propertySet" type="ids:idsValue"/> 

My argument is that having the option to set a regex to define the property set name is detrimental to the infrastructure of IDS. (see points 1 and 2 in the initial post).

@berlotti
Copy link
Member

We discussed this yesterday in the meeting as well. There are a lot of identified use-cases that require thisL https://github.com/buildingSMART/IDS/tree/master/Use-cases

@pasi-paasiala
Copy link
Contributor

For BIM authoring tools we need to have an explicit value for property set and property names. With this information the authoring tools can configure the property sets for their users to fill in.
If there's a need to have a pattern for these, we could then have optional PropertySetPattern and PropertySetNamePattern to store the allowed variance.
Of course, It'd be better to to not have variance in these. It is much easier for downstream applications to work with property set and property names that are fixed.

@berlotti
Copy link
Member

berlotti commented Feb 2, 2022

When sending an IDS to a BIM Authors, the sender needs to make sure the values for properties are simple values. Many other use-cases for checking require some kind of intelligence.
IDS should support multiple use-cases to get interoperability.

@CBenghi
Copy link
Contributor Author

CBenghi commented Feb 2, 2022

It is much easier for downstream applications to work with property set and property names that are fixed.
Many other use-cases for checking require some kind of intelligence.

Hi @berlotti,
Sure, intelligence is always welcome, but the data management workflow downstream cannot rely on that, if the IDS schema allows the use of a pattern, every implementer will have to develop a data extraction algorithm for choosing the value to extract between multiple possible values for a single requirement.

Say that we have a propertyName pattern that accepts both H or height and the file contains both, what is the value that we should consider for the property?
This might be more likely than we think, because since both names are in the IDS, after a couple of exchanges between parties, I'm sure that some element will have both.

If we want to guarantee consistency the same exact selection behaviours will have to be replicated in all applications downstream, I see this unlikely to happen.

I don't see how it benefits the IDS to allow multiple (possible conflicting) ways to store the same data, particularly since we accept that IDS is to serve specific atomic transactions.

Can you point us to use cases that need this feature, so that we can think of an alternative solution?
Thanks,
Claudio

@aothms
Copy link

aothms commented Feb 2, 2022

Say that we have a propertyName pattern that accepts both H or height and the file contains both, what is the value that we should consider for the property?

I think that's a separate topic. It could also be the case the a psetname is not specified and the prop is found multiple times. The same for layered materials. Very important that this behaviour is clearly defined, but there are many ways by which a constraint will be mapped to multiple values from the model.

If we want to guarantee consistency the same exact selection behaviours will have to be replicated in all applications downstream, I see this unlikely to happen.

I'm more optimistic. This has to happen. We need to have a formal unambiguous understanding of what we encode in an IDS.

@CBenghi
Copy link
Contributor Author

CBenghi commented Feb 3, 2022

@aothms,
thanks for your comment. You do have a good point with regards to the material facet.

If a difference between the two cases exist, it might be that in the case of materials the process can be defined strictly within IDS behaviours, while for patterns it might also have to consider the order of values that the user writes in the specs. Which might be less intuitive for final users to understand.

@aothms
Copy link

aothms commented Feb 4, 2022

We should indeed eliminate any behaviour related on order. Both any/exists and all predicates are order independent, but I see your point in the confusion that would arise from using any in the case of multiple properties. Because we're checking that any of the properties with certain name matches, but we don't inform the downstream application - that actually will consume the information - which is the property to use.

I see two ways around it:

  • using all in this case, all properties matching the name query should match the value constraint. So it doesn't matter which of them in the end is picked by the downstream application.
  • using any but also standardizing a way so that downstream applications (from the BCF report?) know which property to take. This doesn't sound very realistic.

I see two cases for patterns in psets/props:

  • Pset(Wall|Slab)Common.LoadBearing. You can reuse the same phrase, "Element is load bearing", regardless of the applicability. Could be handled in the IDS authoring software as well if it has enough knowledge on psets, or if you use some substitution syntax in your editor so that phrases can reference each other.
  • Known incompatibilities between authoring tools. If you know one tool can only deliver X, and the other can only deliver Y. In this case you also know you'll never get both X and Y. So it might make sense to specify X|Y if your downstream app can handle both.

Both can be handled I think if we settle on all for the handling of multiple properties matching the name/pset pattern.

@pasi-paasiala
Copy link
Contributor

Discussion in April 5th: Property set and property names should be expressed as explicit strings. Agreement between @pasi-paasiala @CBenghi @rubendel @MatthiasWeise and Robin.

@pasi-paasiala
Copy link
Contributor

pasi-paasiala commented Apr 5, 2022

Discussion in April 5th: Property set and property names should be expressed as explicit strings. Agreement between @pasi-paasiala @CBenghi @rubendel @MatthiasWeise and Robin.

This should also apply to entity types and attribute names.

@berlotti
Copy link
Member

There does not seem to be consensus yet in this issue.... Hoping to get some more input here.

@Moult
Copy link
Contributor

Moult commented May 31, 2022

This should also apply to entity types and attribute names.

If applied to entity names, then it becomes impossible to say that a list (enumeration) of IfcClasses have a requirement. This is a pretty fundamental usecase of IDS. So I am against this.

If applied to attribute names, that makes it much harder to handle schema differences where attributes have been renamed between versions (in particular Identification, ItemId, ...). So I am also against this.

Addressing the two original concerns:

... my concern is that the use of multiple options reading several parts of the model results in a high risk of combinatorial explosion of dissimilar models.

Here are some test cases covering the combinatorial explosion you speak of: https://github.com/IfcOpenShell/IfcOpenShell/blob/0dfe1d995f25ce6b519a82e57e62361cea108b21/src/ifcopenshell-python/test/test_ids.py#L767-L800 - the assumption here is that "all" shall pass.

The previous state (i.e. fixed names) would allow the automation of editing helpers (e.g. production of a propertySet template) that scaffolds the data input features to enable users to fix models and help them pass the requirements. This becomes much more arbitrary if we think of all possible combinations of content in the idsValue structure.

If the recipient has information requirements that state they need exact pset names and property names, then they should not use the restrictions (pattern / enum) in their IDS.

If the recipient simply wants to check a pset or prop naming scheme, prefix convention (e.g. no custom psets, all must be Pset_ or Qto_), etc, these restrictions are valid. Common for a company standard, or government standard.

If the recipient needs to check multiple Pset_[a-z]+Common psets (LoadBearing, FireRating, AcousticRating, Status especially) or base quantities, these restrictions are valid.

If the the recipient is expecting generated psets with variable properties to be processed by their system, these restrictions are valid.

If these five usecases (entity names, attribute names, namespaces, *common, and generated props) are invalid, then by all means drop the feature. But I don't think they are all invalid.

@Moult
Copy link
Contributor

Moult commented May 31, 2022

I would like to note that I personally have just written about 70 IDS specs (and counting, still a WIP) and I have yet to need xs:restrictions on attribute names, pset names, and prop names. So at least right now, it seems as though I haven't needed to use them yet - though I suspect the usecase of "applicable to things with a fire rating" or "applicable to anything of a status of demolished / new / whatever" is probably needed. I'm also curious on what usecases @berlotti is referring to here: #39 (comment)

@berlotti berlotti closed this as completed Aug 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants