Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple Birth/Death/Gender "Conclusions" #117

Closed
EssyGreen opened this issue Feb 5, 2012 · 20 comments
Closed

Multiple Birth/Death/Gender "Conclusions" #117

EssyGreen opened this issue Feb 5, 2012 · 20 comments

Comments

@EssyGreen
Copy link

My understanding of the Conclusion Model is that it will allow multiple births, deaths and gender (in a similar way to the existing GEDCOM 5.x). However, this is illogical for a "Conclusion" ... If we conclude that Person ABC was born at a certain date/place than we cannot possibly also conclude that they were born at a different date/place. We can conclude that a bundle of evidence suggests one or the other and we can prefer one or other of these evidence bundles but a person cannot possibly be born more than once! Similarly a Person cannot die more than once (although it may be possible for their heart to stop more than once) and can only have a single gender at birth) - I would recommend that trans-sexual events would be better dealt with as bespoke facts. Cremations are also single-events and the vast majority of burials would be singletons (although I accept that it is physically possible to be buried more than once.)

If we are to support multiple Births/Deaths/Genders etc in the Conclusion model then a better description would be a "Hypothesis" rather than "Conclusion". And a Hypothesis could include multiple generic facts (but only a single birth, death and gender) which form a specific scenario being tested out by the researcher.

Conversely, if the Conclusion Model is really intended to reflect Conclusions then it should not permit impossible situations and hence there need to be two types of fact: repeatable facts and non-repeatable facts (singletons within the Person).

@stoicflame
Copy link
Member

Fair enough.

I do think that there are various ways where it makes sense to have multiple conclusions for a "non-repeatable fact".

The first is that there can be distinct conclusions made by two different contributors.

The second is that there can be distinct conclusions of varying degrees of confidence. We could argue semantics and say that anything less than a specific level of confidence isn't a "conclusion", but the question for this thread is what's the best way to create a model to support these concepts.

So I like the model as just a list of facts because it's simpler and it gives implementations the choice on how to apply their own business rules. If an implementor doesn't want to allow more than one birth conclusion on a person, then they can do that (the FamilySearch FamilyTree will be implemented like that). If an implementor wants to allow the notion of a "hypothesis", then that can be provided as another conclusion with low confidence.

@EssyGreen
Copy link
Author

I do see your point but I'm afraid I don't agree ...

"distinct conclusions made by two different contributors." - in this case surely the solution is to branch off into separate trees, each with his/her own hypothesis. To allow two contributors to continue to build a single tree along conflicting lines would be mayhem (both to them and to their readers)

"there can be distinct conclusions of varying degrees of confidence" - certainly, but in this case is it not easier to leave the conflicting areas out of the fact altogether (or if you prefer one over the other then give the details of the preferred one) but reference both positive and negative/conflicting sources with a textual evaluation in the "Proof Statement".

I believe that if we are calling it a Conclusion Model then logically we should not encourage scenarios which are impossible in the real world. Also we should be aiming to make the process of arriving at a good "Proof Statement" easier.

For example, supposing I find two possible birth records for Person P1; one where the father is F1 and the other where the father is F2. I now look for other records to work out which is better and say I find two census records which could be P1 with F1 and P1 with F2 ...

Using the "duplicate facts" approach I now have P1 with 2 births, 2 fathers and 2 censuses (muddled in with all the other facts I already know about P1). I may also have F1 and F2 both having the same P1 child. I look for more records to prove one way or another and P1 (not to mention F1 and F2) just gets to look messier and messier.

Using a "hypothesis" approach I would have 2 branches (in separate trees/files) covering both possibilities, each with a different set of Persons: H1.P2, H1.F1 and H2.P3, H2.F2. The original P1 remains in the original tree/file and the birth fact references both H1 and H2 files as sources with an evaluation of which is preferred (and which is not) and why in the "Proof Statement". All of these can now be researched independently without confusing which fact relates to which hypothesis. If/when a clear conclusion is reached then I can merge the research from H1 or H2 into my original file.

@stoicflame
Copy link
Member

Very cool. Sounds like you've got a great product in mind.

I'm trying to understand how defining the model as a simple list of conclusions is preventing you from implementing your product as you describe. I can understand how a different model would be easier to work with in the product you describe, but GEDCOM X isn't really about making all products work the same way. Instead, it's about enabling products that work differently to share and exchange data with each other.

You've done your own analysis of the needs of your users and come to some good conclusions about how you want your product to behave. But different people will want their products to behave differently according to the preferences of their own users.

So I'm trying to look for how the model as it's defined today in GEDCOM X is insufficient for the purposes of data exchange. Can you specify?

@EssyGreen
Copy link
Author

I totally agree and understand that GEDCOMX should allow multiple implementations/applications etc but I feel that the current model is making the ability to link a single Person to a single Persona extremely difficult (if not impossible) yet this is fundamental (not just to my model but to any genealogical research - all the facts etc depend on the Person/Persona link being accurate).

  • the Conclusion is only an abstract class for Facts, Gender and Names. There is no provision to "conclude" anything else, and hence no way to "conclude" that a Person is (or is not) a Persona.
  • I can see that as the Person is a GenealogicalResource it can have an Attribution (and hence a proof statement) but I can't see how I could link Persona(s) into that
  • Since the Person is a GenealogicalEntity I could add the Personas as Sources but then where would I put the proof statements?

If I'm missing something obvious please let me know :)

@EssyGreen
Copy link
Author

Actually I apologise ... the problem I just described in answer to your post is actually more related to Issue #120

You are quite right in that I can just flag up multiple births, genders, mothers and fathers as errors/problems but I am disappointed that the new GEDCOMX will encourage people to come to impossible conclusions. At least in GEDCOM 5 there was no attempt to call it a "Conclusion".

@stoicflame
Copy link
Member

You are quite right in that I can just flag up multiple births, genders, mothers and fathers as errors/problems but I am disappointed that the new GEDCOMX will encourage people to come to impossible conclusions. At least in GEDCOM 5 there was no attempt to call it a "Conclusion".

Fair enough.

Personally, I don't have a problem with using the term "conclusion" because I think it makes sense that there can be multiple equally-valid-but-conflicting conclusions about e.g. someone's birth. What term would you use to describe a "thing that is a conclusion and/or a hypothesis"? Maybe "assertion"? There's a lot of baggage with that term...

@EssyGreen
Copy link
Author

As you may have guessed from reading my other posts I tend to like the word "Interpretation" - for me this makes it clear that it has been derived from something.

But given issue #138 this may now be a moot point?

@jralls
Copy link
Contributor

jralls commented Feb 16, 2012

I don't think adopting #138 would make it moot, but I think #134 addresses it rather elegantly.

@EssyGreen
Copy link
Author

#134 addresses it rather elegantly.

Agreed.

However, my argument for the single birth/gender/parents remains unresolved.

@jralls
Copy link
Contributor

jralls commented Feb 16, 2012

However, my argument for the single birth/gender/parents remains unresolved.

I thought that Tom addressed that in #134, but if you're still unsatisfied with that model, perhaps better to address it there.

@EssyGreen
Copy link
Author

I may have missed something but I don't see anything in #134 which ensures that each Person conforms to a real situation (ie a single mother, father, gender, death, birth).

@jralls
Copy link
Contributor

jralls commented Feb 16, 2012

Hmm. Perhaps I misunderstood what you're looking for here. Do you want a uniqueness constraint on certain types of fact, event, and relationship? How would you implement that in a data model (as opposed to application-level validation)? Would your implementation work in @ttwetmore's n-level model?

@EssyGreen
Copy link
Author

I can't really lay down the details until we've sorted out the Roles vs Relationships and whether we're having a Conclusion and a Record Model or not but basically I would have the Gender, Mother and Father as properties of the Person and derive special types of Fact/Event for the Birth, Death.

I believe it will fit fine with the N-tier approach - providing that we also allow the N-tier source approach (see #136 and my last comment in #80) which allows for multiple hypotheses to be defined within a single source/context.

I am aware that I am probably in the minority here since the old way of allowing duplicates is firmly entrenched but I feel that encouraging people to think about using Hypotheses as a means of reaching a Conclusion rather than just lumping conflicting Facts together (without any need to explain the conflicts) would be worthwhile.

@jralls
Copy link
Contributor

jralls commented Feb 18, 2012

I would have the Gender, Mother and Father as properties of the Person and derive special types of Fact/Event for the Birth, Death

I'm OK with sex (gender is for nouns) as a property (attribute in UML-speak), but not parents. Parents, or rather the parental relationship, needs to be a first-class GenealogicalEntity so that it can carry the attribution and other stuff. You've argued elsewhere for having a single class for events and relationships, so I guess you could have attributes for mother, father, birth, and death of type event with constraints limiting the person's role (and the other person's role and gender in the case of parents) as appropriate for each.

The implementing code would be called business logic in a normal database application. It's not really something with which one would want to burden the serialization code.

@EssyGreen
Copy link
Author

Sex is an activity - gender is your genes :)

the parental relationship, needs to be a first-class GenealogicalEntity so that it can carry the attribution and other stuff.

I specifically avoided saying what a "Mother" was because of the uncertainty about whether we're having Relationships or Roles or whatever ... you could have Mother as a Role/Relationship rather than a Person.

You've argued elsewhere for having a single class for events and relationships

Yes but that can be abstract and derivatives more specific (e.g. Mother, Parent or whatever)

The implementing code would be called business logic

The validation is in the business logic but the data is in the data model.

At the data level we are happy to specify a pointer property for say a "Contributor" in an Attribution, so why the reluctance to have a pointer property for a mother/father/birth etc?

@jralls
Copy link
Contributor

jralls commented Feb 18, 2012

Sex is an activity - gender is your genes :)

Better go have another look at your OED. "Gender... 3. Sex. Now only jocular" (their emphasis). Language is 2.

I specifically avoided saying what a "Mother" was because of the uncertainty about whether we're having Relationships or Roles or whatever ... you could have Mother as a Role/Relationship rather than a Person.

I'm saying that the "mother" and "father" attributes must be a Role/Relationship and must not be a person, because a direct pointer to a person wouldn't have a place for the attribution attribute.

At the data level we are happy to specify a pointer property for say a "Contributor" in an Attribution, so why the reluctance to have a pointer property for a mother/father/birth etc?

No reluctance as long as the pointer is to an arc, carrying an attribution, between the people. The contributor attribute on an attribution isn't a genealogical statement needing a proof, but rather the modern researcher providing the proof on the genealogical statement to which it's attached. An assertion of parentage, on the other hand, is the most basic genealogical statement that we make.

@EssyGreen
Copy link
Author

Better go have another look at your OED

LOL let's not go there!

the "mother" and "father" attributes must be a Role/Relationship and must not be a person, because a direct pointer to a person wouldn't have a place for the attribution attribute.

Like I keep saying, the detail of exactly how they are defined is a step further down the road and not worth going into unless the basic need for it is established. You seem to be arguing that the reason for not doing it is because it isn't possible. I'm just trying to illustrate that it is possible but obviously we would need to further define the details of how it was implemented.

I think I will make a separate post regarding hypotheses since this underpins my argument but seems to be absent in both the model and the forum discussions (except when I'm wittering on about it!)

@jralls
Copy link
Contributor

jralls commented Feb 18, 2012

Like I keep saying, the detail of exactly how they are defined is a step further down the road and not worth going into unless the basic need for it is established.

You seem to be arguing that the reason for not doing it is because it isn't possible.

Not a bit. I have no real issue with the top-level requirement that class Person have some sort of uniqueness constraint at some level on sex, parentage, birth, and death.

I do wonder if it's necessary to enforce that constraint in the interchange file format, and I'm quite concerned that we might wind up with a way of enforcing it that is either ugly or harmful.

@EssyGreen
Copy link
Author

Fair enough - in that case can we pause this post until there is clarity on the Roles/Relationship issue (and also preferably on the 1 vs 2 model issue)

@jralls
Copy link
Contributor

jralls commented Feb 18, 2012

in that case can we pause this post until there is clarity on the Roles/Relationship issue

OK

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants