Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Relationships are N-Squared: Provide for Shared Events #134

Merged
merged 11 commits into from

6 participants

@ttwetmore

This is a comment on the current Record Model.

If the record being extracted holds a multi-role event (for example, say the record is a family in a census), then there typically are relationships between each pair of persons mentioned in the record. This is an N-squared situation. A family group of 5 persons requires potentially 20 relationships, a group of 6 persons potentially requires 30 relationships and so on. I believe that where these potential N-squared situations arise in a model, it behooves one to do something about it, especially if it is trivial to do so, as it is in this case.

I have always advocated not using the relationship concept directly at this point in a record level model. Instead, I believe it is better to think the record object in this case to be an "event" record that points to the persona records, and each pointer specifies the role the person has with respect to the event. In all important cases, knowing what the relationship that two different persons have with respect to a single event, allows an easy inference of their relationship with respect to each other.

This converts the extraction of information from multi-role genealogical records from an O(N-squared) process into an O(N) process. And in the process it prepares the data for easier data processing.

The Record object in the current model could contain these "role-typed" persona pointers, in which case the Record object describes the event in more actual fidelity, and the Relationship object is not needed.

I don't believe the Relationship object should be removed from the overall model however, since later, when conclusions are being built, it can be very important to be able to conclude that a particular relationship existed between two people (say they were cousins of an as yet undiscovered type).

@EssyGreen

I would prefer the Relationship object to be removed in favour of the Roles discussed here and in Issue #118 because I think this makes for a cleaner and simpler model. But I'd be happy to compromise providing we got the Roles.

@ttwetmore

EssyGreen,

If you do have a chance to check the deadends model, you will see that relationships are accommodated. They are not separate objects, however. Each of the two persons in the relationship refers to the other with a relationship reference that also states the relationship role. A relationship is therefore just a two way pointer between two persons that has the info needed to understand the roles and any other facts that are INTRINSIC TO THE RELATIONSHIP but not intrinsic to the persons per se.

And if you know your GEDCOM by heart you'll recognize that the old ASSO tag has much the same semantic intent.

So sure, keep the relationship, but add those roles!

@EssyGreen

Hehe yes I do know my GEDCOM by heart :)

The problem I always had with ASSO tho' was that there was no guarantee that the link was was two-way and since multiple ASSOs for the same person/person were completely possible there was no way to understand which linked to which. So from that point of view I would rather have an object that holds them all together in context ... and as we've agreed I think we both see that as an Event with multiple Roles.

@jralls

Yes, relationships are often N^2, but by-and-large most of them can be derived rather than stored. For example, sibling relationships can be derived from shared parental relationships.

As for roles vs. relationships in the Record section, ISTM both are necessary. For example, in the US census from 1880, the relationship of each enumerate to the nominal "head of household" is recorded. The role for each persona in that case is "enumerated" and the relationship should be recorded separately. Note that at the Record level, the implied relationships should not be recorded. Inferences belong in the Conclusions section.

@EssyGreen

I agree re the siblings, ancestors, descendants etc in the Conclusion model but the Record Model should enable all the relationships to be stored as they were recorded e.g. exactly like you've specified in your census example. The shared events with Roles seems to me to be the most effective way of doing this. see Issue #118

However, I also believe that in creating a Record the very act of breaking the original record down into these man-made objects is a process of Interpretation (one step further than a Transcription or Translation which is already a basic Interpretation) .... albeit one constrained by the bounds of the Record. Bearing this in mind I have no problem with Interpreting events which are not explicit in the original but are implied. For example, if an age and date of event are given I don't have a problem Interpreting an approximate date of birth; if a woman is recorded in the original as a widow I see no problem in Interpreting a previous marriage event to an unknown man etc. I would however love to have a field which enabled me to record whether the event/role was explicit or implicit but I can always build that into my own model.

As I mentioned elsewhere (can't find it right now) I also believe that there should be the ability for a Record to have multiple interpretations. For example, in the past a "step" relation was often used in the same way that the term "in-law" is used today - with totally different consequences to the relationships mentioned. The ability to document both interpretations and then follow through to a single conclusion would be advantageous in these situations.

@ttwetmore

Relationships between persons (conventionally called relationships) and relationships between persons and events (conventionally called roles) are both key concepts in a genealogical data model. At times people will argue that you could get rid of one or the other, but you can’t; they are two different concepts, each required in different contexts.

One can then ask how are relationships and roles are to be represented in a data model or a database. In a relational database implementation, both relationships and roles would almost certainly be implemented (“normalized”) as two simple tables. This doesn't imply that they would be separate objects in the model itself. In my models I have never "blessed" the relationship or the role concepts to be full-bodied objects; I have not made them "first class" citizens. I feel strongly about this, but others feel as strongly in the other way (Sarah argued earlier that relationships are first-class citizens.).

Backup. Consider the three forms of genealogical data. First the model, which really isn't data. Second an archive or transmission file format for the data. The data in the file should correspond exactly with the concepts in the model (otherwise there is no point in having a model!). If the model has person objects then the file format should have person records, and so on. Then there is the database format; there need be no direct relationship between the data model/external format and the database format. If the database is a RDBMS then the external data (file format) would presumably be normalized into tables during import, and unnormalized back into "objects" on the way out. Personally I like what used to be called network databases, and which seem to these days be called document-based databases (e.g., Mongo) for genealogical databases. With these databases you can keep the same object structure in the database as you keep in the model and the file. This feels right to me so I have always done it that way.

My N-squared argument is deep in here. The Record Model does not include events. The result of this is that there is no role concept in the record model (obviously, because the concept of a role depends on the concept of an event). The result of this is that all relationships (meaning the superclass concept; that is, relationships AND roles) must all be treated as relationships between persons. This is inappropriate for some records, and I gave the example of census record. As another example consider a marriage record that names two witnesses and the preacher-man. Witnesses especially are very important clues, so you want personas for them (I wouldn’t argue that you really need a persona for the preacher). Given you want the witnesses as personas, how, without roles, are you going hook them up to any other records in the database in meaningful way? You clearly need roles for this. The GEDCOMX Record Model does not have roles. It should. They convert an N-squared representation to a more meaningful N representation in many situations. They are Very Good Things.

@EssyGreen

Sarah argued earlier that relationships are first-class citizens

Did I? Oops I thought I was arguing for Roles! (Or maybe that was Roles as Relationships or Relationships as Roles - eek!). Your comments on the differences (and hence the need for both) are interesting ... but I'm still not sure both are necessary. If a person is a "wife" the implication is that there was a marriage event to a "husband"; conversely, if the event is a marriage with roles of "bride" and "groom" the implication is that their relationship was "wife" and "husband" after the event. I personally don't have a problem interpreting between the two but I find it easier to link multiples (more than 2 people) together via an event e.g. say a Census specifies a Householder, Wife, Sister and Son. With one Event and 4 Roles we've got the lot. If instead we have to define the relationships then we have the N-squared scenario or loss of data/context (e.g. Householder<->Wife, Householder<->Sister, Householder<->Son gives us 3 Relationships but tells us nothing about the relationship between "Sister" and "Son" without comparing/evaluating Householder<->Sister and Householder<->Son). For a Record where you want the data to be as transparent as possible I think this makes it less clear hence my argument for Roles.

Conversely in the Conclusion Model I would argue for the simplicity of each Person having a (single) Mother and a (single) Father and anything else being up to the user to specify if they wish. See Issue #131.

Like I said before I think we're in agreement that we need Roles and Relationships - but not necessarily in the same context/model!?

@EssyGreen

Actually I think I'm going to scupper my own argument here ... If you wanted to draw a tree based on inheritance law (at least in the UK) then you would need to specify a date which the tree represented and show different parents if a child was adopted depending on that date. So if a child was born in 1900 with parents A+B but was adopted in 1910 to parents X+Y; the tree for 1900 would show a completely different set of ancestors to one which represented the situation in 1910 or after. This is something I'd love to see done but not sure I would personally have the time to develop it. Anyway, in order to do it the application would have to store the parents as Roles with the event (much like the FAMC links in old GEDCOM BIRThs, ADOPtions etc).

For this reason I would argue in favour of Roles in both Record and Conclusion models.

I would however, maintain my argument that there cannot be multiple births (or deaths) in the Conclusion Model.

@jralls

I agree with Tom about the need to have separate objects/fields/elements/whatever for roles and relationships, and disagree strongly with Sarah that the relationship to the "head of household" is a role in a census event.

Tom, the record model does include events, they're just bundled up in Fact. If you look at the enum FactType in gedcomx-types/.../FactType.java, you'll see that most of the values are events with only a few characteristics mixed in. There's a big argument about doing this in #84 and #85.

@ttwetmore

My apologies, Sarah, for misunderstanding your points.

One comment about your point of not having multiple birth and death events in the Conclusion Model person record.

In the DeadEnds model, which has what is called a multi-level or an n-tier person model, person records (for the same individual) can be constructed into trees, with the higher nodes representing conclusions based on evidence and conclusions from the lower nodes, and the leaves being what GEDCOMX calls personas. All the nodes are simply examples of person records. They are brought together into trees because the user decides they all represent records that mention the same real individual. So the higher level persons inherit the birth and death events of the person nodes below them. This makes sense since the tree represents the bringing together of all the information the user believes applies to individual individuals (smily). But we know that there is a high likelihood of errors and inconsistency in the various raw records of the events that we find during research.

When you get to the “top of the person node tree,” which is what in the GEDCOMX terminology is a person record in the Conclusion Model, you have a record that may have many sub-person and persona records hanging below it. If those lower down person records have a fully consistent story to tell about birth and/or death, then no birth or death info need be added to the top person. The top person “inherits” the information from below. However, if there inconsistencies, then the user of this wonder system will have to specify which of the birth/death events from below should apply to the individual as a whole. Of course this doesn’t have to be the same as any of the exact events from below, but an event constructed from info from any of the events below. The choosing of an event, or the construction of a composite event, is a “conclusion” made by the user, so must be justified in the root, individual level person. And of course this conclusion making can actually be made at any level of the tree as it grows.

Imagine the “two-tier” model of New Family Search trees. In that system, persona records (they are never officially called that there, though that’s what they are) are gathered from many sources, including users. Then users of the system can join these personas together into individuals, and they can specify which of the facts from which of the personas they want to be displayed at the individual level. Other users can later rearrange the personas into different sets of individuals, and can change which events from which of the personas to display for the individuals.

So the New Family Search system is a two-tier person system. I believe it more logically should be changed to an n-tier system. As a two-tier system it is logical to talk about a Record Model and a Conclusion Model. But once you “graduate” to an n-tier system it becomes impossible to maintain the illusion that there should be two fixed layers in a genealogical data model.

@EssyGreen

No apologies needed @ttwetmore - my fault for not being clear!

When you get to the “top of the person node tree,” which is what in the GEDCOMX terminology is a person record in the Conclusion Model, you have a record that may have many sub-person and persona records hanging below it.

I don't agree that a Person has "sub-Person(a)s" - they may reference a bunch of Personas but I don't believe that a concluded Person is or should be a component of a Person represented elsewhere.

When different "Personas" are referenced and linked to a single "Person" I don't believe that the Concluded Person is the same instance of the Persona (since it is being interpreted in a different context - maybe in a different hypothesis) nor do I believe that the Concluded Person should "inherit from" any of the instances of the Personas. However, the Facts, Roles, Relationships or whatever recorded as linked to the Persona are available (indirectly) for the researcher to assess. For example, say I have 2 records P1 and P2 with conflicting birth events 1850 Bedminster and 1854 Bishport ... I create a P3 with a new concluded birth event say Abt 1852 Bedminster which takes into account (and cites) P1 and P2. The new event is not an instance of nor inherited from P1 or P2 but belongs to P3. However, because P3 links to P1 and P2 (taking aside negative complexities for a moment) then if the application wishes to represent the un-concluded data they can simply "show through" the data from the cross-referenced P1 and P2. But from a data model point of view it would be incorrect to hold the events of P1 and P2 against P3.

@EssyGreen

@jralls - going back to the Roles vs Relationships debate, I think this one is interesting so can we thrash it out a bit? I think ...

  • a Role is the part a single person plays in a single event or relationship

  • a Relationship is a state of "connection" between two or more people which may be bound by time and possibly other factors I can't think of!)

Because a Relationship can be bound by time (and possibly place e.g. having a holiday romance maybe?) it is simply an Event. For example, a husband/wife relationship = a marriage event (NB - not the same as the wedding or the marrying event!)

@EssyGreen

This probably explains why I my Interpretations are wider than the event being recorded e.g. the Census event for Date D1 above with P1 (householder), P2 (wife), P3 (sister) and P4 (son), I would interpret as:

E1: Residence with Roles P1 (householder), P2, P3, P4 on D1
E2: Marriage: P1 (husband), P2 (wife)
E3: Birth: P4 (child), P1 (father), P2 (mother)
E4: Birth: P1 (child), P5 (father), P6 (mother)
E5: Birth: P3 (child), P5 (father), P6 (mother)

@ttwetmore

I hope I don't misinterpret. The person trees I mention are only built up via references. No record is ever a component of another. Trees are built, trimmed, and rearranged by manipulating pointers or ids. Compose does not imply contain.

Sounds like you might prefer 2-tier systems over n-tier ones, but I'm not sure. The key concept in a 2-tier system is individuals are composed (doesn't imply a containment) from all the persona records that hold evidence about them. It is the genealogist's job to figure that out and establish the proper relationships between the persons and the personas. In most current systems there are no persona records; genealogists just add facts to conclusion person records, with each fact reference the specific item of evidence/source it was taken from. Woe be to the users when they discover they added facts from another real person to one of their conclusion persons.

New Family Search is a 2-tier system so it seems natural that GEDCOMX would therefore be 2-tier. However, I haven't figured out in the GEDCOMX model how the two models connect. How does a genealogist, when putting a person record into the conclusion model, refer to the persona records from the record model that justifies the person record. The answer is probably very simply; I just don't see it yet.

Obviously I believe the n-tier system is best. Much of this belief comes from software I have written in the past that automatically compares records and then automatically links them together. Algorithms like this can be very complex and have to be organized into phases that use various statistical techniques. Because there are phases, the linking of records occurs in stages where each stage conceptually takes some persona records and adds them some of the person records that are being built up by the process. By using an n-tier system one can fully track the operation of the phases, and therefore reconstruct the full history, the full set of operations that linked the persons together. This is important -- with 2-tier systems you loose the history of conclusion making, in fact you loose the details of ALL your conclusions -- with n-tier systems you maintain the full history of conclusion making, and that history is fully reversible, that is, you can cleanly "unmake" any of your conclusions. This one property, for me, demands the n-tier system. In one application I wrote the linking as a 2-tier system first, and later was forced to convert it to an n-tier system in order to be able to present and make any sense of what the linking really means. What started as simply a "debugging" aid lead me to understand the necessary properties of an n-tier system in any application where the number of records reaches into the 100s of 1,000s or beyond. The applications I worked on had record counts in the 100s of 1,000,000s.

When I say inherit I mean the conclusion persons inherit the properties of their persona records (limiting thinking to a 2-tier system). All I mean is that there is no need to copy any properties/facts from the persona records, into their conclusion record, if you believe the information in the persona is correct and there is no conflicting information in any of the other persona records. If you believe, by whatever means, that either the information in the personas is incorrect, or there is conflicting information in the personas, then you must add information at the person level to resolve the issue. And of course, the adding of this information has to be seen and treated as making a conclusion.

Some of what you have written still confuses me some, but I sense that there is much we hold in agreement.

@jralls

E3: Birth: P4 (child), P1 (father), P2 (mother)
E4: Birth: P1 (child), P5 (father), P6 (mother)
E5: Birth: P3 (child), P5 (father), P6 (mother)

That's going way out on a limb. Censuses seldom provide more than one child-parent relationship, and unless the census also documents the length of the marriage one cannot assume that the non-HoH (usually the wife) is the other parent of the children.

@EssyGreen

@ttwetmore

Sounds like you might prefer 2-tier systems over n-tier ones

No, I totally agree with you re the N-tier - I was trying to keep it simple in the discussion :)

How does a genealogist, when putting a person record into the conclusion model, refer to the persona records from the record model that justifies the person record.

I agree this needs clarifying - one for Ryan :)

there is no need to copy any properties/facts from the persona records, into their conclusion record

I totally agree - I think I just got confused by your use of the word "inherit"

Some of what you have written still confuses me some, but I sense that there is much we hold in agreement.

I agree - I think we have a bit of a semantics problem - mostly seem to be saying the same things in different ways :)

@jralls
  • a Role is the part a single person plays in a single event or relationship

  • a Relationship is a state of "connection" between two or more people which may be bound by time and possibly other factors I can't think of!)

Because a Relationship can be bound by time (and possibly place e.g. having a holiday romance maybe?) it is simply an Event. For example, a husband/wife relationship = a marriage event (NB - not the same as the wedding or the marrying event!)

OK. That's pretty much the same argument Ryan makes in #84 when he justifies collapsing events and characteristics into facts. From a data modelling standpoint relationships need to be a separate class so we can use them to link our descendency graphs.

I'm not really keen on atomizing evidence into a bunch of separate records, though I see the attraction for FamilySearch with their built-in trees. It's very difficult to get all of the evidence, especially the contextual evidence (great-great-grampa's wife was Mary. Oh, look, in the census before they got married, there's a Mary of the right age two houses over. Hmm, what can we find out about her?)

@EssyGreen

@jralls

That's going way out on a limb

Why? How would you interpret it?

Censuses seldom provide more than one child-parent relationship

Surely not? There are many instances where there are grandparents, sisters, cousins etc etc. In my opinion these are better Interpreted within the scope of the record before being linked in the Conclusion model and that's the benefit of derivatives .... I get a (derived) photocopy of the original; I might Transcribe it (into another derived record), if in say Latin then I might also Translate (derived from the Transcription) then I Interpret (into Facts/Roles etc) from any one of these into another derived record. If the record was ambiguous then I might have more than one Interpretation. Finally I can create a Concluded record and link to the derived interpretation. If anything goes astray I can trace all the way back up to see the level of detail I need and determine where it went wrong.

unless the census also documents the length of the marriage one cannot assume that the non-HoH (usually the wife) is the other parent of the children.

You can Interpret it to mean that and/or create multiple interpretations for the different possibilities and/or you can leave it out. It is up to the individual researcher. Providing you keep the layers which the interpretation was taken from there is no problem. I don't see this as any more problematic than say Ancestry parsing the data into Name/Date/Place fields (and hence incurring errors such as mis-reading handwriting or mis-understanding where the places are).

@EssyGreen

@jralls

From a data modelling standpoint relationships need to be a separate class so we can use them to link our descendency graphs

Can you explain? I don't understand

@ttwetmore

I pretty much agree with Sarah's view that census events imply other events, and that we should infer them. She may be going out on a limb but it a statistically sound limb. There will be errors introduced, but there will be far more birth events to work with.

(I have some census-handling software that takes census records and generates the residence and birth events exactly as Sarah has summarized them. Note on these constructed birth events it is usually possible to add the birth places of the parents, even if their names are not known). Plus we can often estimate the date of the marriage of the head of household and spouse [thus eliminating John's concern in some cases], and we often know extra info based on the "number of children/number of living children" fields for married women.)

I think that when we leave the comfortable world of easy person-based genealogy and we are far enough back that we have to enter the more challenging world of records-based genealogy, our work becomes ridden with errors and inconsistencies that we must learn to make sense of. I don't know how (or even if) it is possible to make a convincing statistical argument for this, but I believe that the value of the addition of vast numbers of inferred birth events (with parents assigned) from census records as outlined by Sarah, will greatly outweigh the problems introduced by sometimes having one or the other or even both of the parents wrong.

All records-based genealogy involves learning to cope with errors and inconsistencies in the records. We cannot reject a significant source of valuable records just because we know they introduce a certain percentage of errors.

@EssyGreen

Exactly so :)

On a side-line my great fear at the moment is the propagation of highly inaccurate data through sites which (for commercial reasons) are combining social networking and genealogy. I fervently hope that GEDCOMX can lay down some standards that will encourage users to think (e.g. interpret, analyse, hypothesise and conclude) rather than just bulk copy (which usually breaches data integrity, privacy and copyright in one fell swoop!).

@ttwetmore

John,

Thanks very much for straightening me out on where the event concept resides in the record model. I'll think about that and maybe comment later.

@jralls

E3: Birth: P4 (child), P1 (father), P2 (mother)
E4: Birth: P1 (child), P5 (father), P6 (mother)
E5: Birth: P3 (child), P5 (father), P6 (mother)

That's going way out on a limb.

I pretty much agree with Sarah's view that census events imply other events, and that we should infer them. She may be going out on a limb but it a statistically sound limb. There will be errors introduced, but there will be far more birth events to work with.

Yes, facts recorded in census records can imply other events. Where Sarah goes out on a very weak limb is the assumption
that P2 is the mother of P4 and that P1 and P3 share both parents. Neither of those relationships is implied by the census record at hand: There are far too many cases where a man's current wife is not the mother of (all) his children, and to assume otherwise is very bad practice.

Censuses seldom provide more than one child-parent relationship, and unless the census also documents the length of the marriage one cannot assume that the non-HoH (usually the wife) is the other parent of the children.

Surely not? There are many instances where there are grandparents, sisters, cousins etc etc. In my opinion these are better Interpreted within the scope of the record before being linked in the Conclusion model and that's the benefit of derivatives

In all of those cases the census documents a single relationship, that to the HoH. I have no problem with inferring an event that's directly supported by a document: Obviously, if a person is enumerated, that person was very likely born some time before the census (I haven't yet found any cases where a census taker enumerated someone who didn't exist, but I wouldn't completely it out, either) -- maybe even at the time indicated by the recorded age (though my great-grandmother Ralls is 43 in both the 1910 and 1920 censuses). If the person in question is listed as the HoH's son, well we can conclude that he probably is, but that tells us nothing about who was his mother. If the son is 10 and the census records that the parents have been married 12 years, we might, in the absence of other evidence, infer that the wife is indeed the son's mother, but we shouldn't have too much confidence in that until we find some corroborating evidence.

@ttwetmore

John,

I understand what you are saying. You are more concerned with avoiding errors, and I am more concerned with adding data. This argument boils down to where along the spectrum of more data versus increasing error ratio one is comfortable to live with. As you point out you can never guarantee that the spouse in a household record is the other parent of the children. If the children are young the probability goes up. If the record says how long the spouses have been married (inferable in many records) and the children fit that time period then the probability goes up. But you can never know for sure. As you can never know for sure that the head is one of the parents either. There is nothing absolutely true in genealogical data other than direct eye witness evidence. I know exactly when each of my parents died because I was at their bedsides each time. My Dad's grave marker has the wrong year.

The question here is what is more valuable -- data with a known ratio of errors -- or no data at all -- or what error rates are acceptable to use the data?

There are objective criteria that can be used to determine what ratio of errors is acceptable. That is, we don't have to just argue about it on theoretical, "errors are bad," terms. Errors can be measured, and the effect of those errors can be measured against the accuracy of the overall conclusions made in establishing family groups and pedigrees. I can't give you a formula for doing these tests now, but they are certainly possible, and certainly have to be completed before we can truly find the error ratio we should be willing to accept.

@jralls

Tom,

Genealogy isn't about data. We have artifacts, including documents, that provide evidence. We search diligently to find all of the artifacts about a person that we can, we carefully analyze the artifacts to extract the evidence that the artifacts provide directly, and we write a proof argument (or a set of proof arguments) about that evidence in which we discuss the quality of the artifacts that we've found, the credibility of the informants, the proximity of the artifacts to the events that they record, and so on. We draw conclusions from the evidence, explaining why we prefer some evidence to other, conflicting evidence when that arises. We synthesize the results into a biographical sketch of the subject, and either write the sketch in prose or slice it up into little pieces and enter it into our genealogy programs. It would be nice if those programs were written to help with the process or even just to document it instead of just recording the final conclusions, but none of them are.

The only evidence directly contained in the census record in Sarah's example is the relationship of each enumerate to the HoH. I said earlier that I'm not keen on slicing up evidence, but obviously FS and Ancestry have to do so in order to allow us to search the records. If they assign P2 as P4's mom, though, and she isn't, that's going to hurt searchability for researchers who already know who P4's mom is and include the correct information on their search forms.

It's fine -- necessary, even -- to infer events for which there is no direct evidence. The place for recording those inferences is in the proof argument, not in the record of evidence.

The error you're introducing here is not the kind that works well with statistical analysis of what's an acceptable error rate. That applies to randomly distributed errors like typing errors or misreading handwriting. Well designed processes are aimed at minimizing errors. For example, FS has two people index records independently, and any differences are reviewed by a third independent arbitrator.

The other counter argument to your "acceptable error" is garbage in, garbage out. There's enough garbage in the original records. It's stupid to add more.

@jralls

From a data modelling standpoint relationships need to be a separate class so we can use them to link our descendency graphs

Can you explain? I don't understand

If I said "family tree" instead of "descendancy graph", would it be clearer?

It's easier to illustrate the construction of the tree/graph in the model if the objects used to form the arcs on the graph are in their own class rather than a subset of some other class. That could be accomplished by making Relationship a subclass of Fact or an unrelated class, whatever makes the model easier to understand.

That doesn't mean that it needs to be a separate class in the implementation. The goal is different there, to balance performance, storage density, and code maintainability (not necessarily in that order! ;-) ),

@ttwetmore

John,

Thanks for your detailed response. You make excellent and conventional arguments, but we see things from different perspectives. For me genealogy is all about data, and I am very interested in bringing computing techniques to bear upon the problem of discovering links between persons mentioned in different record sets, in a rational manner based on firm mathematical and statistical principles, finding algorithmic techniques that recognize errors in a statistically significant way. Most genealogists have no faith that such computing techniques would ever be able to provide such a valuable service, or are even conscious that sophisticated computing techniques might even exist to help them. For the processes that I imagine, one must accept the presence of errors, because they are there, so one devises statistically sound methods to minimize the impact of them. When generating birth events from census records I would want to generate records both with only the head of household as a single parent and with head and spouse as two parents, and run detailed experiments to determine whether the advantages of having two parents for the purpose of finding out which set of data does a better job of family recognition and pedigree generation outweighs the disadvantages of the additional errors. You simply cannot state by fiat which will be better; one must find out. For me it is an engineering decision, not a theoretical one. I want to figure out the best way that genealogical computing can help recognize the same human beings as they are manifested in different record sets.

@jralls

Tom,

OK, that's a completely different, and very interesting, approach. Considering that very little genealogical evidence is available in digital form and that the contextual data needed to properly evaluate the evidence is seldom encoded, I suspect you're pushing the envelope just a bit. Good luck with it though.

That said, I don't think that a long-term research project a galaxy or two away from mainstream genealogy should drive a standard interchange format.

@ttwetmore

John, Thanks. I also agree that long-term research projects should not drive the transport/archive formats. Being able to support the n-tier person model is all that my ideas need however, and there is already some pressure to add this support from other areas Adding that capability is as simple as allowing person references to occur in person records. No real impact on the data model (other than adding one line!).

@jralls

So justindex dae11cc..0069d67 100644
--- a/gedcomx-conclusion/src/main/java/org/gedcomx/conclusion/Person.java
+++ b/gedcomx-conclusion/src/main/java/org/gedcomx/conclusion/Person.java
@@ -45,6 +45,7 @@ public class Person extends GenealogicalEntity implements Pers
private List genders;
private List names;
private List facts;

  • private List componentPersons;

    /**

    • Living status of the person as treated by the system. The value of this pr

(along with a getter and setter, of course, since there aren't constructors, and some (de)serialization code)?

Or do you mean the equivalent in Persona.java? I don't see a PersonReference type anywhere, so I'm guessing that you mean (in C++) &Person.

I think as I understand this thing right now, it would make more sense in Record than Conclusion, but I still don't grok how the two fit together, so I might well change my mind about that. Other than that I have no issue with it.

@ttwetmore

John, I interpret your last as a question to me. I am sorry but I have no knowledge of the existing GEDCOMX code. I am behind the ball. If I were asked I would say it is too early to be writing code to implement a model that has just been opened up for public comment and discussion. Seems putting the cart ahead of the horse. But I wasn't asked.

In XML I would give the example thus:

<person id= "fdsakjlfdskljalkj">
  <name> ...</name>
  ...
  ...
  <personRef id="adafadfadfadfaf"/>
  <personRef id="lklkjfdadlkflkjlkj"/>
</person>
@jralls

OK, but for now the code is the model. See #114.

There's no "personRef" in the model, so I take it in this case the id attribute contains the value of some other person element's id attribute.

@ttwetmore

John, How naive of me. I thought the model was the two diagrams and the descriptions of its objects! You are right about the id's in my example.

@EssyGreen

It's easier to illustrate the construction of the tree/graph in the model if the objects used to form the arcs on the graph are in their own class rather than a subset of some other class.

Easier, yes but the application can get round that by building their own index. The indexing method needed by each individual should not be part of the standard since different applications will need to model the data in different ways to achieve optimum performance etc. ie we should be thinking here of the logical data model not the physical.

@EssyGreen

I think the main area of disagreement here is whether an "Interpretation" should be in the Record or Conclusion model and hence would be resolved by addressing issue #138.

My own view is that there is no difference what-so-ever since there is no "truth" out there. Think of a venn diagram with the real person's life in one circle and everything that has been recorded about it in the other circle (whether or not it actually happened). The bit in the middle is a very small bit representing what actually happened and was accurately recorded (and preserved). My (personal) preference as a genealogist is to expand this middle bit to give a deep view of the person (I think this is also what @ttwetmore was saying). All records carry within them the interpretations and prejudices and experiences of the people involved in creating them and the social context in which they were set. To assume that the original record is a "truth" is naive. To convert it into different interpretation(s) based on knowledge of history, society, the context etc is to illuminate the reader (and makes it a heck of a lot more interesting too!).

My only requirements for this is that the model allows for multiple "Interpretations" of the same source and the ability to link a Person (in one interpretation) with a Person (from another interpretation) together with a justification/explanation of why they are thought to be the same person.

@ttwetmore

My recommendation is to remove the distinction between the Record Model and the Conclusion Model, and replace it with the N-tier Model. This both simplifies the overall model, by removing analogous objects from the two models, and increases the power and applicability of the overall model. The Record and Conclusion Models "drop out" of the N-tier model by simply using the N-tier model strictly in 2-tier mode. The N-tier model removes the requirement to force all information about persons into pure record versus pure conclusion form.

Alternatively only the Conclusion Model could be made N-tier, which enables a better capturing of the conclusion making process. Here's an example.

You have three personas, P1, P2 and P3, from say three city directories naming a person with the same name living at the same address with the same occupation over a period of five years; the P3 record also states that the person removed to another city. You reasonably conclude that these personas refer to the same person, so you create a conclusion person, CP1, that joins the three personas. Later in your research you find a census record from the city where P3 removed to that seems to be the same person, and you create persona P4 for this person. Say that you eventually conclude that P4 is another record of the same person as in P1, P2 and P3. In the current Record and Conclusion Models world, you would have to add P4 to CP1, and you would have to modify the proof statement or the conclusion statement, whatever you choose to call it, to justify this. Note that there is only one proof statement for CP1 so the modified proof statement becomes lengthy and complex, since it must justify bringing the directory records together, and justify adding the census record to them. However, in an N-tiered system you could do the more logical thing of creating a new conclusion person, CP2, and have it refer to CP1 and P4. The proof statement in CP1 remains fully intact, as it only refers to the city directories, and the proof statement in CP2 only does what it is important for it, which is to justify that the removed person is the same as the person in the original city.

Note these two very important features of the N-tier system in this example.

  1. The full history of your conclusion-making is inherent in the structure of the tree of records; and the full history remains retrievable forever.
  2. Reversing decisions (deciding that one or more of your conclusions were wrong) are trivial to fix -- all you do is remove one conclusion person somewhere in the tree. Reversing decisions in the 2-tier Record and Conclusion model can be much messier. In the example just given, where the 2-tier system would have put all four of P1, P2, P3 and P4 in the single CP1, you would have to change the contents of CP1, removing the reference to P4, but you would also have to REWRITE your proof statement; you would also have to physically remove any references to the sources and citations for P4, if any of them stuck around after you removed the reference to P4.

This example was based on the idea of a single level of personas and a multi-level of conclusion persons. But it's much simpler and cleaner to get rid entirely of the distinction, and call them all persons. Yeah, it's much more likely that the persons at the leaves of the trees are personas, but the model doesn't need to make that distinction. Persona level objects will refer to their sources, which will make it clear where they come from, and conclusion objects will have conclusions or proof-statements, which will declare what they are.

@EssyGreen

+1 which leads us back to #138

However I would prefer to link the persons one at a time e.g. P2=P1 then P3=P2 (assuming I found them in order of P1, P2, P3) since this is easier to model in a relational implementation (ie it's just a link table with two Ps and a proof statement).

Also ...

Persona level objects will refer to their sources, which will make it clear where they come from, and conclusion objects will have conclusions or proof-statements, which will declare what they are.

It sounds like you mean that concluded persons won't refer to the source(s) they came from but I believe that all Persons would provide a reference (either directly, or indirectly via their Facts/Events/Roles or implicitly by being contained/embedded in one) to the source in which they are contained/described. The source for the concluded person is just the project/file/tree/hypothesis (or whatever we want to call it) that the researcher has created it in.

I would also vote for leaving an optional proof statement in each Person - personally I find it useful to document the complex trail at times just as an aide memoire tho' I understand that not all users would want to.

@ttwetmore
It sounds like you mean that concluded persons won't refer to the source(s) they came from but I believe that all Persons would need a cross-reference (either directly, or indirectly via their Facts/Events/Roles) to the source in which they are contained/described. The source for the concluded person is just the project/file/tree/hypothesis (or whatever we want to call it) that the researcher has created it in.

Each persona level person refers to its sources; always; the personas are always there and their sources are always there. Each higher level (in a tree) person refers to its lower level persons and personas and to its own conclusion/proof statements. Therefore every higher level person refers indirectly to all the sources of all the lower level personas. So all sources are available at all times by any kind of software processing the data.

An issue arises when lower level person nodes contain conflicting data that you wish to resolve in a higher level person. The best way, IMHO, to do this is through an inherit and override method, pretty much as done by New Family Search, which uses a 2-tier record and conclusion model. In other words you don't have to put any attributes of persons into higher level person objects unless you need to either resolve or correct inconsistencies or errors in the information found lower in the person tree. You always have to put a proof statement in a higher level person, but you only have to put other "fields" in them if you need to resolve or integrate or correct information from deeper in the tree. For example you might want to give the conclusion person a birth event with the date taken from one of the lower level persons and a place taken from another. The concluded and added birth event would still be taking its sources from the sources of the person records supplying the data. It works by recursion "all the way down."

Too bad this stuff sounds so complicated. It is actually very simple, clean and easy to handle. It basically adds the research dimension to genealogical data models. And you get this capability by unifying the two person object concepts and then allowing person objects to refer to a list of lower level objects. No hacks.

@EssyGreen

I think I understand and if so agree with you ... I'd find it easier to be sure if we had an updated model ... hint hint, Ryan? :)

@jralls

Just to try to clarify ... here's my take on it: http://ourfamily.brighterworking.com/multimedia/brightergenealogy.jpg

Hmm, that's not well-formed UML. What is it?

Could you label the associations to reveal their intent? E.g., I don't understand how a Person would have multiple Roles independent of events or facts, or why Sources have two recursive 1..* associations.

Evidence appears to be a linking object between layers of persons, joining to lower-level persons. If that's right, one of the associations should be 1 Evidence -> 2 Persons. (I don't like calling that "evidence". "Conclusion" would be clearer, I think.)

What's a Contact?

@jralls

Too bad this stuff sounds so complicated. It is actually very simple, clean and easy to handle. It basically adds the research dimension to genealogical data models. And you get this capability by unifying the two person object concepts and then allowing person objects to refer to a list of lower level objects. No hacks.

It doesn't sound too complicated, and I like it. It seems quite expressive.

@EssyGreen

that's not well-formed UML. What is it?

It's just a data model. It was just a quick illustration I hoped would clarify but obviously not. I don't really want to get into a competition with doing a full blown model alongside GEDCOMX - was just trying to illustrate the relationships of some of the objects we've been discussing.

Evidence appears to be a linking object between layers of persons, joining to lower-level persons

Evidence links between any two Persons anywhere.

What's a Contact?

A person of interest to the researcher e.g. SUBMitter, REPOsitory in old GEDCOM speak.

Would anyone like to offer a diagrammatic alternative?

@jralls

that's not well-formed UML. What is it?

It's just a data model. It was just a quick illustration I hoped would clarify but obviously not. I don't really want to get into a competition with doing a full blown model alongside GEDCOMX - was just trying to illustrate the relationships of some of the objects we've been discussing.

I asked because the notation is unfamiliar. I think I see how the associations work, I just had to let go of my UML-based preconceptions. ;-)

@stoicflame
Owner

Phew! Finally getting around to consuming this interesting thread.

Personally, I don't mind modeling a first-class Event object to accommodate the "N-squared" problem. I'll take this issue and put together a pull request that you can all comment on after we settle #138.

One question, particularly for @ttwetmore: Is there anything equivalent in the legacy GEDCOM format? I.e. would you ever create an Event object when mapping from legacy GEDCOM? It seems to me like this is a new concept, no? All the events in legacy GEDCOM are "intrinsic" to the person or relationship, right?

@ttwetmore

Ryan,

The FAM record is the only legacy structure that bears a relationship to an event in GEDCOM. It encapsulates marriage and divorce events, but then again, so much more.

Note that Commsoft proposed "Event GEDCOM" in 1994; it added an EVEN record type to GEDCOM that was multi-role as it is sometimes discussed today. My own LifeLines program, first released around 1990, before Event GEDCOM, allows EVEN records (the LifeLines database consists of GEDCOM records). I did a little research with Dovy Paukstys and we discovered, if I remember correctly, that two desktop system of today have multi-role events in their models, and a few others have pseudo-multi-role events in theirs. It is an idea whose time has come.

The "record" record in the GEDCOMX Record Model is actually very close to an event record. Instead of having simple person references pointing to persons, all it needs are slightly more complex role pointers with enough structure to them to mention the role and to give the event-based, non-intrinsic (e.g., age) attributes of the persons wrt the event. And of course, this bears exactly on the original purpose of this thread. By using role pointers in the "record"/event record, instead of simple person references, you get rid of the need for relationships records (in this context anyway -- there are other contexts, IMHO, where they are needed).

I can put the EventGEDCOM proposal up on a web site, though I expect you might be able to find it through google.

@ttwetmore

Ryan,

I reread your question. On exporting GEDCOM to a format with "first class" events, you might decide to convert the MARR (and other related) events to first class events. This is because MARR events (sensu GEDCOM) are the only things in GEDCOM that correspond to multi-role, first-class events.

@EssyGreen

@stoicflame - any GEDCOM 5 event could be considered to be the stub of an Event which has not yet been expanded beyond the principle role.

@ttwetmore

This is the question of whether single-role vital events (birth, death, ...) as encoded in GEDCOM really are events in the true sense of the word. In nearly all cases the best you can say about a BIRT or DEAT structure in GEDCOM is that it states the conclusion that a person was born at a particular date and place, but it is not actual evidence taken from the record of the actual birth event. Because GEDCOM is a conclusion-only system, the BIRT and DEAT structures can really only be considered the final conclusion the genealogist made about the true fact. This may be subtle but is key in understanding the difference between evidence and conclusions and in understanding how to handle the research process.

In my models I have always allowed "vital events" inside person records, to handle the case where no backup evidence is actually provided along with the fact of the vital event, and I use fully fledged event records when that evidence is available, and especially when more than one role-player is mentioned in the evidence.

My approach is criticized therefore for allowing the same information to be encoded in two different ways in the same model, which is considered by purists to be a Bad Thing. I argue that the critics don't yet see the subtleties.

@EssyGreen

@stoicflame - I will refrain from confusing the issue further since your question was directed at @ttwetmore but I suspect this is not the end of the matter :) In any event (excuse the unintended pun!), I think I probably need to see the next version of the model (as you say after resolution of #138) before I can comment further on this one.

@EssyGreen

Where are we on this one? ie how is GEDCOM X going to handle Shared Events/Roles?

@stoicflame
Owner

Where are we on this one? ie how is GEDCOM X going to handle Shared Events/Roles?

Still not addressed yet.

@EssyGreen

It seems like loads of issues are being closed and pointed at this one for a solution ... yet there is no progress on it.

@stoicflame
Owner

It seems like loads of issues are being closed and pointed at this one for a solution ... yet there is no progress on it.

Acknowledged.

@DallanQ

I'm wondering if an example would clarify this proposal. Suppose I have the following 1940 census record:

Papa Smith Age 30
Mama Smith Age 29
Baby Smith Age 8

How would the three residence events, the relationships, and the three approximate-birth events be represented under this proposal?

@ttwetmore

This is my preferred solution to this example, which I use frequently in my own database:

<event id="e1" type="census">
  <date> 1940 </date> <place> ... </place>
  <role id="p1" type="head"> <age> 30 years </age> </role>
  <role id="p2" type="wife"> <age> 29 years </age> </role>
  <role id="p3" type="child"> <age> 8 years </age> </role>
</event>

<person id="p1">
  <name> Papa Smith </name>
  <sex> M </sex>
  <birth> <date> about 1910 </date> </birth>
  <residence> <date> 1940 </date> <place> ... </place> </residence>
</person>

<person id="p2">
  <name> Mama Smith </name>
  <sex> F </sex>
  <birth> <date> about 1911 </date> </birth>
  <residence> <date> 1940 </date> <place> ... </place> </residence>
</person>

<person id="p3">
  <name> Baby Smith </name>
  <sex> U </sex>
  <birth> <date> about 1932 </date> </birth>
  <residence> <date> 1940 </date> <place> ... </place> </residence>
</person>

The three residence events go with the persons. They are redundant, but probably this is best.

The three computed birth events go with the persons. These (and the residence events) are uni-role events so they are subsumed within the person records. When I create these birth events in my own database, I always put <note> computed from age at census </note> under the date element.

The three ages, which you didn't ask about, go with the role references (this is because you must distinguish properties that are intrinsic to persons versus properties that are only transitory and dependent on the event.) The three relationships are not called out specifically because they are inherent in the three different "combinations of the three roles taken two at a time" (if you remember your combinatorics class), that is, the head-wife, the head-child, and the wife-child relationships.

The current GEDCOM-X format wants the three relationships as independent records. I say "fie on you three things, you are not needed."

Tom

@EssyGreen

I agree with slight variations, for example, I personally don't use Census as an event (for me it's a type of source document), so I would have the same thing but would have the shared event of type "residence" and would omit the 3 residences against the 3 people. But I think that is more to do with how different researchers prefer slightly different methods depending on their focus etc.. The fundamentals are the same.

@ttwetmore

Thinking of the census as a residence event is a great idea. It really makes the three residence events in my "solution" redundant, and I hate redundant things.

I didn't say this, but in my database I maintain two-way pointers, that is, the person records also point back to the mutli-role events that they play roles in. This makes my residence events even more hopelessly redundant.

@jralls

I'm also in the camp of places-as-top-level-objects. Otherwise I agree with keeping everything together in one record (assuming a semi-structured database as Tom uses for his illustration -- can't do that with an RDBMS without really ugly contortions). I'd mark the evidence bits somehow to differentiate the direct evidence (e.g., age) from inferred (e.g., approximate birth date), maybe just with an attribute on the element (using the XML terms).

Be careful with that wife-child relationship. It's a rookie mistake to assume that the wife is the mother of all (or even any) of the children.

@EssyGreen

I'd mark the evidence bits somehow to differentiate the direct evidence (e.g., age) from inferred (e.g., approximate birth date), maybe just with an attribute on the element

Totally agree plus loads more (e.g. type of derivative, proof statement etc etc) - I assume Tom left out citation details for simplicity

@EssyGreen

Be careful with that wife-child relationship. It's a rookie mistake to assume that the wife is the mother of all (or even any) of the children

Ditto the father for that matter and come to that ditto the wife's surname but we're getting into detail not really relevant to the shared events issue so I'm assuming we're omitting for clarity and simplicity.

@ttwetmore

RDBMS. Ugh. Use MongoDB or any database that handles top level XML elements or JSON structures directly. Designing genealogical data models to be easy to implement with relational tables is the oldest living mistake still being made in this field. See GenTech.

I agree on the need to differentiate the rawest form of the evidence (e.g., age) and the al dente form (e.g., estimated birth year). The GEDCOMX model also has the formal concept for using raw expressions and standard expressions. Personally I am sloppy about this. I usually record my evidence with the standard expressions with notes that mention the raw forms if I believe there is a reason to. After all, others or I can always go back and inspect the actual evidence for the raw information in the .00000001% of the cases where there is a need.

It would be an interesting discussion point to ask whether extracted record objects (e.g., personas) should hold data that is as absolutely close to the form of the data on the physical records as possible, or whether, as extracted data, and already on the train of objects leading to conclusions, it is okay to use standard date forms, standard place hierarchies, and so forth.

Yeah, and the head-child relationship ain't the best either. One must embrace the devil that all the data we collect comes with an inherent level of error. But we should never NOT collect data because of this, but we should learn how to live in a world where we know errors exist. The child relationships in census records are the source of many errors in assumptions about about biological parentage, but they are often the only physical evidence there is for the relationships. One cannot insist on only allowing 100% true evidence into databases, because such evidence does not exist. One must get comfortable with the notion that databases contain a certain level of crap, which leaves one with the often likely possibility of making crappy conclusions. Your goal, if you are willing to accept it, is to cut the crap. This message will self destruct in 5 seconds. . . . . . . pooof.

@ttwetmore

Yes, Sarah, I usually add the source references and the source records when I make examples like this, but left them out to keep the solution uncluttered. Thanks for sticking up for me!

@EssyGreen

RDBMS. Ugh. Use MongoDB or any database that handles top level XML elements or JSON structures directly.

That's a rather limiting view. GEDCOM X model should be independent of physical data storage implementations.

@ttwetmore

That's a rather limiting view. GEDCOM X model should be independent of physical data storage implementations.

I am underlining John's point that a model using the kinds of event and person records being demonstrated here would not map easily to an RDBMS, and that this is something that we should not be worrying about. And if we were worrying about it then we would be repeating the oldest mistake in our industry.

On the other hand JSON is not limiting. It liberates designers to exercise complete freedom in data storage implementations. Anything you can do in RDBMS you can do in JSON, but there is a universe of things you can do with JSON that you can't do with RDBMS. Think Cantor's theorem that proved that the infinity of the real numbers is greater than the infinity of the rational numbers. Think of RDBMS as the set of rational numbers, and JSON as the set of real numbers.

(MongoDB is a database that manages records in JSON format, and provides indexing and querying that is just as efficient as that provided by an RDBMS.)

@EssyGreen

I am underlining John's point that a model using the kinds of event and person records being demonstrated here would not map easily to an RDBMS, and that this is something that we should not be worrying about.

I agree that mapping to a relational database is not something GEDCOM X should be worried about per se. However, we would be foolhardy to create a model which was so prescriptively structured that it could only be imported into a specific data storage implementation.

I don't want to get into a debate about the pros and cons of RDBMS (or the alternatives) - we'd be here forever! - I just want to keep GEDCOM X flexible enough to allow developers to use whichever data storage they see fit.

I don't see any major problems in transforming the above into an RDBMS format but maybe I missed John's point.

@jralls

It wasn't a point, it was an observation that the way Tom structured his record takes advantage of a semi-structured database architecture. GedcomX is represented in XML or JSON, which are both semi-structured, and I think for an interchange format that makes sense. It's up to each application to translate between the GedcomX representation and the application's internal representation.

There's nothing here to argue about. Move along, please. ;-)

@EssyGreen

Fair enough ... moving along ... @stoicflame is this sufficient to get this requirement bedded in now?

@DallanQ

@ttwetmore given your example, it seems to my point of view that you're describing what I'll call a record, similar to HistoricalRecord at historical-data.org.

I think that including the concept of a record as a first-class object in GedcomX is a terrific idea.

As a first-class object, people could list all of the records in their tree, like a digital shoebox. The list could even include records that weren't yet tied to any one individual, and Gedcom files could finally contain information that wasn't strictly conclusion-based. Allowing GedcomX files to contain records would go a long way in my opinion to encourage genealogists to be more careful researchers.

I'd suggest:

  • Add record as a first-class object to the object model
  • Records can contain properties for date, place, text, date-found, a source-reference, an optional image link?, and one or more roles.
  • A role contains a list of attribute/value data pairs extracted from the record, including role type (husband/wife/child/etc.), age, etc., plus an optional id of a person in the tree. If it is not yet known whether a record actually pertains to anyone in the tree, the id may be empty. In my opinion this is key to encouraging people to record their research, and why records should be first-class objects.
  • Name and fact entities on people in the tree can contain a record-references as an alternative to in addition to source-references. That is, a fact may reference a record, which record then references the source (census, physical artifact, etc.) in which the record was found.
@ttwetmore

@DallanQ,

The GEDCOMX record model diagram at the website includes the "record" record as a first class object. I prefer calling it an "event" (specifically, a multi-role evidence event) because each physical record of interest, that we actually are willing to sit down and extract data from, seems to me, to document an event. I won't quibble about the name, however, except to point out that the word "record" is already overloaded.

I agree with all your points.

@DallanQ

By making record (call it event or evidence if you want) a first-class object in a GedcomX file, you're making evidence-based genealogy possible. In my opinion, evidence-based genealogy is the right way to go, but because most genealogy programs don't support it, most people end up tracking their evidences by hand and recording only their conclusions in the tree. Conclusion-based genealogy makes it more difficult for people to collaborate, because others are unable to see all of the information that was used (or rejected) when the author generated the conclusions.

@DallanQ

@ttwetmore I'm still pretty new to GedcomX. I looked for record but couldn't find it. The best overview I could find is here: http://www.gedcomx.org/models.html but it doesn't list record. It does list source, but I view record as having a one-to-many relationship with source. That is, I could have multiple records from the same source (e.g., 1930 census) in my tree. Can you point me to where record is defined?

@ttwetmore

@DallanQ, Here is the link to the diagram of the record model:

http://record.gedcomx.org/record-uml.png

@ttwetmore

By making record (call it event or evidence if you want) a first-class object in a GedcomX file, you're making evidence-based genealogy possible. In my opinion, evidence-based genealogy is the right way to go, but because most genealogy programs don't support it, most people end up tracking their evidences by hand and recording only their conclusions in the tree. Conclusion-based genealogy makes it more difficult for people to collaborate, because others are unable to see all of the information that was used (or rejected) when the author generated the conclusions.

I have been making this argument for twenty years. Software cannot fully support the full evidence to conclusion research process until evidence is also extracted into records that is manipulated in concert with the conclusion records. I fully agree with you on all your points here. The record record as it stands now does not do a good enough job of specifying roles and role-dependent properties of persons, but that is fixable. The DeadEnds model demonstrates a way to do it.

@stoicflame
Owner

I hope to have a pull request for this change that we can talk against by tomorrow or Friday.

I'm going to start by calling it Event (not Record) and go from there.

@ttwetmore

Ryan,

It was just pointed out on the Better GEDCOM wiki that the definition of the record record specifies it as a container object, that is, it wholly contains its personas and relationships. I had been making the assumption all along that it would contain pointers/IDs to separate persona and relationship entities.

I would prefer that they be separate, but the way you have it is certainly something to consider. What I was most concerned about when I criticized the idea was that the roles that the persons play in the records/events, and the properties of those persons that are not intrinsic, didn't have a place. However, with the personas inside the container, the role can easily be made a property of the persona, and the non-intrinsic properties of the person (e.g., the age at the time of the event, marital status at the time of the event, even the name at the time of the event) make a lot of sense being put right in the persona structure normally.

So I only have one real concern with this "persona inside" model. And that is that the personas, from the point of view of software, must be searchable as set of objects unto themselves. That is, software must be able to run a search on all personas with, say, the name of "John Doe", and they must all be found from where they are living inside the container objects. I believe, maybe incorrectly, that it is a bit harder to find and index internal structures than it is to search for and index top-level objects.

Of course, what I really want you to do is to combine the conclusion and record models together, making the same person object work for both, and then let person objects exist on a wide evidentiary scale, from raw raw evidence, to pure pure conclusion, and every gray layer in between. The handwriting is on the wall on this one though. I don't believe GEDCOMX would ever go that way. But I do hope you will think about the problems that having only two levels of person information may have on the ability of genealogical software to fully support the research process. (Oh, and I have read your white paper on why you think this combining thing is a terrible thing to do, so don't bother to send the link!! The paper did not convince me.)

I apologize for any confusion I might have caused by persisting so long in my misunderstanding of an important nuance of the record model.

Tom

@DallanQ

@ttwetmore thank-you for the link!

@stoicflame if the personas are inside the event (I'm not worried about searchability - that should be workable), would the personas inside the event contain only information from the event, along with links to separate personas in the tree in those cases when I've decided that the person in the event is the same as the person in my tree? That works for me.

@ttwetmore

@DallanQ asks

would the personas inside the event contain ... links to separate personas in the tree in those cases when I've decided that the person in the event is the same as the person in my tree?

Those links can't exist in the record model. There would be a person in the conclusion model derived from the persona, and another person in the conclusion model derived from a persona from some other record, and it would be those two conclusion level persons who link together. In the current conclusion model they would be linked together by being the two members of a relationship.

@EssyGreen

the Better GEDCOM wiki that the definition of the record record specifies it as a container object, that is, it wholly contains its personas and relationships. I had been making the assumption all along that it would contain pointers/IDs to separate persona and relationship entities. [...] I would prefer that they be separate

I would most definitely agree that they should be separate/pointers since this gives much greater flexibility. I can't see how Persons/Relationships can be meaningfully contained within an Event in the conclusion model.

what I really want you to do is to combine the conclusion and record models together, making the same person object work for both

+1

@EssyGreen

would the personas inside the event contain ... links to separate personas in the tree in those cases when I've decided that the person in the event is the same as the person in my tree?

Those links can't exist in the record model

I think they could actually tho' I think it has already been decided not to do this when the record model was frozen.

@jralls

The record model was more withdrawn than frozen. See #138. Dallan, you probably missed that one (it was closed 4 months ago with the withdrawal of the Record model) but the discussion covered most of the ground we've touched on here in the last few days. You might enjoy reading through them.

Ryan, a diff of prose will be excruciatingly hard to read unless you manage to get git to give you replacement of whole sections at a time. Maybe a wiki page with an alternate version of the spec would work better?

@EssyGreen

The record model was more withdrawn than frozen

Withdrawn from this forum, yes, but my understanding is that it is being implemented already by FS/FindMyPast for the 1940 Census hence I presume it is pretty much non-changeable (at least by the likes of us).

@jralls

No, it's over here. If you're interested in online indexing, join the fun there.

I think those of us here think that GedcomX needs a record/event/evidence component. For my part, I agree that the requirements for that component are different from those for online indexing, but I would have preferred that the record model had been modified and integrated as was proposed in #134 instead of axed when FamilySearch recognized that and separated the two projects.

@EssyGreen

If you're interested in online indexing, join the fun

Aha! Many thanks @jralls ... I'm interested but got the impression the discussion was closed.

GedcomX needs a record/event/evidence component

Event yes - evidence I see as something completely different - still needed, yes (tho' I have lost hope of getting it) but definitely not the same as an Event.

I would have preferred that the record model had been modified and integrated as was proposed in #134 instead of axed

I agree but I understand that FS have a commitment to proceed with FindMyPast and balancing that against getting buy-in and consensus from us lot is a hard ask.

@jralls

I would have preferred that the record model had been modified and integrated as was proposed in #134 instead of axed

I agree but I understand that FS have a commitment to proceed with FindMyPast and balancing that against getting buy-in and consensus from us lot is a hard ask.

Sorry, I guess I wasn't clear: I agree that the online indexing project should be separate. But Ryan could have just copied the record model over to the new repo without removing it from gedcomx.

@EssyGreen

ah i c - indeed, yes it would have been good to still have it available for reference

@jralls

Oh, it's still there for reference, thanks to the magic of Git. I want the Record information to still be part of GedcomX, merged similar to Josh Hansen's proposal.

@EssyGreen

Yeah me too but like I said I can understand the constraints

@stoicflame
Owner

Okay, folks.

This issue has been converted into a pull request and the proposed changes at 6a281b1 are attached.

As @jralls mentioned, the github diff viewer might be new to some of you, but I didn't get the time to put together a wiki page as he suggested. And I guess I'd like to see if the diff viewer isn't enough before I take the time to do that, anyway.

@EssyGreen

In-law is ambiguous with different meanings over time

So do you think it should be removed? Or are you suggesting a clarification?

Remove - can be specified under "Other" - or alternatively allow any word with the above just being recommended standard values

@EssyGreen

Thx @stoicflame - Is there any diagrammatic/model representation anywhere or do we have to go through the code line by line?

@stoicflame
Owner

Is there any diagrammatic/model representation anywhere or do we have to go through the code line by line?

The most useful thing I've got so far for evaluation is the new section added to the conceptual model starting here and here.

I agree it's limited. My question is whether it's sufficient for a reasonable evaluation from you guys.

@DallanQ

@jralls Thanks for the link to gedcomx-record and @stoicflame thanks for all your work on the pull request. It's a lot to digest; I'll look at it over the weekend.

@DallanQ

@stoicflame looks great! Just a few questions:

  1. is there a way to add text (a transcription) to an Event? Perhaps as a Note?
  2. is there a way to attach a URL (of a record image) to an Event? I assume this can be done using a SourceReference?
  3. is there a way to reference an Event as a source of a Fact? Could a SourceReference refer to an Event?
@EssyGreen

So do you think it should be removed? Or are you suggesting a clarification?

Remove it - you can always use "Other" ... as you can probably guess from recent posts elsewhere, I'm not in favour of all these codes unless they give a distinct advantage to automated analysis. The only role codes I think are essential are parent, child and spouse. Everything else can be better done via free-text.

@EssyGreen

The most useful thing I've got so far for evaluation is the new section added to the conceptual model starting here and here.I agree it's limited. My question is whether it's sufficient for a reasonable evaluation from you guys.

I understand the time constraints but personally I think the code-first/code-only approach is a mistake. I don't have the time to scrutinise the code at this level and it is much harder to see the overall design and answer questions such as those posed above without a diagrammatic model. Do you not have a dev environment which can generate uml diagrams?

specifications/json-format-specification.md
@@ -995,14 +1025,58 @@ notes | Contributed notes about the relationship. | notes | array of [`Note`](#n
},
"person2" : {
"resource" : "http://identifier/for/person/2"
+ },
+ "facts" : [ { ... }, { ... } ],
+ "sources" : [ { ... }, { ... } ],
+ "notes" : [ { ... }, { ... } ]
+}
+```
+
+
+<a id="event"/>
+
+# 8. The Event
+
+This section defines the `Event` XML type corresponding to the `Event` data type
@jralls
jralls added a note

You're in the json spec here. It should be JSON type, not XML type.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
specifications/json-format-specification.md
((6 lines not shown))
+ "sources" : [ { ... }, { ... } ],
+ "notes" : [ { ... }, { ... } ]
+}
+```
+
+
+<a id="event"/>
+
+# 8. The Event
+
+This section defines the `Event` XML type corresponding to the `Event` data type
+specified by the section titled "The Event" of the conceptual model specification.
+
+## 8.1 The "Event" Data Type
+
+The `gxc:Event` is used to (de)serialize the `http://gedcomx.org/conclusion/v1/Event`
@jralls
jralls added a note

I don't see namespace prefixes anywhere else in the JSON spec. Does it really belong here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@jralls

It's not code-only. Scroll down to the bottom of the diff and look at the changes to the foo.md files. It would have been easier if Ryan had split the change into two commits, one for the code and one for the spec, but he didn't.

@EssyGreen

Ah! Gotcha! thx @jralls :) ... OK so next question ... is there a way to see the diffs in the text of the md files or do I have to guess where the 89 changes are?

@EssyGreen

nm I guess I have to read the pseudo text stuff

@EssyGreen

@stoicflame -

  1. Most of the known roles are superflous - grandparent, grandchild, ancestor, descendant etc ... it's another infinite list of enums for the sake of it - same argument I'm having re Event Types. I seem to remember GEDCOM 4 went down this route and then deprecated them all in version 5 ... are we not in danger of repeating the experience? In contrast I can't seem to find a free-form text role which is absolutely essential

  2. Standard "Fact Types" list is way too long and has a whole mix of "Events" and "Characteristics" and IDs and Relationship types and ... just about Uncle Tom Cobley and All in there! - see #161

  3. This is prolly an old one I missed but what the heck is a "death name"? Is there a culture which renames people when they die?

  4. I'm gonna repeat that plea for a diagram - this is hell to validate/verify

@jralls

Most of the known roles are superflous

Beyond that, they're silly. In what events is "Grandparent", much less "Grandchild" a participant? The occasional will?
Roles should be something like Primary, Participant, Official, and Witness. Examples might be:

Birth event:

  • Primary: the child
  • Participants: The mother & father
  • Official: the obstetrician or midwife
  • Witnesses: Anyone else who happens to be in the room

Marriage:

  • Primary: The happy couple
  • Participants: Maybe nobody, maybe the bride's father, the ring-bearer, others who have a role in the ceremony.
  • Official: The minister/priest/justice of the peace/etc.
  • Witnesses: There are often official witnesses, but there may be others, such as wedding guests
@jralls

I'm gonna repeat that plea for a diagram

Agreed, because the changeset isn't complete without updating the files in gedcomx/specifications/support.

@jralls

There are a bunch of detail issues with the proposed Event and EventRole classes, but at this point I think we need to pay more attention to the big picture.

The two classes reflect pretty closely what Dallan asked for a few days ago, but somehow seems to miss the mark of moving toward evidence-based research. It does add a very important data type which was missing from the conclusion model, and I think it should stay, with some adjustments that I'll address in another comment. It models quite well the similar event classes in Gramps, TMG, and doubtless many other programs with which I'm not familiar, and it does allow a somewhat closer link between the evidence in the source and the conclusions pertaining to a particular person. It does not restore the original Record model's ability to represent the "shadow persons" of a document and to link them into "constructed persons" in the conclusion model, and Tom's original N-Squared proposal was for a way to have that support in the Conclusion Model, as was #138, which was closed in favor of this discussion.

@jralls

The following applies to GedcomX's data model and its correspondence to good genealogical practice. It does not reflect the structure or behavior of any genealogy database program that I'm aware of, so in some respects it's antithetical to a good data exchange mechanism. On the other hand, no program that I'm aware of matches the '''current''' data model in these respects either.

First of all, as Tom pointed out yesterday, having a single Attribution and thus a single proof argument per GenealogicalResource is going to get really cumbersome. Similarly, an Event doesn't need an attribution unless it's inferred rather than directly documented. Since inferred evidence is the subject of #120, I'll not elaborate here.

Second, there's too much duplication of SourceReferences and they're attached to inappropriate objects. In real research, there's a one-to-many relationship between sources and evidence statements (i.e., each Source produces many evidence statements), a many-to-one relationship between evidence statements and conclusions, and a many-to-one relationship between conclusions and the person or relationship. There is no direct relationship between sources and persons or relationships, and requiring that sources be directly attached to persons and relationships makes no sense, adds work, and is likely to cause trouble when on further analysis an evidence statement is found not to apply to the person or relationship to which the user has attached it. This problem is compounded with the new Events, because the Event has Sources, the EventRole has Sources (because it's a Conclusion, the Sources must be attached also to the containing Person or Relationship -- but wait, the EventRole doesn't '''have''' a '''containing''' Person, it '''contains''' the Person! What does that do to the constraint?) This leads to a bunch of circular references that is bound to cause trouble in code. Meanwhile Places and Dates are conclusion objects too, and therefore carry their own Sources, though they might be part of a Fact, a Relationship, or an Event. But the conclusional nature of a date or a place separate from the object (Fact/Event/Relationship) is the interpretation of it into a modern setting: Converting a date from Old Style to Gregorian, for example, or recognizing that a town name in two counties is the same town, that the county boundaries changed. Those sources need to be documented, of course, but they don't have anything to do with the Fact, Event, or Relationship.

Third, Name variations arise from sources, usually from different sources using different names for one person, though occasionally a single source will name an individual in more than one way. The present structure separates the Name variation from the evidence, whereas a name variation is really an item of conflicting evidence that needs to be resolved in a proof statement.

The structural changes I'd make to resolve these problems are:

  • Make Relationship a subclass of Conclusion instead of GenealogicalResource
  • Move the Attribution property from GenealogicalResource to Fact, EventRole, and Relationship conclusions. If you want to take OO decomposition all the way, insert an intermediate subclass "ConclusionWithAttribution".
  • Change the Name property of Person to a single instance instead of a collection. Attach a Name property to Fact and EventRole.
@EssyGreen

Roles should be something like Primary, Participant, Official, and Witness

I agree with the principle but think even these are arguable .. often someone is many of these e.g. a mother (participant) who is the informant (official) for the birth of her child - you could argue that the registration of the birth should be separate from the event itself but I can't see many people doing that just to separate out the roles).

I would vote for Primary (as you specified) and Secondary (anything else) but the important point is that the user is allowed to enter free-form text to specify exactly what the role is (e.g. "Step-Father", "Informant", "Employer", "Head of Household", "Headmaster", "Administrator", "Executor", "Widow" etc etc)

@jralls

I would vote for Primary (as you specified) and Secondary (anything else) but the important point is that the user is allowed to enter free-form text to specify exactly what the role is (e.g. "Step-Father", "Informant", "Employer", "Head of Household", "Headmaster", "Administrator", "Executor", "Widow" etc etc)

OK. Minimalist as always. ;-) I can't think of anything automatic to do with roles aside from Primary, so it seems workable as well.

a mother (participant) who is the informant (official) for the birth of her child

I think "informant" is a property of a Source, not of an Event.

@EssyGreen

LOL :)

I think "informant" is a property of a Source, not of an Event.

I think it's up to the researcher/situation ... I remember one Census entry where the enumerator was also recording his own relatives and speculation or not some of the entries looked decidedly dubious ... hence the fact that the "Head of household" was also the "Enumerator" was highly relevant. If I left it within the Source it wouldn't have been prominent. Same applies to things like recording illegitimate births etc ... the social context of the recorder/informant and their relationship to the participants is often highly relevant.

@EssyGreen

an Event doesn't need an attribution unless it's inferred rather than directly documented

Not sure I agree with you here ... you could manage the proof at many levels ... we (as in this bunch of people contributing to this forum) cannot agree on where it should go ergo it goes everywhere - which prolly reflects the different approaches to research and the granularity that different people want to go to.

The problem with allowing it everywhere is that it increases complexity when sharing ... one app wants to use it at the event level; another at the role level; another at the person level ... but 'cos GEDCOM X says any of these might apply they all have to cater for all possibilities. In my mind this is not a standard - the standard should specify the minimum permitted which all apps must comply with and anything else is for the app to add if they wish. In this case the minimum standard is the Person level ie proof/rationale that the person in the Source is (or is not) the person in the tree.

@EssyGreen

In real research, there's a one-to-many relationship between sources and evidence statements, a many-to-one relationship between evidence statements and conclusions, and a many-to-one relationship between conclusions and the person or relationship

I'm not sure what an "evidence statement" is (do you mean "attribution"?) ... in my view in real research there is a Hypothesis (a collection of Person(s), Relationship(s) and Event(s) related to each other in a specific way) which is supported by relevant Evidence (a number of Sources) and rationale/deduction/argued reasoning which aims to prove or disprove the Hypothesis. The Hypothesis is itself a Source which can be used for other Hypotheses (see @ttwetmore's N-tier).

The closest thing to the Hypothesis in GEDCOM X is the whole file so the only way to make it a viable evidence-based model is to chop everything up into mini-files :(

@EssyGreen

the EventRole doesn't '''have''' a '''containing''' Person, it '''contains''' the Person!

Surely not! Maybe I read it wrong but I assumed it referenced the Person? It cannot possibly be the container for a person!

@EssyGreen

@jralls - I agree with all your main points above - there are so many of them it's hard to digest all at once but I'm with you on the general gist of the structure not being right and the need to look at the big picture.

@jralls

I'm not sure what an "evidence statement" is (do you mean "attribution"?)

Sorry, I wasn't very clear about that. By "evidence statement" I mean each abstract factoid that can be directly gleaned from a Source. For example, a birth record will directly provide:

  • Date of the birth
  • Name and sex of the child
  • Names of the nominal father (nominal because of possible mis-paternity) and the mother
  • Registration jursidiction

There might be others, of course, but you get the idea. One might also be able to infer some other things: For example, a British civil registration entry gives the mother's maiden name if she's married, so one can infer that the marriage occurred before the birth.

The important part is that these "evidence statements" aren't recorded directly in the current GedcomX model, they're created in your head as you analyze the actual source and recorded in notes somewhere. After doing your reasonably exhaustive search and careful analysis of all the sources you find, you form a set of hypotheses, which you test by trying to find more evidence, by considering the possibility that you've muddled up more than one person, and so on. Finally you write a proof statement summarizing the whole process and try to wedge that single round peg into the several square holes of your genealogy database program, which may eventually generate a GedcomX file so that you can share your findings with someone else using a different program.

@jralls

In this case the minimum standard is the Person level ie proof/rationale that the person in the Source is (or is not) the person in the tree.

The problem with that is that the result is a collection of dozens of proof statements in a giant text field with no connection to either the sources or the individual conclusions. A person could, with a bit of work, untangle it, but it would be nearly impossible for a program to do so given the current state of the art.

@jralls

the EventRole doesn't '''have''' a '''containing''' Person, it '''contains''' the Person!

Surely not! Maybe I read it wrong but I assumed it referenced the Person? It cannot possibly be the container for a person!

Event does reference the Person indirectly. For the other conclusions, the Person references the Conclusion. Therefore, a (e.g.) Conclusion::Fact has a '''containing''' Person, because the reference is '''from''' the Person '''to''' the Conclusion::Fact. But an Event has a container of EventRoles, each of which has a reference '''from''' the EventRole '''to''' the Person but no Person has a container of Events or EventRoles, so while an Event contains "zero to many" Persons, it has no "containing" Person.

@jralls

I think "informant" is a property of a Source, not of an Event.

I think it's up to the researcher/situation ... I remember one Census entry where the enumerator was also recording his own relatives and speculation or not some of the entries looked decidedly dubious ... hence the fact that the "Head of household" was also the "Enumerator" was highly relevant. If I left it within the Source it wouldn't have been prominent. Same applies to things like recording illegitimate births etc ... the social context of the recorder/informant and their relationship to the participants is often highly relevant.

Your example smells like source analysis to me. I absolutely agree that good source analysis is critical, and that it is an essential part of a proof statement. One of the major defects of most extant genealogy database programs is that they force one to strip away all of the stuff that goes between the collection of sources behind a conclusion and the conclusion itself -- at best shuffling it off to a note on the side, in the worst making no provision for it at all.

This goes back to #144, and is drifting away from the discussion of the new Event class, so I'll continue over there.

@jralls jralls referenced this pull request
Closed

In need of a Source object #144

@ttwetmore

I wish to go on the record as being one of those who believe that the tags for event types and role types should be a reasonably large enumerated type with that ability to have an "other" with additional information added by the user to clarify.

Is there anything seriously wrong with "father", "mother", and "child" being tags for roles in a birth event? Do we seriously want "primary" for the child and "secondary" the for father and mother? What is the advantage of that? How will software know who is who? Do we have to wait until we personally make conclusions and add relationships to the conclusion model before the software will be able to infer the relationships between people? If good role information is available at the record model level, software that supports research can provide considerably more help by being able to show all the possible hypothetical relationships. Great support for all kinds of "what if" questions. Maybe I'm the only one who is thinking about GEDCOMX with respect to software that fully supports genealogical research?

If this is an issue, please think about the relationship conclusion object. Are we going to limit the kinds of relationship objects in the same manner? Is the intention to limit relationship objects to only spouse/spouse or parent/child? Should the relationship objects be of specific types (e.g, spouse/spouse, parent/child) or should the person pointers in the relationship object be turned into role references? (Answer: they must be role references.)

Along the same lines, where is the actual pedigree to be found in the conclusion model? I assume it must be in parent/child relationships? Each conclusion person could have a special father and mother person reference. Just saying.

As another aside, where is the family in the conclusion model? It seems that one has to search through the entirety of the relationship objects, looking for any that mentions either of two parents in a parent role (you have to search them all because persons don't point to their relationships), to get their natural children regardless of other parent. Do we keep only the children of the two parents together, or include the children with only one of the parents. And of course this misses adopted children. It is rather hard to believe that the FS, based on its strong family concept, is abandoning the family to winds of software inference.

With fixed event and role tags software can infer relationships. With non-specific roles software is blind.

Are both husband and wife primary in a marriage event, or are they both secondary? What's wrong with "husband" and "wife". What's wrong with "witness" as a role or "officiator" as a role?

Is there some problem because a set gets large? What is that problem? Is it just the limits of the human brain in dealing with a large number of things, or is there a more fundamental problem? Must we accept the 7 +/- 2 psychological concept of chunking here? http://en.wikipedia.org/wiki/Chunking_(psychology)

In the LifeLines program I carried this into the definition of records. The program supports person, family, event and source records directly (an enumerated type for records), but one could then add records of "other" type by giving them type fields with any value. If one wants to add a "ship" or a "regiment" or a "congregation" go for it.

@jralls

Do we have to wait until we personally make conclusions and add relationships to the conclusion model before the software will be able to infer the relationships between people?

Yes. And only after a reasonably exhaustive search and skilled analysis of the evidence. I don't know of any software capable of skilled analysis of '''anything'''.

Are both husband and wife primary in a marriage event, or are they both secondary?

In my view, yes, husband and wife are both primary.

What's wrong with "husband" and "wife" [?]

It's prejudicial against same-sex marriages. We actually had a bug report against Gramps for this.

What's wrong with "witness" as a role or "officiator" as a role?

Nothing, but what is a program (as opposed to a researcher) likely to do with it?

Is there some problem because a set gets large? What is that problem? Is it just the limits of the human brain in dealing with a large number of things, or is there a more fundamental problem?

No, the problem is 1) that it can't ever grow to cover all of the possible roles, or event types, or whatever, and 2) that only the enums that are used '''by a program''' are actually useful. Since we're doing Family History, the program needs to capture family relationships (conjugal, parental, and adoptive) for programmatic use. Other relationships can be captured, but little is gained from structuring them because their interpretation must be done by a human. Little because there '''is''' an advantage to a controlled vocabulary when constructing searches.

@jralls

As another aside, where is the family in the conclusion model? It seems that one has to search through the entirety of the relationship objects, looking for any that mentions either of two parents in a parent role (you have to search them all because persons don't point to their relationships), to get their natural children regardless of other parent. Do we keep only the children of the two parents together, or include the children with only one of the parents. And of course this misses adopted children. It is rather hard to believe that the FS, based on its strong family concept, is abandoning the family to winds of software inference.

I think this amplifies my earlier post about the structure of the data model.

@EssyGreen

In this case the minimum standard is the Person level ie proof/rationale that the person in the Source is (or is not) the person in the tree.

The problem with that is that the result is a collection of dozens of proof statements in a giant text field with no connection to either the sources or the individual conclusions.

I don't agree here ... that giant text field is the researcher's analysis ... that stuff that you said was just in your head. Because it is text it's in English (or whatever) so anyone who can read should be able to understand it. To attempt to code this up and/or shorten it is in my mind madness.

@ttwetmore

John,

I disagree with nearly all of your responses in your penultimate post, but I was only interested in making sure at least one person stood up for the idea of large enumerated sets for events and role types. So I won't proceed further. I appreciate your response.

@EssyGreen

Loads of points floating around here but underlying it all seems to be the same old debate about how much to put in the GEDCOM X standard vs how much freedom to allow different apps. to do what they want/need.

This comes back to what the standard is intended to do - see #141 which still hasn't been resolved. Until we all agree what we are trying to achieve we won't agree the detail.

The more we put into the standard the more we constrain apps. Are we trying to write the all-singing-all-dancing genealogy app here? Or are we trying to develop a fundamental/essential standard which all ("good") genealogy apps should adhere to?

To do the latter we need to focus on simplicity and on what is essential not on bright ideas or "nice-to-have" features which might be great for one app but a nightmare for another to comply with.

@ttwetmore

GEDCOMX needs the features to fully support the genealogical research process. To me this means the current conclusion, source metadata and record models, with a number of changes to smooth them out, simplify them, remove a few things (e.g. attributions), unify a few things (e.g., personas and persons; yeah, wishful thinking), and so on.

I suppose I might be accused of being one those who want GEDCOMX to support a rich software domain (including the all-singing-all-dancing genealogy app), but the irony is, the DeadEnds model, my design for supporting that app and the rest of the genealogical software domain, is simpler than the combined conclusion, source metadata, and record models.

@EssyGreen

GEDCOMX needs the features to fully support the genealogical research process

Agreed - see #141

I might be accused of being one those who want GEDCOMX to support a rich software domain

I'm not accusing :) but I am asking Ryan to confirm what the remit is. In #141, Ryan said:

a "minimalist" approach for this first version. But it still needs to be flexible enough to provide for future standards

We seem to be going way beyond minimalist and in doing so losing flexibility.

@ennoborg

Well, I'm with Tom here. Been reading a lot, and don't see a reason to make things more complicated than needed, nor to throw away well known roles, event types, etc., which already exist in existing models and applications.

@ttwetmore

We seem to be going way beyond minimalist and in doing so losing flexibility.

I believe we are going beyond minimalist in at least two ways:

Making the model too complex. Having semi-independent sub-models is a symptom.

Making the archival format too complex.Defining an export file where each object is a file, and where there is heavy use of namespaces and RDF paraphernalia is a symptom.

I would prefer the GEDCOMX goal, instead of being a minimalist model, to be the simplest complete model! Possibly difficult in the short term.

I wish to stress again the idea of combining the conclusion and record models, allowing a multi-tier approach to evidence and conclusions. This has the exemplary effects of simplifying the model while increasing its expressibility and flexibility. (I have read the white paper that explains why combining the models is wrong, and I agree with much said there.)

@EssyGreen

I agree on all counts :)

@jralls

I wish to stress again the idea of combining the conclusion and record models, allowing a multi-tier approach to evidence and conclusions. This has the exemplary effects of simplifying the model while increasing its expressibility and flexibility. (I have read the white paper that explains why combining the models is wrong, and I agree with much said there.)

First we have to get Ryan to bring back the record model.

But remember that it was separated because it exists for a different problem domain, that of supporting online indexing. ISTM GedcomX is suffering from FamilySearch trying to support two problem domains with one spec:

  • Exchange of complete research databases between different applications
  • Serving some not-well-defined "swaths" (see Ryan's comment in #165) from online databases.

That second problem domain adds a lot of complexity that wouldn't be needed for the first one. Should the two be separated, perhaps by having a single-document format with ID attributes to provide the links and an RDF-free DublinCore-derived (Source)Description? For this discussion let's accept that the document could be XML or Gedcom5-like nested name-value pairs.

@EssyGreen

First we have to get Ryan to bring back the record model

Or we have to use the Record Model (as is) for the Conclusion Model too and if necessary ask Ryan to make some amendments to accommodate things that CM needs which RM doesn't ... I think this is more likely to be feasible than "bringing back" the Record Model when FS and Bright Solid have an urgent need to implement it.

@jralls

Sorry, I didn't mean that the Record Model for online indexing should again be part of GedcomX. That's clearly a very different problem domain and needs its own model, developed separately.

I meant that GedcomX also needs a record model integrated with the existing conclusion model. We can develop it from scratch (which your last comment on #144 begins to do, I think) or Ryan can revert the parts of the change which removed it from GedcomX and we can go from there. Looking over that change, I think perhaps that record model is overly complicated, so perhaps redeveloping from scratch is a better choice.

@EssyGreen

In that case, yes I think we agree :) Tho' "scratch" sounds a bit drastic and prolly won't wear

@ennoborg

H'm, OK John, so how would you simplify the current record model, and why would you do that exactly? I can understand that a simplified model may be easier to implement in our desktop (or web based) software, but maintaining two models may also be a source of new maintenance problems.

OTOH, I can understand that FS and other institutions want to develop a special record model for their own purposes, and if there is a clear interface between those and the standardized user record model, it saves us from adapting our software to any internal change they want to make.

So, thinking out loud, the answer is yes: I fully support a record model that can be integrated with our software.

@ttwetmore

First we have to get Ryan to bring back the record model.

But remember that it was separated because it exists for a different problem domain, that of supporting online indexing. ISTM GedcomX is suffering from FamilySearch trying to support two problem domains with one spec:

Exchange of complete research databases between different applications
Serving some not-well-defined "swaths" (see Ryan's comment in #165) from online databases.

That second problem domain adds a lot of complexity that wouldn't be needed for the first one. Should the two be separated, perhaps by having a single-document format with ID attributes to provide the links and an RDF-free DublinCore-derived (Source)Description? For this discussion let's accept that the document could be XML or Gedcom5-like nested name-value pairs.

I agree. It was separated because FS sees the two extreme ends of the universe of genealogical applications:

  1. The massive on-line conclusion tree -- conclusion model with Dublin core type support.
  2. The massive collection of indexed records -- record model with Dublin core type support.

FS is somewhat narrowly focused on these needs, and not paying as close attention to the rest of the industry. Here is where my Microsoft-like insult comes in. FS is simply so big and so engaged with their own massive activities, that it is easy for them to assume that what's right for them is right for all. And their administrative hierarchy is so demanding that they meet their own needs, that even if the gurus wanted to take the whole genealogical industry under their wing it is impossible.

My concern for years has been that the genealogical software industry evolve so the better applications can fully support the genealogical research process. The genealogical desktop applications of today do a good job of storing our conclusions, and if we are willing to put in the effort, they do a good job of storing the sources for those conclusions at a Dublin Core type level.

What is missing from these conclusion/source systems is a way to explicitly handle the genealogical evidence that can be extracted from the sources.

There is confusion about what the term evidence means, and for good reason. By evidence I simply mean the information that you find in the sources that is interesting to you. In a conclusion/source model, the evidence doesn't make it into your database explicitly. The important facts from the evidence will show up as dates and places and other attributes of conclusion records, and images of the evidence might make it onto your computer's hard drive. But that is the extent of evidence support today (some systems will keep track of where you store image files). It is up to your brain to keep track of where to go to refresh your memory of the facts that you might later decide to add to conclusion records. Yes, your source records can point you in the right direction, and the images on your hard drive, if you can remember where you put them and what you named them, can refresh your memory. The facts themselves, until they are explicitly added to a conclusion object, are only indirectly available to you.

A genealogical application that supports the research process needs a more complete representation at the evidence level. The best metaphor for understanding this need is to think about the manual, index card technique for conducting research. Here a researcher fills out index cards for sources, penciling in Dublin Core properties on that set of cards, and then the researcher fills out index cards with the facts found in those sources that he/she believes may later help in reaching his/her historical conclusions. A major advantage of the index card approach is that the researcher can shuffle them around, bringing together disparate facts from disparate sources, allowing them to see relationships that would be hard to grasp by simply trying to remember what they had read, and allowing them to play around with what-if scenarios.

The source metadata model (really the records created that conform to it) is the computer analog of the index cards holding the source information.

The record model is the computer analog of the index cards that hold the important facts extracted from the sources, that the researcher is going to have to reason carefully about in order to make his/her conclusions.

With these pieces in place I hope you can see what is needed for computer-based support for the genealogical research process. You need the three models we've been mentioning, source metadata, conclusion and record. And I don't think either of them is so important or stand-alone that they need to be separate models. Call them separate parts of an overall model. To support research the three subparts must work together as a single framework, so it is important that they fit together seamlessly.

Now combining the conclusion and record models together is a separate issue. I believe it should be done, because it simplifies the model and because it enables the multi-tier person tree concept that I also believe is key to providing full support for the research process, but it is a somewhat separate issue from that of simply combining the models into a cohesive overall model.

In their NewFamilySearch tree system, FS uses a persona-like concept (the individual person records submitted by contributors) and the person level concept (conclusion level person records) in a way that hides their differences. In that application your "job" is to create and correct family trees by adding new "persona" records, or rearranging existing "persona" records (the index card analogy works perfectly in imaging what this rearranging is like) into a new or corrected set of person (conclusion) records. Except for the fact that the "persona" records are not really evidence level person records; they are very frustrating, very sloppy and very poorly researched records added by generally careless genealogists. I think that both the metaphor presented here of modeling the online conclusion tree, and the horrors of the data sources being used, provides a screaming demand to merge the two models and to allow a multi-tier approach to persons. The NewFamilySearch application is actually a great germ for the next generation of systems that support research. It has already explored many of the problems of linking two levels of information. The very important lesson of NFS is how a two-tiered system can be managed. Evolving into a multi-level system is something I believe to be key in moving to systems that fully support research, though there is real disagreement about this. Separate record and conclusion models only make sense if the world of genealogical information can be explicitly be partitioned into pure evidence and pure conclusion. I am wholly convinced that this cannot be done. I believe that evidence and conclusions exist at many levels, and that a single model is best for handling a very gray world.

Think how wonderful the world would be if the NewFamilySearch tree had started off, not with the almost useless junk that it did start off with, but with billions of record model persona records extracted from the FS's immense store of evidence. Then the job of NFS users would be truly to construct the ancestry tree of humankind. And I guarantee the users would appreciate the multi-tier approach in order to bring order and understanding to the task.

@stoicflame
Owner

Hi all.

Just wanted to say thanks for all the time that everybody is putting into helping this move forward. There are some great points that are being made on this thread, and I'm still trying to digest everything down into some concrete action items.

I've made some adjustments to account for a simpler list of event roles at 1707e96, aligning with @jralls suggestions.

I've applied @EssyGreen's suggestion that the event types be separated from the fact types at 6df4f98.

I think there are a lot of other action items that need to be pulled out of this thread and addressed in separate issues. I'll do that next, but don't wait for me if you've a good grasp over what needs to be opened.

As a sidenote, I've gotta get some of my colleagues taking ownership of some of these issues, 'cause I'm getting buried. They're scared of you guys. :-)

Anyway, what other changes are specifically needed for this issue wrt shared events? I understand there are other issues that have been spawned from this one, but what do you see in this changeset that needs to be adjusted?

@stoicflame
Owner
  1. is there a way to add text (a transcription) to an Event? Perhaps as a Note?

Not yet. We're taking that on at #121

  1. is there a way to attach a URL (of a record image) to an Event? I assume this can be done using a SourceReference?

Correct.

  1. is there a way to reference an Event as a source of a Fact? Could a SourceReference refer to an Event?

Yes.

@EssyGreen

As a matter of interest what would the deceased father of a birth be? Not principal since this is the child; not participant or witness (unless their corpse was present at the birth!). Official is the only one which seems to make sense but doesn't feel right.

Lots of programs use the Witness role to mean Other. That's always rubbed me a bit, so maybe we should just explicitly call it "Other".

Need clear and unambiguous definitions of each of these and clear directive on what to do if none of the "known" roles are appropriate.

@EssyGreen

I believe Arrival and Departure are ambiguous and mean different things to different people - see discussion in #161

What on earth is "Move"?

"Education" seems to have been replaced with "Scholastic Achievement". I believe the former is a better term for an Event since there may not have been any "achievement" to record; whereas "Scholastic Achievement" is more like a characteristic of the Person (ie a Fact). For example: Education:"She went to XYZ university between 1931 and 1934" vs Scholastic Achievement:"She attained a PhD in theoretical physics". In genealogy I believe we tend to find more info on education than achievements (e.g. many Censuses specify "scholar" for children)

@EssyGreen

is there a way to add text (a transcription) to an Event? Perhaps as a Note?

Not yet. We're taking that on at #121

#121 is specifically about transcribing a source ... We also need a simple (not Note structured) text field for any sort of narrative the researcher might want to add (e.g. stuff which can't be coded up; verbatim version of the event; explanations and illustrations etc etc). I believe this must allow for html coding (ie must be CDATA)

@EssyGreen

[...] is there a way to reference an Event as a source of a Fact? Could a SourceReference refer to an Event?

Yes.

Are you saying for example that a Birth Cert (Source Reference) which recorded a Birth (Event) mentioning say Sex (Fact) of the child (Person) then the Sex would have some sort of pointer to the Birth? If so, what is the pointer? If, not could you explain/give an example of what is meant here?

@jralls

Anyway, what other changes are specifically needed for this issue wrt shared events? I understand there are other issues that have been spawned from this one, but what do you see in this changeset that needs to be adjusted?

  1. Add Event as a containing entity for sources in the properties description of paragraph 5.1

  2. Modify specifications/sources/gedcomx.zargo to reflect the new Event and EventRole classes

  3. Add a "Detail" string parameter to EventRole for making the role more specific.

  4. Consider changing "Witness" to "Other", as noted in an in-line comment.

@jralls

Are you saying for example that a Birth Cert (Source Reference) which recorded a Birth (Event) mentioning say Sex (Fact) of the child (Person) then the Sex would have some sort of pointer to the Birth? If so, what is the pointer? If, not could you explain/give an example of what is meant here?

We discussed this before, in #136 -- and probably in other places, too. A SourceReference is a URI to an RDF graph, and that is totally unconstrained. It's entirely up to the creator of the RDF graph and of the SourceReference to ensure that it has meaning. I asked a couple of days ago if that's really Ryan's intent, but he hasn't responded.

@jralls

The file size issue has gotten a bit of a life of its own, so I did a bit of analysis on it and opened #173 with the results. Let's move that part of the discussion over there.

@stoicflame
Owner

Sorry for the delay on this.

See changes at 8927d5c, 8927d5c, and 8927d5c.

Now what? Anything else before I merge?

@stoicflame
Owner

What on earth is "Move"?

Maybe we should rename it to Relocation?

@jralls

Now what? Anything else before I merge?

Paragraph 5.1 still requires that "The sources of a conclusion MUST also be sources of the conclusion's containing entity (i.e. Person or Relationship )." That needs to be reworked to take handle Events.

The catch-all value for EventRole is still "Witness". Concordant with many extant programs, but doesn't always express the correct meaning.

@jralls

What on earth is "Move"?

American for "Removal".

Maybe we should rename it to Relocation?

Can't we just include it in the overloads of "Departure" and "Arrival"?

@stoicflame
Owner

Can't we just include it in the overloads of "Departure" and "Arrival"?

Hmm... maybe. Tracking at #186

@stoicflame
Owner

That needs to be reworked to take handle Events.

How? Events don't contain any conclusions, at least not on the branch we're collaborating on here. I'm aware of the type refactor being coordinated across other threads, but that's separate from this discussion.

The catch-all value for EventRole is still "Witness".

Actually, the catch-all value is null. Just leave it empty if none of the known roles fit. Or put in your own custom type.

That concept is not unique to EventRole, that's applicable to wherever we're maintaining a controlled vocabulary. If that needs to be clarified, let's open up a separate issue to address all those places.

@jralls

Events don't contain any conclusions

Ah, sorry, I thought for some reason that EventRole extended Conclusion rather than GenealogicalResource.

let's open up a separate issue to address all those places.

#187

@EssyGreen

Why did the ordering get taken out? I realise the order was questioned by @jralls and I guess I missed the opportunity to defend it so am doing so now by answering: the order which the researcher thought was important!

  • accounts, emails, phones etc ... order would probably be most used
  • addresses - e.g. local branch might come before head office
  • sources - might be the order of discovery
  • alternate forms - might be ordered by probability/frequency of use
  • facts - might be order in which they occurred where dates are approximate

etc etc

The point being that the order was determined by the originating researcher and/or application. The message to any importing application should be: Don't mess with it unless you have to!

@EssyGreen

What on earth is "Move"?

Maybe we should rename it to Relocation?

We had this discussion before ... Move/Relocation has to be either away from or to somewhere so it needs 2 fact types/events like immigration/emigration in case peeps die enroute etc

@jralls

The point being that the order was determined by the originating researcher and/or application.

My original question was "ordered by what?". That's a programming-domain question, not a conceptual-domain one. If the answer is "ordered list in the sense of xsd:sequence" (i.e., implementations should preserve document order), OK, that works for me.

@EssyGreen

The point being that the order was determined by the originating researcher and/or application.
My original question was "ordered by what?". That's a programming-domain question, not a conceptual-domain one.

Yes I get that .. so I should have said "change ordered list to a list which maintains its original order" :) The exact term depends on the language you're using.

@stoicflame
Owner

If the answer is "ordered list in the sense of xsd:sequence" (i.e., implementations should preserve document order), OK, that works for me.

That was the intent, yes.

Doesn't the concept of a "list" imply that it "maintains it original order"? It does in Java, anyway. So that's why I removed the "ordered" qualifier... it was redundant and (based on the fact that it was brought up) it seems to cause confusion.

@ttwetmore

I agree that lists are implicitly ordered. So there is one item of ambiguity concerning lists that should be specified up front.

A genealogical object may have many properties of the same type (e.g., a name, a birth event, etc.) When multiple properties of the same type are found in an object, it must be clear which is the one that is the "preferred" value of the property, that is, the one to be shown in displays or to be treated as the most important, or the one to be used in age or other genealogical algorithms.

Some people choose the first and some choose the last. It should be made clear from the beginning by having it defined into the specs.

In the record model this issue would generally not arise, as most records only include one value for each property. However, in the conclusion model one might want to list all the names found on all the items of evidence that the researcher has decided refer to the same person.

[Aside: In the LifeLines program I wrote eons ago I chose the "first is best" interpretation. After putting the program into the public domain someone willy nilly changed that interpretation in a few spots to be "last is best." In the current release you have to experiment on a case by case basis to discover which is which. Names are first is best. Deaths are last is best. Not a happy situation.]

@stoicflame
Owner

it must be clear which is the one that is the "preferred" value of the property

See discussion at #176

@EssyGreen

Doesn't the concept of a "list" imply that it "maintains it original order"?

No there are many different types of list :)

@stoicflame
Owner

Okay. See d234f8b for the wording to clarify that the order is lists is preserved.

I'm going to give a day or so for further comments. Assuming no big objections, I'm going to merge.

@stoicflame stoicflame Merge branch 'master' into shared-events
Conflicts:
	specifications/conceptual-model-specification.md
	specifications/json-format-specification.md
	specifications/support/gedcomx.zargo
14c1a54
@stoicflame stoicflame merged commit 05a6e22 into from
@stoicflame
Owner

Merged at 05a6e22

@jralls

Ryan, can you convert this back into an issue and re-open it? There are a ton of things discussed here (and which you consolidated from other issues) which aren't covered by shared-events.

@EssyGreen

Maybe it would be easier to have new issues to discuss the other stuff .... it's a bit of a mammoth thread already

@stoicflame
Owner

Maybe it would be easier to have new issues to discuss the other stuff .... it's a bit of a mammoth thread already

Yes, please. If there are issues that this thread spawned, let's open them separately rather than making people trudge through this one to get context.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Jun 14, 2012
  1. @stoicflame
Commits on Jun 19, 2012
  1. @stoicflame
  2. @stoicflame
  3. @stoicflame
  4. @stoicflame
  5. @stoicflame
Commits on Jul 10, 2012
  1. @stoicflame
  2. @stoicflame
  3. @stoicflame
Commits on Jul 12, 2012
  1. @stoicflame
Commits on Jul 16, 2012
  1. @stoicflame

    Merge branch 'master' into shared-events

    stoicflame authored
    Conflicts:
    	specifications/conceptual-model-specification.md
    	specifications/json-format-specification.md
    	specifications/support/gedcomx.zargo
This page is out of date. Refresh to see the latest.
View
21 ...main/java/org/gedcomx/types/RelationshipRole.java → ...rc/main/java/org/gedcomx/types/EventRoleType.java
@@ -20,22 +20,17 @@
import org.gedcomx.common.URI;
/**
- * Enumeration of standard relationship roles.
+ * Enumeration of standard event roles.
*/
@XmlQNameEnum (
base = XmlQNameEnum.BaseType.URI
)
-public enum RelationshipRole {
+public enum EventRoleType {
- Spouse,
- Parent,
- Child,
- Grandparent,
- Grandchild,
- Ancestor,
- Descendant,
- Cousin,
- InLaw,
+ Principal,
+ Participant,
+ Official,
+ Witness,
@XmlUnknownQNameEnumValue
OTHER;
@@ -54,8 +49,8 @@ public URI toQNameURI() {
* @param qname The qname.
* @return The enumeration.
*/
- public static RelationshipRole fromQNameURI(URI qname) {
- return org.codehaus.enunciate.XmlQNameEnumUtil.fromURIValue(qname.toString(), RelationshipRole.class);
+ public static EventRoleType fromQNameURI(URI qname) {
+ return org.codehaus.enunciate.XmlQNameEnumUtil.fromURIValue(qname.toString(), EventRoleType.class);
}
}
View
90 gedcomx-common/src/main/java/org/gedcomx/types/EventType.java
@@ -0,0 +1,90 @@
+/**
+ * Copyright 2011-2012 Intellectual Reserve, Inc.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.gedcomx.types;
+
+import org.codehaus.enunciate.qname.XmlQNameEnum;
+import org.codehaus.enunciate.qname.XmlUnknownQNameEnumValue;
+import org.gedcomx.common.URI;
+
+/**
+ * Enumeration of standard event types.
+ */
+@XmlQNameEnum (
+ base = XmlQNameEnum.BaseType.URI
+)
+public enum EventType {
+
+ Adoption,
+ AdultChristening,
+ Annulment,
+ Arrival,
+ Baptism,
+ BarMitzvah,
+ BatMitzvah,
+ Birth,
+ Blessing,
+ Burial,
+ Census,
+ Christening,
+ Circumcision,
+ Confirmation,
+ Cremation,
+ Death,
+ Departure,
+ Divorce,
+ DivorceFiling,
+ Education,
+ Engagement,
+ Emigration,
+ Excommunication,
+ FirstCommunion,
+ Funeral,
+ Graduation,
+ Immigration,
+ Interment,
+ Marriage,
+ MilitaryAward,
+ MilitaryDischarge,
+ Mission,
+ Move,
+ Ordinance,
+ Ordination,
+ Retirement,
+
+
+ @XmlUnknownQNameEnumValue
+ OTHER;
+
+ /**
+ * Return the QName value for this enum.
+ *
+ * @return The QName value for this enum.
+ */
+ public URI toQNameURI() {
+ return URI.create(org.codehaus.enunciate.XmlQNameEnumUtil.toURIValue(this));
+ }
+
+ /**
+ * Get the enumeration from the QName.
+ *
+ * @param qname The qname.
+ * @return The enumeration.
+ */
+ public static EventType fromQNameURI(URI qname) {
+ return org.codehaus.enunciate.XmlQNameEnumUtil.fromURIValue(qname.toString(), EventType.class);
+ }
+
+}
View
2  gedcomx-common/src/test/java/org/gedcomx/types/TypesTest.java
@@ -23,7 +23,7 @@ public void testToQNameURI() throws Exception {
NameType.Formal.toQNameURI();
PlacePartType.Address.toQNameURI();
RecordType.Census.getBaseType();
- RelationshipRole.Ancestor.toQNameURI();
+ EventRoleType.Participant.toQNameURI();
RelationshipType.Couple.toQNameURI();
}
}
View
221 gedcomx-conclusion/src/main/java/org/gedcomx/conclusion/Event.java
@@ -0,0 +1,221 @@
+/**
+ * Copyright 2011-2012 Intellectual Reserve, Inc.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.gedcomx.conclusion;
+
+import org.codehaus.enunciate.json.JsonName;
+import org.codehaus.jackson.annotate.JsonIgnore;
+import org.codehaus.jackson.annotate.JsonProperty;
+import org.codehaus.jackson.annotate.JsonTypeInfo;
+import org.codehaus.jackson.map.annotate.JsonTypeIdResolver;
+import org.gedcomx.common.GenealogicalResource;
+import org.gedcomx.common.URI;
+import org.gedcomx.rt.CommonModels;
+import org.gedcomx.rt.JsonElementWrapper;
+import org.gedcomx.rt.XmlTypeIdResolver;
+import org.gedcomx.types.EventType;
+import org.gedcomx.types.TypeReference;
+
+import javax.xml.bind.annotation.XmlElement;
+import javax.xml.bind.annotation.XmlRootElement;
+import javax.xml.bind.annotation.XmlTransient;
+import javax.xml.bind.annotation.XmlType;
+import java.util.ArrayList;
+import java.util.List;
+
+/**
+ * A historical event.
+ *
+ * @author Ryan Heaton
+ */
+@XmlRootElement
+@JsonElementWrapper (name = "events")
+@JsonTypeInfo ( use =JsonTypeInfo.Id.CUSTOM, property = XmlTypeIdResolver.TYPE_PROPERTY_NAME)
+@JsonTypeIdResolver ( XmlTypeIdResolver.class )
+@XmlType ( name = "Event", propOrder = { "type", "date", "place", "roles", "sources" } )
+public class Event extends GenealogicalResource implements ReferencesSources {
+
+ @XmlElement (namespace = CommonModels.RDF_NAMESPACE)
+ @JsonProperty
+ private TypeReference<EventType> type;
+ private Date date;
+ private Place place;
+ private List<EventRole> roles;
+ private List<SourceReference> sources;
+
+ /**
+ * Create an event.
+ */
+ public Event() {
+ }
+
+ /**
+ * Create an event with the passed in type and values.
+ *
+ * @param EventType the event type.
+ */
+ public Event(EventType EventType) {
+ setKnownType(EventType);
+ }
+
+ /**
+ * Create a date/place event with the passed in type and values.
+ *
+ * @param EventType the event type.
+ * @param date The date of applicability of this event.
+ * @param place The place of applicability of this event.
+ */
+ public Event(EventType EventType, Date date, Place place) {
+ setKnownType(EventType);
+ setDate(date);
+ setPlace(place);
+ }
+
+ /**
+ * The type of the event.
+ *
+ * @return The type of the event.
+ */
+ @XmlTransient
+ @JsonIgnore
+ public URI getType() {
+ return this.type == null ? null : this.type.getType();
+ }
+
+ /**
+ * The type of the event.
+ *
+ * @param type The type of the event.
+ */
+ @JsonIgnore
+ public void setType(URI type) {
+ this.type = type == null ? null : new TypeReference<EventType>(type);
+ }
+
+ /**
+ * The enum referencing the known type of the event, or {@link org.gedcomx.types.EventType#OTHER} if not known.
+ *
+ * @return The enum referencing the known type of the event, or {@link org.gedcomx.types.EventType#OTHER} if not known.
+ */
+ @XmlTransient
+ @JsonIgnore
+ public org.gedcomx.types.EventType getKnownType() {
+ return this.type == null ? null : EventType.fromQNameURI(this.type.getType());
+ }
+
+ /**
+ * Set the type of this event from a known enumeration of event types.
+ *
+ * @param knownType the event type.
+ */
+ @JsonIgnore
+ public void setKnownType(org.gedcomx.types.EventType knownType) {
+ this.type = knownType == null ? null : new TypeReference<EventType>(knownType);
+ }
+
+ /**
+ * The date of this event.
+ *
+ * @return The date of this event.
+ */
+ public Date getDate() {
+ return date;
+ }
+
+ /**
+ * The date of this event.
+ *
+ * @param date The date of this event.
+ */
+ public void setDate(Date date) {
+ this.date = date;
+ }
+
+ /**
+ * The place of this event.
+ *
+ * @return The place of this event.
+ */
+ public Place getPlace() {
+ return place;
+ }
+
+ /**
+ * The place of this event.
+ *
+ * @param place The place of this event.
+ */
+ public void setPlace(Place place) {
+ this.place = place;
+ }
+
+ /**
+ * The roles played in this event.
+ *
+ * @return The roles played in this event.
+ */
+ @XmlElement (name="role")
+ @JsonProperty ("roles")
+ @JsonName ("roles")
+ public List<EventRole> getRoles() {
+ return roles;
+ }
+
+ /**
+ * The roles played in this event.
+ *
+ * @param roles The roles played in this event.
+ */
+ public void setRoles(List<EventRole> roles) {
+ this.roles = roles;
+ }
+
+ /**
+ * The source references for this event.
+ *
+ * @return The source references for this event.
+ */
+ @XmlElement (name="source")
+ @JsonProperty ("sources")
+ @JsonName ("sources")
+ public List<SourceReference> getSources() {
+ return sources;
+ }
+
+ /**
+ * The source references for this event.
+ *
+ * @param sources The source references for this event.
+ */
+ @JsonProperty("sources")
+ public void setSources(List<SourceReference> sources) {
+ this.sources = sources;
+ }
+
+ /**
+ * Add a sourceReference.
+ *
+ * @param sourceReference The sourceReference to be added.
+ */
+ public void addSource(SourceReference sourceReference) {
+ if (sourceReference != null) {
+ if (sources == null) {
+ sources = new ArrayList<SourceReference>();
+ }
+ sources.add(sourceReference);
+ }
+ }
+
+}
View
123 gedcomx-conclusion/src/main/java/org/gedcomx/conclusion/EventRole.java
@@ -0,0 +1,123 @@
+/**
+ * Copyright 2011-2012 Intellectual Reserve, Inc.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.gedcomx.conclusion;
+
+import org.codehaus.jackson.annotate.JsonIgnore;
+import org.gedcomx.common.GenealogicalResource;
+import org.gedcomx.common.ResourceReference;
+import org.gedcomx.rt.RDFRange;
+import org.gedcomx.types.EventRoleType;
+import org.gedcomx.types.TypeReference;
+
+import javax.xml.bind.annotation.XmlTransient;
+import javax.xml.bind.annotation.XmlType;
+
+/**
+ * A role that a specific person plays in an event.
+ *
+ * @author Ryan Heaton
+ */
+@XmlType ( name = "EventRole", propOrder = { "person", "role" } )
+public class EventRole extends GenealogicalResource {
+
+ private ResourceReference person;
+ private TypeReference<EventRoleType> role;
+ private String details;
+
+ /**
+ * Reference to the person playing the role in the event.
+ *
+ * @return Reference to the person playing the role in the event.
+ */
+ @RDFRange (Person.class)
+ public ResourceReference getPerson() {
+ return person;
+ }
+
+ /**
+ * Reference to the person playing the role in the event.
+ *
+ * @param person Reference to the person playing the role in the event.
+ */
+ public void setPerson(ResourceReference person) {
+ this.person = person;
+ }
+
+ /**
+ * The role the person plays in the event.
+ *
+ * @return The role the person plays in the event.
+ */
+ public TypeReference<EventRoleType> getRole() {
+ return role;
+ }
+
+ /**
+ * The role the person plays in the event.
+ *
+ * @param role The role the person plays in the event.
+ */
+ public void setRole(TypeReference<EventRoleType> role) {
+ this.role = role;
+ }
+
+ /**
+ * Details about the role of the person in the event.
+ *
+ * @return Details about the role of the person in the event.
+ */
+ public String getDetails() {
+ return details;
+ }
+
+ /**
+ * Details about the role of the person in the event.
+ *
+ * @param details Details about the role of the person in the event.
+ */
+ public void setDetails(String details) {
+ this.details = details;
+ }
+
+ /**
+ * The enum referencing the known role, or {@link org.gedcomx.types.EventRoleType#OTHER} if not known.
+ *
+ * @return The enum referencing the known role, or {@link org.gedcomx.types.EventRoleType#OTHER} if not known.
+ */
+ @XmlTransient
+ @JsonIgnore
+ public EventRoleType getKnownRole() {
+ return getRole() == null ? null : EventRoleType.fromQNameURI(getRole().getType());
+ }
+
+ /**
+ * Set the role from a known enumeration of roles.
+ *
+ * @param role The known role.
+ */
+ @JsonIgnore
+ public void setKnownRole(EventRoleType role) {
+ setRole(role == null ? null : new TypeReference<EventRoleType>(role));
+ }
+
+ /**
+ * Provide a simple toString() method.
+ */
+ @Override
+ public String toString() {
+ return (person == null) ? "" : person.toString();
+ }
+}
View
1  gedcomx-conclusion/src/main/java/org/gedcomx/conclusion/Fact.java
@@ -62,7 +62,6 @@ public Fact() {
public Fact(FactType factType, String original) {
setKnownType(factType);
setOriginal(original);
- setFormal(formal);
}
/**
View
72 gedcomx-conclusion/src/test/java/org/gedcomx/conclusion/EventTest.java
@@ -0,0 +1,72 @@
+package org.gedcomx.conclusion;
+
+import org.gedcomx.common.Attribution;
+import org.gedcomx.common.ResourceReference;
+import org.gedcomx.common.URI;
+import org.gedcomx.types.EventRoleType;
+import org.gedcomx.types.EventType;
+import org.testng.annotations.Test;
+
+import java.util.ArrayList;
+
+import static org.gedcomx.rt.SerializationUtil.processThroughJson;
+import static org.gedcomx.rt.SerializationUtil.processThroughXml;
+import static org.testng.AssertJUnit.assertEquals;
+
+/**
+ * @author Ryan Heaton
+ */
+@Test
+public class EventTest {
+
+ /**
+ * tests processing a event through xml...
+ */
+ public void testEventXml() throws Exception {
+ Event event = createTestEvent();
+ event = processThroughXml(event);
+ assertTestEvent(event);
+ }
+
+ /**
+ * tests processing a event through json...
+ */
+ public void testPersonJson() throws Exception {
+ Event event = createTestEvent();
+ event = processThroughJson(event);
+ assertTestEvent(event);
+ }
+
+ private Event createTestEvent() {
+ Event event = new Event();
+ event.setKnownType(EventType.Marriage);
+ event.setAttribution(new Attribution());
+ event.getAttribution().setProofStatement("explanation");
+ event.setDate(new Date());
+ event.getDate().setOriginal("date");
+ event.setPlace(new Place());
+ event.getPlace().setOriginal("place");
+ event.setRoles(new ArrayList<EventRole>());
+ EventRole role = new EventRole();
+ role.setKnownRole(EventRoleType.Official);
+ role.setPerson(new ResourceReference());
+ role.getPerson().setResource(URI.create("urn:person"));
+ event.getRoles().add(role);
+ SourceReference sourceReference = new SourceReference();
+ sourceReference.setId("source-ref");
+ event.addSource(sourceReference);
+ return event;
+ }
+
+ private void assertTestEvent(Event event) {
+ assertEquals(EventType.Marriage, event.getKnownType());
+ assertEquals("explanation", event.getAttribution().getProofStatement());
+ assertEquals("date", event.getDate().getOriginal());
+ assertEquals("place", event.getPlace().getOriginal());
+ assertEquals(1, event.getRoles().size());
+ assertEquals(EventRoleType.Official, event.getRoles().get(0).getKnownRole());
+ assertEquals("urn:person", event.getRoles().get(0).getPerson().getResource().toString());
+ assertEquals("source-ref", event.getSources().get(0).getId());
+ }
+
+}
View
161 specifications/conceptual-model-specification.md
@@ -460,10 +460,10 @@ name | description | data type
name | The name of the person or organization. | [`http://www.w3.org/2000/01/rdf-schema#Literal`](#rdf-literal)
homepage | The homepage of the person or organization. | [`http://www.w3.org/2000/01/rdf-schema#Literal`](#rdf-literal)
openid | The [openid](http://openid.net/) of the person or organization. | [`http://www.w3.org/2000/01/rdf-schema#Literal`](#rdf-literal)
-accounts | The online accounts of the person or organization. | Ordered list of [`http://xmlns.com/foaf/0.1/OnlineAccount`](#online-account)
-emails | The email addresses of the person or organization. | Ordered list of [URI](#uri) - MUST resolve to a valid e-mail address (e.g. "mailto:someone@gedcomx.org")
-phones | The phones (voice, fax, mobile) of the person or organization. | Ordered list of [URI](#uri) - MUST resolve to a valid phone number (e.g. "tel:+1-201-555-0123")
-addresses | The addresses of the person or organization. | Ordered list of [`http://www.w3.org/2000/10/swap/pim/contact#Address`](#address)
+accounts | The online accounts of the person or organization. | List of [`http://xmlns.com/foaf/0.1/OnlineAccount`](#online-account). Order is preserved.
+emails | The email addresses of the person or organization. | List of [URI](#uri) - MUST resolve to a valid e-mail address (e.g. "mailto:someone@gedcomx.org"). Order is preserved.
+phones | The phones (voice, fax, mobile) of the person or organization. | List of [URI](#uri) - MUST resolve to a valid phone number (e.g. "tel:+1-201-555-0123"). Order is preserved.
+addresses | The addresses of the person or organization. | List of [`http://www.w3.org/2000/10/swap/pim/contact#Address`](#address). Order is preserved.
<a id="organization"/>
@@ -543,7 +543,7 @@ This data type extends the following data type:
name | description | data type
-----|-------------|----------
-sources | The list of references to the sources of the conclusion. The sources of a conclusion MUST also be sources of the conclusion's containing entity (i.e. [`Person`](#person) or [`Relationship`](#relationship) ).| Ordered list of [`http://gedcomx.org/conclusion/v1/SourceReference`](#source-reference).
+sources | The list of references to the sources of the conclusion. The sources of a conclusion MUST also be sources of the conclusion's containing entity (i.e. [`Person`](#person) or [`Relationship`](#relationship) ).| List of [`http://gedcomx.org/conclusion/v1/SourceReference`](#source-reference). Order is preserved.
<a id="conclusion-date"/>
@@ -592,9 +592,49 @@ name | description | data type
original | The original value of the place as supplied by the contributor. | string
formal | The formal value of the place. | [`http://gedcomx.org/FormalValue`](#formal-value)
+
+<a id="conclusion-event-role"/>
+
+## 5.4 The "EventRole" Data Type
+
+The `EventRole` data type defines a role played in an event by a person.
+
+### identifier
+
+The identifier for the `EventRole` data type is:
+
+`http://gedcomx.org/conclusion/v1/EventRole`
+
+### extension
+
+This data type extends the following data type:
+
+`http://gedcomx.org/GenealogicalResource`
+
+### properties
+
+name | description | data type
+-----|-------------|----------
+person | Reference to the person playing the role in the event. | [`URI`](#uri) - MUST resolve to an instance of [`http://gedcomx.org/conclusion/v1/Person`](#person)
+role | Reference to the role. | [`URI`](#uri) - MUST resolve to a role. Refer to the list of [known roles](#known-roles).
+details | Details about the role of the person in the event. | string
+
+<a id="known-roles"/>
+
+### known roles
+
+The following roles are defined by GEDCOM X:
+
+URI | description
+----|------------
+`http://gedcomx.org/Principal`|
+`http://gedcomx.org/Participant`|
+`http://gedcomx.org/Official`|
+`http://gedcomx.org/Witness`|
+
<a id="fact-conclusion"/>
-## 5.4 The "Fact" Data Type
+## 5.5 The "Fact" Data Type
The `Fact` data type defines a conclusion about a fact of the life of a person or
the nature of a relationship. The `Fact` data type extends the `Conclusion` data type.
@@ -725,7 +765,7 @@ URI | description | scope
<a id="gender-conclusion"/>
-## 5.5 The "Gender" Data Type
+## 5.6 The "Gender" Data Type
The `Gender` data type defines a conclusion about the gender of a person. the `Gender` data type
extends the `Conclusion` data type.
@@ -762,7 +802,7 @@ URI | description
<a id="name-part"/>
-## 5.6 The "NamePart" Data Type
+## 5.7 The "NamePart" Data Type
The `NamePart` data type defines a part of a name of a person.
@@ -792,8 +832,9 @@ URI | description
`http://gedcomx.org/Given`|
`http://gedcomx.org/Surname`|
+<a id="name-form"/>
-## 5.7 The "NameForm" Data Type
+## 5.8 The "NameForm" Data Type
The `NameForm` data type defines a form of a name of a person.
@@ -808,11 +849,11 @@ The identifier for the `NameForm` data type is:
name | description | data type
-----|-------------|----------
fullText | The full text of the name form. | string
-parts | The parts of the name form. | Ordered list of [`http://gedcomx.org/conclusion/v1/NamePart`](#name-part)
+parts | The parts of the name form. | List of [`http://gedcomx.org/conclusion/v1/NamePart`](#name-part). Order is preserved.
<a id="name-conclusion"/>
-## 5.8 The "Name" Data Type
+## 5.9 The "Name" Data Type
The `Name` data type defines a conclusion about a name of a person. The `Name` data type
extends the `Conclusion` data type.
@@ -835,7 +876,7 @@ name | description | data type
-----|-------------|----------
type | URI identifying the type of the name. | [URI](#uri) - MUST resolve to a name type. Refer to the list of [known name types](#known-name-types).
primaryForm | The primary form of the name. | `http://gedcomx.org/conclusion/v1/NameForm`
-alternateForms | The alternate forms of the name. | Ordered list of `http://gedcomx.org/conclusion/v1/NameForm`
+alternateForms | The alternate forms of the name. | List of [`http://gedcomx.org/conclusion/v1/NameForm`](#name-form). Order is preserved.
preferred | Whether this name is preferred above the other names of a person. | boolean
<a id="known-name-types"/>
@@ -884,13 +925,13 @@ This data type extends the following data type:
name | description | data type
-----|-------------|----------
-identifiers | Identifiers for the person. | Ordered list of [`http://gedcomx.org/Identifier`](#identifier-type)
+identifiers | Identifiers for the person. | List of [`http://gedcomx.org/Identifier`](#identifier-type). Order is preserved.
living | Whether the person is considered living. | boolean
gender | The conclusion about the gender of the person. | [`http://gedcomx.org/conclusion/v1/Gender`](#gender)
-names | The conclusions about the names of the person. | Ordered list of [`http://gedcomx.org/conclusion/v1/Name`](#name-conclusion)
-facts | The conclusions about the facts of the life of the person. | Ordered list of [`http://gedcomx.org/conclusion/v1/Fact`](#fact-conclusion)
-sources | The list of references to the evidence of the person. | Ordered list of [`http://gedcomx.org/conclusion/v1/SourceReference`](#source-reference)
-notes | Contributed notes about the person. | Ordered list of [`http://gedcomx.org/Note`](#note)
+names | The conclusions about the names of the person. | List of [`http://gedcomx.org/conclusion/v1/Name`](#name-conclusion). Order is preserved.
+facts | The conclusions about the facts of the life of the person. | List of [`http://gedcomx.org/conclusion/v1/Fact`](#fact-conclusion). Order is preserved.
+sources | The list of references to the evidence of the person. | List of [`http://gedcomx.org/conclusion/v1/SourceReference`](#source-reference). Order is preserved.
+notes | Contributed notes about the person. | List of [`http://gedcomx.org/Note`](#note). Order is preserved.
<a id="relationship"/>
@@ -922,9 +963,9 @@ name | description | data type
type | URI identifying the type of the relationship. | [URI](#uri) - MUST resolve to a relationship type. Refer to the list of [known relationship types](#known-relationship-types)
person1 | Reference to the first person in the relationship. | [URI](#uri) - MUST resolve to an instance of [`http://gedcomx.org/conclusion/v1/Person`](#person)
person2 | Reference to the second person in the relationship. | [URI](#uri) - MUST resolve to an instance of [`http://gedcomx.org/conclusion/v1/Person`](#person)
-facts | The conclusions about the facts of the life of the relationship. | Ordered list of [`http://gedcomx.org/conclusion/v1/Fact`](#fact-conclusion)
-sources | The list of references to the evidence of the relationship. | Ordered list of [`http://gedcomx.org/conclusion/v1/SourceReference`](#source-reference)
-notes | Contributed notes about the relationship. | Ordered list of [`http://gedcomx.org/Note`](#note)
+facts | The conclusions about the facts of the life of the relationship. | List of [`http://gedcomx.org/conclusion/v1/Fact`](#fact-conclusion). Order is preserved.
+sources | The list of references to the evidence of the relationship. | List of [`http://gedcomx.org/conclusion/v1/SourceReference`](#source-reference). Order is preserved.
+notes | Contributed notes about the relationship. | List of [`http://gedcomx.org/Note`](#note). Order is preserved.
Note: when a relationship type implies direction, the relationship is said to
to *from* person1 *to* person2. For example, in a parent-child relationship, the
@@ -943,7 +984,83 @@ URI | description
`http://gedcomx.org/ParentChild`|
-# 8. Extensibility
+# 8. The Event
+
+An event describes a historical event.
+
+## 8.1 The "Event" Data Type
+
+The `Event` data type defines a description of a historical event. The `Event` data type
+extends the `GenealogicalResource` data type.
+
+### identifier
+
+The identifier for the `Event` data type is:
+
+`http://gedcomx.org/conclusion/v1/Event`
+
+### extension
+
+This data type extends the following data type:
+
+`http://gedcomx.org/GenealogicalResource`
+
+### properties
+
+name | description | data type
+-----|-------------|----------
+type | URI identifying the type of the event. | [URI](#uri). MUST resolve to an event type. Refer to the list of [known event types](#known-event-types)
+date | The date of the event. | [`http://gedcomx.org/conclusion/v1/Date`](#conclusion-date)
+place | The place of the event. | [`http://gedcomx.org/conclusion/v1/Place`](#conclusion-place)
+roles | The roles of the persons in the event. | List of [`http://gedcomx.org/conclusion/v1/EventRole`](#conclusion-event-role). Order is preserved.
+sources | The list of references to the evidence of the event. | List of [`http://gedcomx.org/conclusion/v1/SourceReference`](#source-reference). Order is preserved.
+
+<a id="known-event-types"/>
+
+### known roles
+
+The following event types are defined by GEDCOM X:
+
+URI | description
+----|------------
+`http://gedcomx.org/Adoption`|
+`http://gedcomx.org/AdultChristening`|
+`http://gedcomx.org/Annulment`|
+`http://gedcomx.org/Arrival`|
+`http://gedcomx.org/Baptism`|
+`http://gedcomx.org/BarMitzvah`|
+`http://gedcomx.org/BatMitzvah`|
+`http://gedcomx.org/Birth`|
+`http://gedcomx.org/Blessing`|
+`http://gedcomx.org/Burial`|
+`http://gedcomx.org/Census`|
+`http://gedcomx.org/Christening`|
+`http://gedcomx.org/Circumcision`|
+`http://gedcomx.org/Confirmation`|
+`http://gedcomx.org/Cremation`|
+`http://gedcomx.org/Death`|
+`http://gedcomx.org/Departure`|
+`http://gedcomx.org/Divorce`|
+`http://gedcomx.org/DivorceFiling`|
+`http://gedcomx.org/Education`|
+`http://gedcomx.org/Engagement`|
+`http://gedcomx.org/Emigration`|
+`http://gedcomx.org/Excommunication`|
+`http://gedcomx.org/FirstCommunion`|
+`http://gedcomx.org/Funeral`|
+`http://gedcomx.org/Graduation`|
+`http://gedcomx.org/Immigration`|
+`http://gedcomx.org/Interment`|
+`http://gedcomx.org/Marriage`|
+`http://gedcomx.org/MilitaryAward`|
+`http://gedcomx.org/MilitaryDischarge`|
+`http://gedcomx.org/Mission`|
+`http://gedcomx.org/Move`|
+`http://gedcomx.org/Ordinance`|
+`http://gedcomx.org/Ordination`|
+`http://gedcomx.org/Retirement`|
+
+# 9. Extensibility
## Extensions from Non-GEDCOM X Vocabularies
@@ -1003,4 +1120,4 @@ a known data type, GEDCOM X recognizes the data URI scheme as defined by
todo: add details about which properties are required.
-todo: supply details about how GEDCOM X defines its evidence model.
+todo: supply details about how GEDCOM X defines its evidence model.
View
91 specifications/json-format-specification.md
@@ -684,9 +684,41 @@ formal | The formal value of the place. | formal | [`FormalValue`](#formal-value
}
```
+<a id="conclusion-event-role"/>
+
+## 5.4 The "EventRole" Data Type
+
+The JSON object used to (de)serialize the `http://gedcomx.org/conclusion/v1/EventRole`
+data type is defined as follows:
+
+### properties
+
+name | description | JSON member | JSON object type
+-----|-------------|--------------|---------
+person | Reference to the person playing the role in the event. | person | [`URI`](#uri)
+role | Reference to the role. | role | [`URI`](#uri)
+details | Details about the role of the person in the event. | details | string
+
+### examples
+
+```json
+{
+ "@type" : "http://gedcomx.org/conclusion/v1/EventRole",
+ "id" : "local_id",
+ "person" : {
+ "resource" : "http://identifier/for/person/1"
+ },
+ "role" : {
+ "resource" : "http://gedcomx.org/Witness"
+ },
+ details: "..."
+}
+```
+
+
<a id="fact-conclusion"/>
-## 5.4 The "Fact" Data Type
+## 5.5 The "Fact" Data Type
The JSON object used to (de)serialize the `http://gedcomx.org/conclusion/v1/Fact` data type is defined as follows:
@@ -722,7 +754,7 @@ formal | The formal value of the fact. | formal | [`FormalValue`](#formal-value)
<a id="gender-conclusion"/>
-## 5.5 The "Gender" Data Type
+## 5.6 The "Gender" Data Type
The JSON object used to (de)serialize the `http://gedcomx.org/conclusion/v1/Gender` data type is defined as follows:
@@ -743,7 +775,7 @@ type | URI identifying the type of the gender. | type | [`URI`](#uri)
<a id="name-part"/>
-## 5.6 The "NamePart" Data Type
+## 5.7 The "NamePart" Data Type
The JSON object used to (de)serialize the `http://gedcomx.org/conclusion/v1/NamePart` data type is defined as follows:
@@ -763,7 +795,7 @@ text | The text of the name part. | text | string
}
```
-## 5.7 The "NameForm" Data Type
+## 5.8 The "NameForm" Data Type
The JSON object used to (de)serialize the `http://gedcomx.org/conclusion/v1/NameForm` data type is defined as follows:
@@ -785,7 +817,7 @@ parts | The parts of the name form. | parts | array of [`NamePart`](#name-part)
<a id="name-conclusion"/>
-## 5.8 The "Name" Data Type
+## 5.9 The "Name" Data Type
The JSON object used to (de)serialize the `http://gedcomx.org/conclusion/v1/Name` data type is defined as follows:
@@ -842,9 +874,9 @@ notes | Contributed notes about the person. | notes | array of [`Note`](#note)
"gender" : {
...
},
- "names" : [ { ... }, { ... } ]
- "facts" : [ { ... }, { ... } ]
- "sources" : [ { ... }, { ... } ]
+ "names" : [ { ... }, { ... } ],
+ "facts" : [ { ... }, { ... } ],
+ "sources" : [ { ... }, { ... } ],
"notes" : [ { ... }, { ... } ]
}
```
@@ -885,7 +917,48 @@ notes | Contributed notes about the relationship. | notes | array of [`Note`](#n
}
```
-# 8. Known JSON Extension Members
+
+<a id="event"/>
+
+# 8. The Event
+
+This section defines the `Event` JSON type corresponding to the `Event` data type
+specified by the section titled "The Event" of the conceptual model specification.
+
+## 8.1 The "Event" Data Type
+
+the JSON object used to (de)serialize the `http://gedcomx.org/conclusion/v1/Event` data type
+is defined as follows:
+
+### properties
+
+name | description | JSON member | JSON object type
+-----|-------------|--------------|---------
+type | URI identifying the type of the event. | type | [`URI`](#uri)
+date | The date of the event. | date | [`Date`](#conclusion-date)
+place | The place the event. | place | [`Place`](#conclusion-place)
+roles | The roles of the persons in the event. | roles | array of [`EventRole`](#conclusion-event-role)
+sources | The list of references to the evidence of the event. | sources | array of [`SourceReference`](#source-reference)
+
+### examples
+
+```json
+{
+ "id" : "local_id",
+ "type" : "http://gedcomx.org/Marriage",
+ "date" : {
+ ...
+ },
+ "place" : {
+ ...
+ }
+ "roles" : [ { ... }, { ... } ],
+ "sources" : [ { ... }, { ... } ]
+}
+```
+
+
+# 9. Known JSON Extension Members
GEDCOM X defines the notion of extension properties, and the JSON serialization
supports the extensibility requirements detailed in the GEDCOM X conceptual model
View
2  specifications/support/conceptual-model-conclusion.bnf
@@ -1,5 +1,6 @@
Person ::= ( identifier )* ( living | ) ( gender | ) ( name )* ( fact )* ( source )* ( note )*
Relationship ::= type person1 person2 ( fact )* ( source )* ( note )*
+Event ::= type date place ( role )* ( source )*
identifier ::= type resource
living ::= ( <true> | <false> )
gender ::= conclusion_properties type
@@ -12,6 +13,7 @@ primaryForm ::= nameForm
alternateForm ::= nameForm
nameForm ::= ( fullText | ) ( namePart )*
namePart ::= type text
+role ::= person1 type
preferred ::= ( <true> | <false> )
conclusion_properties ::= genealogical_resource_properties ( source )*
genealogical_resource_properties ::= ( id | ) attribution
View
BIN  specifications/support/gedcomx.zargo
Binary file not shown
View
90 specifications/xml-format-specification.md
@@ -668,9 +668,35 @@ formal | The formal value of the place. | gxc:formal | [`gx:FormalValue`](#forma
</...>
```
+<a id="conclusion-event-role"/>
+
+## 5.4 The "EventRole" Data Type
+
+The `gxc:EventRole` XML type is used to (de)serialize the `http://gedcomx.org/conclusion/v1/EventRole`
+data type.
+
+### properties
+
+name | description | XML property | XML type
+-----|-------------|--------------|---------
+person | Reference to the person playing the role in the event. | gxc:person | [`rdf:ResourceReference`](#resource-reference)
+role | Reference to the role. | gxc:role | [`rdf#ResourceReference`](#resource-reference)
+details | Details about the role of the person in the event. | details | xs:string
+
+### examples
+
+```xml
+ <... rdf:ID="local_id">
+ <gxc:person rdf:resource="http://identifier/for/person/1"/>
+ <gxc:role rdf:resource="http://gedcomx.org/Witness"/>
+ <gxc:details>...</gxc:details>
+ </...>
+```
+
+
<a id="fact-conclusion"/>
-## 5.4 The "Fact" Data Type
+## 5.5 The "Fact" Data Type
The `gxc:Fact` XML type is used to (de)serialize the `http://gedcomx.org/conclusion/v1/Fact`
data type.
@@ -706,7 +732,7 @@ formal | The formal value of the fact. | gxc:formal | [`gxc:FormalValue`](#forma
<a id="gender-conclusion"/>
-## 5.5 The "Gender" Data Type
+## 5.6 The "Gender" Data Type
The `gxc:Gender` XML type is used to (de)serialize the `http://gedcomx.org/conclusion/v1/Gender`
data type.
@@ -727,7 +753,7 @@ type | URI identifying the type of the gender. | gxc:type | [`rdf:ResourceRefere
<a id="name-part"/>
-## 5.6 The "NamePart" Data Type
+## 5.7 The "NamePart" Data Type
The `gxc:NamePart` XML type is used to (de)serialize the `http://gedcomx.org/conclusion/v1/NamePart`
data type.
@@ -748,7 +774,7 @@ text | The text of the name part. | gxc:text | xsd:string
</...>
```
-## 5.7 The "NameForm" Data Type
+## 5.8 The "NameForm" Data Type
The `NameForm` XML type is used to (de)serialize the `http://gedcomx.org/conclusion/v1/NameForm`
data type.
@@ -779,7 +805,7 @@ parts | The parts of the name form. | gxc:part | [`gxc:NamePart`](#name-part)
<a id="name-conclusion"/>
-## 5.8 The "Name" Data Type
+## 5.9 The "Name" Data Type
The `gxc:Name` XML type is used to (de)serialize the `http://gedcomx.org/conclusion/v1/Name`
data type.
@@ -882,7 +908,7 @@ specified by the section titled "The Relationship" of the conceptual model speci
## 7.1 The "Relationship" Data Type
-The `Relationship` XML type is used to (de)serialize the `http://gedcomx.org/conclusion/v1/Relationship`
+The `gxc:Relationship` XML type is used to (de)serialize the `http://gedcomx.org/conclusion/v1/Relationship`
data type.
### properties
@@ -930,7 +956,57 @@ notes | Contributed notes about the relationship. | gxc:note | [`gxc:Note`](#not
</...>
```
-# 8. XML Elements
+
+<a id="event"/>
+
+# 8. The Event
+
+This section defines the `Event` XML type corresponding to the `Event` data type
+specified by the section titled "The Event" of the conceptual model specification.
+
+## 8.1 The "Event" Data Type
+
+The `gxc:Event` is used to (de)serialize the `http://gedcomx.org/conclusion/v1/Event`
+data type.
+
+### properties
+
+name | description | XML property | XML type
+-----|-------------|--------------|---------
+type | URI identifying the type of the event. | gxc:type | [`rdf:ResourceReference`](#resource-reference)
+date | The date of the event. | gxc:date | [`gxc:Date`](#conclusion-date)
+place | The place the event. | gxc:place | [`gxc:Place`](#conclusion-place)
+roles | The roles of the persons in the event. | gxc:role | [`gxc:EventRole`](#conclusion-event-role)
+sources | The list of references to the evidence of the event. | gxc:source | [`gxc:SourceReference`](#source-reference)
+
+### examples
+
+```xml
+ <... rdf:ID="local_id">
+ <gxc:type rdf:resource="http://gedcomx.org/Marriage"/>
+ <gxc:date>
+ ...
+ </gxc:date>
+ <gxc:place>
+ ...
+ </gxc:place>
+ <gxc:role>
+ ...
+ </gxc:role>
+ <gxc:role>
+ ...
+ </gxc:role>
+ <gxc:source>
+ ...
+ </gxc:source>
+ <gxc:source>
+ ...
+ </gxc:source>
+ </...>
+```
+
+
+# 9. XML Elements
XML types are not enough to provide an XML serialization format for a data model. XML requires elements to be defined
that can be used as the "root" of an XML document. XML elements are also used to identify any extension properties that
Something went wrong with that request. Please try again.