Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tree metadata tool and related changes #1403

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

cdhorn
Copy link
Contributor

@cdhorn cdhorn commented Oct 26, 2022

This PR implements the following changes:

  • Add a Tree object to encapsulate information about the tree and save it in the database similar to the current Researcher entity. Properties of a Tree are uuid, name, description, copyright, license, and contributors. The tree uuid represents the dataset that comprises the logical tree.
  • Maintain a change timestamp for changes to the Tree object as a whole.
  • Provide a Tree Editor Tool for managing the tree related data.
  • Generate a unique uuid to represent the current backing/persistent store for the tree data.
  • Maintain a last transaction timestamp for the last database transaction.
  • Maintain a change timestamp for changes to the Researcher object as a whole.
  • Add support for associating the Researcher object with a Person in the tree.
  • Update the Owner Editor to enable adding/removing the Researcher <> Person association.
  • Update the Gramps Xml export header to include the new information being captured. A sample is as follows:
  <header>
    <created date="2022-10-25" version="5.2.0"/>
    <storage uuid="b2be8956-4c3c-4ac5-8f01-692034026c2d" change="1666738821"/>
    <tree uuid="c7db6c8a-a300-4218-93d6-039b71c9decd" change="1666736955">
      <name>Tree Testing</name>
      <copyright>Copyright 2022 Christopher Horn</copyright>
      <license>None</license>
      <description>A Gramps example tree used for testing the tree metadata editor and other changes.</description>
      <contributors>None</contributors>
    </tree>
    <researcher change="1666739578" handle="_f28adbe86b15cd123e7bd84e043">
      <resname>Alex Roitman</resname>
    </researcher>
    <mediapath>{GRAMPS_RESOURCES}/doc/gramps/example/gramps</mediapath>
  </header>
  • Add support for including a user export note or comment in the Gramps Xml header. Note the exporter GUI has not been updated for the user to use this yet.
  • Update the Gramps Xml import to import some of the new information when it is an import for an empty tree.
  • Update the Gedcom export to use the Tree copyright information if it is available.
  • Add get_addon_metadata and set_addon_metadata methods so add ons can choose to save state in the database. This state is not backed up or exported at this time and values must be str type only.
  • Other additional methods include get_database_uuid, get_last_transaction_time, get_tree, set_tree, set_tree_change, set_researcher_change, get_researcher_handle, get_researcher_person, set_researcher_handle, and get_tree_metadata
  • Update the DTD and RNG files and bump the Gramps Xml version to 1.7.2

@cdhorn cdhorn marked this pull request as draft October 26, 2022 03:56
@cdhorn
Copy link
Contributor Author

cdhorn commented Oct 26, 2022

Need to look closer at the failing tests yet I guess.

@DavidMStraub
Copy link
Member

This looks great. Is anything missing apart from fixing the tests?

@emyoulation
Copy link
Contributor

emyoulation commented Feb 7, 2023

Is the 'owner' similar to the 'Researcher' ? Where they can be associated with a Person in the Tree?

Are the Address and Internet Gramps XML data structures in the Tree the only way to store Tree contact data: Phone, postal address, email, website.

I suppose that Users could make such info accessible at the metadata level as a user entered comment. And it would be less likely to be accidentally included. (Although the "probably alive" export filters are likely to invalidate the Researcher handle more often than not.)

@cdhorn
Copy link
Contributor Author

cdhorn commented Feb 7, 2023

Is anything missing apart from fixing the tests?

I don't think so, but being the one who wrote it not sure if I overlooked something.

I got stuck trying to track down the issue with the tests and set it aside and of course got side tracked and have not revisited.

Is the 'owner' similar to the 'Researcher' ? Where they can be associated with a Person in the Tree?

Yes, you have Researcher in preferences and you also have a Researcher object stored in the database in the Researcher entity. The one in the database you can edit with the Database Owner Editor under Tools -> Family Tree Processing and the editor lets you sync with the one in the preferences. The one in the database is the one included in XML exports. So I added something so you can associate the database owner/researcher stored in the database with a Person in the tree as it seemed logical to do so.

@emyoulation
Copy link
Contributor

So I added something so you can associate the database owner/researcher stored in the database with a Person in the tree as it seemed logical to do so.

I agree completely that associating is logical.

But my question is if there is a SEPARATE association for owner.

Rather than OnePlace Studies, I have about four different OneSource studies. These revolve around creating Trees to transport each fact presented within a single Reference Book. (An OCR''d PDF of the Reference Book is attached as a Media Object.)

The researcher for the 1924 booklet (which incidentally roped me into this obsession) died in 1948. He is in the Tree and is indisputedly the Researcher. His research only comes as close to me as a paternal great-grandmother.

If I distribute the Tree, I would be the owner. Yet my existence as a 'Person' connected to that Tree is only with an pair of Associations (to my great-grandmother, and to the 6th great-grandfather who is the progenitor in the research) & there's a Note detailing my direct descendancy from the progenitor.

We don't want to taint the Tree built from his original research with other sources. (Yes, there are significant errors.)

And I'm talking with a 6th cousin (who built a website around the book) is connected to his closest ancestors in that book with 2 Associations and a Note. He is considering distribution of that Tree via the website. So happily, HE might become stuck as the owner soon.

The problem is that discovering either of the 2 possible owners via browsing the Tree is virtually impossible ... unless there is a separate GrampsID/handle for Owner from that of the Researcher.

@cdhorn
Copy link
Contributor Author

cdhorn commented Feb 8, 2023

But my question is if there is a SEPARATE association for owner.

Oh I am sorry I did not understand. No there is not.

I think ideally you would eliminate the concept of a singleton researcher. I'd just model it as a list of contacts and the tree owners and researchers/collaborators are just different types of contacts. I use plural because with Gramps.js everyone should probably be thinking multi-user in anything done going forward.

@cdhorn cdhorn force-pushed the tree-data branch 2 times, most recently from f43451c to a960caf Compare March 9, 2023 01:46
@cdhorn cdhorn marked this pull request as ready for review March 9, 2023 01:55
@cdhorn
Copy link
Contributor Author

cdhorn commented Mar 12, 2023

@DavidMStraub Nick asked that I pull out the db and tree uuids so just confirming you were not looking to use them before I refactor this.

@DavidMStraub
Copy link
Member

One question, I noticed the tree_change (which I'm particularly looking forward to as it will make caching for Web API much easier) is set as int(time.time()). Is there any reason to drop sub-second precision? In a multi-user/server context, there might be multiple transactions per second. Actually, even on a Desktop system, there might be cases of an add-on or batch script running through a loop and processing a transaction in every iteration. So I think it would be much safer to store the timestamp float directly without the int().

@cdhorn
Copy link
Contributor Author

cdhorn commented Apr 2, 2023

@Nick-Hall @DavidMStraub this has been refactored now. I pulled out the uuids as well as some other things to try to make it a little more focused. Please let me know if you see any other changes you think are needed.

@Nick-Hall
Copy link
Member

@cdhorn Thank you. I'll look at this soon. I've just arrived back from travelling where I had limited internet access.

@DavidMStraub
Copy link
Member

Concerning my previous comment, do I understand correctly that it's not actually tree_change that is relevant but db_last_transaction and this is stored as float?

@cdhorn
Copy link
Contributor Author

cdhorn commented Apr 3, 2023

Concerning my previous comment, do I understand correctly that it's not actually tree_change that is relevant but db_last_transaction and this is stored as float?

Correct.

The tree_change tracks when the Tree object with the description, copyright etc was last changed. I also added owner_change for the Researcher object. While they're not formal table objects it seems desireable to know when they were last modified.

@emyoulation
Copy link
Contributor

The OS may important for the Media path and archive restoration.

Media restores fail terminally when destination paths are not compatible. (I've seen overwrite permission problems when the destination was the same machine too. And the Tree data was not imported either in that case.) Will originating OS be important for resolving (or crosswalking path structures for) such problems?

@Nick-Hall Nick-Hall added the string Requires string changes label Jul 14, 2023
@emyoulation
Copy link
Contributor

To make the header dates and media paths be more parsable by humans, should the date format CCYY-MM-DD and OS be explicit?

@cdhorn
Copy link
Contributor Author

cdhorn commented Jul 29, 2023

To make the header dates and media paths be more parsable by humans, should the date format CCYY-MM-DD and OS be explicit?

Unsure I follow this question. The media path attribute is untouched by this. The created date is already in that format.

No information about the OS the export was generated on or database engine it was generated from is captured, though that could be added to the <provenance> section.

Currently this adds the following, as an example, to the <header> in addition to change timestamp for the <researcher> object:

    <provenance>
      <database-id>5d73241c</database-id>
      <last-transaction-timestamp>1690640156.5227873</last-transaction-timestamp>
      <export-note>An optional export note</export-note>
    </provenance>
    <tree change="1690638919">
      <name>Family Tree 2</name>
      <copyright>2001-2006 Donald N. Allingham, 2007-2023 Gramps Project</copyright>
      <license>Creative Commons Attribution-ShareAlike 2.5</license>
      <description>Example Gramps database</description>
      <contributors>Gramps Developers</contributors>
    </tree>

The change timestamp on the <tree> object is not the change time stamp for the database. That is in the <provenance> section at the moment as you can see above.

The support for adding the <export-note> was added but I did not update the exporter dialog to support it, that would be a separate item at some point.

I just realized I need to also include an update to example.gramps with the above as well now.

I also just noticed that the example.gramps under data/tests directory differs slightly from the one under the example/gramps directory so they seem to have diverged. I am unsure if that was deliberate.

@Nick-Hall when you have a few cycles if you could look this over again and let me know of any further changes you would like to see here.

@emyoulation
Copy link
Contributor

I looked at the sample header posted. Notice it had an August 8th date. In that case, whether the format is CCYY-MM-DD or is CCYY-DD-MM is irrelevant since the month and day # were the same. But since this is the first occurence of DATE in the file, it would boost confidence to have the DATE format explicit.

The media path had an environment variable in curly brackets and used forward-slash folder level delimiters. This reminded me that indentifying the OS may be helpful if paths have to be converted to to a different OS.

I also noted that version was 5.2 ? Is this XML header element supposed to ID the version of Gramps or the version of the XML schema?

@cdhorn
Copy link
Contributor Author

cdhorn commented Jul 29, 2023

| I looked at the sample header posted. Notice it had an August 8th date.

2017-08-08 is the date in the current example.gramps not the example I posted when I opened the PR which was 2022-10-25. The date format is always explicit.

Path conversion is already handled in the existing code and OS information not needed in the XML for that.

The created element identifies the version of Gramps, the XML schema version is in the document type declaration itself.

@cdhorn
Copy link
Contributor Author

cdhorn commented Jul 31, 2023

Rebased.

@Nick-Hall Nick-Hall removed the string Requires string changes label Aug 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants