Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

checking syntax when using Gedcom Validator #12

Open
mariannevanharten opened this issue Jan 7, 2024 · 11 comments
Open

checking syntax when using Gedcom Validator #12

mariannevanharten opened this issue Jan 7, 2024 · 11 comments

Comments

@mariannevanharten
Copy link

When exporting a gedcom and checking the syntax using Gedcom Validator, I see several warnings the gedcom is not compatible to Gedcom 7...

@Jefferson49
Copy link
Owner

Thank you for reporting the issue!

To understand the background, it would help to get some information about the related GEDCOM data and errors. Could you post some examples of the warning messages of Gedcom Validator? Could you also post some examples of GEDCOM snippets, which are related to the warnings?

If possible from privacy point of view, you could also send me your exported GEDCOM file (or some parts, which cause errors) as email to webmaster(at)familienforschung-hemprich.de

@mariannevanharten
Copy link
Author

I think the easiest way to check the syntaxis of the Gedcom file is installing it from https://chronoplexsoftware.com/gedcomvalidator/
Some error messages:
0 @M502087@ OBJE
1 FILE import/photo.emf
2 FORM EMF -> should be: 2 FORM image/emf
3 TYPE PHOTO

Also
2 FORM GIF
or
2 FORM HTM
are not valid.

All custom Gedcom tags (starting with "_" should be declared:
1 SCHMA
2 TAG _AKA https://website.com/_AKA.html

For now it's too much to give you more examples, may be later.

Best regards,
Marianne

@Jefferson49
Copy link
Owner

Regarding the image type, currently, the following types are supported:
bmp|BMP, gif|GIF, jpg|JPG, tif|TIF, pdf|PDF

Therefore, GIF should already work.

It is no problem to add further types, like EMF or HTML. The allowed media types in GEDCOM 7 can be found in the following source: https://www.iana.org/assignments/media-types/media-types.xhtml

Just post a list with the missing types to include, e.g.
EMF => image/emf
HTM|HTML => text/html

It would be great if you could also check in the IANA link list if the media type is available.

@Jefferson49
Copy link
Owner

All custom Gedcom tags (starting with "_" should be declared:
1 SCHMA
2 TAG _AKA https://website.com/_AKA.html

This is more difficult, since I do not have a list of all custom tags and schema definitions and I do not even know if such a list exists.

What I could do is to scan the whole Gedcom text if a certain custom tags occurs; and include the schemas if the related custom tag is found somewhere.

If you can provide me a list with custom tags and schemas, I can try to include this. I would need something like the following examples from GEDCOM-L:
'_GODP' => 'https://genealogy.net/GEDCOM/',
'_GOV' => 'https://genealogy.net/GEDCOM/',
'_GOVTYPE' => 'https://genealogy.net/GEDCOM/',
'_LOC' => 'https://genealogy.net/GEDCOM/',
'_NAME' => 'https://genealogy.net/GEDCOM/',
'_POST' => 'https://genealogy.net/GEDCOM/',
'_RUFNAME' => 'https://genealogy.net/GEDCOM/',
'_STAT' => 'https://genealogy.net/GEDCOM/',
'_UID' => 'https://genealogy.net/GEDCOM/',
'_WITN' => 'https://genealogy.net/GEDCOM/',

@Jefferson49
Copy link
Owner

Jefferson49 commented Jan 8, 2024

During testing, I identifyed that GIF and JPG ist not always exported correctly. I fixed this issue and also added an export for emf, htm, html.

In the attachement, I added an updated file for GedcomSevenExportService.php. You can unzip and replace this file in your installation and check, if the export now works for emf, gif etc.

GedcomSevenExportService.zip

@Jefferson49
Copy link
Owner

All custom Gedcom tags (starting with "_" should be declared:

At https://wiki.genealogy.net/GEDCOM/_Nutzerdef-Tag#Tabelle_1, I found a list of GEDCOM custom tags, which seems to cover a lot of the known custom tags.

In the latest code of the module, I generate SCHMA structure based on this custom tags list. If a custom tag from this list is detected during download, a SCHMA is included in the export:

Example:

1 SCHMA
2 TAG _NOTH https://wiki.genealogy.net/GEDCOM/_Nutzerdef-Tag#Tabelle_1

In the attachement, you can find a module version, which includes this functionality.

Can you test it if it works for your purposes?

download_gedcom_with_url_v3.2.3_238f1278.zip

@mariannevanharten
Copy link
Author

mariannevanharten commented Feb 23, 2024 via email

@Jefferson49
Copy link
Owner

Jefferson49 commented Feb 24, 2024

In my tree are several custom tags. A lot of these tags were not found by your module. Do you want to receive a list of these tags?

Since the simple approach with the custom tag list from https://wiki.genealogy.net/GEDCOM/_Nutzerdef-Tag#Tabelle_1 did not cover all of your custom tags, I started to rethink about this issue.

I read the GEDCOM 7 specification for extensions and found the following:

  • "Each extTag is either a documented extension tag or an undocumented extension tag"
  • "An extension tag that is not given a URI in the schema structure is called an undocumented extension tag. The meaning of an undocumented extension tag is identified by its superstructure type and its tag."

I my opinion, the specification text implies that undocumented extension tags are also a part of the standard and are possible to be included in a GEDCOM file.

Therefore, GEDCOM Validator is too strict about the SCHMA structure and the error messages should be warnings or information.

I created an issue at GEDCOM Validator to change the error. Hopefully, this will be changed.

Regarding the DownloadGedcomWithURL module, I will wait what happens with the issue at GEDCOM Validator. My summary for the moment is that I will only created SCHMA sctructures for custom tags, where a dedicated URI with a specific description of the custom tag is available. This seems to be in line with the intention of the GEDCOM 7 specification.

@Jefferson49
Copy link
Owner

Also I duplicated some tags by using both the regular tag and Gedcom 7 tag, e.g. RELA and ROLE.

Your module changed the tag RELA to ROLE, so when using your module and check the syntax of the Gedcom with GedcomValidator, it reports an error of duplicate ROLE.

Well, GEDCOM 7 eliminated the RELA tags and all related structures need to be converted to ROLE.

Can you provide me a GEDCOM snippet (from webtrees, i.e. GEDCOM 5.5.1) with your usage of RELA/ROLE, which creates an error after conversion to GEDCOM 7. I will check if I can change the code to support a conversion.

@Jefferson49
Copy link
Owner

I created an issue at GEDCOM Validator to change the error. Hopefully, this will be changed.

GEDCOM Validator did not accept to change the validation.

At https://wiki.genealogy.net/GEDCOM/_Nutzerdef-Tag#Tabelle_1, I found a list of GEDCOM custom tags, which seems to cover a lot of the known custom tags.

The latest release generates all the schemas for custom tags from the list above. Like described above, it is not an error to have further custom tags without schema. If you want to add a schema for those tags, you might want to add the following GEDCOM lines for each of your custom tags without schema. The idea is to refer to the GEDCOM 7 specification if no other URL ist available, which described the custom tag.

1 SCHMA
2 TAG _TAG https://gedcom.io/specifications/FamilySearchGEDCOMv7.pdf

@Jefferson49
Copy link
Owner

Release 3.2.4 addresses most of the reported issues abouve.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants