-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Additional version/date attributes for gaiji description #2132
Comments
From: #1805 (comment) |
Makes perfect sense to me that multiple version numbers would need to be recorded. But will it ever be necessary to specify anything other than
If not, I am wondering (aloud) if just defining The original comment on this ticket is kind enough to show an example of a Also wondering what elements need att.datable and extended version capability. List of candidates: from
|
@sydb Thank you, this is a point related to what I wrote above that "Whether the version-based and date-based attributes can coexist is subject to discussion." Actually, in reference to what we might need in our project's environment, adding to However, I can easily come up with a case that a change in mapping involves transitioning between multiple versioning schemes. In the following scenario,
Here, all PUA, single code point, and IVS mapping of the glyph are equally a Unicode representation, so only one of them should be valid at a certain moment. Versioning schemes of the legacy set, Unicode (core spec), and IVD are all independent of each other, which means if you try to delimit them by "versions", the start and end attributes are described in terms of different frameworks. This could be very complicated compared to logging the changes by date. (Alternatively, you can update the version number of the legacy set whenever the Unicode mapping has changed, but do you have to fork it to keep up with Unicode?) I assume that there are possible use cases that
This seems a great idea for the version range notation, with a little drawback I think that you will not be able to mark the
What I have in my mind is the analogy to
the
I didn't think thoroughly beyond elements I showed in the original post, but on second thought, we might need (extended) versioning for |
Just jotting done some quick notes. To keep in line with version ranges in other standards I would propose to use When I wrote the updates, my assumption was that all references to unihan or Unicode properties would be tied to a single unambiguous version. (Usually the latest at the time of publishing an edition). Automatic validation would assist and alert users to changes if they occur. This could be made more explicit by defining a single |
To move the discussion forward, below is my tentative spec design based on the comments so far:
PS: For whoever that might be confused by the semantics between changes in |
I wonder if there is any further discussion ongoing. |
No, @747, I have to admit at least I have not thought about or discussed this ticket since last summer; so thanks for the ping! As for versioning attributes you proposed on 05 Jul 21, two thoughts jump to mind.
|
@sydb Thank you for your advice!
It is a very good point that involves the semantics of As I re-read the guidelines, I was actually able to find description on
Yes, it will be very welcome and efficient if possible. |
@747 — Council discussed this today, and we are wondering if the following would address your needs?
Thus your original example would look something like <char xml:id="myChar">
<localProp name="Name" value="A LOCAL GAIJI" />
<unihanProp name="kIRG_USource" value="U-012345" version="1X.0 1Y.0" />
<unihanProp name="kIRG_SSource" value="S-567890" version=">=1Z.0" />
<mapping type="internal">0xABCD</mapping>
<mapping type="PUA" from="2012-01-01" to="2018-03-31">U+FXXX</mapping>
<mapping type="standard" from="2018-04-01" to="2019-10-15">U+YYYYY</mapping>
<mapping type="standard" from="2019-10-16">U+YYYYY U+E0100</mapping>
</char> This has the slight disadvantages that a) you would not get the lovely drop-down list of Unicode version numbers that you do now, you would have to type it by hand (gasp!), and b) if you wanted to express a character property that was drawn from both Unicode and some other standard you would have to use two separate elements. Note[1] This might be done by creating a new teidata.semanticVersion datatype which would adopt the syntax, but not all of the semantics, of the semantic versioning system. (See #1993 and associated.) Thus values like (maybe) |
Thank you for your update and continued support. The described spec seems enough to cover our use cases. Other than |
Note: related to #1993 |
Yeah, I meant year, that’s it. Seriously, @747, the hold-up here is that Council cannot make up its mind about version numbering in general (e.g., #1993), which sort of makes progress on the version number part of this hard. So once I realized an entire year has gone by, I did the other two bullet points, but not the 1st one (the version number stuff). The results are available (only in English) on my basement server; see the Guidelines and the schemas there. Council will be meeting again in Paderborn in a few weeks, and I am pretty sure version numbering will be on the agenda. |
@sydb Hi, yes I understand that overriding |
Sorry, @747, turns out I did not make it to Paderborn. |
Hi @sydb thank you for your response. Do you mean |
Oh, yes! That was a slip of the brain. |
Noting that part of this ticket is addressed in #2511 for the Guidelines 4.8.0, but the rest (re |
Due to the gradual and time-consuming procedure of Han character standardization into Unicode, an unencoded Han gaiji will likely have multiple identities as well as go through property changes during and after the standardization process. In order to maximize the stability of text body using the gaiji module, the character/glyph description needs the capacity to record the update history for traceable collation.
Thus we suggest:
Extended versioning: the existing
@version
is limited to the Unicode Standard version and insufficient to support modifiable properties. We will need attributes to delimit the start and end points in Unicode version (perhaps as@verFrom
and@verTo
). Versioning systems outside Unicode should better be supported as well for those regional or specialized character sets that are (still) widely used.Datable elements: to further support non-version-based change items, we should allow
att.datable
attributes to subelements of<char>
/<glyph>
. Whether the version-based and date-based attributes can coexist is subject to discussion.An example is like (
<mapping>
s contain numeral notations instead of real code points for visibility):A hypothetical sample for the local version attribute:
It could be also dated with start and end e.g. using
@localVerFrom
and@localVerTo
.By-Question: How do we handle IVD (UTS #37) versions and properties, which are not linked to the Unicode proper's in any ways?
The text was updated successfully, but these errors were encountered: