Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PubID creation for Stages and Amendments #102

Closed
9 tasks done
ronaldtse opened this issue Sep 16, 2022 · 30 comments
Closed
9 tasks done

PubID creation for Stages and Amendments #102

ronaldtse opened this issue Sep 16, 2022 · 30 comments
Assignees
Labels
bug Something isn't working

Comments

@ronaldtse
Copy link
Contributor

ronaldtse commented Sep 16, 2022

These use cases have to work. All of them don't work -- I am merely typing out the expected output.

Case 1

# Generates Amd pubid despite not providing Amendment Class as input
> Pubid::Iso::Identifier.new(number: 17301, part: 1, publisher: "ISO", language: "en", year: 2016, amendments: [{number: 1, stage: "NP"}]).to_s
=> "ISO 17301-1:2016/NP Amd 1(en)"

# Generates Amd pubid with year despite not providing Amendment Class as input
> Pubid::Iso::Identifier.new(number: 17301, part: 1, publisher: "ISO", language: "en", year: 2016, amendments: [{number: 1, stage: "NP", year: 2017}]).to_s
=> "ISO 17301-1:2016/NP Amd 1:2017(en)"

Case 2

# Generates "DAmd"
> Pubid::Iso::Identifier.new(number: 17301, part: 1, publisher: "ISO", language: "en", year: 2016, amendments: [{number: 1, stage: "DIS"}]).to_s
=> "ISO 17301-1:2016/DAmd 1(en)"

Case 3

# Amd at IS generates pubid
> Pubid::Iso::Identifier.new(number: 17301, part: 1, publisher: "ISO", language: "en", year: 2016, amendments: [{number: 1, stage: "IS"}]).to_s
=> "ISO 17301-1:2016/Amd 1(en)"

# Amd at IS generates URN
> Pubid::Iso::Identifier.new(number: 17301, part: 1, publisher: "ISO", language: "en", year: 2016, amendments: [{number: 1, stage: "IS"}]).urn
=> "urn:iso:std:iso:17301:-1:stage-60.00:amd:1:en"

# Amd with IS but no year
> Pubid::Iso::Identifier.new(number: 17301, part: 1, publisher: "ISO", language: "en", year: 2016, edition: 1, amendments: [{number: 1, stage: "IS"}]).urn
=> "urn:iso:std:iso:17301:-1:ed-1:amd:1:en"

# Amd with IS and year
> Pubid::Iso::Identifier.new(number: 17301, part: 1, publisher: "ISO", language: "en", year: 2016, edition: 1, amendments: [{number: 1, stage: "IS", year: 2017}]).urn
=> "urn:iso:std:iso:17301:-1:ed-1:amd:1:en"

# Amd with IS, year and edition
> Pubid::Iso::Identifier.new(number: 17301, part: 1, publisher: "ISO", language: "en", year: 2016, edition: 1, amendments: [{number: 1, stage: "IS", year: 2017, edition: 1}]).urn
=> "urn:iso:std:iso:17301:-1:ed-1:amd:1:en"

# Amd without edition, generates pubid
> Pubid::Iso::Identifier.new(number: 17301, part: 1, publisher: "ISO", year: 2016, amendments: [{number: 1, year: 2017}]).to_s
=> "ISO 17301-1:2016/Amd 1:2017"

# Amd with edition, generates pubid
> Pubid::Iso::Identifier.new(number: 17301, part: 1, publisher: "ISO", language: "en", year: 2016, edition: 1, amendments: [{number: 1, year: 2017}]).urn
=> "ISO 17301-1:2016/Amd 1:2017"

# URN succeeds when Amd is created with edition of base document
> Pubid::Iso::Identifier.new(number: 17301, part: 1, publisher: "ISO", language: "en", year: 2016, edition: 1, amendments: [{number: 1}]).urn
=> "urn:iso:std:iso:17301:-1:ed-1:amd:1:en"

# URN fails when Amd is created without edition of base document
> Pubid::Iso::Identifier.new(number: 17301, part: 1, publisher: "ISO", language: "en", year: 2016, amendments: [{number: 1}]).urn
=> raise Error because base document must have edition

Case 4

> Pubid::Iso::Identifier.new(number: 17301, part: 1, publisher: "ISO", language: "en", year: 2016, amendments: [{number: 1, stage: "IS", iteration: 2}])
=> raise Error because IS stage does not support iterations.

Case 5

# Generates DTS
> Pubid::Iso::Identifier.new(number: 17301, part: "1-1", publisher: "ISO", language: "en", year: 2016, type: "TS", stage: "DIS", copublisher: ["IEC","IEEE"]).to_s
=> "ISO/IEC/IEEE DTS 17301-1-1:2016(en)"

Case 6

# Generates FDTS
> Pubid::Iso::Identifier.new(number: 17301, part: "1-1", publisher: "ISO", language: "en", year: 2016, type: "TS", stage: "FDIS", copublisher: ["IEC","IEEE"]).to_s
=> "ISO/IEC/IEEE FDTS 17301-1-1:2016(en)"

Task list

@ronaldtse ronaldtse added the bug Something isn't working label Sep 16, 2022
@ronaldtse
Copy link
Contributor Author

Please add these cases to the RSpec tests.

@ronaldtse ronaldtse mentioned this issue Sep 16, 2022
9 tasks
@mico
Copy link
Contributor

mico commented Sep 19, 2022

# URN fails when Amd is created without edition of base document
> Pubid::Iso::Identifier.new(number: 17301, part: 1, publisher: "ISO", language: "en", year: 2016, amendments: [{number: 1}]).urn
=> raise Error because base document must have edition

"year" is an edition here, it will generate "ISO 17301-1:2016" as a base document id.

@ronaldtse
Copy link
Contributor Author

@mico what I mean is, without an edition, there cannot be an amendment URN. When there is no edition number, it is possible to create a PubID in string. However, RFC 5141 URNs for amendments require existence of an edition number. Therefore an error should be raised.

@mico
Copy link
Contributor

mico commented Sep 19, 2022

# Generates "DAmd"
> Pubid::Iso::Identifier.new(number: 17301, part: 1, publisher: "ISO", language: "en", year: 2016, amendments: [{number: 1, stage: "DIS"}]).to_s
=> "ISO 17301-1:2016/DAmd 1(en)"

Already have test for that https://github.com/metanorma/pubid-iso/blob/main/spec/pubid_iso/identifier_spec.rb#L868

@mico
Copy link
Contributor

mico commented Sep 19, 2022

@ronaldtse We have PubID identifiers with edition included in PubID, examples:

  • ISO/IEC 17025:2005/Cor.1:2006 ED1(fr)
  • ISO/IEC 30142 ED1
  • ISO 22610:2006 Ed
  • ISO 11553-1 Ed.2

Now we also include edition to generated PubID when we have one of these identifiers as input.
Should we keep this behaviour?
If yes, it's conflicting with this rule:

# Amd with edition, generates pubid
> Pubid::Iso::Identifier.new(number: 17301, part: 1, publisher: "ISO", language: "en", year: 2016, edition: 1, amendments: [{number: 1, year: 2017}]).urn
=> "ISO 17301-1:2016/Amd 1:2017"

Where edition is not included in PubID output but included to URN.

@ronaldtse
Copy link
Contributor Author

Good point @mico. Can we provide a PubID rendering option to whether show Edition or not?

@opoudjis
Copy link

Pubid::Iso::Errors::SupplementWithoutYearError:
  Cannot apply supplement to document without edition year

We have been generating undated identifiers for all ISO documents, including amendments. So both ISO 123:2001 and ISO 123, and therefore both ISO 123:2001/Amd 1:2002 and ISO 123/Amd 1.

However, I don't think you are wrong in raising this error. ISO 123 is much more meaningful than ISO 123/Amd 1 is: ISO 123 refers to the set of all editions of ISO 123; ISO 123/Amd 1 is much less well-defined.

So for now, I'm going to not generate undated identifiers for amendments. @ronaldtse @Intelligent2013 let me know if this raises concerns.

@ronaldtse
Copy link
Contributor Author

So for now, I'm going to not generate undated identifiers for amendments.

Yes this error is correct indeed.

We are now disallowing the PubID "ISO 123/Amd 1" because a supplement must be based on a base publication with either edition of year.

We should not generate undated identifiers for amendments.

@mico
Copy link
Contributor

mico commented Sep 20, 2022

Good point @mico. Can we provide a PubID rendering option to whether show Edition or not?

@ronaldtse we already have this option (with_edition):

> Pubid::Iso::Identifier.new(number: 17301, part: 1, publisher: "ISO", year: 2016, amendments: [{number: 1, year: 2017}]).to_s(with_edition: false)
=> "ISO 17301-1:2016/Amd 1:2017"

but without this option, result will be:

> Pubid::Iso::Identifier.new(number: 17301, part: 1, publisher: "ISO", year: 2016, amendments: [{number: 1, year: 2017}]).to_s
=> "ISO 17301-1:2016/Amd 1:2017 ED1"

Is it enough for now or we should change something?

@opoudjis
Copy link

opoudjis commented Sep 20, 2022

Lots of minor discrepancies between what I'd implemented and the current code, now that I have the code working. Posting to confirm:

Before: ISO 17301-1:2016/PreNP Amd 1. After: ISO 17301-1:2016/NP Amd 1.3:2017

  • We are ignoring draft status (Pre), and we are not ignoring iterations in amendments.

Before ISO 17301-1:2030/CD Cor.3. After: ISO 17301-1:2030/CD Cor 3

  • We are not abbreviating Corrigendum as "Cor.", we are putting space between "Cor" and iteration

Before: ISO/PreWD3 1000-1. After: ISO WD 1000-1.3

  • We are appending iterations, not making them part of the status. We are not delimiting ISO from status with a slash.

Before: IEC/IETF/ISO/TR 1000-1-1:2001. After: IEC/IETF/ISO TR 1000-1-1:2001

  • We are not delimiting publishers from document type with a slash.

Before: ISO 1000. After: ISO PRF 1000

  • We are not ignoring stage PRF in identifier rendering.

@ronaldtse
Copy link
Contributor Author

Before: ISO 17301-1:2016/PreNP Amd 1. After: ISO 17301-1:2016/NP Amd 1.3:2017

  • We are ignoring draft status (Pre)

This is a question to be discussed in #76

we are not ignoring iterations in amendments.

This is correct.

Before ISO 17301-1:2030/CD Cor.3. After: ISO 17301-1:2030/CD Cor 3

  • We are not abbreviating Corrigendum as "Cor.", we are putting space between "Cor" and iteration

Correct. "Cor." is legacy practice that has been replaced with "Cor" (no period).

Before: ISO/PreWD3 1000-1. After: ISO WD 1000-1.3

  • We are appending iterations, not making them part of the status.

Correct. Iteration is after the document/part number prefixed with the full stop.

We are not delimiting ISO from status with a slash.

This is wrong. It should have been "ISO/WD 1000-1.3".

Before: IEC/IETF/ISO/TR 1000-1-1:2001. After: IEC/IETF/ISO TR 1000-1-1:2001

  • We are not delimiting publishers from document type with a slash.

This is correct. The previous IEC/IETF/ISO/TR pattern is wrong. If there are copublishers, copublishers are separated by slashes, and the document type appears after the copublishers and a space. If there are no copublishers, the document type is placed after a slash after "ISO".

Before: ISO 1000. After: ISO PRF 1000

  • We are not ignoring stage PRF in identifier rendering.

This is usage dependent:

  • In rendering a PRF document, the PRF document type is ignored.
  • In referencing a PRF document, the PRF document type is rendered.

This means we need to have some option of specifying behavior here.

@opoudjis
Copy link

I would still rather we be able to specify stage as numbers rather than codes; as it is, the stage parameters are abstract anyway (we specify "FDIS" to get "FDTS" for tech specs, so we might as well just specify "50".)

OK, a couple of things to fix still, but this is substantially improved.

@ronaldtse
Copy link
Contributor Author

@ronaldtse we already have this option (with_edition):

Can we reverse the default behavior i.e. with_edition: false by default, only render edition when with_edition: true? Thanks!

@ronaldtse
Copy link
Contributor Author

I would still rather we be able to specify stage as numbers rather than codes; as it is, the stage parameters are abstract anyway (we specify "FDIS" to get "FDTS" for tech specs, so we might as well just specify "50".)

I agree that we should also allow specifying stages via stage codes (and substage codes).

When using it though understand that there will be inconsistencies: stage codes for PRF and FDIS are identical, so anything 50.00 will be rendered by default as FDIS unless you specify the PRF stage. (@mico can you put this in the README? Thanks.)

@opoudjis
Copy link

opoudjis commented Sep 20, 2022

Because the pubid-iso gem is now going to act as my oracle for stage abbreviations (why should the lookup tables for abbreviations be implemented twice?),

I will need to be able to look up the actual abbreviation for a stage, as applied to the particular document type. So if the style FDIS is applied to a TS document, I need a function that will return me "FDTS" given the initialised identifier. id.stage.abbr still returns me "FDIS". I request it be updated to return FTDS. Similarly, I'd like for it to return "IS" for 60.60, even though we suppress display of the "IS" stage identifier.

This will allow me to remove my own mapping to stage abbreviations, so that there is only one source of truth for abbreviations.

@opoudjis
Copy link

Pubid::Iso::Identifier.new(**{:number=>"1000", :language=>"en", :type=>"DIR", :publisher=>"ISO", :copublisher=>[], :urn_stage=>"60.60"}).to_s(with_language_code: :single)

outputs

"ISO DIR 1000"

and not the expected

"ISO DIR 1000(E)"

@opoudjis
Copy link

We are not delimiting ISO from status with a slash.

This is wrong. It should have been "ISO/WD 1000-1.3".

But note that this entire venture was triggered by metanorma/metanorma-iso#657 :

In ISO, the document identifier pattern for "copublished" documents is "IEC/ISO {stage}" not "IEC/ISO/{stage}", e.g. https://www.iso.org/standard/56572.html.

@mico
Copy link
Contributor

mico commented Sep 20, 2022

Pubid::Iso::Identifier.new(**{:number=>"1000", :language=>"en", :type=>"DIR", :publisher=>"ISO", :copublisher=>[], :urn_stage=>"60.60"}).to_s(with_language_code: :single)

outputs

"ISO DIR 1000"

and not the expected

"ISO DIR 1000(E)"

We don't have DIR documents with language code. (https://github.com/metanorma/pubid-iso/blob/main/spec/fixtures/iso-pubid-directives.txt)
Should we support it? @ronaldtse

@ronaldtse
Copy link
Contributor Author

But note that this entire venture was triggered by metanorma/metanorma-iso#657 :

In ISO, the document identifier pattern for "copublished" documents is "IEC/ISO {stage}" not "IEC/ISO/{stage}", e.g. https://www.iso.org/standard/56572.html.

See this, this is correct behavior:

This is correct. The previous IEC/IETF/ISO/TR pattern is wrong. If there are copublishers, copublishers are separated by slashes, and the document type appears after the copublishers and a space. If there are no copublishers, the document type is placed after a slash after "ISO".

We don't have DIR documents with language code.

@mico sure, let's support DIR with language codes as well. Thanks.

@ronaldtse
Copy link
Contributor Author

I will need to be able to look up the actual abbreviation for a stage, as applied to the particular document type. So if the style FDIS is applied to a TS document, I need a function that will return me "FDTS" given the initialised identifier. id.stage.abbr still returns me "FDIS". I request it be updated to return FTDS. Similarly, I'd like for it to return "IS" for 60.60, even though we suppress display of the "IS" stage identifier.

@opoudjis I need a bit more context here. The return value requested is inconsistent in these two cases:

  • The stage abbreviation for ISO "FDIS" is "FDIS". "FDTS" is an abbreviation of the "doctype + stage" of a TR in FDIS. We don't currently have a name for this but we probably should.
  • There is no stage abbreviation for ISO 60.60. "IS" is the abbreviation of the document type.

Maybe the "doctype + stage" abbreviation can be called the "document number prefix", but notice that there is a challenge:

  • ISO/TR => "TR" ?
  • ISO/FDTR => "FDTR"?
  • ISO/IEC TR => "TR"?
  • ISO/IEC FDTR => "FDTR"?

What should these return? Unless you include the "ISO..." bit the values will be inconsistent.

@opoudjis
Copy link

opoudjis commented Sep 22, 2022

At the moment, these are being shoved into stage/@abbreviation in markup, and they are being used by @Intelligent2013 in the process of creating the cover page for PDF. I was indeed giving

ISO/TR => "TR"
ISO/FDTR => "FDTR"
ISO/IEC TR => "TR"
ISO/IEC FDTR => "FDTR"

.... I don't see that consistency is that important. I think it is far more important that we not destroy a PDF workflow that has been stable for the past two years out of pedantry. I also don't particularly see where else this information should be stored: it is a stage abbreviation, albeit specific to a document type. So?

If your concern is what pubid-iso calls this, as opposed to what Metanorma XML does with it: what pubid-iso calls this data element of course is up to pubid-iso, as long as I can obtain it to pass it on. You can call it type_stage_abbrev if you like.

The inconsistency just means it is specific to the type/stage combination, but that does not impact anything in identifier generation anyway. I just want the string, so that Alex can put the string in the PDF front cover.

@ronaldtse
Copy link
Contributor Author

ronaldtse commented Sep 22, 2022

ISO/TR => "TR"
ISO/FDTR => "FDTR"
ISO/IEC TR => "TR"
ISO/IEC FDTR => "FDTR"
I just want the string, so that Alex can put the string in the PDF front cover.

This inconsistency is a real issue because it impacts @Intelligent2013 as well. The PDF workflow should not need to manually reconstruct these prefixes.

What I am stating here is: the publisher, document type and stage abbreviations are LINKED and dependent on each other. They need to be represented as one string.

Ping @Intelligent2013 on how this works for you?

@Intelligent2013
Copy link

I don't clear understand the changes. I.e.
instead of
<stage abbreviation="TR" language=""
will be
<stage abbreviation="ISO/TR" language="" or <stage abbreviation="ISO/IEC TR" language=""?
Then I can extract TR, no problem.
Currently, stage/@abbreviation is using to determination the PDF layout for cover page (there are 4 layouts) and some initial pages.

@Intelligent2013
Copy link

The text representation of abbreviation XSLT reads from stage with non-empty @language:

<stage abbreviation="DIS" language="en">Draft International Standard</stage>

@opoudjis
Copy link

opoudjis commented Sep 22, 2022

This inconsistency is a real issue because it impacts @Intelligent2013 as well. The PDF workflow should not need to manually reconstruct these prefixes.

You are completely misunderstanding the requirement, @ronaldtse. This is nothing to do with document identifiers. I am providing the stage abbreviation, for him (and me) to display on the coverpage for unpublished documents: IT SAYS, on the cover, "FDTR stage" in the warning box. That string is unlinked to the publisher and the document type: this is not the document identifier. I am asking for this gem to provide me that abbreviation, so I don't have to reconstruct it myself any more, and maintain a separate source of truth on those type-specific stage abbreviations.

opoudjis added a commit to metanorma/metanorma-iso that referenced this issue Sep 22, 2022
@opoudjis
Copy link

opoudjis commented Sep 25, 2022

I still need the stage output. Moreover, whatever the incompatibility is between pubid-iso and relaton-iso needs to be resolved. relaton/relaton-iso@5b6a684 @andrew2net

This is now holding up tomorrow's release.

The stage output, I can wait on. The incompatibility between pubid-iso and relaton-iso, I cannot.

@ronaldtse
Copy link
Contributor Author

@opoudjis thanks for the explanation. We could provide the "FDTR" abbreviation and full name for a pubid. The same happens for TR, PAS.

How about pubid.typed_stage_abbev for "FDTR" and pubid.typed_stage_name for "Final Draft Technical Report"?

@ronaldtse
Copy link
Contributor Author

@mico can we implement this soon? thanks.

@ronaldtse
Copy link
Contributor Author

The remaining issues have been dealt with by using the TypedStages concept.

Once pubid-iso is released this can be used by relaton-iso.

@ronaldtse
Copy link
Contributor Author

Closing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants