[localize-tools] Validation of existing translation should not fail on changed expression #4502

augustjk · 2024-01-18T00:36:25Z

Which package(s) are affected?

Localize (@lit/localize)

Description

Reported by @kensternberg-authentik in #4489

https://lit.dev/docs/localization/overview/#message-ids states the rules for message id generation and how they're used to deduplicate translation targets. This can be interpreted that changing the code inside expressions in your source code should not fail on future extract and build calls but there seems to exist some validation that compares the value of equiv-text attribute.

Reproduction

See linked discussion above for sample and error message.

Workaround

Manually update the xliff to match the equiv-text

Is this a regression?

No or unsure. This never worked, or I haven't tried before.

Affected versions

@lit/localize-tools@0.7.1

Browser/OS/Node environment

n/a

The text was updated successfully, but these errors were encountered:

augustjk · 2024-01-18T00:41:07Z

According to the comment here

lit/packages/localize-tools/src/messages.ts

Lines 108 to 119 in 546d75d

    
           /** 
        
            * Check that for every localized message, the set of placeholders in the 
        
            * localized version is equal to the set of placeholders in the source version 
        
            * (no more, no less, no changes, but order can change). 
        
            * 
        
            * It is important to validate this condition because placeholders can contain 
        
            * arbitrary HTML and JavaScript template literal placeholder expressions, will 
        
            * be substituted back into generated executable source code. A well behaving 
        
            * localization process/tool would not allow any modification of these 
        
            * placeholders, but we can't assume that to be the case, so it is a potential 
        
            * source of bugs and attacks and must be validated. 
        
            */

placeholders don't always contain just the expression code and can contain html too so we ignoring it might not be the best idea.

But perhaps this assumption doesn't hold when we consider multiple source msg() calls can map to a single trans-unit.

Need to investigate and consider this more carefully.

augustjk · 2024-01-23T01:40:30Z

Found some more context here: #2405 (comment)

The churn in extraction was anticipated but looks like we didn't really handle that at the time, only addressing the more pressing problem with making sure the correct expressions are used in transform mode.

In the PR above, we decided to treat same id with the same description as canonically the same translation unit.

Firstly, this should be made clear in the docs in lit.dev.

Secondly, I think we need to update the validation function

lit/packages/localize-tools/src/messages.ts

Line 120 in 546d75d

export function validateLocalizedPlaceholders(

to match this behavior so we just check for the placeholder count. Might we need to also check for description in a <note from="lit-localize">? If so, changing the description, or adding a new one, might cause error and churn.

Another concern with this is we would no longer check equiv-text for discrepancy.. I think the comment above might be outdated with regards to it being "substituted back into generated executable source code", so perhaps there's no need to really check that. Need to confirm this.

If we make these pass validation, I think the extract will end up overwriting the previous equiv-text value with whatever the analysis first encountered. Need to check if that is the case and whether it does it for both <source> and <target>.

augustjk · 2024-01-26T02:27:50Z

Got around to trying out some things and (re)discovering how lit-localize actually works. Thought dump here so I don't forget it all later.

When running extract, it'll always overwrite <source> of the translation unit with analysis from source code.

The validation and potential error happens when build is run, as it validates the placeholder in the translated <target> against the source code.

For runtime mode, it looks like localize currently uses the placeholder text from the <target> tag as it generates code. This is problematic if <target> tags have been modified and not verified against source.

For transform mode, localize only uses the index of the placeholder from <target> for positioning, but the actual placeholder content comes from source code.

Potential forward path here, we can make runtime mode behave like transform in that it'll always source placeholder text from source code. This removes the concern from the original comment above of arbitrary code injection from the translation targets. The only useful validation we'd really need then is just the number of placeholders so we know where to place them.

But without content validation, we won't detect any drift from source code to translation target for cases were the id was manually made. Would that actually matter? (This kind of drift wouldn't exactly happen with auto generated ID as it would be treated as a new translation unit then)

We could make the validation specifically ignore the expression (everything inside ${}) from the placeholder text too but still check for other things like html tags to catch cases with fixed custom ids.

augustjk · 2024-01-30T02:23:25Z

Opted to implement

We could make the validation specifically ignore the expression (everything inside ${}) from the placeholder text too but still check for other things like html tags to catch cases with fixed custom ids.

This required the smallest change with the correct behavior.

kensternberg-authentik mentioned this issue Jan 20, 2024

Web/20240118 monorepo2 goauthentik/authentik#8242

Closed

7 tasks

augustjk changed the title ~~[localize-tools] extract validation of existing translation should not fail on changed expression~~ [localize-tools] Validation of existing translation should not fail on changed expression Jan 26, 2024

augustjk mentioned this issue Jan 30, 2024

[localize] Make placeholder validation pass with different expressions #4530

Merged

augustjk mentioned this issue Jan 31, 2024

[docs/localization] Update info on messages that map to the same id lit/lit.dev#1301

Merged

augustjk closed this as completed in #4530 Jan 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[localize-tools] Validation of existing translation should not fail on changed expression #4502

[localize-tools] Validation of existing translation should not fail on changed expression #4502

augustjk commented Jan 18, 2024 •

edited

Loading

augustjk commented Jan 18, 2024

augustjk commented Jan 23, 2024

augustjk commented Jan 26, 2024

augustjk commented Jan 30, 2024

[localize-tools] Validation of existing translation should not fail on changed expression #4502

[localize-tools] Validation of existing translation should not fail on changed expression #4502

Comments

augustjk commented Jan 18, 2024 • edited Loading

Which package(s) are affected?

Description

Reproduction

Workaround

Is this a regression?

Affected versions

Browser/OS/Node environment

augustjk commented Jan 18, 2024

augustjk commented Jan 23, 2024

augustjk commented Jan 26, 2024

augustjk commented Jan 30, 2024

augustjk commented Jan 18, 2024 •

edited

Loading