Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Translation aliasing #3249

Merged
merged 28 commits into from
Jan 2, 2024
Merged

Translation aliasing #3249

merged 28 commits into from
Jan 2, 2024

Conversation

demiankatz
Copy link
Member

@demiankatz demiankatz commented Dec 6, 2023

TODO

  • Account for aliases appropriately in the devtools/language tool
  • Investigate the possibility of cross-TextDomain aliasing
  • Investigate edge case of inheriting from another TextDomain, when that TextDomain contains aliases
  • Add test coverage
  • Use aliases to resolve VUFIND-1233 as a demonstration of the new functionality
  • Close VUFIND-1233 when merging

@demiankatz demiankatz marked this pull request as draft December 6, 2023 14:24
@demiankatz demiankatz marked this pull request as ready for review December 6, 2023 21:13
@demiankatz
Copy link
Member Author

I think this is a reasonable start on a translation aliasing feature. Here's what I've done:

1.) Added support for an aliases.ini file as proposed by @ThoWagen on #413.

2.) Adjusted the language development tool to account for aliases when comparing files, and to include a report on duplicated values (since this will help us identify candidates for aliasing).

3.) Reduced duplication in the CreatorRoles text domain using the new alias mechanism, to demonstrate its usefulness and to resolve VUFIND-1233.

This PR is not yet ready to merge, as I still have to do some work adding test coverage. However, before I do those finishing touches, I'd be interested in a first opinion from @ThoWagen (whose ideas inspired this PR) and/or @EreMaijala. Please don't feel obligated to do a really deep review yet, but let me know if you have any high-level concerns about this approach. If it looks generally okay, I'll go ahead and add tests and fix any edge case bugs that I find along the way.

@demiankatz
Copy link
Member Author

I should also note that the current build failures seem to be related to a GitHub Actions issue (Composer is misbehaving in weird ways) rather than an actual problem with this PR. I'll re-run the build at some point in the future to hopefully straighten that out.

@demiankatz demiankatz mentioned this pull request Dec 7, 2023
6 tasks
Copy link
Contributor

@EreMaijala EreMaijala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@ThoWagen
Copy link
Contributor

ThoWagen commented Dec 8, 2023

Besides the idea here #3251 (comment) this looks good to me. I have not done any testing and am not familiar with the dev tools though.

@demiankatz
Copy link
Member Author

The idea @ThoWagen alludes to above is the ability to alias across TextDomains, which is not possible with the implementation currently presented here. I think it should be possible to accomplish this, though it may come with some costs.

In general, I think there are two possible ways to approach the aliasing problem:

Option 1: Reconcile aliases when loading language files

Option 2: Reconcile aliases when applying translations

I chose Option 1 for my initial implementation, since this only requires us to apply alias processing once, and then the results are cached. This has higher memory/cache storage requirements, but greater runtime efficiency. It also allows us to isolate the aliasing logic to a single class, and it has a very low risk of behavioral side effects elsewhere in the code.

I think we could add cross-TextDomain aliasing while still leaning into Option 1 if we want to -- the trade-off is that we may need to load and parse some language files multiple times in order to do it. This might not be a huge problem since the results of this work will be cached anyway. I'll try this type of implementation the next time I have time to work on this PR and see how it goes. If we want to allow simple aliasing to existing translation strings, that's probably not very hard, but if we want to cover all the edge cases of aliases to aliases, aliases to inherited/fallback strings, etc., it may get messy and lead to pitfalls with circular dependencies, etc. I'll see how it goes when I try it!

If for some reason that approach doesn't prove to be workable, there might be some kind of hybrid solution possible where we use the current code to load most of the aliases, but we have some kind of placeholder for cross-domain aliases, and we apply those at runtime. But that's obviously a much less elegant approach, and it would likely impose a significant runtime burden, since it would add extra logic to every single translation call, of which there are many on most pages.

As I noted on #3251, one potential workaround for cross-domain aliasing could be to inherit from the other domain before applying aliases, which simplifies the logic but increases the storage/memory overhead. This also raises questions about some other edge cases, like what should happen if you inherit from a language file in a different text domain, and that text domain defines aliases? I'll give these factors some thought as well when I work on the next iteration of the code.

@demiankatz
Copy link
Member Author

@ThoWagen / @EreMaijala, what do you think of commit 6d9cb34? This adds the ability to create aliases across TextDomains, at the cost of some redundant file loading.

Two important notes:

1.) You cannot alias an alias with this mechanism. That's a limitation that I'm willing to live with, but I imagine it's a problem that could be solved if we felt it was worth the effort.

2.) Right now, aliases.ini is set up so that if no TextDomain is specified, it defaults to the current TextDomain. I think this is the least verbose and most intuitive behavior. However, an argument could be made for defaulting to the default, top-level TextDomain instead. I'm open to discussing further if anyone prefers that behavior.

@EreMaijala
Copy link
Contributor

@ThoWagen / @EreMaijala, what do you think of commit 6d9cb34? This adds the ability to create aliases across TextDomains, at the cost of some redundant file loading.

Looks fine to me, but then I was happy with the initial implementation as well, so I'll leave the final judgement to @ThoWagen. :)

@ThoWagen
Copy link
Contributor

I still didn't have the time to test this but it looks good to me.

Regarding 1.) Being able to alias aliases would be something I would expect to work. But I can't tell if it is worth the effort to implement it.

Regarding 2.) I like the current approach.

@demiankatz
Copy link
Member Author

Thanks, @ThoWagen! I probably won't have time to do more work on this until the new year, but the next step is to build tests. While I'm building the test suite, I'll also see if I can set up some "alias to an alias" scenarios and determine whether it's feasible to support that without a lot of overhead. I agree that it's a feature that would be nice to have (and may potentially prevent some future confusion); I'm just not sure how expensive it will be -- but I'll try to find out!

@demiankatz
Copy link
Member Author

Okay, I believe this is done now -- I have added test coverage for basic alias functionality and some obvious edge cases, and I have extended the logic so that aliases of aliases (of aliases, etc., etc...) should work, while also preventing the possibility of infinite loops.

If @EreMaijala and/or @ThoWagen have time for one more look at this, I'd appreciate a final confirmation that my logic and tests make sense and appear complete. Once we're all happy, I'll get this merged and begin moving some of the other i18n-related PRs forward.

@demiankatz demiankatz added this to the 10.0 milestone Dec 21, 2023
Copy link
Contributor

@ThoWagen ThoWagen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one small suggestion below. Other than that this still looks good to me :)

// Circular alias infinite loop prevention:
$breadcrumbKey = "$domain::$key";
if (in_array($breadcrumbKey, $breadcrumbs)) {
return null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to throw an exception here? Otherwise finding the error might be hard.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, very good point. I have changed the behavior in 721c4a9.

@demiankatz
Copy link
Member Author

Since this already has two approvals, and the only subsequent change was to apply @ThoWagen's suggested improvement, I'm going to go ahead and merge this now. Now we should merge this work into #3200, #3239 and #3251 and make further progress on those. If @ThoWagen is able to look at #3239, I'm willing to handle the other two when time permits.

@demiankatz demiankatz merged commit 7006967 into vufind-org:dev Jan 2, 2024
7 checks passed
@demiankatz demiankatz deleted the vufind-1233 branch January 2, 2024 13:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants