New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recursively merge language specific `site.data`. #59

Merged
merged 2 commits into from Jun 28, 2017

Conversation

Projects
None yet
2 participants
@pserwylo
Contributor

pserwylo commented Jun 26, 2017

Previously this would only overwrite the entire data files contents with
that of a language specific version. Now it merges the language specific
data file with the default, overriding whenever a new value is found in
the language specific version. This ensures that untranslated strings
revert back to their default value if a data file is only partially
translated.

Note that this doesn't try to do anything smart with arrays. If both the original data file and the language specific version contain an array, then the array from the language specific version should get chosen.

Fixes #58.

@pserwylo

This comment has been minimized.

Show comment
Hide comment
@pserwylo

pserwylo Jun 26, 2017

Contributor

You can see this addressing the problem in #58 in my partially-translated--fixed branch. It takes the commit from #58 which reproduces the problem, and rebases it on this branch.

Contributor

pserwylo commented Jun 26, 2017

You can see this addressing the problem in #58 in my partially-translated--fixed branch. It takes the commit from #58 which reproduces the problem, and rebases it on this branch.

@untra

This comment has been minimized.

Show comment
Hide comment
@untra

untra Jun 26, 2017

Owner

Thanks for the PR @pserwylo ! I'll take a look at it later tonight. I have a few other changes I've been procrastinating on, so I'd like to get those in along with any other changes before potentially cutting a new release 👍

Owner

untra commented Jun 26, 2017

Thanks for the PR @pserwylo ! I'll take a look at it later tonight. I have a few other changes I've been procrastinating on, so I'd like to get those in along with any other changes before potentially cutting a new release 👍

@pserwylo

This comment has been minimized.

Show comment
Hide comment
@pserwylo

pserwylo Jun 27, 2017

Contributor

No worries @untra, thanks for letting me know.

Contributor

pserwylo commented Jun 27, 2017

No worries @untra, thanks for letting me know.

@untra untra referenced this pull request Jun 27, 2017

Merged

beta changes and cleanup #62

untra added a commit that referenced this pull request Jun 27, 2017

beta changes and cleanup (#62)
* cleaned up the `.rubocop.yml` file
* cleaned up the `gemfile`
* updated the `.travis.yml` target ruby versions
* staged changes to the `.gemspec` to target a future `1.3.0` release
* changes to coordinate to allow proper testing of recursive merge #59 
* changes to i18nHeaders to prevent #20
@untra

I made changes last night that will assist with developing a 1.3.0 branch. Among these is getting tests going for coordinate. Please modify your changes to the coordinate.rb file, and add in a test for the coordinate_spec.rb file

@untra

This comment has been minimized.

Show comment
Hide comment
@untra

untra Jun 27, 2017

Owner

I'll make adding the test super easy: right here, add:

it 'should recursively merge the site.data.active_lang to the site.data' do
    @site.active_lang = 'fr'
    hook_coordinate(@site)
    expect(@site.data['strings']['ice cream']).to eq('crème glacée')
    expect(@site.data['strings']['apple']).to eq('apple')
end
Owner

untra commented Jun 27, 2017

I'll make adding the test super easy: right here, add:

it 'should recursively merge the site.data.active_lang to the site.data' do
    @site.active_lang = 'fr'
    hook_coordinate(@site)
    expect(@site.data['strings']['ice cream']).to eq('crème glacée')
    expect(@site.data['strings']['apple']).to eq('apple')
end
@pserwylo

This comment has been minimized.

Show comment
Hide comment
@pserwylo

pserwylo Jun 27, 2017

Contributor

Oh how of course writing tests makes me realise my broken assumptions.

It turns out that this is the structure we are using in our data files:

  • _data/
    • strings.json (English strings, from before I applied the "no default_lang" hack documented in #60.
    • fr/strings.json
    • de/strings.json

With our blank default_lang it conincidentally works with this MR, but not by design. What happens is:

  • If the active_lang is en, then it tries to merge en/strings.json into strings.json. The en/strings.json file doesn't exist, but coincidentally the correct (English) values are in en/strings.json anyway, so it just works.
  • If the active_lang is fr, then it tries to merge fr/strings.json into strings.json and works. It works for us because we don't have a default language

@untra: Is the intent that you expect people to put their default strings directly in their data/ directory, rather than in data/default_lang/?

I can change the merging logic to do the following:

  • Start with non-language specific site data (if it exists)
  • Merge the default language site data in (if it exists)
  • Then merge the active language site data in (if it exists)

Right now it only:

  • Starts with non-language specific site data (if it exists)
  • Merge the active language site data in (if it exists)

Without that, then the test above wont pass because site.data['en'] never gets merged in when the active lang is 'fr'.

Contributor

pserwylo commented Jun 27, 2017

Oh how of course writing tests makes me realise my broken assumptions.

It turns out that this is the structure we are using in our data files:

  • _data/
    • strings.json (English strings, from before I applied the "no default_lang" hack documented in #60.
    • fr/strings.json
    • de/strings.json

With our blank default_lang it conincidentally works with this MR, but not by design. What happens is:

  • If the active_lang is en, then it tries to merge en/strings.json into strings.json. The en/strings.json file doesn't exist, but coincidentally the correct (English) values are in en/strings.json anyway, so it just works.
  • If the active_lang is fr, then it tries to merge fr/strings.json into strings.json and works. It works for us because we don't have a default language

@untra: Is the intent that you expect people to put their default strings directly in their data/ directory, rather than in data/default_lang/?

I can change the merging logic to do the following:

  • Start with non-language specific site data (if it exists)
  • Merge the default language site data in (if it exists)
  • Then merge the active language site data in (if it exists)

Right now it only:

  • Starts with non-language specific site data (if it exists)
  • Merge the active language site data in (if it exists)

Without that, then the test above wont pass because site.data['en'] never gets merged in when the active lang is 'fr'.

@pserwylo

This comment has been minimized.

Show comment
Hide comment
@pserwylo

pserwylo Jun 27, 2017

Contributor

Specifically, I'd change it to this:

def hook_coordinate(site)
  # Copy the language specific data, by recursively merging it with the default data.
  # See: https://www.ruby-forum.com/topic/142809
  merger = proc { |key, v1, v2| Hash === v1 && Hash === v2 ? v1.merge(v2, &merger) : v2 }
  if site.data.include?(site.default_lang)
    site.data = site.data.merge(site.data[site.default_lang], &merger)
  end
  if site.data.include?(site.active_lang)
    site.data = site.data.merge(site.data[site.active_lang], &merger)
  end

  site.collections.each do |_, collection|
    collection.docs = site.coordinate_documents(collection.docs)
  end
  site.pages = site.coordinate_documents(site.pages)
end

Which makes the following test pass:

  it 'should fall back to the default_lang when using translated site data' do
    @site.active_lang = 'fr'
    hook_coordinate(@site)
    expect(@site.data['foo']).to eq('frbar')
    expect(@site.data['baz']).to eq('databaz')
    expect(@site.data['strings']['ice cream']).to eq('crème glacée')
    expect(@site.data['strings']['apple']).to eq('apple') # This is populated from @site.data['strings'][@site.default_lang]['apple']
  end
Contributor

pserwylo commented Jun 27, 2017

Specifically, I'd change it to this:

def hook_coordinate(site)
  # Copy the language specific data, by recursively merging it with the default data.
  # See: https://www.ruby-forum.com/topic/142809
  merger = proc { |key, v1, v2| Hash === v1 && Hash === v2 ? v1.merge(v2, &merger) : v2 }
  if site.data.include?(site.default_lang)
    site.data = site.data.merge(site.data[site.default_lang], &merger)
  end
  if site.data.include?(site.active_lang)
    site.data = site.data.merge(site.data[site.active_lang], &merger)
  end

  site.collections.each do |_, collection|
    collection.docs = site.coordinate_documents(collection.docs)
  end
  site.pages = site.coordinate_documents(site.pages)
end

Which makes the following test pass:

  it 'should fall back to the default_lang when using translated site data' do
    @site.active_lang = 'fr'
    hook_coordinate(@site)
    expect(@site.data['foo']).to eq('frbar')
    expect(@site.data['baz']).to eq('databaz')
    expect(@site.data['strings']['ice cream']).to eq('crème glacée')
    expect(@site.data['strings']['apple']).to eq('apple') # This is populated from @site.data['strings'][@site.default_lang]['apple']
  end
@untra

This comment has been minimized.

Show comment
Hide comment
@untra

untra Jun 27, 2017

Owner

Good new solution. I like that you're looking out for the edge cases. With your default strings in the data directory you can still achieve the best of both worlds. I'll need to write some documentation about how to properly lay out strings in the data directory.

@untra: Is the intent that you expect people to put their default strings directly in their data/ directory, rather than in data/default_lang/?

I would like to accommodate both approaches, however I do believe that combined with your #60 approach to all language subdirectories, this new solution should have a proper configuration that will ensure fallback to a default set of strings, even when a default_lang is not specified.

  • Start with non-language specific site data (if it exists)
  • Merge the default language site data in (if it exists)
  • Then merge the active language site data in (if it exists)

I think that is certainly the correct approach. Because it merges recursively, it will favor active_lang > default_lang > non-specific strings in that order.

Thanks for your help with this! Make the changes to the PR at your conveience, add the test(s), and I'll give it the 👍

Owner

untra commented Jun 27, 2017

Good new solution. I like that you're looking out for the edge cases. With your default strings in the data directory you can still achieve the best of both worlds. I'll need to write some documentation about how to properly lay out strings in the data directory.

@untra: Is the intent that you expect people to put their default strings directly in their data/ directory, rather than in data/default_lang/?

I would like to accommodate both approaches, however I do believe that combined with your #60 approach to all language subdirectories, this new solution should have a proper configuration that will ensure fallback to a default set of strings, even when a default_lang is not specified.

  • Start with non-language specific site data (if it exists)
  • Merge the default language site data in (if it exists)
  • Then merge the active language site data in (if it exists)

I think that is certainly the correct approach. Because it merges recursively, it will favor active_lang > default_lang > non-specific strings in that order.

Thanks for your help with this! Make the changes to the PR at your conveience, add the test(s), and I'll give it the 👍

pserwylo added some commits Jun 26, 2017

Recursively merge language specific `site.data`.
Previously this would only overwrite the entire data files contents with
that of a language specific version. Now it merges the language specific
data file with the default, overriding whenever a new value is found in
the language specific version. This ensures that untranslated strings
revert back to their default value if a data file is only partially
translated.
Added test coverage for merging language-specific and default site data
Supports first prefering language specific strings for the active_lang, then falling back to those from default_lang (if they exist). Finally, if neither exist, it will try to fall back to the non-language specific default strings.
@pserwylo

This comment has been minimized.

Show comment
Hide comment
@pserwylo

pserwylo Jun 27, 2017

Contributor

Alright, I've ammended the first commit to support defaulting to active_lang, falling back to default_lang, then non-language specific default strings. The test has been beefed up slightly to support all of these in the top commit. Specifically, there is now a non-language specific string in @site.data['strings']['banana'] which both tests fall back to correctly. Cheers!

Contributor

pserwylo commented Jun 27, 2017

Alright, I've ammended the first commit to support defaulting to active_lang, falling back to default_lang, then non-language specific default strings. The test has been beefed up slightly to support all of these in the top commit. Specifically, there is now a non-language specific string in @site.data['strings']['banana'] which both tests fall back to correctly. Cheers!

@untra

untra approved these changes Jun 28, 2017

@untra untra merged commit 7442c7f into untra:master Jun 28, 2017

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@pserwylo pserwylo deleted the pserwylo:merge-translations branch Jun 29, 2017

@pserwylo pserwylo restored the pserwylo:merge-translations branch Jun 29, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment