New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor Catalogs.extract()
to implement more consistent behaviour
#231
Conversation
…zable strings & update helpers to allow these pods to be used easily
True by default, as that's the current & most common behaviour
… the correct locale catalogs
…talog.save()` with `include_obsolete` arg for `Catalog.update()` and `Catalog.update_using_catalog()` This is more in line with `babel.Catalog` behaviour, but better suits Grow needs of wanting to include obsolete terms _without marking them obsolete_ (which is what Babel would do). `Catalogs.extract()` can then lean on this behaviour and hence be simpler.
Awesome, thank you for putting all the time into the change and also for the excellent test coverage. I'll start the review with some style-related stuff just to keep consistency with the rest of the project and then I'll dig in after to see if I have any comments on the overall design. I gave a cursory look at it yesterday and it looks pretty great. |
|
||
# Extract from root of /content/: | ||
for path in self.pod.list_dir('/content/', recursive=False): | ||
if path.endswith('.yaml') or path.endswith('.yml'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
endswith
accepts tuples, so you can do path.endswith(('.yaml', '.yml'))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This I didn’t know! Will update.
In response to this comment, "No longer extracts tagged terms in ‘global’ locations", does that behavior only kick in when |
Well, sorta N/A: * I like this term. I might use it more. With Asid: maybe with |
Makes sense, thanks for explaining. The term "locale context" sounds good. ;) I'm OK with the breaking change of ignoring ALSO – This should definitely be a separate feature/PR but I'd also like to add a new configuration option to
That would simplify translation management across multiple developers and help with any continuous translation initiative we work on. |
You can still put stuff in |
Agree that config option would be nice — but low priority IMO. Kinda like teams to know what the commands they’re running are doing. |
Ha, well, if we provide some built-in optional commit hooks (such as a pre-commit |
👍 We’ve played with an “feature finished” script too, which runs linting, (we decided pre/post-commit was a bit much) |
Just a note: I ran this against against a project I work on and everything works as intended! I noticed this exposes the reason behind that As now intended, in absence of that identifier, and in absence of any
Also, there is no I think the change in behavior introduced by this PR is correct -- to not extract this content at all. To get the previous behavior, you'd want the equivalent of After merging this PR in, I'm thinking about making a change to the default behavior of these partial files that do not have a
Therefore, in order to get the previous behavior back and have this file extracted too, I would have to do nothing (since Thoughts on that? |
I'm pretty sure with the new implementation strings in that file will still be extracted — but it looks like they're not tagged for translation? If it were this: ---
foo@: bar
---
$locale: de
foo@: baz I'd expect |
super(Catalog, self).update( | ||
catalog_to_merge, no_fuzzy_matching=(not use_fuzzy_matching)) | ||
# Don't use gettext's obsolete functionality as it polutes files: merge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jeremydw Meant to ask — do you agree with this? Looks like the current implementation doesn't use gettext's functionality for obsolete strings, so I avoided it too: if you include_obsolete
it'll just include obsolete terms as if they weren't obsolete. Seemed safest, in case obsolete syntax isn't widely supported by translation tools (dunno if this is the case, just a guess it might not be!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, maybe. I see what you've done, but I don't know how the default functionality pollutes files other than leaving the terms in. What does it do exactly that's bad? I think obsolete terms in PO files are just comments preceded with #~
. IMO if we could support that default obsolete behavior/syntax (when --include-obsolete
is used) that'd be good? Feel free to make that outside the scope of this PR though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Raised #232 to discuss. Will leave as-is on this PR.
Bad explanation on my part, the strings are indeed tagged, e.g. You can take a peek but the reason they're not extracted is because without the This logging statement shows up when running As I mentioned I think this is correct behavior as-is (aside from the logging statement). The presence of |
self.save(include_header=include_header) | ||
|
||
def merge_obsolete(self): | ||
""" Copy obsolete terms into the main catalog """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a tiny style nit, but docstrings should end in punctuation and not be padded with space characters. ;) This isn't a blocker but I just wanted to raise this for consistency for future code you write!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK
Yep that's a bug then — my intention was for |
This wasn't a problem before adding validation for doc parts in c8bb8ab.
d3b4b44
to
f13e71c
Compare
Thanks for adding those extra tests, now it looks like they're (correctly) failing and that matches up with the behavior that I reported when using Other than that I think we're good with all these changes. If you're happy with this I think we can merge it in as soon as the fix to infer I think the change for that should be made here (https://github.com/stucox/pygrow/blob/c8bb8ab51de401459816db141bbb40ae5da17a8f/grow/pods/documents.py#L260). With some pseudo code like:
That'd just be one way to do it. Am open to another way or a simpler way of inferring whether a document is localized. LMK if you have any questions or want to pass this over to me for merging. |
Yep I'll fix those cases tomorrow. Thanks for the pseudocode. |
Merged by #239 |
Fixes #204, #157 & #206.
Pretty much by definition this is a breaking change — see #204 and the aim & the included tests for details of cases I’ve covered. Not sure the best way to communicate this to users.
Note that in this version
/data/
isn’t treated as a special directory anymore.I’ve tested this on a very large project I work on and it does what I’d expect:
This last point caused some work for us: we had a “globals.yaml” file which assumed everything was relevant to every locale (which was wrong: not every locale used those) and hence didn’t declare any locales… which worked until now. Solved by either listing all relevant locales in that file, or listing them all in podspec.yaml (the list there only contained primary locales, which might be better defined at collection level).
Note: Would appreciate if people could test on some other projects.
I’ll try a few others of ours while @jeremydw is checking out the code.