Filter data by draft status#124
Conversation
8576039 to
ac267d7
Compare
ac267d7 to
c24c606
Compare
| # Some parts (`ldml`, `ldmlBCP47` amd `supplementalData`) of CLDR data require that you merge all the | ||
| # files with the same root element before doing lookups. | ||
| # Ref: https://www.unicode.org/reports/tr35/tr35.html#XML_Format | ||
| # | ||
| # The return of this method is a merged XML Nokogiri document. | ||
| # Note that it technically is no longer compliant with the CLDR `ldml.dtd`, since: | ||
| # * it has repeated elements | ||
| # * the <identity> elements no longer refer to the filename | ||
| # | ||
| # However, this is not an issue, since #select will find all of the matches from each of the repeated elements, | ||
| # and the <identity> elements are not important to us / make no sense when combined together. |
There was a problem hiding this comment.
Most of this comment has been moved to DataFile#merge
The method itself has been rewritten to use reduce, which is cleaner IMO.
c24c606 to
4dae577
Compare
28f9756 to
31812e5
Compare
| def export(options = {}, &block) | ||
| locales = options[:locales] || Data.locales | ||
| components = options[:components] || Data.components | ||
| self.minimum_draft_status = options[:minimum_draft_status] if options[:minimum_draft_status] |
There was a problem hiding this comment.
I'm wondering if this should set a default if not provided, since it will throw an exception if not set.
There was a problem hiding this comment.
I think I prefer for it to explode, since it indicates that something is not working as expected. 🤷
| end | ||
|
|
||
| def setup | ||
| Cldr::Export.minimum_draft_status = Cldr::DraftStatus::CONTRIBUTED |
There was a problem hiding this comment.
You might want to not set the class variable explicitly by default, so you can see when an exception is thrown in tests when it's being unexpectedly used before it's been set.
You could set it explicitly for tests that expect it to already be set, or mock it and explicitly assert that it was used.
There was a problem hiding this comment.
I think I want it to default for everything except for the tests that exercise the draft status filtering code. I've broken the tests in data_file_test.rb into two TestCases, and I override the defaulting in the one that is testing the draft status filtering. 👍
31812e5 to
19f78a7
Compare
Provides a view into the data that is filtered by draft level
It is already handled by `DataFile`
ba983a6 to
92defe0
Compare
What are you trying to accomplish?
Fixes #73.
Throughout
ruby-cldr, there weredraft?calls in some places, but not others.What approach did you choose and why?
I added a new
--draft-statusCLI option that allows users to specify the minimum draft status that they want all exported data to have.Instead of each area of the codebase needing to know to check
draft?all the time, all access to the data is done through a newDataFileclass, which filters the data by draft status transparently. This ensures that we are not missing places where we need to be doing the filtering.The default minimum draft status of
contributedmatches the status needed for inclusion into Unicode's ICU (and consequently whatcldr-jsonexports).What should reviewers focus on?
DraftStatusis effectively an enum. Is there a better way to define these in pure Ruby?minimum_draft_status. It was a lot nicer than passing a newminimum_draft_statusparameter around everywhere. I feel mostly OK about it.There is a similar issue with the
altattribute, but that will be handled as part of #125.The impact of these changes
Users can now control what the minimum draft status to accept data from.
Less data exported as you increase your minimum draft status:
main)--draft-status=unconfirmed--draft-status=provisional--draft-status=contributed--draft-status=approved(Computed using
find . -type f -exec ls -l {} \; | awk '{sum += $5} END {print sum}', since MacOS'ducommand doesn't have a bytes flag)Testing
Find an entry in the
vendordirectory that hasdraft=provisional, for example:vendor/cldr/common/main/el.xml:Export with different minimum draft statuses and compare the results:
And see that the diffs are the
draft=provisionalentries.