New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added scripts and data retrieved from unicode CLDR #321
Conversation
Codecov Report
@@ Coverage Diff @@
## master #321 +/- ##
==========================================
- Coverage 97.61% 94.28% -3.33%
==========================================
Files 20 299 +279
Lines 1674 1836 +162
==========================================
+ Hits 1634 1731 +97
- Misses 40 105 +65
Continue to review full report at Codecov.
|
…d other changes to make parsing more efficient
… added language_order
…with same translations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi! I'm leaving some comments regarding data download scripts.
scripts/get_cldr_data.py
Outdated
@@ -0,0 +1,480 @@ | |||
import requests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any new modules (requests, orderedset) must be added to requirements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But these are imported in the scripts which won't be used by users, do we still need to add these to requirements?
scripts/get_cldr_data.py
Outdated
from utils import get_dict_difference | ||
from order_languages import language_locale_dict | ||
|
||
OAuth_Access_Token = 'OAuth_Access_Token' # Add OAuth_Access_Token here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that we could replace direct access to git with requests+auth with using:
Repo.clone_from('https://github.com/unicode-cldr/cldr-dates-full', 'path')
and working on temporary cloned repo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't know about that, will make changes soon.
scripts/get_cldr_data.py
Outdated
redundant_keys = [] | ||
for key, value in json_dict.items(): | ||
if not value: | ||
redundant_keys.append(key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would probably be more efficient to do with filter()
. I mean something like:
def filter_func(keyvalue):
key, value = keyvalue
if value and not value.isdigit(): # etc... the conditions for filtering
return True
json_dict = dict(filter(filter_func, json_dict.items()))
scripts/get_cldr_data.py
Outdated
json_dict["date_order"] = DATE_ORDER_PATTERN.sub( | ||
r'\1\2\3', DATE_ORDER_PATTERN.search(date_format_string).group()) | ||
|
||
json_dict["january"] = [gregorian_dict["months"]["stand-alone"]["wide"]["1"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This dict creation part could be made a lot shorter with loops.
Thanks for the review @asadurski. I have modified the scripts and they work much faster now. |
…icient data, made necessary changes
… more tests for translation
Added scripts use to retrieve and store data in desired format, and added the data retrieved. Still support for numbering systems and numerals need to be added and some more issues are to be dealt with.