Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for Portuguese (Portugal) (PT) #198

Merged
merged 12 commits into from Oct 20, 2018

Conversation

Olos
Copy link
Contributor

@Olos Olos commented Sep 21, 2018

Fixes # by...

Changes proposed in this pull request:

  • Added support for Portuguese from Portugal

Status

  • READY
  • HOLD
  • WIP (Work-In-Progress)

How to verify this change

Additional notes

  • I noticed there was already one pull request regarding this, but I changed some more things from the Brazilian Portuguese, namely
  • 19 ('devEzone') to ('dezAnove') - same logic to 16 and 17
  • Also some specifications with using "and" ("e") after some words 1200 is ("mil e duzentos") 1210 is ("mil duzentos e dez") - no "e" before the duzentos
  • Using long scale instead of short scale
  • Currency - add "de" when "milhões" is right before "euros"
  • Removed comma separator (from pt_BR) for big composite numbers
  • First PR ever... be gentle

@coveralls
Copy link

Coverage Status

Coverage decreased (-4.4%) to 87.49% when pulling d1370be on Olos:master into c71d99c on savoirfairelinux:master.

.gitignore Outdated
@@ -4,3 +4,4 @@ dist
.idea/
*.egg-info
/.tox
venv/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be part of your .git/info/exclude, not in tree.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry, I'll remove it


if self.MEGA_SUFFIX:
self.cards[10 ** (n - 3)] = word + self.MEGA_SUFFIX
if self.cards[10 ** (n - 3)] == 'milião':
Copy link
Contributor

@shulcsm shulcsm Sep 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't make any sense, if you are implementing set_high_numwords just set the suffixes you, no need to use class props

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just used the set_high_numwords from lang_EU but I needed to correct the "milião".
Is it better if I correct it in merge? Like:

if ntext == 'milião':
    ntext = 'milhão'

@shulcsm
Copy link
Contributor

shulcsm commented Sep 21, 2018

This seems really bad d1370be#diff-48c05b2e89d4eb8299b0f502c3707165R224
But is probably out of scope for this pull request.

removed venv/ from .gitignore.
correction from "milião" to "milhão" done on merge function.
removed hack for negword, creating a backup of the original one.
@Olos
Copy link
Contributor Author

Olos commented Sep 24, 2018

Regarding the "hack" I did not understood if I need to correct anything.
I can change and rechange the negword in the to_cardinal function to avoid doing the replace, but I did not want (if I do not have to) to re-implement all the function because of the space handling.
Thanks

@coveralls
Copy link

coveralls commented Sep 24, 2018

Pull Request Test Coverage Report for Build 520

  • 91 of 96 (94.79%) changed or added relevant lines in 3 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.1%) to 89.808%

Changes Missing Coverage Covered Lines Changed/Added Lines %
num2words/lang_PT.py 84 89 94.38%
Totals Coverage Status
Change from base Build 514: 0.1%
Covered Lines: 2529
Relevant Lines: 2816

💛 - Coveralls

Copy link
Collaborator

@erozqba erozqba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Olos Thanks a lot for taking the time to make this contribution.

There is a lot of duplicated code between pt and pt_BR. So these are my questions because I don't know Portuguese :

  • Which are the difference in the way pt and pt_BR write numbers?
  • If there are differences, could pt_BR class be refactored so it extends pt and we avoid code duplication?

Please also check https://github.com/savoirfairelinux/num2words/pull/99/files it was a PR to add the same feature but without any tests, so never merged it.

.gitignore Outdated
@@ -3,4 +3,4 @@ build
dist
.idea/
*.egg-info
/.tox
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you remove the empty line at the end of the file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Must have been when I deleted the venv/ line.
Sorry. I will add it again.

@Olos
Copy link
Contributor Author

Olos commented Sep 26, 2018

Regarding the differences between the way pt-pt and pt-br write the numbers:

  • In pt-pt when you have 109 you do not write 'um bilião' but 'mil milhões' (we use the long scale instead of the short scale) ;
  • in pt-pt we do not write 19 ('dezEnove'), 17 ('dezEssete'), 16 ('dezEsseis') but ('dezAnove') ('dezAssete') ('dezAsseis');
  • in pt-pt we do not say 'bilhão' 'trilhão', etc but 'bilião', 'trilião', etc;
  • I did not put a comma after thousand, million, etc. as a separator (I have never used it in my life) but that is a detail;
  • in pt-br they don't have the 'e' connection in 200200 'duzentos mil, duzentos' but in pt-pt we say 'duzentos mil e duzentos' (they only have it with 200100)
  • I did not have the opportunity to change the ordinal code (yet?) because we also say (in theory because we're talking about big numbers here) them differently, according to my research.

Do you think it can be expanded with these issues?

The other pt-pt PR does not solve, if I analysed it correctly, the issues I stated above.

@shulcsm
Copy link
Contributor

shulcsm commented Sep 26, 2018

@Olos You don't have to necessarily use pt-pt class as inheritance base, just refactor it so you can reuse what is common and get rid of obvious duplication

@Olos
Copy link
Contributor Author

Olos commented Sep 26, 2018

@shulcsm @erozqba
I understand that but in this case:

  • setup has (?) to be different
  • merge has some different specifications so I don't see how it can be done (not saying it can't be done, just that I don't see it)
  • same for to_cardinal: the separator is different; the big numbers are different; the regex is different
  • to_currency uses the base class implementation (so already refactored?)

If my previous assumptions are correct I can use pt-br for:

  • to_ordinal
  • to_ordinal_num
  • to_year

As they are all the same verbatim.

Is this it?

Thank you.

@shulcsm
Copy link
Contributor

shulcsm commented Sep 26, 2018

@Olos yes, but you keep redefining huge dictionaries in it. For start move them into module constants and import them in your implementation.

@erozqba erozqba mentioned this pull request Oct 3, 2018
3 tasks
@Olos
Copy link
Contributor Author

Olos commented Oct 10, 2018

Sorry for not updating this yet but some things got in the way.
I'll define the dictionaries out of both PT languages and import them to both of them (with the changes) when I have a little time.
Should I do the same with to_ordinal function or that makes no sense?
Thank you

@Olos
Copy link
Contributor Author

Olos commented Oct 10, 2018

Could I just use PT_pt as base for PT_br? I know pt-br was created first here but it makes more sense (at least to me) to do it like this.
I would change the dictionary delete the ordinal things and keep the merge, to_cardinal (calling the Num2Word_EU.to_cardinal instead of super) and to_currency.
Makes sense?

@erozqba
Copy link
Collaborator

erozqba commented Oct 11, 2018

I'm ok with it @Olos, go ahead unless someone else has an inconvenient with it.

@Olos
Copy link
Contributor Author

Olos commented Oct 11, 2018

@erozqba
Changed it.
Waiting on feedback!

erozqba
erozqba previously approved these changes Oct 13, 2018
Copy link
Collaborator

@erozqba erozqba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Olos thanks a lot! This looks good to me.
I have one question, could you clarify?

@@ -132,7 +72,7 @@ def merge(self, curr, next):
return (ctext + ntext, cnum * nnum)

def to_cardinal(self, value):
result = super(Num2Word_PT_BR, self).to_cardinal(value)
result = lang_PT.Num2Word_EU.to_cardinal(self, value)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is really better to use the to_cardinal() implementation on Num2Word_EU instead of that the one on Num2Word_PT? They are so different the cardinal numbers between pt and pt_BR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both of them use the Num2Word_EU implementation and then change the output based on a regex.
I kept the original regex on pt_BR but the regex on pt is a little different.
In order for everything to work i thought it was better to continue to use Num2Word_EU for both of them.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Olos is good for me then! I see you still have the PR marked as Work In Progress. If you have finished, change the status, please. Also, I will really appreciate if you can check the missed lines on the coverage by the tests and add some tests for this lines, so we increase the coverage to the maximum, https://coveralls.io/builds/19469219/source?filename=num2words%2Flang_PT.py#L134.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@erozqba Added new tests for those lines.

@coveralls
Copy link

coveralls commented Oct 19, 2018

Pull Request Test Coverage Report for Build 546

  • 94 of 96 (97.92%) changed or added relevant lines in 3 files are covered.
  • 4 unchanged lines in 1 file lost coverage.
  • Overall coverage increased (+0.2%) to 90.291%

Changes Missing Coverage Covered Lines Changed/Added Lines %
num2words/lang_PT.py 87 89 97.75%
Files with Coverage Reduction New Missed Lines %
num2words/compat.py 4 100.0%
Totals Coverage Status
Change from base Build 545: 0.2%
Covered Lines: 2604
Relevant Lines: 2884

💛 - Coveralls

@erozqba erozqba merged commit 0f63859 into savoirfairelinux:master Oct 20, 2018
@erozqba erozqba mentioned this pull request Oct 25, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants