Skip to content

Conversation

nxshock
Copy link

@nxshock nxshock commented Apr 8, 2018

No description provided.

@data-man
Copy link
Contributor

data-man commented Apr 8, 2018

In your PR, only ASCII digits are added. It isn't correct.

proc isDigit*(c: Rune): bool {.rtl, extern: "nuc$1", procvar.} =
## Returns true iff ``c`` is a digit character
var c = RuneImpl(c)
return c in digitRanges
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like I can't check for digit like this... I need to rewrite this line.

@data-man
Copy link
Contributor

data-man commented Apr 9, 2018

@nxshock

Which Unicode version you used?
Version 10.0 contains 590 characters (in Nd category).

See also nim-unicodedb and nim-unicodeplus libraries.
nim-unicodeplus contains these procs:

isDecimal
isDigit
isNumeric

Just for compatibility reasons.

@nxshock
Copy link
Author

nxshock commented Apr 9, 2018

digitRanges contains 590 characters (59 groups * 10 chars) from Unicode 10.

What kind of compatibility you are mean? As described, nim-unicodeplus is not drop-in replacement, so it must be used like this:

import unicode except isDigit # like README.md example
import unicodeplus

I think if we have Rune.isAlpha, we need to have Rune.isDigit in std lib. :)

@data-man
Copy link
Contributor

data-man commented Apr 9, 2018

@nxshock

I mean that it would be nice to have isDecimal and isNumeric. :)

@nxshock
Copy link
Author

nxshock commented Apr 9, 2018

You are right, but unfortunately I'm neither Unicode guru nor English guru.

As described here, there are:

  • 865 numeric chars
  • 128 digit chars
  • 590 decimal chars

Should we use same naming rules or not? If yes, I just implemented Rune.isDecimal, not Rune.isDigit...

@data-man
Copy link
Contributor

data-man commented Apr 9, 2018

@nxshock

You implemented checking for a decimal digits.
So isDecimalDigit will be better name, IMO.

LGTM. Let's wait for other opinions.

nor English guru

Me too :).

Some useful links:
Category 'Number, Decimal Digit'
Category 'Number, Letter'
Category 'Number, Other'

@nxshock nxshock changed the title Add unicode.Rune.isDigit proc Add unicode.Rune.isNumeric, isDigit, isDecimal Apr 9, 2018
@Varriount
Copy link
Contributor

The only thing I have to add to the table is that it looks like all those new procedures could share a common template implementation.

@Araq
Copy link
Member

Araq commented Apr 13, 2018

Nice job, but all of these tables need to be generated from the Unicode spec. Otherwise it's a maintenance burden. I know the current code does it this way too, but I don't want to add more of the same bad stuff.

@ghost
Copy link

ghost commented Apr 13, 2018

Like this library - https://github.com/nitely/nim-unicodedb ? It generates all info from UCD

@Araq
Copy link
Member

Araq commented Apr 13, 2018

@Yardanico Probably.

@Araq
Copy link
Member

Araq commented Oct 9, 2018

No feedback, closing.

@Araq Araq closed this Oct 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants