Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

expand set of codepoints returned #21

Closed
Ichimonji10 opened this issue May 16, 2014 · 6 comments
Closed

expand set of codepoints returned #21

Ichimonji10 opened this issue May 16, 2014 · 6 comments

Comments

@Ichimonji10
Copy link
Contributor

When asked to generate UTF8 characters, the FauxFactory.generate_string class method returns only characters in the CJK characterset. Although CJK characters are valid UTF8 characters, they do not represent the entire range of valid UTF8 characters. By the same logic, generate_string could return only ASCII characters when asked for UTF8 characters, and given that ASCII characters are a subset of UTF8, it would technically be acting correctly.

It would be better if the generate_string method returned a fuller set of UTF8 characters when asked to generate UTF8 characters.

The lower limit of valid UTF8 code points is 0x0, and I'm not sure what the upper limit is. According to RFC 3629, several ranges of characters are also off-limits: 0xC0, 0xC1, 0xF5–0xFF and 0xD800–0xDFFF.

@Ichimonji10
Copy link
Contributor Author

The upper limit (inclusive) of UTF8 is 0x10FFFF.

@omaciel
Copy link
Owner

omaciel commented May 17, 2014

The tricky part is determining all the valid ranges and avoiding characters that are not "real" but I totally agree with you that ultimately generate_string can be improved. I need to spend a bit more time researching the topic so that I can take a stab at it.

@Ichimonji10
Copy link
Contributor Author

I'm working on this. You'll have a pull request soon.

@omaciel
Copy link
Owner

omaciel commented May 17, 2014

Sweet!!!

rock

@Ichimonji10
Copy link
Contributor Author

The current pull request fixes this issue.

@omaciel
Copy link
Owner

omaciel commented May 17, 2014

Merged! You should update README.rst, HISTORY.rst and add yourself to AUTHORS.rst :)

@omaciel omaciel closed this as completed May 17, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants