Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add unicode letters generator #70

Merged
merged 1 commit into from Feb 16, 2015
Merged

Conversation

elyezer
Copy link
Contributor

@elyezer elyezer commented Feb 13, 2015

This generator is a helper for the gen_utf8 function which will provide
the system supported list of unicode letters. This will avoid generating
unicode string with control characters and other non letters characters.

Also adds tests for the generator in order to ensure it is not
generating unwanted characters.

Closes #69

@coveralls
Copy link

Coverage Status

Coverage decreased (-2.37%) to 97.33% when pulling 4178f1b on elyezer:unicode-letters into 3ca13df on omaciel:master.

"""
if sys.version_info.major == 2:
chr_function = unichr
range_function = xrange
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chr_function = unichr  # pylint:disable=undefined-variable
range_function = xrange  # pylint:disable=undefined-variable

@Ichimonji10
Copy link
Contributor

ACK pending comments

@coveralls
Copy link

Coverage Status

Coverage increased (+0.0%) to 99.7% when pulling 204485f on elyezer:unicode-letters into 3ca13df on omaciel:master.

@Ichimonji10
Copy link
Contributor

Unicode Standard Annex 44: Unicode Character Database, section 5.5.1 General Category Values explains the meaning of values such as "Lu" or "No".

@Ichimonji10
Copy link
Contributor

Do you think the types of characters emitted by gen_utf8 is appropriate? Character categories such as "Decimal_Number" and "Currency_Symbol" are omitted. And, as Ke$ha demonstrates, those values actually are used in some weird places. We can make a judgment call and omit them. But in that case, the docstring for gen_utf8 should be updated appropriately.

This generator is a helper for the gen_utf8 function which will provide
the system supported list of unicode letters. This will avoid generating
unicode string with control characters and other non letters characters.

Also adds tests for the generator in order to ensure it is not
generating unwanted characters.

Closes omaciel#69
@elyezer
Copy link
Contributor Author

elyezer commented Feb 13, 2015

@Ichimonji10 I have updated the docstring. For testing purposes I think just the letters should suffice as probably we will get at least one UTF-8 glyph with 2 or more bytes. We can add more categories for sure, but I think is better to "stay safe", just using a multi-byte character will define if the system is capable of handling UTF-8 strings.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.0%) to 99.7% when pulling 8bbf290 on elyezer:unicode-letters into 3ca13df on omaciel:master.

@elyezer
Copy link
Contributor Author

elyezer commented Feb 13, 2015

On my system I got a total of 48270 unicode letters. It is 73.66% of the maxunicode 65535.

@Ichimonji10
Copy link
Contributor

ACK

@elyezer
Copy link
Contributor Author

elyezer commented Feb 15, 2015

Have done some practical testing and here are the results:

With this changes:

$ py.test tests/foreman/api/test_user.py -k test_create
================================ test session starts ================================
platform darwin -- Python 2.7.6 -- py-1.4.26 -- pytest-2.6.4
collected 23 items

tests/foreman/api/test_user.py ...

====================== 20 tests deselected by '-ktest_create' =======================
===================== 3 passed, 20 deselected in 13.24 seconds ======================

$ py.test tests/foreman/api/test_user.py -k test_create
================================ test session starts ================================
platform darwin -- Python 2.7.6 -- py-1.4.26 -- pytest-2.6.4
collected 23 items

tests/foreman/api/test_user.py ...

====================== 20 tests deselected by '-ktest_create' =======================
====================== 3 passed, 20 deselected in 7.42 seconds ======================

$ py.test tests/foreman/api/test_user.py -k test_create
================================ test session starts ================================
platform darwin -- Python 2.7.6 -- py-1.4.26 -- pytest-2.6.4
collected 23 items

tests/foreman/api/test_user.py ...

====================== 20 tests deselected by '-ktest_create' =======================
====================== 3 passed, 20 deselected in 8.17 seconds ======================

$ py.test tests/foreman/api/test_user.py -k test_create
================================ test session starts ================================
platform darwin -- Python 2.7.6 -- py-1.4.26 -- pytest-2.6.4
collected 23 items

tests/foreman/api/test_user.py ...

====================== 20 tests deselected by '-ktest_create' =======================
===================== 3 passed, 20 deselected in 13.62 seconds ======================

 $ py.test tests/foreman/api/test_user.py -k test_create
================================ test session starts ================================
platform darwin -- Python 2.7.6 -- py-1.4.26 -- pytest-2.6.4
collected 23 items

tests/foreman/api/test_user.py ...

====================== 20 tests deselected by '-ktest_create' =======================
===================== 3 passed, 20 deselected in 28.24 seconds ======================

Without the changes

$ py.test tests/foreman/api/test_user.py -k test_create
================================ test session starts ================================
platform darwin -- Python 2.7.6 -- py-1.4.26 -- pytest-2.6.4
collected 23 items

tests/foreman/api/test_user.py FFF

===================================== FAILURES ======================================
2015-02-15 11:39:37 - nailgun.client - WARNING - Received HTTP 422 response: {
  "error": {"id":null,"errors":{"firstname":["is invalid"]},"full_messages":["First name is invalid"]}
}

2015-02-15 11:39:38 - nailgun.client - WARNING - Received HTTP 422 response: {
  "error": {"id":null,"errors":{"lastname":["is invalid"]},"full_messages":["Surname is invalid"]}
}

2015-02-15 11:39:39 - nailgun.client - WARNING - Received HTTP 422 response: {
  "error": {"id":null,"errors":{"login":["is invalid"]},"full_messages":["Username is invalid"]}
}

====================== 20 tests deselected by '-ktest_create' =======================
====================== 3 failed, 20 deselected in 4.93 seconds ======================

$ py.test tests/foreman/api/test_user.py -k test_create
================================ test session starts ================================
platform darwin -- Python 2.7.6 -- py-1.4.26 -- pytest-2.6.4
collected 23 items

tests/foreman/api/test_user.py FFF

===================================== FAILURES ======================================
2015-02-15 11:39:42 - nailgun.client - WARNING - Received HTTP 422 response: {
  "error": {"id":null,"errors":{"firstname":["is invalid"]},"full_messages":["First name is invalid"]}
}

2015-02-15 11:39:43 - nailgun.client - WARNING - Received HTTP 422 response: {
  "error": {"id":null,"errors":{"lastname":["is invalid"]},"full_messages":["Surname is invalid"]}
}

2015-02-15 11:39:44 - nailgun.client - WARNING - Received HTTP 422 response: {
  "error": {"id":null,"errors":{"login":["is invalid"]},"full_messages":["Username is invalid"]}
}

====================== 20 tests deselected by '-ktest_create' =======================
====================== 3 failed, 20 deselected in 3.81 seconds ======================

$ py.test tests/foreman/api/test_user.py -k test_create
================================ test session starts ================================
platform darwin -- Python 2.7.6 -- py-1.4.26 -- pytest-2.6.4
collected 23 items

tests/foreman/api/test_user.py FFF

===================================== FAILURES ======================================
2015-02-15 11:39:47 - nailgun.client - WARNING - Received HTTP 422 response: {
  "error": {"id":null,"errors":{"firstname":["is invalid"]},"full_messages":["First name is invalid"]}
}

2015-02-15 11:39:48 - nailgun.client - WARNING - Received HTTP 422 response: {
  "error": {"id":null,"errors":{"lastname":["is invalid"]},"full_messages":["Surname is invalid"]}
}

2015-02-15 11:39:49 - nailgun.client - WARNING - Received HTTP 422 response: {
  "error": {"id":null,"errors":{"login":["is invalid"]},"full_messages":["Username is invalid"]}
}

====================== 20 tests deselected by '-ktest_create' =======================
====================== 3 failed, 20 deselected in 3.47 seconds ======================

For this test I commented out all non UTF-8 test data and I removed most of the failures output and left just the invalid messages. Also I ran more than once to get other random values.

@omaciel
Copy link
Owner

omaciel commented Feb 16, 2015

ACK

omaciel added a commit that referenced this pull request Feb 16, 2015
Add unicode letters generator
@omaciel omaciel merged commit 7af06b3 into omaciel:master Feb 16, 2015
@elyezer elyezer deleted the unicode-letters branch February 24, 2015 18:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Random UTF-8 string generation must generate valid string
4 participants