-
Notifications
You must be signed in to change notification settings - Fork 558
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xsd:hexBinary added to XSDToPython #386
Conversation
The test-failure is probably unrelated, but what is the use of this? if we should support hexBinary, should not have some way of "serializing/unserializing" hex-strings to bytes/str objects? |
Good question. My use case is assigning a RSA modulus (like this one http://www.w3.org/wiki/WebID#Adding_a_certificate). The status quo of using the same string in Python and in RDF is good for this use case. modulus = 'b0dfc17e0969569a813d97b3bc42b434b35ecdf45e786d6db1462604f2f8d800b4d61b56c494cf02667dccdb2a227f259b3b5d99b50553d50a13328b7a34a46f0d1d68713ddfef6ee02da87db6ee794781cddcb61946943359b23c1ce15a7eb10f4dedf1e00ed2ee62986c42bd1ba6bc9174d2869ac78801b7c2b1dc8a1f73ac99f94aa0ae5bbcb7494dc3ff15af01fdc92b98574854518bbc58601c56fc62771f843acdd495e602875729ad0813485620ede2d83050e3607d9a92b47c67c8adf341b5ab0b9926388c33dd3c60117b8aebea46f655f049e871acfbac57aec61f6a66b250ae8e9e05aba2b3a8379012122a5b6e5310d26ce1bc3e2a4c80168aa5'
assert(modulus == format(int(modulus, 16), "x")
>>> True than the following representation which starts with '0x': print hex(int(modulus, 16))
>>> '0xb0dfc17e0969569a813d97b3bc42b434b35ecdf45e786d6db1462604f2f8d800b4d61b56c494cf02667dccdb2a227f259b3b5d99b50553d50a13328b7a34a46f0d1d68713ddfef6ee02da87db6ee794781cddcb61946943359b23c1ce15a7eb10f4dedf1e00ed2ee62986c42bd1ba6bc9174d2869ac78801b7c2b1dc8a1f73ac99f94aa0ae5bbcb7494dc3ff15af01fdc92b98574854518bbc58601c56fc62771f843acdd495e602875729ad0813485620ede2d83050e3607d9a92b47c67c8adf341b5ab0b9926388c33dd3c60117b8aebea46f655f049e871acfbac57aec61f6a66b250ae8e9e05aba2b3a8379012122a5b6e5310d26ce1bc3e2a4c80168aa5L' NB: In Python 2.7.6 there is a final L character that makes this string be an invalid hexadecimal value. modulus2 = hex(int(modulus, 16))
int(modulus2, 16)
>>>ValueError: invalid literal for int() with base 16: In this use case, I have no interest in using the decimal value. In my patch, it returns a unicode, not a str like hex() and format(). Do you think this is a problem? I am not used to use bytes(). What would be its interest? This patch only concerns the translation XSD -> Python. In the other direction, it could be interesting to prevent users from serializing xsd:hexBinary values starting with "0x", which is not valid. |
nb: restarted the travis job, passed in my opinion there are two ways to treat this: hexbinary means that it's actually a hex representation of some binary, so we could either return those bytes (after a call to binascii.unhexlify ) or we could just return a hexstring of those bytes. l = Literal('6ac3b6726e', datatype=XSD.hexBinary)
l.toPython() == 'j\xc3\xb6rn' == u'jörn'.encode('utf-8') # this would be bytes
l.toPython() == '6ac3b6726e' # this would be hexstring I'd favor of bytes option as otherwise one could just call In [3]: str(l)
Out[3]: '6ac3b6726e'
In [4]: unicode(l)
Out[4]: u'6ac3b6726e'
In [5]: binascii.unhexlify(str(l))
Out[5]: 'j\xc3\xb6rn' we should have a testcase for this as well... |
Which version of Python did you used? I cannot reproduce your results for str(l) and unicode(l) in Python 2.7.6 and Python 3.4.0. In 2.7.6: l = u'jörn'.encode('utf-8')
str(l)
>>> 'j\xc3\xb6rn'
unicode(l)
>>> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1: ordinal not in range(128) In 3.4.0: l = u'jörn'.encode('utf-8')
l
>>> b'j\xc3\xb6rn'
str(l)
>>> "b'j\\xc3\\xb6rn'" I would argue in favor of returning a hexstring of those bytes.
value = '6ac3b6726e'
l = Literal(value, datatype=XSD.hexBinary)
l.toPython() == value
>>> True |
i used python 2.7.6 but your problem is that you defined my point is: with your code, why not just use value = '6ac3b6726e'
l = Literal(value, datatype=XSD.hexBinary)
str(l) == value
>>> True |
Oh, sorry for that. My preference in using I started to prototype an ORM (Object RDF Mapper) that use this generic method. See for instance: Am I using toPython() correctly or am I expecting something unintended? Here my test example using a hexstring value: |
Thank you for your comments, you convince me. I have opened a new pull request (#388) where I propose an implementation for the bytes option. I will close this pull request. |
Literal("b234", datatype=(XSD + "hexBinary")).toPython() now returns a string and not a literal.