Quench nt test userwarn #1500

ghost · 2021-12-12T17:57:16Z

Fed up with seeing Serializer Userwarnings in the test output. Is the latin-1 format testing anything specific? Can it be simply swapped out for utf-8 or am I missing something? (I'm not strong on the parse/serializer side of things).

aucampia · 2021-12-13T15:35:48Z

Can it be simply swapped out for utf-8 or am I missing something? (I'm not strong on the parse/serializer side of things).

If we are not specifically testing character sets support and handing of passing invalid char sets then I think valid charsets (e.g. utf-8 in this case ) is more appropriate than charsets that cause warnings (e.g. ascii/1latin-1` in this case). So I think this change makes a lot of sense.

There should be tests to check the handling of passing of invalid charsets but they should be in a separate place and not mixed in with tests for something different, and they could (should?) even go as far as expecting the warnings. And I also think we should rather error out in some of these cases, especially if the internal handling is to just use utf-8 - I guess there are different approaches here but I think erroring out when an unsupported charset is supplied will result in less surprises for users than warning would. Though this is something we can consider as a seperate issue and discuss potentially for future major versions as this will break compatibility and warrant a major version bump.

I made a bunch of round trip tests that I still have to break off from #1418 that cover character sets, and tests for handling of invalid charsets could maybe go into something like that.

…warn

nicholascar · 2021-12-15T11:45:21Z

In all the 1.1 versions of the W3C Specifications we see UTF-8 mandated, e.g.:

N-Triples: https://www.w3.org/TR/n-triples/#h2_n-triples-changes
Turtle: https://www.w3.org/TR/turtle/#h3_sec-mime

So I think we can always use UTF-8. I wondered if there were any good reasons this wasn't already the case but presumably it's just for the historical reasons of different character support in Python < 3 and perhaps in W3C Specs < 1.1.

Graham Higgins added 2 commits December 12, 2021 17:39

modernize warning string construction

404e1be

provide explicit serialization format, quench test warnings

41c07fa

aucampia approved these changes Dec 12, 2021

View reviewed changes

Merge branch 'RDFLib:master' into quench-nt-test-userwarn

7685ad0

Merge remote-tracking branch 'origin/master' into quench-nt-test-user…

6b4ada1

…warn

nicholascar approved these changes Dec 15, 2021

View reviewed changes

nicholascar merged commit eb151d3 into RDFLib:master Dec 15, 2021

ghost deleted the quench-nt-test-userwarn branch December 28, 2021 12:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quench nt test userwarn #1500

Quench nt test userwarn #1500

ghost commented Dec 12, 2021

aucampia commented Dec 13, 2021

nicholascar commented Dec 15, 2021 •

edited

Quench nt test userwarn #1500

Quench nt test userwarn #1500

Conversation

ghost commented Dec 12, 2021

aucampia commented Dec 13, 2021

nicholascar commented Dec 15, 2021 • edited

nicholascar commented Dec 15, 2021 •

edited