Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix for issue 1484 raised and solved by Graham Klyne: #1490

Merged
merged 2 commits into from Dec 7, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
15 changes: 11 additions & 4 deletions rdflib/plugins/shared/jsonld/util.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,26 +17,33 @@

from urllib.parse import urljoin, urlsplit, urlunsplit

from rdflib.parser import create_input_source, PythonInputSource
from rdflib.parser import create_input_source, PythonInputSource, StringInputSource

from io import StringIO
from io import TextIOBase, TextIOWrapper


def source_to_json(source):

if isinstance(source, PythonInputSource):
return source.data

if isinstance(source, StringInputSource):
return json.load(source.getCharacterStream())

# TODO: conneg for JSON (fix support in rdflib's URLInputSource!)
source = create_input_source(source, format="json-ld")

stream = source.getByteStream()
try:
return json.load(StringIO(stream.read().decode("utf-8")))
# Use character stream as-is, or interpret byte stream as UTF-8
if isinstance(stream, TextIOBase):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually unsure if this condition will be true given this code from rdflib.parser.FileInputSource:

rdflib/rdflib/parser.py

Lines 239 to 244 in 9379a69

if isinstance(file, TextIOBase): # Python3 unicode fp
self.setCharacterStream(file)
self.setEncoding(file.encoding)
try:
b = file.buffer
self.setByteStream(b)

Copy link

@gklyne gklyne Dec 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I was trying to home in on the error that was triggered by trying to parse the output from json.dump, which specifically caused the stream result from source.getByteStream() to be a character stream. I agree that seems not quite right, and the problem may lie deeper in the logic that sorts out the data stream to be parsed.

But I was also trying to identify a minimal fix that could solve my immediate problem. Hopefully if someone chooses to delve deeper into the create_input_source logic, the test cases provided will still be helpful.

use_stream = stream
else:
use_stream = TextIOWrapper(stream, encoding='utf-8')
return json.load(use_stream)
finally:
stream.close()


VOCAB_DELIMS = ("#", "/", ":")


Expand Down
65 changes: 65 additions & 0 deletions test/test_issue1484.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
import unittest
import io
import json
from rdflib import Graph, RDF, RDFS, Namespace


class TestIssue1484_json(unittest.TestCase):
def test_issue_1484_json(self):
"""
Test JSON-LD parsing of result from json.dump
"""
n = Namespace("http://example.org/")
jsondata = {"@id": n.s, "@type": [n.t], n.p: {"@id": n.o}}

s = io.StringIO()
json.dump(jsondata, s, indent=2, separators=(",", ": "))
s.seek(0)

DEBUG = False
if DEBUG:
print("S: ", s.read())
s.seek(0)

b = n.base
g = Graph()
g.bind("rdf", RDF)
g.bind("rdfs", RDFS)
g.parse(source=s, publicID=b, format="json-ld")

assert (n.s, RDF.type, n.t) in g
assert (n.s, n.p, n.o) in g


class TestIssue1484_str(unittest.TestCase):
def test_issue_1484_str(self):
"""
Test JSON-LD parsing of result from string (used by round tripping tests)

(Previously passes, but broken by earlier fix for above.)
"""
n = Namespace("http://example.org/")
jsonstr = """
{
"@id": "http://example.org/s",
"@type": [
"http://example.org/t"
],
"http://example.org/p": {
"@id": "http://example.org/o"
}
}
"""

b = n.base
g = Graph()
g.bind("rdf", RDF)
g.bind("rdfs", RDFS)
g.parse(data=jsonstr, publicID=b, format="json-ld")

assert((n.s, RDF.type, n.t) in g)
assert((n.s, n.p, n.o) in g)


if __name__ == "__main__":
unittest.main()