-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document.append does not handle utf-8 encoding #17
Milestone
Comments
Here's the change I think we can make for approach two above: diff --git a/sbol/document.py b/sbol/document.py
index 4eb9bf5..6191954 100644
--- a/sbol/document.py
+++ b/sbol/document.py
@@ -372,18 +372,17 @@ class Document(Identified):
:return: None
"""
self.logger.debug("Appending data from file: " + filename)
- with open(filename, 'r') as f:
- if not self.graph:
- self.graph = rdflib.Graph()
- # Save any changes we've made to the graph.
- self.update_graph()
- # Use rdflib to automatically merge the graphs together
- self.graph.parse(f, format="application/rdf+xml")
- # Clean up our internal data structures.
- # (There's probably a more efficient way to merge.)
- self.clear(clear_graph=False)
- # Base our internal representation on the new graph.
- self.parse_all()
+ if not self.graph:
+ self.graph = rdflib.Graph()
+ # Save any changes we've made to the graph.
+ self.update_graph()
+ # Use rdflib to automatically merge the graphs together
+ self.graph.parse(filename, format="application/rdf+xml")
+ # Clean up our internal data structures.
+ # (There's probably a more efficient way to merge.)
+ self.clear(clear_graph=False)
+ # Base our internal representation on the new graph.
+ self.parse_all()
def parse_all(self):
# Parse namespaces |
This issue only happens when the environment variable LANG is unset, as it is in a docker environment. When LANG=en_US.utf8, the UTF8 document is read properly. Since the recommended fix works in both environments, it is probably desirable. It wouldn't be surprising to see the SBOL module used in a docker environment. |
SBOL is definitely used in Docker environments, but this can probably be
pushed back to 1.0 if you want.
Thanks,
-Jake
(sent from my phone)
…On Thu, Nov 14, 2019, 2:08 PM Tom Mitchell ***@***.***> wrote:
This issue only happens when the environment variable LANG is unset, as it
is in a docker environment. When LANG=en_US.utf8, the UTF8 document is read
properly.
Since the recommended fix works in both environments, it is probably
desirable. It wouldn't be surprising to see the SBOL module used in a
docker environment.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#17>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACRONO47U5AXIYDOTZV5N3LQTWV2DANCNFSM4JMI5OEA>
.
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Saw this error via
Document.append()
:The file
sbol/test/SBOLTestSuite/SBOL2/pICSL50014.xml
has non-ascii characters (see lines 136 and 137). This file passes SBOL validation.Two alternatives seem to work:
with open(filename, 'rb')
) -- this is what RDFLib does (see parser.py)Both approaches pass the tests in
test_roundtrip.py
.The text was updated successfully, but these errors were encountered: