-
-
Notifications
You must be signed in to change notification settings - Fork 308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exceptions reading some SVG files and Batik #442
Comments
Hi Hosé, Thanks for the detailed report! I believe this issue is a Batik problem. Have a look if you can open the SVGs in the official Batik products. If you can't, then there's nothing I can do to fix it directly (other than passing your info along to the Batik team). It would be helpful if you could test, and report the problem directly to them. Re. Batik 1.9 vs 1.10, I think it should be pretty safe. The only thing is, I haven't run the test cases using 1.10, so there's no warranty. ;-) Feel free to submit an issue/patch, upgrading to Batik 1.10 (it should be as easy as changing some version numbers in a POM file)! Best regards, -- |
Well, now that I'm looking into it (I had only tested with half a dozen SVG files before), I'm finding more and more problems with Batik. I can't even open the SVG samples in the Batik folder with my viewer ( ; _ ; ) The problems are many:
an internet connection is established and it takes forever to load the file... I mean, way too long to be acceptable.
And to make me even more depressed, I can't even open the official Batik app "squiggle" as it gives the exception:
and gets forever stuck... (JVM task must be killed). If you have any ideas, I'd love to hear them. |
Okay, I suggest you roll back to Batik 1.9. At least that works fine for me. For the references to work properly, you probably need to read using an A double-hyphen inside a comment is actually illegal in SGML (and thus in XML/HTML etc), so I think Batik is correct on that part (https://www.w3.org/TR/REC-xml/#sec-comments). I really don't think Batik should allow an SVG to start roaming the internet for random information linked from an XML file. If it does by default, that should be reported as a bug, IMHO. Hope that helps, -- |
Well, it certainly did help, thank you very much!! ( ^ _ ^ ) b However, using both Batik 1.9 and 1.10, it still takes an eternity when the SVG has the doctype. As you can see here in my logs:
Doing the math, 15 seconds to load a simple image is far from ok. If I remove the doctype from the SVG file, it loads in a flash. You say that you don't have this problem when you load these SVG files? What could I be doing wrong? Maybe if I download the dtd files and place them somewhere, it helps? About the "--" maybe that was a bad example, as firefox also complains about it and does not show the image, but for example, when a "use tag" reference fails, firefox still manages to display the image. But that's fine, now only the 15s lag is annoying me. About the security, thanks a lot for the link, I'm much more reassured now as it says:
|
I think using (from the link above):
...should make sure no external requests are made. Still, I don't understand how fetching the resource could be in compliance with the default security setting, unless you actually tried to display the SVG directly from the w3.org site? Perhaps it happens outside of Batik's control, and is a setting in the XML parser library? You might have a look, to see if there's a setting you can change, to avoid trying to resolve external resources. I used to know these things, but it's been a while since I worked with these XML APIs... Best regards, -- |
Hi, thanks a lot for the tips, really appreciated 👍 The SVG files are all in my local hard drive. My viewer (Meew) can only read files in the local drives (disk, USB Pen...) or network shared folders. In fact, I want to keep it that way, I'm not too happy in having my software opening internet connections like that. So I don't know what the problem might be. I have to admit that I'm a bit lost about setting Batik's security options (I really intend to disallow external resources), but I'll try to read everything more carefully and search google : ) Sadly, it is not helping me that I'm flooded with work until November, and I can't spend much time with this hobby of mine that is Meew. |
With the little time I had, I managed to find that:
The only solution might be tweaking with the java.lang.SecurityManager and the policy file to block the connections. I never did this before, but I'll look into it when I have time and if I find a solution, I'll post it here, in case someone else is interested. |
Thanks for digging into this, and updating the issue! If you find something that can be done, let me know here, or by creating a PR, and I'll look into it! :-) Best regards, -- |
Ok, my bad. It was me not looking into it properly and being dumb (what else is new?). The 15s lag was not caused by Batik at all. Before calling TwelveMonkeys' plugin, I use the
Still, about the security, it seems that using a policy file is the only way to run libraries inside a sandbox, though it's quite the headache for more than one reason. Anyway, I still have a problem with the references to CSS files, even though I properly set the base URI. Here is the exception (despite that the image is still retrieved):
The file is barChart.svg in Batik's samples folder. Do you have this problem as well? |
Hi, Sorry for not following up on this. No I haven't seen this problem, but I'm not using the Batik plugins, so I don't do much testing apart from running the test cases. Maybe you could try adding a test case with an external CSS reference, so that we get better coverage on that? I don't think there's much I can do but if there is, I'd like to look into it! Glad you found the fix for the timeout though! :-) -- |
Okay, So I made a test case, and it seems you are right. I am able to decode and display the document ( It doesn't change the fact that the CSS is not picked up, and I really don't get why. As setting the baseURI clearly does solve the problem for embedded resources... Best regards, -- |
Hi, thank you for looking into this :) It seems there are several classes where we can set the base URI/URL... The guy in the link above suggests this: However, that class was removed and that might be where the problem is (https://stackoverflow.com/questions/30092651/where-has-org-apache-batik-dom-svg-svgdomimplementation-gone/30250306). Now there is org.apache.batik.anim.dom.SVGOMDocument. In fact, I am converting a org.w3c.dom.Document object to bytes and passing it to the TwelveMonkeys reader. I do set the URI with the setDocumentURI method before passing it to the reader (as well as using the SVGReadParam) but it does not help. I do not use the SVGOMDocument class though, as I want to keep my code blind to plug-ins as much as possible. |
Hi, If you look at the source code of the It even works, kind of. Without it, the link to the "squiggle" ( Batik just doesn't use the same mechanism for loading the CSS, which is very confusing/frustrating. Using a relative or absolute URL doesn't seem to make a difference (this is up to you though, as I just pass whatever you claim to be the base URI along). The fact that the message from Batik contains " Best regards, -- |
Using an absolute link in the XML like:
actually makes the exception go way. However, and I only noticed it just now, the style does not influence the final image, which means the text size of the legend stays the same no matter if it finds the CSS or not. If it worked, I could make a workaround to edit the XML to get the tags <?xml-stylesheet> and get the href. If the link contains the char ":" (because even in linux without "drive:" we have "file:") it probably means that the link is already absolute, if not I would inject the baseURI. However, I'm not sure if it is this simple to check if a path is absolute, and if the CSS ends ignored anyway it is pretty much useless. And I just wish I could have more time to spend with this, but until at least November it will be complicated. |
In my tests it actually does influence the image...? The only thing I make it affect though, is the font size. Anyway, I made a fix to the problem, so that it now works. Even the CSS. It was my bad all along, because I set it too late... :-P There's still a quirk though, that is, if you try to query the Perhaps I should add a custom -- |
I just pushed an improvement, feel free to try it out. It should completely remove the noisy output to Thank you for your feedback so far! -- |
That's great to hear! However, whatever I do, I can't see the effects of the CSS on the image. I even tried this:
just in case it was me needing glasses (happens sometimes), but there is no effect on the image displayed by my application, only in firefox. I re-checked my code, since I do call the methods getFormatName, getImageTypes, getWidth and getHeight, but I don't think I'm doing it in case of SVG images. The only thing I call on the reader in my code when I have an SVG image is (unless I'm not looking properly):
Would any of these methods break the CSS? What could I be doing wrong? |
Here's what the rendering looks like in my case, using Batik 1.9.1: Default style (as in the test case): Modified to 160/100pt: Clearly a difference. And no, none of the things you do should be a problem (I do exactly the same): ImageReader reader = readers.next();
reader.addIIOReadWarningListener(...);
reader.addIIOReadProgressListener(...);
reader.setInput(input);
try {
ImageReadParam param = reader.getDefaultReadParam();
// To avoid explicit dependencies....
if (param.getClass().getName().equals("com.twelvemonkeys.imageio.plugins.svg.SVGReadParam")) {
Method setBaseURI = param.getClass().getMethod("setBaseURI", String.class);
String uri = file.getAbsoluteFile().toURI().toString();
setBaseURI.invoke(param, uri);
}
int numImages = reader.getNumImages(true);
for (int imageNo = 0; imageNo < numImages; imageNo++) {
try {
BufferedImage image = reader.read(imageNo, param);
IIOMetadata metadata = reader.getImageMetadata(imageNo);
// Display image/metadata...
}
catch (Throwable t) {
System.err.println(file + " image " + imageNo + " can't be read:");
t.printStackTrace();
}
}
}
finally {
input.close();
reader.dispose();
} Is basically what I do (see -- |
Ok, then my problem must be somehow related to the fact that I first read the SVG file into a Document object and I do some tweaking in the XML before transforming everything into a byte array stream. I'm totally unable to do something for the time being (no time at all), but now that I'm sure it must be my problem, I'll look into it more carefully in the near future. As always, BIG THANK YOU for your work! You're a genius, twelvemonkeys is simply great ; ) |
You're welcome! :-) And thank you for taking the time to explain and suggest improvements! Much appreciated. PS: It might be possible with some light tweaks to allow the Something like: if (imageInput != null) {
// ...
}
else if (pInput instanceof Document) {
Document document = (Document) pInput;
TranscoderInput input = new TranscoderInput(document);
input.setURI(document.getBaseURI());
rasterizer.setInput(input);
} Anyway, I'm closing this issue for now. Best regards, -- |
Ouch, I think I found out what the problem is that is causing CSS to be ignored. When I checked the byte array of the XML I was passing to the reader, I found out that there were no line breaks before the svg tag. So, it looks something like this:
If I put the line Now, passing a SVG without the line breaks works well in Firefox, and I don't think there is a rule saying that line breaks are mandatory, so I don't know why Batik should have any problem with this. And preserving the line breaks of the original XML file (or injecting them if they were missing to begin with) is, for more than one reason, not something that easy to do... : ( |
I can't see anything in the spec about them having to be on a specific line... But you need to report that bug to the Batik guys, as I don't control the parsing. As I mentioned above, I think a better approach (than re-serializing to a byte array) would be to just pass the pre-parsed document to the reader. Did you try that out yet? -- |
Not yet, but I will!! I promise to come back with the results soon. |
First of all, I'm really happy with this 3.4.1 release, thank you for a great job! All the problems that I've encountered had been solved, except with some SVG files.
Now, there is a very high chance that this is not a TwelveMonkey's problem, I bet it is an issue within Batik (and I do know that Batik does not support everything) or I'm doing something wrong (like missing some jar files), but I felt it was worth mentioning.
The problem happens only with some SVG images, like the one at the wikipedia page "SVG exported from KOMPAS-Graphic".
When I try to read this file, I get this:
I can't find any info about this drawing-type tag. I edited the xml of the file to remove that tag to test it. The image is displayed exactly the same in Firefox, with or without the tag. When I try to load the image removing the
<drawing-type>1</drawing-type>
, the exception changes to:Do you know anything about this? This also happened in the previous 3.3.2 version. I'm using Batik version 1.10 now but the problem happened in the 1.9 version as well. By the way, I saw in the TwelveMonkey's release notes that "Batik dependencies updated to 1.9", so is it safe if I use 1.10?
The text was updated successfully, but these errors were encountered: