Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] SVG support #30

Closed
wants to merge 1 commit into from
Closed

[WIP] SVG support #30

wants to merge 1 commit into from

Conversation

borodean
Copy link
Contributor

@borodean borodean commented Nov 2, 2015

Solves #29

This PR:

  • parses SVGs which have both dimensions specified;
  • with one of the dimensions and a view box;
  • with view box only;
  • handles cases when nothing is specified and ignores percentage dimensions.

However, there are stuff to improve performance-wise and very lame format detection.

};

function isSVG (buffer) {
return ('<' === utils.ascii(buffer, 0, 1));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file format detection is really naive and should be improved. For instance, it wouldn’t work for SVGs that have whitespace at the beginning. I would rather test the whole file for regExps.rootTag and assume its an SVG if it matches.

@borodean
Copy link
Contributor Author

borodean commented Nov 2, 2015

@mgartner what do you think? Any advice on how to improve it?

@mgartner
Copy link
Member

mgartner commented Nov 2, 2015

I'm not too familiar with the SVG format, but it would be nice to not have to read the entire file to detect whether or not it is an SVG and to measure it. I'm not sure that's possible. I need to research the SVG format more.

If a full read on the file is necessary, I'm not sure supporting SVG would make sense for Calipers, and maybe a separate SVG parser should be used. If a good case can be made for Calipers to include it, I think that at the very least, the API should be updated so that supported file types can be whitelisted so that someone who has no need to detect SVG won't see a performance hit from a full file read.

@borodean
Copy link
Contributor Author

borodean commented Nov 2, 2015

@mgartner here is an article describing the basic logic for determining SVG size: https://developer.mozilla.org/en-US/docs/Web/CSS/Scaling_of_SVG_backgrounds.

I guess I have some ideas on how to not load the whole file to get the dimensions: since the only thing needed to get those is the opening <svg> tag which is supposed to be somewhere at the beggining of file, we may read file chunk by chunk until this tag is fully read and then stop.

As for the file type detection it could be agreed that Calipers understands only SVGs where that tag was introduced in the first 32 bytes — otherwise it can't detect it (but the file could still be generally valid) and SVG should be fixed. This could be improved in the future if some other kind of detection is implemented, like looking at the file extension or manually passing the type. What do you think?

As for the reasons of why I am eager to help getting SVG work is my postcss-assets lib, which among other things helps to automatically insert image dimensions into CSS. Since SVG is a very popular Web format I care about supporting it. Currently the lib is using image-size, but SVG support there has some serious bugs and the project itself looks abandonned, so I'm looking at Calipers to migrate to it.

@borodean
Copy link
Contributor Author

borodean commented Nov 4, 2015

@mgartner I've refactored the code attempting to make it more readable and made the file type detection more sane. However, dimension calculation still needs to be improved to not read the whole file.

I need to find a way to search for the first <svg...> match and then stop reading.

@borodean
Copy link
Contributor Author

borodean commented Nov 4, 2015

@mgartner the other option could be using third-party XML parsers that parse streams and are able to stop when the object of interest is found. I've quickly searched for such libs and noticed these two:

  • xml-stream, which is very fast, but depends on libexpat library;
  • node-xml, which is rather slow, but is purely JavaScript.

I guess I'll try to update the code to use those and see what would happen.

return pread(fd, new Buffer(BUFFER_SIZE), 0, BUFFER_SIZE, offset)
.spread(function (bytesRead, buffer) {
string += utils.ascii(buffer, 0, bytesRead);
var rootMatches = string.match(ROOT_MATCHER);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RegExp match is done for the whole part read from the file on every fs.read pass.

@borodean
Copy link
Contributor Author

borodean commented Nov 6, 2015

Okay, that was a tough one. I've tried some XML parsers, but eventually I felt that those are an overkill for such a minor task, so I kept the parsing with regular expressions. However, files are no longer read entirely with no need - they are read chunk by chunk until the RegExp matches and then it stops.

@mgartner could you review the code and share your thoughts and advice?

@borodean
Copy link
Contributor Author

borodean commented Nov 7, 2015

Okay, I've did some benchmark and it looks slower than the others:

PDF calipers: 83ms
PDF pdfinfo: 877ms
PNG calipers: 44ms
PNG identify: 905ms
JPEG calipers: 66ms
JPEG identify: 960ms
SVG calipers: 112ms

@mgartner
Copy link
Member

mgartner commented Nov 7, 2015

That's actually much more performant than I thought it would be. Have you
found any SVG files that this can't measure because there is extra
whitespace at the beginning or something?

Let me run some more benchmarks myself, but I'm feeling more confident this
could be added now.

I also am planning on allowing an optional parameter in measure that
whitelists file types, so certain formats can be avoided. The benefit would
be that Calipers won't waste time measuring files that a consumer app
doesn't support. I think this will allow much easier adoption of new
formats without having to worry about degrading performance of Calipers as
a whole.

On Saturday, November 7, 2015, Vadym Borodin notifications@github.com
wrote:

Okay, I've did some benchmark and it looks slower than the others:

PDF calipers: 83ms
PDF pdfinfo: 877ms
PNG calipers: 44ms
PNG identify: 905ms
JPEG calipers: 66ms
JPEG identify: 960ms
SVG calipers: 112ms


Reply to this email directly or view it on GitHub
#30 (comment).

@mgartner mgartner changed the title SVG support [WIP] SVG support Nov 19, 2015
@borodean
Copy link
Contributor Author

borodean commented Dec 4, 2015

Since we're going to introduce modules, I've extracted the code from this PR to a separate repo: https://github.com/calipersjs/calipers-svg

@borodean borodean closed this Dec 4, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants