Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(#3379) - prefer readAsArrayBuffer to BinaryString #3379

Closed
wants to merge 2 commits into from

Conversation

nolanlawson
Copy link
Member

I have no evidence that this produces better
performance, but my intuition tells me so.

It just seems silly to convert a Blob to a binary string
just for the purposes of MD5 hashing. Presumably
ArrayBuffers are better at memory management.

@nolanlawson
Copy link
Member Author

Note that spark-md5 supports binary strings and array buffers, but not blobs.

@nolanlawson
Copy link
Member Author

Neat, looks like I found a bug in spark-md5.

@nolanlawson
Copy link
Member Author

Welp, it'll take me awhile to debug spark-md5. Adding WIP to the title.

@nolanlawson nolanlawson changed the title (#3379) - prefer readAsArrayBuffer to BinaryString (#3379) - WIP- prefer readAsArrayBuffer to BinaryString Jan 12, 2015
@nolanlawson
Copy link
Member Author

Ah jeez, it's a bug in our sliceShim. Got a fix now.

nolanlawson added a commit that referenced this pull request Jan 17, 2015
I have no evidence that this produces better
performance, but my intuition tells me so.

It just seems silly to convert a Blob to a binary string
just for the purposes of MD5 hashing. Presumably
ArrayBuffers are better at memory management.
@@ -7,7 +7,7 @@ var MD5_CHUNK_SIZE = 32768;

function sliceShim(arrayBuffer, begin, end) {
if (typeof arrayBuffer.slice === 'function') {
if (!begin) {
if (!begin && !end) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ this was the sliceShim bug. if you pass in 0 for begin and non-zero for end, then it will just call slice(), which is wrong.

nolanlawson added a commit that referenced this pull request Jan 17, 2015
I have no evidence that this produces better
performance, but my intuition tells me so.

It just seems silly to convert a Blob to a binary string
just for the purposes of MD5 hashing. Presumably
ArrayBuffers are better at memory management.
@nolanlawson
Copy link
Member Author

OK, this commit should be good now.

@nolanlawson nolanlawson changed the title (#3379) - WIP- prefer readAsArrayBuffer to BinaryString (#3379) - prefer readAsArrayBuffer to BinaryString Jan 17, 2015
nolanlawson added a commit that referenced this pull request Jan 17, 2015
I have no evidence that this produces better
performance, but my intuition tells me so.

It just seems silly to convert a Blob to a binary string
just for the purposes of MD5 hashing. Presumably
ArrayBuffers are better at memory management.
@daleharvey
Copy link
Member

We should have a performance test that indicates this is some type of improvement, I ran the performance test and it ddint indicate any improvement, nor would I expect it since we dont have any attachment tests

@nolanlawson
Copy link
Member Author

To me it's obvious for a few reasons:

  1. In IE it doesn't even have readAsBinayString, so it has to readAsArrayBuffer and then convert it
  2. ArrayBuffers are more performant than huge strings for binary data, right?

Although I realize the onus is on me to provide a performance test justifying this. I'll try to get one soon.

@daleharvey
Copy link
Member

Yeh, I agree its obviously better, but its similiar to previous improvements we need to get the infrastructure in place not as much for this patch, but for future improvements to be measurable.

I am focussed on your android patch right now, once I have done that I will make another go at the improvements I plan for the performance suite

nolanlawson added a commit that referenced this pull request Jan 25, 2015
I have no evidence that this produces better
performance, but my intuition tells me so.

It just seems silly to convert a Blob to a binary string
just for the purposes of MD5 hashing. Presumably
ArrayBuffers are better at memory management.
nolanlawson added a commit that referenced this pull request Jan 25, 2015
@nolanlawson
Copy link
Member Author

Added a basic attachment perf test. It actually shows that the arraybuffer method is slightly slower:

  • Chrome 40: 3595ms -> 3677ms
  • Firefox 37: 5855ms -> 5890ms

These might be flukes, though. I should probably add more iterations and try to get a finer granularity.

Also, if the memory usage is improved, these numbers won't capture that.

@daleharvey
Copy link
Member

Just a jshint failure here, I would be fine merging this to get the code fix and at least the performance tests in there, +1 when green

nolanlawson added a commit that referenced this pull request Jan 27, 2015
I have no evidence that this produces better
performance, but my intuition tells me so.

It just seems silly to convert a Blob to a binary string
just for the purposes of MD5 hashing. Presumably
ArrayBuffers are better at memory management.
nolanlawson added a commit that referenced this pull request Jan 27, 2015
@nolanlawson
Copy link
Member Author

OK, fair enough! jshint should be fixed now.

@daleharvey
Copy link
Member

Failing in node

@nolanlawson
Copy link
Member Author

Oh God, if I want this canvas stuff to work in Node, we're going to need the canvas package, which has all these dependencies.

I dunno. Maybe I should just check a large image into source control and load that?

@nolanlawson
Copy link
Member Author

or just make a bigass text file

I have no evidence that this produces better
performance, but my intuition tells me so.

It just seems silly to convert a Blob to a binary string
just for the purposes of MD5 hashing. Presumably
ArrayBuffers are better at memory management.
@nolanlawson
Copy link
Member Author

OK, I think we have a pretty decent test now. Just directly makes some Buffers/Blobs of arbitrary size. Makes more sense than my fancy canvas stuff.

@daleharvey
Copy link
Member

I like this test much better, a lot simpler

nolanlawson added a commit that referenced this pull request Jan 30, 2015
I have no evidence that this produces better
performance, but my intuition tells me so.

It just seems silly to convert a Blob to a binary string
just for the purposes of MD5 hashing. Presumably
ArrayBuffers are better at memory management.
nolanlawson added a commit that referenced this pull request Jan 30, 2015
@daleharvey
Copy link
Member

Merged in 98a1623

@daleharvey daleharvey closed this Jan 30, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants