-
Notifications
You must be signed in to change notification settings - Fork 213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[tool] Add performance test for parsing time #340
Conversation
performance_test.js
Outdated
const promises = arr.map(() => getParsingTime(VastParser, inlineXml)); | ||
|
||
Promise.all(promises).then(results => { | ||
const average = getAverage(results); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Problem: The higher the numberOfRuns
the higher the average becomes. I have changed from Date.now()
to performance.now()
and it didn't solve the issue. Any inputs here would be helpful :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Solution: Each loop runs the test in a separate process so they don't share resources with other tests and become slower with time. Promise.all
is also not reliable because it runs all the promises, and only after it has executed all the promises, will it begin to resolve them (so the timing isn't right). Now we wait for each run to be resolved before going to the next one. Downside is that the tests are a bit slow now.
fb3cc4a
to
8127339
Compare
for(let i = 0; i < numberOfRuns; i++) { | ||
try { | ||
const time = await new Promise((resolve, reject) => { | ||
const proc = spawn("node", [testFilePath]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Candid question, do we really need to go throught a new child process every time ? Just calling a promise that returns the value was not enough ? I recall you talked about the more you run the test the more it took, I guess it's related to that ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe now that we don't do Promise.all
it isn't needed anymore but I will need to test it out. This is safer because we know that each test is not impacting other runs (by sharing the same resources)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But since you're doing a await they actually running one by one so it won't share anything right ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested it without the child processes, and the results changed in correlation to how many runs was performed. So it seems it is still needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If Rumesh or Jakub have an explaination for that I would be happy to know it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TL;DR; Yes spawn new process will be faster than doing it in the same execution thread.
The child_process.spawn() method spawns the child process asynchronously, without blocking the Node.js event loop
https://nodejs.org/api/child_process.html#child_process_child_process_spawn_command_args_options
So first using spawn
will not block the node event loop.
And NodeJS is single thread but using spawn
let the capabilities to use ale the cores in the CPU
a new process will be created and managed by your OS. That new process can be executed in parallel to the main process as long as your computer has at least 2 virtual CPU cores
https://stackoverflow.com/questions/33381796/does-node-child-process-spawn-run-on-multi-core-cpu
So yes spawn
will create clean and new thread to execute this test file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spawning node
for each test feels heavy. Maybe https://nodejs.org/api/vm.html can be useful and more lightweight, although I never used it personally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked into it and tried it, but I had some errors (new vm.Script('node performance/get_time.js')
was throwing an error). As the waiting time for the performance tests to run is "not great, not terrible", and this PR isn't a priority, I will leave it as it is for now, because at least it works 🤷♀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
performance/get_time.js
Outdated
const path = require('path'); | ||
const { VASTClient } = require('../dist/vast-client-node.min.js'); | ||
|
||
const url = path.join('file://', __dirname, '..', 'spec', 'samples', 'inline-linear.xml'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just FYI, when using path.join
you can also just do '../spec/samples/inline-linear.xml'
and it will work cross-platform just fine. No need to supply N separate arguments. (the output will be the same)
(the output will contain \
on Windows though, so it can be used when you want to open a file, but not always in other situations)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However the problem is that path
was created for dealing with filesystem disk paths, not with any generic paths. In particular it does not work as expected on Windows when dealing with URLs because URLs have /
while all path
operations output \
on Windows, and also on Windows you have the C:\
stuff which complicates things.
When I run this on WIndows with bash, I get:
- with
file://
in the beginning:
[Error: ENOENT: no such file or directory, open 'C:\C:\git\vast-client-js\spec\samples\inline-linear.xml'] {
- without
file://
:
TypeError [ERR_INVALID_PROTOCOL]: Protocol "c:" not supported. Expected "http:"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to make it work with https://www.npmjs.com/package/file-url (https://github.com/sindresorhus/file-url/blob/master/index.js)
const fileUrl = require('file-url');
const url = fileUrl(path.join(__dirname, '../spec/samples/inline-linear.xml'));
but I still get the ENOENT. Need to dig further, maybe there's some bash-on-windows specific stuff, or an issue with vast-client
🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW the output from file-url
is like file:///C:/git/vast-client-js/spec/samples/inline-linear.xml
which when loaded in any browser on Windows works just fine.
Unclear to me why node wants to load C:\C:\
It is either issue in vast-client
or the lib which it uses under the hood to issue requests, or some peculiary of msys/mingw like this one
BTW check also this (added in node 10.12)
https://nodejs.org/api/url.html#url_url_pathtofileurl_path
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it to using pathtofileurl
. I wasn't able to test it on windows though, so I'm not sure it will work. I also removed the keyword __dirname
as it wasn't necessary after all
performance/get_time.js
Outdated
|
||
const VastClient = new VASTClient(); | ||
|
||
const timeBefore = Date.now(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps you could use process.hrtime()
although it's more complex
https://nodejs.org/api/process.html#process_process_hrtime_time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
for(let i = 0; i < numberOfRuns; i++) { | ||
try { | ||
const time = await new Promise((resolve, reject) => { | ||
const proc = spawn("node", [testFilePath]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spawning node
for each test feels heavy. Maybe https://nodejs.org/api/vm.html can be useful and more lightweight, although I never used it personally.
Alright so I found the issue on Windows: it's this nodejs/node#24223 (comment) To fix this, this line: |
performance/get_time.js
Outdated
VastClient.get(url) | ||
.then(() => { | ||
console.log(Date.now() - timeBefore); | ||
const res = process.hrtime(timeBefore); | ||
console.log((res[0] * 1e3 + res[1] * 1e-6)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be 1e6
not 1e-6
? (same in the other file)
1e6 === 1000000
1e-6 === 0.000001
Maybe more readable to just write 1000 / 1000000 instead :)
Or even better, since recent versions of node you can write integers with _
separators for readability:
1_000
1_000_000
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
res[1]
is in nanoseconds so I'm multiplying it by 1e-6, I could also divide it by 1e6, but it is correct as it is. I can change it to be more explicit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah well, the node docs suck because they look like the arr[1]
is a small value while in reality it's huge.
And now I see that you use ms
while they use ns
in output. All good 👍
@jakub-g thanks for the input, I didn't think to look at node_url_handler for some reason, I thought the problem was in the performance tests. |
Description
Script to calculate the parsing speed of VAST client (fetch and parse) and VAST parser (just parse). Each test is run 100 times, to get the average.
Current alpha 8 results:
Type