Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

asyncify: support *-matching in whitelist and blacklist #2344

Merged
merged 5 commits into from Sep 23, 2019

Conversation

@Beuc
Copy link
Contributor

commented Sep 18, 2019

See emscripten-core/emscripten#9381 for rationale.

The impacted code seem to lack proper testing framework, ideally we'd unit-test wildcardMatch, and check what functions precisely were instrumented in the final output. I'd like feedback/guidance on this.

Also currently the performance appears significantly slower (~+30% time). I'm not sure if this is purely related to the matching algorithm, in which case we may need a larger beast like http://developforperformance.com/MatchingWildcards_AnImprovedAlgorithmForBigData.html
EDIT: +30% time of my full link (+15s from initial 45s), so probably X00% increase for the pass itself.

Copy link
Member

left a comment

Thanks for the PR!

Overall looks good aside from the comments below. Testing looks reasonable. Speed will be slower, yeah, but probably this is fine for now.

@@ -503,7 +508,12 @@ class ModuleAnalyzer {
// Only the functions in the whitelist can change the state.
for (auto& func : module.functions) {
if (!func->imported()) {
map[func.get()].canChangeState = whitelist.count(func->name) > 0;

This comment has been minimized.

Copy link
@kripken

kripken Sep 18, 2019

Member

This looks like the behavior might have changed - before we always assigned to canChangeState, but in this PR it becomes only if it we turn it true. Maybe this explains the test failure?

This comment has been minimized.

Copy link
@Beuc

Beuc Sep 19, 2019

Author Contributor

Exactly. I didn't realize asyncify would apply auto-detection only to be overwritten by the whitelist. Perharps this can be optimized out, unless there's something left to do with imported functions.

@@ -96,7 +96,10 @@ inline bool wildcardMatch(const std::string& pattern,
return false;

This comment has been minimized.

Copy link
@kripken

kripken Sep 18, 2019

Member

The comment on line 90-91 is now obsolete, please update it.

This comment has been minimized.

Copy link
@Beuc

Beuc Sep 19, 2019

Author Contributor

Done.

return wildcardMatch(pattern.substr(i + 1, std::string::npos),
value.substr(i, std::string::npos)) ||
wildcardMatch(pattern.substr(i, std::string::npos),
value.substr(i + 1, std::string::npos));

This comment has been minimized.

Copy link
@kripken

kripken Sep 18, 2019

Member

the npos parameters are not needed (they are the default value)

This comment has been minimized.

Copy link
@Beuc

Beuc Sep 19, 2019

Author Contributor

Done.

@Beuc

This comment has been minimized.

Copy link
Contributor Author

commented Sep 19, 2019

  • Separate direct and pattern matching to regain perfs
  • Issue warning when pattern didn't match anything
  • Match empty "", e.g. "func" matches "func" too
src/passes/Asyncify.cpp Outdated Show resolved Hide resolved

// The lists contain human-readable strings. Turn them into the internal
// escaped names for later comparisons
auto processList = [&module](const String::Split& list,

This comment has been minimized.

Copy link
@kripken

kripken Sep 19, 2019

Member

I think this is a good idea, but it would be clearer as a standalone class. What I mean is something like class PatternMatcher whose constructor takes a list of strings (that may have * in some of them), does all the work of processList in the constructor, and then provides a match function to test a value.

This comment has been minimized.

Copy link
@Beuc

Beuc Sep 19, 2019

Author Contributor

It seems this will make the reporting harder.
I believe I kept the level of complexity/clarity on-par with the initial code, so that sounds like a code clean-up/refactoring that is distinct and can be done later, what do you think?

This comment has been minimized.

Copy link
@kripken

kripken Sep 20, 2019

Member

It could be done separately, I agree.

But I do worry about increasing the code complexity in this PR without a timeline of when we can do this refactoring to get things in good shape again. When do you think you'd have time for it?

This comment has been minimized.

Copy link
@Beuc

Beuc Sep 23, 2019

Author Contributor

Done.

test/unit/test_asyncify.py Show resolved Hide resolved
Beuc added 2 commits Sep 19, 2019
Copy link
Member

left a comment

Excellent, thanks @Beuc!

@kripken kripken merged commit f5f53bb into WebAssembly:master Sep 23, 2019
2 checks passed
2 checks passed
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@kripken

This comment has been minimized.

Copy link
Member

commented Sep 23, 2019

Looks like this is failing on an emscripten test in our integration CI, see the bottom of https://logs.chromium.org/logs/emscripten-releases/buildbucket/cr-buildbucket.appspot.com/8901492015302662944/+/steps/Emscripten_testsuite__upstream__other_/0/stdout

======================================================================
FAIL: test_asyncify_response_file (test_other.other)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\b\s\w\ir\cipd_bin_packages\cpython\bin\Lib\unittest\case.py", line 329, in run
    testMethod()
  File "C:\b\s\w\ir\k\install\emscripten\tests\runner.py", line 116, in decorated
    func(self, *args, **kwargs)
  File "C:\b\s\w\ir\k\install\emscripten\tests\test_other.py", line 8962, in test_asyncify_response_file
    self.assertContained('Asyncify whitelist contained a non-existing function name: DOS_ReadFile(unsigned short, unsigned char*, unsigned short*, bool)', proc.stderr)
  File "C:\b\s\w\ir\k\install\emscripten\tests\runner.py", line 846, in assertContained
    additional_info
  File "C:\b\s\w\ir\cipd_bin_packages\cpython\bin\Lib\unittest\case.py", line 410, in fail
    raise self.failureException(msg)
AssertionError: Expected to find 'Asyncify whitelist contained a non-existing function name: DOS_ReadFile(unsigned short, unsigned char*, unsigned short*, bool)' in 'warning: Asyncify whitelist contained a non-matching pattern: DOS_ReadFile\28unsigned\20short\2c\20unsigned\20char*\2c\20unsigned\20short*\2c\20bool\29
', diff:

--- expected
+++ actual
@@ -1 +1,2 @@
-Asyncify whitelist contained a non-existing function name: DOS_ReadFile(unsigned short, unsigned char*, unsigned short*, bool)
+warning: Asyncify whitelist contained a non-matching pattern: DOS_ReadFile\28unsigned\20short\2c\20unsigned\20char*\2c\20unsigned\20short*\2c\20bool\29
+

Looks like the warning prints escaped names now, which I think wasn't intended here?

(Until this is resolved no binaryen changes, including this one, can roll into emsdk releases.)

@kripken

This comment has been minimized.

Copy link
Member

commented Sep 23, 2019

Oh, this does look bad: the function name is DOS_ReadFile(unsigned short, unsigned char*, unsigned short*, bool) so it thinks it's a pattern due to the *... that's always been a problem I guess, but as long as we only matched them at the end, it wasn't that noticeable.

A proper fix might be to use a different character for wildcards, but that's annoying.

We do need to fix the warning here a little, I have a PR in progress.

kripken added a commit that referenced this pull request Sep 24, 2019
This is part of the fix for

https://logs.chromium.org/logs/emscripten-releases/buildbucket/cr-buildbucket.appspot.com/8901492015302662960/+/steps/Emscripten_testsuite__upstream__other_/0/stdout

Specifically it fixes that the name shown there should not be escaped.

Followup for #2344
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.