-
Couldn't load subscription status.
- Fork 3.4k
[empath-split] Support multi-paths modules #25577
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
e307e14
e53e2db
8a8dd99
ac1d245
97258c6
7eedf6d
70a22a5
b481574
c7af449
3806250
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -21,13 +21,26 @@ | |
| $ emcc -g2 -gsource-map a.o b.o -o result.js | ||
| See https://emscripten.org/docs/porting/Debugging.html for more details. | ||
|
|
||
| This takes a wasm file and a paths file, which is a text file containing a list | ||
| of paths as inputs. The paths file should contain a single path per line. A | ||
| single split module will be generated per specified path. If a specified path | ||
| contains another specified path, functions contained in the inner path will be | ||
| split as the inner path's module, and the rest of the functions will be split as | ||
| the outer path's module. Functions that do not belong to any of the specified | ||
| paths will remain in the primary module. | ||
| This takes a wasm file and a paths file as inputs. The paths file defines how | ||
| to split modules. The format is similar to the manifest file for wasm-split, but | ||
| with paths instead of function names. A module is defined by a name on a line, | ||
| followed by paths on subsequent lines. Modules are separated by empty lines. | ||
| For example: | ||
| module1 | ||
| path/to/a | ||
| path/to/b | ||
|
|
||
| module2 | ||
| path/to/c | ||
|
|
||
| This will create two modules, 'module1' and 'module2'. 'module1' will contain | ||
| functions from source files under path/to/a and path/to/b. 'module2' will | ||
| contain functions from source files under path/to/c. | ||
|
|
||
| If a specified path contains another specified path, functions contained in the | ||
| inner path will be split as the inner path's module, and the rest of the | ||
| functions will be split as the outer path's module. Functions that do not belong | ||
| to any of the specified paths will remain in the primary module. | ||
|
|
||
| The paths in the paths file can be either absolute or relative, but they should | ||
| match those of 'sources' field in the source map file. Sometimes a source map's | ||
|
|
@@ -238,6 +251,50 @@ def is_synthesized_func(func): | |
| return path_to_funcs | ||
|
|
||
|
|
||
| # 1. Strip whitespaces | ||
| # 2. Normalize separators | ||
| # 3. Make /a/b/c and /a/b/c/ equivalent | ||
| def normalize_path(path): | ||
| return utils.normalize_path(path.strip()).rstrip(os.sep) | ||
|
|
||
|
|
||
| def parse_paths_file(paths_file_content): | ||
| module_to_paths = {} | ||
| path_to_module = {} | ||
| cur_module = None | ||
| cur_paths = [] | ||
|
|
||
| for line in paths_file_content.splitlines(): | ||
| line = line.strip() | ||
| if not line: | ||
| if cur_module: | ||
| if not cur_paths: | ||
| diagnostics.warn(f"Module '{cur_module}' has no paths specified.") | ||
| module_to_paths[cur_module] = cur_paths | ||
| cur_module = None | ||
| cur_paths = [] | ||
| continue | ||
|
|
||
| if not cur_module: | ||
| cur_module = line | ||
| else: | ||
| path = normalize_path(line) | ||
| if path in path_to_module: | ||
| exit_with_error("Path '{path}' cannot be assigned to module '{cur_module}; it is already assigned to module '{path_to_module[path]}'") | ||
| cur_paths.append(path) | ||
| path_to_module[path] = cur_module | ||
|
|
||
| if cur_module: | ||
| if not cur_paths: | ||
| diagnostics.warn(f"Module '{cur_module}' has no paths specified.") | ||
| module_to_paths[cur_module] = cur_paths | ||
|
|
||
| if not module_to_paths: | ||
| exit_with_error('The paths file is empty or invalid.') | ||
|
|
||
| return module_to_paths | ||
|
|
||
|
|
||
| def main(): | ||
| args, forwarded_args = parse_args() | ||
| check_errors(args) | ||
|
|
@@ -247,32 +304,41 @@ def main(): | |
| print_sources(sourcemap) | ||
| return | ||
|
|
||
| paths = utils.read_file(args.paths_file).splitlines() | ||
| paths = [utils.normalize_path(path.strip()) for path in paths if path.strip()] | ||
| # To make /a/b/c and /a/b/c/ equivalent | ||
| paths = [path.rstrip(os.sep) for path in paths] | ||
| # Remove duplicates | ||
| paths = list(dict.fromkeys(paths)) | ||
| content = utils.read_file(args.paths_file) | ||
| module_to_paths = parse_paths_file(content) | ||
|
|
||
| # Compute {path: list of functions} map | ||
| path_to_funcs = get_path_to_functions_map(args.wasm, sourcemap, paths) | ||
| all_paths = [] | ||
| for paths in module_to_paths.values(): | ||
| all_paths.extend(paths) | ||
| path_to_funcs = get_path_to_functions_map(args.wasm, sourcemap, all_paths) | ||
|
|
||
| # Write .manifest file | ||
| with tempfile.NamedTemporaryFile(suffix=".manifest", mode='w+', delete=args.preserve_manifest) as f: | ||
| manifest = f.name | ||
| for i, path in enumerate(paths): | ||
| f.write(f'{i}\n') | ||
| if not path_to_funcs[path]: | ||
| diagnostics.warn(f'{path} does not match any functions') | ||
| for i, (module, paths) in enumerate(module_to_paths.items()): | ||
| if i != 0: # Unless we are the first entry add a newline separator | ||
| f.write('\n') | ||
| funcs = [] | ||
| for path in paths: | ||
| if not path_to_funcs[path]: | ||
| diagnostics.warn(f'{path} does not match any functions') | ||
| funcs += path_to_funcs[path] | ||
| if not funcs: | ||
| diagnostics.warn(f"Module '{module}' does not match any functions") | ||
|
|
||
| if args.verbose: | ||
| print(f'{path}: {len(path_to_funcs[path])} functions') | ||
| for func in path_to_funcs[path]: | ||
| print(' ' + func) | ||
| print(f'{module}: {len(funcs)} functions') | ||
| for path in paths: | ||
| if path in path_to_funcs: | ||
| print(f' {path}: {len(path_to_funcs[path])} functions') | ||
| for func in path_to_funcs[path]: | ||
| print(' ' + func) | ||
| print() | ||
| for func in path_to_funcs[path]: | ||
|
|
||
| f.write(f'{module}\n') | ||
| for func in funcs: | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Annoyingly If you want this to be deterministic then I think you need to make this a dict rather than set. The There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Changed it and also one more place from |
||
| f.write(func + '\n') | ||
| if i < len(paths) - 1: | ||
| f.write('\n') | ||
| f.flush() | ||
|
|
||
| cmd = [args.wasm_split, '--multi-split', args.wasm, '--manifest', manifest] | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Can you end these lines with
:maybe? I feel like it would make them easier to read if I could clearly see the different between module names and filenames.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to make the file format similar to that of wasm-split's multi split manifest, which does not have colons: https://github.com/WebAssembly/binaryen/blob/main/test/lit/wasm-split/multi-split.wast.manifest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, maybe we could update them both one day?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mind much, but wasm-split's manifest file already has some users. If we decide to do it, I think we can give some internal clients a heads-up though. WDYT? @tlively
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will land this PR for now then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making improvements to the manifest file format sounds good to me. I imagine we might want to make more changes in the future as well, for example to include information about dependencies between modules.