binpash · angelhof · Sep 26, 2023 · Aug 29, 2023 · Aug 29, 2023 · Aug 29, 2023
diff --git a/README.md b/README.md
@@ -1,66 +1,78 @@
-## Dynamic Parallelizer
+## hs README
 
-A dynamic parallelizer that optimistically/speculatively executes everything in a script in parallel and ensures that it executes correctly by tracing it and reexecuting the parts that were erroneous.
+### Overview
 
-## Installing
+`hs` is a system for executing shell scripts out of order. It achieves this by tracing the script's execution, and if an error arises due to speculative execution, the script re-executes the necessary parts to ensure correct outcomes. The project aims to boost the parallel execution of shell scripts, reducing their runtime and enhancing efficiency.
 
-```sh
-./scripts/install_deps_ubuntu20.sh
-```
+### Structure
 
-## Tests
+The project's top-level directory contains the following:
 
-To run the tests:
-```sh
-cd test
-./test_orch.sh
-```
+- `deps`: Dependencies required by `hs`.
+- `docs`: Documentation and architectural diagrams.
+- `model-checking`: Tools and utilities for model checking.
+- `parallel-orch`: Main orchestration components.
+- `pash-spec.sh`: Entry script to initiate the `hs` process.
+- `README.md`: This documentation file.
+- `report`: Generated reports related to test runs and performance metrics.
+- `requirements.txt`: List of Python dependencies.
+- `Rikerfile`: Configuration file for Riker.
 
-### TODO Items
+### Installation
 
-#### Complete control flow and complex script support
+Install `hs` on your Linux-based machine by following these steps:
 
-Extend the architecture to support complete scripts and not just partial order graphs of commands.
+**Note:** Currently works with `Ubuntu 20.04` or later
 
-A potential solution is shown below:
+1. Navigate to the project directory:
+   ```sh
+   cd path_to/dynamic-parallelizer
+   ```
 
-![Architecture Diagram](/docs/handdrawn_architecture.jpeg)
+2. Run the installation script:
+   ```sh
+   ./scripts/install_deps_ubuntu20.sh
+   ```
 
-This solution includes a preprocessor that creates two executable artifacts: 
-- the preprocessed/instrumented script (similar to what the PaSh-JIT preprocessor produces)
-- the partial program order graph (a graph of commands that will be speculated and executed with tracing from the orchestrator)
+This script will handle all the necessary installations, including dependencies, try, Riker, and PaSh.
 
-The graph might contain unexpanded commands, so the orchestrator should support unexpanded strings.
-On these commands, the orchestrator can speculate for the value of these strings and then when they become the frontier (the preprocessed script has reached them), we actually know their values and could confirm/abort the speculation.
+### Running `hs`
 
-The two executors communicate with each other and progress through the script execution in tandem. The JIT executor (left) also needs to trace execution to inform the orchestrator about changes in the environment.
+The main entry script to initiate `hs` is `pash-spec.sh`. This script sets up the necessary environment and invokes the orchestrator in `parallel-orch/orch.py`. It's designed to accept a variety of arguments to customize its behavior, such as setting debug levels or specifying log files.
 
-#### Orchestator: Partial Program Order Graph
+Example of running the script:
 
-**Note:** we have moved to a continuous scheduling implementation. An example explaining its operation can be found [here](/docs/example.md).
+```bash
+./pash-spec.sh [arguments] script_to_speculatively_run.sh
+```
 
-The orchestrator needs to support arbitrary partial program order graphs (instead of just sequences of instructions), to figure out the precise real program order dependencies.
+**Arguments**:
 
+- `-d, --debug-level`: Set the debugging level. Default is `0`.
+- `-f, --log_file`: Define the logging output file. By default, logs are printed to stdout.
+- `--sandbox-killing`: Kill any running overlay instances before committing to the lower layer.
+- `--env-check-all-nodes-on-wait`: On a wait, check for environment changes between the current node and all other waiting nodes. (not fully functional yet!)
 
-An instance of a graph is shown below:
+### Testing
 
-![Example Partial Program Order Graph](/docs/handdrawn_partial_program_order.jpeg)
+To run the provided tests:
 
-One important characteristic of the graph (and the speculative execution algorithm) is that there is a committed prefix-closed part that has already executed and cannot be affected.
-The rest of the graph is uncommited and therefore might or might not have completed execution. The uncommited frontier, the part of the graph adjacent to the prefix is guaranteed to execute and complete without speculation (since we have both the environment and the variables resolved) and this is part of the argument for the termination of the algorithm. Every step that the orchestration takes, it can always commit the uncommited frontier, and therefore the commited prefix grows until it reaches the whole graph.
+```bash
+./test/test_orch.sh
+```
 
-#### Orchestrator: Backward dependencies and Execution Isolation/Aborting/Reverting
+For in-depth analysis, set the `DEBUG` environment variable to `2` for detailed logs and redirect logs to a file:
 
-How do we resolve backward dependencies? For example: 
-```sh
-grep foo in1 > out1
-grep bar in0 > in1 ## Its write might affect the first command exec.
+```bash
+DEBUG=2 ./test/test_orch.sh 2>logs.txt
 ```
 
-One solution would be to run the non-frontier (non-root) commands in an isolated environment and only at the end of their execution commit their results. This might have significant overhead, except if we can just write to temporary files and then move them? Or let them work in a temporary directory? 
+### Contributing and Further Development
+
+Contributions are always welcome! The project roadmap includes extending the architecture to support complete scripts, optimizing the scheduler for better performance, etc.
 
-Another way would be to dynamically track writes of non-frontier commands and stop them when they try to write to something that might be a read dependency of the first, but there are timing issues here that I don't see how to resolve.
+For a detailed description of possible optimizations, see the [related issues](https://github.com/binpash/dynamic-parallelizer/issues?q=is%3Aopen+is%3Aissue+label%3Aoptimization)
 
-#### Commands that change current directory
+### License
 
-Can we actually trace that and not run these commands? Is that simply a change of an environment variable? They will run in a forked version anyway, but we want to see their results.
+`hs` is licensed under the MIT License. See the `LICENSE` file for more information.
diff --git a/deps/pash b/deps/pash
diff --git a/parallel-orch/analysis.py b/parallel-orch/analysis.py
@@ -29,47 +29,67 @@ def parse_shell_to_asts(input_script_path) -> "list[AstNode]":
     except libdash.parser.ParsingException as e:
         logging.error(f'Parsing error: {e}')
         exit(1)
+
 
+def validate_node(ast) -> bool:
+    assert(isinstance(ast, (CommandNode, PipeNode)))
+    if isinstance(ast, CommandNode):
+        return True
+    else:
+        for cmd in ast.items:
+            assert isinstance(cmd, CommandNode)
 
-## Returns true if the script is safe to speculate and execute outside
-##  of the original shell context.
-##
-## The script is not safe if it might contain a shell primitive. Therefore
-##  the analysis checks if the command in question is one of the underlying
-##  shell's primitives (in our case bash) and if so returns False
-def safe_to_execute(asts: "list[AstNode]", variables: dict) -> bool:
-    ## There should always be a single AST per node and it must be a command
-    assert(len(asts) == 1)
-    ast = asts[0]
-    assert(isinstance(ast, CommandNode))
-    logging.debug(f'Ast in question: {ast}')
+
+def is_node_safe(node: CommandNode, variables: dict) -> str:
     ## Expand and check whether the asts contain
-    ##  a command substitution or a primitive.
+    ## a command substitution or a primitive.
     ## If so, then we need to tell the original script to execute the command.
 
     ## Expand the command argument
-    cmd_arg = ast.arguments[0]
+    cmd_arg = node.arguments[0]
     exp_state = expand.ExpansionState(variables)
     ## TODO: Catch exceptions around here
     expanded_cmd_arg = expand.expand_arg(cmd_arg, exp_state)
     cmd_str = string_of_arg(expanded_cmd_arg)
     logging.debug(f'Expanded command argument: {expanded_cmd_arg} (str: "{cmd_str}")')
-
-    ## TODO: Determine if the ast contains a command substitution and if so
-    ##        run it in the original script.
-    ##       In the future, we should be able to perform stateful expansion too,
-    ##        and properly execute and trace command substitutions.
-
+
     ## KK 2023-05-26 We need to keep in mind that whenever we execute something
     ##               in the original shell, then we cannot speculate anything
     ##               after it, because we cannot track read-write dependencies
     ##               in the original shell.
-
     if cmd_str in BASH_PRIMITIVES:
         return False
-
     return True
 
+
+def is_pipe_node_safe_to_execute(node: PipeNode, variables: dict) -> bool:
+    for cmd in node.items:
+        logging.debug(f'Ast in question: {cmd}')
+        if not is_node_safe(cmd, variables):
+            return False
+    return True
+
+## Returns true if the script is safe to speculate and execute outside
+##  of the original shell context.
+##
+## The script is not safe if it might contain a shell primitive. Therefore
+##  the analysis checks if the command in question is one of the underlying
+##  shell's primitives (in our case bash) and if so returns False
+def safe_to_execute(asts: "list[AstNode]", variables: dict) -> bool:
+    ## There should always be a single AST per node and it must be a command
+    assert(len(asts) == 1)
+    if isinstance(asts[0], PipeNode):
+        return is_pipe_node_safe_to_execute(asts[0], variables)
+    else:
+        assert(isinstance(asts[0], CommandNode))
+        logging.debug(f'Ast in question: {asts[0]}')
+        return is_node_safe(asts[0], variables)
+    ## TODO: Determine if the ast contains a command substitution and if so
+    ##        run it in the original script.
+    ##       In the future, we should be able to perform stateful expansion too,
+    ##        and properly execute and trace command substitutions.
+
+
 BASH_PRIMITIVES = ["break", 
                    "continue", 
                    "return"]

diff --git a/parallel-orch/config.py b/parallel-orch/config.py
@@ -1,6 +1,7 @@
 import os
 import subprocess
 import logging
+import time
 
 
 ## TODO: Figure out how logging here plays out together with the log() in PaSh
@@ -27,17 +28,21 @@ def log_root(msg, *args, **kwargs):
 
 
 ## Ensure that PASH_TMP_PREFIX is set by pa.sh
-assert(not os.getenv('PASH_SPEC_TMP_PREFIX') is None)
 PASH_SPEC_TMP_PREFIX = os.getenv('PASH_SPEC_TMP_PREFIX')
 
 SOCKET_BUF_SIZE = 8192
 
 SCHEDULER_SOCKET = os.getenv('PASH_SPEC_SCHEDULER_SOCKET')
 
-MAX_KILL_ATTEMPTS = 10  # Define a maximum number of kill attempts for each process in the partial program order
-
 INSIGNIFICANT_VARS = {'PWD', 'OLDPWD', 'SHLVL', 'PASH_SPEC_TMP_PREFIX', 'PASH_SPEC_SCHEDULER_SOCKET', 'PASH_SPEC_TOP',
                       'PASH_TOP', 'PASH_TOP_LEVEL','RANDOM', 'LOGNAME', 'MACHTYPE', 'MOTD_SHOWN', 'OPTERR', 'OPTIND',
                       'PPID', 'PROMPT_COMMAND', 'PS4', 'SHELL', 'SHELLOPTS', 'SHLVL', 'TERM', 'UID', 'USER', 'XDG_SESSION_ID'}
 
-SIGNIFICANT_VARS = {'foo', 'bar', 'baz'}
+SIGNIFICANT_VARS = {'foo', 'bar', 'baz', 'file1', 'file2', 'file3', 'file4', 'file5', 'LC_ALL', 'nchars', 'filename'}
+
+START_TIME = time.time()
+
+NAMED_TIMESTAMPS = {}
+
+SANDBOX_KILLING = False
+SPECULATE_IMMEDIATELY = False
diff --git a/parallel-orch/executor.py b/parallel-orch/executor.py
@@ -12,18 +12,18 @@ def async_run_and_trace_command_return_trace(command, node_id, latest_env_file,
     trace_file = util.ptempfile()
     stdout_file = util.ptempfile()
     stderr_file = util.ptempfile()
-    post_exec_env = util.ptempfile()
+    post_execution_env_file = util.ptempfile()
     logging.debug(f'Scheduler: Stdout file for: {node_id} is: {stdout_file}')
     logging.debug(f'Scheduler: Stderr file for: {node_id} is: {stderr_file}')
     logging.debug(f'Scheduler: Trace file for: {node_id}: {trace_file}')
-    process = async_run_and_trace_command_return_trace_in_sandbox(command, trace_file, node_id, stdout_file, stderr_file, latest_env_file, post_exec_env, speculate_mode)
-    return process, trace_file, stdout_file, stderr_file, post_exec_env
+    process = async_run_and_trace_command_return_trace_in_sandbox(command, trace_file, node_id, stdout_file, stderr_file, latest_env_file, post_execution_env_file, speculate_mode)
+    return process, trace_file, stdout_file, stderr_file, post_execution_env_file
 
 def async_run_and_trace_command_return_trace_in_sandbox_speculate(command, node_id, latest_env_file):
-    process, trace_file, stdout_file, stderr_file, post_exec_env = async_run_and_trace_command_return_trace(command, node_id, latest_env_file, speculate_mode=True)
-    return process, trace_file, stdout_file, stderr_file, post_exec_env
+    process, trace_file, stdout_file, stderr_file, post_execution_env_file = async_run_and_trace_command_return_trace(command, node_id, latest_env_file, speculate_mode=True)
+    return process, trace_file, stdout_file, stderr_file, post_execution_env_file
 
-def async_run_and_trace_command_return_trace_in_sandbox(command, trace_file, node_id, stdout_file, stderr_file, latest_env_file, post_exec_env, speculate_mode=False):
+def async_run_and_trace_command_return_trace_in_sandbox(command, trace_file, node_id, stdout_file, stderr_file, latest_env_file, post_execution_env_file, speculate_mode=False):
     ## Call Riker to execute the command
     run_script = f'{config.PASH_SPEC_TOP}/parallel-orch/run_command.sh'
     args = ["/bin/bash", run_script, command, trace_file, stdout_file, latest_env_file]
@@ -32,7 +32,7 @@ def async_run_and_trace_command_return_trace_in_sandbox(command, trace_file, nod
     else:
         args.append("standard")
     args.append(str(node_id))
-    args.append(post_exec_env)
+    args.append(post_execution_env_file)
     # Save output to temporary files to not saturate the memory
     logging.debug(args)
     process = subprocess.Popen(args, stdout=None, stderr=None)