Skip to content
Ilya Sher edited this page Feb 5, 2026 · 13 revisions

Recording Design - WIP

TODO: link here from other pages.

Background

TODO: NGS, UI main concepts (including timeline formed from interactions), links to everything.

Definitions

The definitions below apply only to this page.

"Idea"

Items marked "idea" are not part of the design but rather side notes, mainly for the author.

Timeline

Roughly corresponds to a session, see Timeline Design.

Timeline Item

An element in a timeline. For external programs for example, it would include (parsed) output of these programs whenever possible.

Object

An Object is a data structure typically consisting of several fields, representing an entity. Objects provide information about the entities they represent and they allow operating on these entities. Entities could be resources or processes in the broad sense.

Objects reside in Timeline Items.

Examples:

  • Resources:
    • file
    • directory
    • rsync program
    • EC2 instance (fields: AWS account, AWS region, InstanceId, tags, security groups, etc)
    • ECS container
    • ECS task definition
    • IP address
  • Processes:
    • rsync ... process (typically running but can also be paused, finished, etc)
    • AWS CodeBuild build
    • AWS CodePipeline execution
    • AWS CloudFormation stack operation (creating, updating, deleting the stack)
    • GitHub Actions workflow run

Fields of an object can be "plain" information about the object (such as name or description) or reference other Objects (such as EC2 Subnet belongs to and references VPC).

Interactive Element

Interactive Element is a user interface element that supports any kind of interaction. Interactive Elements facilitate interacting with Objects.

Typically, an Object will have several corresponding Interactive Elements, each Interactive Element representing a field. Relatively simple objects such as IP addresses, will most likely be represented as a single Interactive Element.

Imprecisely saying, Objects appear in outputs of programs and Objects have Interactive Elements.

Interaction Record

More precisely, "Structured Interaction Record" is a structured data representation of single user interaction event with the shell. Examples:

  • Entering a command
  • Mouse click on an Interactive Element
  • Hitting Enter or Space key to interact with selected Interactive Element
  • Opening context menu and selecting a command from the menu

Recording

Recording is a group of Interaction Records. Recording, in its purpose, is similar to a script: it provides reproducibility. Recordings are created automatically as user interacts with the shell and can be edited later, including:

  • addition of assertions
  • addition of comments
  • parametrization
  • editing of Object selection criteria (when guessed wrong), within limits

Compared to creating a regular scripting, the effort of creating a Recording is much smaller.

Replay

Replay executes a Recording by playing back its Interaction Records, like a user repeating the same actions. This is similar to replaying macros in other systems.

Assertion

Assertion is a condition articulated by the user (or guessed by the shell) that must hold true at a particular point during Replay. When that point is reached, if that condition is not evaluating to true, Replay can not continue and should either stop or switch to debugging/editing mode.

TODO: example(s)

Adapter

Adapter is a (presumably small) program that takes an output of an external program and extracts Objects from it.

Objects are then represented as Interactive Elements, allowing interaction with the Object. The interaction with Interactive Element could be default action or various operations in the context menu. Example: an output contains textual representation of an IP, it is parsed as IP Object, and presented as Interactive Element after processing. Such Interaction Element should facilitate operations for the IP Object data type such as ping, whois, tcpdump, etc in the context menu.

Side note. There are two big parts for adapter to implement: parsing the output (syntax) and giving semantic meaning to the output - extracting Objects. Parsing output of many utilities is already solved by the jc utility.

Action Provider

Action provider is a program that provides a set of commands for particular Object type(s). Actions from action providers are shown in UI with relation to Objects of compatible types.

Challenges

To be usable for Replay, Interaction Record must capture the intent (semantics) of the interaction. For example, if there is a list of EC2 instances, and the user decides to stop one of the instances in the list, the user is not interacting with instance with id i-123456. The user might be interacting with instance having Name tag dev, or the most recently launched of the machines having tag Role with value staging, or any other unforeseen criteria. Unfortunately, this semantics is not directly available - we don't know why the user choose to interact with one Interactive Element or another.

While we want to capture the precise semantics of the interaction, it is also inappropriate to stop and ask the user. The shell can not be getting in the way of performing tasks. This is especially true when the user does one off (or what seems to be one off at the moment) task and doesn't care about recording and reproducibility.

TODO: The dual nature of a field value that is a reference to another Object (explain about the UI issue here).

TODO: Multiple selection

Requirements

  • Capturing interaction intent (semantics) must include:
    • Which command output was interacted with? Typically that would be the last command that was run but not always.
    • Which element of the output was interacted with?
      • For rows in a table that includes which row was selected and why and which column/field was interacted with.
      • An attempt should be made to find generic (likely recursive) heuristics for different data structures. For example, Tags field with key-value tags must be supported: interacting with a row because it has Tags column/field with tag Name with value db-1 should be supported when capturing the intent.
    • What was the command that user performed? Typically "default action" or a command from a context menu.
  • Replay (and therefore Interaction Records) should work despite wide variation of context. Example:
    • Running under another AWS account, in different region, etc
    • Run from other user's machine (after sharing)
  • Replay must support both "normal" run and step-by-step execution, where each step is confirmed before execution. (Thought: maybe distinguish read-only steps and don't require confirmation for those.)
  • Recording must support parametrization during recording and after the fact. For example: if the user performed a chain of UI interactions starting with ECS task MyTask1, it should be possible to introduce parameter task_name, use it instead of the value, and provide the value for the parameter at the beginning of Replay.
    • Support adding assertions. Ideally, assertions in the UI should be consistent with assert() implementation in NGS: it takes actual value, a pattern to compare to, and an error message to show when the value does not match the pattern.
    • Support adding loops or other means to express "these Interaction Records apply to these objects".
    • Support editing of Interaction Record guesses. Only allow editing of criteria when the resulting items or set of items is identical compared to using the guessed criteria.
      • Which program output was interacted with?
      • Which element of the output was interacted with?
  • Outputs from external programs must either:
    • Define the Objects to interact with. For this, a format or a protocol must be defined later.
    • Have suitable Adapter, provided by either NGS or 3rd party that would parse the output to extract Objects.
    • In absence of the above two, would be displayed as plain text, which I consider to be traced-to-telegraph communication paradigm.
  • Recording must be serializable.
  • Recording must be shareable.
  • Recording must support editing.
  • Tooling
    • Tools to manipulate Recordings - TBD
    • Diff-ability of Recordings - TBD
      • Idea: Pre-diff replace unique identifiers that don't matter for the comparison.
  • Clear separation between what an Object is and which operations are possible for that Object type.
    • Pluggable Adapters for parsing outputs of programs into Objects
    • Pluggable operations (TODO: need name) by Object types
  • TBD: work with collections of Objects (homogeneous or heterogeneous)

Design

WIP

Recording Design

WIP

Idea: Subset of recordings can potentially be converted to / shown as IaC such as AWS CDK or AWS CloudFormation. The intended use of this shell is ad hoc operations and debugging and not to replace IaC tools though.

Idea: exporting as script?

Idea: is there any advantage to recording vs IaC?

  • TODO: Step by step replay.
  • TODO: Parametrization.

Interaction Record Design

WIP

  • The criteria by which the user has selected particular object during interaction are guessed when creating a new Interaction Record.
    • The assumption here is that few relatively simple heuristics would "guess" the criteria correctly in the majority of cases. Example: among table rows, a unique value in a particular column is a good candidate (excluding random unique ID column).
    • Interaction Record is modifiable after the fact. The modification is limited though to sets of criteria that would produce the same result. TODO: elaborate.
  • TODO: Guessing Command/output selection criteria.
  • TODO: Assertions, including automatic (ex: single match), stopping/debugging on failed assertion
  • TODO: Multi-selection
  • TODO: Why it's OK to ask the user to interact with fields when selecting an Object

Menus - Mini Design

WIP

This section is a candidate to be moved somewhere else (maybe its own page)

This section is more of a scratchpad at the moment

For Criteria Guessing Algorithm below, menu data model needs to be designed, at least on the high level.

Questions/Answers

  • What is the data model for each menu item?
    • Each menu item should reference a command to run when that menu item is selected.
      • Can a menu or an item be parametrized?
        • If a menu is on a resource, we already have the target parameter, referencing the Object/field on which the menu was triggered.
        • Further parametrization? Operating on multiple resources?
          • Can be from previously bookmarked resources matching the type?
          • Can be from recently shown resources matching the type?
          • Idea: for multiple arguments - click multiple nouns on the screen while composing the command
    • Can a menu item reference a resource to navigate to when that menu item is selected?
  • Differentiate between two types of menus? Navigation menu such as for an AWS service. Context menu.

General

Here we need to see whether the idea of plugins defining operations on particular data types can be fitted to deal with menus and if so, what's the correct data model for that. This idea, consistent with the language, is multiple dispatch. Each plugin would define my_command_N(data1: Type1, data2: Type2, ..., dataN: TypeN) (very likely with additional metadata). An Interactive Element will usually belong to an Object of type TypeN and all commands that can handle this type (as any input argument) will be listed in the context menu for that Interactive Element.

Side note: handle when a particular type can be used as different inputs for the same command. There should be a way to specify which input this particular Object is.

Side note: type hierarchy needs to be implemented and more specific types (subtypes) should be accepted as arguments.

Let's assume that a menu is a particular data type. For example, MainAwsMenu.

Sorting, grouping, searching and enabled/disabled status of menu items - TBD.

Command palette - TBD.

Ideas

  • Menu should be referenceable, therefore an Object.

Alternative A:

Following the plugins idea, each plugin that wants to add an item to the menu should add my_command_n(data:MainAwsMenu).

This model does not seem to fit:

  • It's not obvious what goes into data. An empty Object just to have the right type? That's a bit odd.
  • The default action of the menu should be displaying the menu items while following our model, commands would appear in the context menu.

Alternative B:

For menus, there will be predefined method, menu_items(m:MyTypeOfMenu) for the moment. It will be called as default action for menus, return menu item(s) and these menu items will be arranged and displayed as a menu by the shell. The issue of what goes into m is still not solved. menu_items might have additional argument, a context of some kind.

Side note: Maybe grouping by parameter type?

It does seem that providing menu facilities by the shell is the right thing to do.

Sub-alternative B.1 is that plugins provide structured data that defines items for menus instead of a method being called. While it has some advantages, it is way less flexible and I expect this structured data to become a mini-language by itself to accommodate different features - not a good thing.

Alternative C

Composite commands. Menu's default action calls composite action, where plugins contribute menu items.

Criteria Guessing Algorithm

WIP

This algorithm will be applied after a UI event. The UI event will tell which Interactive Element on the screen was interacted with. This algorithm will try to guess the criteria that the user used to select that particular Object.

For the purposes of this algorithm:

  • A Table is equivalent to an array of objects where all objects are of the same type and have same fields. (TODO: maybe support heterogeneous tables where only some fields overlap. This can help somewhere else.)

  • A Row is equivalent to an element of an array

  • A Column is a particular field of all the elements in the array

  • A Cell is a particular field of a particular element of the array

  • Properties is a set of Key/Value pairs, typically accessed by Key.

  • E - The Interactive Element the user interacted with, providing a hint as to why the corresponding Object was selected.

  • O - The object corresponding to E.

  • F - The field corresponding to E (on the Object O).

Table

  • Guess: Field type of F is unique in the Column
  • Guess: Field value of F is unique in the Column

TODO: Elaborate about what field type means here (Error the type vs the error message)

Non-Table Array

TODO: See if it can be treated as single column table.

Properties

(notes)

TODO: Make sure selecting an object by tag with given Key/Value is covered.

Thinking over Time

WIP

TODO: explain why changed from textual representation of interaction (as code) to structured data.

Ideas to be Reviewed Later

  • UI - whole Row handle for selecting multiple criteria or other special operations on the Object
  • Broader/narrower guesses (as with unique type vs unique value)
  • When assertion fails:
    • edit and re-run last few commands that led to this situation.
    • add new commands above assertion to fix the issue.

Scratchpad (Ideas)

  • What is the "same" object in context of record/replay?
  • Data Model for menus

Data Model for Interaction Record

attention to reproducibility

  • id - unique (per timeline) identifier
  • (maybe) ui - copy or reference to what the user was seeing and what they interacted with. The problem with reference is that Recording would need to bring Timeline with it. The problem with copying is volume.
    • data
    • path
  • command - Specifics TBD but should be a shell command, not a low level exec().
    • Commonly, it would be show (or alike) to reflect simple default action of left-click or hitting enter on a field.
  • args - named arguments for the command. Names of the arguments are determined by the command.

Current Situation

UI sends event to the shell: {"cur": {"ti": ID, "path": [...]}}. All IDs and paths/indexes need to be mapped to semantically meaningful references to fit the record/replay paradigm.

Note that using ti + path means we can not arbitrary update timeline items, we can only append. We need to consider giving unique id to every Object in Timeline Items.

Misc

Idea: A menu can be reprsented as single Object, where each field represents a menu item.

Sample AWS services menu.

Object:

{
	"EC2": {
		"command": "show",
		"args": {
			"dst": ???
		}
	}
	"ECS": ...
	"RDS": ...
}

# path: ['EC2']

Idea: guessing for ['STRING'] would be just that string. Make sure this decision would fit well with recursive guessing.

Sample context menu on EC2 instance, on VPC field:

{
	"show": {
		"command": "show",
		"args": {
			"dst": (TBD, the referenced VPC somehow)
		}
	},
	# what else?
}

Sample context menu on EC2 instance, security groups field:

# Note it's on all security groups together field
{
	"add": {
		"command": "show",
		"args": {
			"dst": (TBD, add security group wizard/menu) # mmm.. supposed to select on screen, start composing command
		}
	}
	"edit": {

	}

}

Context menu vs "regular" naviagational menu.

Idea: Dealing timeline item ID can be postponed for now and be "last command blah" (which would require some standardized way of referencing commands)

Idea: later add also which timeline was interacted with.

Idea: to allow later editing of criteria, the whole output and path must be kept so that newly manually provided criteria could be validated (confirming they would produce the same result: choosing the same object(s)).

Idea: what happens if it's not kept?

Commands

All arguments are named.

Limited number of verbs (makes everything predictable), mapped to lower level commands. This for example would unify the mess of AWS CLI commands for different services.

Commands will have associated (JSON?) schemas.

Having the code as data structure allows many things and reminds Lisp macros.

List of commands, with parameters

  • show
    • dst - references the target.
  • create
    • parent - create under which parent container/resource/service
    • type - type of resource to create, appropriate for the parent
    • name (if applicable for the type)
  • run
    • abstract programs, including cloud processes such as build
    • as - identity to run as (ex: in cloud could be IAM identity)
    • (maybe)cwd
    • argv
    • env (for Unix processes that would be directory and environment variables)
  • kill
  • pause
  • resume

Arguments' values

TBD

Clone this wiki locally