Skip to content

Commit

Permalink
split up readme
Browse files Browse the repository at this point in the history
  • Loading branch information
untitaker committed Jun 25, 2023
1 parent 11b7686 commit 0682125
Show file tree
Hide file tree
Showing 4 changed files with 134 additions and 120 deletions.
123 changes: 4 additions & 119 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,130 +29,15 @@ Check [the website](https://untitaker.github.io/spacemod/) for installation opti

</div>

## Matching modes
## Matching Modes

By default, your pattern is interpreted as a regular RE2-style regex using
[regex](https://docs.rs/regex/) crate.
By default, you use regexes to replace text. See [Matching
modes](./docs/matching.md) for the alternative modes that `spacemod` supports.

`spacemod` provides two flags to change that:

* `-S` to use parenthesis-matching and implicit whitespace, a.k.a "space mode".
* `-F` to interpret the pattern as a literal string to find, same as `-F` in
fastmod.

`-S` requires further explanation. Let's motivate it with an example:

```rust
vec!["foo".to_string(), "bar".to_string(), "baz".to_string()]
```

You get the terrible idea of changing all of your `x.to_string()` calls to
`String::from(x)` instead.

In `fastmod`, or any regex-based search-and-replace, you'd write:

```bash
fastmod '"(.*)"\.to_string\(\)' 'String::from("$1")'
```

You forgot about the greediness of `.*` and should've maybe written `.*?`
instead. Now your code looks like:

```rust
vec![String::from(String::from(String::from("foo"), "bar"), "baz")]
```

Let's try that again. `spacemod` lets you write:

```bash
spacemod -S '" (.*) " \.to_string ( )' 'String::from("$1")'
```

The correct end result looks like:

```rust
vec![String::from("foo"), String::from("bar"), String::from("baz")]
```

Spacing out parenthesis and quotes tells spacemod that those tokens, `""` and `()`:

* must be balanced wherever they appear in a match
* can have surrounding whitespace (including newlines)
* are literals (fewer backslashes!)

As a result of implicit whitespace, spacemod would have also found all matches
in the following code:

```rust
"
foo".to_string(
)

"foo"
.to_string()
```

### Syntax reference

`spacemod`'s pattern syntax consists of tokens delimited by a single space each.
A token can either be a parenthesis/quote, or a substring of a regex:

```
{ regex1 } regex2 { regex3 }
```

If you need to use a literal space anywhere in a regex substring, escape it as
`\ `. That also means escaping a regex like `\ ` as `\\ `. Backslashes followed
by anything but a space within a regex do not need to be escaped.

`spacemod` knows the following parenthesis/quotes:

* `{}`
* `[]`
* `()`
* `<>`
* `"", ``, ''`. Since start and end are the same, this basically
devolves into asserting there is an even number of tokens.

You can extend this list with `-p ab` where `a` is the opening parenthesis, and
`b` the closing counterpart. See `--help` for more information.

## Alternatives

You may find the following tools useful if spacemod is doing too much or too
little for you. The primary focus of this list is on editor/IDE-independent
tools, and preferably on those which can be composed into more complex
shell-scripts.

* `spacemod` is heavily inspired by
[fastmod](https://github.com/facebookincubator/fastmod) and was specifically
built to deal with shortcomings of regex. `fastmod` is much faster.

* [`ast-grep`](https://ast-grep.github.io/) is a very easy to use AST-based
code search tool.

* [`comby`](https://comby.dev/) actually has grammars built-in for various
filetypes to understand what wildcards are supposed to match in which
contexts. It appears to be "a step up" from `spacemod` the same way
`spacemod` syntax is a step up from regex. That goes for both expressiveness
and complexity.

* [`codemod`](https://github.com/facebook/codemod) is another search-and-replace tool that has
language-specific knowledge. It supports both basic regex replacements and
more sophisticated transformations written in Python.

* [`spatch`](https://github.com/facebookarchive/pfff/wiki/Spatch) from the
PFFF-suite appears to be very similar to `comby`.

### Other tools in the same space

* [Beyond Grep](https://beyondgrep.com/more-tools/) has a table of
(regex-based) *text search* tools. The best one IMO is
[`ripgrep`](https://github.com/BurntSushi/ripgrep).

* [`semgrep`](https://github.com/returntocorp/semgrep) is a search tool with a
bit of semantic knowledge, but also general text matching abilities that go
beyond regular expressions.
There are many tools like `spacemod`, some of which may suit your needs better. Take a look at [Alternatives](./docs/alternatives.md).

<div class="oranda-hide">

Expand Down
38 changes: 38 additions & 0 deletions docs/alternatives.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@

# Alternatives

You may find the following tools useful if spacemod is doing too much or too
little for you. The primary focus of this list is on editor/IDE-independent
tools, and preferably on those which can be composed into more complex
shell-scripts.

* `spacemod` is heavily inspired by
[fastmod](https://github.com/facebookincubator/fastmod) and was specifically
built to deal with shortcomings of regex. `fastmod` is much faster.

* [`ast-grep`](https://ast-grep.github.io/) is a very easy to use AST-based
code search tool.

* [`comby`](https://comby.dev/) actually has grammars built-in for various
filetypes to understand what wildcards are supposed to match in which
contexts. It appears to be "a step up" from `spacemod` the same way
`spacemod` syntax is a step up from regex. That goes for both expressiveness
and complexity.

* [`codemod`](https://github.com/facebook/codemod) is another search-and-replace tool that has
language-specific knowledge. It supports both basic regex replacements and
more sophisticated transformations written in Python.

* [`spatch`](https://github.com/facebookarchive/pfff/wiki/Spatch) from the
PFFF-suite appears to be very similar to `comby`.

## Other tools in the same space

* [Beyond Grep](https://beyondgrep.com/more-tools/) has a table of
(regex-based) *text search* tools. The best one IMO is
[`ripgrep`](https://github.com/BurntSushi/ripgrep).

* [`semgrep`](https://github.com/returntocorp/semgrep) is a search tool with a
bit of semantic knowledge, but also general text matching abilities that go
beyond regular expressions.

87 changes: 87 additions & 0 deletions docs/matching.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
# Matching modes

By default, your pattern is interpreted as a regular RE2-style regex using
[regex](https://docs.rs/regex/) crate.

`spacemod` provides two flags to change that:

* `-S` to use parenthesis-matching and implicit whitespace, a.k.a "space mode".
* `-F` to interpret the pattern as a literal string to find, same as `-F` in
fastmod.

`-S` requires further explanation. Let's motivate it with an example:

```rust
vec!["foo".to_string(), "bar".to_string(), "baz".to_string()]
```

You get the terrible idea of changing all of your `x.to_string()` calls to
`String::from(x)` instead.

In `fastmod`, or any regex-based search-and-replace, you'd write:

```bash
fastmod '"(.*)"\.to_string\(\)' 'String::from("$1")'
```

You forgot about the greediness of `.*` and should've maybe written `.*?`
instead. Now your code looks like:

```rust
vec![String::from(String::from(String::from("foo"), "bar"), "baz")]
```

Let's try that again. `spacemod` lets you write:

```bash
spacemod -S '" (.*) " \.to_string ( )' 'String::from("$1")'
```

The correct end result looks like:

```rust
vec![String::from("foo"), String::from("bar"), String::from("baz")]
```

Spacing out parenthesis and quotes tells spacemod that those tokens, `""` and `()`:

* must be balanced wherever they appear in a match
* can have surrounding whitespace (including newlines)
* are literals (fewer backslashes!)

As a result of implicit whitespace, spacemod would have also found all matches
in the following code:

```rust
"
foo".to_string(
)

"foo"
.to_string()
```

## Syntax reference

`spacemod`'s pattern syntax consists of tokens delimited by a single space each.
A token can either be a parenthesis/quote, or a substring of a regex:

```
{ regex1 } regex2 { regex3 }
```

If you need to use a literal space anywhere in a regex substring, escape it as
`\ `. That also means escaping a regex like `\ ` as `\\ `. Backslashes followed
by anything but a space within a regex do not need to be escaped.

`spacemod` knows the following parenthesis/quotes:

* `{}`
* `[]`
* `()`
* `<>`
* `"", ``, ''`. Since start and end are the same, this basically
devolves into asserting there is an even number of tokens.

You can extend this list with `-p ab` where `a` is the opening parenthesis, and
`b` the closing counterpart. See `--help` for more information.
6 changes: 5 additions & 1 deletion oranda.json
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,11 @@
}
},
"build": {
"path_prefix": "spacemod"
"path_prefix": "spacemod",
"additional_pages": {
"Matching modes": "./docs/matching.md",
"Alternatives": "./docs/alternatives.md"
}
},
"styles": {
"theme": "hacker",
Expand Down

0 comments on commit 0682125

Please sign in to comment.