Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cli: add basic context insensitive tab completion for the CLI #45186

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

rohany
Copy link
Contributor

@rohany rohany commented Feb 18, 2020

This PR adds basic context insensitive tab completion for
the CLI. It maintains a store of keywords, variables
and identifiers and uses these to match against unfinished
tokens in the input stream.

Work with @Azhng

Release note (cli change): Add basic tab completion for the SQL CLI.

@rohany rohany requested a review from knz February 18, 2020 23:04
@rohany rohany requested a review from a team as a code owner February 18, 2020 23:04
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@rohany
Copy link
Contributor Author

rohany commented Feb 19, 2020

Alright, I'm having some trouble adding the dependency I want to vendor... Any tips?

I've added the package to gopkg.toml/gopkg.lock, and I've made a commit in vendored that has the depedency. How do I point cockroach at that commit?

@knz
Copy link
Contributor

knz commented Feb 19, 2020

I've added the package to gopkg.toml/gopkg.lock, and I've made a commit in vendored that has the depedency. How do I point cockroach at that commit?

  1. give your commit in vendored a branch name and push it upstream
  2. git add vendor in the crdb repo to link it

@rohany
Copy link
Contributor Author

rohany commented Feb 19, 2020

Awesome, thanks!

@rohany rohany force-pushed the tab-complete branch 5 times, most recently from c209ba9 to 917847c Compare February 19, 2020 18:18
@rohany
Copy link
Contributor Author

rohany commented Feb 19, 2020

I was able to get tab suggestions to work when there were multiple completions, but only on linux (not in this PR). Does it make sense to merge something that would only add the feature on linux not mac?

@knz
Copy link
Contributor

knz commented Feb 19, 2020

If you don't mind I'd prefer that hackathon PRs do not get merged until they go the full cycle of review (and after we iron out the most glaring UX issues).

@rohany
Copy link
Contributor Author

rohany commented Feb 19, 2020

Yes of course -- I've been polishing up for review. I just wanted to know what the philosophy was on differing support on different OS's.

(also, not saying this is ready to be merged -- theres most likely lots of room for improvement).

@knz
Copy link
Contributor

knz commented Feb 19, 2020

It's OK if a feature doesn't work in environment X while it works fine on environment Y; however consider:

  • usually X = windows and Y = everything else
  • we like Y to be "the things that most Developers use"

Considering that macOS is a pretty popular dev platform, that means you might want to either work on the user messaging (a copious release note) or iterate on the code a bit more.

@rohany
Copy link
Contributor Author

rohany commented Feb 19, 2020

That makes sense.

Considering that macOS is a pretty popular dev platform, that means you might want to either work on the user messaging (a copious release note) or iterate on the code a bit more.

We can spend some more time, but attempts to sidestep the libedit bug on mac haven't led to much success.

This PR adds basic context insensitive tab completion for
the CLI. It maintains a store of keywords, variables
and identifiers and uses these to match against unfinished
tokens in the input stream.

Release note (cli change): Add basic tab completion for the SQL CLI.
@rohany
Copy link
Contributor Author

rohany commented Mar 9, 2020

cc @knz can you take a look at this and estimate what it would need added if we wanted to get this into 20.1? (if it should be included at all)

@knz
Copy link
Contributor

knz commented Mar 14, 2020

Sorry for dropping the ball on this.
I think it is too late in the 20.1 cycle to add this, also we'd probably want to increase the test coverage.

Let's pick it back up in a month from now.

@knz knz added this to To do in DB Server & Security via automation Mar 14, 2020
@knz knz moved this from To do to 20.2 draft in DB Server & Security Mar 14, 2020
@awoods187
Copy link
Contributor

@knz out of curiosity, what makes it too late in the cycle? is this risky?

@knz
Copy link
Contributor

knz commented Mar 16, 2020

Yes there is risk.

  • The feature does not work consistently across supported platforms. This will raise eyebrows, and more importantly, questions in support because we don't have a good way to explain/document this difference yet.

  • The feature fails to detect new table names after CREATE TABLE or new column names after ALTER TABLE ADD COLUMN; nor does it properly recognize names that disappear after DROP. This will raise more eyebrows (= support burden).

  • It works by querying then loading all the SQL schema of the current db client-side in RAM. This would cause hard-to-troubleshoot slowness and possibly OOM situations when an operator unsuspectedly connects the SQL shell to a large db.

In an ideal world we'd operate as follows:

  • iterate on this a bit more to work harder at feature parity across supported platforms.

  • make the algorithm work properly with a subset of the data:

    1. only populate the data in RAM the first time it's needed, instead of unconditionally.
    2. make the thing detect DDL properly and refresh names as necessary. This could leverage the newly added support for the notice protocol, using dedicated messages sent by the server to the client upon request by our CLI shell.
    3. when entering select <tab> with no table known, auto-complete a table name from the current db first, make the user choose a table, before engaging with column name completion (so that the code does not need to load all column names in RAM upfront, and only the table names from the current db)
    4. optionally, look up tables from other db if the tab key is pressed after a "db." prefix and "db" is a valid database name.
    5. if the query is already syntactically well-formed with an hypothetical identifier at the current position, ask the query planner for a list of valid identifiers in that context.

Of these latter steps, steps (1) - (4) are the required polish for the feature as-presented. Only step (5) can be seen as an architectural step further towards a more feature-complete tab completion and thus could be pushed to a v2.

@awoods187
Copy link
Contributor

Thanks @knz! Let's push this to early 20.2 then!

@knz knz added this to In progress in SQL CLI (demo + sql + workload + dump) via automation Apr 8, 2020
@knz knz removed this from 20.2 draft in DB Server & Security Apr 8, 2020
@rohany
Copy link
Contributor Author

rohany commented May 6, 2020

I don't have the bandwidth right now to fix all of these comments, but @RichardJCai had an idea: as a starting step, would it be useful to just have tab completion of our keywords?

@knz
Copy link
Contributor

knz commented Jun 12, 2020

would it be useful to just have tab completion of our keywords?

That certainly would be a step forward. Maybe a PM would have an educated opinion?

@tbg tbg added the X-noremind Bots won't notify about PRs with X-noremind label May 6, 2021
@knz knz removed the X-noremind Bots won't notify about PRs with X-noremind label May 6, 2021
@knz knz marked this pull request as draft May 6, 2021 10:43
@awoods187
Copy link
Contributor

That certainly would be a step forward. Maybe a PM would have an educated opinion?

Yes, I think that would be valuable as an intermediary step. It would be great to build toward full tab completion.

@fantapop
Copy link
Collaborator

fantapop commented Nov 10, 2021

Any movement on this would be great but my humble opinion is that keywords are the thing I need tab completion on the least because as a developer using the sql console I already know them all and they're usually a lot shorter than table names.

@RichardJCai
Copy link
Contributor

Any movement on this would be great but my humble opinion is that keywords are the thing I need tab completion on the least because as a developer using the sql console I already know them all and they're usually a lot shorter than table names.

Might be reviving this or maybe reviving it using https://microsoft.github.io/language-server-protocol/ for breather week.

craig bot pushed a commit that referenced this pull request Dec 6, 2022
87606: sql: new heuristic-based completion engine r=rytaft,ZhouXing19 a=knz

This PR extends the server-side completion logic available under SHOW COMPLETIONS.

Fixes #37355.
Intended for use together with #86457.

The statement now returns 5 columns: completion, category, description, start, end.
This change is backward-compatible with the CLI code in previous versions, which was not checking the number of columns.

It works roughly as follows:

1. first the input is scanned into a token array.

2. then the token array is translated to a *sketch*: a string with the
   same length as the token array, where each character corresponds to an
   input token and an extra character indicating where the completion was
   requested.

3. then the completion rules are applied in sequence. Each rule
   inspects the sketch (and/or the token sequence) and decides whether
   to do nothing (skip to next rule), or to run some SQL which will
   return a batch of completion candidates.

   Each method operates exclusively on the token sequence, and does
   not require the overall SQL syntax to be valid or complete.

Also, the completion engine executes in a streaming fashion, so that
there is no need to accumulate all the completion candidates in RAM
before returning them to the client. This prevents excessive
memory usage server-side, and offers the client an option to cancel
the search once enough suggestions have been received.

Code organization:

- new package `sql/compengine`: the core completion engine, also
  where sketches are computed.

  Suggested reading: `api.go` in the new package.

- new package `sql/comprules`: the actual completion
  methods/heuristics, where sketches are mapped into SQL queries
  that provide the completion candidates.

  Suggested reading: the test cases under `testdata`.

Some more words about sketches:

For example, `SHOW COMPLETIONS AT OFFSET 10 FOR 'select myc from
sc.mytable;'` produces the sketch `"ii'ii.i;"` where `i` indicates an
identifier-like token and `'` indicates the cursor position.

The purpose of the sketch is to simplify the pattern matching
performed by the completion heuristics, and enables the application of
regular expressions on the token sequence.

For example, a heuristic to complete schema-qualified relations in
the current database, or a database-qualified relation, would use
the regexp `[^.;]i\.(['_]|i')`: an identifier not following a
period or a semicolon, followed by a period, followed by the
completion cursor (with the cursor either directly after the period
or on an identifier after the period).

Supersedes #45186.



93158: dev: fix bep temporary dir path r=rickystewart a=healthy-pod

Release note: None
Epic: none

Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net>
Co-authored-by: healthy-pod <ahmad@cockroachlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants