Skip to content
No description, website, or topics provided.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

Shell Complete - WIP Build Status codecov PEP8

Sequential Keras models for both command line misprints correction and next command prediction. The RNNs are trained on datasets of bash/zsh/fish history files gathered from GitHub.


pip3 install git+

Run the data pipeline

  • Get the list of repositories

To use the GitHub API, you need to generate a personel access token, see GitHub help. Then, run:

shcomplete repos -t token -o output.txt
  • Get the history files using Scrapy
scrapy runspider
  • Clean the dataset
shcomplete filtering -d shcomplete/data
  • Build a vocabulary of command line prefixes based on TF-DF score

Store command line prefixes into a trie data structure, using google/pygtrie. Compute the Term-Frequency Document-Frequency score of each prefix and prune the trie based on these numerical statistics to keep only the relevant prefixes. The level of noise in this vocabulary depends on the threshold parameter.

shcomplete tfdf -d shcomplete/data -o vocabulary.txt
  • Build the corpus, input when generating batches of data
shcomplete corpus -d shcomplete/srcd -o output.txt

Train the sequential Keras models

See the following command line interface to train the RNNs for both misprints correctionon and next command prediction, on the previous dataset of command line histories.

shcomplete model2correct --help
shcomplete model2predict --help

As regards misprints correction, a sequential model that reached 99% accuracy on more than 1000 basic command line prefixes after 100 epochs with 4 GPUs is provided in /saved_models. If you want it to take into account your aliases or specific commands, we recommand you to train this model on your own history.

Usage - WIP


Apache 2.0

You can’t perform that action at this time.