Rainbow CSV has 2 main features:
- Highlights csv columns in different rainbow colors.
- Provides SELECT and UPDATE queries in RBQL: SQL-like transprogramming query language.
There are 2 ways to enable csv columns highlighting:
- CSV autodetection based on file content. File extension doesn't have to be .csv or .tsv
- Manual CSV delimiter selection with :RainbowDelim command (So you can use rainbow_csv for non-table files, e.g. to highlight function arguments in different colors)
To run an RBQL query either press F5 or enter the query in vim command line e.g. :Select a1, a2
The demo table is demo/movies.tsv. There are also some other test datasets in python/test_datasets. In this demo python expressions were used, but JavaScript is also available.
RBQL is a technology which provides SQL-like language that supports SELECT and UPDATE queries with Python or JavaScript expressions.
- Use Python or Java Script expressions inside SELECT, UPDATE, WHERE and ORDER BY statements
- Output entries appear in the same order as in input unless ORDER BY is provided.
- Input csv/tsv table may contain varying number of entries (but select query must be written in a way that prevents output of missing values)
- Result set of any query immediately becomes a first-class table on it's own.
- SELECT [ TOP N ] [ DISTINCT [ COUNT ] ]
- UPDATE [ SET ]
- WHERE
- ORDER BY ... [ DESC | ASC ]
- [ [ STRICT ] LEFT | INNER ] JOIN
All keywords have the same meaning as in SQL queries. You can check them online But there are also two new keywords: DISTINCT COUNT and STRICT LEFT JOIN:
- DISTINCT COUNT is like DISTINCT, but adds a new column to the "distinct" result set: number of occurences of the entry, similar to uniq -c unix command.
- STRICT LEFT JOIN is like LEFT JOIN, but generates an error if any key in left table "A" doesn't have exactly one matching key in the right table "B".
Some other rules:
- UPDATE SET is synonym to UPDATE, because in RBQL there is no need to specify the source table.
- UPDATE has the same semantic as in SQL, but it is actually a special type of SELECT query.
- JOIN statements must have the following form: <JOIN_KEYWORD> (/path/to/table.tsv | table_name ) ON ai == bj
Variable Name | Variable Type | Variable Description |
---|---|---|
a1, a2,..., a{N} | string | Value of i-th column |
b1, b2,..., b{N} | string | Value of i-th column in join table B |
NR | integer | Line number (1-based) |
NF | integer | Number of fields in line |
select top 100 a1, int(a2) * 10, len(a4) where a1 == "Buy" order by int(a2)
select * order by random.random()
- random sort, this is an equivalent of bash command sort -R
select top 100 a1, a2 * 10, a4.length where a1 == "Buy" order by parseInt(a2)
select * order by Math.random()
- random sort, this is an equivalent of bash command sort -R
You can use rainbow highlighting and RBQL even for non-csv/tsv files. E.g. you can highlight records in log files, one-line xmls and other delimited records. You can even highlight function arguments in your programming language using comma as a delimiter for :RainbowDelim command. And you can always turn off the rainbow highlighting using :NoRainbowDelim command.
Here is an example of how to extract some fields from a bunch of uniform single-line xmls:
Key | Action |
---|---|
<Leader>d (\d) | Print info about current column (under the cursor) |
F5 | Start query editing for the current csv file |
F5 | Execute currently edited query |
Allows to enter RBQL select query as vim command. e.g. :Select a1, a2 order by a1
Allows to enter RBQL update query as vim command. e.g. :Update a1 = a1 + " " + a2
Mark current file as a table and highlight it's columns in rainbow colors. Character under the cursor will be used as a delimiter. The delimiter will be saved in a config file for future vim sessions.
You can also use this command for non-csv files, e.g. to highlight function arguments in source code in different colors. To return back to original syntax highlighting run :NoRainbowDelim
Same as :RainbowDelim but allows delimiters inside fields if the field is double quoted by rules of Excel / RFC 4180
Mark the current file as rainbow table with a single column without delimiters. You will be able to run RBQL queries on it using a1 column variable.
This command will disable rainbow columns highlighting for the current file. Use it to cancel :RainbowDelim, :RainbowDelimQuoted and :RainbowMonoColumn effects or when autodection mechanism has failed and marked non-table file as a table
Assign any name to the current table. You can use this name in join operation instead of the table path. E.g.
JOIN customers ON a1 == b1
intead of:
JOIN /path/to/my/customers/table ON a1 == b1
Default: tsv Allowed values: tsv, csv
Format of RBQL result set tables.
- tsv format doesn't allow quoted tabs inside fields.
- csv is Excel-compatible and allows quoted commas.
Essentially format here is a pair: delimiter + quoting policy. This setting for example can be used to convert files between tsv and csv format:
- To convert csv to tsv: 1. open csv file. 2.
:let g:rbql_output_format='tsv'
3.:Select *
- To convert tsv to csv: 1. open tsv file. 2.
:let g:rbql_output_format='csv'
3.:Select *
Default: python
Scripting language to use in RBQL expression. Either 'js' or 'python' To use JavaScript add let g:rbql_meta_language = 'js' to .vimrc
Default: ["\t", ","]
By default plugin checks only TAB and comma characters during autodetection stage. You can override this variable to autodetect tables with other separators. e.g. let g:rcsv_delimiters = ["\t", ",", ";"]
csv autodetection mechanism can be disabled by setting this variable value to 1. Manual delimiter selection would still be possible.
Default: 30
Autodetection will fail if buffer has more than g:rcsv_max_columns columns. You can increase or decrease this limit.
Rainbow csv allows you to create a special "header" file for any of your table files. It must have the same name as the table file but with ".header" suffix (e.g. for "table.tsv" table the header file is "table.tsv.header"). The only purpose of header file is to provide csv column names for \d key.
It is also possible to use :RainbowSetHeader <file_name>
command to set a differently named file as a header for the current table.
Install with your favorite plugin manager.
If you want to use RBQL with JavaScript expressions, make sure you have Node.js installed
Python module rbql.py parses RBQL query, creates a new python worker module, then imports and executes it.
select top 20 len(a1) / 10, a2 where a2 in ["car", "plane", "boat"]
- use Python's "in" to emulate SQL's "in"update set a3 = 'US' where a3.find('of America') != -1
select * where NR <= 10
- this is an equivalent of bash command "head -n 10", NR is 1-based')select a1, a4
- this is an equivalent of bash command "cut -f 1,4"select * order by int(a2) desc
- this is an equivalent of bash command "sort -k2,2 -r -n"select NR, *
- enumerate lines, NR is 1-basedselect * where re.match(".*ab.*", a1) is not None
- select entries where first column has "ab" patternselect a1, b1, b2 inner join ./countries.txt on a2 == b1 order by a1, a3
- an example of join queryselect distinct count len(a1) where a2 != 'US'
select top 20 a1.length / 10, a2 where ["car", "plane", "boat"].indexOf(a2) > -1
update set a3 = 'US' where a3.indexOf('of America') != -1
select * where NR <= 10
- this is an equivalent of bash command "head -n 10", NR is 1-based')select a1, a4
- this is an equivalent of bash command "cut -f 1,4"select * order by parseInt(a2) desc
- this is an equivalent of bash command "sort -k2,2 -r -n"select * order by Math.random()
- random sort, this is an equivalent of bash command "sort -R"select NR, *
- enumerate lines, NR is 1-basedselect a1, b1, b2 inner join ./countries.txt on a2 == b1 order by a1, a3
- an example of join queryselect distinct count a1.length where a2 != 'US'
rainbow_csv comes with cli_rbql.py script which is located in ~/.vim extension folder.
You can use it in standalone mode to execute RBQL queries from command line. Example:
./cli_rbql.py --query "select a1, a2 order by a1" < input.tsv
To find out more about cli_rbql.py and available options, execute:
./cli_rbql.py -h