🧚pxi-dsv
is a delimiter-separated values plugin for pxi
(pixie), the small, fast, and magical command-line data processor.
See the pxi
github repository for more details!
👌
pxi-dsv
comes preinstalled inpxi
. No installation necessary. If you still want to install it, proceed as described below.
pxi-dsv
is installed in ~/.pxi/
as follows:
npm install pxi-dsv
The plugin is included in ~/.pxi/index.js
as follows:
const dsv = require('pxi-dsv')
module.exports = {
plugins: [dsv],
context: {},
defaults: {}
}
For a much more detailed description, see the .pxi
module documentation.
This plugin comes with the following pxi
extensions:
Description | |
---|---|
dsv deserializer |
Deserializes delimiter-separated values files. The delimiter, quote, and escape characters, as well as several other options make it very flexible. |
csv deserializer |
Deserializes comma-separated values files. Follows RFC4180 for the most part. Uses dsv internally and accepts the same options. |
tsv deserializer |
Deserializes tab-separated values files. Useful for processing tabular database and spreadsheet data. Uses dsv internally and accepts the same options. |
ssv deserializer |
Deserializes space-separated values files. Useful for processing command line output from ls , ps , and the like. Uses dsv internally and accepts the same options. |
csv serializer |
Serializes JSON into CSV format. |
This plugin has the following limitations:
- No type casting: The deserializers do not cast strings to other data types, like numbers or booleans. This is intentional. Since different use cases need different data types, and some use cases need their integers to be strings, e.g. in case of IDs, there is no way to know for sure when to cast a string to another type. If you need different types, you may cast strings by using functions.
- Integer header order: Headers that are integers are always printed before other headers. This is an implementation detail of the way JavaScript orders object keys internally. Although this is an inconvenience, this behaviour will stay for now, since changing it would reduce performance. If you have a good way to solve this and retain performance, please let me know.
- Non-optimal tsv (de-)serializer implementations:
The
tsv
deserializer is implemented in terms of thedsv
deserializer and thus supports quotes and escaping tabs. Other implementations oftsv
deserializers do not allow tabs in values and thus have no need of quotes and escapes. This means, the currenttsv
implementation works just fine, but an implementation without quotes should be faster. Such an implementation may come at some point in the future. - No multi-line CSV files: The
csv
deserializer does not appear to support multi-line values, aka values with line breaks inside quotes. Actually, nopxi
deserializer could support this feature alone, since it is the chunkers' responsibility to chunk data. Currently there is no dedicated chunker that supports chunking multi-line csv files, but there may be in the future.
Please report issues in the tracker!
pxi-dsv
is MIT licensed.