Skip to content

Add DKVPX file format#2002

Merged
johnkerl merged 6 commits intomainfrom
johnkerl/dkvpx
Mar 3, 2026
Merged

Add DKVPX file format#2002
johnkerl merged 6 commits intomainfrom
johnkerl/dkvpx

Conversation

@johnkerl
Copy link
Copy Markdown
Owner

@johnkerl johnkerl commented Mar 3, 2026

Resolves issue #266.

This is like Miller's original DKVP except it handles quoted keys and/or values, like CSV does.

DKVP has a naïve split on = and ,:

$ echo 'a=1,"b,c"=2,d="3,4",e' | mlr -i dkvp -o json cat
[
{
  "a": 1,
  "2": "\"b",
  "c\"": 2,
  "d": "\"3",
  "5": "4\"",
  "6": "e"
}
]

But DKVPX handles these:

$ echo 'a=1,"b,c"=2,d="3,4",e' | mlr -i dkvpx -o json cat
[
{
  "a": 1,
  "b,c": 2,
  "d": "3,4",
  "4": "e"
}
]

Performance

Performance numbers, and why I'm not making DPVKX replace the existing DKVP:

repeat 10 justtime mlr -i dkvp  nothing ~/data/big.dkvp
repeat 10 justtime mlr -i dkvp  cat     ~/data/big.dkvp
repeat 10 justtime mlr -i dkvpx nothing ~/data/big.dkvp
repeat 10 justtime mlr -i dkvpx cat     ~/data/big.dkvp

where big.dkvp is a million-line DKVP file with no quoting on keys or values.

#dkvp-r dkvp-rw dkvpx-r dkvpx-rw
1.102   1.165   1.567   1.650
1.103   1.150   1.579   1.652
1.093   1.151   1.577   1.677
1.096   1.150   1.707   1.660
1.146   1.152   1.602   1.659
1.170   1.139   1.591   1.665
1.106   1.151   1.590   1.665
1.101   1.146   1.583   1.665
1.112   1.145   1.587   1.717
1.113   1.144   1.584   1.677
Screenshot 2026-03-02 at 10 12 23 PM

@johnkerl johnkerl merged commit 102c624 into main Mar 3, 2026
7 checks passed
@johnkerl johnkerl deleted the johnkerl/dkvpx branch March 3, 2026 03:35
@aborruso
Copy link
Copy Markdown
Contributor

Thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants