Skip to content

tsv-pretty auto preamble#218

Merged
jondegenhardt merged 4 commits into
eBay:masterfrom
jondegenhardt:tsv-pretty-auto-preamble
Aug 18, 2019
Merged

tsv-pretty auto preamble#218
jondegenhardt merged 4 commits into
eBay:masterfrom
jondegenhardt:tsv-pretty-auto-preamble

Conversation

@jondegenhardt
Copy link
Copy Markdown
Contributor

@jondegenhardt jondegenhardt commented Aug 18, 2019

This PR adds the an option to tsv-pretty to auto-detect preamble lines. The option is --a|auto-preamble. The short form of the option --preamble NUM option was changed from 'a' to 'b'.

Auto-detection of preamble lines uses a very simple heuristic: Lines at the start of a file that do not contain field delimiters are considered part of the preamble. This works well when the field delimiter is TAB (default for TSV), and the file has two or more fields.

Without preamble support the initial lines are typically interpreted as a single field, interfering with header detection and correct field type and width interpretation. For example:

$ tsv-pretty -f sample.tsv
# This file contains 4 fields: Color, Count, Height (Ht), and Weight (Wt).
# Color is an alphabetic, the others are numeric.

Color                                     Count  Ht      Wt
Brown                                     106    202.2   1.5
Canary Yellow                             7      106     0.761
Chartreuse                                1139   77.02   6.22
Fluorescent Orange                        422    1141.7  7.921
Grey                                      19     140.3   1.03

sample.tsv has three preamble lines, two starting with '#' and one blank line. Turning preamble detection on corrects the output:

$ tsv-pretty -f --auto-preamble sample.tsv
# This file contains 4 fields: Color, Count, Height (Ht), and Weight (Wt).
# Color is an alphabetic, the others are numeric.

Color               Count       Ht     Wt
Brown                 106   202.20  1.500
Canary Yellow           7   106.00  0.761
Chartreuse           1139    77.02  6.220
Fluorescent Orange    422  1141.70  7.921
Grey                   19   140.30  1.030

This preamble behavior could have been done before by using the --preamble NUM option. In the above example, --preamble 3. Having auto-preamble detection avoids needing to know the number of lines.

@jondegenhardt jondegenhardt merged commit 1509750 into eBay:master Aug 18, 2019
@jondegenhardt jondegenhardt deleted the tsv-pretty-auto-preamble branch August 18, 2019 23:32
@codecov-commenter
Copy link
Copy Markdown

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.21%. Comparing base (6502a52) to head (a2311b0).
⚠️ Report is 143 commits behind head on master.
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #218   +/-   ##
=======================================
  Coverage   99.21%   99.21%           
=======================================
  Files          16       16           
  Lines        4957     4972   +15     
=======================================
+ Hits         4918     4933   +15     
  Misses         39       39           
Files with missing lines Coverage Δ
tsv-pretty/src/tsv_utils/tsv-pretty.d 99.32% <100.00%> (+0.02%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants