Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add concat option "--del-header". #258

Closed
4 tasks done
derekmahar opened this issue Nov 6, 2023 · 10 comments
Closed
4 tasks done

Add concat option "--del-header". #258

derekmahar opened this issue Nov 6, 2023 · 10 comments

Comments

@derekmahar
Copy link

derekmahar commented Nov 6, 2023

Please consider adding option --del-header (or similar) to csvtk concat. Consider the scenario of using xargs to concatenate a large number of CSV files:

find . -type f -name "*.csv" |
  sort |
  head -n 10000 |
  (read first
   csvtk concat $first
   xargs csvtk del-header) |
  (read first
   echo $first
   let count=1
   while read line
   do
     if ((count <= 10 || count == 10000))
     then
       echo $line
     fi
     let count=count+1
   done)

Output:

Column
1
2
3
4
5
6
7
8
9
10
10000

If csvtk concat had option --del-header, we could replace xargs csvtk del-header with xargs csvtk concat --del-header:

find . -type f -name "*.csv" |
  sort |
  head -n 10000 |
  (read first
   csvtk concat $first
   xargs csvtk concat --del-header) |
  (read first
   echo $first
   let count=1
   while read line
   do
     if ((count <= 10 || count == 10000))
     then
       echo $line
     fi
     let count=count+1
   done)

Desired output:

Column
1
2
3
4
5
6
7
8
9
10
10000

Prerequisites

  • make sure you're are using the latest version by csvtk version
$ csvtk version
0.28.1

Describe your issue

  • describe the problem
  • provide a reproducible example
@shenwei356
Copy link
Owner

Actually, I do not fully understand what's the purpose of the commands. Can we just

find ./ -name "*.csv" \
    | csvtk concat --infile-list - \
    | csvtk del-header -o result.csv

@derekmahar
Copy link
Author

Yes, but I think csvtk concat better describes the intent of the operation. Command csvtk concat --del-header would simply be a synonym of csvtk del-header. Another idea might be to implement csvtk --del-header which would apply to every subcommand.

@derekmahar
Copy link
Author

derekmahar commented Nov 6, 2023

Reading your example again, I realised that csvtk concat --infile-list=- doesn't require xargs and csvtk del-header at all. My scenario overlooked option --infile-list because I was comparing csvtk concat to other tools like xsv and qsv that don't (yet) have an option similar to --infile-list.

@shenwei356
Copy link
Owner

The option --infile-list is very useful in cases where the input file list is long.

@derekmahar
Copy link
Author

Yes, I agree. This is why I've asked the maintainers of mlr and qsv to add a similar option to each of those tools.

@derekmahar
Copy link
Author

I'm closing this issue because it duplicates existing behaviour.

Actually, I do not fully understand what's the purpose of the commands. Can we just

find ./ -name "*.csv" \
    | csvtk concat --infile-list - \
    | csvtk del-header -o result.csv

@mbhall88
Copy link

mbhall88 commented Jan 9, 2024

I would just like to second that having a global option to not output a header would be great. del-header obviously does this, but it would simplify many of my pipelines if there was a global option to just not out the header, thus removing one command from my pipeline

@shenwei356
Copy link
Owner

Sounds reasonable and might be also helpful for others. Please create a new issue, in case this one being ignored.

@shenwei356
Copy link
Owner

gosh, it's added, it's a lot of work. @derekmahar @mbhall88

  • add a new global flag -U, --delete-header for disable outputing the header row. Supported commands: concat, csv2tab/tab2csv, csv2xlsx/xlsx2csv, cut, filter, filter2, freq, fold/unfold, gather, fmtdate, grep, head, join, mutate, mutate2, replace, round, sample.
$ (echo a; seq 3) | csvtk head -n 3
a
1
2
3

$ (echo a; seq 3) | csvtk head -n 3 -U
1
2
3

@derekmahar
Copy link
Author

Thank you for implementing this feature!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants