Skip to content

Commit

Permalink
warn log ALL invalid file paths in .infile-list files
Browse files Browse the repository at this point in the history
so users who use .infile-list can inspect the log to see where all the invalid file paths are and not do it piecemeal
  • Loading branch information
jqnatividad committed Mar 9, 2024
1 parent 20a45c8 commit 2650930
Show file tree
Hide file tree
Showing 2 changed files with 30 additions and 3 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -259,7 +259,7 @@ Click [here](https://docs.rs/file-format/latest/file_format/#reader-features) fo

The `cat`, `headers`, `sqlp` & `to` commands have extended input support (🗄️). If the input is `-` or empty, the command will try to use stdin as input. If it's not, it will check if its a directory, and if so, add all the files in the directory as input files.

If its a file, it will first check if it has an `.infile-list` extension. If it does, it will load the text file and parse each line as an input file path. This is a much faster and convenient way to process a large number of input files, without having to pass them all as separate command-line arguments. Further, the file paths can be anywhere in the file system, even on separate volumes. If an input file path is not fully qualified, it will be treated as relative to the current working directory. Empty lines and lines starting with `#` are ignored.
If its a file, it will first check if it has an `.infile-list` extension. If it does, it will load the text file and parse each line as an input file path. This is a much faster and convenient way to process a large number of input files, without having to pass them all as separate command-line arguments. Further, the file paths can be anywhere in the file system, even on separate volumes. If an input file path is not fully qualified, it will be treated as relative to the current working directory. Empty lines and lines starting with `#` are ignored. Invalid file paths will be logged as warnings and skipped.

For both directory and `.infile-list` input, snappy compressed files with a `.sz` extension will be automatically decompressed.

Expand Down
31 changes: 29 additions & 2 deletions src/util.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1496,12 +1496,39 @@ pub fn process_input(
{
let mut input_file = std::fs::File::open(input_path)?;
let mut input_file_contents = String::new();
let mut canonical_invalid_path = PathBuf::new();
let mut invalid_files = 0_u32;
input_file.read_to_string(&mut input_file_contents)?;
input_file_contents
let infile_list_vec = input_file_contents
.lines()
.filter(|line| !line.is_empty() && !line.starts_with('#'))
.map(PathBuf::from)
.collect::<Vec<_>>()
.filter_map(|path| {
if path.exists() {
Some(path)
} else {
// note that we're warn logging if files do not exist for
// each line in the infile-list file
// even though we're returning an error on the FIRST file that
// doesn't exist in the next section. This is because
// we want to log ALL the invalid file paths in the infile-list
// file, not just the first one.
invalid_files += 1;
canonical_invalid_path = path.canonicalize().unwrap_or_default();
log::warn!(
".infile-list file '{}': '{}' does not exist",
path.display(),
canonical_invalid_path.display()
);
None
}
})
.collect::<Vec<_>>();
log::info!(
".infile-list file parsed. Filecount - valid:{} invalid:{invalid_files}",
infile_list_vec.len()
);
infile_list_vec
} else {
// if the input is not an ".infile-list" file, add the file to the input
arg_input
Expand Down

0 comments on commit 2650930

Please sign in to comment.