-
-
Notifications
You must be signed in to change notification settings - Fork 785
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Design and refactoring / general code quality #382
Comments
First off, I think I can poke at the What I'm unsure about, is the performance aspect. A quick look around makes me suspect that, although virtual call penalty is negligible on modern/desktop CPUs with indirect branch predictors, smaller CPUs, including Atom, don't have one. So until it can be measured (hopefully on several CPUs, i only have access to some i7s), I think a defensive choice is to go with regular jumps inside a function, perhaps partitioning out the different sections for readability. |
Thank you for creating this ticket!
Exactly what I was thinking!
I thought about it as well, but I don't think it will be a problem. Yes, these calls will be in the "hot" part of the code, but my guess would be that we are IO-limited anyway and the virtual function call overhead will hopefully be negligible. But we should definitely measure it, yes 👍.
That would definitely also be an improvement. Anything that reduces the size of that monolithic worker thread logic. |
I did a lot of refactoring for v8.0. The whole module structure has changed, things have been moved around and renamed. The error handling is completely new. The main points listed in the first comment are still not addressed, but it should be much more pleasant to work with |
I think this would be an excellent place for new contributors to start. There are many places where some refactoring work could improve the code quality a lot. If you want to work on this, PLEASE submit small PRs that can be reviewed easily. |
Another thing that would need some refactoring is the tests. It would be great to split these into multiple files. We could also think about using |
I like this idea. I'll take a stab at refactoring some tests via assert_cmd 😄 |
Awesome! Maybe really do just a few first. To see if that works out as a concept. |
I think we can tidy up Perhaps we could refactor it to look something like this (code might not be 100% correct, but you get the general idea). In a separate file, we define our traits and filter structs: trait Filter {
fn filter(entry: &DirEntry) -> Option<ignore::WalkState>;
}
trait MetadataFilter {
fn filter(metadata: &Metadata) -> Option<ignore::WalkState>;
}
// structs and their trait impls also go here Then, we refactor this part of let entry = ...;
let filters: Vec<Box<dyn Filter>> = vec![
MinDepth::new(config.min_depth),
// other filters here
];
let result = filters
.iter()
.map(|x| x.filter(entry))
.find_map(|x| x);
if let result = Some(x) {
return x;
}
let metadata = entry.metadata();
let metadata_filters: Vec<Box<dyn MetadataFilter>> = vec![
OwnerConstraint::new(config.owner_constraint),
SizeConstraint::new(config.size_constraints),
// other metadata filters here
]
return metadata_filters
.iter()
.map(|x| x.filter(entry))
.find_map(|x| x)
.unwrap_or(ignore::WalkState::Continue); Basically, the goal is to unify the filtering interface (like mentioned above), so that we can use the right iterator tool to find the first failing filter and exit early. What do you think? |
I like that approach, and I don't think the performance issues with indirect calls mentioned above will be significant. (You could avoid let result = filters
.iter()
.find_map(|x| x.filter(entry)); // No need for .map().find_map()
if let Some(x) = result { // This was backwards
return x;
} |
I can never remember the In any case, can I start working on a PR or should we wait for more opinions? |
If you want to cook up a PR that would be great! That way we can see what the improvement really looks like. |
Is this still open? If so, I can start working on it. 😄 |
I believe it is |
Has anybody considered the idea of refactoring this into being essentially two different projects:
The reason I ask is because prior to finding out that this existed, I was working on my own multi-threaded copy tool, and part of its implementation was having a multi-threaded file scanning capability, along with implementation for globbing, path expansion, etc, but the only part I've completed is a very primitive version of the scan function, with nothing else. Though now that I've seen this it would make more sense to fork this and then just implement the copy functionality (along with progress tracking) on top. My current work is here: https://github.com/1Dragoon/fcp (It's a slow moving project, and was my motivation for recently extending indicatif's functionality to include stateful tracking) Was going to do something similar for move, delete, etc, though with the added twist of detecting file locks and prompting the user to either close the programs that have those files open, or offering to just close them for the user, prompting for interactive privilege escalation if needed (on windows at least, where credentials can be entered out of band thus obviating security concerns.) That work can be found here if anybody is interested: |
This was brought up before in #203. I don't think it's completely out of the question, but I'd like to understand better what the value of fd as a library has over using the |
This is intended as a tracking issue for general code quality and design discussions. @sharkdp feel free to close this when you think quality has reached an overall satisfactory level.
To start, I'll quote a recent comment by @sharkdp from a discussion on a PR thread:
The text was updated successfully, but these errors were encountered: