-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[API Proposal]: The Except and ExceptBy methods #110121
Comments
Tagging subscribers to this area: @dotnet/area-system-linq |
nit: For the rare cases where something like this is needed, most people can probably use the following extension method: IEnumerable<TSource> WhereNotIn<TSource>(this IEnumerable<TSource> first, IEnumerable<TSource> second)
{
var set = new HashSet<TSource>(second);
return first.Where(e => !set.Contains(e));
} |
@eiriktsarpalis - This isn't complaining about the existing behavior, so not a strict duplicate? |
I’m not suggesting changing the logic of the existing method, but rather adding an overload where you can choose whether to remove duplicates. For example, it could be implemented like this:
` |
Adding boolean flags that fundamentally change the semantics of a function isn't considered good practice, and is definitely not something we've done before in LINQ. For better or for worse |
Well, yes, the SQL method is more understandable, but for C#, it's not immediately clear how to use this method. In SQL, you can use a LEFT JOIN to quickly find the difference between two tables, but in C#, it's quite difficult to achieve, especially in terms of performance. It would be great if there were some method that could work between two collections and leverage all possible optimizations to speed up the process, similar to what SQL offers. |
Without losing extra data |
This is mostly because RDBMSs put a lot more work into their dynamic optimizers than would be reasonable for C#. If you're dealing with a large enough dataset to actually affect program performance, stick it into an actual database (especially because chances are you're going to want to do more things with it).
... I really wish (the iSeries version of) DB2's |
It would be good to implement such an algorithm in C#, and then use it to separate one collection from another. It seems like a complex algorithm at first glance. https://en.wikipedia.org/wiki/Sort-merge_join chat gpt:Merge Join (Merging Join)The Merge Join is an algorithm used for efficiently executing a JOIN operation, especially when both tables are already sorted by the column(s) involved in the join. Its key advantage lies in its linear traversal of rows, making it highly performant for large, sorted datasets. How Does Merge Join Work?Input Requirements:
Data Comparison:
ExampleTable A: ID | Name -- | -- 1 | John 2 | Alice 3 | BobThis behavior is crucial for accurate join results but may increase the size of the output significantly. Advantages of Merge Join
Limitations
|
Background and motivation
I propose adding an overload to the Except and ExceptBy methods that allows for removing elements from the provided array without removing duplicates.
API Proposal
API Usage
Array:2,3,4,5
Array:2,3,4,5
Array:2,3,4,5,2,5
Alternative Designs
No response
Risks
No response
The text was updated successfully, but these errors were encountered: