-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing uniq command #29
Comments
@tiago4orion @katcipis @cadicallegari I really never used the GNU uniq as it is. If I would like to grep uniques or duplicates I've used awk or grep. From the GNU uniq manual:
Adjacent lines? I think it's misguided. It can be me the wrong here. :-) For that actually work it demands to use From the GNU sort manual:
Highlighting that in any case, comparing all occurrences or adjacency the complexity will be O(n), let me know what do you think. |
Ok, but I do not used |
Ok, using $ cat file.txt | uniq
1
2
1
3
1 It seems to me like a However its duplicate option seems more logical. cat file.txt | uniq -d
1
3 Only the cat file.txt | uniq -D
1
1
1
3
3 |
cat file.txt | sort -u
1
2
3 In that case |
@tiago4orion
See below. |
@tiago4orion I figured out now (because I was sleepy yesterday) that you're expecting a behavior as What I think is, whether So, actually #27 is broken from my point of view and #28 is ok. I'm sorry for that mess. I suggest we discuss all here before moving into code. I'm postponing any PR's fixes. |
@geyslan Sorry for the late reply.. Yes, I've used Can you describe with examples how you would like the tool? Showing input and output, like I made in the issue description. |
@tiago4orion ok, it's better exemplify so we can get it right. But first I want to paste a few meanings to make my point accordingly. unique from Oxford Dictionary:
unique from Online Etymology Dictionary
By now we can accept that unique is a sense of something that is only one, (alone) in a set. Right? # Same input as above
$ cat > file.txt
1
1
1
2
1
3
3
1
# by default, print only unique/single/alone entries.
# empty lines are ignored
$ cat file.txt | uniq
2
# -dup print only lines that are duplicated in the input
$ cat file.txt | uniq -dup
1
3
# -every print only one string representation from all input set.
$ cat file.txt | uniq -every
1
2
3
# -empty print empty lines also
$ cat file.txt | uniq -empty
2
$ It's a possibility to change the Cheers. |
Examples for real cases that came to mind.
|
Ok. I got your point, but we use a 'single word' to describe a set of features related to that word, it doesn't need to be so much strict. I liked your examples except the first. I cannot think of one use case for that. Can you provide a real world example? My point is that, features that doesn't have real world usage (right now) should be dropped in the first design and implementation. What do you think? I'm not against it being developed in the future if needed, but I think that adding complexity with no advantage in the beginning won't help. |
Ok, no problem. Then we can start implementation? Or we're missing some detail? |
Nice, Before go back to implementation, I would like to hear from you about this understanding. #28 (comment) and #28 (comment) Last suggestion: type struct Line {
Text *string
Numbers []int
}
...
inputMap := make(map[string]Line)
linesOrdered := []*Line |
You can ask me why that $ cat file.txt | uniq -num
[4] 2
$ cat file.txt | uniq -dup -num
[1 2 3 6 9] 1
[7 8] 3 |
@tiago4orion @katcipis Hello guys, I made the changes that we have discussed and I implemented the Input λ> cat input
hello
world
hello
世界
世界
世
1
3
4
日本語
4
1 Unique lines and all empty lines λ> cat input | ./uniq -empty
world
世
3
日本語 Duplicate lines and all empty lines λ> cat input | ./uniq -dup -empty
hello
世界
1
4 Every line representation and all empty lines λ> cat input | ./uniq -every -empty
hello
world
世界
世
1
3
4
日本語 Whit
So, do you think that |
I think it should behave like any other character.. one representation. |
@tiago4orion, tks. I'll change it soon. 👍 |
@tiago4orion Done! |
We need a uniq command, but how it should behave?
@katcipis @geyslan @cadicallegari
Differently from Plan9 and GNU uniq, it should apply the
uniq
in the entire input buffer, not only in the adjacent input lines. The current @geyslan implementations already did this way (#27 #28).Below are some test cases I do expect to work:
A third option to show line numbers could be added if it do not complicates the tool.
@geyslan Current implementation do not honor this cases. I know it's not like 'gnu uniq'. But what do you think? Makes sense?
The text was updated successfully, but these errors were encountered: