Skip to content
This repository
Browse code

Removed the duplicates builtin

  • Loading branch information...
commit 08c57bc05c790d545412d675a3a048a6f22b2889 1 parent 4139a99
Chris O'Hara authored March 08, 2011
37  builtin/README.md
Source Rendered
@@ -5,9 +5,9 @@ Node.io comes with several built-in modules which can be accessed through the co
5 5
 To run a built-in module, run
6 6
 
7 7
     $ node.io [MODULE] [ARGS]
8  
-    
  8
+
9 9
 To see usage details, run
10  
-    
  10
+
11 11
     $ node.io [MODULE] help
12 12
 
13 13
 ### digest
@@ -15,7 +15,7 @@ To see usage details, run
15 15
 This module calculates the hash/checksum of each element of input. Available hashes are [md5, crc32, sha1, sha256, sha512, ...]
16 16
 
17 17
 Example 1 - find the MD5 hash of a string
18  
-    
  18
+
19 19
     $ echo "this is a string" | node.io digest md5
20 20
        => b37e16c620c055cf8207b999e3270e9b
21 21
 
@@ -25,15 +25,15 @@ This module checks a URL's Google pagerank (rate limits apply)
25 25
 
26 26
 Example 1 - find the pagerank of mastercard.com
27 27
 
28  
-    $ echo "mastercard.com" | node.io pagerank    
  28
+    $ echo "mastercard.com" | node.io pagerank
29 29
        => mastercard.com,7
30  
-       
  30
+
31 31
 ### resolve
32 32
 
33 33
 This module provides DNS resolution utilities
34 34
 
35 35
 Example 1 - resolve domains and output "domain,ip"
36  
-    
  36
+
37 37
     $ node.io resolve < domains.txt
38 38
 
39 39
 Example 2 - return domains that do not resolve (potentially available)
@@ -45,9 +45,9 @@ Example 3 - return domains that do resolve
45 45
     $ node.io resolve found < domains.txt
46 46
 
47 47
 Example 4 - return unique IPs
48  
-    
  48
+
49 49
     $ node.io resolve ips < domains.txt
50  
-    
  50
+
51 51
 ### statuscode
52 52
 
53 53
 Makes a HEAD request to each URL of input and returns the status code
@@ -87,17 +87,17 @@ Example 1 - remove lines that **do not** match a filter
87 87
 Example 2 - output lines that do not match a filter (remove valid lines)
88 88
 
89 89
     $ node.io validate not [FILTER] < list.txt
90  
-    
  90
+
91 91
 ### eval
92 92
 
93 93
 This module evaluates an expression on each line of input and emits the result (unless the result is null)
94 94
 
95 95
 Example 1 - remove empty lines
96  
-    
  96
+
97 97
     $ node.io -s eval "input.trim() != '' ? input : null" < input.txt > modified.txt
98  
-    
  98
+
99 99
 Example 2 - convert a TSV (tab separated file) to CSV
100  
-       
  100
+
101 101
     $ node.io -s eval "input.split('\t').join(',')" < data.tsv > data.csv
102 102
 
103 103
 ### word_count
@@ -105,16 +105,3 @@ Example 2 - convert a TSV (tab separated file) to CSV
105 105
 This module uses map/reduce to count word occurrences in a file
106 106
 
107 107
     $ node.io word_count < input.txt
108  
-    
109  
-### duplicates
110  
-
111  
-This module can find or remove duplicates from a list
112  
-
113  
-Example 1 - remove duplicates from a list and output unique lines
114  
-    
115  
-    $ node.io duplicates < list.txt
116  
-
117  
-Example 2 - to output duplicate lines
118  
-    
119  
-    $ node.io duplicates find < list.txt
120  
-    
43  builtin/duplicates.coffee
... ...
@@ -1,43 +0,0 @@
1  
-usage = '''
2  
-This module can find/remove duplicates in a list
3  
-
4  
-   1. To remove duplicates from a list and output unique lines:
5  
-       $ node.io duplicates < list.txt
6  
-
7  
-   2. To output lines that appear more than once:
8  
-       $ node.io duplicates find < list.txt
9  
-'''
10  
-
11  
-nodeio = require 'node.io'
12  
-
13  
-seen_lines = []
14  
-emitted_lines = []
15  
-
16  
-class RemoveDuplicates extends nodeio.JobClass
17  
-    reduce: (lines) ->
18  
-        for line in lines
19  
-            if not line in seen_lines
20  
-                @emit line
21  
-                seen_lines.push line
22  
-
23  
-class FindDuplicates extends nodeio.JobClass
24  
-    reduce: (lines) ->
25  
-        for line in lines
26  
-            if line in seen_lines
27  
-                if not line in emitted_lines
28  
-                    @emit line
29  
-                    emitted_lines.push line
30  
-                else
31  
-                    seen_lines.push line
32  
-
33  
-class UsageDetails extends nodeio.JobClass
34  
-    input: ->
35  
-        @status usage
36  
-        @exit()
37  
-
38  
-@class = RemoveDuplicates
39  
-@job = {
40  
-    remove: new RemoveDuplicates()
41  
-    find: new FindDuplicates()
42  
-    help: new UsageDetails()
43  
-}

0 notes on commit 08c57bc

Please sign in to comment.
Something went wrong with that request. Please try again.