Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Enable follow tailing
- Loading branch information
Showing
10 changed files
with
185 additions
and
12 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,100 @@ | ||
# Input | ||
|
||
*rare* reads the supplied inputs in massive parallelization, rather | ||
than in-order reads. In most cases, you won't need to do anything | ||
other than specifying what to read. In some cases, you may want to | ||
tweak some parameters. | ||
|
||
## Input Methods | ||
|
||
### Read File(s) | ||
|
||
The simplest version of reading files is by specifying one or more filename: | ||
|
||
`rare <aggregator> file1 file2 file3...` | ||
|
||
You can also use simple expansions, such as: | ||
|
||
`rare <aggregator> path/**/*.log` | ||
|
||
In this case, all `*.log` files in any nested directory under `path/` will be read. | ||
|
||
or you can use recursion, which will read all plain files in the path | ||
|
||
`rare <aggregator> -R path/` | ||
|
||
#### gzip | ||
|
||
If the files *may* be gzip'd you can specify `-z`, and will be gunzip'd if able. If a | ||
file can't be opened as a gzip file, a warning will be logged, and it will be interpreted | ||
as a raw file. | ||
|
||
`rare <aggregator> -z *.log.gz` | ||
|
||
### Following File(s) | ||
|
||
Like `tail -f`, following files allows you to watch files actively being written to. This is | ||
useful, for example, to read a log of an actively running application. | ||
|
||
**Note:** When following files, all files are open at once, and max readers are ignored. | ||
|
||
`rare <aggregator> -f app.log` | ||
|
||
If the file may be deleted and recreated, such as in a log-rotation, you can follow with re-open | ||
|
||
`rare <aggregator> -F app.log` | ||
|
||
#### Polling (Instead of blocking) | ||
|
||
By default, following a file uses `fsnotify` which monitors files for changes. This should | ||
work fine for most major operating systems. If not, you can enable polling to watch for changes | ||
instead with `--poll` | ||
|
||
#### Tailing | ||
|
||
If you wish to only start reading at the end of the file (eg. only looking at newer entries), | ||
you can specify `-t` or `--tail` to start following at the end. | ||
|
||
### Stdin/Pipe | ||
|
||
There are two ways to read from a pipe: implicit and explicit. | ||
|
||
Implicitely, if *rare* detects its stdin is a pipe, it will read it simply by not providing any arguments | ||
|
||
`cat file.log | rare <aggregator>` or `rare <aggregator> < file.log` | ||
|
||
Explicitely, you can pass a single read argument of `-` (dash) to mandate reading from stdin | ||
|
||
`cat file.log | rare <aggregator> -` | ||
|
||
## Tweaking the Batcher | ||
|
||
There are already some heuristics that optimize how files are read which | ||
should work for most cases. If you do find you need to modify how *rare* | ||
is reading, you can tweak two things: | ||
|
||
* concurrency -- How many files are read at once | ||
* batch size -- How many lines read from a given file are "batched" to send to the expression stage | ||
|
||
### Concurrency | ||
|
||
Concurrency specifies how many files are opened at once (in a normal case). It | ||
defaults to `3`, but is ignored if following files. | ||
|
||
Specify with: | ||
|
||
`rare <aggregator> --readers=1 file1 file2 file3...` | ||
|
||
### Batch Sizes | ||
|
||
Rare reads (by default) 1000 lines in a file, for a batch, before providing it | ||
to the extractor stage. This significantly speeds up processing, but comes | ||
at the cost of being less real-time if input generation is slow. | ||
|
||
To counteract this, in the *follow* or *stdin* cases, there's also a flush timeout of | ||
250ms. This means if a new line has been received, and the duration has passed, | ||
that the batch will be processed irregardless of its current size. | ||
|
||
You can tweak this value with `--batch` | ||
|
||
`rare <aggreagator> --batch=10 ...` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,19 +1,61 @@ | ||
package batchers | ||
|
||
import ( | ||
"io/ioutil" | ||
"os" | ||
"testing" | ||
"time" | ||
|
||
"github.com/stretchr/testify/assert" | ||
) | ||
|
||
func TestBatchTailFile(t *testing.T) { | ||
func TestBatchFollowFile(t *testing.T) { | ||
filenames := make(chan string, 1) | ||
filenames <- "tailBatcher_test.go" // me | ||
|
||
batcher := TailFilesToChan(filenames, 5, false, false) | ||
batcher := TailFilesToChan(filenames, 5, false, false, false) | ||
|
||
batch := <-batcher.c | ||
batch := <-batcher.BatchChan() | ||
assert.Equal(t, "tailBatcher_test.go", batch.Source) | ||
assert.Len(t, batch.Batch, 5) | ||
assert.NotZero(t, batcher.ReadBytes()) | ||
} | ||
|
||
func TestBatchFollowTailFile(t *testing.T) { | ||
tmp, err := ioutil.TempFile("", "followtest-") | ||
if err != nil { | ||
panic(err) | ||
} | ||
defer tmp.Close() | ||
defer os.Remove(tmp.Name()) | ||
|
||
// Add test data | ||
for i := 0; i < 10; i++ { | ||
tmp.WriteString("abc\n") | ||
} | ||
|
||
// Now tail the file | ||
filenames := make(chan string, 1) | ||
filenames <- tmp.Name() | ||
|
||
batcher := TailFilesToChan(filenames, 1, false, false, true) | ||
|
||
time.Sleep(300 * time.Millisecond) // Uhg hack cause auto-flushing | ||
|
||
// And write some more data | ||
const testLines = 5 | ||
for i := 0; i < testLines; i++ { | ||
tmp.WriteString("abc\n") | ||
} | ||
|
||
// And finally assert we got what we wanted | ||
for i := 0; i < testLines; i++ { | ||
batch, ok := <-batcher.BatchChan() | ||
assert.True(t, ok) | ||
if ok { | ||
assert.Equal(t, tmp.Name(), batch.Source) | ||
assert.Equal(t, uint64(i+1), batch.BatchStart) | ||
assert.Len(t, batch.Batch, 1) | ||
} | ||
} | ||
} |