Implement topics for store nodes #120

fabxc · 2018-04-05T14:18:45Z

This is more of a preview.

fabxc · 2018-04-05T14:20:02Z

pkg/store/log.go

+	newLog func(string) (Log, error)
+	mtx    sync.Mutex
+	m      map[string]Log
+}


This essentially adds coordination across actors, which is not great.
If we remove the lock file from the FileLog to a single one for the top level data dir we can get rid of that again I think.

fabxc · 2018-04-05T14:20:38Z

pkg/store/demux_test.go

+func BenchmarkDemux(b *testing.B) {
+	for name, newFsys := range map[string]func() fs.Filesystem{
+		"virtual": fs.NewVirtualFilesystem,
+		"real":    fs.NewRealFilesystem,


BenchmarkDemux/virtual-8 2000000 562 ns/op 1079 B/op 3 allocs/op BenchmarkDemux/real-8 2000000 1015 ns/op 408 B/op 3 allocs/op PASS

That seems good enough for the time being.

The real number actually strongly depends on how big our bufio.Writer buffers are when fanning out to per-topic segments. Cranking it to a higher number makes it look nice, but probably the benchmark doesn't quite capture the right thing here yet?

Should we change it to actually benchmark against small fixed-sized staging segments?

fabxc · 2018-04-05T14:24:37Z

cmd/oklog/ingeststore.go

+					// Run each compacters' next stage.
+					for _, c := range tc {
+						c.Next()
+					}


Using Run/Stop is not really feasible here I think as we may end up with 1000 compactors competing for IO. Some parallelism might be okay though if this cannot keep up.

I generally prefer handling lifecycle entirely in main over blackbox-y Run/Stop methods – which works especially great with run.Group.

fabxc · 2018-04-05T14:29:07Z

pkg/store/demux.go

+		return "", id, nil, err
+	}
+	// Execution of return arguments is not ordered. We must cast the topic
+	// to a string separately.


This really surprised me as I assumed left-to-right order.

From https://golang.org/ref/spec#Order_of_evaluation:

when evaluating the operands of an expression, assignment, or return statement, all function calls, method calls, and communication operations are evaluated in lexical left-to-right order.

Does the "lexical" bit mean that return string(p[1]), append(b[... would be reordered because str... > app...?

fabxc · 2018-04-05T14:39:53Z

pkg/fs/real.go

@@ -54,6 +55,10 @@ func (realFilesystem) Exists(path string) bool {
 	return !os.IsNotExist(err)
 }

+func (realFilesystem) ReadDir(dirname string) ([]os.FileInfo, error) {
+	return ioutil.ReadDir(dirname)
+}


I noticed you used Walk everywhere. Because of the annoying os.File.Readdir{,names} semantics?
Went with the ioutil.ReadDir signature instead unless we strictly want to match os.File.

fabxc · 2018-04-06T08:38:08Z

Moved file locking to main, which allowed things to get cleaned up a bit.
If we agree on the general approach, I'll start to add/fix tests.

fabxc · 2018-04-06T08:43:22Z

pkg/store/file_log.go

+		}
+		m[t] = l
+	}
+	return m, nil


Always instantiating new logs seems wasteful (allocs and syscalls) – but compared to when we read/write logs it probably won't matter.

tsenart · 2018-04-11T16:10:13Z

@fabxc: Let us know when this is ready for first review. Perhaps adding a WIP label would be good this sort of PRs.

fabxc · 2018-04-14T09:35:30Z

Right, prefix added.
I was hoping to get a quick glance over the general approach before cleaning this up.

fabxc · 2018-04-14T12:51:49Z

@peterbourgon I ran into several issues in the virtual FS implementation, missing features and bugs alike, e.g. exists(dir) missing, file.Name() not updated on rename.
Needless to say it's annoying to debug and I'd expect this to come up more often.

When I worked on something similar, I remember that I mostly dropped it because getting actual FS semantics was incredibly tedious and virtually impossible to get entirely right. I wonder whether it would be better to replace the virtual FS to be backed by a chrooted tmpfs for tests and such.
Unless there is a clear benefit in having syscalls be cut out from benchmarks? Also the VFS obfuscates the memory profile in benchmarks.

fabxc · 2018-04-14T12:57:44Z

I think this is generally good for review.

I wonder how the staging log behaves in practice. If it gets evicted sufficiently fast, I'd hope the staging segments mostly never make it beyond the page cache in a busy server, thus not impacting the overall throughput limit as given by the disk.

fs: add ReadDir method store: add topic demuxer store: handle topics in store store: remove per-log lock, refactor store: fix tests for topics Resolved oklog#120

fabxc commented Apr 5, 2018

View reviewed changes

fabxc force-pushed the topics branch 4 times, most recently from 3d1ecd6 to 29f19d3 Compare April 6, 2018 07:04

fabxc force-pushed the topics2 branch from c693d27 to 8511ce8 Compare April 6, 2018 07:05

fabxc added 3 commits April 6, 2018 09:08

fs: add ReadDir method

f123934

store: add topic demuxer

2d373dc

store: handle topics in store

8e3ef3a

fabxc force-pushed the topics2 branch from 8511ce8 to 8e3ef3a Compare April 6, 2018 07:09

store: remove per-log lock, refactor

776cacc

fabxc commented Apr 6, 2018

View reviewed changes

fabxc changed the title ~~Implement topics for store nodes~~ [WIP] Implement topics for store nodes Apr 14, 2018

store: fix tests for topics

6ce2ca9

minor cleanups

2791660

fabxc changed the title ~~[WIP] Implement topics for store nodes~~ Implement topics for store nodes Apr 24, 2018

fabxc mentioned this pull request Apr 24, 2018

[WIP] Trigram based indexer sidecar #123

Open

dmitry-guryanov mentioned this pull request Jul 13, 2018

Add ability to query by topic #136

Open

denji pushed a commit to denji/oklog that referenced this pull request Dec 10, 2018

Implement topics for store nodes

a7b6c78

fs: add ReadDir method store: add topic demuxer store: handle topics in store store: remove per-log lock, refactor store: fix tests for topics Resolved oklog#120

denji pushed a commit to denji/oklog that referenced this pull request Dec 10, 2018

Implement topics for store nodes

3654da8

fs: add ReadDir method store: add topic demuxer store: handle topics in store store: remove per-log lock, refactor store: fix tests for topics Resolved oklog#120

denji pushed a commit to denji/oklog that referenced this pull request Dec 10, 2018

Implement topics for store nodes

c08aeab

fs: add ReadDir method store: add topic demuxer store: handle topics in store store: remove per-log lock, refactor store: fix tests for topics Resolved oklog#120

denji pushed a commit to denji/oklog that referenced this pull request Dec 10, 2018

Implement topics for store nodes

1b3aea4

fs: add ReadDir method store: add topic demuxer store: handle topics in store store: remove per-log lock, refactor store: fix tests for topics Resolved oklog#120

denji mentioned this pull request Dec 10, 2018

Implement topics for store nodes denji/oklog#4

Closed

denji pushed a commit to denji/oklog that referenced this pull request Dec 10, 2018

Implement topics for store nodes

5636e5e

fs: add ReadDir method store: add topic demuxer store: handle topics in store store: remove per-log lock, refactor store: fix tests for topics Resolved oklog#120

denji mentioned this pull request Dec 10, 2018

Implement topics for store nodes denji/oklog#5

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement topics for store nodes #120

Implement topics for store nodes #120

fabxc commented Apr 5, 2018

fabxc Apr 5, 2018

fabxc Apr 5, 2018

fabxc Apr 5, 2018

fabxc Apr 14, 2018

fabxc Apr 5, 2018

fabxc Apr 5, 2018 •

edited

fabxc Apr 5, 2018

fabxc commented Apr 6, 2018 •

edited

fabxc Apr 6, 2018

tsenart commented Apr 11, 2018

fabxc commented Apr 14, 2018

fabxc commented Apr 14, 2018 •

edited

fabxc commented Apr 14, 2018

Implement topics for store nodes #120

Are you sure you want to change the base?

Implement topics for store nodes #120

Conversation

fabxc commented Apr 5, 2018

fabxc Apr 5, 2018

Choose a reason for hiding this comment

fabxc Apr 5, 2018

Choose a reason for hiding this comment

fabxc Apr 5, 2018

Choose a reason for hiding this comment

fabxc Apr 14, 2018

Choose a reason for hiding this comment

fabxc Apr 5, 2018

Choose a reason for hiding this comment

fabxc Apr 5, 2018 • edited

Choose a reason for hiding this comment

fabxc Apr 5, 2018

Choose a reason for hiding this comment

fabxc commented Apr 6, 2018 • edited

fabxc Apr 6, 2018

Choose a reason for hiding this comment

tsenart commented Apr 11, 2018

fabxc commented Apr 14, 2018

fabxc commented Apr 14, 2018 • edited

fabxc commented Apr 14, 2018

fabxc Apr 5, 2018 •

edited

fabxc commented Apr 6, 2018 •

edited

fabxc commented Apr 14, 2018 •

edited