Skip to content
This repository has been archived by the owner on Sep 8, 2018. It is now read-only.

Implement topics for store nodes #120

Open
wants to merge 6 commits into
base: topics
Choose a base branch
from
Open

Implement topics for store nodes #120

wants to merge 6 commits into from

Conversation

fabxc
Copy link
Member

@fabxc fabxc commented Apr 5, 2018

This is more of a preview.

pkg/store/log.go Outdated
newLog func(string) (Log, error)
mtx sync.Mutex
m map[string]Log
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This essentially adds coordination across actors, which is not great.
If we remove the lock file from the FileLog to a single one for the top level data dir we can get rid of that again I think.

func BenchmarkDemux(b *testing.B) {
for name, newFsys := range map[string]func() fs.Filesystem{
"virtual": fs.NewVirtualFilesystem,
"real": fs.NewRealFilesystem,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BenchmarkDemux/virtual-8         	 2000000	       562 ns/op	    1079 B/op	       3 allocs/op
BenchmarkDemux/real-8            	 2000000	      1015 ns/op	     408 B/op	       3 allocs/op
PASS

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems good enough for the time being.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The real number actually strongly depends on how big our bufio.Writer buffers are when fanning out to per-topic segments. Cranking it to a higher number makes it look nice, but probably the benchmark doesn't quite capture the right thing here yet?

Should we change it to actually benchmark against small fixed-sized staging segments?

// Run each compacters' next stage.
for _, c := range tc {
c.Next()
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using Run/Stop is not really feasible here I think as we may end up with 1000 compactors competing for IO. Some parallelism might be okay though if this cannot keep up.

I generally prefer handling lifecycle entirely in main over blackbox-y Run/Stop methods – which works especially great with run.Group.

return "", id, nil, err
}
// Execution of return arguments is not ordered. We must cast the topic
// to a string separately.
Copy link
Member Author

@fabxc fabxc Apr 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This really surprised me as I assumed left-to-right order.

From https://golang.org/ref/spec#Order_of_evaluation:

when evaluating the operands of an expression, assignment, or return statement, all
function calls, method calls, and communication operations are evaluated in lexical
left-to-right order.

Does the "lexical" bit mean that return string(p[1]), append(b[... would be reordered because str... > app...?

@@ -54,6 +55,10 @@ func (realFilesystem) Exists(path string) bool {
return !os.IsNotExist(err)
}

func (realFilesystem) ReadDir(dirname string) ([]os.FileInfo, error) {
return ioutil.ReadDir(dirname)
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed you used Walk everywhere. Because of the annoying os.File.Readdir{,names} semantics?
Went with the ioutil.ReadDir signature instead unless we strictly want to match os.File.

@fabxc
Copy link
Member Author

fabxc commented Apr 6, 2018

Moved file locking to main, which allowed things to get cleaned up a bit.
If we agree on the general approach, I'll start to add/fix tests.

}
m[t] = l
}
return m, nil
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always instantiating new logs seems wasteful (allocs and syscalls) – but compared to when we read/write logs it probably won't matter.

@tsenart
Copy link
Contributor

tsenart commented Apr 11, 2018

@fabxc: Let us know when this is ready for first review. Perhaps adding a WIP label would be good this sort of PRs.

@fabxc
Copy link
Member Author

fabxc commented Apr 14, 2018

Right, prefix added.
I was hoping to get a quick glance over the general approach before cleaning this up.

@fabxc fabxc changed the title Implement topics for store nodes [WIP] Implement topics for store nodes Apr 14, 2018
@fabxc
Copy link
Member Author

fabxc commented Apr 14, 2018

@peterbourgon I ran into several issues in the virtual FS implementation, missing features and bugs alike, e.g. exists(dir) missing, file.Name() not updated on rename.
Needless to say it's annoying to debug and I'd expect this to come up more often.

When I worked on something similar, I remember that I mostly dropped it because getting actual FS semantics was incredibly tedious and virtually impossible to get entirely right. I wonder whether it would be better to replace the virtual FS to be backed by a chrooted tmpfs for tests and such.
Unless there is a clear benefit in having syscalls be cut out from benchmarks? Also the VFS obfuscates the memory profile in benchmarks.

@fabxc
Copy link
Member Author

fabxc commented Apr 14, 2018

I think this is generally good for review.

I wonder how the staging log behaves in practice. If it gets evicted sufficiently fast, I'd hope the staging segments mostly never make it beyond the page cache in a busy server, thus not impacting the overall throughput limit as given by the disk.

@fabxc fabxc changed the title [WIP] Implement topics for store nodes Implement topics for store nodes Apr 24, 2018
denji pushed a commit to denji/oklog that referenced this pull request Dec 10, 2018
fs: add ReadDir method
store: add topic demuxer
store: handle topics in store
store: remove per-log lock, refactor
store: fix tests for topics
Resolved oklog#120
denji pushed a commit to denji/oklog that referenced this pull request Dec 10, 2018
fs: add ReadDir method
store: add topic demuxer
store: handle topics in store
store: remove per-log lock, refactor
store: fix tests for topics
Resolved oklog#120
denji pushed a commit to denji/oklog that referenced this pull request Dec 10, 2018
fs: add ReadDir method
store: add topic demuxer
store: handle topics in store
store: remove per-log lock, refactor
store: fix tests for topics
Resolved oklog#120
denji pushed a commit to denji/oklog that referenced this pull request Dec 10, 2018
fs: add ReadDir method
store: add topic demuxer
store: handle topics in store
store: remove per-log lock, refactor
store: fix tests for topics

Resolved oklog#120
denji pushed a commit to denji/oklog that referenced this pull request Dec 10, 2018
fs: add ReadDir method
store: add topic demuxer
store: handle topics in store
store: remove per-log lock, refactor
store: fix tests for topics

Resolved oklog#120
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants