Skip to content

Commit

Permalink
Merge pull request #1843 from semi-technologies/gh-1837-fix-segfault
Browse files Browse the repository at this point in the history
gh-1837 WEAVIATE-2 fix segfault by copying memory
  • Loading branch information
etiennedi committed Mar 8, 2022
2 parents cc50214 + 9280066 commit 8182935
Show file tree
Hide file tree
Showing 2 changed files with 28 additions and 9 deletions.
18 changes: 17 additions & 1 deletion adapters/repos/db/lsmkv/segment_collection_strategy.go
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,23 @@ func (i *segment) getCollection(key []byte) ([]value, error) {
}
}

return i.collectionStratParseData(i.contents[node.Start:node.End])
// We need to copy the data we read from the segment exactly once in this
// place. This means that future processing can share this memory as much as
// it wants to, as it can now be considered immutable. If we didn't copy in
// this place it would only be safe to hold this data while still under the
// protection of the segmentGroup.maintenanceLock. This lock makes sure that
// no compaction is started during an ongoing read. However, as we could show
// as part of https://github.com/semi-technologies/weaviate/issues/1837
// further processing, such as map-decoding and eventually map-merging would
// happen inside the bucket.MapList() method. This scope has its own lock,
// but that lock can only protecting against flushing (i.e. changing the
// active/flushing memtable), not against removing the disk segment. If a
// compaction completes and the old segment is removed, we would be accessing
// invalid memory without the copy, thus leading to a SEGFAULT.
contentsCopy := make([]byte, node.End-node.Start)
copy(contentsCopy, i.contents[node.Start:node.End])

return i.collectionStratParseData(contentsCopy)
}

func (i *segment) collectionStratParseData(in []byte) ([]value, error) {
Expand Down
19 changes: 11 additions & 8 deletions adapters/repos/db/lsmkv/strategies_map.go
Original file line number Diff line number Diff line change
Expand Up @@ -212,11 +212,20 @@ func (kv MapPair) Bytes() ([]byte, error) {
func (kv *MapPair) FromBytes(in []byte, keyOnly bool) error {
var read uint16

// NOTE: A previous implementation was using copy statements in here to avoid
// sharing the memory. The general idea of that is good (protect against the
// mmaped memory being removed from a completed compaction), however this is
// the wrong place. By the time we are in this method, we can no longer
// control the memory safety of the "in" argument. Thus, such a copy must
// happen at a much earlier scope when a lock is held that protects against
// removing the segment. Such an implmentation can now be found in
// segment_collection_strategy.go as part of the *segment.getCollection
// method. As a result all memory used here can now be considered read-only
// and is safe to be used indefinitely.

keyLen := binary.LittleEndian.Uint16(in[:2])
read += 2 // uint16 -> 2 bytes

// kv.Key = make([]byte, keyLen)
// copy(kv.Key, in[read:read+keyLen])
kv.Key = in[read : read+keyLen]
read += keyLen

Expand All @@ -227,12 +236,6 @@ func (kv *MapPair) FromBytes(in []byte, keyOnly bool) error {
valueLen := binary.LittleEndian.Uint16(in[read : read+2])
read += 2

// if int(valueLen) > cap(kv.Value) {
// kv.Value = make([]byte, valueLen)
// } else {
// kv.Value = kv.Value[:valueLen]
// }
// copy(kv.Value, in[read:read+valueLen])
kv.Value = in[read : read+valueLen]
read += valueLen

Expand Down

0 comments on commit 8182935

Please sign in to comment.