Skip to content

Optimize memory usage of table.TableIndex #1335

@jarifibrahim

Description

@jarifibrahim

Badger keeps the index used by each table in memory
https://github.com/dgraph-io/badger/blob/62b7a10a949e77b33d43cd7438979833c32cc865/table/table.go#L94-L102
As the number of SST grows, the number of in-memory index blocks grows. The memory usage grows linearly with the number of tables. We should be cache the index blocks in ristretto.

This issue was also seen on hypermodeinc/dgraph#5361 (comment)

// HEAP //
File: dgraph
Type: inuse_space
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 44487.54MB, 99.36% of 44774.33MB total
Dropped 234 nodes (cum <= 223.87MB)
Showing top 10 nodes out of 60
      flat  flat%   sum%        cum   cum%
20127.60MB 44.95% 44.95% 20127.60MB 44.95%  github.com/dgraph-io/ristretto/z.(*Bloom).Size
15454.99MB 34.52% 79.47% 23157.80MB 51.72%  github.com/dgraph-io/badger/v2/pb.(*TableIndex).Unmarshal
 7702.81MB 17.20% 96.67%  7702.81MB 17.20%  github.com/dgraph-io/badger/v2/pb.(*BlockOffset).Unmarshal

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions