Skip to content
This repository has been archived by the owner on Jan 28, 2021. It is now read-only.

sql: implement memory management system for caches #802

Merged
merged 1 commit into from
Aug 14, 2019

Conversation

erizocosmico
Copy link
Contributor

This PR implements a memory management system, intended to have
control over the allocated memory for caches so that they can be
freed at any moment and we can avoid out of memory errors.

The main changes are the following:

  • MemoryManager in the sql package, which is just the component that
    tracks all caches. Memory of all freeable caches can be freed
    using the Free method of this component. The only way to instantiate
    new caches is using the NewXXXCache methods.
  • Rows, history and LRU cache implementations, accessed using the
    NewXXXCache methods of MemoryManager.
  • Reporters, which is a component that reports the maximum amount
    of memory the program is allowed to use and the currently used memory.
    This interface is meant for making testing easier. There is a default
    ProcessMemory reporter that returns the memory used by the process and
    the maximum memory defined in the MAX_MEMORY environment variable.
  • MemoryManager is passed down to every component through *sql.Context,
    which meant a little more boilerplate on the server SessionBuilder.
  • GroupBy, Sort, Distinct and Join now use the provided APIs of memory
    and cache management for their in-memory computations.

Caveats:

  • We need to think of a good default so that memory usage won't grow
    forever and crash eventually, which is the behaviour when MAX_MEMORY is 0.

Signed-off-by: Miguel Molina miguel@erizocosmi.co

@erizocosmico erizocosmico requested a review from a team August 12, 2019 15:06
sql/cache_test.go Outdated Show resolved Hide resolved
sql/cache_test.go Outdated Show resolved Hide resolved
sql/cache_test.go Outdated Show resolved Hide resolved
Copy link
Contributor

@agarciamontoro agarciamontoro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some comments and suggestions below. Some of them more questions than comments.

Also, I need to learn more about this to be able to comment on the changes on the join, groupby and friends.

sql/memory.go Outdated Show resolved Hide resolved
sql/memory.go Outdated Show resolved Hide resolved
sql/memory.go Outdated Show resolved Hide resolved
sql/memory.go Outdated Show resolved Hide resolved
sql/cache.go Show resolved Hide resolved
sql/cache.go Outdated Show resolved Hide resolved
sql/cache.go Outdated Show resolved Hide resolved
sql/cache.go Show resolved Hide resolved
sql/cache_test.go Show resolved Hide resolved
@erizocosmico erizocosmico force-pushed the feature/memory branch 2 times, most recently from 3cf6a25 to f4d9a5b Compare August 13, 2019 07:49
@erizocosmico
Copy link
Contributor Author

Fixed all the comments @agarciamontoro

Copy link
Contributor

@agarciamontoro agarciamontoro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just two more tests with the unchecked freed variable. Sorry that I wasn't explicit in my comment before.

sql/cache_test.go Show resolved Hide resolved
sql/cache_test.go Show resolved Hide resolved
@erizocosmico
Copy link
Contributor Author

erizocosmico commented Aug 13, 2019

Fixed. I'm also looking into the problem in the ruby integration test, which is now failing for some reason

@erizocosmico erizocosmico force-pushed the feature/memory branch 2 times, most recently from 96c12a3 to 6131062 Compare August 13, 2019 08:17
@erizocosmico
Copy link
Contributor Author

CI fixed. Cleaned up a few things in the travis file that were not needed anymore

sql/cache.go Show resolved Hide resolved
sql/memory.go Outdated
type KeyValueCache interface {
// Put a new value in the cache. If there is no more memory and the cache is
// not Freeable it will try to free some memory from other freeable caches.
// If there's still no more memory available, it will fail and erase all the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be the case in all the implementations. Per example, a warmed cache can improve a lot consecutive queries execution over more or less the same data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, will remove that from the doc


// NewLRUCache returns an empty LRU cache and a function to dispose it when it's
// no longer needed.
func (m *MemoryManager) NewLRUCache(size uint) (KeyValueCache, DisposeFunc) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is intended to be used per query or while the SQL engine is running?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the same question for all the cache implementations

Copy link
Contributor Author

@erizocosmico erizocosmico Aug 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it's running. You can get a cache from any place you have a sql.Context.

This PR implements a memory management system, intended to have
control over the allocated memory for caches so that they can be
freed at any moment and we can avoid out of memory errors.

The main changes are the following:

- MemoryManager in the sql package, which is just the component that
  tracks all caches. Memory of all freeable caches can be freed
  using the Free method of this component. The only way to instantiate
  new caches is using the NewXXXCache methods.
- Rows, history and LRU cache implementations, accessed using the
  NewXXXCache methods of MemoryManager.
- Reporters, which is a component that reports the maximum amount
  of memory the program is allowed to use and the currently used memory.
  This interface is meant for making testing easier. There is a default
  ProcessMemory reporter that returns the memory used by the process and
  the maximum memory defined in the `MAX_MEMORY` environment variable.
- MemoryManager is passed down to every component through *sql.Context,
  which meant a little more boilerplate on the server SessionBuilder.
- GroupBy, Sort, Distinct and Join now use the provided APIs of memory
  and cache management for their in-memory computations.

Caveats:
- We need to think of a good default so that memory usage won't grow
  forever and crash eventually, which is the behaviour when MAX_MEMORY is 0.

Signed-off-by: Miguel Molina <miguel@erizocosmi.co>
@ajnavarro ajnavarro merged commit 2e9abc7 into src-d:master Aug 14, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants