Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(bigtable/bttest): make table gc release memory #3930

Merged
merged 14 commits into from
Feb 15, 2024

Conversation

philpearl
Copy link
Contributor

@philpearl philpearl commented Apr 14, 2021

How does the emulator store rows?
Each table in emulator contains a Btree which stores the rows:

type table struct {
	rows        *btree.BTree             // indexed by row key
        ....
}

Each object in this btree is of type row. A row has a families map. A family has a cells map keyed by column name

A sample row would be:

{
	families: {
		"col-family-1": {
			cells: {
				"col-1": [{
					"value": "val-1"
				}],
				"col-2": [{
					"value": "val-2"
				}],
				"col-3": [{
					"value": "val-3"
				}]
			},
		},
		"col-family-2": ....
	}
}

Many fields have not been shown in above sample for brevity.

Now, when cells col-1 and col-2 are deleted, the emulator just sets "col-1" and "col-2" to empty array i.e. the row becomes:

{
	families: {
		"col-family-1": {
			cells: {
				"col-1": [],
				"col-2": [],
				"col-3": [{
					"value": "val-3"
				}]
			},
		},
		"col-family-2": ....
	}
}

In this PR, this behaviour is being changed to remove the key itself i.e. the row will become:

{
	families: {
		"col-family-1": {
			cells: {
				"col-3": [{
					"value": "val-3"
				}]
			},
		},
		"col-family-2": ....
	}
}

Similarly, when all the families are deleted, the empty row will be deleted as well.

Closes: #6102

Description by PR and issue original author:
We had a problem where we were unable to run the BigTable emulator for extended periods in dev environments without giving it huge amounts of RAM, even if we had quite aggressive GC policies applied to the tables. This fixes that by making GC in the emulator actually delete empty rows and columns.

@product-auto-label product-auto-label bot added the api: bigtable Issues related to the Bigtable API. label Apr 14, 2021
@google-cla google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label Apr 14, 2021
We had a problem where we were unable to run the emulator for extended periods in dev environments without giving it huge amounts of RAM, even if we had quite aggressive GC policies applied to the tables.
This fixes that by making GC delete empty rows and columns.
@philpearl philpearl marked this pull request as ready for review April 14, 2021 11:31
@philpearl philpearl requested review from crwilcox, kolea2, tritone and a team as code owners April 14, 2021 11:31
@tritone tritone added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 14, 2021
@kokoro-team kokoro-team removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 14, 2021
bigtable/bttest/inmem.go Outdated Show resolved Hide resolved
@philpearl philpearl requested a review from tritone April 14, 2021 16:25
@tritone tritone added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 14, 2021
@kokoro-team kokoro-team removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 14, 2021
@philpearl
Copy link
Contributor Author

Ah, actually, scratch that. It's still not safe. I'm not convinced it was safe before as GC was changing the cell slices, but this has certainly made it more obviously unsafe.

The easy option would be to hold the write lock over the GC cycle, whether it changes or not. A harder option would be to do the whole thing in two phases

@philpearl
Copy link
Contributor Author

Ah, no, wrong again. I think this version is good. The row lock protects us.

@philpearl
Copy link
Contributor Author

Wrong a third time! We need to re-check the rows as we delete them with a lock held. It could be that cells have been added to the row in the meantime. Fix incoming

@tritone tritone added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 14, 2021
@kokoro-team kokoro-team removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Apr 14, 2021
Copy link
Contributor

@tritone tritone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine to me now. @kolea2 @crwilcox could you take a look?

@crwilcox
Copy link
Contributor

crwilcox commented May 3, 2021

This looks good to me also. I have asked some other folks that are more familiar with the emulator to take a look also, in case there is a hidden gotcha.

@philpearl
Copy link
Contributor Author

Any chance of getting this merged?

Copy link
Contributor

@jimfulton jimfulton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but needs at least a basic test.

bigtable/bttest/inmem.go Show resolved Hide resolved
@philpearl
Copy link
Contributor Author

@jimfulton do you have any suggestions about what the test would look like and what it would test? Presumably there are already tests that the GC operates correctly in terms of what's visible in the database. There are already tests that catch issues with --race (I know because I broke them!). Are you suggesting a test that counts memory allocations?

@meredithslota meredithslota added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label May 11, 2023
@kokoro-team kokoro-team removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label May 11, 2023
@bhshkh
Copy link
Contributor

bhshkh commented Jan 31, 2024

Reviewing

@bhshkh
Copy link
Contributor

bhshkh commented Feb 1, 2024

The changes LGTM

I've raised an issue. #6102. I'm afraid I didn't understand what type of test jim fulton was expecting and he didn't answer my query.

You can add a test with simple gc rule such as maxNumVersions and verify that empty rows, cells, cell families did get deleted.

@philpearl philpearl requested review from a team as code owners February 10, 2024 00:12
@product-auto-label product-auto-label bot added size: m Pull request size is medium. and removed size: s Pull request size is small. labels Feb 10, 2024
@bhshkh
Copy link
Contributor

bhshkh commented Feb 10, 2024

Added the tests

@bhshkh bhshkh enabled auto-merge (squash) February 12, 2024 19:55
@bhshkh bhshkh added kokoro:force-run Add this label to force Kokoro to re-run the tests. kokoro:run Add this label to force Kokoro to re-run the tests. labels Feb 15, 2024
@kokoro-team kokoro-team removed kokoro:run Add this label to force Kokoro to re-run the tests. kokoro:force-run Add this label to force Kokoro to re-run the tests. labels Feb 15, 2024
@bhshkh bhshkh self-assigned this Feb 15, 2024
@bhshkh bhshkh added kokoro:force-run Add this label to force Kokoro to re-run the tests. kokoro:run Add this label to force Kokoro to re-run the tests. labels Feb 15, 2024
@kokoro-team kokoro-team removed kokoro:run Add this label to force Kokoro to re-run the tests. kokoro:force-run Add this label to force Kokoro to re-run the tests. labels Feb 15, 2024
@bhshkh bhshkh merged commit 7d6ff39 into googleapis:main Feb 15, 2024
9 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigtable Issues related to the Bigtable API. cla: yes This human has signed the Contributor License Agreement. size: m Pull request size is medium. stale: extraold Pull request is critically old and needs prioritization.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bigtable/bttest: GC'd memory is not released
10 participants