Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GC targets without deserializing Target model #7026

Merged
merged 7 commits into from
Nov 30, 2020

Conversation

schmidt-sebastian
Copy link
Contributor

@schmidt-sebastian schmidt-sebastian commented Nov 25, 2020

This works around a crash during LRU by not converting the Target Proto into Target Models, and instead only using the Target ID and the LastListenSequenceNumber for serialization. It is the more complicated version of #7026, but it doesn't orphan data on disk.

Fix: #6721

}
}

int MemoryTargetCache::RemoveTargets(
uint64_t MemoryTargetCache::RemoveTargets(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: consider returning size_t (size_t is an alias for an unsigned integer type that can store the largest possible object size. Strictly speaking, containers define their own size_type aliases, but they're usually the same as size_t whereas size_t is shorter to type and not tied to a particular container).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember size_t from my good old days on App Engine. Updated.

ASSERT_TRUE(target_data.target_id() > 20 ||
target_data.target_id() % 2 == 1);
});
int detectedRemoval = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: should be snake_case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

});

ASSERT_EQ(10, detectedRemoval);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: don't use Yoda-style comparisons, please switch the argument order. Note: a counterargument would be consistency with existing comparisons in this file. Personally, I wouldn't worry about it, but feel free to reject this on those grounds.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed (this instance alone). I did get my inspiration from other occurrences in this file though.

/**
* Parses the given bytes as a `firestore_client_Target` protocol buffer and
* then converts to the equivalent target data.
*/
TargetData DecodeTarget(absl::string_view encoded);

/** Removes the given targets from the query to target mapping */
void RemoveQueryTargetKeyForTargets(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ultranit: please add a full stop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -269,6 +291,24 @@ void LevelDbTargetCache::RemoveAllKeysForTarget(TargetId target_id) {
}
}

void LevelDbTargetCache::RemoveQueryTargetKeyForTargets(
std::unordered_set<TargetId> target_ids) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This argument should be passed by const reference (reference and not value because this function doesn't take ownership of it, const because it only calls const functions on it).


int resultCount = 0;
cache_->EnumerateSequenceNumbers([&](ListenSequenceNumber sequence_number) {
std::cout << sequence_number;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like leftover debug code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. Thanks for catching.

int resultCount = 0;
cache_->EnumerateSequenceNumbers([&](ListenSequenceNumber sequence_number) {
std::cout << sequence_number;
ASSERT_TRUE(sequence_numbers.find(sequence_number) !=
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: consider using GMock matchers:

EXPECT_THAT(sequence_numbers, Contains(sequence_number));

(matchers may provide more informative messages upon failure)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion. Done.

TargetData target_data2 = MakeTargetData(testutil::Query("b"));
cache_->AddTarget(target_data2);

std::unordered_map<model::TargetId, TargetData> live_targets;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: please include <unordered_map>.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done... but removed as per the suggestion below.

TargetData target_data2 = MakeTargetData(testutil::Query("b"));
cache_->AddTarget(target_data2);

std::unordered_map<model::TargetId, TargetData> live_targets;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: no need to specify the model:: namespace.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

cache_->AddTarget(target_data2);

std::unordered_map<model::TargetId, TargetData> live_targets;
cache_->RemoveTargets(target_data2.sequence_number(), live_targets);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: you could use empty braces to simply create the empty map on the fly:

cache_->RemoveTargets(target_data2.sequence_number(), {});

Applies below as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

index_iterator->Seek(index_prefix);

LevelDbQueryTargetKey row_key;
for (; index_iterator->Valid(); index_iterator->Next()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pessimizing target deletion considerably, making it O(targets) where it was previously O(targets-to-delete). Previously, RemoveTarget would construct the query-target key from the target data but now we're scanning all query-target rows.

It seems like we could try to construct the query-target key to delete based on the proto data and do like RemoveTarget always did.

Indeed I'm surprised that this this PR didn't just add a RemoveTarget overload that operated on a const message reference instead of changing this around to reimplement deletion.

Ultimately the workaround we're adding here is for a problem where the query proto has some invalid data in there. We haven't proven that this is happening because of general data corruption issues so I'm disinclined to rewrite all of this code in a way that doesn't trust the relationships built into the data. We should just not validate the Query part of the TargetData when removing a target.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with your concerns, but I went with this approach for two reasons:

  • We don't actually store the Canonical ID. Instead, we compute it from the Target model. That means that we have to fully deserialize into the model. If we don't validate, we end up with a Model that has an invalid field ordering, which we would then use to create a Canonical ID that is different from the original ID (unless they happen to both be corrupt). Hence, we don't know whether we would be able to delete the row from Level DB.

  • The LRU runs is already O(n) as it has to read all targets to collect the sequence numbers.

That's all I got for now :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somehow I thought we stored the canonical ID, but you're right: the only place it exists is in the keys of the query-target mapping.

Your point re O(n) is valid, but it's a different "table" so it's more I/O. If there were a way to avoid this I'd push for it, but I can see how you arrived here.

I'm still concerned that regular deletion and GC deletion follow different code paths but it's definitely not worth pessimizing single target removal to make this line up, nor is it worth writing a migration to force these to line up. I guess this is as good as it's going to get.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regular deletion is actually not used for LevelDB, so we could use the new code path for RemoveTarget without much harm. Let me know if you'd like me to do so.

Copy link
Contributor

@var-const var-const left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but see Gil's comment.

Copy link
Contributor

@wilhuff wilhuff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -1,5 +1,11 @@
# Unreleased
- [fixed] The SDK no longer fully deserializes queries during garbage,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra comma after garbage.

Also consider rephrasing to be less technically detailed. Something like: "Fixed a crash that could happen when the SDK encountered invalid data during garbage collection.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

index_iterator->Seek(index_prefix);

LevelDbQueryTargetKey row_key;
for (; index_iterator->Valid(); index_iterator->Next()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somehow I thought we stored the canonical ID, but you're right: the only place it exists is in the keys of the query-target mapping.

Your point re O(n) is valid, but it's a different "table" so it's more I/O. If there were a way to avoid this I'd push for it, but I can see how you arrived here.

I'm still concerned that regular deletion and GC deletion follow different code paths but it's definitely not worth pessimizing single target removal to make this line up, nor is it worth writing a migration to force these to line up. I guess this is as good as it's going to get.

@schmidt-sebastian schmidt-sebastian merged commit f576aa1 into master Nov 30, 2020
@firebase firebase locked and limited conversation to collaborators Dec 31, 2020
@paulb777 paulb777 deleted the mrschmidt/targetremoval branch January 10, 2021 00:42
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Empty orderBy field path when retrieving cached query from LevelDB
4 participants