New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OAK-10657: shrink in-DB documents after updates fail due to 16MB limit #1314
base: trunk
Are you sure you want to change the base?
Changes from 4 commits
1e06134
4866a37
23a37bf
37658dd
94f56dd
5bc3172
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -26,6 +26,7 @@ | |||||||||||||||||||||||||
import java.time.Instant; | ||||||||||||||||||||||||||
import java.time.format.DateTimeFormatter; | ||||||||||||||||||||||||||
import java.util.ArrayList; | ||||||||||||||||||||||||||
import java.util.Collections; | ||||||||||||||||||||||||||
import java.util.Comparator; | ||||||||||||||||||||||||||
import java.util.Date; | ||||||||||||||||||||||||||
import java.util.Iterator; | ||||||||||||||||||||||||||
|
@@ -46,6 +47,7 @@ | |||||||||||||||||||||||||
import org.apache.jackrabbit.oak.plugins.document.ClusterNodeInfo; | ||||||||||||||||||||||||||
import org.apache.jackrabbit.oak.plugins.document.ClusterNodeInfoDocument; | ||||||||||||||||||||||||||
import org.apache.jackrabbit.oak.plugins.document.Collection; | ||||||||||||||||||||||||||
import org.apache.jackrabbit.oak.plugins.document.Document; | ||||||||||||||||||||||||||
import org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreBuilder; | ||||||||||||||||||||||||||
import org.apache.jackrabbit.oak.plugins.document.DocumentStore; | ||||||||||||||||||||||||||
import org.apache.jackrabbit.oak.plugins.document.DocumentStoreException; | ||||||||||||||||||||||||||
|
@@ -54,6 +56,8 @@ | |||||||||||||||||||||||||
import org.apache.jackrabbit.oak.plugins.document.Revision; | ||||||||||||||||||||||||||
import org.apache.jackrabbit.oak.plugins.document.RevisionVector; | ||||||||||||||||||||||||||
import org.apache.jackrabbit.oak.plugins.document.StableRevisionComparator; | ||||||||||||||||||||||||||
import org.apache.jackrabbit.oak.plugins.document.UpdateOp; | ||||||||||||||||||||||||||
import org.apache.jackrabbit.oak.plugins.document.UpdateOp.Key; | ||||||||||||||||||||||||||
import org.apache.jackrabbit.oak.spi.toggle.Feature; | ||||||||||||||||||||||||||
import org.apache.jackrabbit.oak.stats.Clock; | ||||||||||||||||||||||||||
import org.jetbrains.annotations.NotNull; | ||||||||||||||||||||||||||
|
@@ -274,6 +278,73 @@ private static String diagsForEntry(Map.Entry<String, PropertyStats> member) { | |||||||||||||||||||||||||
} | ||||||||||||||||||||||||||
} | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
/** | ||||||||||||||||||||||||||
* @return cluster if from first revision found in op, {@code -1} otherwise | ||||||||||||||||||||||||||
*/ | ||||||||||||||||||||||||||
public static int extractClusterId(UpdateOp op) { | ||||||||||||||||||||||||||
for (Key key : op.getChanges().keySet()) { | ||||||||||||||||||||||||||
if (key.getRevision() != null) { | ||||||||||||||||||||||||||
return key.getRevision().getClusterId(); | ||||||||||||||||||||||||||
} | ||||||||||||||||||||||||||
} | ||||||||||||||||||||||||||
return -1; | ||||||||||||||||||||||||||
} | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
/** | ||||||||||||||||||||||||||
* Produce an {@link UpdateOp} suitable for shrinking branch revision entries for given property in {@link Document}, {@code null} otherwise. | ||||||||||||||||||||||||||
* | ||||||||||||||||||||||||||
* @param doc document to inspect for repeated branch commits | ||||||||||||||||||||||||||
* @param propertName property to check for | ||||||||||||||||||||||||||
* @param revisionChecker filter for revisions (for instance, to check for cluster id) | ||||||||||||||||||||||||||
* @return {@link UpdateOp} suitable for shrinking document, {@code null} otherwise | ||||||||||||||||||||||||||
*/ | ||||||||||||||||||||||||||
public static @Nullable UpdateOp getShrinkOp(Document doc, String propertyName, Predicate<Revision> revisionChecker) { | ||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we can do that once we use that from NodeDocumentStore, not DocumentStore... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @mbaedke - this is where we could check the feature flug for now... |
||||||||||||||||||||||||||
Object t_bc = doc.get("_bc"); | ||||||||||||||||||||||||||
Object t_property = doc.get(propertyName); | ||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would use camelCase. |
||||||||||||||||||||||||||
if (t_bc instanceof Map && t_property instanceof Map) { | ||||||||||||||||||||||||||
@SuppressWarnings("unchecked") | ||||||||||||||||||||||||||
Map<Revision, String> _bc = (Map<Revision, String>)t_bc; | ||||||||||||||||||||||||||
@SuppressWarnings("unchecked") | ||||||||||||||||||||||||||
Map<Revision, String> pMap = (Map<Revision, String>)t_property; | ||||||||||||||||||||||||||
List<Revision> revs = new ArrayList<>(); | ||||||||||||||||||||||||||
for (Map.Entry<Revision, String> en : pMap.entrySet()) { | ||||||||||||||||||||||||||
Revision r = en.getKey(); | ||||||||||||||||||||||||||
if (revisionChecker.apply(r)) { | ||||||||||||||||||||||||||
String bcv = _bc.get(r); | ||||||||||||||||||||||||||
if ("true".equals(bcv)) { | ||||||||||||||||||||||||||
revs.add(r); | ||||||||||||||||||||||||||
} | ||||||||||||||||||||||||||
} | ||||||||||||||||||||||||||
} | ||||||||||||||||||||||||||
// sort by age | ||||||||||||||||||||||||||
Collections.sort(revs, new Comparator<Revision>() { | ||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. wondering if there isn't such a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not in Revision, as far as I can tell. I wanted a comparator that sorts by clusterId first; we may not need this if we always filter by cluster id though. |
||||||||||||||||||||||||||
@Override | ||||||||||||||||||||||||||
public int compare(Revision r1, Revision r2) { | ||||||||||||||||||||||||||
if (r1.getClusterId() != r2.getClusterId()) { | ||||||||||||||||||||||||||
return r1.getClusterId() - r2.getClusterId(); | ||||||||||||||||||||||||||
} else if (r1.getTimestamp() != r2.getTimestamp()) { | ||||||||||||||||||||||||||
return r1.getTimestamp() > r2.getTimestamp() ? 1 : -1; | ||||||||||||||||||||||||||
} else { | ||||||||||||||||||||||||||
return r1.getCounter() - r2.getCounter(); | ||||||||||||||||||||||||||
} | ||||||||||||||||||||||||||
}}); | ||||||||||||||||||||||||||
Comment on lines
+320
to
+330
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
UpdateOp clean = new UpdateOp(doc.getId(), false); | ||||||||||||||||||||||||||
Revision last = null; | ||||||||||||||||||||||||||
for (Revision r : revs) { | ||||||||||||||||||||||||||
if (last != null) { | ||||||||||||||||||||||||||
if (last.getClusterId() == r.getClusterId()) { | ||||||||||||||||||||||||||
clean.removeMapEntry(propertyName, last); | ||||||||||||||||||||||||||
} | ||||||||||||||||||||||||||
} | ||||||||||||||||||||||||||
last = r; | ||||||||||||||||||||||||||
} | ||||||||||||||||||||||||||
Comment on lines
+334
to
+341
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. PS:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. some ideas:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. those 3 cases could be .. test cases .. :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. regarding
that might actually be a tricky thing to achieve - and I believe we might not have done that properly in the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
... or maybe not a physical checkpoint, but a root revision that corresponds to reading 24h ago : which we might substitute with corresponding revisions (with timestamp 24h minus 1 millisecond) for each known clusterId ... or something like that ... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
... taking that back .. the difference between |
||||||||||||||||||||||||||
return clean.hasChanges() ? clean : null; | ||||||||||||||||||||||||||
} else { | ||||||||||||||||||||||||||
return null; | ||||||||||||||||||||||||||
} | ||||||||||||||||||||||||||
} | ||||||||||||||||||||||||||
|
||||||||||||||||||||||||||
/** | ||||||||||||||||||||||||||
* List of property names that are system-defined by JCR and thus do not | ||||||||||||||||||||||||||
* need to be redacted (to be expanded later) | ||||||||||||||||||||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make it @param propertyName, please.