Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MOD-5181: Bug fix - Update numeric inverted index numEntries #3606

Merged
merged 6 commits into from May 28, 2023

Conversation

meiravgri
Copy link
Collaborator

@meiravgri meiravgri commented May 28, 2023

As the inverted index numEntries was never updated, we kept splitting the numeric tree. This caused the memory to grow constantly.

In addition, the numRecords field in the spec statistics was decreased by numEntries upon removing a range, but since the numEntries kept growing, and wasn't aligned with the actual number of entries, numRecords decreased below 0 and overflowed.

In this PR we update numEntries in the GC according to the number of removed entries collected.

related issue -MOD-5181

@codecov
Copy link

codecov bot commented May 28, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: -0.06 ⚠️

Comparison is base (ed988bd) 82.44% compared to head (bfdd1b3) 82.38%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3606      +/-   ##
==========================================
- Coverage   82.44%   82.38%   -0.06%     
==========================================
  Files         190      190              
  Lines       31139    31140       +1     
==========================================
- Hits        25671    25655      -16     
- Misses       5468     5485      +17     
Impacted Files Coverage Δ
src/fork_gc.c 56.54% <100.00%> (-0.66%) ⬇️

... and 5 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

added get tree max depth to pytests
@meiravgri meiravgri requested a review from DvirDukhan May 28, 2023 10:32


def dump_numeric_index_tree_root(env, idx, numeric_field):
tree_root_stats = dump_numeric_index_tree(env,idx,numeric_field)['root']
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2% of developers fix this issue

E231: missing whitespace after ','

❗❗ 3 similar findings have been found in this PR

🔎 Expand here to view all instances of this finding
File Path Line Number
tests/pytests/common.py 175
tests/pytests/test_numbers.py 11
tests/pytests/test_numbers.py 28

Visit the Lift Web Console to find more details in your report.


ℹ️ Expand to see all @sonatype-lift commands

You can reply with the following commands. For example, reply with @sonatype-lift ignoreall to leave out all findings.

Command Usage
@sonatype-lift ignore Leave out the above finding from this PR
@sonatype-lift ignoreall Leave out all the existing findings from this PR
@sonatype-lift exclude <file|issue|path|tool> Exclude specified file|issue|path|tool from Lift findings by updating your config.toml file

Note: When talking to LiftBot, you need to refresh the page to see its response.
Click here to add LiftBot to another repo.

tests/pytests/common.py Outdated Show resolved Hide resolved
@@ -7,6 +7,33 @@
from RLTest import Env
import math

def testNumEntries(env):
env.expect('ft.config', 'set', 'FORK_GC_CLEAN_THRESHOLD',0).equal('OK')
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1% of developers fix this issue

W191: indentation contains tabs

❗❗ 19 similar findings have been found in this PR

🔎 Expand here to view all instances of this finding
File Path Line Number
tests/pytests/test_numbers.py 12
tests/pytests/test_numbers.py 13
tests/pytests/test_numbers.py 15
tests/pytests/test_numbers.py 16
tests/pytests/test_numbers.py 17
tests/pytests/test_numbers.py 18
tests/pytests/test_numbers.py 19
tests/pytests/test_numbers.py 20
tests/pytests/test_numbers.py 21
tests/pytests/test_numbers.py 23

Showing 10 of 19 findings. Visit the Lift Web Console to see all.


ℹ️ Expand to see all @sonatype-lift commands

You can reply with the following commands. For example, reply with @sonatype-lift ignoreall to leave out all findings.

Command Usage
@sonatype-lift ignore Leave out the above finding from this PR
@sonatype-lift ignoreall Leave out all the existing findings from this PR
@sonatype-lift exclude <file|issue|path|tool> Exclude specified file|issue|path|tool from Lift findings by updating your config.toml file

Note: When talking to LiftBot, you need to refresh the page to see its response.
Click here to add LiftBot to another repo.

env.cmd('hset', f'{i}', 'num', f'{i}')

# explicitly run gc to update spec stats and the inverted index number of entries.
forceInvokeGC(env, 'idx')
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

11% of developers fix this issue

E0602: Undefined variable 'forceInvokeGC'

❗❗ 2 similar findings have been found in this PR

🔎 Expand here to view all instances of this finding
File Path Line Number
tests/pytests/test_numbers.py 27
tests/pytests/test_numbers.py 31

Visit the Lift Web Console to find more details in your report.


ℹ️ Expand to see all @sonatype-lift commands

You can reply with the following commands. For example, reply with @sonatype-lift ignoreall to leave out all findings.

Command Usage
@sonatype-lift ignore Leave out the above finding from this PR
@sonatype-lift ignoreall Leave out all the existing findings from this PR
@sonatype-lift exclude <file|issue|path|tool> Exclude specified file|issue|path|tool from Lift findings by updating your config.toml file

Note: When talking to LiftBot, you need to refresh the page to see its response.
Click here to add LiftBot to another repo.


loops = 15
hashes_number = 10_000
expected_depth = 17
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1% of developers fix this issue

W191: indentation contains tabs


ℹ️ Expand to see all @sonatype-lift commands

You can reply with the following commands. For example, reply with @sonatype-lift ignoreall to leave out all findings.

Command Usage
@sonatype-lift ignore Leave out the above finding from this PR
@sonatype-lift ignoreall Leave out all the existing findings from this PR
@sonatype-lift exclude <file|issue|path|tool> Exclude specified file|issue|path|tool from Lift findings by updating your config.toml file

Note: When talking to LiftBot, you need to refresh the page to see its response.
Click here to add LiftBot to another repo.

expected_depth = 17
for i in range(loops):
# In each loop re-index 0, 1,...,`hashes_number`-1 entries with increasing values
for i in range(hashes_number):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1% of developers fix this issue

W191: indentation contains tabs


ℹ️ Expand to see all @sonatype-lift commands

You can reply with the following commands. For example, reply with @sonatype-lift ignoreall to leave out all findings.

Command Usage
@sonatype-lift ignore Leave out the above finding from this PR
@sonatype-lift ignoreall Leave out all the existing findings from this PR
@sonatype-lift exclude <file|issue|path|tool> Exclude specified file|issue|path|tool from Lift findings by updating your config.toml file

Note: When talking to LiftBot, you need to refresh the page to see its response.
Click here to add LiftBot to another repo.

DvirDukhan
DvirDukhan previously approved these changes May 28, 2023
@@ -165,6 +165,20 @@ def index_info(env, idx):
res = {res[i]: res[i + 1] for i in range(0, len(res), 2)}
return res


def dump_numeric_index_tree(env, idx, numeric_field):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This dump is only at the top level. Consider renaming

'maxDepth']
env.assertGreaterEqual(expected_depth, numeric_tree_depth)

env.cmd('flushall')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can remove this

@oshadmi oshadmi changed the title Bug fix - Update numeric inverted index numEntries MOD-5181: Bug fix - Update numeric inverted index numEntries May 28, 2023
@meiravgri meiravgri merged commit 84dbca1 into master May 28, 2023
14 of 15 checks passed
@meiravgri meiravgri deleted the meiravg_bugfix_update_numeric_numEntries branch May 28, 2023 15:31
rafie pushed a commit that referenced this pull request Jun 8, 2023
* Update numeric inverted index numEntries

* added numeric tree test
added get tree max depth to pytests

(cherry picked from commit 84dbca1)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants