Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ref count record #663

Merged
merged 16 commits into from May 13, 2024
Merged

Ref count record #663

merged 16 commits into from May 13, 2024

Conversation

swilly22
Copy link
Contributor

@swilly22 swilly22 commented Apr 27, 2024

fix #664

Summary by CodeRabbit

  • Refactor

    • Enhanced memory management and reference handling in the Execution Plan modules for improved performance.
    • Streamlined record deletion processes for better memory utilization and stability during operations.
  • Bug Fixes

    • Addressed issues related to record handling to prevent double frees and enhance system reliability.

@swilly22 swilly22 requested a review from AviAvni April 27, 2024 16:43
Copy link

coderabbitai bot commented Apr 27, 2024

Walkthrough

The updates focus on enhancing memory management and function handling across various modules. Key changes include improved record deletion based on reference counts, the introduction of new functions for merging records, and adjustments to function signatures for better memory handling. These modifications aim to optimize performance and clarity in the codebase.

Changes

Files Summary
op.c, op.h, op_aggregate.c, op_apply.c, op_call_subquery.c, op_value_hash_join.c, op_cartesian_product.c, op_cartesian_product.h, op_foreach.c Updated OpBase_DeleteRecord function calls to handle pointers and introduced OpBase_MergeRecords. Adjusted function signatures and memory management strategies.
record.c, record.h Enhanced record handling with new fields and functions, improved memory management, and updated function names for clarity.
grouping/group.c, grouping/group.h Modified Group_New function to use Record instead of SIValue *keys and updated related logic.
tests/flow/test_bfs.py, tests/flow/test_comprehension_functions.py, tests/flow/test_foreach.py Minor updates to test files for better readability and updated node properties in graph tests.
arithmetic/comprehension_funcs/comprehension_funcs.c Refactored functions for clarity and consistency in comments.

Poem

🐰🌟
Once upon a code so bright,
Changes made with all our might.
Records merge, and pointers shift,
In our code, no memory rift.
Hop and code, day and night,
With every line, we make it right!
🌙📜


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 38

src/execution_plan/ops/op_call_subquery.c Show resolved Hide resolved
src/execution_plan/ops/op_call_subquery.c Show resolved Hide resolved
src/execution_plan/ops/op_merge_create.c Show resolved Hide resolved
src/execution_plan/ops/op_merge_create.c Outdated Show resolved Hide resolved
src/execution_plan/ops/op_aggregate.c Show resolved Hide resolved
src/execution_plan/ops/op_aggregate.c Outdated Show resolved Hide resolved
src/execution_plan/ops/op_aggregate.c Show resolved Hide resolved
@swilly22 swilly22 linked an issue May 1, 2024 that may be closed by this pull request
src/execution_plan/ops/op_create.c Outdated Show resolved Hide resolved
src/execution_plan/ops/op_delete.c Outdated Show resolved Hide resolved
src/execution_plan/ops/op_update.c Outdated Show resolved Hide resolved
@@ -93,7 +93,7 @@ static Record _handoff
) {
// if there is a new value ready, return it
if(op->listIdx < op->listLen) {
Record r = OpBase_DeepCloneRecord(op->currentRecord);
Record r = OpBase_CloneRecord(op->currentRecord);
SIValue v = SIArray_Get(op->list, op->listIdx);

if(!(SI_TYPE(v) & SI_GRAPHENTITY)) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check if we really need to persist ?

src/execution_plan/ops/op_merge_create.c Outdated Show resolved Hide resolved
src/util/object_pool/object_pool.c Outdated Show resolved Hide resolved
src/execution_plan/record.h Outdated Show resolved Hide resolved
Comment on lines 41 to 42
void *owner; // owner of record
rax *mapping; // mapping between alias to record entry
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Get mapping from owner.

src/execution_plan/record.h Outdated Show resolved Hide resolved
src/execution_plan/record.c Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1


// Zero-set the item being returned.
// zero-set the item being returned
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace memset with memset_s to ensure safe memory operations and prevent potential buffer overflows.

- memset(item, 0, block->itemSize - HEADER_SIZE);
+ memset_s(item, block->itemSize - HEADER_SIZE, 0, block->itemSize - HEADER_SIZE);

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
// zero-set the item being returned
// zero-set the item being returned
memset_s(item, block->itemSize - HEADER_SIZE, 0, block->itemSize - HEADER_SIZE);

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Out of diff range and nitpick comments (3)
src/arithmetic/list_funcs/list_funcs.c (3)

Line range hint 1-1: Ensure the copyright notice is up-to-date.


[!TIP]
Codebase Verification

The verification of the ListReduceCtx_Clone function reveals that the function does not perform deep cloning for complex data structures. Specifically, the record field is set to NULL in the cloned context, which indicates that any complex data associated with this field is not cloned. This could potentially lead to issues if the cloned context is expected to have an independent copy of these data structures.

  • The record field, which could potentially be a complex data structure, is not cloned but set to NULL.
Analysis chain

Line range hint 31-41: Check for deep cloning in ListReduceCtx_Clone to prevent potential memory issues.

Scripts Executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify deep cloning in ListReduceCtx_Clone to ensure no shallow copies are made that could lead to double free errors.

# Search for usages of ListReduceCtx_Clone and analyze the cloning mechanism.
rg 'ListReduceCtx_Clone' --files-with-matches | xargs ast-grep --lang c --pattern 'ListReduceCtx_Clone($_)'

Length of output: 107



Script:

#!/bin/bash
# Description: Directly examine the implementation of ListReduceCtx_Clone to ensure deep cloning.

# Search for the definition of ListReduceCtx_Clone and display its implementation.
rg --multiline 'ListReduceCtx_Clone\s*\([^)]*\)\s*\{' -C 10 src/arithmetic/list_funcs/list_funcs.c

Length of output: 704


Line range hint 43-63: Ensure robust error handling and memory management in _PopulateReduceCtx.

- raxTryInsert(record_map, (unsigned char *)ctx->variable, strlen(ctx->variable), (void *)id, NULL);
- raxTryInsert(record_map, (unsigned char *)ctx->accumulator, strlen(ctx->accumulator), (void *)id, NULL);
+ if (!raxTryInsert(record_map, (unsigned char *)ctx->variable, strlen(ctx->variable), (void *)id, NULL)) {
+     // Handle error: insertion failed
+ }
+ if (!raxTryInsert(record_map, (unsigned char *)ctx->accumulator, strlen(ctx->accumulator), (void *)id, NULL)) {
+     // Handle error: insertion failed
+ }

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Out of diff range and nitpick comments (3)
tests/flow/test_foreach.py (3)

Line range hint 1-1: Consider replacing from common import * with specific imports to avoid namespace pollution and improve code clarity.


Line range hint 2-2: Remove the unused import from collections import Counter to clean up the code.


Line range hint 263-263: The local variable res is assigned but never used in this context. Consider removing it if it's unnecessary.

Review Details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits Files that changed from the base of the PR and between 87faebd and f300956.
Files selected for processing (2)
  • src/execution_plan/ops/op_aggregate.c (5 hunks)
  • tests/flow/test_foreach.py (3 hunks)
Files skipped from review as they are similar to previous changes (1)
  • src/execution_plan/ops/op_aggregate.c
Additional Context Used
Ruff (20)
tests/flow/test_foreach.py (20)

1-1: from common import * used; unable to detect undefined names


2-2: collections.Counter imported but unused


8-8: Env may be undefined, or defined from star imports


263-263: Local variable res is assigned to but never used


278-278: redis may be undefined, or defined from star imports


293-293: redis may be undefined, or defined from star imports


326-326: Node may be undefined, or defined from star imports


333-333: Node may be undefined, or defined from star imports


364-364: Node may be undefined, or defined from star imports


366-366: Node may be undefined, or defined from star imports


407-407: Node may be undefined, or defined from star imports


410-410: Node may be undefined, or defined from star imports


413-413: Node may be undefined, or defined from star imports


414-414: Node may be undefined, or defined from star imports


428-428: redis may be undefined, or defined from star imports


440-440: redis may be undefined, or defined from star imports


447-447: redis may be undefined, or defined from star imports


604-604: Node may be undefined, or defined from star imports


620-620: redis may be undefined, or defined from star imports


627-627: redis may be undefined, or defined from star imports

Additional comments not posted (16)
tests/flow/test_foreach.py (16)

Line range hint 8-8: Ensure that Env is defined somewhere in the imported modules, or import it explicitly to avoid runtime errors.


Line range hint 278-278: Ensure that the redis module is imported or available in the global namespace to avoid runtime errors.

Verification successful

The verification process has confirmed that the redis module is indeed imported in multiple files within the codebase, including those within the tests directory. This ensures that the usage of redis.exceptions.ResponseError in the test_foreach.py file is supported by the necessary import statements elsewhere in the test suite, mitigating the risk of runtime errors due to missing imports.

  • File Locations with redis Import:
    • tests/disposableredis/__init__.py
    • tests/flow/common.py
    • demo/imdb/imdb_demo.py
    • demo/client/client.py
    • demo/imdb/disposableredis/__init__.py
    • demo/social/social_demo.py
    • demo/social/disposableredis/__init__.py
Scripts Executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify the import of `redis`.

# Test: Search for the import of `redis`. Expect: At least one occurrence.
rg --type py 'import redis'

Length of output: 325


Line range hint 293-293: Ensure that the redis module is imported or available in the global namespace to avoid runtime errors.


Line range hint 333-333: Ensure that Node is defined somewhere in the imported modules, or import it explicitly to avoid runtime errors.


Line range hint 364-364: Ensure that Node is defined somewhere in the imported modules, or import it explicitly to avoid runtime errors.


Line range hint 366-366: Ensure that Node is defined somewhere in the imported modules, or import it explicitly to avoid runtime errors.


Line range hint 407-407: Ensure that Node is defined somewhere in the imported modules, or import it explicitly to avoid runtime errors.


Line range hint 410-410: Ensure that Node is defined somewhere in the imported modules, or import it explicitly to avoid runtime errors.


Line range hint 413-413: Ensure that Node is defined somewhere in the imported modules, or import it explicitly to avoid runtime errors.


Line range hint 414-414: Ensure that Node is defined somewhere in the imported modules, or import it explicitly to avoid runtime errors.


Line range hint 428-428: Ensure that the redis module is imported or available in the global namespace to avoid runtime errors.


Line range hint 440-440: Ensure that the redis module is imported or available in the global namespace to avoid runtime errors.


Line range hint 447-447: Ensure that the redis module is imported or available in the global namespace to avoid runtime errors.


Line range hint 604-604: Ensure that Node is defined somewhere in the imported modules, or import it explicitly to avoid runtime errors.


Line range hint 620-620: Ensure that the redis module is imported or available in the global namespace to avoid runtime errors.


Line range hint 627-627: Ensure that the redis module is imported or available in the global namespace to avoid runtime errors.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review Details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits Files that changed from the base of the PR and between f300956 and 43565d9.
Files selected for processing (6)
  • src/execution_plan/ops/op_conditional_traverse.c (5 hunks)
  • src/execution_plan/ops/op_create.c (2 hunks)
  • src/execution_plan/ops/op_delete.c (2 hunks)
  • src/execution_plan/ops/op_foreach.c (4 hunks)
  • src/execution_plan/record.c (4 hunks)
  • src/execution_plan/record.h (4 hunks)
Files skipped from review as they are similar to previous changes (6)
  • src/execution_plan/ops/op_conditional_traverse.c
  • src/execution_plan/ops/op_create.c
  • src/execution_plan/ops/op_delete.c
  • src/execution_plan/ops/op_foreach.c
  • src/execution_plan/record.c
  • src/execution_plan/record.h

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review Details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits Files that changed from the base of the PR and between 43565d9 and b682a46.
Files selected for processing (8)
  • src/execution_plan/ops/op.c (2 hunks)
  • src/execution_plan/ops/op.h (1 hunks)
  • src/execution_plan/ops/op_aggregate.c (5 hunks)
  • src/execution_plan/ops/op_apply.c (4 hunks)
  • src/execution_plan/ops/op_call_subquery.c (7 hunks)
  • src/execution_plan/ops/op_value_hash_join.c (7 hunks)
  • src/execution_plan/record.c (4 hunks)
  • src/execution_plan/record.h (4 hunks)
Files skipped from review as they are similar to previous changes (8)
  • src/execution_plan/ops/op.c
  • src/execution_plan/ops/op.h
  • src/execution_plan/ops/op_aggregate.c
  • src/execution_plan/ops/op_apply.c
  • src/execution_plan/ops/op_call_subquery.c
  • src/execution_plan/ops/op_value_hash_join.c
  • src/execution_plan/record.c
  • src/execution_plan/record.h

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Review Details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits Files that changed from the base of the PR and between b682a46 and 4ca0a28.
Files selected for processing (3)
  • src/execution_plan/ops/op_aggregate.c (7 hunks)
  • src/grouping/group.c (2 hunks)
  • src/grouping/group.h (1 hunks)
Files skipped from review as they are similar to previous changes (1)
  • src/execution_plan/ops/op_aggregate.c

uint func_count; // number of aggregation functions
Record r; // representative record
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change Record r to Record *r to align with memory management improvements.

AR_ExpNode **agg, // aggregation functions
uint func_count // number of aggregation functions
uint func_count, // number of aggregation functions
Record r // representative record
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change the parameter from Record r to Record *r in the Group_New function declaration to enhance memory management.

) {
Group *g = rm_malloc(sizeof(Group));

g->keys = keys;
g->r = r;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Record r is changed to Record *r in the Group_New function declaration, update the assignment to g->r = r; accordingly.

Comment on lines +39 to +40
if(g->r != NULL) {
OpBase_DeleteRecord(&(g->r));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the redundant NULL check before deleting the record. Ensure OpBase_DeleteRecord handles NULL pointers internally.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review Details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits Files that changed from the base of the PR and between 4ca0a28 and 4a3ff09.
Files selected for processing (1)
  • src/execution_plan/ops/op_aggregate.c (7 hunks)
Files skipped from review as they are similar to previous changes (1)
  • src/execution_plan/ops/op_aggregate.c

}
Group *g = (Group*)HashTableGetVal(entry);
Record r = g->r;
g->r = NULL;

// compute the final value of all aggregate expressions and add to Record
for(uint i = 0; i < op->aggregate_count; i++) {
int rec_idx = op->record_offsets[i + op->key_count];
AR_ExpNode *exp = g->agg[i];

SIValue agg = AR_EXP_FinalizeAggregations(exp, r);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

finalize aggregation one time value returning delete persist

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proved to be more difficult then expected, although for the most part e.g. count, max, min this change is meaningless, the only place where this can show benefit is with the collect function, which causes a redundant clone of the array.

src/execution_plan/ops/op_call_subquery.c Outdated Show resolved Hide resolved
src/execution_plan/ops/op_foreach.c Outdated Show resolved Hide resolved
typedef _Record *Record;

typedef struct _Record {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove typedef as it already declared in line 35

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't compile error.

Copy link

codecov bot commented May 13, 2024

Codecov Report

Attention: Patch coverage is 87.39496% with 30 lines in your changes are missing coverage. Please review.

Project coverage is 81.00%. Comparing base (fc0dba1) to head (79b8567).

Files Patch % Lines
src/execution_plan/ops/op_sort.c 16.66% 5 Missing ⚠️
src/execution_plan/ops/op_expand_into.c 42.85% 4 Missing ⚠️
src/execution_plan/ops/op_apply_multiplexer.c 71.42% 2 Missing ⚠️
src/execution_plan/ops/op_argument.c 0.00% 2 Missing ⚠️
src/execution_plan/ops/op_call_subquery.c 71.42% 2 Missing ⚠️
src/execution_plan/ops/op_merge.c 33.33% 2 Missing ⚠️
src/execution_plan/ops/op_node_by_id_seek.c 50.00% 2 Missing ⚠️
src/execution_plan/ops/op_node_by_label_scan.c 66.66% 2 Missing ⚠️
src/execution_plan/ops/op_semi_apply.c 66.66% 2 Missing ⚠️
src/execution_plan/ops/op_all_node_scan.c 66.66% 1 Missing ⚠️
... and 6 more
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #663      +/-   ##
==========================================
+ Coverage   80.97%   81.00%   +0.02%     
==========================================
  Files         313      313              
  Lines       31887    31853      -34     
  Branches     5115     5118       +3     
==========================================
- Hits        25822    25801      -21     
+ Misses       5229     5218      -11     
+ Partials      836      834       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Out of diff range and nitpick comments (7)
tests/flow/test_foreach.py (7)

Line range hint 1-1: Avoid using wildcard imports to prevent namespace pollution and improve code clarity.

- from common import *
+ from common import Env  # Import only the necessary components

Line range hint 2-2: Remove unused imports to clean up the code.

- from collections import Counter

[!TIP]
Codebase Verification

The verification process has revealed that the function Env is indeed defined in tests/flow/common.py, but there is no evidence of this module being imported in tests/flow/test_foreach.py. This suggests that the usage of Env in test_foreach.py might lead to a runtime error unless there is an implicit import mechanism not captured by the search or a modification in the import statements not reflected in the provided scripts.

Action Required:

  • Please ensure that tests/flow/common.py is properly imported in tests/flow/test_foreach.py to avoid runtime errors related to the use of Env.
Analysis chain

Line range hint 8-8: Ensure that Env is defined in the imported modules, or import it explicitly to avoid runtime errors.

Scripts Executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify the definition of `Env`.

# Test: Search for the definition of `Env`. Expect: At least one occurrence.
rg --type py 'class Env|def Env'

Length of output: 135



Script:

#!/bin/bash
# Description: Check if `common.py` is imported in `test_foreach.py`.

# Test: Search for import statements related to `common.py` in `test_foreach.py`.
rg --type py 'from .common import|import .common' tests/flow/test_foreach.py

Length of output: 78


Line range hint 278-278: Handle exceptions properly to ensure that the application can gracefully recover from errors.

+ from redis.exceptions import ResponseError

Line range hint 263-263: The local variable res is assigned but never used, which could indicate dead code or a missing implementation.


[!TIP]
Codebase Verification

Several files in the codebase still use the old function signature.

  • The usage of Node in tests/flow/test_foreach.py might lead to runtime errors as it is not imported from demo/client/client.py or defined locally in the test file.
Analysis chain

Line range hint 604-604: Ensure that Node is defined somewhere in the imported modules, or import it explicitly to avoid runtime errors.

Scripts Executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify the definition of `Node`.

# Test: Search for the definition of `Node`. Expect: At least one occurrence.
rg --type py 'class Node|def Node'

Length of output: 78



Script:

#!/bin/bash
# Description: Check if `Node` from `demo/client/client.py` is imported in `tests/flow/test_foreach.py`.

# Test: Search for import statements related to `Node` in `tests/flow/test_foreach.py`.
rg --type py 'from .*client.* import .*Node|import .*client.*' tests/flow/test_foreach.py

Length of output: 91


Line range hint 620-620: Handle exceptions properly to ensure that the application can gracefully recover from errors.

+ from redis.exceptions import ResponseError
Review Details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits Files that changed from the base of the PR and between 4a3ff09 and 62c300e.
Files selected for processing (10)
  • src/arithmetic/comprehension_funcs/comprehension_funcs.c (4 hunks)
  • src/execution_plan/ops/op_call_subquery.c (7 hunks)
  • src/execution_plan/ops/op_cartesian_product.c (1 hunks)
  • src/execution_plan/ops/op_cartesian_product.h (1 hunks)
  • src/execution_plan/ops/op_foreach.c (5 hunks)
  • src/execution_plan/record.c (7 hunks)
  • src/execution_plan/record.h (4 hunks)
  • tests/flow/test_bfs.py (1 hunks)
  • tests/flow/test_comprehension_functions.py (1 hunks)
  • tests/flow/test_foreach.py (4 hunks)
Files skipped from review as they are similar to previous changes (5)
  • src/execution_plan/ops/op_call_subquery.c
  • src/execution_plan/ops/op_cartesian_product.c
  • src/execution_plan/ops/op_cartesian_product.h
  • src/execution_plan/ops/op_foreach.c
  • src/execution_plan/record.h
Additional Context Used
Ruff (41)
tests/flow/test_bfs.py (12)

1-1: from common import * used; unable to detect undefined names


7-7: FlowTestsBase may be undefined, or defined from star imports


9-9: Env may be undefined, or defined from star imports


18-18: Node may be undefined, or defined from star imports


19-19: Node may be undefined, or defined from star imports


20-20: Node may be undefined, or defined from star imports


21-21: Node may be undefined, or defined from star imports


22-22: Node may be undefined, or defined from star imports


31-31: Edge may be undefined, or defined from star imports


32-32: Edge may be undefined, or defined from star imports


33-33: Edge may be undefined, or defined from star imports


34-34: Edge may be undefined, or defined from star imports

tests/flow/test_comprehension_functions.py (9)

1-1: from common import * used; unable to detect undefined names


6-6: ExecutionPlan may be undefined, or defined from star imports


13-13: FlowTestsBase may be undefined, or defined from star imports


15-15: Env may be undefined, or defined from star imports


26-26: Node may be undefined, or defined from star imports


30-30: Edge may be undefined, or defined from star imports


31-31: Edge may be undefined, or defined from star imports


165-165: redis may be undefined, or defined from star imports


172-172: redis may be undefined, or defined from star imports

tests/flow/test_foreach.py (20)

1-1: from common import * used; unable to detect undefined names


2-2: collections.Counter imported but unused


8-8: Env may be undefined, or defined from star imports


263-263: Local variable res is assigned to but never used


278-278: redis may be undefined, or defined from star imports


293-293: redis may be undefined, or defined from star imports


326-326: Node may be undefined, or defined from star imports


333-333: Node may be undefined, or defined from star imports


364-364: Node may be undefined, or defined from star imports


366-366: Node may be undefined, or defined from star imports


407-407: Node may be undefined, or defined from star imports


410-410: Node may be undefined, or defined from star imports


413-413: Node may be undefined, or defined from star imports


414-414: Node may be undefined, or defined from star imports


428-428: redis may be undefined, or defined from star imports


440-440: redis may be undefined, or defined from star imports


447-447: redis may be undefined, or defined from star imports


604-604: Node may be undefined, or defined from star imports


620-620: redis may be undefined, or defined from star imports


627-627: redis may be undefined, or defined from star imports

Record Record_New
(
rax *mapping
) {
ASSERT(mapping);
ASSERT(mapping != NULL);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensure proper error handling for ASSERT to avoid crashes in production.

- ASSERT(mapping != NULL);
+ if (mapping == NULL) {
+     ErrorCtx_RaiseRuntimeException("Mapping cannot be NULL");
+     return NULL;
+ }

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
ASSERT(mapping != NULL);
if (mapping == NULL) {
ErrorCtx_RaiseRuntimeException("Mapping cannot be NULL");
return NULL;
}

Comment on lines +60 to +98
const restrict Record r,
restrict Record clone
) {
int entry_count = Record_length(r);
size_t required_record_size = sizeof(Entry) * entry_count;
int entry_count = Record_length(clone);
// r and clone share the same record mapping
if(likely(r->owner == clone->owner)) {
size_t required_record_size = sizeof(Entry) * entry_count;
memcpy(clone->entries, r->entries, required_record_size);

// foreach scalar entry in cloned record, make sure it is not freed
// it is the original record owner responsibility to free the record
// and its internal scalar as a result
//
} else {
// r and clone don't share the same mappings
// scan through each entry within r
// locate coresponding entry in clone
// if such exists shallow clone it
raxIterator it;
raxStart(&it, clone->mapping);
raxSeek(&it, "^", NULL, 0);

while(raxNext(&it)) {
it.key[it.key_len] = '\0';
uint src_idx = Record_GetEntryIdx(r, (const char*)it.key);

if(src_idx == INVALID_INDEX) continue;
if(Record_GetType(r, src_idx) == REC_TYPE_UNKNOWN) continue;

intptr_t target_idx = (intptr_t)it.data;
Record_Add(clone, target_idx, Record_Get(r, src_idx));
}

memcpy(clone->entries, r->entries, required_record_size);
raxStop(&it);
}

/* Foreach scalar entry in cloned record, make sure it is not freed.
* it is the original record owner responsibility to free the record
* and its internal scalar as a result.
*
* TODO: I wish we wouldn't have to perform this loop as it is a major performance hit
* with the introduction of a garbage collection this should be removed. */
// TODO: i wish we wouldn't have to perform this loop
// as it is a major performance hit
// with the introduction of a garbage collection this should be removed
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optimize record cloning to reduce performance overhead.

- // TODO: i wish we wouldn't have to perform this loop
- // as it is a major performance hit
- // with the introduction of a garbage collection this should be removed
- for(int i = 0; i < entry_count; i++) {
-     if(Record_GetType(clone, i) == REC_TYPE_SCALAR) {
-         SIValue_MakeVolatile(&clone->entries[i].value.s);
-     }
- }
+ // Consider implementing a more efficient mechanism or using a different approach to manage scalar values.

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
const restrict Record r,
restrict Record clone
) {
int entry_count = Record_length(r);
size_t required_record_size = sizeof(Entry) * entry_count;
int entry_count = Record_length(clone);
// r and clone share the same record mapping
if(likely(r->owner == clone->owner)) {
size_t required_record_size = sizeof(Entry) * entry_count;
memcpy(clone->entries, r->entries, required_record_size);
// foreach scalar entry in cloned record, make sure it is not freed
// it is the original record owner responsibility to free the record
// and its internal scalar as a result
//
} else {
// r and clone don't share the same mappings
// scan through each entry within r
// locate coresponding entry in clone
// if such exists shallow clone it
raxIterator it;
raxStart(&it, clone->mapping);
raxSeek(&it, "^", NULL, 0);
while(raxNext(&it)) {
it.key[it.key_len] = '\0';
uint src_idx = Record_GetEntryIdx(r, (const char*)it.key);
if(src_idx == INVALID_INDEX) continue;
if(Record_GetType(r, src_idx) == REC_TYPE_UNKNOWN) continue;
intptr_t target_idx = (intptr_t)it.data;
Record_Add(clone, target_idx, Record_Get(r, src_idx));
}
memcpy(clone->entries, r->entries, required_record_size);
raxStop(&it);
}
/* Foreach scalar entry in cloned record, make sure it is not freed.
* it is the original record owner responsibility to free the record
* and its internal scalar as a result.
*
* TODO: I wish we wouldn't have to perform this loop as it is a major performance hit
* with the introduction of a garbage collection this should be removed. */
// TODO: i wish we wouldn't have to perform this loop
// as it is a major performance hit
// with the introduction of a garbage collection this should be removed
const restrict Record r,
restrict Record clone
) {
int entry_count = Record_length(clone);
// r and clone share the same record mapping
if(likely(r->owner == clone->owner)) {
size_t required_record_size = sizeof(Entry) * entry_count;
memcpy(clone->entries, r->entries, required_record_size);
// foreach scalar entry in cloned record, make sure it is not freed
// it is the original record owner responsibility to free the record
// and its internal scalar as a result
//
} else {
// r and clone don't share the same mappings
// scan through each entry within r
// locate coresponding entry in clone
// if such exists shallow clone it
raxIterator it;
raxStart(&it, clone->mapping);
raxSeek(&it, "^", NULL, 0);
while(raxNext(&it)) {
it.key[it.key_len] = '\0';
uint src_idx = Record_GetEntryIdx(r, (const char*)it.key);
if(src_idx == INVALID_INDEX) continue;
if(Record_GetType(r, src_idx) == REC_TYPE_UNKNOWN) continue;
intptr_t target_idx = (intptr_t)it.data;
Record_Add(clone, target_idx, Record_Get(r, src_idx));
}
raxStop(&it);
}
// Consider implementing a more efficient mechanism or using a different approach to manage scalar values.

@@ -319,12 +319,14 @@ def test08_complex(self):

# validate that the end state is correct
# there should be one node in the graph, with label N,
# properties: {v: 2, name: 'RAZmon', li: [1] and 4 nodes with label M
# properties: {v: 1, name: 'RAZ', li: [1,2,3,4]
# and 4 nodes with label M
# and properties: {v: 1, name: 'raz'} and li from 1 to 4
res = self.graph.query("MATCH (n:N) return n")
n = Node(labels='N', properties={'v': 1, 'name': 'RAZ', 'li': [1, 2, 3, 4]})
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tip

Codebase Verification

The Node class is defined in demo/client/client.py but is not imported in tests/flow/test_foreach.py where it is used. This could lead to runtime errors as the class is not accessible in the test context.

  • Ensure that Node is imported from demo/client/client.py in tests/flow/test_foreach.py to avoid runtime errors.
Analysis chain

Ensure that Node is defined somewhere in the imported modules, or import it explicitly to avoid runtime errors.

Scripts Executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify the definition of `Node`.

# Test: Search for the definition of `Node`. Expect: At least one occurrence.
rg --type py 'class Node|def Node'

Length of output: 78



Script:

#!/bin/bash
# Description: Check import statements in the test file to verify if `Node` is imported.

# Test: Search for import statements related to `Node` in `tests/flow/test_foreach.py`.
rg --type py 'from .* import .*Node|import .*Node' tests/flow/test_foreach.py

Length of output: 79

Comment on lines +106 to +130
// merge src record into dest
void Record_Merge
(
Record a,
const Record b
restrict Record dest, // dest record
const restrict Record src // src record
) {
ASSERT(a->owner == b->owner);
uint len = Record_length(a);
ASSERT(src->owner == dest->owner);

uint len = Record_length(src);
for(uint i = 0; i < len; i++) {
RecordEntryType a_type = a->entries[i].type;
RecordEntryType b_type = b->entries[i].type;

if(a_type == REC_TYPE_UNKNOWN && b_type != REC_TYPE_UNKNOWN) {
_RecordPropagateEntry(a, b, i, true);
RecordEntryType src_type = src->entries[i].type;
RecordEntryType dest_type = dest->entries[i].type;

if(src_type != REC_TYPE_UNKNOWN && dest_type == REC_TYPE_UNKNOWN) {
// copy entry from src to dest
dest->entries[i] = src->entries[i];

// protect heap allocated values
if(src->entries[i].type == REC_TYPE_SCALAR) {
if(SI_ALLOCATION(&(src->entries[i].value.s)) == M_SELF) {
SIValue_MakeVolatile(&(src->entries[i].value.s));
} else {
SIValue_Persist(&(dest->entries[i].value.s));
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensure memory safety during record merging.

+ // Ensure that the destination record has enough space to accommodate the source entries.
+ if (Record_length(dest) < len) {
+     ErrorCtx_RaiseRuntimeException("Destination record does not have enough space.");
+     return;
+ }

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
// merge src record into dest
void Record_Merge
(
Record a,
const Record b
restrict Record dest, // dest record
const restrict Record src // src record
) {
ASSERT(a->owner == b->owner);
uint len = Record_length(a);
ASSERT(src->owner == dest->owner);
uint len = Record_length(src);
for(uint i = 0; i < len; i++) {
RecordEntryType a_type = a->entries[i].type;
RecordEntryType b_type = b->entries[i].type;
if(a_type == REC_TYPE_UNKNOWN && b_type != REC_TYPE_UNKNOWN) {
_RecordPropagateEntry(a, b, i, true);
RecordEntryType src_type = src->entries[i].type;
RecordEntryType dest_type = dest->entries[i].type;
if(src_type != REC_TYPE_UNKNOWN && dest_type == REC_TYPE_UNKNOWN) {
// copy entry from src to dest
dest->entries[i] = src->entries[i];
// protect heap allocated values
if(src->entries[i].type == REC_TYPE_SCALAR) {
if(SI_ALLOCATION(&(src->entries[i].value.s)) == M_SELF) {
SIValue_MakeVolatile(&(src->entries[i].value.s));
} else {
SIValue_Persist(&(dest->entries[i].value.s));
}
}
// merge src record into dest
void Record_Merge
(
restrict Record dest, // dest record
const restrict Record src // src record
) {
ASSERT(src->owner == dest->owner);
uint len = Record_length(src);
// Ensure that the destination record has enough space to accommodate the source entries.
if (Record_length(dest) < len) {
ErrorCtx_RaiseRuntimeException("Destination record does not have enough space.");
return;
}
for(uint i = 0; i < len; i++) {
RecordEntryType src_type = src->entries[i].type;
RecordEntryType dest_type = dest->entries[i].type;
if(src_type != REC_TYPE_UNKNOWN && dest_type == REC_TYPE_UNKNOWN) {
// copy entry from src to dest
dest->entries[i] = src->entries[i];
// protect heap allocated values
if(src->entries[i].type == REC_TYPE_SCALAR) {
if(SI_ALLOCATION(&(src->entries[i].value.s)) == M_SELF) {
SIValue_MakeVolatile(&(src->entries[i].value.s));
} else {
SIValue_Persist(&(dest->entries[i].value.s));
}
}

Comment on lines +365 to +367
ASSERT(r != NULL);
ASSERT(r->ref_count == 0);

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Validate record reference count before freeing to prevent double free errors.

+ if (r->ref_count != 0) {
+     ErrorCtx_RaiseRuntimeException("Attempting to free a record with non-zero reference count.");
+     return;
+ }

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
ASSERT(r != NULL);
ASSERT(r->ref_count == 0);
ASSERT(r != NULL);
ASSERT(r->ref_count == 0);
if (r->ref_count != 0) {
ErrorCtx_RaiseRuntimeException("Attempting to free a record with non-zero reference count.");
return;
}

Comment on lines +163 to +168
query = """MATCH (a {v: 'a'})
CALL algo.BFS(a, 1, NULL) YIELD nodes as n1
MATCH (b {v: 'd'})
CALL algo.BFS(b, 1, NULL) YIELD nodes as n2
RETURN [n IN n1 | n.v] AS x, [n IN n2 | n.v] AS y
ORDER BY x.v"""
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensure correct variable naming in the BFS test to improve readability.

-                   CALL algo.BFS(a, 1, NULL) YIELD nodes as n1
-                   MATCH (b {v: 'd'})
-                   CALL algo.BFS(b, 1, NULL) YIELD nodes as n2
-                   RETURN [n IN n1 | n.v] AS x, [n IN n2 | n.v] AS y
-                   ORDER BY x.v
+                   CALL algo.BFS(a, 1, NULL) YIELD nodes AS nodes_a
+                   MATCH (b {v: 'd'})
+                   CALL algo.BFS(b, 1, NULL) YIELD nodes AS nodes_b
+                   RETURN [node IN nodes_a | node.v] AS values_a, [node IN nodes_b | node.v] AS values_b
+                   ORDER BY values_a.v

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
query = """MATCH (a {v: 'a'})
CALL algo.BFS(a, 1, NULL) YIELD nodes as n1
MATCH (b {v: 'd'})
CALL algo.BFS(b, 1, NULL) YIELD nodes as n2
RETURN [n IN n1 | n.v] AS x, [n IN n2 | n.v] AS y
ORDER BY x.v"""
query = """MATCH (a {v: 'a'})
CALL algo.BFS(a, 1, NULL) YIELD nodes AS nodes_a
MATCH (b {v: 'd'})
CALL algo.BFS(b, 1, NULL) YIELD nodes AS nodes_b
RETURN [node IN nodes_a | node.v] AS values_a, [node IN nodes_b | node.v] AS values_b
ORDER BY values_a.v"""

Comment on lines +22 to +25
// if this comprehension has a filter tree, free it
if(ctx->ft) FilterTree_Free(ctx->ft);
// If this comprehension has an eval routine, free it.

// if this comprehension has an eval routine, free it
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensure proper memory management when freeing comprehension contexts.

- if(ctx->ft) FilterTree_Free(ctx->ft);
- if(ctx->eval_exp) AR_EXP_Free(ctx->eval_exp);
+ if(ctx->ft) {
+     FilterTree_Free(ctx->ft);
+     ctx->ft = NULL;
+ }
+ if(ctx->eval_exp) {
+     AR_EXP_Free(ctx->eval_exp);
+     ctx->eval_exp = NULL;
+ }

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
// if this comprehension has a filter tree, free it
if(ctx->ft) FilterTree_Free(ctx->ft);
// If this comprehension has an eval routine, free it.
// if this comprehension has an eval routine, free it
// if this comprehension has a filter tree, free it
if(ctx->ft) {
FilterTree_Free(ctx->ft);
ctx->ft = NULL;
}
// if this comprehension has an eval routine, free it
if(ctx->eval_exp) {
AR_EXP_Free(ctx->eval_exp);
ctx->eval_exp = NULL;
}

Comment on lines +368 to +369
query = """MATCH x=(a {val: 'v3'})
RETURN [x IN nodes(x) | [elem IN range(1, 2) | size(({val: 'v2'})-[]->(x))]]"""
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarify the use of variable x in nested list comprehensions to avoid confusion.

-        query = """MATCH x=(a {val: 'v3'})
-                   RETURN [x IN nodes(x) | [elem IN range(1, 2) | size(({val: 'v2'})-[]->(x))]]"""
+        query = """MATCH path=(a {val: 'v3'})
+                   RETURN [node IN nodes(path) | [elem IN range(1, 2) | size(({val: 'v2'})-[]->(node))]]"""

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
query = """MATCH x=(a {val: 'v3'})
RETURN [x IN nodes(x) | [elem IN range(1, 2) | size(({val: 'v2'})-[]->(x))]]"""
query = """MATCH path=(a {val: 'v3'})
RETURN [node IN nodes(path) | [elem IN range(1, 2) | size(({val: 'v2'})-[]->(node))]]"""

@swilly22 swilly22 merged commit b7d5719 into master May 13, 2024
17 checks passed
@swilly22 swilly22 deleted the ref-count-record branch May 13, 2024 16:59
swilly22 added a commit that referenced this pull request May 13, 2024
* 600 add browser usage for the docker image (#601)

* add instructions to run browser UI

* set hostname to 0.0.0.0

* fix crash in rewrite call subquery (#604)

* fix crash in rewrite call subquery

* simplified test query

---------

Co-authored-by: Roi Lipman <swilly22@users.noreply.github.com>

* use debug assert in rust (#606)

* fix merge init (#609)

* fix merge init

* remove unused function, simplified branch assignment logic

* clean

---------

Co-authored-by: Roi Lipman <roilipman@gmail.com>

* Unwind persist (#613)

* persist unwind record

* deep clone base record

* Disable jit (#612)

* persist unwind record

* disable GraphBLAS JIT

* Update module api (#617)

* update RediSearch submodule

* bump RediSearch version

* updated redis module api header file

* make sure there's a record to emit (#623)

* update RediSearch (#640)

* Fix #634 Add Cloud link to README.md (#635)

* fix #645 Add objective to README (#646)

* Add objective to README

* Update wordlist.txt

---------

Co-authored-by: Roi Lipman <swilly22@users.noreply.github.com>

* fix union validation with call subquery (#648)

* Seek by id runtime optimization (#643)

* wip runtime optimization

* fix leak

* fix node by label scan

* fix leak

* fix leak

* address review

* add tests

* address review

* review

* review

* review

* add memory hook to roaring bitmap

* introduce bitmap range

---------

Co-authored-by: Roi Lipman <roilipman@gmail.com>
Co-authored-by: Roi Lipman <swilly22@users.noreply.github.com>

* zero dim vector save/load (#654)

* update redisearch, do not escape TAG field (#667)

* Code coverage (#674)

* add coverage build

* rename

* fix

* fix cache

* install redis in coverage

* ignore some path in code coverage (#678)

* Ref count record (#663)

* ref count record

* nullify record on deletion

* handle record ref count within execution plan return record

* address PR comments

* remove head & tail calls to persist

* trying to simplify and make value passing safer and simpler

* free aggregated group once yield

* removed Record_PersistScalars

* removed override flag

* remove extra persists

* group no longer holds a keys array

* only persists keys which are not graph entities

* allow for record cloning from different execution plans

---------

Co-authored-by: Dudi <16744955+dudizimber@users.noreply.github.com>
Co-authored-by: Avi Avni <avi.avni@gmail.com>
Co-authored-by: Guy Korland <gkorland@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Crash while enforcing unique constraint on string property of nodes.
2 participants