fix: allow alternative vector db engine to be used #106

borisarzentar · 2024-06-06T10:32:22Z

Summary by CodeRabbit

New Features
- Added instructions for using Networkx and Graphistry for visualization.
- Introduced error handling to display messages during data exploration failures.
Bug Fixes
- Improved error handling for missing Graphistry credentials in the API.
Style
- Added a new CSS variable for textarea default color.
Documentation
- Updated quickstart.md with new visualization instructions and search query parameters.
Chores
- Updated dependencies in pyproject.toml.

coderabbitai · 2024-06-06T10:32:26Z

Warning

Review failed

The pull request is closed.

Walkthrough

The recent changes enhance the cognee application by refining search queries, improving error handling, and optimizing UI components. Key updates include modifying the cognee.search method to focus on NLP queries, adding visualization instructions in the documentation, and improving CSS styling. Additionally, error handling in various components has been bolstered, and the pyproject.toml dependencies have been updated.

Changes

Files/Paths	Change Summary
`README.md`, `docs/quickstart.md`	Updated search query to focus on NLP and added visualization instructions using Networkx and Graphistry.
`cognee-frontend/src/app/globals.css`	Added new CSS variable `--textarea-default-color` for default text area color styling.
`cognee-frontend/src/app/wizard/CognifyStep/...`	Optimized `cognifyPromise` handling using `useRef` and conditional logic. Removed redundant imports.
`cognee-frontend/src/modules/exploration/...`	Enhanced error handling in `getExplorationGraphUrl` function to manage non-200 status responses.
`cognee-frontend/src/ui/Partials/Explorer/...`	Introduced error handling logic for data exploration and displayed error messages in the UI.
`cognee/api/client.py`	Added error handling for missing Graphistry credentials and updated paths for `vector_db_url` and `graph_file_path`.
`cognee/api/v1/config/config.py`	Updated vector database path setting logic based on vector engine provider.
`cognee/infrastructure/databases/graph/...`	Added new attributes to `GraphConfig` and updated `to_dict` method.
`cognee/tests/test_neo4j.py`	Added logic to set up Neo4j graph database, add data, and run cognitive operations using `cognee`.
`pyproject.toml`	Updated `dlt` version to "0.4.12" and changed dependency to `langchain-text-splitters` version "^0.2.1".

🐰✨

In the code's enchanted grove,
Queries now for NLP rove.
Errors caught with gentle care,
Styles refined, no glitch to spare.
Graphs and texts, a seamless blend,
Our journey's magic knows no end.

🌟🔍

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai generate interesting stats about this repository and render them as a table.
- @coderabbitai show all the console.log statements in this repository.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 10

Outside diff range and nitpick comments (5)

cognee-frontend/src/app/page.tsx (1)
Line range hint 7-7: Consider renaming the DataView variable to avoid shadowing the global name.
- import DataView, { Data } from '@/modules/ingestion/DataView';
+ import CustomDataView, { Data } from '@/modules/ingestion/DataView';
evals/simple_rag_vs_cognee_eval.py (2)
Line range hint 76-76: Remove unused variable to clean up the code.
-    graph = await cognify("initial_test")
This variable is assigned but never used, which could lead to confusion and unnecessary memory usage.

Tools

Ruff

83-83: Module level import not at top of file (E402)

84-84: Module level import not at top of file (E402)

85-85: Module level import not at top of file (E402)

Line range hint 10-10: Consider reorganizing imports to improve readability and maintainability.

It's a common Python best practice to place all module-level imports at the top of the file unless there's a specific reason (like avoiding circular dependencies). This helps in understanding dependencies of the module at a glance.

Also applies to: 43-43, 83-83, 84-84, 85-85

Tools

Ruff

83-83: Module level import not at top of file (E402)

84-84: Module level import not at top of file (E402)

85-85: Module level import not at top of file (E402)
cognee/modules/cognify/graph/add_cognitive_layer_graphs.py (1)

Line range hint 21-146: Ensure consistent error handling and logging.

While the integration of the vector engine is well executed, consider adding more robust error handling around the vector engine's operations. This could include logging specific errors or retry mechanisms in case of transient failures.
cognee/api/v1/cognify/cognify.py (1)
Line range hint 212-212: Avoid using bare except statements.

Using a bare except can catch unexpected exceptions and obscure underlying issues. Specify the exception types to improve error handling and maintainability.
-    except:
+    except SpecificExceptionType:

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 472d1a0 and f79631d.

Files selected for processing (23)

cognee-frontend/src/app/page.tsx (1 hunks)
cognee/api/client.py (3 hunks)
cognee/api/v1/cognify/cognify.py (1 hunks)
cognee/api/v1/config/config.py (1 hunks)
cognee/infrastructure/databases/vector/init.py (1 hunks)
cognee/infrastructure/databases/vector/config.py (1 hunks)
cognee/infrastructure/databases/vector/create_vector_engine.py (1 hunks)
cognee/infrastructure/databases/vector/embeddings/FastembedEmbeddingEngine.py (1 hunks)
cognee/infrastructure/databases/vector/embeddings/LiteLLMEmbeddingEngine.py (1 hunks)
cognee/infrastructure/databases/vector/embeddings/init.py (1 hunks)
cognee/infrastructure/databases/vector/embeddings/config.py (1 hunks)
cognee/infrastructure/databases/vector/embeddings/get_embedding_engine.py (1 hunks)
cognee/infrastructure/databases/vector/get_vector_engine.py (1 hunks)
cognee/modules/cognify/graph/add_cognitive_layer_graphs.py (4 hunks)
cognee/modules/cognify/graph/add_data_chunks.py (6 hunks)
cognee/modules/cognify/graph/add_label_nodes.py (2 hunks)
cognee/modules/cognify/llm/resolve_cross_graph_references.py (2 hunks)
cognee/modules/data/deletion/prune_system.py (2 hunks)
cognee/modules/search/vector/search_similarity.py (2 hunks)
cognee/modules/settings/get_settings.py (1 hunks)
cognee/modules/settings/save_vector_db_config.py (1 hunks)
cognee/modules/tasks/get_task_status.py (2 hunks)
evals/simple_rag_vs_cognee_eval.py (1 hunks)

Files skipped from review due to trivial changes (3)

cognee/infrastructure/databases/vector/create_vector_engine.py
cognee/infrastructure/databases/vector/embeddings/init.py
cognee/modules/settings/save_vector_db_config.py

Additional context used

Biome

cognee-frontend/src/app/page.tsx

[error] 7-7: Do not shadow the global "DataView" property. (lint/suspicious/noShadowRestrictedNames)

Consider renaming this variable. It's easy to confuse the origin of variables when they're named after a known global.

Ruff

evals/simple_rag_vs_cognee_eval.py

10-10: Module level import not at top of file (E402)

43-43: Module level import not at top of file (E402)

76-76: Local variable graph is assigned to but never used (F841)

Remove assignment to unused variable graph

83-83: Module level import not at top of file (E402)

84-84: Module level import not at top of file (E402)

85-85: Module level import not at top of file (E402)

cognee/api/client.py

85-85: Do not use bare except (E722)

cognee/api/v1/cognify/cognify.py

212-212: Do not use bare except (E722)

Additional comments not posted (14)

cognee/infrastructure/databases/vector/get_vector_engine.py (1)

5-6: LGTM! The function get_vector_engine effectively combines vector DB configurations and embedding engines.

cognee/infrastructure/databases/vector/__init__.py (1)

6-6: The import of get_vector_engine is appropriate and aligns with the refactoring goals to centralize vector engine access.

cognee/modules/data/deletion/prune_system.py (1)

12-13: The integration of get_vector_engine in the prune_system function is correctly implemented, ensuring that vector-related deletions are handled through the centralized engine access.
cognee/modules/tasks/get_task_status.py (2)
19-19: Refactor to improve readability of dictionary comprehension.
- return { dataset["data_id"]: dataset["status"] for dataset in datasets_statuses }
+ return {dataset["data_id"]: dataset["status"] for dataset in datasets_statuses}
Likely invalid or redundant comment.

9-9: Ensure the SQL query is optimized for performance.
cognee/infrastructure/databases/vector/embeddings/config.py (1)

9-9: Clarify the reason for commenting out DefaultEmbeddingEngine.

cognee/modules/cognify/llm/resolve_cross_graph_references.py (1)

19-19: LGTM! Ensure that get_vector_engine is properly integrated and tested across the system.

cognee/modules/cognify/graph/add_label_nodes.py (1)

8-8: LGTM! Ensure that get_vector_engine is properly integrated and tested across the system.
cognee/modules/settings/get_settings.py (1)
43-43: Enhanced API key masking logic for improved security.

Verification successful

The enhanced API key masking logic has been successfully verified. The code correctly masks the last 10 characters of the API key.

cognee/modules/settings/get_settings.py: The API key is masked as expected.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the API key masking logic by checking if the last 10 characters are masked.

# Test: Search for API key usage in the codebase. Expect: All API keys should be masked.
rg --type python $'llm_api_key[:-10] + "**********"'
Length of output: 196

Script:
#!/bin/bash
# Description: Verify the API key masking logic by checking if the last 10 characters are masked.

# Test: Search for API key usage in the codebase. Expect: All API keys should be masked.
rg --type python 'llm_api_key\[:-10\] \+ "**********"'
Length of output: 87

Script:
#!/bin/bash
# Description: Verify the API key masking logic by checking if the last 10 characters are masked.

# Test: Search for API key usage in the codebase. Expect: All API keys should be masked.
rg 'llm_api_key\[:-10\] \+ "**********"'
Length of output: 187
cognee/modules/cognify/graph/add_data_chunks.py (1)
12-12: Refactored to use get_vector_engine for obtaining vector engine instances.

Also applies to: 54-54

Verification successful

The integration of get_vector_engine appears consistent across multiple modules in the codebase.

evals/simple_rag_vs_cognee_eval.py

cognee/modules/search/vector/search_similarity.py

cognee/modules/data/deletion/prune_system.py

cognee/modules/cognify/graph/add_cognitive_layer_graphs.py

cognee/modules/cognify/graph/add_label_nodes.py

cognee/modules/cognify/graph/add_data_chunks.py

cognee/modules/cognify/llm/resolve_cross_graph_references.py

cognee/infrastructure/databases/vector/get_vector_engine.py

cognee/infrastructure/databases/vector/__init__.py

Refactored to use get_vector_engine for obtaining vector engine instances.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the integration of `get_vector_engine` in the system.

# Test: Search for `get_vector_engine` usage in the codebase. Expect: Consistent usage across all modules.
rg --type python $'get_vector_engine'
Length of output: 67

Script:
#!/bin/bash
# Description: Verify the integration of `get_vector_engine` in the system.

# Test: Search for `get_vector_engine` usage in the codebase. Expect: Consistent usage across all modules.
rg --type py 'get_vector_engine'
Length of output: 1801
cognee/api/v1/config/config.py (1)
20-21: Configured specific URL for lancedb vector engine provider.

Verification successful

The verification of the vector_db_url setting logic shows that it is consistently used across multiple files in the codebase. The URL is being set correctly for the lancedb provider and is utilized in various configurations and settings.

cognee/api/v1/config/config.py

cognee/api/client.py

cognee/modules/settings/get_settings.py

cognee/infrastructure/databases/vector/config.py

cognee/modules/settings/save_vector_db_config.py

cognee/infrastructure/databases/vector/create_vector_engine.py

The logic for setting and using vector_db_url appears to be correctly implemented.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the URL setting logic for `lancedb`.

# Test: Search for `vector_db_url` usage in the codebase. Expect: Correct URL format and usage.
rg --type python $'vector_db_url'
Length of output: 63

Script:
#!/bin/bash
# Description: Verify the URL setting logic for `lancedb`.

# Test: Search for `vector_db_url` usage in the codebase. Expect: Correct URL format and usage.
rg --type py 'vector_db_url'
Length of output: 1324

Script:
#!/bin/bash
# Description: Extract relevant code snippets for `vector_db_url` usage.

# Extract lines around the usage of `vector_db_url` in the identified files.
rg --type py --context 5 'vector_db_url' cognee/infrastructure/databases/vector/create_vector_engine.py cognee/infrastructure/databases/vector/config.py cognee/modules/settings/get_settings.py cognee/modules/settings/save_vector_db_config.py cognee/api/client.py cognee/api/v1/config/config.py
Length of output: 8616
cognee-frontend/src/app/page.tsx (1)

115-115: Added nowrap attribute to improve notification text display.

cognee/modules/cognify/graph/add_cognitive_layer_graphs.py (1)

5-5: Good use of imports to support the new vector engine functionality.

The inclusion of get_vector_engine from the vector module aligns with the changes across the project to centralize vector engine access, enhancing modularity and maintainability.

Also applies to: 7-7
cognee/api/client.py (1)
273-273: Ensure the new vector database URL configuration is robust.

This change is crucial for the system's operation. Verify that it's consistently used across all relevant parts of the application.

Verification successful

The search results indicate that the vector_db_url configuration is used in multiple files across the system. This suggests that the new vector database URL configuration is indeed applied consistently.

cognee/api/client.py

cognee/api/v1/config/config.py

cognee/infrastructure/databases/vector/config.py

cognee/infrastructure/databases/vector/create_vector_engine.py

cognee/modules/settings/save_vector_db_config.py

cognee/modules/settings/get_settings.py

These references confirm that the configuration change is propagated throughout the relevant parts of the application.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the vector database URL configuration.
# Test: Check if the URL configuration is applied correctly across the system.
rg --type python 'vector_db_url'
Length of output: 63

Script:
#!/bin/bash
# Description: Verify the vector database URL configuration.
# Test: Check if the URL configuration is applied correctly across the system.
rg --type py 'vector_db_url'
Length of output: 1324

coderabbitai · 2024-06-09T17:59:52Z

cognee/infrastructure/databases/vector/embeddings/get_embedding_engine.py

+def get_embedding_engine() -> EmbeddingEngine:
+    llm_config = get_llm_config()
+    return LiteLLMEmbeddingEngine(api_key = llm_config.llm_api_key)


Consider enhancing get_embedding_engine to support multiple types of embedding engines based on configuration, improving flexibility.

def get_embedding_engine() -> EmbeddingEngine: llm_config = get_llm_config() if llm_config.engine_type == 'LiteLLM': return LiteLLMEmbeddingEngine(api_key=llm_config.llm_api_key) # Add other conditions for different engines as needed

coderabbitai · 2024-06-09T17:59:52Z

cognee/infrastructure/databases/vector/config.py

+    vector_db_url: str = os.path.join(
+        os.path.join(get_absolute_path(".cognee_system"), "databases"),
+        "cognee.lancedb"
+    )


Simplify nested os.path.join calls.

- vector_db_url: str = os.path.join( - os.path.join(get_absolute_path(".cognee_system"), "databases"), - "cognee.lancedb" - ) + vector_db_url: str = os.path.join(get_absolute_path(".cognee_system"), "databases", "cognee.lancedb")

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

vector_db_url: str = os.path.join(

os.path.join(get_absolute_path(".cognee_system"), "databases"),

"cognee.lancedb"

)

vector_db_url: str = os.path.join(get_absolute_path(".cognee_system"), "databases", "cognee.lancedb")

coderabbitai · 2024-06-09T17:59:52Z

cognee/infrastructure/databases/vector/embeddings/LiteLLMEmbeddingEngine.py

+class LiteLLMEmbeddingEngine(EmbeddingEngine):
+    api_key: str
+    embedding_model: str
+    embedding_dimensions: int
+
+    def __init__(
+        self,
+        embedding_model: Optional[str] = "text-embedding-3-large",
+        embedding_dimensions: Optional[int] = 3072,
+        api_key: str = None,
+    ):
+        self.api_key = api_key
+        self.embedding_model = embedding_model
+        self.embedding_dimensions = embedding_dimensions
+
+    async def embed_text(self, text: List[str]) -> List[List[float]]:
+        async def get_embedding(text_):
+            response = await aembedding(
+                self.embedding_model,
+                input = text_,
+                api_key = self.api_key
+            )
+
+            return response.data[0]["embedding"]
+
+        tasks = [get_embedding(text_) for text_ in text]
+        result = await asyncio.gather(*tasks)
+        return result
+
+    def get_vector_size(self) -> int:
+        return self.embedding_dimensions


Great addition with LiteLLMEmbeddingEngine. Ensure proper error handling in asynchronous operations and consider adding logging for better traceability.

+ import logging + logger = logging.getLogger(__name__) async def embed_text(self, text: List[str]) -> List[List[float]]: async def get_embedding(text_): try: response = await aembedding( self.embedding_model, input = text_, api_key = self.api_key ) + logger.info(f"Embedding successful for text: {text_}") return response.data[0]["embedding"] + except Exception as e: + logger.error(f"Failed to embed text: {text_}, Error: {str(e)}") + raise

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

class LiteLLMEmbeddingEngine(EmbeddingEngine):

api_key: str

embedding_model: str

embedding_dimensions: int

def __init__(

self,

embedding_model: Optional[str] = "text-embedding-3-large",

embedding_dimensions: Optional[int] = 3072,

api_key: str = None,

):

self.api_key = api_key

self.embedding_model = embedding_model

self.embedding_dimensions = embedding_dimensions

async def embed_text(self, text: List[str]) -> List[List[float]]:

async def get_embedding(text_):

response = await aembedding(

self.embedding_model,

input = text_,

api_key = self.api_key

)

return response.data[0]["embedding"]

tasks = [get_embedding(text_) for text_ in text]

result = await asyncio.gather(*tasks)

return result

def get_vector_size(self) -> int:

return self.embedding_dimensions

import logging

class LiteLLMEmbeddingEngine(EmbeddingEngine):

api_key: str

embedding_model: str

embedding_dimensions: int

def __init__(

self,

embedding_model: Optional[str] = "text-embedding-3-large",

embedding_dimensions: Optional[int] = 3072,

api_key: str = None,

):

self.api_key = api_key

self.embedding_model = embedding_model

self.embedding_dimensions = embedding_dimensions

async def embed_text(self, text: List[str]) -> List[List[float]]:

logger = logging.getLogger(__name__)

async def get_embedding(text_):

try:

response = await aembedding(

self.embedding_model,

input = text_,

api_key = self.api_key

)

logger.info(f"Embedding successful for text: {text_}")

return response.data[0]["embedding"]

except Exception as e:

logger.error(f"Failed to embed text: {text_}, Error: {str(e)}")

raise

tasks = [get_embedding(text_) for text_ in text]

result = await asyncio.gather(*tasks)

return result

def get_vector_size(self) -> int:

return self.embedding_dimensions

coderabbitai · 2024-06-09T17:59:52Z

evals/simple_rag_vs_cognee_eval.py

+from cognee.infrastructure.databases.vector import get_vector_engine

 async def cognify_search_base_rag(content:str, context:str):
    base_config = get_base_config()
-  
+
    cognee_directory_path = os.path.abspath(".cognee_system")
    base_config.system_root_directory = cognee_directory_path

-    vector_config = get_vectordb_config()
-    vector_client = vector_config.vector_engine
+    vector_engine = get_vector_engine()

-    return_ = await vector_client.search(collection_name="basic_rag", query_text=content, limit=10)
+    return_ = await vector_engine.search(collection_name="basic_rag", query_text=content, limit=10)


Refactor to use the new vector engine and ensure proper error handling.

The change to use get_vector_engine() instead of a direct vector client is a good move towards modularity. However, ensure that the error handling is robust, especially when the vector engine might throw exceptions that are not caught here.

Tools

Ruff

85-85: Module level import not at top of file (E402)

coderabbitai · 2024-06-09T17:59:53Z

cognee/modules/cognify/graph/add_cognitive_layer_graphs.py



 def generate_node_id(node_id: str) -> str:
-    return node_id.upper().replace(' ', '_').replace("'", "")
+    return node_id.upper().replace(" ", "_").replace("'", "")


Optimize ID generation for efficiency.

The method generate_node_id could be optimized or simplified if it's a common operation that might impact performance. Consider caching results if IDs are regenerated often with the same inputs.

coderabbitai · 2024-06-09T17:59:53Z

cognee/modules/cognify/graph/add_label_nodes.py

@@ -62,9 +59,9 @@ class PayloadSchema(BaseModel):
    ]

    try:
-        await vector_client.create_collection(parent_node_id, payload_schema = PayloadSchema)
+        await vector_engine.create_collection(parent_node_id, payload_schema = PayloadSchema)


Consider handling specific exceptions when creating collections to improve robustness.

- except Exception as e: + except CollectionCreationError as e:

Committable suggestion was skipped due to low confidence.

coderabbitai · 2024-06-09T17:59:53Z

cognee/infrastructure/databases/vector/embeddings/FastembedEmbeddingEngine.py

+from typing import List, Optional
+from fastembed import TextEmbedding
+from cognee.root_dir import get_absolute_path
+from cognee.infrastructure.databases.vector.embeddings.EmbeddingEngine import EmbeddingEngine
+
+class FastembedEmbeddingEngine(EmbeddingEngine):
+    embedding_model: str
+    embedding_dimensions: int
+
+    def __init__(
+        self,
+        embedding_model: Optional[str] = "BAAI/bge-large-en-v1.5",
+        embedding_dimensions: Optional[int] = 1024,
+    ):
+        self.embedding_model = embedding_model
+        self.embedding_dimensions = embedding_dimensions
+
+    async def embed_text(self, text: List[str]) -> List[float]:
+        embedding_model = TextEmbedding(model_name = self.embedding_model, cache_dir = get_absolute_path("cache/embeddings"))
+        embeddings_list = list(map(lambda embedding: embedding.tolist(), embedding_model.embed(text)))
+
+        return embeddings_list
+
+    def get_vector_size(self) -> int:
+        return self.embedding_dimensions


Ensure proper error handling in embed_text method.

- embeddings_list = list(map(lambda embedding: embedding.tolist(), embedding_model.embed(text))) + try: + embeddings_list = list(map(lambda embedding: embedding.tolist(), embedding_model.embed(text))) + except Exception as e: + # Handle specific exceptions related to embedding failures + raise e

Committable suggestion was skipped due to low confidence.

Tip
Codebase Verification

The search results indicate that the optional parameters embedding_model and embedding_dimensions are used directly without any additional validation for None values. However, since default values are provided in the constructor, these parameters will not be None unless explicitly set to None by the caller.

Given this context, the default values should suffice for typical usage, but it might be prudent to add validation to handle cases where these parameters might be explicitly set to None.

File: cognee/infrastructure/databases/vector/embeddings/FastembedEmbeddingEngine.py

Lines: 1-25

Consider adding validation checks in the constructor to handle cases where embedding_model or embedding_dimensions might be explicitly set to None.

Analysis chain

Validate the optional parameters in the constructor.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Description: Verify the handling of optional parameters in `FastembedEmbeddingEngine`. # Test: Check if there are any checks for `None` values for optional parameters. rg --type python $'if embedding_model is None' --context 5 src/

Length of output: 95

Script:

#!/bin/bash # Description: Verify the handling of optional parameters in `FastembedEmbeddingEngine`. # Test: Check if there are any checks for `None` values for optional parameters. rg --type py $'if embedding_model is None' --context 5

Length of output: 55

Script:

#!/bin/bash # Description: Verify the usage and handling of `embedding_model` and `embedding_dimensions` in the codebase. # Test: Search for the usage of `embedding_model` and `embedding_dimensions` in the codebase. rg 'embedding_model' --type py --context 5 rg 'embedding_dimensions' --type py --context 5

Length of output: 16036

coderabbitai · 2024-06-09T17:59:53Z

cognee/api/client.py

+    try:
+        graph_config = get_graph_config()
+        graph_engine = graph_config.graph_engine
+        graph_client = await get_graph_client(graph_engine)
+        graph_url = await render_graph(graph_client.graph)

-    return JSONResponse(
-        status_code = 200,
-        content = str(graph_url),
-    )
+        return JSONResponse(
+            status_code = 200,
+            content = str(graph_url),
+        )
+    except:
+        return JSONResponse(
+            status_code = 409,
+            content = "Graphistry credentials are not set. Please set them in your .env file.",
+        )


Improve error handling by specifying exception types.

Using a bare except is generally discouraged as it can catch unexpected exceptions and make debugging more difficult. Specify the exception types to improve clarity and maintainability.

- except: + except SpecificExceptionType:

Committable suggestion was skipped due to low confidence.

Tools

Ruff

85-85: Do not use bare except (E722)

coderabbitai

Actionable comments posted: 2

Outside diff range and nitpick comments (2)

cognee/infrastructure/data/chunking/LangchainChunkingEngine.py (1)
Line range hint 9-66: Consider moving the imports for langchain_text_splitters to the top of the file to improve performance and maintainability.
+ from langchain_text_splitters import (
+     Language,
+     RecursiveCharacterTextSplitter,
+ )
-        from langchain_text_splitters import (
-            Language,
-            RecursiveCharacterTextSplitter,
-        )
The LangchainChunkEngine class is well-designed, with clear methods for chunking data based on different strategies.
cognee/api/v1/cognify/cognify.py (1)
Line range hint 216-216: Avoid using a bare except statement.

Using a bare except can catch unexpected exceptions and obscure underlying bugs. Specify exception types to improve error handling clarity and maintainability.
- except:
+ except ExpectedExceptionType:

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between f79631d and 11231b7.

Files ignored due to path filters (1)

poetry.lock is excluded by !**/*.lock

Files selected for processing (10)

cognee/api/v1/cognify/cognify.py (5 hunks)
cognee/api/v1/config/config.py (2 hunks)
cognee/infrastructure/data/chunking/DefaultChunkEngine.py (5 hunks)
cognee/infrastructure/data/chunking/HaystackChunkEngine.py (1 hunks)
cognee/infrastructure/data/chunking/LangchainChunkingEngine.py (3 hunks)
cognee/infrastructure/data/chunking/config.py (1 hunks)
cognee/infrastructure/data/chunking/create_chunking_engine.py (1 hunks)
cognee/infrastructure/data/chunking/get_chunking_engine.py (1 hunks)
cognee/modules/ingestion/save_data_to_file.py (2 hunks)
pyproject.toml (2 hunks)

Files skipped from review as they are similar to previous changes (1)

cognee/api/v1/config/config.py

Additional context used

Ruff

cognee/api/v1/cognify/cognify.py

216-216: Do not use bare except (E722)

288-288: Local variable graph is assigned to but never used (F841)

Remove assignment to unused variable graph

Additional comments not posted (11)

cognee/infrastructure/data/chunking/get_chunking_engine.py (1)

5-6: LGTM! Consider adding error handling for configuration retrieval and engine creation.

cognee/infrastructure/data/chunking/HaystackChunkEngine.py (1)

2-6: Constructor implementation looks good. Consider adding docstrings to explain the parameters and their default values.
cognee/infrastructure/data/chunking/config.py (1)
10-12: Change in chunk_overlap value looks good. Ensure that this new value is appropriately handled in all relevant parts of the application.

Verification successful

The new chunk_overlap value is indeed used in various parts of the codebase, including configuration settings and chunking engine implementations. This confirms that the new value is being handled appropriately.

Files and lines where chunk_overlap is used:

cognee/api/v1/cognify/cognify.py: Line 1

cognee/api/v1/config/config.py: Lines 1-2

cognee/infrastructure/data/chunking/create_chunking_engine.py: Lines 1-3

cognee/infrastructure/data/chunking/DefaultChunkEngine.py: Multiple lines

cognee/infrastructure/data/chunking/config.py: Line 1

cognee/infrastructure/data/chunking/LangchainChunkingEngine.py: Multiple lines

cognee/infrastructure/data/chunking/HaystackChunkEngine.py: Line 1
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify that the new `chunk_overlap` value is handled correctly.

# Test: Search for usage of `chunk_overlap`. Expect: Proper handling of the new value.
rg --type python $'chunk_overlap'
Length of output: 63

Script:
#!/bin/bash
# Description: Verify that the new `chunk_overlap` value is handled correctly.

# Test: Search for usage of `chunk_overlap`. Expect: Proper handling of the new value.
rg 'chunk_overlap'
Length of output: 4470
cognee/modules/ingestion/save_data_to_file.py (1)

18-21: Implementation of random file name generation looks good. Consider adding a comment explaining why a random file name is generated when the 'name' is missing or None.

cognee/infrastructure/data/chunking/create_chunking_engine.py (1)

6-9: The ChunkingConfig class is well-defined and aligns with the need to store configuration settings for different chunking engines.

pyproject.toml (1)

41-41: The updates to dependencies are appropriate for the new functionalities introduced in the PR.

Also applies to: 75-75

cognee/infrastructure/data/chunking/DefaultChunkEngine.py (1)

Line range hint 8-94: The DefaultChunkEngine class is well-designed, with clear methods for chunking data based on different strategies.

cognee/api/v1/cognify/cognify.py (4)

9-12: Imports added for chunking and embedding engines.

These imports align with the PR's objective to enhance the system's flexibility in handling different data chunking and embedding strategies.

30-30: Import added for KnowledgeGraph and ChunkStrategy.

This import is necessary for the new functionalities introduced in the system, ensuring that the types and strategies are correctly utilized in the processing functions.

74-79: Enhanced error handling in the cognify function.

The addition of a try-except block around the cognify call for each dataset is a robust way to handle potential errors during dataset processing, ensuring that the system can gracefully handle failures and log them appropriately.

101-101: Initialization of chunk_engine using the get_chunk_engine function.

This change supports the dynamic selection of chunking engines based on configuration, which is a flexible design choice that aligns with the system's extensibility goals.

coderabbitai · 2024-06-09T20:51:24Z

cognee/infrastructure/data/chunking/create_chunking_engine.py

+def create_chunking_engine(config: ChunkingConfig):
+    if config["chunk_engine"] == "langchainchunkengine":
+
+        return LangchainChunkEngine(
+            chunk_size=config["chunk_size"],
+            chunk_overlap=config["chunk_overlap"],
+            chunk_strategy=config["chunk_strategy"],
+
+
+        )
+    elif config["chunk_engine"] == "defaultchunkengine":
+            from .DefaultChunkEngine import DefaultChunkEngine
+
+            return DefaultChunkEngine(
+                chunk_size=config["chunk_size"],
+                chunk_overlap=config["chunk_overlap"],
+                chunk_strategy=config["chunk_strategy"],
+            )
+    elif config["chunk_engine"] == "haystackchunkengine":
+        from .HaystackChunkEngine import HaystackChunkEngine
+
+        return HaystackChunkEngine(
+            chunk_size=config["chunk_size"],
+            chunk_overlap=config["chunk_overlap"],
+            chunk_strategy=config["chunk_strategy"],
+        )


Consider moving the import statements to the top of the file to improve performance and maintainability.

+ from .DefaultChunkEngine import DefaultChunkEngine + from .HaystackChunkEngine import HaystackChunkEngine - from .DefaultChunkEngine import DefaultChunkEngine - from .HaystackChunkEngine import HaystackChunkEngine

The logic for creating chunking engines based on configuration is clear and modular.

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def create_chunking_engine(config: ChunkingConfig):

if config["chunk_engine"] == "langchainchunkengine":

return LangchainChunkEngine(

chunk_size=config["chunk_size"],

chunk_overlap=config["chunk_overlap"],

chunk_strategy=config["chunk_strategy"],

)

elif config["chunk_engine"] == "defaultchunkengine":

from .DefaultChunkEngine import DefaultChunkEngine

return DefaultChunkEngine(

chunk_size=config["chunk_size"],

chunk_overlap=config["chunk_overlap"],

chunk_strategy=config["chunk_strategy"],

)

elif config["chunk_engine"] == "haystackchunkengine":

from .HaystackChunkEngine import HaystackChunkEngine

return HaystackChunkEngine(

chunk_size=config["chunk_size"],

chunk_overlap=config["chunk_overlap"],

chunk_strategy=config["chunk_strategy"],

)

from .DefaultChunkEngine import DefaultChunkEngine

from .HaystackChunkEngine import HaystackChunkEngine

def create_chunking_engine(config: ChunkingConfig):

if config["chunk_engine"] == "langchainchunkengine":

return LangchainChunkEngine(

chunk_size=config["chunk_size"],

chunk_overlap=config["chunk_overlap"],

chunk_strategy=config["chunk_strategy"],

)

elif config["chunk_engine"] == "defaultchunkengine":

return DefaultChunkEngine(

chunk_size=config["chunk_size"],

chunk_overlap=config["chunk_overlap"],

chunk_strategy=config["chunk_strategy"],

)

elif config["chunk_engine"] == "haystackchunkengine":

return HaystackChunkEngine(

chunk_size=config["chunk_size"],

chunk_overlap=config["chunk_overlap"],

chunk_strategy=config["chunk_strategy"],

)

coderabbitai

Actionable comments posted: 1

Outside diff range and nitpick comments (1)

cognee/api/v1/cognify/cognify.py (1)
Line range hint 216-216: Avoid using a bare except statement.

Using a bare except can catch unexpected exceptions and obscure bugs. Specify the exception type to improve error handling.
- except:
+ except ExpectedExceptionType:

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 11231b7 and a23fc40.

Files selected for processing (4)

cognee/api/v1/cognify/cognify.py (5 hunks)
cognee/infrastructure/data/chunking/config.py (2 hunks)
cognee/infrastructure/data/chunking/create_chunking_engine.py (1 hunks)
cognee/shared/data_models.py (1 hunks)

Files skipped from review due to trivial changes (1)

cognee/shared/data_models.py

Files skipped from review as they are similar to previous changes (2)

cognee/infrastructure/data/chunking/config.py
cognee/infrastructure/data/chunking/create_chunking_engine.py

Additional context used

Ruff

cognee/api/v1/cognify/cognify.py

216-216: Do not use bare except (E722)

288-288: Local variable graph is assigned to but never used (F841)

Remove assignment to unused variable graph

Additional comments not posted (3)

cognee/api/v1/cognify/cognify.py (3)

9-10: Imports from LangchainChunkingEngine and get_chunk_engine have been added.

These imports align with the PR's objective to enhance chunking engine functionality. Ensure that these modules are properly tested.

12-12: Import from LiteLLMEmbeddingEngine has been added.

This import supports the integration of new embedding engines, which is consistent with the PR objectives. Verify that LiteLLMEmbeddingEngine is implemented according to specifications.

30-30: Import of KnowledgeGraph, ChunkStrategy, and ChunkEngine from cognee.shared.data_models.

This import is necessary for the newly introduced chunking and graph functionalities. Ensure that these data models are used appropriately throughout the code.

borisarzentar · 2024-06-10T14:21:48Z

cognee/modules/topology/topology.py

@@ -1,28 +1,195 @@
-import os
-import glob
+# import csv


Let's delete commented code if we don't need it.

borisarzentar · 2024-06-10T14:25:33Z

cognee/modules/topology/topology.py

+                async with aiofiles.open(file_path, mode='r') as f:
+                    data = await f.read()
+                    return json.loads(data)
+            elif file_path.endswith('.csv'):


Does it make sense to have topology in csv format?

Whyhow does seem to support it, and people tend to use it. Structured data, most of it will be excel

borisarzentar · 2024-06-10T14:26:32Z

cognee/modules/topology/topology.py

+from cognee.base_config import get_base_config
+
+# Define models
+class RelationshipModel(BaseModel):


Why do we have RelationshipModel and Relationship below? Both are pydantic models with similar fields.

Different models. One is github repo logic, other is base csv, json

coderabbitai

Actionable comments posted: 8

Outside diff range and nitpick comments (2)

cognee/modules/cognify/graph/add_node_connections.py (1)

Line range hint 61-61: Remove the unused variable result.

- result = await graph.query(f"""MATCH (a), (b) WHERE a.unique_id = '{relationship['searched_node_id']}' AND b.unique_id = '{relationship['original_id_for_search']}'
-           CREATE (a)-[:SEMANTIC_CONNECTION {{weight:{relationship['score']}}}]->(b)""")
- await graph.close()
+ await graph.query(f"""MATCH (a), (b) WHERE a.unique_id = '{relationship['searched_node_id']}' AND b.unique_id = '{relationship['original_id_for_search']}'
+           CREATE (a)-[:SEMANTIC_CONNECTION {{weight:{relationship['score']}}}]->(b)""")
+ await graph.close()

cognee/api/v1/cognify/cognify.py (1)

Line range hint 218-218: Avoid using a bare except statement. Specify the exception type to improve clarity and maintainability.
-    except:
+    except SpecificExceptionType:

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between a23fc40 and 4f76c46.

Files selected for processing (13)

.gitignore (1 hunks)
cognee/api/client.py (3 hunks)
cognee/api/v1/cognify/cognify.py (8 hunks)
cognee/infrastructure/databases/graph/config.py (1 hunks)
cognee/infrastructure/databases/graph/get_graph_client.py (2 hunks)
cognee/modules/cognify/graph/add_node_connections.py (1 hunks)
cognee/modules/data/deletion/prune_system.py (1 hunks)
cognee/modules/search/vector/search_similarity.py (2 hunks)
cognee/modules/topology/example_data.json (1 hunks)
cognee/modules/topology/infer_data_topology.py (1 hunks)
cognee/modules/topology/topology.py (3 hunks)
docs/blog/posts/llmops-and-knowledge-graphs.md (1 hunks)
docs/blog/rag/rag_explained.md (1 hunks)

Files not summarized due to errors (4)

.gitignore: Error: Server error. Please try again later.
cognee/modules/topology/topology.py: Error: Server error. Please try again later.
docs/blog/rag/rag_explained.md: Error: Server error. Please try again later.
cognee/modules/data/deletion/prune_system.py: Error: Server error. Please try again later.

Files skipped from review due to trivial changes (1)

cognee/modules/topology/example_data.json

Additional context used

Markdownlint

docs/blog/posts/llmops-and-knowledge-graphs.md

15-15: null (MD018, no-missing-space-atx)
No space after hash on atx style heading

15-15: null (MD047, single-trailing-newline)
Files should end with a single newline character

docs/blog/rag/rag_explained.md

15-15: null (MD018, no-missing-space-atx)
No space after hash on atx style heading

15-15: null (MD047, single-trailing-newline)
Files should end with a single newline character

Ruff

cognee/modules/data/deletion/prune_system.py

7-7: Local variable graph_config is assigned to but never used (F841)

Remove assignment to unused variable graph_config

cognee/infrastructure/databases/graph/get_graph_client.py

22-22: Do not use bare except (E722)

35-35: Do not use bare except (E722)

cognee/modules/cognify/graph/add_node_connections.py

61-61: Local variable result is assigned to but never used (F841)

Remove assignment to unused variable result

cognee/api/client.py

85-85: Do not use bare except (E722)

cognee/modules/topology/topology.py

281-281: Module level import not at top of file (E402)

cognee/api/v1/cognify/cognify.py

218-218: Do not use bare except (E722)

290-290: Local variable graph is assigned to but never used (F841)

Remove assignment to unused variable graph

Additional comments not posted (5)

docs/blog/posts/llmops-and-knowledge-graphs.md (1)

1-13: Metadata looks well-structured and correctly formatted.

docs/blog/rag/rag_explained.md (1)

1-13: Metadata looks well-structured and correctly formatted.

cognee/modules/topology/infer_data_topology.py (1)

Line range hint 1-17: The changes in the function infer_data_topology are consistent and improve clarity.

cognee/infrastructure/databases/graph/config.py (1)

22-22: The addition of graph_topology aligns with the refactoring efforts to enhance clarity in graph-related configurations.

.gitignore (1)

13-16: The additions to the .gitignore file are appropriate for excluding notebooks and evaluation scripts, which aligns with best practices to keep the repository clean.

coderabbitai · 2024-06-10T14:33:14Z

cognee/infrastructure/databases/graph/get_graph_client.py

    """Factory function to get the appropriate graph client based on the graph type."""
    config = get_graph_config()

-    if graph_type == GraphDBType.NEO4J:
+    if config.graph_engine == GraphDBType.NEO4J:


Avoid using bare except clauses. It's better to handle specific exceptions or at least log the exception to aid in debugging.

- except: + except Exception as e: + logger.error(f"Failed to initialize Neo4jAdapter: {e}") - except: + except Exception as e: + logger.error(f"Failed to initialize FalcorDBAdapter: {e}")

Also applies to: 25-25

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if config.graph_engine == GraphDBType.NEO4J:

if config.graph_engine == GraphDBType.NEO4J:

try:

return Neo4jAdapter(config)

except Exception as e:

logger.error(f"Failed to initialize Neo4jAdapter: {e}")

elif config.graph_engine == GraphDBType.FALCOR:

try:

return FalcorDBAdapter(config)

except Exception as e:

logger.error(f"Failed to initialize FalcorDBAdapter: {e}")

coderabbitai · 2024-06-10T14:33:14Z

docs/blog/posts/llmops-and-knowledge-graphs.md

+---
+
+#tbd


Correct Markdown formatting issues.

- #tbd + # tbd

Ensure the file ends with a single newline character.

Tools

Markdownlint

15-15: null (MD018, no-missing-space-atx)
No space after hash on atx style heading

15-15: null (MD047, single-trailing-newline)
Files should end with a single newline character

coderabbitai · 2024-06-10T14:33:14Z

cognee/modules/search/vector/search_similarity.py

    for layer_id in unique_layer_uuids:
-        vector_engine = vector_config.vector_engine
+        vector_engine = get_vector_engine()


As previously suggested, replace debug prints with proper logging mechanisms to maintain production code quality.

- print("results", results) - print("len_rs", len(results)) + logger.debug(f"Search results: {results}") + logger.debug(f"Results length: {len(results)}")

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

vector_engine = get_vector_engine()

vector_engine = get_vector_engine()

# Assuming the context where these print statements are located

# Replace the following print statements with logger.debug

logger.debug(f"Search results: {results}")

logger.debug(f"Results length: {len(results)}")

coderabbitai · 2024-06-10T14:33:14Z

docs/blog/rag/rag_explained.md

+---
+
+#tbd


Correct Markdown formatting issues.

- #tbd + # tbd

Ensure the file ends with a single newline character.

Committable suggestion was skipped due to low confidence.

Tools

Markdownlint

15-15: null (MD018, no-missing-space-atx)
No space after hash on atx style heading

15-15: null (MD047, single-trailing-newline)
Files should end with a single newline character

coderabbitai · 2024-06-10T14:33:15Z

cognee/api/client.py

+            status_code = 200,
+            content = str(graph_url),
+        )
+    except:


Specify the exception type to improve error handling.

- except: + except SpecificExceptionType:

Committable suggestion was skipped due to low confidence.

Tools

Ruff

85-85: Do not use bare except (E722)

coderabbitai · 2024-06-10T14:33:15Z

cognee/modules/topology/topology.py

+            raise RuntimeError(f"Failed to add graph topology from {file_path}: {e}")
+
+# Example Main Function:
+import asyncio


Move the module-level import to the top of the file.

- import asyncio + # Move this import to the top of the file

Committable suggestion was skipped due to low confidence.

Tools

Ruff

281-281: Module level import not at top of file (E402)

coderabbitai · 2024-06-10T14:33:15Z

cognee/modules/data/deletion/prune_system.py

 from cognee.infrastructure.databases.graph.config import get_graph_config
-from cognee.infrastructure.databases.vector.config import get_vectordb_config
 from cognee.infrastructure.databases.graph.get_graph_client import get_graph_client

 async def prune_system(graph = True, vector = True):
    if graph:
        graph_config = get_graph_config()


Remove the unused variable graph_config.

- graph_config = get_graph_config()

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

graph_config = get_graph_config()

Tools

Ruff

7-7: Local variable graph_config is assigned to but never used (F841)

Remove assignment to unused variable graph_config

coderabbitai

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 4f76c46 and 9fd542c.

Files selected for processing (2)

cognee/modules/topology/topology.py (1 hunks)
cognee/modules/topology/topology_data_models.py (1 hunks)

Additional comments not posted (11)

cognee/modules/topology/topology_data_models.py (6)

7-11: The Relationship class is well-defined with clear descriptions for each field.

13-17: The Document class effectively uses the Relationship class to ensure type safety and consistency.

19-25: The DirectoryModel class is well-structured, using lists with default values to avoid null issues.

29-36: The DirMetadata class provides a comprehensive view of a directory's contents, effectively capturing both documents and subdirectories.

38-41: The GitHubRepositoryModel class is clearly structured to represent a repository's metadata and root directory effectively.

50-54: The NodeModel class is flexible in its representation of children, allowing for both dictionary and NodeModel types.

cognee/modules/topology/topology.py (5)

22-32: The flatten_model method is well-documented and effectively uses the NodeModel class to flatten the model to a dictionary.

34-46: The recursive_flatten method is well-documented and appropriately handles nested structures using a recursive approach.

48-63: The load_data method is robust, with comprehensive error handling and support for multiple file formats.

65-82: The add_graph_topology method effectively integrates data loading, flattening, and graph client interactions, with robust error handling.

86-93: The main function serves as a simple and effective entry point for the module, demonstrating the use of the TopologyEngine.

coderabbitai

Actionable comments posted: 4

Outside diff range and nitpick comments (3)

cognee/modules/topology/explanations/062c22df-d99b-599f-90cd-2d325c8bcf69.txt (2)

3-3: Consider revising "with respect to input size scaling" to a more concise alternative like "relative to input size" to reduce wordiness.

Tools

LanguageTool

[style] ~3-~3: ‘with respect to’ might be wordy. Consider a shorter alternative. (EN_WORDINESS_PREMIUM_WITH_RESPECT_TO)
Context: ...some calculations exponentially faster (with respect to input size scaling) than any modern "cl...

4-4: Consider adding a comma after "quickly" for better readability.

Tools

LanguageTool

[uncategorized] ~4-~4: Possible missing comma found. (AI_HYDRA_LEO_MISSING_COMMA)
Context: ...m calculations efficiently and quickly. Physically engineering high-quality qubits has pro...
cognee/api/v1/cognify/cognify.py (1)
Line range hint 232-232: Avoid using bare except statements.
- except:
+ except Exception as e:

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 9fd542c and d0939b9.

Files selected for processing (12)

cognee/api/v1/cognify/cognify.py (9 hunks)
cognee/infrastructure/data/chunking/DefaultChunkEngine.py (5 hunks)
cognee/infrastructure/data/chunking/LangchainChunkingEngine.py (3 hunks)
cognee/infrastructure/databases/graph/config.py (2 hunks)
cognee/infrastructure/llm/prompts/extract_topology.txt (1 hunks)
cognee/modules/topology/example_data.json (1 hunks)
cognee/modules/topology/explanations/062c22df-d99b-599f-90cd-2d325c8bcf69.txt (1 hunks)
cognee/modules/topology/explanations/6dfe01b6-07d2-5b77-83c8-1d6c11ce2aa7.txt (1 hunks)
cognee/modules/topology/explanations/Natural language processing.txt (1 hunks)
cognee/modules/topology/explanations/bab90046-1d9b-598c-8711-dab30f501915.txt (1 hunks)
cognee/modules/topology/infer_data_topology.py (1 hunks)
cognee/modules/topology/topology.py (1 hunks)

Files skipped from review due to trivial changes (4)

cognee/infrastructure/llm/prompts/extract_topology.txt
cognee/modules/topology/example_data.json
cognee/modules/topology/explanations/Natural language processing.txt
cognee/modules/topology/explanations/bab90046-1d9b-598c-8711-dab30f501915.txt

Files skipped from review as they are similar to previous changes (1)

cognee/infrastructure/data/chunking/DefaultChunkEngine.py

Additional context used

LanguageTool

cognee/modules/topology/explanations/062c22df-d99b-599f-90cd-2d325c8bcf69.txt

[style] ~3-~3: ‘with respect to’ might be wordy. Consider a shorter alternative. (EN_WORDINESS_PREMIUM_WITH_RESPECT_TO)
Context: ...some calculations exponentially faster (with respect to input size scaling) than any modern "cl...

[uncategorized] ~4-~4: Possible missing comma found. (AI_HYDRA_LEO_MISSING_COMMA)
Context: ...m calculations efficiently and quickly. Physically engineering high-quality qubits has pro...

cognee/modules/topology/explanations/6dfe01b6-07d2-5b77-83c8-1d6c11ce2aa7.txt

[typographical] ~4-~4: Nepárový symbol: zdá sa, že chýba '“' (UNPAIRED_BRACKETS)
Context: ...unikacije među nižim i višim majmunima: „U velikoj većini, celokupni sistem komun...

Ruff

cognee/modules/topology/topology.py

190-190: Local variable graph is assigned to but never used (F841)

Remove assignment to unused variable graph

cognee/api/v1/cognify/cognify.py

150-150: Local variable graph_topology is assigned to but never used (F841)

Remove assignment to unused variable graph_topology

232-232: Do not use bare except (E722)

304-304: Local variable graph is assigned to but never used (F841)

Remove assignment to unused variable graph

Additional comments not posted (6)

cognee/modules/topology/infer_data_topology.py (2)

10-10: Ensure that graph_topology is correctly initialized from graph_config.graph_topology as intended.

19-19: The updated error message is more specific, which improves clarity and debugging. Good job on this change.
cognee/infrastructure/databases/graph/config.py (2)
41-41: Including infer_graph_topology in the dictionary returned by to_dict ensures that this setting can be serialized and used in configurations. This is a good practice for maintaining configuration consistency.

22-27: Adding properties related to graph topology enhances the configurability of the graph database. Ensure that these new properties are utilized appropriately in the rest of the codebase.

Verification successful

The new properties related to graph topology are indeed utilized in various parts of the codebase, confirming their integration and relevance.

graph_topology_task, graph_topology, infer_graph_topology, and topology_file_path are used in cognee/modules/topology/topology.py, cognee/modules/topology/infer_data_topology.py, cognee/infrastructure/databases/graph/config.py, and cognee/api/v1/cognify/cognify.py.

This confirms that the new properties are appropriately utilized in the codebase.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the usage of new graph topology properties in the codebase.

# Test: Search for the usage of new properties. Expect: Proper usage in relevant modules.
rg --type python $'graph_topology_task|graph_topology|infer_graph_topology|topology_file_path'
Length of output: 126

Script:
#!/bin/bash
# Description: Verify the usage of new graph topology properties in the codebase.

# Test: Search for the usage of new properties. Expect: Proper usage in relevant modules.
rg --type py $'graph_topology_task|graph_topology|infer_graph_topology|topology_file_path'
Length of output: 2343
cognee/infrastructure/data/chunking/LangchainChunkingEngine.py (1)

9-18: The introduction of LangchainChunkEngine with customizable chunking strategies is a significant enhancement. Ensure that the parameters provided in the constructor are used consistently throughout the class methods.
cognee/modules/topology/topology.py (1)
22-22: Move the module-level import asyncio to the top of the file.
- import asyncio
+ # Move this import to the top of the file
Likely invalid or redundant comment.

coderabbitai · 2024-06-12T11:55:48Z

cognee/modules/topology/explanations/6dfe01b6-07d2-5b77-83c8-1d6c11ce2aa7.txt

+U analizi komunikacionih sistema kod životinja zadržaćemo se samo na semiotičkim problemima – postoje li u pojedinim sistemima njihove komunikacije ZNACI, semiotički SISTEMI i neke semiotičke OPERACIJE, u onom smislu kako su ti pojmovi definisani i utvrđeni kod ljudi. Analiziraćemo sličnosti i razlike između komunikacije kod životinja i kod ljudi, posebno semiotičke komunikacije kod čoveka. 
+Kada se ima u vidu bogatstvo oblika komunikativnih veza među životinjama: sva raznolikost signala u pogledu fizičkih svojstava – hemijski, oflaktivni (mirisni), akustički (uključiv i ultrazvukove), električni, motorički (kinezički), proksemički (položaj u prostoru), vizuelni i drugi, zatim – raznovrsnost kanala (sredina) kroz koje se ostvaruje veza, kao i raznovrsnost funkcija koje imaju komunikativni sistemi, pitanje je koliko je uopšte opravdano govoriti o komunikaciji životinja u celini. 
+Međutim, kada se pristupi semiotičkoj analizi sistema komunikacije među životinjama, iza raznolikosti nalazi se prilična jednoličnost, čak tolika da se ne može utvrditi postoji li nekakvo usavršavanje sistema komunikacije duž evolucione lestvice. 
+Pogledajmo najpre kakve FUNKCIJE opslužuju sistemi komunikacija kod životinja. Poznati istraživač ovih problema, Marler, ovako rezimira analizu komunikacije među nižim i višim majmunima: „U velikoj većini, celokupni sistem komunikacije izgleda postoji radi organizacije socijalnog ponašanja grupe, regulacije dominantnosti i subordinacije, održanja mira i kohezije u grupi, kao i radi reprodukcije i brige o mladima (Marleu, 1967). Pomenute funkcije mogle bi se, nešto raščlanjenije, ovako opisati: 


Fix the unpaired bracket issue.

- „U velikoj većini, celokupni sistem komunikacije izgleda postoji radi organizacije socijalnog ponašanja grupe, regulacije dominantnosti i subordinacije, održanja mira i kohezije u grupi, kao i radi reprodukcije i brige o mladima (Marleu, 1967). + „U velikoj većini, celokupni sistem komunikacije izgleda postoji radi organizacije socijalnog ponašanja grupe, regulacije dominantnosti i subordinacije, održanja mira i kohezije u grupi, kao i radi reprodukcije i brige o mladima“ (Marleu, 1967).

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

Pogledajmo najpre kakve FUNKCIJE opslužuju sistemi komunikacija kod životinja. Poznati istraživač ovih problema, Marler, ovako rezimira analizu komunikacije među nižim i višim majmunima: „U velikoj većini, celokupni sistem komunikacije izgleda postoji radi organizacije socijalnog ponašanja grupe, regulacije dominantnosti i subordinacije, održanja mira i kohezije u grupi, kao i radi reprodukcije i brige o mladima (Marleu, 1967). Pomenute funkcije mogle bi se, nešto raščlanjenije, ovako opisati:

Pogledajmo najpre kakve FUNKCIJE opslužuju sistemi komunikacija kod životinja. Poznati istraživač ovih problema, Marler, ovako rezimira analizu komunikacije među nižim i višim majmunima: „U velikoj većini, celokupni sistem komunikacije izgleda postoji radi organizacije socijalnog ponašanja grupe, regulacije dominantnosti i subordinacije, održanja mira i kohezije u grupi, kao i radi reprodukcije i brige o mladima“ (Marleu, 1967). Pomenute funkcije mogle bi se, nešto raščlanjenije, ovako opisati:

Tools

LanguageTool

[typographical] ~4-~4: Nepárový symbol: zdá sa, že chýba '“' (UNPAIRED_BRACKETS)
Context: ...unikacije među nižim i višim majmunima: „U velikoj većini, celokupni sistem komun...

coderabbitai · 2024-06-12T11:55:48Z

cognee/modules/topology/topology.py

+    file_path = 'example_data.json'  # or 'example_data.csv'
+    #
+    # # Adding graph topology
+    graph = await topology_engine.add_graph_topology(file_path, dataset_files=dataset_files)


Remove assignment to unused variable graph.

- graph = await topology_engine.add_graph_topology(file_path, dataset_files=dataset_files) + await topology_engine.add_graph_topology(file_path, dataset_files=dataset_files)

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

graph = await topology_engine.add_graph_topology(file_path, dataset_files=dataset_files)

await topology_engine.add_graph_topology(file_path, dataset_files=dataset_files)

Tools

Ruff

190-190: Local variable graph is assigned to but never used (F841)

Remove assignment to unused variable graph

coderabbitai · 2024-06-12T11:55:49Z

cognee/infrastructure/data/chunking/LangchainChunkingEngine.py

+            chunked_data, chunk_number = self.chunk_data_by_code(source_data,self.chunk_size, self.chunk_overlap)

        elif chunk_strategy == ChunkStrategy.LANGCHAIN_CHARACTER:
-            chunked_data = LangchainChunkEngine.chunk_data_by_character(source_data,chunk_size, chunk_overlap)
+            chunked_data, chunk_number  = self.chunk_data_by_character(source_data,self.chunk_size, self.chunk_overlap)
        else:
-            chunked_data = DefaultChunkEngine.chunk_data_by_paragraph(source_data,chunk_size, chunk_overlap)
-        return chunked_data
+            chunked_data, chunk_number  = "Invalid chunk strategy.", [0, "Invalid chunk strategy."]
+        return chunked_data, chunk_number


The implementation of different chunking strategies based on the chunk_strategy parameter is well-handled. However, consider handling the case where an invalid strategy is passed more gracefully, perhaps by raising a specific exception rather than returning a string error.

- chunked_data, chunk_number = "Invalid chunk strategy.", [0, "Invalid chunk strategy."] + raise ValueError("Invalid chunk strategy provided.")

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

chunked_data, chunk_number = self.chunk_data_by_code(source_data,self.chunk_size, self.chunk_overlap)

elif chunk_strategy == ChunkStrategy.LANGCHAIN_CHARACTER:

chunked_data = LangchainChunkEngine.chunk_data_by_character(source_data,chunk_size, chunk_overlap)

chunked_data, chunk_number = self.chunk_data_by_character(source_data,self.chunk_size, self.chunk_overlap)

else:

chunked_data = DefaultChunkEngine.chunk_data_by_paragraph(source_data,chunk_size, chunk_overlap)

return chunked_data

chunked_data, chunk_number = "Invalid chunk strategy.", [0, "Invalid chunk strategy."]

return chunked_data, chunk_number

chunked_data, chunk_number = self.chunk_data_by_code(source_data,self.chunk_size, self.chunk_overlap)

elif chunk_strategy == ChunkStrategy.LANGCHAIN_CHARACTER:

chunked_data, chunk_number = self.chunk_data_by_character(source_data,self.chunk_size, self.chunk_overlap)

else:

raise ValueError("Invalid chunk strategy provided.")

return chunked_data, chunk_number

coderabbitai

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between d0939b9 and 409d3c7.

Files selected for processing (1)

cognee/infrastructure/databases/graph/config.py (2 hunks)

Files skipped from review as they are similar to previous changes (1)

cognee/infrastructure/databases/graph/config.py

coderabbitai

Actionable comments posted: 3

Outside diff range and nitpick comments (1)

cognee/api/v1/cognify/cognify.py (1)
Line range hint 232-232: Avoid using bare except statements as they can catch unexpected exceptions and hide programming errors.
- except:
+ except Exception as e:
+     logger.error(f"An error occurred: {str(e)}")

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 409d3c7 and a197177.

Files selected for processing (3)

README.md (1 hunks)
cognee/api/v1/cognify/cognify.py (10 hunks)
cognee/api/v1/config/config.py (2 hunks)

Files skipped from review as they are similar to previous changes (1)

cognee/api/v1/config/config.py

Additional context used

LanguageTool

README.md

[locale-violation] ~35-~35: In American English, “take a look” is more commonly used. (HAVE_A_LOOK)
Context: ...-iB9gpYfu?usp=sharing">notebook or have a look at our <a href="https://topoteretes.git...

[style] ~128-~128: The phrase “a variety of” may be wordy. To make your writing clearer, consider replacing it. (A_VARIETY_OF)
Context: ...ieval, Graphs and LLMs Cognee supports a variety of tools and services for different operat...

Markdownlint

README.md

72-72: Expected: 0 or 2; Actual: 1 (MD009, no-trailing-spaces)
Trailing spaces

5-5: Expected: 1; Actual: 2 (MD012, no-multiple-blanks)
Multiple consecutive blank lines

12-12: Expected: 1; Actual: 2 (MD012, no-multiple-blanks)
Multiple consecutive blank lines

34-34: Expected: 1; Actual: 2 (MD012, no-multiple-blanks)
Multiple consecutive blank lines

37-37: Expected: 1; Actual: 2 (MD012, no-multiple-blanks)
Multiple consecutive blank lines

38-38: Expected: 1; Actual: 3 (MD012, no-multiple-blanks)
Multiple consecutive blank lines

39-39: Expected: 1; Actual: 4 (MD012, no-multiple-blanks)
Multiple consecutive blank lines

42-42: Expected: 1; Actual: 2 (MD012, no-multiple-blanks)
Multiple consecutive blank lines

43-43: Expected: 1; Actual: 3 (MD012, no-multiple-blanks)
Multiple consecutive blank lines

44-44: Expected: 1; Actual: 4 (MD012, no-multiple-blanks)
Multiple consecutive blank lines

45-45: Expected: 1; Actual: 5 (MD012, no-multiple-blanks)
Multiple consecutive blank lines

54-54: Expected: 1; Actual: 2 (MD012, no-multiple-blanks)
Multiple consecutive blank lines

61-61: Expected: 1; Actual: 2 (MD012, no-multiple-blanks)
Multiple consecutive blank lines

142-142: Expected: 1; Actual: 2 (MD012, no-multiple-blanks)
Multiple consecutive blank lines

143-143: Expected: 1; Actual: 3 (MD012, no-multiple-blanks)
Multiple consecutive blank lines

146-146: Expected: 1; Actual: 2 (MD012, no-multiple-blanks)
Multiple consecutive blank lines

147-147: Expected: 1; Actual: 3 (MD012, no-multiple-blanks)
Multiple consecutive blank lines

148-148: Expected: 1; Actual: 4 (MD012, no-multiple-blanks)
Multiple consecutive blank lines

149-149: Expected: 1; Actual: 5 (MD012, no-multiple-blanks)
Multiple consecutive blank lines

152-152: Expected: 1; Actual: 2 (MD012, no-multiple-blanks)
Multiple consecutive blank lines

153-153: Expected: 1; Actual: 3 (MD012, no-multiple-blanks)
Multiple consecutive blank lines

154-154: Expected: 1; Actual: 4 (MD012, no-multiple-blanks)
Multiple consecutive blank lines

155-155: Expected: 1; Actual: 5 (MD012, no-multiple-blanks)
Multiple consecutive blank lines

158-158: Expected: 1; Actual: 2 (MD012, no-multiple-blanks)
Multiple consecutive blank lines

161-161: Expected: 1; Actual: 2 (MD012, no-multiple-blanks)
Multiple consecutive blank lines

71-71: null (MD031, blanks-around-fences)
Fenced code blocks should be surrounded by blank lines

73-73: null (MD031, blanks-around-fences)
Fenced code blocks should be surrounded by blank lines

79-79: null (MD031, blanks-around-fences)
Fenced code blocks should be surrounded by blank lines

81-81: null (MD031, blanks-around-fences)
Fenced code blocks should be surrounded by blank lines

102-102: null (MD031, blanks-around-fences)
Fenced code blocks should be surrounded by blank lines

104-104: null (MD031, blanks-around-fences)
Fenced code blocks should be surrounded by blank lines

106-106: null (MD031, blanks-around-fences)
Fenced code blocks should be surrounded by blank lines

108-108: null (MD031, blanks-around-fences)
Fenced code blocks should be surrounded by blank lines

66-66: null (MD040, fenced-code-language)
Fenced code blocks should have a language specified

73-73: null (MD040, fenced-code-language)
Fenced code blocks should have a language specified

79-79: null (MD040, fenced-code-language)
Fenced code blocks should have a language specified

88-88: null (MD040, fenced-code-language)
Fenced code blocks should have a language specified

104-104: null (MD040, fenced-code-language)
Fenced code blocks should have a language specified

108-108: null (MD040, fenced-code-language)
Fenced code blocks should have a language specified

144-144: null (MD045, no-alt-text)
Images should have alternate text (alt text)

Ruff

cognee/api/v1/cognify/cognify.py

150-150: Local variable graph_topology is assigned to but never used (F841)

Remove assignment to unused variable graph_topology

232-232: Do not use bare except (E722)

304-304: Local variable graph is assigned to but never used (F841)

Remove assignment to unused variable graph

Additional comments not posted (1)

cognee/api/v1/cognify/cognify.py (1)

9-12: Ensure that the newly added imports are utilized effectively in the code.

coderabbitai · 2024-06-12T18:59:21Z

README.md

+search_results = cognee.search("SIMILARITY", {'query': 'Tell me about NLP'}) # Query cognee for the knowledge

-for result_text in search_results[0]:
-    print(result_text)
+print(search_results)


Consider adding error handling for the search method to ensure robustness.

- search_results = cognee.search("SIMILARITY", {'query': 'Tell me about NLP'}) # Query cognee for the knowledge + try: + search_results = cognee.search("SIMILARITY", {'query': 'Tell me about NLP'}) # Query cognee for the knowledge + except Exception as e: + print(f"Error during search: {str(e)}") + search_results = None

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

search_results = cognee.search("SIMILARITY", {'query': 'Tell me about NLP'}) # Query cognee for the knowledge

for result_text in search_results[0]:

print(result_text)

print(search_results)

try:

search_results = cognee.search("SIMILARITY", {'query': 'Tell me about NLP'}) # Query cognee for the knowledge

except Exception as e:

print(f"Error during search: {str(e)}")

search_results = None

print(search_results)

coderabbitai · 2024-06-12T18:59:21Z

cognee/api/v1/cognify/cognify.py

@@ -136,21 +145,37 @@ async def process_batch(files_batch):
    batch_size = 20
    file_count = 0
    files_batch = []
+    from cognee.infrastructure.databases.graph.config import get_graph_config
+    graph_config = get_graph_config()
+    graph_topology = graph_config.graph_model


The variable graph_topology is declared but not used, which could lead to unnecessary memory usage.

- graph_topology = graph_config.graph_model

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

graph_topology = graph_config.graph_model

Tools

Ruff

150-150: Local variable graph_topology is assigned to but never used (F841)

Remove assignment to unused variable graph_topology

coderabbitai · 2024-06-12T18:59:21Z

cognee/api/v1/cognify/cognify.py

+        config.set_chunk_strategy(ChunkStrategy.LANGCHAIN_CHARACTER)
+        config.embedding_engine = LiteLLMEmbeddingEngine()
+
+        graph = await cognify()


The variable graph is assigned but never used. Consider removing it if it's not needed.

- graph = await cognify()

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

graph = await cognify()

await cognify()

Tools

Ruff

304-304: Local variable graph is assigned to but never used (F841)

Remove assignment to unused variable graph

coderabbitai

Actionable comments posted: 1

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between a197177 and 0d230c9.

Files selected for processing (2)

.github/workflows/test_common.yml (1 hunks)
cognee/tests/test_qdrant.py (1 hunks)

Additional context used

Ruff

cognee/tests/test_qdrant.py

36-36: Local variable text_1 is assigned to but never used (F841)

Remove assignment to unused variable text_1

44-44: Local variable text_2 is assigned to but never used (F841)

Remove assignment to unused variable text_2

Additional comments not posted (5)

.github/workflows/test_common.yml (2)

86-86: The step to run the default basic pipeline is well-configured.

93-100: The configuration for running the Qdrant test is consistent and well-defined.

cognee/tests/test_qdrant.py (3)

22-22: Setting the vector engine provider to "qdrant" aligns with the PR objectives.

29-30: The method for adding data to the cognee module is correctly implemented.

52-66: The implementation of various search operations is correct and consistent.

…b' into fix/setting-alternative-vector-db

coderabbitai

Actionable comments posted: 11

Outside diff range and nitpick comments (7)

cognee/modules/search/graph/search_neighbour.py (1)
Line range hint 28-39: The function implementation for both NetworkX and Neo4j is correct. However, consider handling exceptions explicitly to improve robustness.
try:
    # existing code
except Exception as e:
    logging.error(f"Failed to search neighbours: {str(e)}")
    raise
cognee/modules/search/graph/search_summary.py (1)
Line range hint 46-51: Remove the unnecessary f prefix from the string as it does not contain any placeholders.
-        cypher_query = f"""
+        cypher_query = """
        MATCH (n)
        WHERE n.id CONTAINS $query AND EXISTS(n.summary)
        RETURN n.id AS nodeId, n.summary AS summary
        """
cognee/modules/cognify/graph/add_node_connections.py (2)
Line range hint 44-57: Refactored to support multiple graph database providers ("NETWORKX" and "neo4j"). Ensure that the placeholder for Neo4j is replaced with actual implementation logic before merging.

Line range hint 61-61: The local variable result is assigned but never used. Consider removing or utilizing it to avoid potential confusion and improve code clarity.
- result = await graph.query(f"""MATCH (a), (b) WHERE a.unique_id = '{relationship['searched_node_id']}' AND b.unique_id = '{relationship['original_id_for_search']}'
-           CREATE (a)-[:SEMANTIC_CONNECTION {{weight:{relationship['score']}}}]->(b)""")
- await graph.close()
cognee/modules/cognify/graph/create.py (3)
Line range hint 158-158: Remove unused variable created_node_id.
- created_node_id = await add_node(graph_client, parent_id, node_id, node_data)
+ _ = await add_node(graph_client, parent_id, node_id, node_data)
Line range hint 218-218: Remove unused variable relationship_data.
- relationship_data = {}
Line range hint 253-253: Remove unused variable ids.
- ids = await process_attribute(graph_client, root_id, attribute_name, attribute_value)

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 0d230c9 and 39b346d.

Files ignored due to path filters (1)

poetry.lock is excluded by !**/*.lock

Files selected for processing (24)

.github/workflows/test_common.yml (2 hunks)
.github/workflows/test_neo4j.yml (1 hunks)
.github/workflows/test_qdrant.yml (1 hunks)
.github/workflows/test_weaviate.yml (1 hunks)
cognee/api/v1/add/add.py (1 hunks)
cognee/api/v1/config/config.py (3 hunks)
cognee/api/v1/search/search.py (1 hunks)
cognee/infrastructure/databases/graph/config.py (2 hunks)
cognee/infrastructure/databases/graph/get_graph_client.py (2 hunks)
cognee/infrastructure/databases/vector/create_vector_engine.py (1 hunks)
cognee/modules/cognify/graph/add_node_connections.py (3 hunks)
cognee/modules/cognify/graph/create.py (1 hunks)
cognee/modules/data/extraction/extract_categories.py (1 hunks)
cognee/modules/search/graph/search_adjacent.py (1 hunks)
cognee/modules/search/graph/search_categories.py (2 hunks)
cognee/modules/search/graph/search_cypher.py (1 hunks)
cognee/modules/search/graph/search_neighbour.py (2 hunks)
cognee/modules/search/graph/search_summary.py (2 hunks)
cognee/modules/search/vector/search_similarity.py (2 hunks)
cognee/tests/test_neo4j.py (1 hunks)
cognee/tests/test_qdrant.py (1 hunks)
cognee/tests/test_weaviate.py (1 hunks)
docs/api_reference.md (1 hunks)
tests/import_test.py (1 hunks)

Files skipped from review due to trivial changes (3)

cognee/api/v1/add/add.py
cognee/modules/data/extraction/extract_categories.py
tests/import_test.py

Files skipped from review as they are similar to previous changes (4)

cognee/api/v1/config/config.py
cognee/infrastructure/databases/graph/config.py
cognee/infrastructure/databases/vector/create_vector_engine.py
cognee/modules/search/vector/search_similarity.py

Additional context used

Ruff

cognee/infrastructure/databases/graph/get_graph_client.py

22-22: Do not use bare except (E722)

35-35: Do not use bare except (E722)

cognee/modules/search/graph/search_summary.py

47-51: f-string without any placeholders (F541)

Remove extraneous f prefix

cognee/modules/cognify/graph/add_node_connections.py

61-61: Local variable result is assigned to but never used (F841)

Remove assignment to unused variable result

cognee/tests/test_qdrant.py

36-36: Local variable text_1 is assigned to but never used (F841)

Remove assignment to unused variable text_1

44-44: Local variable text_2 is assigned to but never used (F841)

Remove assignment to unused variable text_2

cognee/tests/test_neo4j.py

36-36: Local variable text_1 is assigned to but never used (F841)

Remove assignment to unused variable text_1

44-44: Local variable text_2 is assigned to but never used (F841)

Remove assignment to unused variable text_2

cognee/modules/cognify/graph/create.py

158-158: Local variable created_node_id is assigned to but never used (F841)

Remove assignment to unused variable created_node_id

218-218: Local variable relationship_data is assigned to but never used (F841)

Remove assignment to unused variable relationship_data

253-253: Local variable ids is assigned to but never used (F841)

Remove assignment to unused variable ids

LanguageTool

docs/api_reference.md

[uncategorized] ~27-~27: Possible missing comma found. (AI_HYDRA_LEO_MISSING_COMMA)
Context: ...ry: str) Sets the root directory of the system where essential system files and operat...

Markdownlint

docs/api_reference.md

9-9: Expected: 0 or 2; Actual: 1 (MD009, no-trailing-spaces)
Trailing spaces

16-16: Expected: 0 or 2; Actual: 1 (MD009, no-trailing-spaces)
Trailing spaces

77-77: Expected: 0 or 2; Actual: 1 (MD009, no-trailing-spaces)
Trailing spaces

5-5: Expected: 1; Actual: 2 (MD012, no-multiple-blanks)
Multiple consecutive blank lines

11-11: Expected: 1; Actual: 2 (MD012, no-multiple-blanks)
Multiple consecutive blank lines

22-22: Expected: 1; Actual: 2 (MD012, no-multiple-blanks)
Multiple consecutive blank lines

25-25: Expected: 1; Actual: 2 (MD012, no-multiple-blanks)
Multiple consecutive blank lines

87-87: Expected: 1; Actual: 2 (MD012, no-multiple-blanks)
Multiple consecutive blank lines

88-88: Expected: 1; Actual: 3 (MD012, no-multiple-blanks)
Multiple consecutive blank lines

91-91: Expected: 1; Actual: 2 (MD012, no-multiple-blanks)
Multiple consecutive blank lines

92-92: Expected: 1; Actual: 3 (MD012, no-multiple-blanks)
Multiple consecutive blank lines

95-95: Expected: 1; Actual: 2 (MD012, no-multiple-blanks)
Multiple consecutive blank lines

96-96: Expected: 1; Actual: 3 (MD012, no-multiple-blanks)
Multiple consecutive blank lines

19-19: Expected: 1; Actual: 0; Below (MD022, blanks-around-headings)
Headings should be surrounded by blank lines

26-26: Expected: 1; Actual: 0; Below (MD022, blanks-around-headings)
Headings should be surrounded by blank lines

34-34: Expected: 1; Actual: 0; Below (MD022, blanks-around-headings)
Headings should be surrounded by blank lines

45-45: Expected: 1; Actual: 0; Below (MD022, blanks-around-headings)
Headings should be surrounded by blank lines

67-67: Expected: 1; Actual: 0; Below (MD022, blanks-around-headings)
Headings should be surrounded by blank lines

76-76: Expected: 1; Actual: 0; Below (MD022, blanks-around-headings)
Headings should be surrounded by blank lines

97-97: Expected: 1; Actual: 0; Below (MD022, blanks-around-headings)
Headings should be surrounded by blank lines

103-103: Expected: 1; Actual: 0; Below (MD022, blanks-around-headings)
Headings should be surrounded by blank lines

110-110: Expected: 1; Actual: 0; Below (MD022, blanks-around-headings)
Headings should be surrounded by blank lines

115-115: Expected: 1; Actual: 0; Above (MD022, blanks-around-headings)
Headings should be surrounded by blank lines

115-115: Expected: 1; Actual: 0; Below (MD022, blanks-around-headings)
Headings should be surrounded by blank lines

14-14: null (MD025, single-title, single-h1)
Multiple top-level headings in the same document

30-30: null (MD031, blanks-around-fences)
Fenced code blocks should be surrounded by blank lines

40-40: null (MD031, blanks-around-fences)
Fenced code blocks should be surrounded by blank lines

51-51: null (MD031, blanks-around-fences)
Fenced code blocks should be surrounded by blank lines

62-62: null (MD031, blanks-around-fences)
Fenced code blocks should be surrounded by blank lines

71-71: null (MD031, blanks-around-fences)
Fenced code blocks should be surrounded by blank lines

81-81: null (MD031, blanks-around-fences)
Fenced code blocks should be surrounded by blank lines

104-104: null (MD031, blanks-around-fences)
Fenced code blocks should be surrounded by blank lines

116-116: null (MD031, blanks-around-fences)
Fenced code blocks should be surrounded by blank lines

98-98: null (MD032, blanks-around-lists)
Lists should be surrounded by blank lines

111-111: null (MD032, blanks-around-lists)
Lists should be surrounded by blank lines

114-114: null (MD032, blanks-around-lists)
Lists should be surrounded by blank lines

Additional comments not posted (7)

.github/workflows/test_qdrant.yml (1)

1-62: The workflow is well-structured and follows best practices for CI/CD pipelines.

.github/workflows/test_weaviate.yml (1)

1-62: The workflow is well-structured and follows best practices for CI/CD pipelines.

cognee/modules/search/graph/search_summary.py (1)

Line range hint 29-46: The logic for handling different graph database providers looks good and is well-structured.

Also applies to: 53-53

cognee/modules/search/graph/search_categories.py (1)

Line range hint 37-53: The logic for handling different graph database providers in search_categories is well-implemented and clear.

docs/api_reference.md (1)

76-79: The documentation update to reflect the new function name graph_database_provider is accurate and clear.

Tools

Markdownlint

77-77: Expected: 0 or 2; Actual: 1 (MD009, no-trailing-spaces)
Trailing spaces

76-76: Expected: 1; Actual: 0; Below (MD022, blanks-around-headings)
Headings should be surrounded by blank lines

cognee/api/v1/search/search.py (1)

53-53: The update to use the new graph database provider configuration in the search function is correctly implemented.

.github/workflows/test_common.yml (1)

1-1: Renamed workflow to "test | environments" to reflect the broader scope of testing environments.

coderabbitai · 2024-06-12T21:03:39Z

cognee/modules/cognify/graph/create.py

@@ -45,7 +45,7 @@ async def add_node(client, parent_id: Optional[str], node_id: str, node_data: di
            graph_config = get_graph_config()

            # Add an edge if a parent ID is provided and the graph engine is NETWORKX
-            if parent_id and "default_relationship" in node_data and graph_config.graph_engine == GraphDBType.NETWORKX:
+            if parent_id and "default_relationship" in node_data and graph_config.graph_database_provider == "NETWORKX":


Consider handling exceptions more robustly in edge addition.

The current implementation prints the error and passes. This might not be sufficient for production environments where error tracking and handling are critical. Consider using a logging framework or rethrowing the exception after logging.

Refactor to use a logging framework.

Instead of using print statements for error logging, consider integrating a logging framework that supports different log levels and is more suitable for production environments.

- print(f"Error adding edge: {e}") + import logging + logger = logging.getLogger(__name__) + logger.error(f"Error adding edge: {e}")

Committable suggestion was skipped due to low confidence.

Tip
Codebase Verification

Ensure consistent error handling by replacing print statements with a logging framework and considering the use of exception handling where appropriate.

cognee/modules/cognify/graph/create.py: Lines containing print(f"Error adding edge: {e}") and print(f"Error adding node or edge: {e}").

Analysis chain

Ensure consistency in error handling.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Description: Verify consistent error handling across the module. # Test: Search for print statements used for error logging. Expect: No occurrences after refactoring. rg --type py 'print\(f"Error' cognee/modules/cognify/graph/

Length of output: 245

Script:

#!/bin/bash # Description: Verify consistent error handling across the module. # Test: Search for logging statements used for error handling. rg --type py 'logging\.error' cognee/modules/cognify/graph/ # Test: Search for exception handling. rg --type py 'raise ' cognee/modules/cognify/graph/

Length of output: 307

coderabbitai · 2024-06-12T21:03:39Z

.github/workflows/test_neo4j.yml

+name: test neo4j
+
+on:
+  pull_request:
+    branches:
+      - main
+  workflow_dispatch:
+
+concurrency:
+  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
+  cancel-in-progress: true
+
+env:
+  RUNTIME__LOG_LEVEL: ERROR
+
+jobs:
+  get_docs_changes:
+    name: docs changes
+    uses: ./.github/workflows/get_docs_changes.yml
+
+  run_common:
+    name: test
+    needs: get_docs_changes
+    if: needs.get_docs_changes.outputs.changes_outside_docs == 'true'
+    runs-on: macos-latest
+
+    defaults:
+      run:
+        shell: bash
+
+    steps:
+      - name: Check out
+        uses: actions/checkout@v2
+
+      - name: Setup Python
+        uses: actions/setup-python@v4
+        with:
+          python-version: 3.11
+
+      - name: Install Poetry
+        uses: snok/install-poetry@v1.3.2
+        with:
+          virtualenvs-create: true
+          virtualenvs-in-project: true
+          installer-parallel: true
+
+      - name: Install dependencies
+        run: poetry install --no-interaction
+
+      - name: Create .cognee_system directory and print path
+        run: |
+          mkdir .cognee_system
+          echo $(pwd)/.cognee_system
+
+      - name: Run default Neo4j
+        env:
+          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
+          LLM_API_KEY: ${{ secrets.OPENAI_API_KEY }}
+          GRAPH_DATABASE_URL: ${{ secrets.NEO4J_API_URL }}
+          GRAPH_DATABASE_PASSWORD: ${{ secrets.NEO4J_API_KEY }}
+          GRAPH_DATABASE_USERNAME: "neo4j"
+          ENV: 'dev'
+        run: poetry run python ./cognee/tests/test_neo4j.py


The workflow configuration for Neo4j testing appears comprehensive and well-structured. However, consider adding a step to cache dependencies to speed up the build process.

- name: Cache Poetry virtual environment uses: actions/cache@v2 with: path: ~/.cache/pypoetry/virtualenvs key: ${{ runner.os }}-poetry-${{ hashFiles('**/poetry.lock') }} restore-keys: | ${{ runner.os }}-poetry-

coderabbitai · 2024-06-12T21:03:39Z

cognee/tests/test_weaviate.py

+import asyncio
+
+async def test_weaviate_integration():
+    from cognee import config, prune, add, cognify, search
+
+    config.set_vector_engine_provider("weaviate")
+    # config.set_vector_db_url("TEST_URL")
+    # config.set_vector_db_key("TEST_KEY")
+
+    prune.prune_system()
+
+    text = """
+      Incapillo is a Pleistocene-age caldera (a depression formed by the collapse of a volcano) in the La Rioja Province of Argentina. It is the southernmost volcanic centre in the Andean Central Volcanic Zone (CVZ) that erupted during the Pleistocene. Incapillo is one of several ignimbrite[a] or caldera systems that, along with 44 active stratovolcanoes, are part of the CVZ.
+      Subduction of the Nazca Plate beneath the South American Plate is responsible for most of the volcanism in the CVZ. After activity in the volcanic arc of the western Maricunga Belt ceased six million years ago, volcanism commenced in the Incapillo region, forming the high volcanic edifices Monte Pissis, Cerro Bonete Chico and Sierra de Veladero. Later, a number of lava domes were emplaced between these volcanoes.
+      Incapillo is the source of the Incapillo ignimbrite, a medium-sized deposit comparable to the Katmai ignimbrite. The Incapillo ignimbrite was erupted 0.52 ± 0.03 and 0.51 ± 0.04 million years ago and has a volume of about 20.4 cubic kilometres (4.9 cu mi). A caldera with dimensions of 5 by 6 kilometres (3.1 mi × 3.7 mi) formed during the eruption. Later volcanism generated more lava domes within the caldera and a debris flow in the Sierra de Veladero. The lake within the caldera may overlie an area of ongoing hydrothermal activity.
+    """
+
+    await add(text)
+
+    await cognify()
+
+    result = await search("SIMILARITY", { "query": "volcanic eruption" })
+
+    print(result)
+
+if __name__ == "__main__":
+    asyncio.run(test_weaviate_integration())


The test function for Weaviate integration is well-implemented. However, consider adding more detailed assertions to verify the correctness of each operation rather than just running them.

assert result is not None, "Search results should not be None" assert isinstance(result, list), "Search results should be a list"

coderabbitai · 2024-06-12T21:03:40Z

cognee/modules/search/graph/search_adjacent.py

+    if graph_config.graph_database_provider == "NETWORKX":
        if node_id not in graph:
            return {}

        neighbors = list(graph.neighbors(node_id))
        neighbor_descriptions = {neighbor: graph.nodes[neighbor].get('description') for neighbor in neighbors}
        return neighbor_descriptions

-    elif graph_config.graph_engine  == GraphDBType.NEO4J:
+    elif graph_config.graph_database_provider  == "neo4j":


The conditional handling for different graph database providers is implemented correctly. However, consider adding logging for better traceability and debugging.

import logging logging.info(f"Using graph database provider: {graph_config.graph_database_provider}")

coderabbitai · 2024-06-12T21:03:40Z

.github/workflows/test_common.yml

-      #     python -m nltk.downloader averaged_perceptron_tagger
-
-      - name: Run test script
+      - name: Run default basic pipeline


Added a new step "Run default basic pipeline" with specific environment variables. Ensure that the OPENAI_API_KEY and LLM_API_KEY are securely managed and not exposed.

coderabbitai · 2024-06-12T21:03:40Z

cognee/tests/test_qdrant.py

+    # # data_directory_path is defined above
+    # await cognee.add("data://" + "/Users/runner/work/cognee/cognee/./cognee/tests", dataset_name)
+
+    text_1 = """A quantum computer is a computer that takes advantage of quantum mechanical phenomena.


Remove unused variables text_1 and text_2 to clean up the code.

- text_1 = """A quantum computer is a computer that takes advantage of quantum mechanical phenomena. - ... - """ - text_2 = """A large language model (LLM) is a language model notable for its ability to achieve general-purpose language generation and other natural language processing tasks such as classification. LLMs acquire these abilities by learning statistical relationships from text documents during a computationally intensive self-supervised and semi-supervised training process. LLMs can be used for text generation, a form of generative AI, by taking an input text and repeatedly predicting the next token or word. - ... - """

Also applies to: 44-44

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

text_1 = """A quantum computer is a computer that takes advantage of quantum mechanical phenomena.

Tools

Ruff

36-36: Local variable text_1 is assigned to but never used (F841)

Remove assignment to unused variable text_1

coderabbitai · 2024-06-12T21:03:40Z

cognee/modules/search/graph/search_cypher.py

@@ -10,7 +10,7 @@ async def search_cypher(query:str, graph: Union[nx.Graph, any]):
    """
    graph_config = get_graph_config()

-    if graph_config.graph_engine == GraphDBType.NEO4J:
+    if graph_config.graph_database_provider == "neo4j":


Tip
Codebase Verification

The search_cypher function currently only handles the "neo4j" graph database provider and raises a ValueError for any other provider. This means that other supported providers like "NETWORKX" and "falkorb" are not handled in this function.

cognee/modules/search/graph/search_cypher.py:13: Handles "neo4j".

cognee/modules/search/graph/search_cypher.py:17: Raises ValueError for unsupported providers.

To ensure comprehensive support for all graph database providers, additional handling for "NETWORKX" and "falkorb" should be implemented in the search_cypher function.

Analysis chain

Ensure that all supported graph database providers are handled appropriately.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Description: Verify that all supported graph database providers are handled in the function. # Test: Search for the function usage. Expect: Handling for all supported providers. rg --type python $'graph_database_provider'

Length of output: 73

Script:

#!/bin/bash # Description: Verify that all supported graph database providers are handled in the function. # Test: Search for the function usage. Expect: Handling for all supported providers. rg --type py 'graph_database_provider'

Length of output: 2220

Script:

#!/bin/bash # Description: Extract the full implementation of the `search_cypher` function to verify the handling of all graph database providers. # Test: Extract the `search_cypher` function implementation. ast-grep --lang python --pattern $'def search_cypher($_, $_):\n $$$' cognee/modules/search/graph/search_cypher.py

Length of output: 1064

coderabbitai · 2024-06-12T21:03:40Z

cognee/tests/test_neo4j.py

+    # # data_directory_path is defined above
+    # await cognee.add("data://" + "/Users/runner/work/cognee/cognee/./cognee/tests", dataset_name)
+
+    text_1 = """A quantum computer is a computer that takes advantage of quantum mechanical phenomena.


Remove unused variables text_1 and text_2 to clean up the code.

- text_1 = """A quantum computer is a computer that takes advantage of quantum mechanical phenomena. - ... - """ - text_2 = """A large language model (LLM) is a language model notable for its ability to achieve general-purpose language generation and other natural language processing tasks such as classification. LLMs acquire these abilities by learning statistical relationships from text documents during a computationally intensive self-supervised and semi-supervised training process. LLMs can be used for text generation, a form of generative AI, by taking an input text and repeatedly predicting the next token or word. - ... - """

Also applies to: 44-44

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

text_1 = """A quantum computer is a computer that takes advantage of quantum mechanical phenomena.

Tools

Ruff

36-36: Local variable text_1 is assigned to but never used (F841)

Remove assignment to unused variable text_1

coderabbitai · 2024-06-12T21:03:40Z

cognee/infrastructure/databases/graph/get_graph_client.py

@@ -22,7 +22,7 @@ async def get_graph_client(graph_type: GraphDBType, graph_file_name: str = None)
        except:


Replace bare except clauses with specific exceptions or at least log the exception.

- except: + except Exception as e: + logger.error(f"Failed to initialize Neo4jAdapter: {e}") - except: + except Exception as e: + logger.error(f"Failed to initialize FalcorDBAdapter: {e}")

Also applies to: 35-35

Committable suggestion was skipped due to low confidence.

Tools

Ruff

22-22: Do not use bare except (E722)

…b' into fix/setting-alternative-vector-db

fix: allow alternative vector db engine to be used

f79631d

borisarzentar self-assigned this Jun 6, 2024

Vasilije1990 approved these changes Jun 9, 2024

View reviewed changes

Vasilije1990 marked this pull request as ready for review June 9, 2024 17:54

coderabbitai bot reviewed Jun 9, 2024

View reviewed changes

Vasilije1990 added 2 commits June 9, 2024 20:41

added chunking config options

00b60a9

rewrote chunking config

11231b7

coderabbitai bot reviewed Jun 9, 2024

View reviewed changes

Updates to the configs

a23fc40

coderabbitai bot reviewed Jun 10, 2024

View reviewed changes

Vasilije1990 added 5 commits June 10, 2024 13:40

rewrote configs

c9b2a06

fix to graph config

91864dc

added topology to modules from csv, json

de972df

removed old topology

a0e9860

fixes

4f76c46

borisarzentar commented Jun 10, 2024

View reviewed changes

coderabbitai bot reviewed Jun 10, 2024

View reviewed changes

added topology refactor

9fd542c

coderabbitai bot reviewed Jun 10, 2024

View reviewed changes

Vasilije1990 mentioned this pull request Jun 12, 2024

KeyError: 'name' #108

Closed

Vasilije1990 added 2 commits June 12, 2024 13:42

added updates to topology

d0939b9

add updates

409d3c7

coderabbitai bot reviewed Jun 12, 2024

View reviewed changes

Updates to the configs

a197177

coderabbitai bot reviewed Jun 12, 2024

View reviewed changes

Add qdrant test

0d230c9

coderabbitai bot reviewed Jun 12, 2024

View reviewed changes

Add qdrant test

d5c7c66

Vasilije1990 and others added 12 commits June 12, 2024 22:05

Add qdrant test

b6a2a40

Add qdrant test

f5c0e27

test: add weaviate integration test

7c66364

Merge remote-tracking branch 'origin/fix/setting-alternative-vector-d…

636b548

…b' into fix/setting-alternative-vector-db

test: add github action running weaviate integration test

0603fa8

fix: change github actions names

3577be3

Add NEO4J test

e896fa3

Merge remote-tracking branch 'origin/fix/setting-alternative-vector-d…

20d8bc3

…b' into fix/setting-alternative-vector-db

Add NEO4J test

e2db4d7

Add NEO4J test

ddb9914

Add NEO4J test

89f0d0a

Add NEO4J test

39b346d

coderabbitai bot reviewed Jun 12, 2024

View reviewed changes

Vasilije1990 and others added 5 commits June 12, 2024 23:18

Add NEO4J test

b68580c

fix: configure api client graph path

6a69279

Merge remote-tracking branch 'origin/fix/setting-alternative-vector-d…

adedfa4

…b' into fix/setting-alternative-vector-db

Add NEO4J test

7466818

chore: increase version to 0.1.12

e660410

borisarzentar merged commit 3224b49 into main Jun 12, 2024
19 checks passed

borisarzentar deleted the fix/setting-alternative-vector-db branch June 12, 2024 21:55

-    if config.graph_engine == GraphDBType.NEO4J:
+    if config.graph_engine == GraphDBType.NEO4J:
+        try:
+            return Neo4jAdapter(config)
+        except Exception as e:
+            logger.error(f"Failed to initialize Neo4jAdapter: {e}")
+    elif config.graph_engine == GraphDBType.FALCOR:
+        try:
+            return FalcorDBAdapter(config)
+        except Exception as e:
+            logger.error(f"Failed to initialize FalcorDBAdapter: {e}")

-        vector_engine = get_vector_engine()
+        vector_engine = get_vector_engine()
+        # Assuming the context where these print statements are located
+        # Replace the following print statements with logger.debug
+        logger.debug(f"Search results: {results}")
+        logger.debug(f"Results length: {len(results)}")

	graph = await topology_engine.add_graph_topology(file_path, dataset_files=dataset_files)
	await topology_engine.add_graph_topology(file_path, dataset_files=dataset_files)

		@@ -22,7 +22,7 @@ async def get_graph_client(graph_type: GraphDBType, graph_file_name: str = None)
		except:

		---

		#tbd

		---

		#tbd

fix: allow alternative vector db engine to be used #106

fix: allow alternative vector db engine to be used #106

Conversation

borisarzentar commented Jun 6, 2024 • edited by coderabbitai bot Loading

Summary by CodeRabbit

coderabbitai bot commented Jun 6, 2024 • edited Loading

Review failed

Walkthrough

Changes

Chat

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jun 9, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 9, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 9, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 9, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 9, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 9, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 9, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 9, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jun 9, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

borisarzentar Jun 10, 2024

Choose a reason for hiding this comment

Vasilije1990 Jun 10, 2024

Choose a reason for hiding this comment

borisarzentar Jun 10, 2024

Choose a reason for hiding this comment

Vasilije1990 Jun 10, 2024

Choose a reason for hiding this comment

borisarzentar Jun 10, 2024

Choose a reason for hiding this comment

Vasilije1990 Jun 10, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jun 10, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 10, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 10, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 10, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 10, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 10, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 10, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jun 12, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 12, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 12, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jun 12, 2024

borisarzentar commented Jun 6, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jun 6, 2024 •

edited

Loading

CodeRabbit Configration File (`.coderabbit.yaml`)