Skip to content

Conversation

@AayushTyagi1
Copy link
Contributor

No description provided.

@AayushTyagi1 AayushTyagi1 requested a review from nithishr October 21, 2025 22:16
@AayushTyagi1 AayushTyagi1 self-assigned this Oct 21, 2025
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @AayushTyagi1, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the toolset by introducing the ability to programmatically list and inspect Couchbase indexes. It provides a clear, structured output of index names, whether they are primary, and their complete N1QL definition, which is crucial for understanding and managing the database's indexing strategy. This addition improves observability and simplifies index management by allowing users to filter indexes based on their hierarchical location within the cluster.

Highlights

  • New Feature: Index Listing: Introduced a new capability to list all indexes within a Couchbase cluster, including their names, primary status, and full CREATE INDEX definitions. This feature supports optional filtering by bucket, scope, and collection.
  • New Module for Index Operations: A new module, src/tools/index.py, has been added to encapsulate all index-related functionalities, starting with the list_indexes function.
  • Index Definition Utility: A new utility file, src/utils/index_utils.py, was created to provide helper functions, specifically generate_index_definition, which constructs the CREATE INDEX statement for GSI indexes based on their metadata.
  • Documentation Update: The README.md file has been updated to reflect the newly added functionality of listing indexes.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new feature to list Couchbase indexes, which is a valuable addition. The implementation is well-structured, with new modules for index-related functionality. I've identified a couple of areas for improvement: one is a minor code cleanup to remove a redundant variable, and the other is a more important fix to add validation in the index definition generation logic. This will make the function more robust against unexpected data and prevent potential errors or invalid SQL generation. Overall, great work on this feature.

Comment on lines +33 to +75
# Only generate definition for GSI indexes
if index_data.get("using") != "gsi":
return None

try:
# Start building the definition
if index_data.get("is_primary"):
query_definition = "CREATE PRIMARY INDEX"
else:
query_definition = "CREATE INDEX"

# Add index name
query_definition += f" `{index_data['name']}`"

# Add bucket name
query_definition += f" ON `{index_data['bucket']}`"

# Add scope and collection if they exist
scope = index_data.get("scope")
collection = index_data.get("collection")
if scope and collection:
query_definition += f".`{scope}`.`{collection}`"

# Add index keys for non-primary indexes
index_keys = index_data.get("index_key", [])
if index_keys and len(index_keys) > 0:
keys_str = ", ".join(str(key) for key in index_keys)
query_definition += f"({keys_str})"

# Add WHERE condition if exists
condition = index_data.get("condition")
if condition:
query_definition += f" WHERE {condition}"

# Add PARTITION BY if exists
partition = index_data.get("partition")
if partition:
query_definition += f" PARTITION BY {partition}"

return query_definition
except Exception as e:
logger.warning(f"Error generating index definition: {e}")
return None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The function could generate an invalid CREATE INDEX statement if name or bucket are missing or None in index_data. This can happen if the query to system:all_indexes returns incomplete data. It's safer to add explicit validation for these required fields at the beginning of the function to prevent generating invalid SQL and to make the function more robust. While the try...except block catches some errors, proactive validation is a better practice.

    # Only generate definition for GSI indexes
    if index_data.get("using") != "gsi":
        return None

    name = index_data.get("name")
    bucket = index_data.get("bucket")

    if not name or not bucket:
        logger.warning(
            f"Cannot generate index definition due to missing name or bucket. Data: {index_data}"
        )
        return None

    try:
        # Start building the definition
        if index_data.get("is_primary"):
            query_definition = "CREATE PRIMARY INDEX"
        else:
            query_definition = "CREATE INDEX"

        # Add index name
        query_definition += f" `{name}`"

        # Add bucket name
        query_definition += f" ON `{bucket}`"

        # Add scope and collection if they exist
        scope = index_data.get("scope")
        collection = index_data.get("collection")
        if scope and collection:
            query_definition += f".`{scope}`.`{collection}`"

        # Add index keys for non-primary indexes
        index_keys = index_data.get("index_key", [])
        if index_keys:
            keys_str = ", ".join(str(key) for key in index_keys)
            query_definition += f"({keys_str})"

        # Add WHERE condition if exists
        condition = index_data.get("condition")
        if condition:
            query_definition += f" WHERE {condition}"

        # Add PARTITION BY if exists
        partition = index_data.get("partition")
        if partition:
            query_definition += f" PARTITION BY {partition}"

        return query_definition
    except Exception as e:
        logger.warning(f"Error generating index definition: {e}")
        return None

Comment on lines +83 to +94
temp_data = {
"name": index_data.get("name"),
"bucket": index_data.get("bucket_id"),
"scope": index_data.get("scope_id"),
"collection": index_data.get("keyspace_id"),
"index_type": index_data.get("using", "gsi"),
"is_primary": index_data.get("is_primary", False),
"index_key": index_data.get("index_key", []),
"condition": index_data.get("condition"),
"partition": index_data.get("partition"),
"using": index_data.get("using", "gsi"),
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The temp_data dictionary includes a redundant index_type key. The using key is already present with the same value, and the generate_index_definition function only uses the using key. Removing the index_type key will improve code clarity.

            temp_data = {
                "name": index_data.get("name"),
                "bucket": index_data.get("bucket_id"),
                "scope": index_data.get("scope_id"),
                "collection": index_data.get("keyspace_id"),
                "is_primary": index_data.get("is_primary", False),
                "index_key": index_data.get("index_key", []),
                "condition": index_data.get("condition"),
                "partition": index_data.get("partition"),
                "using": index_data.get("using", "gsi"),
            }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we rename the using to type & then use it to generate the index definition?

Comment on lines +83 to +94
temp_data = {
"name": index_data.get("name"),
"bucket": index_data.get("bucket_id"),
"scope": index_data.get("scope_id"),
"collection": index_data.get("keyspace_id"),
"index_type": index_data.get("using", "gsi"),
"is_primary": index_data.get("is_primary", False),
"index_key": index_data.get("index_key", []),
"condition": index_data.get("condition"),
"partition": index_data.get("partition"),
"using": index_data.get("using", "gsi"),
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we rename the using to type & then use it to generate the index definition?

logger = logging.getLogger(f"{MCP_SERVER_NAME}.utils.index_utils")


def generate_index_definition(index_data: dict[str, Any]) -> str | None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an easier way to get this data? The concern here is that there might be more parameters than the ones we parse like in the case of vector search such as with.
Did you try the management API in the SDK to see if it has something out of the box?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think anything return Index Definition in SDK though management API. I can recheck if some update is there. We are using the same approach in VS Code and jetbrains for this reason only. I will recheck.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you will have a merge conflict with #58.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I am aware of that. I will merge main once other one get merged.

@AayushTyagi1 AayushTyagi1 merged commit 6714563 into main Oct 22, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants