-
Notifications
You must be signed in to change notification settings - Fork 5
feat: add embeddings package #9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Deploying apps-script with
|
| Latest commit: |
7ac9f69
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://f7c72fbe.apps-script.pages.dev |
| Branch Preview URL: | https://feat-embeddings-package.apps-script.pages.dev |
WalkthroughThis pull request introduces a new package for generating text embeddings using Vertex AI. It adds documentation, package configuration, implementation of embedding functionalities (including cosine similarity and emoji representation functions), comprehensive unit tests, and a TypeScript configuration file. The changes establish public APIs for generating embeddings and validating them via test cases within a Google Apps Script environment. Changes
Sequence Diagram(s)sequenceDiagram
participant User as Client
participant BE as batchedEmbeddings
participant Props as ScriptProperties
participant API as API Endpoint
User->>BE: Call getBatchedEmbeddings(instances, options)
BE->>Props: Retrieve project ID
Props-->>BE: Return project ID
BE->>API: Send HTTP POST with instances and options
API-->>BE: Return embeddings or error response
BE->>User: Return processed embeddings
Possibly related PRs
Poem
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @jpoehnelt, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
Summary of Changes
This pull request introduces a new package called @repository/embeddings. This package provides functionality for generating embeddings using Vertex AI and calculating the similarity between embeddings within Apps Script. It includes functions for batch processing of text into embeddings, calculating cosine similarity between vectors, and representing similarity values with emojis. The package also includes comprehensive unit tests to ensure the reliability of the implemented functions.
Highlights
- New Package: @repository/embeddings: Introduces a new package for generating and comparing embeddings using Vertex AI within Apps Script.
- batchedEmbeddings Function: Adds a function to generate embeddings for text using the Vertex AI text-embedding-005 model, including options for customization.
- Similarity Calculation: Implements functions to calculate cosine similarity between embeddings and represent the similarity using emojis.
- Unit Tests: Includes unit tests for all major functions, ensuring correct calculations and error handling.
Changelog
Click here to see the changelog
- packages/embeddings/README.md
- Added a README file to provide a brief description of the embeddings package.
- packages/embeddings/package.json
- Created a package.json file with the necessary metadata, dependencies (including
@types/google-apps-scriptandvitest), and scripts for the embeddings package.
- Created a package.json file with the necessary metadata, dependencies (including
- packages/embeddings/src/index.test.ts
- Added unit tests for
similarity,similarityEmoji, andbatchedEmbeddingsfunctions. - Mocked Google Apps Script global objects for testing purposes.
- Verified correct cosine similarity calculations, emoji representations, and API request handling.
- Added unit tests for
- packages/embeddings/src/index.ts
- Implemented the
batchedEmbeddingsfunction to generate embeddings using Vertex AI. - Implemented the
similarityfunction to calculate cosine similarity between two vectors. - Implemented the
similarityEmojifunction to represent similarity values with emojis. - Added error handling for API requests and vector length mismatches.
- Implemented the
- packages/embeddings/tsconfig.json
- Configured the TypeScript compiler options for the embeddings package, including module type, target, and library settings.
- pnpm-lock.yaml
- Updated the pnpm lockfile to include the new
@repository/embeddingspackage and its dependencies, such as@types/google-apps-scriptandvitest.
- Updated the pnpm lockfile to include the new
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Trivia time!
What is the dimensionality of the embeddings generated by the text-embedding-005 model?
Click here for the answer
The `text-embedding-005` model generates embeddings with a dimensionality of 1536.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
The pull request introduces a new embeddings package with functions for generating embeddings using Vertex AI and calculating similarity between embeddings. The code is generally well-structured and includes unit tests. However, there are some areas that could be improved, such as addressing the TODO comment, improving error handling, and ensuring consistent variable declaration.
Summary of Findings
- TODO comment in batchedEmbeddings function: The
batchedEmbeddingsfunction contains a TODO comment regarding chunking the input in instances of 5. This should be addressed to improve the function's efficiency and prevent potential issues with large inputs. - Missing error handling for non-200 responses: The
batchedEmbeddingsfunction only throws an error when the response code is not 200. It should also handle other error codes and provide more informative error messages. - Inconsistent use of const and let: The code uses both
constandletfor variable declarations. It should consistently useconstfor variables that do not change andletfor variables that are reassigned.
Merge Readiness
The pull request is not yet ready for merging. The TODO comment in the batchedEmbeddings function should be addressed, and the error handling should be improved. The inconsistent use of const and let should also be addressed. I am unable to approve this pull request, and recommend that the pull request not be merged until these issues are addressed (at a minimum).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (7)
packages/embeddings/README.md (1)
1-4: Enhance the README with more comprehensive documentationWhile the introduction provides a basic overview, the README would benefit from more detailed documentation to help users effectively use this new package. Consider adding:
- Installation instructions
- Usage examples for the main functions (
batchedEmbeddings,similarity,similarityEmoji)- API reference with parameters and return values
- Any necessary configuration steps (e.g., Vertex AI setup)
- Requirements or prerequisites
This would make it easier for new users to understand and implement the embeddings functionality.
packages/embeddings/package.json (1)
16-17: Consider updating main and types fields to point to compiled outputBoth "main" and "types" currently point to the source TypeScript file rather than compiled output. Typically:
- "main" should point to the compiled JavaScript (e.g., "./dist/index.js")
- "types" should point to the TypeScript declaration file (e.g., "./dist/index.d.ts")
While this might be intentional for a private package in a monorepo setup, it could cause issues if other packages try to import this one directly.
packages/embeddings/src/index.ts (5)
1-3: Consider making model constants configurableThese hardcoded constants might be more maintainable if they were configurable via environment variables or script properties, especially if different environments might use different models or regions.
-const MODEL_ID = "text-embedding-005"; -const REGION = "us-central1"; +const MODEL_ID = PropertiesService.getScriptProperties().getProperty("MODEL_ID") || "text-embedding-005"; +const REGION = PropertiesService.getScriptProperties().getProperty("REGION") || "us-central1";
4-7: Add JSDoc comments to Parameters interfaceConsider adding JSDoc comments to the Parameters interface similar to what you've done for the Options interface to improve code documentation.
+/** + * Additional parameters for the embedding model. + */ interface Parameters { + /** + * Whether to automatically truncate text that exceeds the model's maximum input length. + */ autoTruncate?: boolean; + /** + * The dimensionality of the output embeddings. + */ outputDimensionality?: number; }
43-51: Add JSDoc and error handling indicationThe getProjectId function lacks JSDoc comments and doesn't indicate it can throw an error, which could cause issues for callers.
+/** + * Retrieves the project ID from script properties. + * @throws {Error} If PROJECT_ID is not found in script properties + * @returns The project ID + */ const getProjectId = (): string => { const projectId = PropertiesService.getScriptProperties().getProperty("PROJECT_ID"); if (!projectId) { throw new Error("PROJECT_ID not found in script properties"); } return projectId; };
117-123: Add JSDoc return value and handle edge casesThe magnitude function is missing a JSDoc @returns tag and doesn't handle empty arrays explicitly.
/** * Calculates the magnitude of a vector. * @param x - The vector. + * @returns The magnitude (Euclidean norm) of the vector */ function magnitude(x: number[]): number { + if (x.length === 0) { + return 0; + } let result = 0; for (let i = 0, l = x.length; i < l; i += 1) { result += x[i] ** 2; } return Math.sqrt(result); }
142-148: Handle edge cases in similarityEmojiThe function doesn't handle edge cases like values outside the expected cosine similarity range [-1, 1].
export const similarityEmoji = (value: number): string => { + // Ensure value is within expected cosine similarity range [-1, 1] + if (value < -1 || value > 1) { + throw new Error(`Similarity value ${value} is outside the expected range [-1, 1]`); + } + if (value >= 0.9) return "🔥"; // Very high similarity if (value >= 0.7) return "✅"; // High similarity if (value >= 0.5) return "👍"; // Medium similarity if (value >= 0.3) return "🤔"; // Low similarity return "❌"; // Very low similarity };
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (5)
packages/embeddings/README.md(1 hunks)packages/embeddings/package.json(1 hunks)packages/embeddings/src/index.test.ts(1 hunks)packages/embeddings/src/index.ts(1 hunks)packages/embeddings/tsconfig.json(1 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
packages/embeddings/src/index.test.ts (1)
packages/embeddings/src/index.ts (3)
similarity(131-136)similarityEmoji(142-148)batchedEmbeddings(59-98)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Cloudflare Pages
🔇 Additional comments (9)
packages/embeddings/tsconfig.json (1)
1-14: LGTM! TypeScript configuration is well set up for the projectThe TypeScript configuration is appropriate for a package that will be used with Google Apps Script. It correctly:
- Uses modern module system and target
- Enables strict type checking
- Includes Google Apps Script types
- Sets up proper file inclusion/exclusion
No issues found in this configuration.
packages/embeddings/src/index.test.ts (5)
21-38: LGTM! Comprehensive testing of the similarity functionThe tests for the similarity function are well-structured and cover important cases:
- Parallel vectors (similarity = 1.0)
- Orthogonal vectors (similarity = 0.0)
- Opposite vectors (similarity = -1.0)
- Error handling for vectors of different lengths
Good job ensuring that the cosine similarity calculations are accurate.
40-48: LGTM! Good coverage of similarity emoji thresholdsThe tests for the similarityEmoji function verify that the correct emoji is returned for each similarity value threshold:
- Very high (≥0.9): 🔥
- High (≥0.7): ✅
- Medium (≥0.5): 👍
- Low (≥0.3): 🤔
- Very low (<0.3): ❌
All thresholds are properly tested.
50-105: LGTM! Thorough testing of batchedEmbeddings for single and array inputsThe tests for batchedEmbeddings are comprehensive, covering:
- Single string input handling
- Array of strings input handling
- Verification of request payload structure
- Proper parsing of response data
The mocking approach is clean and effective.
107-129: LGTM! Good testing of custom parameters and error handlingThe tests properly verify:
- Custom parameters (model, projectId, region) are correctly used in API requests
- Error handling when API returns non-200 status codes
This ensures the function is robust against various configuration options and failure scenarios.
1-20: LGTM! Clean mock setup for Google Apps Script environmentThe mocking approach for Google Apps Script global objects is clean and effective:
- ScriptApp.getOAuthToken is mocked to return a test token
- PropertiesService is configured to simulate project ID retrieval
- UrlFetchApp.fetchAll is mocked with vi.fn() for response simulation
This setup allows for testing Apps Script code in a Node.js environment.
packages/embeddings/src/index.ts (3)
12-41: Well-documented interfaceThe Options interface is well-documented with clear JSDoc comments for each property, including default values.
131-136: Consider optimizing the similarity functionThe similarity function checks for vector length equality, which is good. However, since we're now adding the same check to dotProduct_, you could optimize this further.
59-98:✅ Verification successful
Address TODOs and add batching implementation
The function has an unimplemented TODO about chunking requests in batches of 5, which should be addressed to handle rate limits and large input arrays.
The
UrlFetchApp.fetchAll()method may have limitations on the number of concurrent requests. According to Google Apps Script documentation, there are quotas that could affect large batches of requests.Here's an improved implementation with batching:
export function batchedEmbeddings( text: string | string[], { parameters = {}, model = MODEL_ID, projectId = getProjectId(), region = REGION, token = ScriptApp.getOAuthToken(), }: Options = {}, ): number[][] { const inputs = !Array.isArray(text) ? [text] : text; - // TODO chunk in instances of 5 - const requests = inputs.map((content) => ({ + // Process inputs in batches of 5 to respect API limits + const BATCH_SIZE = 5; + const results: any[] = []; + + for (let i = 0; i < inputs.length; i += BATCH_SIZE) { + const batch = inputs.slice(i, i + BATCH_SIZE); + const requests = batch.map((content) => ({ url: `https://${region}-aiplatform.googleapis.com/v1/projects/${projectId}/locations/${region}/publishers/google/models/${model}:predict`, method: "post" as const, headers: { Authorization: `Bearer ${token}`, "Content-Type": "application/json", }, muteHttpExceptions: true, contentType: "application/json", payload: JSON.stringify({ instances: [{ content }], parameters, }), - })); + })); - const responses = UrlFetchApp.fetchAll(requests); + const responses = UrlFetchApp.fetchAll(requests); - const results = responses.map((response) => { - if (response.getResponseCode() !== 200) { - throw new Error(response.getContentText()); - } + const batchResults = responses.map((response) => { + const statusCode = response.getResponseCode(); + const responseText = response.getContentText(); + + if (statusCode !== 200) { + throw new Error(`API request failed with status ${statusCode}: ${responseText}`); + } - return JSON.parse(response.getContentText()); - }); + return JSON.parse(responseText); + }); + + results.push(...batchResults); + } return results.map((result) => result.predictions[0].embeddings.values); }
🌐 Web query:
What are the limitations of UrlFetchApp.fetchAll in Google Apps Script?💡 Result:
UrlFetchApp's
fetchAllmethod in Google Apps Script offers significant performance advantages over sequentialfetchcalls but comes with several important limitations:Performance vs. Quota Limits
Daily Request Quota
All URL Fetch requests (including each individual call withinfetchAll) count toward the daily limit of 20,000 requests/user/day[4][12]. A singlefetchAllwith 100 URLs consumes 100 requests from this quota.Execution Time Limits
Scripts must complete within 6 minutes[12], which constrains large-scalefetchAlloperations.Technical Constraints
Response Handling
Maximum response size per request: 50 MB[12]
Maximum URL length: 2,082 characters[11][12]Header Limits
Up to 100 headers per request, each under 8 KB[12]Operational Considerations
Asynchronous Nature
Responses are reordered to match the original request sequence, even though requests execute out of order[1]. Actual fetch order varies per execution[1].Server Rate Limits
Aggressive use offetchAllmay trigger anti-DDOS protections on target servers, potentially blocking requests[6].Best Practices
Benchmarked Performance
Processing 1,000 URLs withfetchAlltook 5.5 seconds vs. 27 seconds for sequentialfetchcalls in testing[1]. However, no documented maximum batch size exists[1][6].Caching Strategies
UseCacheServiceto avoid redundant requests and preserve quota[3][5].Error Handling
EnablemuteHttpExceptions: trueto prevent script failures from individual request errors[8][12].Key Differences from
fetch:
Feature fetchAllSequential fetchExecution Method Asynchronous parallel processing Synchronous linear processing Response Order Enforced request order[1] Natural execution order Quota Usage 1 request per URL 1 request per URL Typical Use Case Batch operations (100-1,000 URLs) Single requests For mission-critical applications, implement retry logic and monitor quota usage via Apps Script Dashboard[4][12].
Citations:
- 1: https://gist.github.com/tanaikech/c0f383034045ab63c19604139ecb0728
- 2: https://support.google.com/docs/thread/239279708/you-do-not-have-permission-to-call-urlfetchapp-fetch
- 3: https://groups.google.com/g/google-apps-script-community/c/_JBtIw_Ok6Q
- 4: https://groups.google.com/g/adwords-scripts/c/WJsRAxTUSN0
- 5: https://developers.google.com/apps-script/guides/support/best-practices
- 6: https://groups.google.com/g/google-apps-script-community/c/GFovmo1PSTQ
- 7: https://community.canvaslms.com/t5/Canvas-Developers-Group/Google-App-Scripts-URLFetchApp-API-Endpoint-Requests-other-than/m-p/530461
- 8: https://spreadsheet.dev/comprehensive-guide-urlfetchapp-apps-script
- 9: https://issuetracker.google.com/issues/64235231
- 10: https://script.gs/using-urlfetchapp-fetch-vs-fetchall-in-google-apps-script/
- 11: https://developers.google.com/apps-script/reference/url-fetch/url-fetch-app
- 12: https://developers.google.com/apps-script/guides/services/quotas
BATCHING AND ERROR HANDLING VERIFIED:
Your updated implementation correctly replaces the TODO with batching logic by processing inputs in groups of 5. The batching mechanism not only mitigates potential rate limit issues withUrlFetchApp.fetchAll(as per Google Apps Script’s quotas and execution time constraints) but also enhances error reporting by including the status code in the error message.
- Key improvements:
- Inputs are now processed in batches of 5 instead of issuing a single large batch.
- Enhanced error handling: errors include both the response code and response text.
- Overall design aligns well with best practices for managing quotas and rate limits in Apps Script.
This refactored approach addresses the original concerns effectively.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
packages/embeddings/src/index.ts (4)
109-123: Consider validating the instances arrayThe function should validate that the instances array is not empty before processing.
export function getBatchedEmbeddings( instances: Instance[], { parameters = {}, model = MODEL_ID, projectId = getProjectId(), region = REGION, token = ScriptApp.getOAuthToken(), }: Options = {}, ): number[][] { + if (!instances || instances.length === 0) { + throw new Error("No instances provided for embedding generation"); + } const chunks = chunkArray(instances, 5); const requests = chunks.map((instances) => ({
135-141: Add error handling for JSON parsingThe function currently doesn't handle potential JSON parsing errors, which could occur if the API returns malformed JSON.
const results = responses.map((response) => { if (response.getResponseCode() !== 200) { throw new Error(response.getContentText()); } - return JSON.parse(response.getContentText()); + try { + return JSON.parse(response.getContentText()); + } catch (error) { + throw new Error(`Failed to parse API response: ${error.message}`); + } });
160-166: Consider optimizing the magnitude calculationThe
magnitudefunction implementation is correct but could be optimized for readability and potentially performance.function magnitude(x: number[]): number { - let result = 0; - for (let i = 0, l = x.length; i < l; i += 1) { - result += x[i] ** 2; - } - return Math.sqrt(result); + return Math.sqrt(x.reduce((sum, val) => sum + val ** 2, 0)); }
109-123:❓ Verification inconclusive
Consider adding rate limiting for large batches
For handling large batches, it might be worth considering a rate limiting mechanism to avoid overwhelming the API and potential rate limits.
export function getBatchedEmbeddings( instances: Instance[], { parameters = {}, model = MODEL_ID, projectId = getProjectId(), region = REGION, token = ScriptApp.getOAuthToken(), + batchSize = 5, + delayBetweenBatches = 1000, // milliseconds }: Options = {}, ): number[][] { if (!instances || instances.length === 0) { throw new Error("No instances provided for embedding generation"); } - const chunks = chunkArray(instances, 5); + const chunks = chunkArray(instances, batchSize); const requests = chunks.map((instances) => ({You would then need to update the
Optionsinterface to include these new parameters:interface Options { // existing properties... /** * The size of each batch when processing multiple instances. * @default 5 */ batchSize?: number; /** * The delay between batches in milliseconds. * @default 1000 */ delayBetweenBatches?: number; }And modify the
fetchAllto process batches with delay:const results: any[] = []; for (const requestBatch of chunkArray(requests, 10)) { const responses = UrlFetchApp.fetchAll(requestBatch); // Process responses... // Add processed results to results array if (delayBetweenBatches > 0 && requestBatch.length < requests.length) { Utilities.sleep(delayBetweenBatches); } }
🌐 Web query:
What are the rate limits for the Vertex AI embeddings API?💡 Result:
Vertex AI's embeddings API enforces rate limits that vary by model, region, and project configuration. Here's a detailed breakdown:
Core Rate Limits
textembedding-gecko models
Defaults to 1,500 requests per minute (RPM) in regions likeus-central1[10][12]. However:
- Input is capped at 250 text entries per request in
us-central1(5 in other regions) [2][4]- Token limit: 20,000 tokens per request, with truncation at 2,048 tokens for most models (8,192 for
text-embedding-large-exp-03-07) [2][10]Experimental embedding models
gemini-embedding-exp-03-07has stricter limits:
- 5 RPM under free tiers [7]
- 10 RPM for paid Tier 1 projects [7]
Quota Management
Default limits
Set per-project and per-region, shared across all applications/IPs using the Firebase/Vertex AI project [1][10].Dynamic adjustments
Limits can increase automatically as your project usage grows through Google's usage tiers:
- Free Tier: Strict RPM/TPM caps [7]
- Tier 1 (billing-enabled): Higher limits (e.g., 1,500 RPM for textembedding-gecko) [10]
Specialized Models
Model Feature Limit Multimodal embeddings ( multimodalembedding@001)Region-specific restrictions may block requests if quota is exhausted [5] Batch processing Up to 4 concurrent jobs [10] To view or adjust quotas:
- Navigate to Google Cloud Console → IAM & Admin → Quotas
- Filter for "Online prediction requests per base model per minute" [3]
- Submit quota increase requests for validated business needs [1][3]
Key Considerations
- Combine local caching with batch processing to reduce API calls [8]
- Use Firebase App Check to prevent abuse of client-side embedding calls [9]
- Errors like 429 ("Quota Exceeded") or 400 ("Payload Size Limit") indicate breached limits [14][5]
Citations:
- 1: https://firebase.google.com/docs/vertex-ai/quotas
- 2: https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/get-text-embeddings
- 3: https://www.googlecloudcommunity.com/gc/AI-ML/where-to-find-the-RPM-for-text-embedding-api/m-p/883419/highlight/true
- 4: https://www.googlecloudcommunity.com/gc/AI-ML/textembedding-gecko-Quota/m-p/625976
- 5: https://www.googlecloudcommunity.com/gc/AI-ML/Vertex-API-Multimodal-Embedding-can-be-blocked-if-too-much/m-p/788337
- 6: https://platform.openai.com/docs/guides/rate-limits
- 7: https://ai.google.dev/gemini-api/docs/rate-limits
- 8: https://www.restack.io/p/embeddings-answer-understanding-applying-text-embeddings-vertex-ai-cat-ai
- 9: https://firebase.google.com/docs/vertex-ai
- 10: https://cloud.google.com/vertex-ai/generative-ai/docs/quotas
- 11: https://www.youtube.com/watch?v=RC3JOzN2F1w
- 12: https://cloud.google.com/vertex-ai/docs/quotas
- 13: https://ai.google.dev/gemini-api/terms
- 14: https://discuss.ai.google.dev/t/payload-size-limit-error-with-embed-content-api/44342
- 15: https://ai.google.dev/gemini-api/docs/embeddings
- 16: https://www.googlecloudcommunity.com/gc/AI-ML/Requests-for-Increasing-Vertex-AI-API-Quota-for-Personal-Account/m-p/721526
Rate Limiting Enhancement for Batch Requests
The current implementation spreads requests over fixed chunks but does not account for API rate limits, which can vary significantly by model and region. Given Vertex AI’s documented limits (e.g., up to 1,500 RPM for textembedding-gecko models and even stricter limits for experimental models), it is advisable to expose rate limiting parameters. Consider updating the function as follows:
Expose parameters:
Add optionalbatchSize(default: 5) anddelayBetweenBatches(default: 1000 ms) to theOptionsinterface so that users can adjust based on the API’s constraints.Dynamic chunking:
Replace the hard-codedchunkArray(instances, 5)withchunkArray(instances, batchSize).Delay between requests:
In your loop that processes batches (e.g., usingUrlFetchApp.fetchAll), insert a delay (usingUtilities.sleep(delayBetweenBatches)) between consecutive batches to ensure that you don’t inadvertently exceed the rate limits.This approach gives developers the flexibility to tune the request flow based on their specific project quotas and the model being used.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
packages/embeddings/package.json(1 hunks)packages/embeddings/src/index.test.ts(1 hunks)packages/embeddings/src/index.ts(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- packages/embeddings/package.json
- packages/embeddings/src/index.test.ts
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Cloudflare Pages
🔇 Additional comments (10)
packages/embeddings/src/index.ts (10)
4-21: Well-defined interfaces with comprehensive task typesThe interfaces for
ParametersandInstanceare well-structured with appropriate optional fields. The task types are comprehensive and follow the Vertex AI embedding model capabilities.
23-55: Good documentation in Options interfaceThe
Optionsinterface is well-documented with clear JSDoc comments for each parameter, including default values. This makes the API easier to understand and use.
57-65: Appropriate error handling in getProjectId functionThe function correctly checks for the presence of the required property and throws a descriptive error message when it's not found.
76-88: Clean implementation of getTextEmbeddingsThe function correctly handles both single string and array inputs by converting to a standard format before calling
getBatchedEmbeddings.
99-107: Good use of default parametersThe function correctly uses destructuring with default values, making the function more flexible while maintaining good defaults.
128-130: Improve error handling for non-200 response codesThe current error handling is minimal and does not provide enough context about the specific failure.
Consider handling different status codes more specifically and including the response body in the error message for better diagnostics:
if (response.getResponseCode() !== 200) { - throw new Error(response.getContentText()); + const responseCode = response.getResponseCode(); + const responseText = response.getContentText(); + let errorMessage = `Request failed with status code ${responseCode}: ${responseText}`; + if (responseCode === 400) { + errorMessage = `Bad Request: ${responseText}`; + } else if (responseCode === 500) { + errorMessage = `Internal Server Error: ${responseText}`; + } + throw new Error(errorMessage); }
148-154: Fix potential silent errors in dotProduct_ functionThe
dotProduct_function usesMath.min(x.length, y.length)which silently ignores extra elements if vectors have different lengths. This could lead to subtle bugs.-function dotProduct_(x: number[], y: number[]): number { +/** + * Calculates the dot product of two vectors. + * @param x - The first vector. + * @param y - The second vector. + * @throws {Error} If vectors have different lengths + * @returns The dot product value + */ +function dotProduct_(x: number[], y: number[]): number { + if (x.length !== y.length) { + throw new Error("Vectors must have the same length"); + } let result = 0; - for (let i = 0, l = Math.min(x.length, y.length); i < l; i += 1) { + for (let i = 0, l = x.length; i < l; i += 1) { result += x[i] * y[i]; } return result; }
174-179: Well-implemented similarity function with proper validationGood implementation of the cosine similarity function with appropriate validation for vector length mismatch.
185-191: Good emoji mapping for similarity valuesThe emoji mapping for different similarity thresholds is well-defined and provides a nice visual representation of similarity values.
193-199: Good implementation of chunkArray utilityThe utility function correctly chunks an array into pieces of a specified size, which is useful for batch processing.
Summary by CodeRabbit