Skip to content

Conversation

@jpoehnelt
Copy link
Owner

@jpoehnelt jpoehnelt commented Mar 28, 2025

Summary by CodeRabbit

  • New Features
    • Introduced package functionality for generating text embeddings along with similarity evaluation and emoji-based feedback for similarity levels.
  • Documentation
    • Added comprehensive documentation outlining package usage and configuration.
  • Tests
    • Implemented extensive tests to validate functionality and ensure robust error handling.
  • Chores
    • Updated package metadata and development configurations to streamline the build and development process.

@cloudflare-workers-and-pages
Copy link

cloudflare-workers-and-pages bot commented Mar 28, 2025

Deploying apps-script with  Cloudflare Pages  Cloudflare Pages

Latest commit: 7ac9f69
Status: ✅  Deploy successful!
Preview URL: https://f7c72fbe.apps-script.pages.dev
Branch Preview URL: https://feat-embeddings-package.apps-script.pages.dev

View logs

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 28, 2025

Walkthrough

This pull request introduces a new package for generating text embeddings using Vertex AI. It adds documentation, package configuration, implementation of embedding functionalities (including cosine similarity and emoji representation functions), comprehensive unit tests, and a TypeScript configuration file. The changes establish public APIs for generating embeddings and validating them via test cases within a Google Apps Script environment.

Changes

File(s) Change Summary
packages/embeddings/README.md New README detailing the embeddings package functionality.
packages/embeddings/package.json New package declaration for @repository/embeddings with scripts for type-checking and testing, author/license info, and module configurations.
packages/embeddings/tsconfig.json New TypeScript configuration specifying compiler options (NodeNext, ES2022, strict type-checking) and Google Apps Script types.
packages/embeddings/src/index.ts New functions: getTextEmbeddings, getBatchedEmbeddings, similarity, and similarityEmoji for generating embeddings, calculating cosine similarity, and mapping similarity to emojis.
packages/embeddings/src/index.test.ts New unit tests for the embedding functions using Vitest, including error handling and parameter validations.

Sequence Diagram(s)

sequenceDiagram
    participant User as Client
    participant BE as batchedEmbeddings
    participant Props as ScriptProperties
    participant API as API Endpoint

    User->>BE: Call getBatchedEmbeddings(instances, options)
    BE->>Props: Retrieve project ID
    Props-->>BE: Return project ID
    BE->>API: Send HTTP POST with instances and options
    API-->>BE: Return embeddings or error response
    BE->>User: Return processed embeddings
Loading

Possibly related PRs

  • feat: vector embedding demo #2: Implements similar embedding generation and similarity calculation functionality, albeit with different naming conventions and code structure.

Poem

I'm a hopping rabbit with code so neat,
Embeddings and tests in every beat.
From Vertex AI to scripts so spry,
Our functions and tests reach for the sky.
With each new line, I give a happy hop—carrot power nonstop!
🥕✨

✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai plan to trigger planning for file edits and PR creation.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @jpoehnelt, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

Summary of Changes

This pull request introduces a new package called @repository/embeddings. This package provides functionality for generating embeddings using Vertex AI and calculating the similarity between embeddings within Apps Script. It includes functions for batch processing of text into embeddings, calculating cosine similarity between vectors, and representing similarity values with emojis. The package also includes comprehensive unit tests to ensure the reliability of the implemented functions.

Highlights

  • New Package: @repository/embeddings: Introduces a new package for generating and comparing embeddings using Vertex AI within Apps Script.
  • batchedEmbeddings Function: Adds a function to generate embeddings for text using the Vertex AI text-embedding-005 model, including options for customization.
  • Similarity Calculation: Implements functions to calculate cosine similarity between embeddings and represent the similarity using emojis.
  • Unit Tests: Includes unit tests for all major functions, ensuring correct calculations and error handling.

Changelog

Click here to see the changelog
  • packages/embeddings/README.md
    • Added a README file to provide a brief description of the embeddings package.
  • packages/embeddings/package.json
    • Created a package.json file with the necessary metadata, dependencies (including @types/google-apps-script and vitest), and scripts for the embeddings package.
  • packages/embeddings/src/index.test.ts
    • Added unit tests for similarity, similarityEmoji, and batchedEmbeddings functions.
    • Mocked Google Apps Script global objects for testing purposes.
    • Verified correct cosine similarity calculations, emoji representations, and API request handling.
  • packages/embeddings/src/index.ts
    • Implemented the batchedEmbeddings function to generate embeddings using Vertex AI.
    • Implemented the similarity function to calculate cosine similarity between two vectors.
    • Implemented the similarityEmoji function to represent similarity values with emojis.
    • Added error handling for API requests and vector length mismatches.
  • packages/embeddings/tsconfig.json
    • Configured the TypeScript compiler options for the embeddings package, including module type, target, and library settings.
  • pnpm-lock.yaml
    • Updated the pnpm lockfile to include the new @repository/embeddings package and its dependencies, such as @types/google-apps-script and vitest.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.


Trivia time!

What is the dimensionality of the embeddings generated by the text-embedding-005 model?

Click here for the answer
The `text-embedding-005` model generates embeddings with a dimensionality of 1536.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request introduces a new embeddings package with functions for generating embeddings using Vertex AI and calculating similarity between embeddings. The code is generally well-structured and includes unit tests. However, there are some areas that could be improved, such as addressing the TODO comment, improving error handling, and ensuring consistent variable declaration.

Summary of Findings

  • TODO comment in batchedEmbeddings function: The batchedEmbeddings function contains a TODO comment regarding chunking the input in instances of 5. This should be addressed to improve the function's efficiency and prevent potential issues with large inputs.
  • Missing error handling for non-200 responses: The batchedEmbeddings function only throws an error when the response code is not 200. It should also handle other error codes and provide more informative error messages.
  • Inconsistent use of const and let: The code uses both const and let for variable declarations. It should consistently use const for variables that do not change and let for variables that are reassigned.

Merge Readiness

The pull request is not yet ready for merging. The TODO comment in the batchedEmbeddings function should be addressed, and the error handling should be improved. The inconsistent use of const and let should also be addressed. I am unable to approve this pull request, and recommend that the pull request not be merged until these issues are addressed (at a minimum).

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (7)
packages/embeddings/README.md (1)

1-4: Enhance the README with more comprehensive documentation

While the introduction provides a basic overview, the README would benefit from more detailed documentation to help users effectively use this new package. Consider adding:

  1. Installation instructions
  2. Usage examples for the main functions (batchedEmbeddings, similarity, similarityEmoji)
  3. API reference with parameters and return values
  4. Any necessary configuration steps (e.g., Vertex AI setup)
  5. Requirements or prerequisites

This would make it easier for new users to understand and implement the embeddings functionality.

packages/embeddings/package.json (1)

16-17: Consider updating main and types fields to point to compiled output

Both "main" and "types" currently point to the source TypeScript file rather than compiled output. Typically:

  • "main" should point to the compiled JavaScript (e.g., "./dist/index.js")
  • "types" should point to the TypeScript declaration file (e.g., "./dist/index.d.ts")

While this might be intentional for a private package in a monorepo setup, it could cause issues if other packages try to import this one directly.

packages/embeddings/src/index.ts (5)

1-3: Consider making model constants configurable

These hardcoded constants might be more maintainable if they were configurable via environment variables or script properties, especially if different environments might use different models or regions.

-const MODEL_ID = "text-embedding-005";
-const REGION = "us-central1";
+const MODEL_ID = PropertiesService.getScriptProperties().getProperty("MODEL_ID") || "text-embedding-005";
+const REGION = PropertiesService.getScriptProperties().getProperty("REGION") || "us-central1";

4-7: Add JSDoc comments to Parameters interface

Consider adding JSDoc comments to the Parameters interface similar to what you've done for the Options interface to improve code documentation.

+/**
+ * Additional parameters for the embedding model.
+ */
 interface Parameters {
+  /**
+   * Whether to automatically truncate text that exceeds the model's maximum input length.
+   */
 	autoTruncate?: boolean;
+  /**
+   * The dimensionality of the output embeddings.
+   */
 	outputDimensionality?: number;
 }

43-51: Add JSDoc and error handling indication

The getProjectId function lacks JSDoc comments and doesn't indicate it can throw an error, which could cause issues for callers.

+/**
+ * Retrieves the project ID from script properties.
+ * @throws {Error} If PROJECT_ID is not found in script properties
+ * @returns The project ID
+ */
 const getProjectId = (): string => {
 	const projectId =
 		PropertiesService.getScriptProperties().getProperty("PROJECT_ID");
 	if (!projectId) {
 		throw new Error("PROJECT_ID not found in script properties");
 	}

 	return projectId;
 };

117-123: Add JSDoc return value and handle edge cases

The magnitude function is missing a JSDoc @returns tag and doesn't handle empty arrays explicitly.

 /**
  * Calculates the magnitude of a vector.
  * @param x - The vector.
+ * @returns The magnitude (Euclidean norm) of the vector
  */
 function magnitude(x: number[]): number {
+	if (x.length === 0) {
+		return 0;
+	}
 	let result = 0;
 	for (let i = 0, l = x.length; i < l; i += 1) {
 		result += x[i] ** 2;
 	}
 	return Math.sqrt(result);
 }

142-148: Handle edge cases in similarityEmoji

The function doesn't handle edge cases like values outside the expected cosine similarity range [-1, 1].

 export const similarityEmoji = (value: number): string => {
+	// Ensure value is within expected cosine similarity range [-1, 1]
+	if (value < -1 || value > 1) {
+		throw new Error(`Similarity value ${value} is outside the expected range [-1, 1]`);
+	}
+
 	if (value >= 0.9) return "🔥"; // Very high similarity
 	if (value >= 0.7) return "✅"; // High similarity
 	if (value >= 0.5) return "👍"; // Medium similarity
 	if (value >= 0.3) return "🤔"; // Low similarity
 	return "❌"; // Very low similarity
 };
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2426a68 and 61a8202.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (5)
  • packages/embeddings/README.md (1 hunks)
  • packages/embeddings/package.json (1 hunks)
  • packages/embeddings/src/index.test.ts (1 hunks)
  • packages/embeddings/src/index.ts (1 hunks)
  • packages/embeddings/tsconfig.json (1 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
packages/embeddings/src/index.test.ts (1)
packages/embeddings/src/index.ts (3)
  • similarity (131-136)
  • similarityEmoji (142-148)
  • batchedEmbeddings (59-98)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Cloudflare Pages
🔇 Additional comments (9)
packages/embeddings/tsconfig.json (1)

1-14: LGTM! TypeScript configuration is well set up for the project

The TypeScript configuration is appropriate for a package that will be used with Google Apps Script. It correctly:

  • Uses modern module system and target
  • Enables strict type checking
  • Includes Google Apps Script types
  • Sets up proper file inclusion/exclusion

No issues found in this configuration.

packages/embeddings/src/index.test.ts (5)

21-38: LGTM! Comprehensive testing of the similarity function

The tests for the similarity function are well-structured and cover important cases:

  • Parallel vectors (similarity = 1.0)
  • Orthogonal vectors (similarity = 0.0)
  • Opposite vectors (similarity = -1.0)
  • Error handling for vectors of different lengths

Good job ensuring that the cosine similarity calculations are accurate.


40-48: LGTM! Good coverage of similarity emoji thresholds

The tests for the similarityEmoji function verify that the correct emoji is returned for each similarity value threshold:

  • Very high (≥0.9): 🔥
  • High (≥0.7): ✅
  • Medium (≥0.5): 👍
  • Low (≥0.3): 🤔
  • Very low (<0.3): ❌

All thresholds are properly tested.


50-105: LGTM! Thorough testing of batchedEmbeddings for single and array inputs

The tests for batchedEmbeddings are comprehensive, covering:

  • Single string input handling
  • Array of strings input handling
  • Verification of request payload structure
  • Proper parsing of response data

The mocking approach is clean and effective.


107-129: LGTM! Good testing of custom parameters and error handling

The tests properly verify:

  • Custom parameters (model, projectId, region) are correctly used in API requests
  • Error handling when API returns non-200 status codes

This ensures the function is robust against various configuration options and failure scenarios.


1-20: LGTM! Clean mock setup for Google Apps Script environment

The mocking approach for Google Apps Script global objects is clean and effective:

  • ScriptApp.getOAuthToken is mocked to return a test token
  • PropertiesService is configured to simulate project ID retrieval
  • UrlFetchApp.fetchAll is mocked with vi.fn() for response simulation

This setup allows for testing Apps Script code in a Node.js environment.

packages/embeddings/src/index.ts (3)

12-41: Well-documented interface

The Options interface is well-documented with clear JSDoc comments for each property, including default values.


131-136: Consider optimizing the similarity function

The similarity function checks for vector length equality, which is good. However, since we're now adding the same check to dotProduct_, you could optimize this further.


59-98:

✅ Verification successful

Address TODOs and add batching implementation

The function has an unimplemented TODO about chunking requests in batches of 5, which should be addressed to handle rate limits and large input arrays.

The UrlFetchApp.fetchAll() method may have limitations on the number of concurrent requests. According to Google Apps Script documentation, there are quotas that could affect large batches of requests.

Here's an improved implementation with batching:

 export function batchedEmbeddings(
 	text: string | string[],
 	{
 		parameters = {},
 		model = MODEL_ID,
 		projectId = getProjectId(),
 		region = REGION,
 		token = ScriptApp.getOAuthToken(),
 	}: Options = {},
 ): number[][] {
 	const inputs = !Array.isArray(text) ? [text] : text;

-	// TODO chunk in instances of 5
-	const requests = inputs.map((content) => ({
+	// Process inputs in batches of 5 to respect API limits
+	const BATCH_SIZE = 5;
+	const results: any[] = [];
+
+	for (let i = 0; i < inputs.length; i += BATCH_SIZE) {
+		const batch = inputs.slice(i, i + BATCH_SIZE);
+		const requests = batch.map((content) => ({
 		url: `https://${region}-aiplatform.googleapis.com/v1/projects/${projectId}/locations/${region}/publishers/google/models/${model}:predict`,
 		method: "post" as const,
 		headers: {
 			Authorization: `Bearer ${token}`,
 			"Content-Type": "application/json",
 		},
 		muteHttpExceptions: true,
 		contentType: "application/json",
 		payload: JSON.stringify({
 			instances: [{ content }],
 			parameters,
 		}),
-	}));
+		}));
 
-	const responses = UrlFetchApp.fetchAll(requests);
+		const responses = UrlFetchApp.fetchAll(requests);
 
-	const results = responses.map((response) => {
-		if (response.getResponseCode() !== 200) {
-			throw new Error(response.getContentText());
-		}
+		const batchResults = responses.map((response) => {
+			const statusCode = response.getResponseCode();
+			const responseText = response.getContentText();
+			
+			if (statusCode !== 200) {
+				throw new Error(`API request failed with status ${statusCode}: ${responseText}`);
+			}
 
-		return JSON.parse(response.getContentText());
-	});
+			return JSON.parse(responseText);
+		});
+		
+		results.push(...batchResults);
+	}

 	return results.map((result) => result.predictions[0].embeddings.values);
 }

🌐 Web query:

What are the limitations of UrlFetchApp.fetchAll in Google Apps Script?

💡 Result:

UrlFetchApp's fetchAll method in Google Apps Script offers significant performance advantages over sequential fetch calls but comes with several important limitations:

Performance vs. Quota Limits

  1. Daily Request Quota
    All URL Fetch requests (including each individual call within fetchAll) count toward the daily limit of 20,000 requests/user/day[4][12]. A single fetchAll with 100 URLs consumes 100 requests from this quota.

  2. Execution Time Limits
    Scripts must complete within 6 minutes[12], which constrains large-scale fetchAll operations.

Technical Constraints

  • Response Handling
    Maximum response size per request: 50 MB[12]
    Maximum URL length: 2,082 characters[11][12]

  • Header Limits
    Up to 100 headers per request, each under 8 KB[12]

Operational Considerations

  • Asynchronous Nature
    Responses are reordered to match the original request sequence, even though requests execute out of order[1]. Actual fetch order varies per execution[1].

  • Server Rate Limits
    Aggressive use of fetchAll may trigger anti-DDOS protections on target servers, potentially blocking requests[6].

Best Practices

  • Benchmarked Performance
    Processing 1,000 URLs with fetchAll took 5.5 seconds vs. 27 seconds for sequential fetch calls in testing[1]. However, no documented maximum batch size exists[1][6].

  • Caching Strategies
    Use CacheService to avoid redundant requests and preserve quota[3][5].

  • Error Handling
    Enable muteHttpExceptions: true to prevent script failures from individual request errors[8][12].

Key Differences from fetch:

Feature fetchAll Sequential fetch
Execution Method Asynchronous parallel processing Synchronous linear processing
Response Order Enforced request order[1] Natural execution order
Quota Usage 1 request per URL 1 request per URL
Typical Use Case Batch operations (100-1,000 URLs) Single requests

For mission-critical applications, implement retry logic and monitor quota usage via Apps Script Dashboard[4][12].

Citations:


BATCHING AND ERROR HANDLING VERIFIED:
Your updated implementation correctly replaces the TODO with batching logic by processing inputs in groups of 5. The batching mechanism not only mitigates potential rate limit issues with UrlFetchApp.fetchAll (as per Google Apps Script’s quotas and execution time constraints) but also enhances error reporting by including the status code in the error message.

  • Key improvements:
    • Inputs are now processed in batches of 5 instead of issuing a single large batch.
    • Enhanced error handling: errors include both the response code and response text.
    • Overall design aligns well with best practices for managing quotas and rate limits in Apps Script.

This refactored approach addresses the original concerns effectively.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
packages/embeddings/src/index.ts (4)

109-123: Consider validating the instances array

The function should validate that the instances array is not empty before processing.

export function getBatchedEmbeddings(
	instances: Instance[],
	{
		parameters = {},
		model = MODEL_ID,
		projectId = getProjectId(),
		region = REGION,
		token = ScriptApp.getOAuthToken(),
	}: Options = {},
): number[][] {
+	if (!instances || instances.length === 0) {
+		throw new Error("No instances provided for embedding generation");
+	}
	const chunks = chunkArray(instances, 5);
	const requests = chunks.map((instances) => ({

135-141: Add error handling for JSON parsing

The function currently doesn't handle potential JSON parsing errors, which could occur if the API returns malformed JSON.

const results = responses.map((response) => {
	if (response.getResponseCode() !== 200) {
		throw new Error(response.getContentText());
	}

-	return JSON.parse(response.getContentText());
+	try {
+		return JSON.parse(response.getContentText());
+	} catch (error) {
+		throw new Error(`Failed to parse API response: ${error.message}`);
+	}
});

160-166: Consider optimizing the magnitude calculation

The magnitude function implementation is correct but could be optimized for readability and potentially performance.

function magnitude(x: number[]): number {
-	let result = 0;
-	for (let i = 0, l = x.length; i < l; i += 1) {
-		result += x[i] ** 2;
-	}
-	return Math.sqrt(result);
+	return Math.sqrt(x.reduce((sum, val) => sum + val ** 2, 0));
}

109-123:

❓ Verification inconclusive

Consider adding rate limiting for large batches

For handling large batches, it might be worth considering a rate limiting mechanism to avoid overwhelming the API and potential rate limits.

export function getBatchedEmbeddings(
	instances: Instance[],
	{
		parameters = {},
		model = MODEL_ID,
		projectId = getProjectId(),
		region = REGION,
		token = ScriptApp.getOAuthToken(),
+		batchSize = 5,
+		delayBetweenBatches = 1000, // milliseconds
	}: Options = {},
): number[][] {
	if (!instances || instances.length === 0) {
		throw new Error("No instances provided for embedding generation");
	}
-	const chunks = chunkArray(instances, 5);
+	const chunks = chunkArray(instances, batchSize);
	const requests = chunks.map((instances) => ({

You would then need to update the Options interface to include these new parameters:

interface Options {
	// existing properties...

	/**
	 * The size of each batch when processing multiple instances.
	 * @default 5
	 */
	batchSize?: number;

	/**
	 * The delay between batches in milliseconds.
	 * @default 1000
	 */
	delayBetweenBatches?: number;
}

And modify the fetchAll to process batches with delay:

const results: any[] = [];
for (const requestBatch of chunkArray(requests, 10)) {
	const responses = UrlFetchApp.fetchAll(requestBatch);
	// Process responses...
	// Add processed results to results array
	
	if (delayBetweenBatches > 0 && requestBatch.length < requests.length) {
		Utilities.sleep(delayBetweenBatches);
	}
}

🌐 Web query:

What are the rate limits for the Vertex AI embeddings API?

💡 Result:

Vertex AI's embeddings API enforces rate limits that vary by model, region, and project configuration. Here's a detailed breakdown:

Core Rate Limits

  • textembedding-gecko models
    Defaults to 1,500 requests per minute (RPM) in regions like us-central1 [10][12]. However:

    • Input is capped at 250 text entries per request in us-central1 (5 in other regions) [2][4]
    • Token limit: 20,000 tokens per request, with truncation at 2,048 tokens for most models (8,192 for text-embedding-large-exp-03-07) [2][10]
  • Experimental embedding models
    gemini-embedding-exp-03-07 has stricter limits:

    • 5 RPM under free tiers [7]
    • 10 RPM for paid Tier 1 projects [7]

Quota Management

  1. Default limits
    Set per-project and per-region, shared across all applications/IPs using the Firebase/Vertex AI project [1][10].

  2. Dynamic adjustments
    Limits can increase automatically as your project usage grows through Google's usage tiers:

    • Free Tier: Strict RPM/TPM caps [7]
    • Tier 1 (billing-enabled): Higher limits (e.g., 1,500 RPM for textembedding-gecko) [10]

Specialized Models

Model Feature Limit
Multimodal embeddings (multimodalembedding@001) Region-specific restrictions may block requests if quota is exhausted [5]
Batch processing Up to 4 concurrent jobs [10]

To view or adjust quotas:

  1. Navigate to Google Cloud Console → IAM & Admin → Quotas
  2. Filter for "Online prediction requests per base model per minute" [3]
  3. Submit quota increase requests for validated business needs [1][3]

Key Considerations

  • Combine local caching with batch processing to reduce API calls [8]
  • Use Firebase App Check to prevent abuse of client-side embedding calls [9]
  • Errors like 429 ("Quota Exceeded") or 400 ("Payload Size Limit") indicate breached limits [14][5]

Citations:


Rate Limiting Enhancement for Batch Requests

The current implementation spreads requests over fixed chunks but does not account for API rate limits, which can vary significantly by model and region. Given Vertex AI’s documented limits (e.g., up to 1,500 RPM for textembedding-gecko models and even stricter limits for experimental models), it is advisable to expose rate limiting parameters. Consider updating the function as follows:

  • Expose parameters:
    Add optional batchSize (default: 5) and delayBetweenBatches (default: 1000 ms) to the Options interface so that users can adjust based on the API’s constraints.

  • Dynamic chunking:
    Replace the hard-coded chunkArray(instances, 5) with chunkArray(instances, batchSize).

  • Delay between requests:
    In your loop that processes batches (e.g., using UrlFetchApp.fetchAll), insert a delay (using Utilities.sleep(delayBetweenBatches)) between consecutive batches to ensure that you don’t inadvertently exceed the rate limits.

This approach gives developers the flexibility to tune the request flow based on their specific project quotas and the model being used.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 61a8202 and 7ac9f69.

📒 Files selected for processing (3)
  • packages/embeddings/package.json (1 hunks)
  • packages/embeddings/src/index.test.ts (1 hunks)
  • packages/embeddings/src/index.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • packages/embeddings/package.json
  • packages/embeddings/src/index.test.ts
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Cloudflare Pages
🔇 Additional comments (10)
packages/embeddings/src/index.ts (10)

4-21: Well-defined interfaces with comprehensive task types

The interfaces for Parameters and Instance are well-structured with appropriate optional fields. The task types are comprehensive and follow the Vertex AI embedding model capabilities.


23-55: Good documentation in Options interface

The Options interface is well-documented with clear JSDoc comments for each parameter, including default values. This makes the API easier to understand and use.


57-65: Appropriate error handling in getProjectId function

The function correctly checks for the presence of the required property and throws a descriptive error message when it's not found.


76-88: Clean implementation of getTextEmbeddings

The function correctly handles both single string and array inputs by converting to a standard format before calling getBatchedEmbeddings.


99-107: Good use of default parameters

The function correctly uses destructuring with default values, making the function more flexible while maintaining good defaults.


128-130: Improve error handling for non-200 response codes

The current error handling is minimal and does not provide enough context about the specific failure.

Consider handling different status codes more specifically and including the response body in the error message for better diagnostics:

if (response.getResponseCode() !== 200) {
-	throw new Error(response.getContentText());
+	const responseCode = response.getResponseCode();
+	const responseText = response.getContentText();
+	let errorMessage = `Request failed with status code ${responseCode}: ${responseText}`;
+	if (responseCode === 400) {
+		errorMessage = `Bad Request: ${responseText}`;
+	} else if (responseCode === 500) {
+		errorMessage = `Internal Server Error: ${responseText}`;
+	}
+	throw new Error(errorMessage);
}

148-154: Fix potential silent errors in dotProduct_ function

The dotProduct_ function uses Math.min(x.length, y.length) which silently ignores extra elements if vectors have different lengths. This could lead to subtle bugs.

-function dotProduct_(x: number[], y: number[]): number {
+/**
+ * Calculates the dot product of two vectors.
+ * @param x - The first vector.
+ * @param y - The second vector.
+ * @throws {Error} If vectors have different lengths
+ * @returns The dot product value
+ */
+function dotProduct_(x: number[], y: number[]): number {
+	if (x.length !== y.length) {
+		throw new Error("Vectors must have the same length");
+	}
 	let result = 0;
-	for (let i = 0, l = Math.min(x.length, y.length); i < l; i += 1) {
+	for (let i = 0, l = x.length; i < l; i += 1) {
 		result += x[i] * y[i];
 	}
 	return result;
 }

174-179: Well-implemented similarity function with proper validation

Good implementation of the cosine similarity function with appropriate validation for vector length mismatch.


185-191: Good emoji mapping for similarity values

The emoji mapping for different similarity thresholds is well-defined and provides a nice visual representation of similarity values.


193-199: Good implementation of chunkArray utility

The utility function correctly chunks an array into pieces of a specified size, which is useful for batch processing.

@jpoehnelt jpoehnelt merged commit 2f6afbb into main Mar 28, 2025
5 checks passed
@jpoehnelt jpoehnelt deleted the feat/embeddings-package branch March 28, 2025 19:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants