Skip to content

Comments

Add OpenAI support#65

Merged
Kavirubc merged 6 commits intosimiligh:mainfrom
nick1udwig:add-openai-support-3
Feb 17, 2026
Merged

Add OpenAI support#65
Kavirubc merged 6 commits intosimiligh:mainfrom
nick1udwig:add-openai-support-3

Conversation

@nick1udwig
Copy link
Contributor

@nick1udwig nick1udwig commented Feb 16, 2026

Description

Split out the openai provider portion of #40 to resolve #42

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 💥 Breaking change (fix or feature that would cause existing functionality to change)
  • 📚 Documentation update
  • 🔧 Configuration/build change
  • ♻️ Refactoring (no functional changes)
  • 🧪 Test update

Related Issues

Fixes #42

Changes Made

Add OpenAI API support for LLM and embedding
Use Gemini by default (i.e. if both Gemini and OpenAI are provided)

Testing

  • I have run go build ./... successfully
  • I have run go test ./... successfully
  • I have run go vet ./... successfully
  • I have tested the changes locally

Screenshots (if applicable)

N/A

Checklist

  • My code follows the project's style guidelines
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • New and existing unit tests pass locally with my changes

Additional Notes

Summary by CodeRabbit

  • New Features

    • Dual AI provider support: Gemini and OpenAI for embeddings and LLMs.
    • Automatic provider selection: Gemini preferred when both keys present; OpenAI used if Gemini key is not set.
    • Configurable embedding model and dimensions; effective embedding dimensions now derived from the chosen provider.
    • Workflows and actions accept an optional OPENAI_API_KEY alongside GEMINI_API_KEY.
  • Documentation

    • Updated README, setup guides, examples, and Quick Start/troubleshooting to reflect provider options, precedence, and "at least one key required" rule.

@nick1udwig nick1udwig requested a review from Kavirubc as a code owner February 16, 2026 21:01
@coderabbitai
Copy link

coderabbitai bot commented Feb 16, 2026

📝 Walkthrough

Walkthrough

Adds OpenAI as an alternative AI provider to Gemini, implements provider resolution (Gemini preferred, OpenAI fallback), refactors embedder and LLM clients for multi-provider support, infers embedding dimensions from provider/model, and updates docs/workflows/examples to accept OPENAI_API_KEY. (36 words)

Changes

Cohort / File(s) Summary
Docs & examples
\.env.sample, DOCS/single-repo-setup.md, DOCS/multi-repo-org-setup.md, README.md, cmd/simili-web/README.md, DOCS/examples/.../simili.yaml, DOCS/examples/.../workflow.yml, DOCS/examples/.../caller-workflow.yml, DOCS/examples/.../shared-workflow.yml
Introduce OPENAI_API_KEY across docs and examples, mark GEMINI_API_KEY optional with precedence rules, add embedding model and dimensions examples, and update inline .env/sample comments to reflect provider precedence and fallback.
Command wiring & init
cmd/simili-web/main.go, cmd/simili/commands/.../*.go
Switch initialization to use cfg.Embedding.APIKey, make embedder/LLM init provider-agnostic, always attempt embedder/LLM creation with unified key fallback, and propagate dynamic embedding dimension inference into collection creation and logs.
Provider abstraction & utils
internal/integrations/gemini/provider.go, internal/integrations/gemini/provider_test.go
Add Provider type/constants, ResolveProvider (env/config resolution and precedence), OpenAI JSON helper callOpenAIJSON, error extraction helpers, and tests covering provider resolution and model selection.
Embedder multi-provider
internal/integrations/gemini/embedder.go
Replace Gemini-only embedder with provider-aware Embedder supporting Gemini and OpenAI, add provider/client fields, provider-specific embedding paths, model defaults, dimensions inference, and new accessors (Provider(), Model(), Dimensions tracking).
LLM multi-provider & features
internal/integrations/gemini/llm.go, internal/integrations/gemini/prompts.go
Extend LLMClient to support Gemini/OpenAI (constructor variadic model), add provider-aware generate paths, unified JSON extraction/parsing, many new types for triage/duplicate detection and DetectPRDuplicate, plus a minor prompt formatting fix.
Vector DB wiring
internal/steps/vectordb_prep.go
Wire embedder into VectorDBPrep and prefer embedder-reported embedding dimensions when available during collection/dry-run operations.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Config as Config (cfg.Embedding)
    participant Init as Init
    participant Env as Env (GEMINI/OPENAI)
    participant Resolver as ResolveProvider
    participant Embedder as Embedder
    participant LLM as LLMClient
    participant VectorDB as VectorDBPrep

    Init->>Config: Load API key, model, dimensions
    Init->>Env: Read GEMINI_API_KEY / OPENAI_API_KEY
    Init->>Resolver: ResolveProvider(cfg key or env)
    Resolver-->>Init: Provider + Resolved API Key
    Init->>Embedder: NewEmbedder(apiKey, model)
    Embedder->>Embedder: init provider client (Gemini/OpenAI)
    Embedder->>Embedder: infer Dimensions()
    Embedder-->>Init: Embedder(provider, model, dimensions)
    Init->>LLM: NewLLMClient(apiKey, model...)
    LLM->>LLM: resolve provider, init client
    Init->>VectorDB: NewVectorDBPrep(..., embedder)
    VectorDB->>Embedder: embedder.Dimensions()
    VectorDB-->>Init: use inferred dimensions for collection ops
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

enhancement

Poem

🐰 I found a key beneath a log,
Gemini leads while OpenAI hops,
Embeds learn size from model props,
Docs and workflows sing in pairs,
A rabbit cheers and nibbles shared carrots. 🥕

🚥 Pre-merge checks | ✅ 5 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.93% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Add OpenAI support' directly describes the main objective of the pull request—adding OpenAI as an alternative AI provider alongside Gemini.
Linked Issues check ✅ Passed The PR successfully implements all key objectives from issue #42: adds OpenAI provider support [.env.sample, provider.go, embedder.go, llm.go], auto-selects providers with Gemini preference [provider.go, ResolveProvider], documents supported models and dimensions [simili.yaml files, READMEs], updates workflow examples and setup guides [multi-repo/caller-workflow.yml, single-repo/workflow.yml, setup guides], makes embedder/LLM initialization provider-agnostic [embedder.go, llm.go], and maintains backward compatibility with no breaking changes.
Out of Scope Changes check ✅ Passed All changes are directly aligned with issue #42 requirements. Configuration updates, documentation changes, provider infrastructure, and initialization logic modifications all serve the goal of adding OpenAI support. No unrelated or out-of-scope changes detected.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into main

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
Verify each finding against the current code and only fix it if needed.


In `@DOCS/examples/multi-repo/simili.yaml`:
- Around line 12-15: Add an inline comment to the llm block to show OpenAI
alternatives: annotate the provider key (provider: "gemini" -> add comment like
# or "openai"), the api_key key (api_key: "${GEMINI_API_KEY}" -> note OpenAI env
var alternative), and the model key (model: "gemini-2.5-flash" -> add a hint
like # or "gpt-4o" or "gpt-4.1") so users see equivalent OpenAI values when
choosing provider, referencing the llm keys provider, api_key, and model in
simili.yaml.

In `@internal/integrations/gemini/embedder.go`:
- Around line 143-151: The code in embedOpenAI mutates the receiver field
e.dimensions (setting it to len(embedding)), which causes a data race if Embed
or EmbedBatch are used concurrently; remove the runtime assignment in
embedOpenAI and instead rely on the existing inferEmbeddingDimensions logic (or
initialize dimensions at construction), leaving e.dimensions immutable at
runtime; alternatively, if runtime detection is required, wrap the one-time
dimension detection in a sync.Once or use atomic operations—update embedOpenAI
to stop writing e.dimensions and adjust any callers that expected it to be set
there (e.g., EmbedBatch/Embed) to use inferEmbeddingDimensions or the new
one-time detection helper.

In `@internal/integrations/gemini/llm.go`:
- Around line 137-178: The NewLLMClient function sets unsafe default model names
that will cause runtime failures and soon-deprecated behavior: change the
default for ProviderOpenAI (where client.model is set under the ProviderOpenAI
case) from "gpt-5.2" to a valid OpenAI model such as "gpt-5" or "gpt-5-mini",
and update the ProviderGemini default (where client.model is set under the
ProviderGemini case) from "gemini-2.0-flash-lite" to a non-deprecated Gemini
model such as "gemini-2.5-flash-lite" or "gemini-3-flash-preview"; keep the
selectedModel override logic intact so passing a model still takes precedence.

In `@internal/integrations/gemini/prompts.go`:
- Around line 317-364: buildPRDuplicateDetectionPrompt currently dereferences
input.PullRequest.Title and Body without a nil check; add a guard inside
buildPRDuplicateDetectionPrompt to handle a nil PullRequest by reading Title and
Body into local variables (e.g., title, body) with safe defaults ("") if
input.PullRequest is nil, then use those locals in the fmt.Sprintf call and
truncate calls; ensure you still build candidateList as before and return the
prompt using the safe title/body values to avoid a panic when PullRequest is
nil.

In `@internal/integrations/gemini/provider.go`:
- Around line 50-56: The heuristic in inferProviderFromKey only checks for "sk-"
and misses other OpenAI key formats like project-scoped "sk-proj-" (and similar
variants); update inferProviderFromKey to trim and lowercase the apiKey, then
check for additional known OpenAI prefixes (e.g., "sk-proj-" plus any other
documented OpenAI prefixes you find) before returning ProviderOpenAI, falling
back to ProviderGemini otherwise, and add a brief comment above the function
documenting which prefixes are treated as OpenAI keys; reference the
inferProviderFromKey function and ProviderOpenAI/ProviderGemini constants when
making the change.
🧹 Nitpick comments (2)
🤖 Fix all nitpicks with AI agents
Verify each finding against the current code and only fix it if needed.


In `@DOCS/examples/multi-repo/simili.yaml`:
- Around line 12-15: Add an inline comment to the llm block to show OpenAI
alternatives: annotate the provider key (provider: "gemini" -> add comment like
# or "openai"), the api_key key (api_key: "${GEMINI_API_KEY}" -> note OpenAI env
var alternative), and the model key (model: "gemini-2.5-flash" -> add a hint
like # or "gpt-4o" or "gpt-4.1") so users see equivalent OpenAI values when
choosing provider, referencing the llm keys provider, api_key, and model in
simili.yaml.

In `@internal/integrations/gemini/provider.go`:
- Around line 50-56: The heuristic in inferProviderFromKey only checks for "sk-"
and misses other OpenAI key formats like project-scoped "sk-proj-" (and similar
variants); update inferProviderFromKey to trim and lowercase the apiKey, then
check for additional known OpenAI prefixes (e.g., "sk-proj-" plus any other
documented OpenAI prefixes you find) before returning ProviderOpenAI, falling
back to ProviderGemini otherwise, and add a brief comment above the function
documenting which prefixes are treated as OpenAI keys; reference the
inferProviderFromKey function and ProviderOpenAI/ProviderGemini constants when
making the change.
DOCS/examples/multi-repo/simili.yaml (1)

12-15: Consider adding OpenAI alternative comments to the LLM section for consistency.

The embedding section includes helpful comments showing OpenAI alternatives, but the LLM section doesn't have similar guidance. For users who want to use OpenAI throughout, adding inline comments (e.g., provider: "gemini" # or "openai") would improve consistency.

📝 Suggested enhancement
 llm:
-  provider: "gemini"
-  api_key: "${GEMINI_API_KEY}"
-  model: "gemini-2.5-flash"
+  provider: "gemini" # or "openai"
+  api_key: "${GEMINI_API_KEY}" # or "${OPENAI_API_KEY}"
+  model: "gemini-2.5-flash" # OpenAI example: "gpt-4o-mini"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@DOCS/examples/multi-repo/simili.yaml` around lines 12 - 15, Add an inline
comment to the llm block to show OpenAI alternatives: annotate the provider key
(provider: "gemini" -> add comment like # or "openai"), the api_key key
(api_key: "${GEMINI_API_KEY}" -> note OpenAI env var alternative), and the model
key (model: "gemini-2.5-flash" -> add a hint like # or "gpt-4o" or "gpt-4.1") so
users see equivalent OpenAI values when choosing provider, referencing the llm
keys provider, api_key, and model in simili.yaml.
internal/integrations/gemini/provider.go (1)

50-56: The sk- prefix heuristic may not cover all OpenAI key formats.

OpenAI has introduced project-scoped keys with prefixes like sk-proj- and organization keys. While the current heuristic works for common cases and falls back to Gemini safely, consider documenting this behavior or extending the check.

📝 Suggested documentation or extended check
 func inferProviderFromKey(apiKey string) Provider {
-	// OpenAI keys commonly use sk-* prefixes. Fall back to Gemini for compatibility.
-	if strings.HasPrefix(strings.TrimSpace(apiKey), "sk-") {
+	// OpenAI keys use sk-* prefixes (including sk-proj-*, sk-svcacct-*).
+	// Fall back to Gemini for compatibility when prefix is unrecognized.
+	trimmed := strings.TrimSpace(apiKey)
+	if strings.HasPrefix(trimmed, "sk-") {
 		return ProviderOpenAI
 	}
 	return ProviderGemini
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/integrations/gemini/provider.go` around lines 50 - 56, The heuristic
in inferProviderFromKey only checks for "sk-" and misses other OpenAI key
formats like project-scoped "sk-proj-" (and similar variants); update
inferProviderFromKey to trim and lowercase the apiKey, then check for additional
known OpenAI prefixes (e.g., "sk-proj-" plus any other documented OpenAI
prefixes you find) before returning ProviderOpenAI, falling back to
ProviderGemini otherwise, and add a brief comment above the function documenting
which prefixes are treated as OpenAI keys; reference the inferProviderFromKey
function and ProviderOpenAI/ProviderGemini constants when making the change.

Comment on lines +137 to 178
// NewLLMClient creates a new LLM client.
func NewLLMClient(apiKey string, model ...string) (*LLMClient, error) {
provider, resolvedKey, err := ResolveProvider(apiKey)
if err != nil {
return nil, fmt.Errorf("failed to create Gemini client: %w", err)
return nil, err
}

if model == "" {
model = "gemini-2.5-flash" // Fast and cost-effective
client := &LLMClient{
provider: provider,
apiKey: resolvedKey,
}
selectedModel := ""
if len(model) > 0 {
selectedModel = strings.TrimSpace(model[0])
}

return &LLMClient{
client: client,
model: model,
}, nil
switch provider {
case ProviderGemini:
ctx := context.Background()
geminiClient, err := genai.NewClient(ctx, option.WithAPIKey(resolvedKey))
if err != nil {
return nil, fmt.Errorf("failed to create Gemini client: %w", err)
}
client.gemini = geminiClient
if selectedModel != "" {
client.model = selectedModel
} else {
client.model = "gemini-2.0-flash-lite"
}
case ProviderOpenAI:
client.openAI = &http.Client{Timeout: 60 * time.Second}
if selectedModel != "" {
client.model = selectedModel
} else {
client.model = "gpt-5.2"
}
default:
return nil, fmt.Errorf("unsupported provider: %s", provider)
}

return client, nil
}
Copy link

@coderabbitai coderabbitai bot Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

OpenAI GPT models available API 2026 current

💡 Result:

As of February 16, 2026, these GPT-branded models are available via the OpenAI API (availability can depend on usage tier / org verification) [1]:

  • GPT‑5 series (general + reasoning-capable)

    • gpt-5.1, gpt-5, gpt-5-mini, gpt-5-nano [2][3]
    • gpt-5-pro (Responses API only) [4]
    • gpt-5.1-codex (optimized for agentic coding; Responses API only) [5]
  • GPT‑4.1 series (non-reasoning, long-context)

    • gpt-4.1, gpt-4.1-mini, gpt-4.1-nano [6][7][8]
  • GPT‑4o mini (small multimodal GPT)

    • gpt-4o-mini [9]
  • GPT‑4o mini audio models

    • gpt-4o-mini-transcribe (speech-to-text) [10]
    • gpt-4o-mini-tts (text-to-speech) [11]
  • Legacy GPT

    • gpt-3.5-turbo (still available, but OpenAI recommends newer models) [12]

Note: OpenAI states that GPT‑4o and GPT‑4.1-family models were retired from ChatGPT on Feb 13, 2026 but “continue to be available in the API.” [13]

Sources: [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13]


🌐 Web query:

Google Gemini models available API 2026

💡 Result:

As of Feb 16, 2026, these Gemini-family models are available in the Gemini API (Google AI for Developers), per Google’s current model docs:

Core Gemini (text output; multimodal inputs)

  • gemini-3-pro-preview (Preview) [1]
  • gemini-3-flash-preview (Preview) [1]
  • gemini-2.5-pro (Stable) [2]
  • gemini-2.5-flash (Stable) [1]
  • gemini-2.5-flash-lite (Stable) [3]
  • gemini-2.0-flash-lite (plus stable -001) [4]
  • gemini-2.0-flash (plus stable -001, and -exp) [4]

Gemini image model (image + text output)

  • gemini-3-pro-image-preview (Preview) [1]

Important lifecycle notes (API)

  • Google changed the “latest” aliases in Jan 2026:
    • gemini-pro-latestgemini-3-pro-preview
    • gemini-flash-latestgemini-3-flash-preview [5]
  • Deprecation announced: gemini-2.0-flash / gemini-2.0-flash-lite (and their -001 variants) scheduled to shut down Mar 31, 2026 [5]

How to see the exact current list in your project

Use the API’s models list endpoint to programmatically enumerate what’s available to you right now. [6]

Sources: [1] [2] [3] [4] [5] [6]


The default OpenAI model "gpt-5.2" does not exist and will cause runtime failures.

As of February 2026, available OpenAI API models include gpt-5.1, gpt-5, gpt-5-mini, gpt-5-nano, gpt-4.1-mini, gpt-4o-mini, and others—but gpt-5.2 is not among them. Use gpt-5 or gpt-5-mini instead.

Additionally, gemini-2.0-flash-lite (line 164) is scheduled for deprecation on March 31, 2026. Replace with gemini-2.5-flash-lite or gemini-3-flash-preview for longer-term stability.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/integrations/gemini/llm.go` around lines 137 - 178, The NewLLMClient
function sets unsafe default model names that will cause runtime failures and
soon-deprecated behavior: change the default for ProviderOpenAI (where
client.model is set under the ProviderOpenAI case) from "gpt-5.2" to a valid
OpenAI model such as "gpt-5" or "gpt-5-mini", and update the ProviderGemini
default (where client.model is set under the ProviderGemini case) from
"gemini-2.0-flash-lite" to a non-deprecated Gemini model such as
"gemini-2.5-flash-lite" or "gemini-3-flash-preview"; keep the selectedModel
override logic intact so passing a model still takes precedence.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bot is out-of-date: gpt-5.2 is real

image

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🌐 Web query:

OpenAI gpt-5.2 model API availability 2026

💡 Result:

As of February 16, 2026, GPT‑5.2 is available in the OpenAI API.

  • API model IDs / endpoints

    • gpt-5.2 (GPT‑5.2 “Thinking”) — available in the Responses API and Chat Completions API. [1]
    • gpt-5.2-chat-latest (GPT‑5.2 “Instant”) — available in the Responses API and Chat Completions API. [1]
    • gpt-5.2-pro (GPT‑5.2 “Pro”) — available in the Responses API. [1]
  • Who can access it

    • OpenAI states GPT‑5 / GPT‑5.1 / GPT‑5.2 are accessible to API users on usage tiers 1–5, with some access/features subject to organization verification. [3]
  • Docs / snapshot

    • The model documentation lists gpt-5.2 plus a snapshot such as gpt-5.2-2025-12-11. [2]

Sources:
[1] OpenAI release post “Introducing GPT‑5.2” (Dec 11, 2025)
[2] OpenAI API docs: GPT‑5.2 model page
[3] OpenAI Help Center: “API Model Availability by Usage Tier and Verification Status”


@nick1udwig, you're absolutely right—thank you for the correction! My initial review was incorrect.

gpt-5.2 is indeed a valid OpenAI model available in the API as of February 2026, with variants including gpt-5.2-chat-latest and gpt-5.2-pro. The default model choice in the code is correct. I apologize for the confusion caused by outdated search results in my earlier analysis.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
Verify each finding against the current code and only fix it if needed.


In `@internal/integrations/gemini/llm.go`:
- Around line 324-326: The code accesses resp.Candidates[0].Content.Parts
without checking that Content is non-nil, risking a nil pointer panic; update
the validation around resp.Candidates to also verify resp.Candidates[0].Content
!= nil before reading .Parts (and propagate an informative error like "empty or
blocked response from LLM"), e.g., add a nil check for
resp.Candidates[0].Content and handle that case the same as when Candidates or
Parts are empty so the function returns a clear error instead of panicking.
- Around line 445-452: The extractJSONObject function can return mismatched
brackets because it finds any opening char with IndexAny and any closing with
LastIndexAny; update extractJSONObject to locate the earliest opening delimiter
between '{' and '[' (use strings.Index for "{" and "["), determine which
delimiter was chosen, then use the corresponding closing delimiter ('}' for '{'
and ']' for '[') with LastIndex to find the matching end; if a valid pair is
found return the trimmed substring, otherwise return "" (keep function name
extractJSONObject and behavior otherwise).
🧹 Nitpick comments (1)
🤖 Fix all nitpicks with AI agents
Verify each finding against the current code and only fix it if needed.


In `@internal/integrations/gemini/llm.go`:
- Around line 445-452: The extractJSONObject function can return mismatched
brackets because it finds any opening char with IndexAny and any closing with
LastIndexAny; update extractJSONObject to locate the earliest opening delimiter
between '{' and '[' (use strings.Index for "{" and "["), determine which
delimiter was chosen, then use the corresponding closing delimiter ('}' for '{'
and ']' for '[') with LastIndex to find the matching end; if a valid pair is
found return the trimmed substring, otherwise return "" (keep function name
extractJSONObject and behavior otherwise).
internal/integrations/gemini/llm.go (1)

445-452: Minor edge case: mismatched bracket extraction.

extractJSONObject uses IndexAny("{[") and LastIndexAny("}]") independently, which could extract mismatched delimiters (e.g., {...data...] if ] appears after }). Since this is a fallback after primary parsing attempts and would still fail at json.Unmarshal, the impact is low.

♻️ Optional: match bracket types
 func extractJSONObject(text string) string {
-	first := strings.IndexAny(text, "{[")
-	lastObj := strings.LastIndexAny(text, "}]")
-	if first >= 0 && lastObj > first {
-		return strings.TrimSpace(text[first : lastObj+1])
+	firstObj := strings.Index(text, "{")
+	lastObj := strings.LastIndex(text, "}")
+	firstArr := strings.Index(text, "[")
+	lastArr := strings.LastIndex(text, "]")
+	
+	// Prefer object if it starts first, else array
+	if firstObj >= 0 && lastObj > firstObj && (firstArr < 0 || firstObj < firstArr) {
+		return strings.TrimSpace(text[firstObj : lastObj+1])
+	}
+	if firstArr >= 0 && lastArr > firstArr {
+		return strings.TrimSpace(text[firstArr : lastArr+1])
 	}
 	return ""
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/integrations/gemini/llm.go` around lines 445 - 452, The
extractJSONObject function can return mismatched brackets because it finds any
opening char with IndexAny and any closing with LastIndexAny; update
extractJSONObject to locate the earliest opening delimiter between '{' and '['
(use strings.Index for "{" and "["), determine which delimiter was chosen, then
use the corresponding closing delimiter ('}' for '{' and ']' for '[') with
LastIndex to find the matching end; if a valid pair is found return the trimmed
substring, otherwise return "" (keep function name extractJSONObject and
behavior otherwise).

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Fix all issues with AI agents
Verify each finding against the current code and only fix it if needed.


In `@internal/integrations/gemini/embedder.go`:
- Around line 4-10: The TODO in package gemini (file embedder.go) about renaming
the package to provider-neutral naming should be turned into a tracked issue:
create an issue in the repo tracker describing the rename (current path/package
"gemini", desired example name "internal/integrations/ai"), include rationale,
migration steps, and any API/compatibility considerations, then update the TODO
comment in embedder.go to reference the created issue ID (e.g.,
TODO(2026-02-16): see issue `#123`) so the TODO is tied to an actionable ticket;
ensure the issue number and short link are added to the comment and push the
updated comment.
🧹 Nitpick comments (1)
🤖 Fix all nitpicks with AI agents
Verify each finding against the current code and only fix it if needed.


In `@internal/integrations/gemini/embedder.go`:
- Around line 4-10: The TODO in package gemini (file embedder.go) about renaming
the package to provider-neutral naming should be turned into a tracked issue:
create an issue in the repo tracker describing the rename (current path/package
"gemini", desired example name "internal/integrations/ai"), include rationale,
migration steps, and any API/compatibility considerations, then update the TODO
comment in embedder.go to reference the created issue ID (e.g.,
TODO(2026-02-16): see issue `#123`) so the TODO is tied to an actionable ticket;
ensure the issue number and short link are added to the comment and push the
updated comment.
internal/integrations/gemini/embedder.go (1)

4-10: Track the package rename TODO in an issue.

Optional, but turning this TODO into a tracked issue helps avoid long-lived TODOs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/integrations/gemini/embedder.go` around lines 4 - 10, The TODO in
package gemini (file embedder.go) about renaming the package to provider-neutral
naming should be turned into a tracked issue: create an issue in the repo
tracker describing the rename (current path/package "gemini", desired example
name "internal/integrations/ai"), include rationale, migration steps, and any
API/compatibility considerations, then update the TODO comment in embedder.go to
reference the created issue ID (e.g., TODO(2026-02-16): see issue `#123`) so the
TODO is tied to an actionable ticket; ensure the issue number and short link are
added to the comment and push the updated comment.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Fix all issues with AI agents
Verify each finding against the current code and only fix it if needed.


In `@internal/integrations/gemini/llm.go`:
- Line 136: The default Gemini model in internal/integrations/gemini/llm.go is
set to "gemini-2.0-flash-lite" (see the client.model assignment); change that
default to a non-deprecated model such as "gemini-2.5-flash-lite" or
"gemini-3-flash-preview" by updating the client.model value, and optionally make
the default configurable (via env/config) so future model swaps don't require
code changes.
- Around line 445-452: The current extractJSONObject selects the first opening
brace from "{[" and the last closing from "}]" independently, which can yield
mismatched pairs; update extractJSONObject to determine which opening char was
found (e.g., open := text[first]) then search for the matching closing char
(close := '}' if open == '{' else ']' if open == '[') using a last-index search
for that specific close (e.g., LastIndexByte/LastIndex) and only return the
substring if last > first; otherwise return "" to avoid extracting invalid
mixed-bracket JSON. Ensure you update the logic inside extractJSONObject and
keep trimming behavior the same.
🧹 Nitpick comments (2)
🤖 Fix all nitpicks with AI agents
Verify each finding against the current code and only fix it if needed.


In `@internal/integrations/gemini/llm.go`:
- Line 136: The default Gemini model in internal/integrations/gemini/llm.go is
set to "gemini-2.0-flash-lite" (see the client.model assignment); change that
default to a non-deprecated model such as "gemini-2.5-flash-lite" or
"gemini-3-flash-preview" by updating the client.model value, and optionally make
the default configurable (via env/config) so future model swaps don't require
code changes.
- Around line 445-452: The current extractJSONObject selects the first opening
brace from "{[" and the last closing from "}]" independently, which can yield
mismatched pairs; update extractJSONObject to determine which opening char was
found (e.g., open := text[first]) then search for the matching closing char
(close := '}' if open == '{' else ']' if open == '[') using a last-index search
for that specific close (e.g., LastIndexByte/LastIndex) and only return the
substring if last > first; otherwise return "" to avoid extracting invalid
mixed-bracket JSON. Ensure you update the logic inside extractJSONObject and
keep trimming behavior the same.
internal/integrations/gemini/llm.go (2)

136-136: Consider updating default Gemini model before deprecation.

The default model gemini-2.0-flash-lite is scheduled for deprecation on March 31, 2026 (approximately 6 weeks away). Consider updating to gemini-2.5-flash-lite or gemini-3-flash-preview for longer-term stability.

♻️ Suggested update
 		if selectedModel != "" {
 			client.model = selectedModel
 		} else {
-			client.model = "gemini-2.0-flash-lite"
+			client.model = "gemini-2.5-flash-lite"
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/integrations/gemini/llm.go` at line 136, The default Gemini model in
internal/integrations/gemini/llm.go is set to "gemini-2.0-flash-lite" (see the
client.model assignment); change that default to a non-deprecated model such as
"gemini-2.5-flash-lite" or "gemini-3-flash-preview" by updating the client.model
value, and optionally make the default configurable (via env/config) so future
model swaps don't require code changes.

445-452: Minor: Bracket matching heuristic may extract invalid JSON.

The function finds the first { or [ and last } or ] independently, which could extract invalid JSON for malformed responses (e.g., {data] would extract {data]). Since this is a fallback for error recovery and LLMs typically produce valid JSON, the impact is minimal.

♻️ Optional: verify bracket type matches
 func extractJSONObject(text string) string {
 	first := strings.IndexAny(text, "{[")
-	lastObj := strings.LastIndexAny(text, "}]")
-	if first >= 0 && lastObj > first {
+	if first < 0 {
+		return ""
+	}
+	openChar := text[first]
+	var closeChar byte
+	if openChar == '{' {
+		closeChar = '}'
+	} else {
+		closeChar = ']'
+	}
+	lastObj := strings.LastIndexByte(text, closeChar)
+	if lastObj > first {
 		return strings.TrimSpace(text[first : lastObj+1])
 	}
 	return ""
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/integrations/gemini/llm.go` around lines 445 - 452, The current
extractJSONObject selects the first opening brace from "{[" and the last closing
from "}]" independently, which can yield mismatched pairs; update
extractJSONObject to determine which opening char was found (e.g., open :=
text[first]) then search for the matching closing char (close := '}' if open ==
'{' else ']' if open == '[') using a last-index search for that specific close
(e.g., LastIndexByte/LastIndex) and only return the substring if last > first;
otherwise return "" to avoid extracting invalid mixed-bracket JSON. Ensure you
update the logic inside extractJSONObject and keep trimming behavior the same.

@nick1udwig nick1udwig mentioned this pull request Feb 16, 2026
17 tasks
@Kavirubc
Copy link
Member

Hi @nick1udwig,

I will solve the conflics and merge. Changes looks good but I need to do some minor changes to this which I will do in a later PR.

Thank you.

Copy link
Member

@Kavirubc Kavirubc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you.

Kavirubc added a commit that referenced this pull request Feb 17, 2026
Adds OpenAI as an alternative AI provider alongside Gemini.
Provider selection is automatic: Gemini is preferred when both keys
are set; OpenAI is used when only OPENAI_API_KEY is present.

Covers embeddings, LLM, config, docs, workflow examples, and CI.
Conflict with exponential-backoff retry logic resolved: both
embedGemini and embedOpenAI are now wrapped in withRetry.

Closes #42
Co-authored-by: nick1udwig <nick1udwig@users.noreply.github.com>
@Kavirubc Kavirubc merged commit e5f2bd5 into similigh:main Feb 17, 2026
5 of 6 checks passed
@coderabbitai coderabbitai bot mentioned this pull request Feb 18, 2026
17 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[0.2.0v][Feature] OpenAI provider support for embeddings and LLM

2 participants