Skip to content

Model Setup Guide

yaojingang edited this page May 24, 2026 · 3 revisions

Model Setup Guide

This page explains the safest way to connect AI models to GEOFlow and make chat, embedding, RAG, and knowledge chunking work together.

1. Where Model Setup Lives

Admin path:

AI Configurator -> AI Model Settings

This area handles:

  • adding models
  • editing models
  • setting model type
  • configuring API URL and model ID
  • enabling or disabling models
  • testing model connections
  • setting failover priority
  • setting the default embedding model
  • configuring knowledge chunking strategy

2. Supported Provider Types

GEOFlow supports two main integration styles:

  • OpenAI-compatible providers: DeepSeek, OpenAI proxies, Zhipu, Volcengine Ark, MiniMax, Alibaba DashScope, and similar providers.
  • Native Gemini endpoints: for Gemini chat / embedding models without pretending they are OpenAI-compatible routes.

If the provider clearly offers an OpenAI-compatible endpoint, configure it that way. If you use Gemini, use the native Gemini configuration.

3. What You Need to Fill In

In most cases you need at least:

  • model name
  • provider / API URL
  • model ID
  • Bearer Token / API Key
  • model type: chat or embedding

The most common starting point is a chat model. If you want knowledge-base RAG, you also need an embedding model.

4. API URL Rules

GEOFlow supports both:

  • provider base URLs
  • full endpoint URLs

It expands capability-specific paths automatically when the provider only gives a base URL:

  • chat defaults to /v1/chat/completions
  • embedding defaults to /v1/embeddings

It also handles versioned base paths such as:

  • Zhipu /api/paas/v4
  • Volcengine Ark /api/v3

Practical examples:

  • DeepSeek: https://api.deepseek.com
  • OpenAI-compatible proxy: https://example.com/v1
  • Zhipu: https://open.bigmodel.cn/api/paas/v4
  • Volcengine Ark: https://ark.cn-beijing.volces.com/api/v3
  • Gemini: follow the Gemini provider hint in the admin form for API key and model ID

Avoid mixing a full chat endpoint with a provider base URL unless you have verified the provider documentation.

5. Model-selection Advice

At the beginning, it is better to:

  • connect one stable chat model first
  • prove title generation and body generation
  • add an embedding model after the core flow works
  • only then add smart failover and complex provider combinations

Title generation usually benefits from fast models. Long-form writing, complex strategies, and semantic chunk planning can use models selected for cost and context length.

6. Embedding-model Considerations

Embedding models are used for knowledge-base vectorization and RAG retrieval.

Check these points:

  • model type is embedding
  • API key is valid
  • endpoint supports embeddings
  • dimensions are compatible with current vector storage
  • the model is selected as the default embedding model

If the knowledge preview shows chunks but zero vectors, the knowledge base has been split, but embeddings were not written.

7. Knowledge Chunking Strategy

The AI Models page also configures knowledge chunking:

  • Structured rule chunking: recommended default, stable and low-cost.
  • Automatic strategy: GEOFlow chooses from configured behavior.
  • LLM semantic planning: a chat model plans boundaries, then GEOFlow rebuilds final chunks from source text.

Semantic planning is useful for long or structurally complex documents. Use a fast, low-cost model with enough context. If planning fails, GEOFlow falls back to rule chunking.

8. Smart Model Failover

GEOFlow supports two task modes:

  • fixed
  • smart_failover

Use fixed when you want predictable output and cost. Use smart_failover when uptime matters more than keeping every article on the same provider.

Where:

  • fixed always uses the primary model
  • smart_failover tries the next available chat model by priority when the primary one fails

Recommendation: stabilize the main model first, then enable failover, and make priorities explicit.

9. Minimum Post-setup Validation

You should verify at least:

  1. the model can be saved
  2. connection test succeeds
  3. title generation works
  4. body generation works
  5. the embedding model can write knowledge vectors
  6. semantic chunking falls back to rules when planning fails
  7. if failover is enabled, fallback actually works

10. Common Issues

404 from the model endpoint

Common causes:

  • wrong base URL
  • provider does not actually use a /v1-style path
  • wrong model ID
  • Gemini configured through an OpenAI-compatible route by mistake

Timeout

Common causes:

  • reasoning model is too slow for the request settings
  • timeout strategy is too short
  • provider-side network instability
  • semantic chunk planning document is too long

Model saves successfully but tasks still fail

Common causes:

  • wrong model type
  • invalid token
  • quota limits
  • provider response shape mismatch

11. Guiding Principle

In one sentence:

Start with one stable model, validate the workflow, then add complexity.

The value of model integration is not how many providers exist in the UI, but how reliably the workflow runs.

Clone this wiki locally