Skip to content

[help]: Pipeline error: The length of the values Array needs to be a multiple  #2266

@b1gcat

Description

@b1gcat

Do you need to file an issue?

  • I have searched the existing issues and this bug is not already filed.
  • My model is hosted on OpenAI or Azure. If not, please look at the "model providers" issue and don't file a new one here.
  • I believe this is a legitimate bug, not just a question. If this is a question, please use the Discussions area.

Describe the issue

It's a great project, I have some problem ,someone help me.

After 'init', I use embedding and model as follows:

NAME                    ID              SIZE      MODIFIED
qwen3.5:cloud           a7bf6f7891c3    -         3 hours ago
qwen3-embedding:0.6b    ac6da0dfba84    639 MB    6 hours ago

NextI use markitdown xxxx.docx -o input/report.md, and then I try 

graphrag index --verbose


It failed:

Starting workflow: create_community_reports
24 / 24 ..........................................................................................
Workflow complete: create_community_reports
WorkflowFunctionOutput(result= id human_readable_id community ... full_content_json period size
0 313abae31643604bbdae091eabbde0358ed3285425868a... 22 22 ... {\n "title": "Cloud Computing Platform and ... 2026-03-05 6
1 4baf1c686ca534cdbd588100f72312081c8604641a238f... 23 23 ... {\n "title": "Cloud Service Customer and Cl... 2026-03-05 5
2 90399b5b0eeff524152242ffba57f631d06e613e49205d... 8 8 ... {\n "title": "Access Control and Business A... 2026-03-05 4
3 851858884b531363dc6cb68287503988301a7e168d76f6... 9 9 ... {\n "title": "User Identity Authentication ... 2026-03-05 3
4 afa2a54f3124b1fe45804dccf12c3030b421e05d9dc400... 10 10 ... {\n "title": "Test Object and Data Leakage ... 2026-03-05 8
5 bc04d1054d54d3f81b802df2dc4dcfaf3e5dd8d70446c3... 12 12 ... {\n "title": "Cloud Service Customer and Co... 2026-03-05 11
6 9a7bacde60b499079b535fbd6959156304a67b14b826a5... 13 13 ... {\n "title": "Cloud Service Provider and Se... 2026-03-05 2
7 8a4ea1998da3eb55f4f8e254a4c8e3a49c3be2e8cc2d8e... 14 14 ... {\n "title": "Network Attack and Cloud Secu... 2026-03-05 4
8 a4eb34f7187e0391eaf6879357c0669facf8a37abe9da1... 15 15 ... {\n "title": "ATTACKER and ADMINISTRATOR Co... 2026-03-05 2
9 dc0b62f79106e536539cbb24ba3483c2a57d0a8c1a1eaf... 16 16 ... {\n "title": "XXX System Security Assessmen... 2026-03-05 6
10 74b77e558aab7035937e76826503bdbb81599e6a8661fc... 17 17 ... {\n "title": "Internal Network and Lateral ... 2026-03-05 2
11 555a214dc05fe3ff1822a2e5992e8aad5a22f33c9fece7... 19 19 ... {\n "title": "Big Data Application and Thir... 2026-03-05 2
12 25d56bb34d7c29e5f16fead576fa3b94a7cfffc05ee97c... 20 20 ... {\n "title": "Big Data Platform and Network... 2026-03-05 6
13 d3b9f5ee75d5b4887b5909162cf7d708e1306801536173... 21 21 ... {\n "title": "Level Assessment Project and ... 2026-03-05 4
14 ef803d0b763a46b49100e26345fcda83ab3ba1ec072a5b... 0 0 ... {\n "title": "Test Object Security Assessme... 2026-03-05 17
15 f1d586879d81036c634696479e6c1eb43ad83a7ff0ff36... 1 1 ... {\n "title": "Cloud Service Customer and Co... 2026-03-05 17
16 4488006bfdd32ec4fb0c4f542e2d514ebedd82902b4a57... 2 2 ... {\n "title": "Security Level Assessment and... 2026-03-05 5
17 67808773301a407e71796d18a1ec0a47000c0a65cac345... 3 3 ... {\n "title": "Assessment Agency and Single ... 2026-03-05 7
18 3aecdbb3428e06eda06062693e497b903e487c12393c33... 4 4 ... {\n "title": "Assessment Institution and Se... 2026-03-05 5
19 0b7c74a9c19b72021d63beb91779b3b52950852c9d2e36... 5 5 ... {\n "title": "XXX System Security Assessmen... 2026-03-05 13
20 ce95d2fea415f99cc1804bcf5d8ff130db3209b457490f... 6 6 ... {\n "title": "Big Data Platform Security As... 2026-03-05 12
21 42212a6744bd770346b291e710804a7681a4efde69c80d... 7 7 ... {\n "title": "Unauthorized Personnel and Id... 2026-03-05 2

[22 rows x 15 columns], stop=False)
Starting workflow: generate_text_embeddings
Pipeline error: The length of the values Array needs to be a multiple of the list_size..............
Pipeline complete


### Steps to reproduce

_No response_

### GraphRAG Config Used

```yaml
# Paste your config here
### This config file contains required core defaults that must be set, along with a handful of common optional settings.
### For a full list of available settings, see https://microsoft.github.io/graphrag/config/yaml/

### LLM settings ###
## There are a number of settings to tune the threading and token limits for LLM calls - check the docs.

completion_models:
  default_completion_model:
    model_provider: openai
    model: qwen3.5:cloud
    api_base: http://localhost:11434/v1
    auth_method: api_key # or azure_managed_identity
    api_key: ${GRAPHRAG_API_KEY} # set this in the generated .env file, or remove if managed identity
    retry:
      type: exponential_backoff

embedding_models:
  default_embedding_model:
    model_provider: openai
    model: qwen3-embedding:0.6b
    api_base: http://localhost:11434/v1
    auth_method: api_key
    api_key: ${GRAPHRAG_API_KEY}
    retry:
      type: exponential_backoff

### Document processing settings ###

input:
  type: text # [csv, text, json, jsonl]

chunking:
  type: tokens
  size: 1200
  overlap: 100
  encoding_model: o200k_base

### Storage settings ###
## If blob storage is specified in the following four sections,
## connection_string and container_name must be provided

input_storage:
  type: file # [file, blob, cosmosdb]
  base_dir: "input"

output_storage:
  type: file # [file, blob, cosmosdb]
  base_dir: "output"

reporting:
  type: file # [file, blob]
  base_dir: "logs"

cache:
  type: json # [json, memory, none]
  storage:
    type: file # [file, blob, cosmosdb]
    base_dir: "cache"

vector_store:
  type: lancedb
  db_uri: output/lancedb

### Workflow settings ###

embed_text:
  embedding_model_id: default_embedding_model

extract_graph:
  completion_model_id: default_completion_model
  prompt: "prompts/extract_graph.txt"
  entity_types: [organization,person,geo,event]
  max_gleanings: 1

summarize_descriptions:
  completion_model_id: default_completion_model
  prompt: "prompts/summarize_descriptions.txt"
  max_length: 500

extract_graph_nlp:
  text_analyzer:
    extractor_type: regex_english # [regex_english, syntactic_parser, cfg]

cluster_graph:
  max_cluster_size: 10

extract_claims:
  enabled: false
  completion_model_id: default_completion_model
  prompt: "prompts/extract_claims.txt"
  description: "Any claims or facts that could be relevant to information discovery."
  max_gleanings: 1

community_reports:
  completion_model_id: default_completion_model
  graph_prompt: "prompts/community_report_graph.txt"
  text_prompt: "prompts/community_report_text.txt"
  max_length: 2000
  max_input_length: 8000

snapshots:
  graphml: false
  embeddings: false

### Query settings ###
## The prompt locations are required here, but each search method has a number of optional knobs that can be tuned.
## See the config docs: https://microsoft.github.io/graphrag/config/yaml/#query

local_search:
  completion_model_id: default_completion_model
  embedding_model_id: default_embedding_model
  prompt: "prompts/local_search_system_prompt.txt"

global_search:
  completion_model_id: default_completion_model
  map_prompt: "prompts/global_search_map_system_prompt.txt"
  reduce_prompt: "prompts/global_search_reduce_system_prompt.txt"
  knowledge_prompt: "prompts/global_search_knowledge_system_prompt.txt"

drift_search:
  completion_model_id: default_completion_model
  embedding_model_id: default_embedding_model
  prompt: "prompts/drift_search_system_prompt.txt"
  reduce_prompt: "prompts/drift_search_reduce_prompt.txt"

basic_search:
  completion_model_id: default_completion_model
  embedding_model_id: default_embedding_model
  prompt: "prompts/basic_search_system_prompt.txt"

Logs and screenshots

No response

Additional Information

  • GraphRAG Version:
  • Operating System:
  • Python Version:
  • Related Issues:

Metadata

Metadata

Assignees

No one assigned

    Labels

    triageDefault label assignment, indicates new issue needs reviewed by a maintainer

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions