Skip to content

Quickwit cannot work using EKS Pod identity #5796

@gn-viet-n

Description

@gn-viet-n

Hi,

I'm deploying quickwit version: 0.8.2 using chart :

namespace: quickwit
chartURL: https://helm.quickwit.io
chartName: quickwit
chartVersion: 0.7.16

to EKS 1.32

here is my helm value :

fullnameOverride: quickwit

serviceAccount:
  create: true
  name: quickwit

searcher:
  replicaCount: 1 # 3

indexer:
  replicaCount: 1 # 3
  serviceAnnotations:
    service.kubernetes.io/topology-mode: auto
    service.kubernetes.io/topology-aware-hints: auto

metastore:
  replicaCount: 1

janitor:
  enabled: true

control_plane:
  enabled: true

environment:
  # Remove ANSI colors.
  NO_COLOR: 1

# Quickwit configuration
config:
  storage:
    # No metastore configuration.
    # By default, metadata is stored on the local disk of the metastore instance.
    # Everything will be lost after a metastore restart.
    s3:
      region: us-west-2
  # Use local file system instead of S3
  default_index_root_uri: s3://lz-obs-quickwit-stg-fi8s/indexes

  # Indexer settings
  indexer:
    # By activating the OTEL service, Quickwit will be able
    # to receive gRPC requests from OTEL collectors.
    enable_otlp_endpoint: true

affinity:
  nodeAffinity:
    preferredDuringSchedulingIgnoredDuringExecution:
    - weight: 1
      preference:
        matchExpressions:
        - key: kubernetes.io/arch
          operator: In
          values:
          - arm64
  • I also make sure that serviceaccount quickwit in namespace quickwit is mapping pod identity properly (to bucket arn:aws:s3:::lz-obs-quickwit-stg-fi8s), the role has s3:* permission for that bucket, I have manually created a debug pod using this serviceaccount and check with aws cli, everything is ok. Also make sure pod identity works well in this cluster (we have other services using pod identity).
  • I also check and make sure all quickwit pods are using serviceaccount quickwit, and even saw EKS pod identity has injected & mount env in pod

But it doesn't work :

$ kubectl logs -n quickwit quickwit-indexer-0 --since=2m | grep -E "(credentials|upload|s3|Failed)" | tail -10

2025-06-11T09:09:51.050Z ERROR quickwit_actors::actor_handle: actor-exit-without-success actor="quickwit_ind
exing::actors::sequencer::Sequencer<quickwit_indexing::actors::publisher::Publisher>-dark-ls3R"
2025-06-11T09:09:51.050Z ERROR quickwit_indexing::actors::indexing_pipeline: Indexing pipeline failure. pipe
line_id=IndexingPipelineId { node_id: "quickwit-indexer-0", index_uid: IndexUid { index_id: "otel-logs-v0_7"
, incarnation_id: Ulid(2115176200197033472036477868220519089) }, source_id: "_ingest-api-source", pipeline_u
id: Pipeline(01JXF3T3CB32A5R2B3KR4P55HD) } generation=8 healthy_actors=["SourceActor-silent-wQzj", "quickwit
_indexing::actors::doc_processor::DocProcessor-divine-xmjw", "Indexer-wispy-vl0Y", "quickwit_indexing::actor
s::index_serializer::IndexSerializer-solitary-iUbz", "Packager-cold-1IM0", "IndexUploader-cold-ggBH"] failed
_or_unhealthy_actors=["quickwit_indexing::actors::sequencer::Sequencer<quickwit_indexing::actors::publisher:
:Publisher>-dark-ls3R", "Publisher-silent-koym"] success_actors=[]
2025-06-11T09:09:57.104Z  WARN index-doc-batches{index_id=otel-logs-v0_7 source_id=_ingest-api-source pipeli
ne_uid=01JXF3T3CB32A5R2B3KR4P55HD workbench_id=01JXF41GT51BPDG0ABBK3AQ8S7}:uploader: quickwit_indexing::acto
rs::uploader: Failed to upload split. Killing! cause=failed uploading key 01JXF41GT6MTDTPE3QC6QEJ5PR.split i
n bucket s3://lz-obs-quickwit-stg-fi8s/indexes/otel-logs-v0_7
    0: storage error(kind=Internal, source=failed to construct request: failed to load credentials from the 
credentials cache: the credentials provider was not properly configured: invalid full URI for ECS provider (
URI did not refer to the loopback interface): http://169.254.170.23/v1/credentials (ConstructionFailure(Cons
tructionFailure { source: CredentialsStageError { source: InvalidConfiguration(InvalidConfiguration { source
: "invalid full URI for ECS provider (URI did not refer to the loopback interface): http://169.254.170.23/v1
/credentials" }) } })))
    1: failed to construct request: failed to load credentials from the credentials cache: the credentials p
rovider was not properly configured: invalid full URI for ECS provider (URI did not refer to the loopback in
terface): http://169.254.170.23/v1/credentials (ConstructionFailure(ConstructionFailure { source: Credential
sStageError { source: InvalidConfiguration(InvalidConfiguration { source: "invalid full URI for ECS provider
 (URI did not refer to the loopback interface): http://169.254.170.23/v1/credentials" }) } })) split_id="01J
XF41GT6MTDTPE3QC6QEJ5PR"
2025-06-11T09:09:57.104Z ERROR quickwit_actors::spawn_builder: actor-failure cause=failed to receive command
 from uploader
    channel closed exit_status=Failure(failed to receive command from uploader
2025-06-11T09:09:57.104Z  INFO quickwit_actors::spawn_builder: actor-exit actor_id=quickwit_indexing::actors
::sequencer::Sequencer<quickwit_indexing::actors::publisher::Publisher>-billowing-mll7 exit_status=failure(c
ause=failed to receive command from uploader
2025-06-11T09:09:57.104Z ERROR quickwit_actors::actor_context: exit activating-kill-switch actor=quickwit_in
dexing::actors::sequencer::Sequencer<quickwit_indexing::actors::publisher::Publisher>-billowing-mll7 exit_st
atus=Failure(failed to receive command from uploader

After many fix but didn't works, I tried with aws access key with same policy (keep all above config, just add access & secret key) and it works.

Anyone have idea on this ?
Thank you so much.

Best Regards,
VietNC

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions