Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clickhouse Crashes When Trying To Read google/protobuf/struct.proto #62467

Closed
jonny7 opened this issue Apr 10, 2024 · 9 comments · Fixed by #62506
Closed

Clickhouse Crashes When Trying To Read google/protobuf/struct.proto #62467

jonny7 opened this issue Apr 10, 2024 · 9 comments · Fixed by #62506
Assignees
Labels
bug Confirmed user-visible misbehaviour in official release crash Crash / segfault / abort

Comments

@jonny7
Copy link

jonny7 commented Apr 10, 2024

Hi

I am using Clickhouse with Kafka & Protobuf, but it seems that the google/protobuf/struct.proto causes Clickhouse to crash. This is happening on Docker in MacOS on ARM and also in Kubernetes in GKE, also on ARM. I have tried this on both the 24.3.2.23 and latest clickhouse versions. I am also aware that these were relatively recently added to Clickhouse through this #56741, which includes support for struct.proto. But it seems that Clickhouse doesn't like it. I can outline the steps and reproducible steps.

Create Dockerfile

version: '3.7'

volumes:
  clickhouse: null

services:
  clickhouse:
    container_name: clickhouse
    image: clickhouse/clickhouse-server:24.3.2.23
    ulimits:
      nofile:
        soft: 262144
        hard: 262144
    ports:
      - "18123:8123"
      - "9000:9000"
    volumes:
      - clickhouse:/var/lib/clickhouse/
      - ./a.proto:/var/lib/clickhouse/format_schemas/a.proto
      - ./b.proto:/var/lib/clickhouse/format_schemas/b.proto

Add these schemas and mount them
a.proto

syntax = "proto3";

package events;
option go_package = "/generated";

import "google/protobuf/struct.proto";

message Request {
  string id = 1;
  optional google.protobuf.Struct ext = 2;
}

b.proto

syntax = "proto3";

import "a.proto";

package analytics;
option go_package = "/analytics";

message AnotherEvent {
  message Metadata {
    string id = 1;
  }
  events.Request embeded = 2;
}

Try and describe the schema:

DESC file('nonexist', 'Protobuf') SETTINGS format_schema='b.proto:AnotherEvent';

The system will crash

2024.04.10 00:53:23.312580 [ 47 ] {} <Trace> TCP-Session: 9039c114-34cf-4a2b-8ede-8b9124996864 Creating query context from session context, user_id: 94309d50-4f52-5250-31bd-74fecac179db, parent context user: default
2024.04.10 00:53:23.312845 [ 47 ] {0ef9e17a-0f1b-4f43-8422-d0853b60ad70} <Debug> executeQuery: (from [::1]:47784) DESC file('nonexist', 'Protobuf') SETTINGS format_schema='b.proto:AnotherEvent'; (stage: Complete)
2024.04.10 00:53:37.936198 [ 1 ] {} <Information> SentryWriter: Sending crash reports is disabled
2024.04.10 00:53:37.936245 [ 1 ] {} <Trace> Pipe: Pipe capacity is 1.00 MiB
2024.04.10 00:53:37.989110 [ 1 ] {} <Information> Application: Starting ClickHouse 24.3.2.23 (revision: 54484, git hash: 8b7d910960cc2c6a0db07991fe2576a67fe98146, build id: 5B5C43F049E2D5125E894547577CDA8EEC25B3C7), PID 1
2024.04.10 00:53:37.989232 [ 1 ] {} <Information> Application: starting up

Expected behavior
That Clickhouse parses the schema correctly and at least doesn't crash.

@jonny7 jonny7 added the potential bug To be reviewed by developers and confirmed/rejected. label Apr 10, 2024
@Algunenano
Copy link
Member

Can you share the logs of the crash?

All I get is:

DESCRIBE TABLE file('nonexist', 'Protobuf')
SETTINGS format_schema = 'b.proto:AnotherEvent'

Query id: 8c9c5419-a7e4-41bb-9a0f-bdc90a622ea1


Elapsed: 0.344 sec. 

Received exception:
Code: 434. DB::Exception: Cannot parse 'b.proto' file, found an error at line 11, column 2, "events.Request" is not defined.: The table structure cannot be extracted from a Protobuf format file. You can specify the structure manually. (CANNOT_PARSE_PROTOBUF_SCHEMA)

Which is not a crash.

@Algunenano Algunenano added the st-need-info We need extra data to continue (waiting for response) label Apr 10, 2024
@Algunenano
Copy link
Member

Ok, I was missing the default protos. With them it reproduces an infinite loop in getNameAndDataTypeFromField

@Algunenano Algunenano added bug Confirmed user-visible misbehaviour in official release crash Crash / segfault / abort and removed st-need-info We need extra data to continue (waiting for response) potential bug To be reviewed by developers and confirmed/rejected. labels Apr 10, 2024
@Algunenano
Copy link
Member

The problem comes from how struct.proto is defined and how we get data types:

message Struct {
  // Unordered map of dynamically typed values.
  map<string, Value> fields = 1;
}

// `Value` represents a dynamically typed value which can be either
// null, a number, a string, a boolean, a recursive struct value, or a
// list of values. A producer of value is expected to set one of these
// variants. Absence of any variant indicates an error.
//
// The JSON representation for `Value` is JSON value.
message Value {
  // The kind of value.
  oneof kind {
    // Represents a null value.
    NullValue null_value = 1;
    // Represents a double value.
    double number_value = 2;
    // Represents a string value.
    string string_value = 3;
    // Represents a boolean value.
    bool bool_value = 4;
    // Represents a structured value.
    Struct struct_value = 5;
    // Represents a repeated `Value`.
    ListValue list_value = 6;
  }
}

// `NullValue` is a singleton enumeration to represent the null value for the
// `Value` type union.
//
// The JSON representation for `NullValue` is JSON `null`.
enum NullValue {
  // Null value.
  NULL_VALUE = 0;
}

// `ListValue` is a wrapper around a repeated field of values.
//
// The JSON representation for `ListValue` is JSON array.
message ListValue {
  // Repeated field of dynamically typed values.
  repeated Value values = 1;
}

When reviewing the types we are doing: Value -> ListValue -> Value -> ListValue -> Value eternally (until we run out of stack)

@Algunenano
Copy link
Member

I'm not sure we can support recursive structures in the current CH type system.

Maybe we could detect if recursion is found and throw in those cases. WDYT @Avogar?

@Avogar
Copy link
Member

Avogar commented Apr 10, 2024

I'm not sure we can support recursive structures in the current CH type system.

I also think so.

Maybe we could detect if recursion is found and throw in those cases.

Yes, let's do it.

@jonny7
Copy link
Author

jonny7 commented Apr 10, 2024

The problem comes from how struct.proto is defined and how we get data types:

message Struct {
  // Unordered map of dynamically typed values.
  map<string, Value> fields = 1;
}

// `Value` represents a dynamically typed value which can be either
// null, a number, a string, a boolean, a recursive struct value, or a
// list of values. A producer of value is expected to set one of these
// variants. Absence of any variant indicates an error.
//
// The JSON representation for `Value` is JSON value.
message Value {
  // The kind of value.
  oneof kind {
    // Represents a null value.
    NullValue null_value = 1;
    // Represents a double value.
    double number_value = 2;
    // Represents a string value.
    string string_value = 3;
    // Represents a boolean value.
    bool bool_value = 4;
    // Represents a structured value.
    Struct struct_value = 5;
    // Represents a repeated `Value`.
    ListValue list_value = 6;
  }
}

// `NullValue` is a singleton enumeration to represent the null value for the
// `Value` type union.
//
// The JSON representation for `NullValue` is JSON `null`.
enum NullValue {
  // Null value.
  NULL_VALUE = 0;
}

// `ListValue` is a wrapper around a repeated field of values.
//
// The JSON representation for `ListValue` is JSON array.
message ListValue {
  // Repeated field of dynamically typed values.
  repeated Value values = 1;
}

When reviewing the types we are doing: Value -> ListValue -> Value -> ListValue -> Value eternally (until we run out of stack)

Thanks @Algunenano for looking at this, glad that my instructions meant you could reproduce the issue 🙂 and for the PR fix

@jonny7
Copy link
Author

jonny7 commented May 1, 2024

@Algunenano I was trying this in the new 24.x version released yesterday. I'm not sure how Clickhouse does it's releasing, should this be fixed within this new v24.x? I'm still seeing in docker 24.2.3.70

@den-crane
Copy link
Contributor

@jonny7

v24.4.1.2088-stable.md:* Avoid crash when reading protobuf with recursive types #62506 (Raúl Marín).

fixed only in 24.4.1.2088
not backported into 24.2

@jonny7
Copy link
Author

jonny7 commented May 1, 2024

ahh, thanks @den-crane

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Confirmed user-visible misbehaviour in official release crash Crash / segfault / abort
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants