Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ERROR] Numeric columns do not support multivalue rows #7086

Closed
manjudr opened this issue Feb 18, 2019 · 6 comments
Closed

[ERROR] Numeric columns do not support multivalue rows #7086

manjudr opened this issue Feb 18, 2019 · 6 comments
Labels

Comments

@manjudr
Copy link

manjudr commented Feb 18, 2019

Fails to ingest the nested object value into druid using kafka with below ingestion spec and test input data.

Sample Input data :

{
    "edata":
    {
        "visits": [
        {
            "objid": "312583946656169984213975",
            "index": 0
        },
        {
            "objid": "312583929625190400114333",
            "index": 1
         
        },
        {
            "objid": "31261937011242598421918",
            "index": 2
        }, ]
    }
}

kafka Ingestion spec

{
  "type": "kafka",
  "dataSchema": {
    "dataSource": "datasource-1",
    "parser": {
      "type": "string",
      "parseSpec": {
        "format": "json",
        "flattenSpec": {
          "useFieldDiscovery": false,
          "fields": [
            {
              "type": "path",
              "name": "edata_visits_index",
              "expr": "$.edata.visits[*].index"
            }
          ]
        },
        "dimensionsSpec": {
          "dimensions": [
            {
              "type": "long",
              "name": "edata_visits_index"
            }
          ],
          "dimensionsExclusions": []
        },
        "timestampSpec": {
          "column": "ets",
          "format": "auto"
        }
      }
    },
    "metricsSpec": [],
    "granularitySpec": {
      "type": "uniform",
      "segmentGranularity": "day",
      "queryGranularity": "none",
      "rollup": false
    }
  },
  "ioConfig": {
    "topic": "telemetry",
    "consumerProperties": {
      "bootstrap.servers": "localhost:9092"
    },
    "taskCount": 2,
    "replicas": 1,
    "taskDuration": "PT100S",
    "useEarliestOffset": true
  },
  "tuningConfig": {
    "type": "kafka",
    "reportParseExceptions": false
  }
}

Error Output

""2019-02-11T12:05:50,450 ERROR [task-runner-0-priority-0] org.apache.druid.indexing.kafka.IncrementalPublishingKafkaIndexTaskRunner - Encountered exception while running task.
" java.lang.UnsupportedOperationException: Numeric columns do not support multivalue rows.
    at org.apache.druid.segment.LongDimensionIndexer.processRowValsToUnsortedEncodedKeyComponent(LongDimensionIndexer.java:45) ~[druid-processing-0.13.0-incubating.jar:0.13.0-incubating]
    at org.apache.druid.segment.LongDimensionIndexer.processRowValsToUnsortedEncodedKeyComponent(LongDimensionIndexer.java:37) ~[druid-processing-0.13.0-incubating.jar:0.13.0-incubating]
    at org.apache.druid.segment.incremental.IncrementalIndex.toIncrementalIndexRow(IncrementalIndex.java:674) ~[druid-processing-0.13.0-incubating.jar:0.13.0-incubating]
    at org.apache.druid.segment.incremental.IncrementalIndex.add(IncrementalIndex.java:609) ~[druid-processing-0.13.0-incubating.jar:0.13.0-incubating]
    at org.apache.druid.segment.realtime.plumber.Sink.add(Sink.java:181) ~[druid-server-0.13.0-incubating.jar:0.13.0-incubating]
    at org.apache.druid.segment.realtime.appenderator.AppenderatorImpl.add(AppenderatorImpl.java:246) ~[druid-server-0.13.0-incubating.jar:0.13.0-incubating]
    at org.apache.druid.segment.realtime.appenderator.BaseAppenderatorDriver.append(BaseAppenderatorDriver.java:403) ~[druid-server-0.13.0-incubating.jar:0.13.0-incubating]
    at org.apache.druid.segment.realtime.appenderator.StreamAppenderatorDriver.add(StreamAppenderatorDriver.java:180) ~[druid-server-0.13.0-incubating.jar:0.13.0-incubating]

Is something wrong with my ingestion spec ? why we are getting this error ?

@manjudr
Copy link
Author

manjudr commented Feb 18, 2019

Things which i have tried out.

  1. When i tried by change $.edata.visits[*].index to $.edata.visits[0].index I was able index the data into druid without any error but what if i want to index all visits list values into druid? without referring any index of visits list?
"fields": [
            {
              "type": "path",
              "name": "edata_visits_index",
              "expr": "$.edata.visits[0].index"
            }
          ]
  1. When I changed the data type from long to string then i was able to index all visits list value into druid as a string format.
    But what if i want to index all visits values as long/integer data type format?
       "dimensions": [
            {
              "type": "string",
              "name": "edata_visits_index"
            }
          ],

But If i try to index all visits value as long data type format then it will throw " java.lang.UnsupportedOperationException: Numeric columns do not support multivalue rows. error

Any work around to index all visits value as long/integer format ?

@manjudr manjudr changed the title Numeric columns do not support multivalue rows [ERROR] Numeric columns do not support multivalue rows Feb 19, 2019
@gianm
Copy link
Contributor

gianm commented Feb 20, 2019

@manjudr, it is true, numeric columns do not support multivalue rows at this time. Probably storing them as strings is your best bet, or alternatively, taking up the effort to propose and drive multivalue numeric column support.

@anandp504
Copy link

@gianm Thanks for your reply. We tried using a transformation on the multi-valued field and convert it to numeric values. cast(expr,'LONG') . Is that the right way to do? Or should we use a extractFn and use javascript code to convert them to numeric values?

@manjudr
Copy link
Author

manjudr commented Feb 26, 2019

@gianm
Though we have tried with Transform Specs using cast function Still we couldn't able to achieve it.

"transformSpec": {
        "transforms": [
          {
            "type": "expression",
            "name": "edata_visits_index",
            "expression": "cast(edata_visits_index,'LONG')"
          }
        ]
}

Output:

[
  {
    "segmentId": "datasource-events-33_2019-11-07T00:00:00.000Z_2019-11-08T00:00:00.000Z_2019-02-26T07:42:38.349Z",
    "columns": [
      "edata_visits_index"
    ],
    "events": [
      {
        "__time": 1573146630927,
        "edata_visits_index": "0"
      }
    ]
  }
]

Any other alternative solution to convert the String value into Long type

@stale
Copy link

stale bot commented Dec 3, 2019

This issue has been marked as stale due to 280 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions.

@stale stale bot added the stale label Dec 3, 2019
@stale
Copy link

stale bot commented Dec 31, 2019

This issue has been closed due to lack of activity. If you think that is incorrect, or the issue requires additional review, you can revive the issue at any time.

@stale stale bot closed this as completed Dec 31, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants