Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for on-demand capacity #86

Closed
perezpaya opened this issue Dec 24, 2018 · 6 comments
Closed

Support for on-demand capacity #86

perezpaya opened this issue Dec 24, 2018 · 6 comments

Comments

@perezpaya
Copy link

I've been trying to export DynamoDB from HiveSQL.

Using a simple Hive Script select * from tableName I encounter a read capacity error. The capacity calculation of the connector is made by using a percentage.

This script works great when the DynamoDB table is set on managed capacity mode. But when switched to on-demand error I encounter the Capacity error.

Could you enable a configuration setting that allows to use a table con on-demand capacity and that uses the maximum throughput the EMR Cluster will use?

Thanks you

@perezpaya
Copy link
Author

#85

@feichashao
Copy link

feichashao commented Dec 29, 2018

Could I know the current status of this issue? I see the PR #85 is closed but not merged.

My customer and I are encountering this issue. A query to DynamoDB (on-demand) would fail with the below exception.

    > SELECT DISTINCT feature_class
    > FROM ddb_features_od
    > ORDER BY feature_class;

Status: Failed
Vertex failed, vertexName=Map 1, vertexId=vertex_1545727109987_0008_3_00, diagnostics=[Vertex vertex_1545727109987_0008_3_00 [Map 1] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: ddb_features_od initializer failed, vertex=vertex_1545727109987_0008_3_00 [Map 1], java.lang.RuntimeException: Read throughput should not be less than 1. Read throughput percent: 0.0

        at org.apache.hadoop.dynamodb.read.AbstractDynamoDBInputFormat.getSplits(AbstractDynamoDBInputFormat.java:66)
        at org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:442)
        at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:561)
        at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:196)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:269)
        at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

@feichashao
Copy link

Also, the exception thrown is somehow misleading.

throw new RuntimeException("Read throughput should not be less than 1. Read throughput "

The value maxReadThroughputAllocated is read from dynamodb.throughput.read, which is not a 'percent'.

@anandhs
Copy link

anandhs commented Jan 2, 2019

+1, we have been waiting for this feature to be released too.

@danielhaviv
Copy link

It is being worked on PR #88

@taklwu
Copy link
Contributor

taklwu commented Jan 29, 2019

#88 has been merged as dca6c75, so resolving this issue as well.

@taklwu taklwu closed this as completed Jan 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants