New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow to configure insert into Hive partition via configuration property #4999
Allow to configure insert into Hive partition via configuration property #4999
Conversation
presto-hive/src/main/java/io/prestosql/plugin/hive/HiveConfig.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/io/prestosql/plugin/hive/HiveConfig.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/io/prestosql/plugin/hive/HiveSessionProperties.java
Outdated
Show resolved
Hide resolved
a373929
to
bec6d28
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@electrum Would you please take a look?
presto-hive/src/main/java/io/prestosql/plugin/hive/HiveConfig.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/io/prestosql/plugin/hive/HiveWriterFactory.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/io/prestosql/plugin/hive/HiveSessionProperties.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/io/prestosql/plugin/hive/HiveSessionProperties.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/io/prestosql/plugin/hive/HiveSessionProperties.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/io/prestosql/plugin/hive/HiveConfig.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/io/prestosql/plugin/hive/HiveSessionProperties.java
Outdated
Show resolved
Hide resolved
55fa327
to
92b56bc
Compare
presto-hive-hadoop2/src/test/java/io/prestosql/plugin/hive/TestHiveHadoop2Plugin.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/io/prestosql/plugin/hive/HiveConfig.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/io/prestosql/plugin/hive/HiveWriterFactory.java
Show resolved
Hide resolved
presto-hive/src/main/java/io/prestosql/plugin/hive/HiveWriterFactory.java
Show resolved
Hide resolved
presto-hive/src/main/java/io/prestosql/plugin/hive/HiveWriterFactory.java
Show resolved
Hide resolved
@PostConstruct | ||
public void validate() | ||
{ | ||
InsertExistingPartitionsBehavior.validate(insertExistingPartitionsBehavior, immutablePartitions); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will we deprecate immutable partitions property?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. immutable partitions = true
is used to protect setting hive.insert-existing-partitions-behavior
to APPEND state via session properties.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could extend hive.insert-existing-partitions-behavior
with FORCE_ERROR
(name subject to discussion) which behaves as ERROR, but also be non overridable by session property.
Though personally I feel that having separate properties is easier to understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some reason immutable partitions=true
allows setting insert-existing-partitions-behavior=OWERWRITE
that's why it is not an option. Apart from that deprecating breaks backwards compatibility and requires a longer discussion.
Won't do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I read the code correctly now if we specify hive.immutable-partitions=true
and NOT specify hive.insert-existing-partitions-behavior
the Presto will fail to start.
This is valid setup today. To be consistent with what we have right now we should default hive.insert-existing-partitions-behavior
to ERROR
if hive.immutable-partitions
is true
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also can you add test for above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point!
I've tried to keep old logic of insert-existing-partitions-behavior calculation, when hive.insert-existing-partitions-behavior is NOT specified:
- Calculate default value during @PostConstruct phase
private Optional<InsertExistingPartitionsBehavior> insertExistingPartitionsBehavior = Optional.empty();
public InsertExistingPartitionsBehavior getInsertExistingPartitionsBehavior()
{
return insertExistingPartitionsBehavior.orElseThrow(() -> new NoSuchElementException("insertExistingPartitionsBehavior was not set during presto initialization"));
}
@Config("hive.insert-existing-partitions-behavior")
@ConfigDescription("Default value for insert existing partitions behavior")
public HiveConfig setInsertExistingPartitionsBehavior(InsertExistingPartitionsBehavior insertExistingPartitionsBehavior)
{
this.insertExistingPartitionsBehavior = Optional.of(requireNonNull(insertExistingPartitionsBehavior, "insertExistingPartitionsBehavior is null"));
return this;
}
@PostConstruct
public void validate()
{
if (insertExistingPartitionsBehavior.isEmpty()) {
insertExistingPartitionsBehavior = getInsertExistingPartitionsBehaviorForBackwardCompatibility();
}
else {
InsertExistingPartitionsBehavior.validate(insertExistingPartitionsBehavior.get(), immutablePartitions);
}
}
private Optional<InsertExistingPartitionsBehavior> getInsertExistingPartitionsBehaviorForBackwardCompatibility()
{
return Optional.of(immutablePartitions ? ERROR : APPEND);
}
but seems it's not good idea from frameworks' point of view (even not looking to even more complicated logic), because tests are not configured to maintain such complex logic TestHiveConfig#testDefaults() does not invoke @PostConstruct method.
- However it is possible to move logic from @PostConstruct to getter (code becomes not that clear).
private Optional<InsertExistingPartitionsBehavior> insertExistingPartitionsBehavior = Optional.empty();
public InsertExistingPartitionsBehavior getInsertExistingPartitionsBehavior()
{
if (insertExistingPartitionsBehavior.isEmpty()) {
insertExistingPartitionsBehavior = getInsertExistingPartitionsBehaviorForBackwardCompatibility();
}
return insertExistingPartitionsBehavior.orElseThrow(() -> new NoSuchElementException("insertExistingPartitionsBehavior was not set during presto initialization"));
}
private Optional<InsertExistingPartitionsBehavior> getInsertExistingPartitionsBehaviorForBackwardCompatibility()
{
return Optional.of(immutablePartitions ? ERROR : APPEND);
}
@Config("hive.insert-existing-partitions-behavior")
@ConfigDescription("Default value for insert existing partitions behavior")
public HiveConfig setInsertExistingPartitionsBehavior(InsertExistingPartitionsBehavior insertExistingPartitionsBehavior)
{
this.insertExistingPartitionsBehavior = Optional.of(requireNonNull(insertExistingPartitionsBehavior, "insertExistingPartitionsBehavior is null"));
return this;
}
@PostConstruct
public void validate()
{
insertExistingPartitionsBehavior.ifPresent(v -> InsertExistingPartitionsBehavior.validate(v, immutablePartitions));
}
@kokosing I see three possible solutions:
- apply solution proposed by @losipiuk which changes default value hive.insert-existing-partitions-behavior to ERROR, which breaks backward compatibility when immutablePartitions=false (in this case insert-existing-partitions-behavior was set to APPEND).
- Break backwards compatibility for one of valid setups today: hive.immutable-partitions=true and hive.insert-existing-partitions-behavior is NOT specified.
- Implement logic in getter.
What solution would be better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll implement solution 3.
92b56bc
to
7ff1101
Compare
@zaz968m there are conflicts now. can you please rebase? |
7ff1101
to
9f6c316
Compare
9da6b0f
to
25c2056
Compare
presto-hive/src/main/java/io/prestosql/plugin/hive/HiveSessionProperties.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@electrum can you review the docs?
25c2056
to
e345b66
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
taking back LGTM for now until last piece of discussion is resolved
e345b66
to
5848ca6
Compare
assertEquals(APPEND, actual); | ||
} | ||
|
||
private Object getDefaultValueInsertExistingPartitionsBehavior(Connector connector) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you can use concrete type here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It requires unnecessary explicit casting. I think it does not worth adding additional characters in this case.
However I would be glad to substitute method's body with shorter and more understandable implementation, but I can't find one.
presto-hive/src/main/java/io/prestosql/plugin/hive/HiveConfig.java
Outdated
Show resolved
Hide resolved
presto-hive-hadoop2/src/test/java/io/prestosql/plugin/hive/TestHiveHadoop2Plugin.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/io/prestosql/plugin/hive/HiveConfig.java
Outdated
Show resolved
Hide resolved
5848ca6
to
0b043e3
Compare
@electrum all comments are addressed |
0b043e3
to
c9ec327
Compare
c9ec327
to
5534628
Compare
Allow to set default value for hive.insert_existing_partitions_behavior property