Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LIVY-572] Avoid usage of spark classes in ColumnBuffer #162

Closed
wants to merge 2 commits into from

Conversation

mgaido91
Copy link
Contributor

What changes were proposed in this pull request?

The ColumnBuffers can be created both inside spark jobs and in the Livy server. The latter case happens when operation logs are returned and fails before this patch because we are using Spark classes in this code after the refactor in LIVY-503. Unfortunately, we do not have test coverage for operation logs retrieval and this is the reason why this wasn't spot out earlier.

Since operation logs are retrieved by beeline for each query, this means that every query run through beeline fails, unless livy.server.thrift.logging.operation.enabled is set to false.

How was this patch tested?

manual tests using beeline

@mgaido91 mgaido91 requested a review from vanzin March 23, 2019 12:46
@@ -43,7 +63,13 @@ public void addRow(Object[] fields) {
}

for (int i = 0; i < fields.length; i++) {
columns[i].add(fields[i]);
Object value;
if (columnIsString[i]) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking columns[i].getType().equals(DataType.STRING) here avoids the extra boolean array.

(It's also an enum so you could use ==.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the reason why I did so was to avoid doing the check for every field of each row and hence avoid introducing an overhead in a part of the code which is critical for performances. Anyway, if you feel strong about this I can change it: it is also true that the overhead is probably very small compared to all the operations which we need to perform, e.g. transfer the data over the network...

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using == there's actually no more overhead at all. In fact, if you look at CPU caches and things like that it's probably faster since there's less risk of a cache miss...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, done, thanks.

@codecov-io
Copy link

codecov-io commented Mar 26, 2019

Codecov Report

Merging #162 into master will increase coverage by 0.14%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff              @@
##             master     #162      +/-   ##
============================================
+ Coverage     68.59%   68.73%   +0.14%     
+ Complexity      905      904       -1     
============================================
  Files           100      100              
  Lines          5662     5662              
  Branches        848      848              
============================================
+ Hits           3884     3892       +8     
+ Misses         1225     1216       -9     
- Partials        553      554       +1
Impacted Files Coverage Δ Complexity Δ
...ain/java/org/apache/livy/rsc/driver/RSCDriver.java 77.96% <0%> (-0.85%) 41% <0%> (-1%)
.../scala/org/apache/livy/sessions/SessionState.scala 61.11% <0%> (+2.77%) 2% <0%> (ø) ⬇️
...ain/scala/org/apache/livy/utils/SparkYarnApp.scala 79.57% <0%> (+6.33%) 33% <0%> (ø) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 07d216d...0c8400f. Read the comment docs.

@vanzin
Copy link

vanzin commented Mar 27, 2019

Merging to master.

@asfgit asfgit closed this in 1c914a1 Mar 27, 2019
@mgaido91
Copy link
Contributor Author

Thanks @vanzin! What about backporting to 0.6?

@vanzin
Copy link

vanzin commented Mar 27, 2019

If there's a need for another RC I'll do it, but we have never released a .z release before, I'd rather just do 0.7 instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants