Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-28930][SQL] Last Access Time value shall display 'UNKNOWN' in all clients #25720

Closed
wants to merge 3 commits into from

Conversation

sujith71955
Copy link
Contributor

@sujith71955 sujith71955 commented Sep 8, 2019

What changes were proposed in this pull request?
Issue 1 : modifications not required as these are different formats for the same info. In the case of a Spark DataFrame, null is correct.

Issue 2 mentioned in JIRA Spark SQL "desc formatted tablename" is not showing the header # col_name,data_type,comment , seems to be the header has been removed knowingly as part of SPARK-20954.

Issue 3:
Corrected the Last Access time, the value shall display 'UNKNOWN' as currently system wont support the last access time evaluation, since hive was setting Last access time as '0' in metastore even though spark CatalogTable last access time value set as -1. this will make the validation logic of LasAccessTime where spark sets 'UNKNOWN' value if last access time value set as -1 (means not evaluated).

Does this PR introduce any user-facing change?
No

How was this patch tested?
Locally and corrected a ut.
Attaching the test report below
SPARK-28930

@sujith71955 sujith71955 changed the title [SPARK-28930][SQL] Last Access Time value shall display 'UNKNOWN' as currently system cannot evaluate the last access time, and 'null' values will be shown in its capital form 'NULL' for SQL client to make the display format similar to spark-shell. [SPARK-28930][SQL] Last Access Time value shall display 'UNKNOWN' and 'null' values will be shown in its capital form 'NULL' Sep 8, 2019
@SparkQA
Copy link

SparkQA commented Sep 8, 2019

Test build #110299 has finished for PR 25720 at commit 87e3f64.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@sujith71955 sujith71955 changed the title [SPARK-28930][SQL] Last Access Time value shall display 'UNKNOWN' and 'null' values will be shown in its capital form 'NULL' [SPARK-28930][SQL] Last Access Time value shall display 'UNKNOWN' and 'null' values will be shown in its capitals 'NULL' for all clients Sep 8, 2019
@SparkQA
Copy link

SparkQA commented Sep 8, 2019

Test build #110300 has finished for PR 25720 at commit 6051f0b.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 9, 2019

Test build #110353 has finished for PR 25720 at commit 51def4f.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

…currently system cannot evaluate the last access time, and 'null' values will be shown in its capital form 'NULL' for SQL client to make the

display format similar to spark-shell.

What changes were proposed in this pull request?
If there is no comment for spark scala shell shows "null" in small letters but all other places Hive beeline/Spark beeline/Spark SQL it is showing in CAPITAL "NULL". In this patch
shown in its capital form 'NULL' for SQL client to make the display format similar to Hive beeline/Spark beeline/Spark SQL. Also corrected the Last Access time, the value shall display 'UNKNOWN'
as currently system wont support the last access time evaluation.
Issue 2 mentioned in JIRA Spark SQL "desc formatted tablename" is not showing the header # col_name,data_type,comment , seems to be the header has been removed knowingly as part of SPARK-20954.

Does this PR introduce any user-facing change?
No

How was this patch tested?
Locally and corrected a ut.
@SparkQA
Copy link

SparkQA commented Sep 9, 2019

Test build #110362 has finished for PR 25720 at commit 2568d50.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@sujith71955
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Sep 10, 2019

Test build #110418 has finished for PR 25720 at commit 2568d50.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Sep 10, 2019

I'm not sure this is a problem. These are different formats for the same info. In the case of a Spark DataFrame, null is correct, not the string "NULL". I also don't see why last access time -1 needs to be 0?

@sujith71955
Copy link
Contributor Author

sujith71955 commented Sep 10, 2019

I'm not sure this is a problem. These are different formats for the same info. In the case of a Spark DataFrame, null is correct, not the string "NULL". - Fine, i got your point Sean.

. I also don't see why last access time -1 needs to be 0? you are right,
WHEN WE ARE SETTING -1 AS DEFAULT VALUE BELOW LOGIC WILL CHANGE THE LAST ACCESS TIME VALUE TO '0' WHILE SAVING THE SAME IN HIVEMETASTORE, SEEMS TO BE BECAUSE OF BELOW LOGIC THE LAST ACCESS TIME CONDITION CHECK IS FAILING.
image

We cannot have a condition check in the above logic since hive is multiplying the lastAccessTime with 1000 so -1 default value wont work in this case. so better we can update the default value of lastAccessTime as 0.

Let me know for any inputs ,Thanks for your suggestions Sean.

@sujith71955
Copy link
Contributor Author

@srowen
We cannot have a condition check in the above logic shown in the snapshot since hive is multiplying the lastAccessTime with 1000 so -1 default value wont work in this case. so better we can update the default value of lastAccessTime as 0.

Let me know for any inputs ,Thanks for your suggestions Sean.

…currently system cannot evaluate the last access time, and 'null' values will be shown in its capital form 'NULL' for SQL client to make the

display format similar to spark-shell.

What changes were proposed in this pull request?
If there is no comment for spark scala shell shows "null" in small letters but all other places Hive beeline/Spark beeline/Spark SQL it is showing in CAPITAL "NULL". In this patch
shown in its capital form 'NULL' for SQL client to make the display format similar to Hive beeline/Spark beeline/Spark SQL. Also corrected the Last Access time, the value shall display 'UNKNOWN'
as currently system wont support the last access time evaluation.
Issue 2 mentioned in JIRA Spark SQL "desc formatted tablename" is not showing the header # col_name,data_type,comment , seems to be the header has been removed knowingly as part of SPARK-20954.

Does this PR introduce any user-facing change?
No

How was this patch tested?
Locally and corrected a ut.
@SparkQA
Copy link

SparkQA commented Sep 11, 2019

Test build #110465 has finished for PR 25720 at commit dc11f3a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@sujith71955
Copy link
Contributor Author

@srowen @dongjoon-hyun - Handled the comments, let me know for any further inputs. Thanks

@sujith71955 sujith71955 changed the title [SPARK-28930][SQL] Last Access Time value shall display 'UNKNOWN' and 'null' values will be shown in its capitals 'NULL' for all clients [SPARK-28930][SQL] Last Access Time value shall display 'UNKNOWN' in all clients Sep 12, 2019
@sujith71955
Copy link
Contributor Author

sujith71955 commented Sep 15, 2019

gentle ping@srowen @dongjoon-hyun

@srowen
Copy link
Member

srowen commented Sep 15, 2019

I still don't quite understand it, are you trying to fix a cosmetic issue or a bug? I don't see a test that now passes with this change but didn't before. But I am also not clear that "UNKNOWN" or "NULL" is the desired output, either.

@sujith71955
Copy link
Contributor Author

sujith71955 commented Sep 16, 2019

I still don't quite understand it, are you trying to fix a cosmetic issue or a bug? I don't see a test that now passes with this change but didn't before. But I am also not clear that "UNKNOWN" or "NULL" is the desired output, either.

After fixing JIRA SPARK-24812 the expectation is 'Last Access' shall be displayed as 'UNKNOWN' which is not happening right now as the value is shown as [Last Access,Thu Jan 01 08:00:00 CST 1970].

As analyzed i found the current logic if (-1 == lastAccessTime) "UNKNOWN" else new Date(lastAccessTime).toString is failing as the Last Access value comes as '0',
its happens because of below logic which is executed in HiveClientImpl.toHiveTable() method.
image. So i corrected it.

Its basically a cosmetic bug. Even in hive (verified in 1.2) the last access will be shown as 'UNKNOWN' because currently no mechanism for evaluating last access.
Hope the background is clear now. Thanks.

@srowen
Copy link
Member

srowen commented Sep 16, 2019

OK I see that we already intend to display "UNKNOWN" to match Hive in some cases. This does not change?

Are you saying that Spark will use -1 and Hive will use 0 in the metastore for unknown values? then why change Spark's value? You can just treat <= 0 as unknown?

@sujith71955
Copy link
Contributor Author

sujith71955 commented Sep 16, 2019 via email

…currently system cannot evaluate the last access time, and 'null' values will be shown in its capital form 'NULL' for SQL client to make the

display format similar to spark-shell.

What changes were proposed in this pull request?
If there is no comment for spark scala shell shows "null" in small letters but all other places Hive beeline/Spark beeline/Spark SQL it is showing in CAPITAL "NULL". In this patch
shown in its capital form 'NULL' for SQL client to make the display format similar to Hive beeline/Spark beeline/Spark SQL. Also corrected the Last Access time, the value shall display 'UNKNOWN'
as currently system wont support the last access time evaluation.
Issue 2 mentioned in JIRA Spark SQL "desc formatted tablename" is not showing the header # col_name,data_type,comment , seems to be the header has been removed knowingly as part of SPARK-20954.

Does this PR introduce any user-facing change?
No

How was this patch tested?
Locally and corrected a ut.
@sujith71955
Copy link
Contributor Author

sujith71955 commented Sep 16, 2019 via email

@srowen
Copy link
Member

srowen commented Sep 16, 2019

Given review of #21775 do you have an opinion @gatorsmile ?

@SparkQA
Copy link

SparkQA commented Sep 16, 2019

Test build #110647 has finished for PR 25720 at commit b9f20ba.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@sujith71955
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Sep 17, 2019

Test build #110736 has finished for PR 25720 at commit b9f20ba.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@sujith71955
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Sep 17, 2019

Test build #110767 has finished for PR 25720 at commit b9f20ba.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@sujith71955
Copy link
Contributor Author

Seems to be failures are not relevant to my changes, will trigger once again (:

@sujith71955
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Sep 17, 2019

Test build #110776 has finished for PR 25720 at commit b9f20ba.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

Merged to master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
5 participants