[SPARK-21067][SQL] Fix Thrift Server - CTAS fail with Unable to move source#25058
[SPARK-21067][SQL] Fix Thrift Server - CTAS fail with Unable to move source#25058Deegue wants to merge 1 commit intoapache:masterfrom
Conversation
|
This is a known issue (due to the design of Hadoop FileSystem interface), you can set hadoop configuration explicitly to disable cache either via Spark or via hadoop conf file. I don't think it is necessary to add another configuration. Besides this only works for HDFS. |
Thanks for your review, what about change |
|
I don't think it is necessary to do some code changes for this, may be adding docs is enough. |
|
@jerryshao I don't know if I understand the problem correctly. I submitted a patch under that jira which is based on spark-2.3.2 several months ago. My understanding is that It is the working thread who created the FS. This FS client would be closed if the working thread closes. If we let the main thread create the FS, the problem could be resolved. |
|
Yes, you can carefully track which thread to create FS and guarantee the same thread to close FS, then this problem could be solved, but it is a little hard to track. |
|
Can one of the admins verify this patch? |
|
doc fix seems arrived. Per #25058 (comment), I suspect we need a different approach. I am closing this due to inactivity for now. |
What changes were proposed in this pull request?
When we close a session of STS, FileSystem(cached by default) on HDFS will be closed.
Then if we do CTAS and trigger
Hive.moveFile, DFSClient will docheckOpenand throwjava.io.IOException: Filesystem closed.So we add a Spark config
spark.sql.thriftserver.hdfs.disable.cacheto control whether disable the FileSystem connection cache of HDFS.How was this patch tested?
Manually tested.