You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When trying to connect to hdfs from a ProcessPoolExecutor then the first call raises an Exception and the function never returns (potential deadlock?). On the other hand it works as expected with a ThreadPoolExecutor.
Sample code that reproduces the problem follows:
importpyarrowaspafromconcurrent.futuresimport (
ThreadPoolExecutor,
ProcessPoolExecutor,
wait,
ALL_COMPLETED)
defls():
fs=pa.hdfs.connect('hdfs://host')
print(fs.ls('/'))
# This works as expectedls()
# Running in parallelthread_pool=ThreadPoolExecutor(max_workers=4)
process_pool=ProcessPoolExecutor(max_workers=4)
defrun(pool):
futures= [pool.submit(ls) for_inrange(5)]
wait(futures, return_when=ALL_COMPLETED)
# The thread_pool works as expectedrun(thread_pool)
# The process_pool raises an exceptionrun(process_pool)
The following exception is raised:
java.lang.ClassFormatError: Incompatible magic value 1347093252 in class file org/xml/sax/helpers/LocatorImpl
at java.lang.ClassLoader.findBootstrapClass(Native Method)
at java.lang.ClassLoader.findBootstrapClassOrNull(ClassLoader.java:1015)
at java.lang.ClassLoader.loadClass(ClassLoader.java:413)
at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:150)
at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2684)
at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2672)
at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2746)
at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2696)
at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2579)
at org.apache.hadoop.conf.Configuration.get(Configuration.java:1091)
at org.apache.hadoop.fs.FileSystem.newInstance(FileSystem.java:404)
When trying to connect to
hdfs
from aProcessPoolExecutor
then the first call raises an Exception and the function never returns (potential deadlock?). On the other hand it works as expected with aThreadPoolExecutor
.Sample code that reproduces the problem follows:
The following exception is raised:
Reporter: Panagiotis Nezis
Related issues:
Note: This issue was originally created as ARROW-7451. Please see the migration documentation for further details.
The text was updated successfully, but these errors were encountered: