[SW-539] Fix bug when pysparkling is executed in parallel on the same node #393

jakubhava · 2017-09-21T15:52:09Z

This bug fix introduce back the cache which @mmalohlava wrote a while ago. However it also changes the python egg cache path for tests so the tests always uses the correct latest artefact ( this was source of issues and also reason why the cache was removed ).

Also when testing SNAPSHOT versions locally on different sparkling-water we make sure to use temporary cache for python eggs, again to be sure we run on the latest code.

The cache is fine if it's used by the users on the released versions.

mmalohlava · 2017-09-25T20:11:13Z

py/pysparkling/initializer.py

-            return sw_jar
+            cache_path = get_cache_path(zip_filename)
+            cached_jar = os.path.abspath("{}/sparkling_water/sparkling_water_assembly.jar".format(cache_path))
+            if os.path.exists(cached_jar) and os.path.isfile(cached_jar):


Thinking about it more, can we disable caching totally? It could be dangerous, if we have cached old version of jar and upgraded sw.

jakubhava · 2017-09-26T18:35:36Z

I'm all for not using cache as the cache was also cause of problems before. We can disable it in this case since we need to extract the JAR any way, but we can extract it to temporary directory which will be cleaned at the end of H2OContext.

… node

… node (#393) (cherry picked from commit c93c0ee)

mmalohlava · 2017-10-23T21:49:32Z

py/pysparkling/context.py

@@ -139,12 +139,14 @@ def getOrCreate(spark, conf=None, **kwargs):


    def stop_with_jvm(self):
+        Initializer.clean_temp_dir()


Update: This seems causing print of stack-traces during spark shutdown. It cannot expect that it will be fully executed. The better solution is simply skip cleanup.

good point! Will work on this tomorrow

jakubhava requested a review from mmalohlava September 21, 2017 21:40

mmalohlava reviewed Sep 25, 2017

View reviewed changes

jakubhava force-pushed the jh/jira/sw-539 branch from f8d295c to 8669a69 Compare September 26, 2017 19:36

[SW-539] Fix bug when pysparkling is executed in parallel on the same…

3f0df23

… node

jakubhava force-pushed the jh/jira/sw-539 branch from 8669a69 to 3f0df23 Compare September 26, 2017 19:43

jakubhava merged commit c93c0ee into master Sep 26, 2017

jakubhava deleted the jh/jira/sw-539 branch September 26, 2017 19:47

jakubhava added a commit that referenced this pull request Oct 10, 2017

[SW-539] Fix bug when pysparkling is executed in parallel on the same…

f8893b0

… node (#393) (cherry picked from commit c93c0ee)

jakubhava added a commit that referenced this pull request Oct 10, 2017

[SW-539] Fix bug when pysparkling is executed in parallel on the same…

dae9f3a

… node (#393) (cherry picked from commit c93c0ee)

jakubhava added a commit that referenced this pull request Oct 10, 2017

[SW-539] Fix bug when pysparkling is executed in parallel on the same…

d248459

… node (#393) (cherry picked from commit c93c0ee)

jakubhava added a commit that referenced this pull request Oct 18, 2017

[SW-539] Fix bug when pysparkling is executed in parallel on the same…

ddc696f

… node (#393) (cherry picked from commit c93c0ee)

mmalohlava reviewed Oct 23, 2017

View reviewed changes

DinukaH2O mentioned this pull request May 23, 2023

Fix PySparkling issue when running multiple times on the same node #3989

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SW-539] Fix bug when pysparkling is executed in parallel on the same node #393

[SW-539] Fix bug when pysparkling is executed in parallel on the same node #393

jakubhava commented Sep 21, 2017

mmalohlava Sep 25, 2017

jakubhava commented Sep 26, 2017

mmalohlava Oct 23, 2017

jakubhava Oct 23, 2017

		@@ -139,12 +139,14 @@ def getOrCreate(spark, conf=None, **kwargs):


		def stop_with_jvm(self):
		Initializer.clean_temp_dir()

[SW-539] Fix bug when pysparkling is executed in parallel on the same node #393

[SW-539] Fix bug when pysparkling is executed in parallel on the same node #393

Conversation

jakubhava commented Sep 21, 2017

mmalohlava Sep 25, 2017

Choose a reason for hiding this comment

jakubhava commented Sep 26, 2017

mmalohlava Oct 23, 2017

Choose a reason for hiding this comment

jakubhava Oct 23, 2017

Choose a reason for hiding this comment