Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File delete problem on Windows #102

Closed
wants to merge 17 commits into from
Closed

Conversation

dkincaid
Copy link

We ran into a problem on a Windows system when Storm is deleting files from the tmp directory. We should be using the canonical path from the File object. This is especially important on Windows systems in my experience. Here is the exception it was throwing.

java.io.IOException: Unable to delete file: C:\Users\KALYOS~1\AppData\Local\Temp\8cfaaa2d-a34f-4a9f-a30a-ef74a0721260\version-2\log.1
at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:1390) t org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1044)
at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:977)
at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:1381)
at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1044)
at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:977)
at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:1381)
at backtype.storm.util$rmr.invoke(util.clj:277)
at backtype.storm.testing$kill_local_storm_cluster.invoke(testing.clj:156)
at backtype.storm.LocalCluster$_shutdown.invoke(LocalCluster.clj:21)
at backtype.storm.LocalCluster.shutdown(Unknown Source)
at com.idexx.data.etl.storm.DataServicesETLTopology.shutdownLocalTopology(DataServicesETLTopology.java:167)
at integration.com.idexx.data.etl.AbstractDataEltHsqldbTopologyTests.afterSuite(AbstractDataEltHsqldbTopologyTests.java:56)

This doesn't completely fix the problem we're having, but eliminates a potential problem. Now the problem appears to be that there may be another process or thread holding a lock on that file.

@nathanmarz
Copy link
Owner

If there's still problems with running on Windows, I'll wait until those are resolved before merging this in.

@nathanmarz
Copy link
Owner

Still happy to accept a pull request once the issue is fully resolved, but going to close this one since it's been open so long.

@nathanmarz nathanmarz closed this Apr 14, 2012
@kichik
Copy link

kichik commented Apr 3, 2013

This issue still exists in 0.8.2.

I checked with Process Explorer and the file can't be deleted because it's still in use by the same Java process trying to delete it. I tried looking into the source code, but couldn't find anything too obvious. The only thing that seemed suspicious was the comment about a possible race condition between killing all workers and killing the supervisor. If it is indeed the worker that creates the version-2/log.1 file, this might just be it.

This wouldn't be too bad if the log files weren't so big. After about 30 minutes of playing around with Storm, my TEMP folder grew an extra 1GB. I will probably have to write a script to automatically clean those up after each run.

@vladokr
Copy link

vladokr commented May 15, 2013

Yes the problem is still here.
I am just starting with the storm framework, under windows, and this was the first issue that I came up to it.
What do I need to do? Can I set the storm to skip deleting this log files?

@ftc
Copy link

ftc commented May 21, 2013

http://stackoverflow.com/questions/16658779/twitter-storm-example-running-in-local-mode-cannot-delete-file I asked about this on stack overflow and the answer I got was pretty much just comment out the kill topology and shutdown lines. Unfortunately I don't know what the negative consequences of doing this are.

@brendanator
Copy link

I suspect this error is being caused by the garbage collector not claiming the file handle correctly.

There is a suggested solution on SO to this problem that people have been seeing in java by nulling the output stream and calling the garbage collector: http://stackoverflow.com/a/4213208/402441

If someone can point me to the place in the code where /version-2/log.1 is written I will have a go at fixing this

revans2 pushed a commit to revans2/storm that referenced this pull request Mar 6, 2014
…-generates-zkdigestpayload

Always generate zk digest payload, better validation
@jutkko
Copy link

jutkko commented Aug 11, 2014

I am very interested to know as well, which part of the program is logging the logs? As they are not readable, and does not have a size limit (it grows to huge monsters after a while), what's the point of keeping them?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
8 participants