Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[1.0.3.x] Problem: batch download speed degradation #17

Closed
reyaz006 opened this issue Apr 7, 2014 · 20 comments
Closed

[1.0.3.x] Problem: batch download speed degradation #17

reyaz006 opened this issue Apr 7, 2014 · 20 comments

Comments

@reyaz006
Copy link

reyaz006 commented Apr 7, 2014

I'm not sure if this bug appeared in 1.0.3.0 or 1.0.3.1, but with 1.0.2.0 I never had this problem, as far as I can say.

I have >400 member entries in batch job, concurrent job = 6. From the start the execution goes very fast, with ~1300 lines per minute being logged. Some time later, it goes as low as ~200 lines/m, then 80 lines/m. During this random pauses between lines as big as 240 seconds appear. It really looks like the app doesn't do anything during these periods - batch table UI update speed makes it look like downloading 50-100 mb files.

This is certainly not a problem with my connection speed - if I restart the app it does the same thing again, with 1000-1300 lines/m logged.

I can't understand why this is happening randomly - sometimes it does all 400 entries without problems (around 30 minutes maybe, since 99% of files are already downloaded), sometimes it takes few hours only to realize it didn't even finish half of its jobs.

@Nandaka
Copy link
Owner

Nandaka commented Apr 7, 2014

what is the memory consumption? Is your HDD start trashing? try to disable DB and logging? Worst case, nijie (or their CDN) do some rate limiter to your connection

@reyaz006
Copy link
Author

reyaz006 commented Apr 9, 2014

So this just happened again.

RAM usage is nothing special, ~55mb. HDD is not trashing - Win8 task manager shows only 30% spikes, which is normal for other cases. But CPU usage went to 66%-70% on my CPU with 8 logical cores. Usually CPU usage is <2% for cases where the app is operating normally.

[Trace DB to log] was already disabled, so now I've disabled [Save info to DB] too. I'll see how it goes.

nijie (or their CDN) do some rate limiter to your connection

No, like I said, it goes fine if I restart the app.

@reyaz006
Copy link
Author

Happened again now, while being at ~30% of the joblist. ~55% CPU usage, sometimes up to 80%. Both DB options are disabled already. In ProcMon I see that it still writes something into Database.sdf though. This shouldn't happen I think.

When I pressed Stop, it took about 1 minute to cancel all jobs and display a notification (CPU usage went down to 0%). Then I pressed Start, and few seconds later it went up to 55%~80% CPU again.

Then I stopped again. Tried to move out Database.sdf while the app was still running. This caused it to display Errors for all remaining member jobs after pressing Start and creating html dumps, tried it 2 times.

Then I moved Database.sdf back to the folder and pressed Start. It's working fine for 5 minutes now, with <2% CPU. Will try to just Stop/Start next time this happens.

@reyaz006
Copy link
Author

Happened again. Stop/Start fixed it this time, from the first try.

@reyaz006
Copy link
Author

Happened again. Stop/Start helped after third try.

@Nandaka
Copy link
Owner

Nandaka commented Apr 19, 2014

give me your actual batchjob.xml

@reyaz006
Copy link
Author

@Nandaka
Copy link
Owner

Nandaka commented Apr 19, 2014

After running it, I think I know the problem with stalled job (Status always showing as Running).

When doing the job and encounter an error, It failed to update the state but the task thread already terminated and fire another task.

So if I have 4 worker thread, and one of it failed to update the state because of error, it will shown as 5 job running.

Let see what I can do.

@reyaz006
Copy link
Author

Did you also get the high CPU usage?

I think it has something to do with variables that you use globally for several threads. Some variable not being updated at time or some data not being written fully in time - this may rarely happen due to implementation if you don't limit the use of some variable per thread. One of the options would be to not start processing some variable or data if you can check it and see if it's still in use by another thread.

I may be wrong. A similar thing happened to me few years ago with some small project and I fixed it by making sure that a new thread does not start working with same global variable until all other threads are finished using that variable.

@Nandaka
Copy link
Owner

Nandaka commented Apr 21, 2014

Not sure if related to global var, as the status is updated directly to the individual job and I'm expecting WPF data binding to automagically update the UI 😃

Anyway, I've put try/catch/finally block to set up the final status to either completed/error based on the exception, and it looks like it working on 1.0.4.0

@Nandaka Nandaka closed this as completed Apr 24, 2014
@reyaz006
Copy link
Author

I guess it's still not fixed in 1.0.4.0.

Today after some time half of my joblist stopped with error (can't remember description now). Trying to start again made it eat up to 60% CPU again, twice. Again, had to wait around 1-2 minutes until Stop button finishes what it supposed to do. Decided to restart the app after that.

I see many timeout events (Error when downloading: .!nijie files ==> timeout) in the log but no error/exception messages.

Also duplicate entries like
2014-04-24 09:05:56,296 WARN [ 8] - File Exists: xxx, Identical size: xxxxxx, skipping... 2014-04-24 09:05:56,296 ERROR [ 8] - File Exists: xxx, Identical size: xxxxxx, skipping...
But this is probably not related to the issue.

I'll see if I can provide a clear log and error description next time.

@Nandaka
Copy link
Owner

Nandaka commented Apr 24, 2014

Network Timeout

The timeout part is from the server side (CDN issue?) I'm also affected with it. I cannot help it, except you want to enable retry for network issue?

Log...

the identical size is not really an error.

Trying to start again made it eat up to 60% CPU again, twice

I didn't encounter CPU issue using your batch file, weird. The stopping part might take time, as it use the task cancellation pattern (graceful cancel), not via Thread.Abort(). So if it still in the middle of download, it wont immediately stop.

@reyaz006
Copy link
Author

Indeed, I don't think those timeouts have anything to do with CPU usage issue.

So if it still in the middle of download, it wont immediately stop.

I'm pretty sure it can't spend that much time for downloading any image. There is likely something else that is using so much CPU that it slows down most of internal logic, I'd say.

Thanks for your replies.

@Nandaka
Copy link
Owner

Nandaka commented Apr 25, 2014

I'm pretty sure it can't spend that much time for downloading any image. There is likely something else that is using so much CPU that it slows down most of internal logic, I'd say.

Depending on your time out value, the download thread might still waiting until it's timeout

@reyaz006
Copy link
Author

Is there a way to check that value? Or do you mean an internal setting for my ISP/connection which decides when to show "can't open a website" message?

@Nandaka
Copy link
Owner

Nandaka commented Apr 25, 2014

it is not exposed in the UI, but if you can modify value in the app.config.

<Nandaka.Common.Properties.Settings>
      <setting name="Timeout" serializeAs="String">
        <value>60000</value>
      </setting>

@reyaz006
Copy link
Author

reyaz006 commented May 2, 2014

Happened again today with 1.0.4.0. This time I noticed the errors in the batch download tab:
Object reference not set to an instance of an object.
I can see this printed after cancelling, for items that were active exactly when speed degradation / CPU overloading started.

Strangely the log does not contain these errors.

@Nandaka
Copy link
Owner

Nandaka commented May 2, 2014

any stack trace? try to upgrade to v1.0.5.0 and apply the patch for logging: http://www.mediafire.com/download/wlo0t3vy3lcbeeo/nijiedownloader.1.0.5.1-patch.7z

@reyaz006
Copy link
Author

reyaz006 commented May 2, 2014

No, Trace DB to log was disabled, if that what you mean (should I enable it?), and there were no options to show any debug info. I'll try 1.0.5.1 for next time then.

@reyaz006
Copy link
Author

For your information, ever since upgrading to 1.0.5.1 I never had this problem again, even though I'm using it almost everyday. I guess the real fix was in 1.0.5.0.

Thank you again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants