[1.0.3.x] Problem: batch download speed degradation #17

reyaz006 · 2014-04-07T07:02:17Z

I'm not sure if this bug appeared in 1.0.3.0 or 1.0.3.1, but with 1.0.2.0 I never had this problem, as far as I can say.

I have >400 member entries in batch job, concurrent job = 6. From the start the execution goes very fast, with ~1300 lines per minute being logged. Some time later, it goes as low as ~200 lines/m, then ~~80 lines/m. During this random pauses between lines as big as 2~~40 seconds appear. It really looks like the app doesn't do anything during these periods - batch table UI update speed makes it look like downloading 50-100 mb files.

This is certainly not a problem with my connection speed - if I restart the app it does the same thing again, with 1000-1300 lines/m logged.

I can't understand why this is happening randomly - sometimes it does all 400 entries without problems (around 30 minutes maybe, since 99% of files are already downloaded), sometimes it takes few hours only to realize it didn't even finish half of its jobs.

Nandaka · 2014-04-07T07:04:34Z

what is the memory consumption? Is your HDD start trashing? try to disable DB and logging? Worst case, nijie (or their CDN) do some rate limiter to your connection

reyaz006 · 2014-04-09T05:42:56Z

So this just happened again.

RAM usage is nothing special, ~55mb. HDD is not trashing - Win8 task manager shows only 30% spikes, which is normal for other cases. But CPU usage went to 66%-70% on my CPU with 8 logical cores. Usually CPU usage is <2% for cases where the app is operating normally.

[Trace DB to log] was already disabled, so now I've disabled [Save info to DB] too. I'll see how it goes.

nijie (or their CDN) do some rate limiter to your connection

No, like I said, it goes fine if I restart the app.

reyaz006 · 2014-04-11T06:46:30Z

Happened again now, while being at ~30% of the joblist. ~55% CPU usage, sometimes up to 80%. Both DB options are disabled already. In ProcMon I see that it still writes something into Database.sdf though. This shouldn't happen I think.

When I pressed Stop, it took about 1 minute to cancel all jobs and display a notification (CPU usage went down to 0%). Then I pressed Start, and few seconds later it went up to 55%~80% CPU again.

Then I stopped again. Tried to move out Database.sdf while the app was still running. This caused it to display Errors for all remaining member jobs after pressing Start and creating html dumps, tried it 2 times.

Then I moved Database.sdf back to the folder and pressed Start. It's working fine for 5 minutes now, with <2% CPU. Will try to just Stop/Start next time this happens.

reyaz006 · 2014-04-13T20:31:23Z

Happened again. Stop/Start fixed it this time, from the first try.

reyaz006 · 2014-04-18T06:22:48Z

Happened again. Stop/Start helped after third try.

Nandaka · 2014-04-19T05:03:28Z

give me your actual batchjob.xml

reyaz006 · 2014-04-19T10:12:01Z

http://pastebin.com/pHE9wjhU

Nandaka · 2014-04-19T15:06:55Z

After running it, I think I know the problem with stalled job (Status always showing as Running).

When doing the job and encounter an error, It failed to update the state but the task thread already terminated and fire another task.

So if I have 4 worker thread, and one of it failed to update the state because of error, it will shown as 5 job running.

Let see what I can do.

reyaz006 · 2014-04-20T17:24:15Z

Did you also get the high CPU usage?

I think it has something to do with variables that you use globally for several threads. Some variable not being updated at time or some data not being written fully in time - this may rarely happen due to implementation if you don't limit the use of some variable per thread. One of the options would be to not start processing some variable or data if you can check it and see if it's still in use by another thread.

I may be wrong. A similar thing happened to me few years ago with some small project and I fixed it by making sure that a new thread does not start working with same global variable until all other threads are finished using that variable.

Nandaka · 2014-04-21T00:50:58Z

Not sure if related to global var, as the status is updated directly to the individual job and I'm expecting WPF data binding to automagically update the UI 😃

Anyway, I've put try/catch/finally block to set up the final status to either completed/error based on the exception, and it looks like it working on 1.0.4.0

reyaz006 · 2014-04-24T14:59:07Z

I guess it's still not fixed in 1.0.4.0.

Today after some time half of my joblist stopped with error (can't remember description now). Trying to start again made it eat up to 60% CPU again, twice. Again, had to wait around 1-2 minutes until Stop button finishes what it supposed to do. Decided to restart the app after that.

I see many timeout events (Error when downloading: .!nijie files ==> timeout) in the log but no error/exception messages.

Also duplicate entries like
2014-04-24 09:05:56,296 WARN [ 8] - File Exists: xxx, Identical size: xxxxxx, skipping... 2014-04-24 09:05:56,296 ERROR [ 8] - File Exists: xxx, Identical size: xxxxxx, skipping...
But this is probably not related to the issue.

I'll see if I can provide a clear log and error description next time.

Nandaka · 2014-04-24T15:27:41Z

Network Timeout

The timeout part is from the server side (CDN issue?) I'm also affected with it. I cannot help it, except you want to enable retry for network issue?

Log...

the identical size is not really an error.

Trying to start again made it eat up to 60% CPU again, twice

I didn't encounter CPU issue using your batch file, weird. The stopping part might take time, as it use the task cancellation pattern (graceful cancel), not via Thread.Abort(). So if it still in the middle of download, it wont immediately stop.

reyaz006 · 2014-04-24T20:21:48Z

Indeed, I don't think those timeouts have anything to do with CPU usage issue.

So if it still in the middle of download, it wont immediately stop.

I'm pretty sure it can't spend that much time for downloading any image. There is likely something else that is using so much CPU that it slows down most of internal logic, I'd say.

Thanks for your replies.

Nandaka · 2014-04-25T00:56:04Z

I'm pretty sure it can't spend that much time for downloading any image. There is likely something else that is using so much CPU that it slows down most of internal logic, I'd say.

Depending on your time out value, the download thread might still waiting until it's timeout

reyaz006 · 2014-04-25T08:14:00Z

Is there a way to check that value? Or do you mean an internal setting for my ISP/connection which decides when to show "can't open a website" message?

Nandaka · 2014-04-25T08:19:30Z

it is not exposed in the UI, but if you can modify value in the app.config.

<Nandaka.Common.Properties.Settings>
      <setting name="Timeout" serializeAs="String">
        <value>60000</value>
      </setting>

reyaz006 · 2014-05-02T07:44:16Z

Happened again today with 1.0.4.0. This time I noticed the errors in the batch download tab:
Object reference not set to an instance of an object.
I can see this printed after cancelling, for items that were active exactly when speed degradation / CPU overloading started.

Strangely the log does not contain these errors.

Nandaka · 2014-05-02T08:01:57Z

any stack trace? try to upgrade to v1.0.5.0 and apply the patch for logging: http://www.mediafire.com/download/wlo0t3vy3lcbeeo/nijiedownloader.1.0.5.1-patch.7z

reyaz006 · 2014-05-02T09:35:35Z

No, Trace DB to log was disabled, if that what you mean (should I enable it?), and there were no options to show any debug info. I'll try 1.0.5.1 for next time then.

reyaz006 · 2014-05-29T06:56:48Z

For your information, ever since upgrading to 1.0.5.1 I never had this problem again, even though I'm using it almost everyday. I guess the real fix was in 1.0.5.0.

Thank you again.

reyaz006 mentioned this issue Apr 16, 2014

[1.0.3.1] Problem: Overwrite only if different in size not working correctly anymore #18

Open

Nandaka closed this as completed Apr 24, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[1.0.3.x] Problem: batch download speed degradation #17

[1.0.3.x] Problem: batch download speed degradation #17

reyaz006 commented Apr 7, 2014

Nandaka commented Apr 7, 2014

reyaz006 commented Apr 9, 2014

reyaz006 commented Apr 11, 2014

reyaz006 commented Apr 13, 2014

reyaz006 commented Apr 18, 2014

Nandaka commented Apr 19, 2014

reyaz006 commented Apr 19, 2014

Nandaka commented Apr 19, 2014

reyaz006 commented Apr 20, 2014

Nandaka commented Apr 21, 2014

reyaz006 commented Apr 24, 2014

Nandaka commented Apr 24, 2014

reyaz006 commented Apr 24, 2014

Nandaka commented Apr 25, 2014

reyaz006 commented Apr 25, 2014

Nandaka commented Apr 25, 2014

reyaz006 commented May 2, 2014

Nandaka commented May 2, 2014

reyaz006 commented May 2, 2014

reyaz006 commented May 29, 2014

[1.0.3.x] Problem: batch download speed degradation #17

[1.0.3.x] Problem: batch download speed degradation #17

Comments

reyaz006 commented Apr 7, 2014

Nandaka commented Apr 7, 2014

reyaz006 commented Apr 9, 2014

reyaz006 commented Apr 11, 2014

reyaz006 commented Apr 13, 2014

reyaz006 commented Apr 18, 2014

Nandaka commented Apr 19, 2014

reyaz006 commented Apr 19, 2014

Nandaka commented Apr 19, 2014

reyaz006 commented Apr 20, 2014

Nandaka commented Apr 21, 2014

reyaz006 commented Apr 24, 2014

Nandaka commented Apr 24, 2014

reyaz006 commented Apr 24, 2014

Nandaka commented Apr 25, 2014

reyaz006 commented Apr 25, 2014

Nandaka commented Apr 25, 2014

reyaz006 commented May 2, 2014

Nandaka commented May 2, 2014

reyaz006 commented May 2, 2014

reyaz006 commented May 29, 2014