Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Purge Scheduled History fails with timeout error on 2 instances of DNN 9.4.1 #3205

Closed
2 tasks
thabaum opened this issue Oct 26, 2019 · 32 comments
Closed
2 tasks

Comments

@thabaum
Copy link
Contributor

thabaum commented Oct 26, 2019

Description of bug

Purge Scheduled History fails with timeout error on 2 instances of DNN 9.4.1

Error log

General Exception

AbsoluteURL DefaultDataProvider DotNetNuke.Data.SqlDataProvider DotNetNuke
AbsoluteURL: DefaultDataProvider:DotNetNuke.Data.SqlDataProvider, DotNetNuke
Message:Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding.

StackTrace:
at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action1 wrapCloseInAction) at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose) at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady) at System.Data.SqlClient.SqlCommand.RunExecuteNonQueryTds(String methodName, Boolean async, Int32 timeout, Boolean asyncWrite) at System.Data.SqlClient.SqlCommand.InternalExecuteNonQuery(TaskCompletionSource1 completion, String methodName, Boolean sendToPipe, Int32 timeout, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry)
at System.Data.SqlClient.SqlCommand.ExecuteNonQuery()
at PetaPoco.Database.ExecuteInternal(CommandType commandType, String sql, Object[] args)
at DotNetNuke.Data.PetaPoco.PetaPocoHelper.ExecuteNonQuery(String connectionString, CommandType type, Int32 timeoutSec, String sql, Object[] args)
at DotNetNuke.Data.SqlDataProvider.ExecuteNonQuery(Int32 timeoutSec, String procedureName, Object[] commandParameters)
at DotNetNuke.Services.Scheduling.DNNScheduler.PurgeScheduleHistory()
at DotNetNuke.Services.Scheduling.PurgeScheduleHistory.DoWork()

InnerMessage:The wait operation timed out
Source:.Net SqlClient Data Provider

Additional context

both are upgraded versions of DNN CE

Affected version

  • 10.0.0 alpha build
  • 9.4.2 alpha build
  • [x ] 9.4.1 latest supported release
@sleupold
Copy link
Contributor

how many rows ae contained in your scheduleHistory table?

@thabaum
Copy link
Contributor Author

thabaum commented Oct 27, 2019

I am thinking a lot of rows man. I need to dig up your SQL commands... Is there potentially a way and need to get some script that is click of the button put into DNN to truncate tables for emergency work for these very common issues that may arise. I do this a lot it seems. I have the commands saved in notes I will find quickly I just thought I would share the fact this type of stuff happens and I am wondering if there could be a faster way of doing this I suppose?

I am also wondering why does this happen to begin with when DNN has an purge history scheduler?

And if there is a reason is there a solution that could created to cure this?

@thabaum
Copy link
Contributor Author

thabaum commented Oct 29, 2019

A couple pages related to this issue:

In Mitchel's post at the bottom it states

Standard Schedule

By default, this task is ENABLED by DotNetNuke and will execute once per 2 hours.
When to Change

You can disable this task if you are working with "Memory" caching (which I highly recommend).

I am using memory so maybe I need to turn this off but I still need to purge DNN records as needed to keep the db tuned up so any help with what to execute for DNN 9 would be greatly appreciated.

@thabaum
Copy link
Contributor Author

thabaum commented Nov 3, 2019

https://github.com/dnnwerk/truncateEventlog/blob/master/TruncateEventlog2.sql

and

https://github.com/dnnwerk/truncateEventlog/blob/master/TruncateEventlog2.sql

Which the last one points to turboscripts. I have been tempted to use those but have not due to having to move around in the updates seemed like more to have to worry about. And the first solution is pretty straight forward here.

What is the recommended solution here?

And after purging since running memory really I should turn off this task? Would this make the DB grow in size or can I / should I turn off whatever is creating these entries and then I dont need to worry about it?

As for turboscripts if I remember I need to remove then re-install turboscripts between upgrades of DNN. Is there a way a module that needs to do this can have a script added to the DNN installation upgrade script. Or is this even an issue anymore? If so and something like this is needed a re-install flag can be set and the package could maybe get put in a special folder for re-installing during upgrades (uninstall, upgrade DNN, install). Just an idea to help something like TurboScripts be more DNN friendly if needed.

Thank you.

@sleupold
Copy link
Contributor

sleupold commented Nov 3, 2019

@thabaum,
if you want to purge the schedule history, you should just execute
TRUNCATE TABLE {databaseOwner}[{objectQualifier}ScheduleHistory]
in BersonaBar > Settings > SQL console.

Purging Eventlog is a different task, and due to some changes in DNN 7.4, you may use my TruncateEventlog script (which does not require TurboScripts to be installed). My TurboScripts however try to provide better automated maintenance of the log tables, but require proper application of the scripts (and I am still working on the version for DNN 9.4.x)

@valadas
Copy link
Contributor

valadas commented Nov 4, 2019

Yep, please try @sleupold solution, I have done that many times and it works great and does not leave anything that needs to be removed/reapplied like the TurboScripts. Let us know...

@thabaum
Copy link
Contributor Author

thabaum commented Nov 11, 2019

Is event log and schedule history the same thing and Admin Log a separate log? Because I have had to clear the admin log the hard way in the past.

Thank you for sharing the script. Maybe it would be nice to get this in the dnndocs as common troubleshooting techniques or things for maintenance that can be referenced easily explaining things.

@mitchelsellers
Copy link
Contributor

Schedule History = The items in the ScheduleHistory table, a history of executed scheduled jobs.

Event Log = Admin Log functionality for errors, login activity, etc.

I think the biggest thing is that the tasks for purge need to run more frequently than scheduled to prevent growth.

@thabaum
Copy link
Contributor Author

thabaum commented Nov 13, 2019

@mitchelsellers https://dnncommunity.org/forums/aft/532#2040

This thread I think is similar thread. Thanks for explaining those.

I suppose adding modules that also add to the scheduled history task log probably is what did it...

@thabaum
Copy link
Contributor Author

thabaum commented Nov 13, 2019

I suppose I was just hoping is to make it so I don't have to do this ever.

I was thinking is running it more often however if I run Memory for cache is this an issue?

@mitchelsellers
Copy link
Contributor

There are many root causes to this issue, and individual environments can exaggerate the problem as we have a few things that make this happen.

  • Execution Frequency compared to Growth Frequency - If too many exceptions come in, and it grows too fast things can get out of hand.
  • Execution Timeout - DNN has a 30-second timeout, that is pretty much impossible to get past
  • Table Structures - Personally, I think that the structure of the eventLog needs some updates.

@thabaum
Copy link
Contributor Author

thabaum commented Nov 13, 2019

OK but how about Scheduled History?

@thabaum
Copy link
Contributor Author

thabaum commented Nov 13, 2019

I suppose I am talking about the Scheduled History log more than the event log as they should be different tables right?

@mitchelsellers
Copy link
Contributor

Schedule history suffers from the same limitations. If you have a bunch of third-party modules that run jobs every X minutes, they insert a lot more records. With the job running once per day, the table can get too large quickly.

For example, there are a few 3rd party modules out there that run a job every 1 minute, in a given day that is growth of at least (1,440) records per job, do that a few times with lots of entries on it, and you can get a bigger table. Then do that on a small Azure SQL box and the query can take more than 30 seconds.

@thabaum
Copy link
Contributor Author

thabaum commented Nov 13, 2019

This is exactly what I think my issue was... i have a lot of tasks from modules I noticed so this makes sense.

Maybe a Maintenance section can help explain these issues in a knowledge base that can be referred to down the road. dnndocs or something to refer someone to all this helpful info and explanation.

@thabaum
Copy link
Contributor Author

thabaum commented Nov 13, 2019

thanks again for the help!

@thabaum
Copy link
Contributor Author

thabaum commented Nov 13, 2019

can maybe the scheduler be a smart scheduler purger and be able to calculate when it will be time to purge or by when it gets so many records it dumps them automatically. Or if it is already such a size it deletes one before entering the next one.

Say if it notices it is so big it sends a warning signal or you can set it to auto-purge when it is size instead of by time.

@EPTamminga
Copy link
Contributor

We could change the delete * from where into a procedure that would delete only a certain # of records (old to new) and loop until the date (# of days to keep) limit is reached and no more records have to be deleted.

@thabaum
Copy link
Contributor Author

thabaum commented Nov 13, 2019

This sounds good. A way to set the number of days to keep and a total number of records possibly so if there is a huge number of them one or the other will catch and cleanup the table.

Something like this should help cure it. It probably needs to be a setting to control as far as number of records to keep and days to keep of records so that it can be configured to each sites needs.

There is a limit on the number of scheduled history happening in the DNN admin log before it starts to error. Then the admin log fills up... then issues with DNN begin if you leave a site running without regular attention.

It would be nice to find this sweet spot out for when these records need to be cleaned out or maintained by DNN Purging scheduled tasks prior to when these known limits are hit.

It makes backups huge and database huge for those of us that don't have time or even know to watch these.

How much can those DNN tables handle before they start to error or cause buggy things to happen on the website? I am sure each environment might have it's own limits but it sure feels like I upgraded to a new environment bigger and faster than ever and similar results are seen as a slower server.

@EPTamminga
Copy link
Contributor

@sleupold Sebastian: is the SPROC with loop something for you to create as SQL wizard?
There is an index on LogCreateDate, so with a DELETE TOP N with a DESC order & WHERE on the date should work?

@sleupold
Copy link
Contributor

There are many root causes to this issue, and individual environments can exaggerate the problem as we have a few things that make this happen.

  • Execution Frequency compared to Growth Frequency - If too many exceptions come in, and it grows too fast things can get out of hand.
  • Execution Timeout - DNN has a 30-second timeout, that is pretty much impossible to get past
  • Table Structures - Personally, I think that the structure of the eventLog needs some updates.

This is correct for EventLog, but not necessarily for ScheduleHistory, which is a rather simple single table with usually not more than a handful thousands of entries (which is a tiny number for SQL Server), due to each job defines the number of items to keep, usually 10, max 500.

A problem occurs, if either
a) the cleanup job "Purge Scehdule History"was disabled for a notable time or
b) the number of items to keep was set to unlimited by an admin for at least one job (AFAIK all have a limited number defined by default) and no other maintenance was implemented by this admin.

To be honest, in 98% of log maintenance issues i encountered, they were caused by the Event Log, not ScheduleHistory and in the other cases, there were issues with other long running scheduler tasks.

@sleupold
Copy link
Contributor

@thabaum

how many rows ae contained in your scheduleHistory table?
you may query the number by running the following statement from PC > Settings > SQL:
SELECT COUNT_BIG(*) N FROM {databaseOwner}[{objectQualifier}ScheduleHistory]

@sleupold
Copy link
Contributor

@sleupold Sebastian: is the SPROC with loop something for you to create as SQL wizard?
There is an index on LogCreateDate, so with a DELETE TOP N with a DESC order & WHERE on the date should work?

@EPTamminga.
Lastest version of sProc PurgeScheduleHistory already limits the numbers of items to be purged to 5000 (my TurboScript version limits it to 50 000). However, this does not solve the issues I mentioned before.

In general, DNN log tables are not implemented ideally, as they combine multiple purposes.
A proper log table does not have any indexes (a so called heap), and is optimized for storing the items as quick as possible, not to interfere with the main process. Another background process would read the items from the table in batches (ideally all at once) and move it to analysis tables, which are in properly optimized format, where they can be queried without placing read locks, interfering with the process being logged.

@EPTamminga
Copy link
Contributor

@sleupold You are absolutely right, but... the ideal situation almost never exists, so we have to work with a non optimized situation.

Can we start working on a better solution for the cleanup of the eventlog?

@thabaum
Copy link
Contributor Author

thabaum commented Nov 14, 2019

looks like I have about 2000 in about 3-4 hours on one site.

@sleupold your scripts help big time I had foreign key issues used your script and just saved that to the SQL console for later.

I just did the same with this one you gave me...

I am curious on one item in the DB that I dont think you can set:

PurgeEventLogBatchCount is 150,000 by default

I dont think you can set this in UI but what does this do exactly? Does it relate at all?

@sleupold
Copy link
Contributor

@EPTamminga

Can we start working on a better solution for the cleanup of the eventlog?
EventLog table is not the biggest issue (unless you edit log types to keep all entries).
But Exceptions table tend to grow unlimited. I added a solution for both issues in my TurboScripts, currently working on an update for DNN 9.4.2.

@sleupold
Copy link
Contributor

sleupold commented Nov 14, 2019

@thabaum

looks like I have about 2000 in about 3-4 hours on one site.

this is not a big number and shouldn't be an issue, if PurgeEventlog runs frequently as well and deletes the oldest entries properly. If this sproc imes out, there is an issue with your database, I would start re-organizing /re-building the indexes.

@sleupold your scripts help big time I had foreign key issues used your script and just saved that to the SQL console for later.

Foreign keys issues should never appear in PurgeScheduleHistory, this table should be safe to use "PURGE TABLE". Only Eventlog in DNN 7.4.0 and up faces issues due to FK to ExceptionEvents.

PurgeEventLogBatchCount is 150,000 by default
I dont think you can set this in UI but what does this do exactly? Does it relate at all?

This is a batch size when purging the eventlog table, There is no UI for this value, but you may modify by running the following SQL statement:

UPDATE {databaseOwner}[{objectQualifier}HostSettings] 
SET  SettingValue = 10000 -- or any value you prefer
WHERE SettingName Like N'PurgeEventLogBatchCount';

@valadas
Copy link
Contributor

valadas commented Nov 18, 2019

Just to triage this issue, is there something actionable to do in Dnn or is this something we can close?

@thabaum
Copy link
Contributor Author

thabaum commented Nov 18, 2019

I would like to add that database setting to be able to set possibly. And if there is something else we might be able to do for keeping things from growing so large to begin with.

Some good stuff discussed here just not sure if anything soon will happen.

@valadas
Copy link
Contributor

valadas commented Nov 18, 2019

Some good stuff discussed here just not sure if anything soon will happen

That's kind of where I am going with my question. I for one never had issues unless there was a rogue module or I picked up a very old site that had a ton of issues or had the scheduled task disabled or some environmental problem of some sort... And well it needed more than a button to clear the logs to fix the root of the problem anyway. We already have a UI to clear the log, the issue is when it gets really huge the sql connection times out. So is there even a possible solution inside Dnn to resolve here knowing we cannot just drop the table due to relations?

So does someone have the intention to make something with this issue or is this an issue specific to this site or that site and we cannot do anything in Dnn itself?

@thabaum
Copy link
Contributor Author

thabaum commented Nov 21, 2019

@EPTamminga

Can we start working on a better solution for the cleanup of the eventlog?
EventLog table is not the biggest issue (unless you edit log types to keep all entries).
But Exceptions table tend to grow unlimited. I added a solution for both issues in my TurboScripts, currently working on an update for DNN 9.4.2.

@thabaum
Copy link
Contributor Author

thabaum commented Nov 23, 2019

For now I suppose any help and support collected might be helpfully explained and refer others to here:
DNN Docs Troubleshooting TS Error SQL Timeout

I will close this issue unless someone wants to reopen it, just make a new issue if you come up with a solution that could work. Or check out turbo scripts github page as another place for a possible solution for this issue.

@thabaum thabaum closed this as completed Nov 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants