Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Oracle "Pooled connection request timed out" that requires process restart #397

Closed
object opened this issue Apr 24, 2017 · 11 comments
Closed

Comments

@object
Copy link
Contributor

object commented Apr 24, 2017

Description

We need sometimes to restart our service that uses SQLProvider with Oracle after it start throwing exception "Pooled connection request timed out". Prior to that there is a number of "Connection request time out" exception.

Repro steps

The service may run many hours before this happens. It looks like occasional connection request timeout cause some db resource leak that eventually result in pooled connection request exception, and after that the process need to be restarted. Restarting the process fixes the problem.

I checked SQLProvider source code, and it seems that clients don't need to do any db resource cleanup - none of exposed types is even disposable, so apparently all db connections must be opened and closed internally by SQLProvider. So I wonder what might cause this issue.

Related information

We use latest version of SQLProvider and ODP.NET

@Thorium
Copy link
Member

Thorium commented Apr 24, 2017

Do you have Oracle in a separate machine? Does network outages affect to this somehow?

@object
Copy link
Contributor Author

object commented Apr 24, 2017

Yes, Oracle runs on its own server. Network outage might happen. What I am trying to find out is how we can resolve this without restarting the service. Would enabling and collecting Oracle performance counters help with the investigation?

@Thorium
Copy link
Member

Thorium commented Apr 24, 2017

I can completely mislead you to a wrong direction, maybe Oracle can also leak connection pool connections somehow, I don't know. :-) But I had this kind of problems myself:

Do you do GetDataContext() for every single query, or do you have only one context through out the application? I imagine if you have only one context and there would be e.g. restart of the Oracle server or a network outage, then the connection goes to a broken state, how SQLProvider could know to restore it? SQLProvider is expecting ADO .Open() will fix all the connection problems, but does it?

I'm currently doing each write context in a separate context (as I use transactions and I want them to be small) and I'm currently doing my read-queries in one context. But the read-context can corrupt if there is a network outage or if there is windows update in my servers and my service is faster up than my database, so my program is starting in a condition where database is unavailable.

So I have done some ugly error-handling to my read-context:

type TypeProviderConnection = SqlDataProvider<...>
let cstr = "my runtime connection string here"

let internal createDbManagementContext() =
    let rec createCon x =
        try
            if cstr = null then TypeProviderConnection.GetDataContext()
            else TypeProviderConnection.GetDataContext cstr
        with // deal with network errors with a small retry:
        | :? System.Data.SqlClient.SqlException as ex when x < 3 -> 
            Console.WriteLine ex.Message
            System.Threading.Thread.Sleep 50
            createCon (x+1)
        | :? System.Data.SqlClient.SqlException as ex when x < 5 ->
            let l = logger.Force()
            Console.WriteLine ex.Message
            System.Threading.Thread.Sleep 1500
            createCon (x+1)
    createCon 0

// Create a new DataContext only when previous is disposed or not-existing for some reason:
let mutable internal contextHolder = Unchecked.defaultof<Lazy<DataContext>>
let dbReadContext() =
    if contextHolder = null || not (contextHolder.IsValueCreated) || box(contextHolder.Value) = null  then
        try
            let itm = lazy(createDbManagementContext())
            contextHolder <- itm
        with
        | e -> 
            Console.WriteLine ex.Message
    contextHolder.Force()

let qry =
            query {
                for item in dbReadContext(). ...
            }

I have created #368 to describe this problem. I don't know how should we deal with these kind of problems but I'm open to suggestions.

@object
Copy link
Contributor Author

object commented Apr 24, 2017

Thank you for so comprehensive response. In the past we created context only once, but that didn't work well in case of network outage, so now we are obtaining the context for each request. I believe my issue is of the same nature, so it will be interesting to find a common approach to it. Since we renew db context for each request, we can't fix this issue outside SQLProvider code. But I will investigate further how we can respond to Oracle exceptions inside SQLProvider implementation to recover connectivity after the network outage.

@Thorium
Copy link
Member

Thorium commented Apr 24, 2017

Thanks, this can be actually common to other providers also.

@object
Copy link
Contributor Author

object commented Apr 24, 2017

BTW, do you remember what is the meaning of two calls to con.Open?

        con.Open()

        use scope = TransactionUtils.ensureTransaction transactionOptions
        try
            // close the connection first otherwise it won't get enlisted into the transaction
            if con.State = ConnectionState.Open then con.Close()
            con.Open()
            ...
        finally
            con.Close()

There are two Open() but one of them is outside the try block. Can this become the source of my troubles?

@Thorium
Copy link
Member

Thorium commented Apr 24, 2017

The purpose of close the connection is that the if you create a transaction inside an open connection, the connection won't be part of the transaction. So the connection has to be closed before creating a transaction. Apart from Microsoft Access, where closing connection will cause file to be locked and un-usable.

@object
Copy link
Contributor Author

object commented Apr 24, 2017

Yes, but extra con.Open() on the top that is not closed? And if you compare with async version of the method, it doesn't have it.

@object
Copy link
Contributor Author

object commented Apr 26, 2017

I removed the extra con.Open(), tested and opened a new PR.

@object object closed this as completed Apr 26, 2017
@Thorium
Copy link
Member

Thorium commented Apr 26, 2017

Thanks!

Very weird that calling con.Open() twice would open two different connections, but ok...

@object
Copy link
Contributor Author

object commented Apr 26, 2017

I am not sure it did. Tried to set up Oracle performance counters but gave up unsuccesfully. However, this morning for the first time in a while our service didn't lose Oracle connectivity. But this statement was not needed in any case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants