New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inserting over ~3000 rows concurrently causes SQLSTATE=57011 SQLCODE=-904 #24
Comments
Original comment by Xu Meng (Bitbucket: mengxumx, GitHub: dmabupt). @KerimG , I tried to insert 5000 rows but can not recreate the issue.
|
Original comment by Kerim Gueney (Bitbucket: KerimG, GitHub: KerimG). @dmabupt , I'm inserting ~60 values per insert. I checked out the spoolfile and got the following:
Maybe that helps a bit more. EDIT: wrong error code |
Original comment by Kerim Gueney (Bitbucket: KerimG, GitHub: KerimG). Is it possible that I need one of these two PTFs ? http://www-01.ibm.com/support/docview.wss?uid=nas3SI44448 http://www-01.ibm.com/support/docview.wss?uid=nas3SI30791 http://www-01.ibm.com/support/docview.wss?uid=nas2SE49099 I don't really understand if the descriptions of what problems the PTFs seem to fix fits the problem I seem to have. |
Original comment by Tony Cairns (Bitbucket: rangercairns, GitHub: rangercairns). There are many possibilities ... Perhaps check joblog while 'run' and see which limit you hit (below). Good Luck. BTW - I am assuming you understand that your node program can hold the connection open while observer the attached QSQSRVR joblog (aka, don't close the connection).
|
Original comment by Tony Cairns (Bitbucket: rangercairns, GitHub: rangercairns). Ops ... sorry ... I see you already found ... When I checked the job log, it said that it was resource number 14. |
Original comment by Kerim Gueney (Bitbucket: KerimG, GitHub: KerimG). Hey @rangercairns , thank you. How did you find the recovery text? That's actually a bit helpful. What exactly is a descriptor? Is it possible to increase the maximum number of descriptors I can allocate? Is my code not deallocating? Or not deallocating fast enough? |
Original comment by Tony Cairns (Bitbucket: rangercairns, GitHub: rangercairns).
Well, CLI is really simple. That is, 32K 'handles/descriptors' allowed in an process. Each time you open a 'connection', 'statement', etc., one of the slots in DB2 CLI array is marked 'in use' (technical simplification, but really is an array). BTW -- I can't recall 32K is exact limit modern DB2 days (may be old info). Point is, you should never be up that high actively 'in use' handles in single process anyway (unless you are Google at millions hits/second).
No.
In this case, 'resource number 14' would indicate you used them all up. Speculation by simple answer, you allocated statements, but never closed them. Aka, assuming not a PTF where DB2 is bad (less likely, but possible).
BTW -- Technically, prepareBindingParams, if each write block of records is always same row size (columns fixed size), and same number rows (count records), aka, then blocks exactly same. You should be able to simply re-use a single handle, but re-bind parameters with next set of records until last insert (different nbr rows, means different block size).
BTW -- Only helping, not trying to take over issue. China folks are probably sleeping by now (other side world). |
Original comment by Tony Cairns (Bitbucket: rangercairns, GitHub: rangercairns). Another though debug ... you could dump stmt number as your program runs along to see if it is climbing up the bean stalk. |
Original comment by Tony Cairns (Bitbucket: rangercairns, GitHub: rangercairns).
I suggest offer two stmt close APIs.
|
Original comment by Kerim Gueney (Bitbucket: KerimG, GitHub: KerimG).
Good idea. I did that and found out that the code doesn't exactly run asynchronously. Least not in a way I expected. My output looks something like
I expected something more like:
Where several stmt objects are released before others are created or executed. |
Original comment by Tony Cairns (Bitbucket: rangercairns, GitHub: rangercairns).
The exact async fear of mine is all the stmts would get created (3000), before any bind/insert operations run. That is, async call writeDb2(data) creates statements like crazy (at speed of light). Followed by essentially a covered wagon operation to insert block of records (git along ol' mule). Also, no amount of yelling at author of the insert mule (block size), will actually fix a relative time differentiation problem. The real problem is 'traditional database' API thinking itself (see below). Note: Worse, when in a web daemon, multiple requesters (browsers), may add a multiplier of the problem 3000 stmts * 3000 request = 'way too many handles'.
The following is only my opinion (years of experience). Do not read following unless you want to understand truth and issues about current design using existing fine grain CLI APIs. Problem: Async 'operations' only works when granularity of operations is significant. That is, fine grained CLI operations don't really work, for instance like Answer: You need composite async 'APIs' that actually do aggregate operations (connect/stmt/prepare/bind/execute/fetch). That is, aggregate new CLI APIs, for instance like litmis/db2sock toolkit db2 driver SQL400JsonAsync(injson,outjson). Wherein, SQL400JsonAsync entire essence of the whole operation occurs in one async call.
|
Original comment by Tony Cairns (Bitbucket: rangercairns, GitHub: rangercairns). Oh, Oh! The world got silent. I am only trying to help you understand async programming. Well, I was trying to be nice. Your design is very bad with writeDb2(data). I have heart of teacher, so I will try to help you. So, let me get to the rest of issues with writeDb2(data).
Note: Better news ... litmis/db2sock project (new libdb400.a), has a mutex at connection level for each stmt use, so this may work with new driver. Aka, stmts will synchronize one at a time due to the lock with db2sock.
Assuming you want to stay with the current (faulty) technology, you will need to use a connection pool to work around the missing mutex in old PASE driver (node db2a -> libdb400.a). Like this ... you will be doing inserts with connection pool, aka, not remove by id (below), but same net 'pooled' affect for inserts. That is, you will spread the insert blocks across many connection/QSQSRVR jobs. Essentially, you will have one connect + db2a.dbstmt + bindAndExecute in each of the connections. Aka, you will 'trick' DB2 into thinking simply a bunch of separate jobs all inserting block data at the same time (no worry about thread safe here). Note: The bear pool design is posted to litmis/nodejs examples/express_api_bears_node6. You do not have to follow literally, but presents the idea of 'connection pools'. This is missing from your writeDb2(data) design.
Graphically, many connection design looks like below. Each connect + stmt + bindAndExecute set will run on a different QSRSRVR job.
Note: Of course there will be a performance wall to hit when you overrun the pre-started QSQSRVR jobs. That is, will run like great for a while until stress causes many, many, many, QSQSRVR to be created to match demand ...
Well, this is tough, because conventional database CLI thinging is not overly good for 'async'. I already described one possible single call technique using SQL400JsonAsync. Of course, people MAY want to 'go faster' than json design, so a 'similar binary' interface may be needed. However, on other hand, seems node programs always want to reduce everything 'data' to json anyway (cut out the db2a middle man). But i digress ... back to current. I have already described in a previous post the idea of stmt.reset(). However, for that 'slimmer resource' design to work you will still need some sort of checkin/checkout connection pool design, wherein excess stress usage of the 'inserts' will be synchronized (probably the dreaded 'promises').
I love elegance of JavaScript async callback design. I love nested callbacks that fire willy nilly with no obvious order. This may be the best language ever created due to 'async'. Aka, I consider frameworks like 'promises' or 'wait' for people that can't write or don't like writing JavaScript. However, these frameworks are very popular with some devlopers. If i may be bold, most casual web design folks should stay with python or php as these languages really cater to 'procedural' or 'traditional' database designs. Node 'async' csan difficult for many, especially when trying to map onto a non-async database like db2 (traditional CLI APIs). Again, not picking on you per say, just trying to help you understand 'async' (bad design function writeDb2). You may choose to ignore my teaching as you wish. |
Original comment by Kerim Gueney (Bitbucket: KerimG, GitHub: KerimG).
Haha @rangercairns, don't worry. I'm not offended or anything. I appreciate your help, I'm on a business trip until Friday (and then again until Tuesday) so I don't really have much time to work on this issue. All is good mate, thanks a lot :-) I'll check out what you wrote in detail as soon as I can. |
Original comment by Kerim Gueney (Bitbucket: KerimG, GitHub: KerimG). @rangercairns I have only one request. Please write more succinctly and clearly. Avoid writing stuff like
or
or your garage story. Those things add no value to your otherwise great responses and make your responses less approachable. So, connection pooling it is. Thanks! |
Original comment by Tony Cairns (Bitbucket: rangercairns, GitHub: rangercairns).
Cool.
One more bit of geek business about 'async' APIs in db2a node driver. Currently db2a 'asnyc' operations (db2a function c code), are using technology libuv (see uv_work_t in c code). This is a common node technique to help make 'stuff' not normally threaded (1st version db2a node not async), to participate in 'async' proper (current version db2a).
Most important is how many worker threads are available to handle db2a 'async' operations at one moment (it's not 3000). Basically if you read the link above ... Its default size is 4, but it can be changed at startup time by setting the UV_THREADPOOL_SIZE environment variable to any value (the absolute maximum is 128). So, uv worker threads plays a factor in your connection pool design, should you choose to follow my suggestion. I suspect pool connection(s) may match UV_THREADPOOL_SIZE (only theory). For more info ... a link with a fellow exploring UV_THREADPOOL_SIZE.
Yes or not 'promises' may be the next db2a API. We will still have to deal with 'async' in a rational way. To wit, I don't know node current use libuv will stay in node author style/vogue for 'async' db2a (the next generation). Certainly litmis/db2sock natural aggregate 'async' interfaces may have some influences (or not). Another wild card, may turn out using json style interface SQL400Json from litmis/db2sock for both toolkit calls and db2 work may be a fine answer. Somebody (not me), should write a little combined json toolkit/db2 driver to see makes life much better for everyone node scripting on IBM i. Maybe talk with Jesse G. if your intuition reflect mine on 'pure json' being the answer for both db2 and toolkit. Again, only a wild guess here. Ok, happy coding. |
Original comment by Tony Cairns (Bitbucket: rangercairns, GitHub: rangercairns). Yes, wit can be singular. Old geeks do forget humor is eye of beholder. Alas, I will retire and younger folks follow. Later. |
Original comment by Kerim Gueney (Bitbucket: KerimG, GitHub: KerimG). @rangercairns Hah, I didn't mean to come across rude or anything. You coming to the POW3R conference in Hannover? I'll buy you a German beer :-) |
Original comment by Tony Cairns (Bitbucket: rangercairns, GitHub: rangercairns).
I dunno. I explained your situation clearly to IBM i co-op working on the next idb2 (next door to mine). He understands idea that a single connection QSQSRVR job can be overrun doing async many stmt operations. Anyway, bear project is simply an example introducing two concepts. First, connection pooling (everybody doing this wrong on IBM i). Second, result caching for better performance ( controversial argument that the truth of data can be a white lie for performance). Feel free to investigate (or not). However, design speaking, one can always argue a line between responsibility of the user script vs. the middleware (db2a driver). That is, who bears responsible for creating a connection pool. I suggest you open a direct dialogue on the new idb issues. I mean, I only work part time, so best to talk with author of this new driver on his issues. (I am regretting involvement a bit already.) |
Original comment by Tony Cairns (Bitbucket: rangercairns, GitHub: rangercairns).
We are cool. No, i don't travel. Also, like many quirky geeks, i don't drink. However, I can assure you, I am trustworthy (... more humour). |
Original comment by Danny Roessner (Bitbucket: droessner, GitHub: droessner).
@ThePrez Can you comment? We very much like the pure json approach of db2sock and would like to know if any plans have been made for any official implementation. |
Original comment by Jesse G (Bitbucket: ThePrez, GitHub: ThePrez). @droessner, I agree that the JSON interface should be pursued. It is on the radar, but there are currently no dates to share. |
Original comment by Xu Meng (Bitbucket: mengxumx, GitHub: dmabupt). Per Tony's suggestion @2018-03-13, from v1.0.9 I added the new API stmt.reset() to support resetting param bindings. |
Original comment by Kerim Gueney (Bitbucket: KerimG, GitHub: KerimG). I think this issue can be closed. Unfortunately, I can't really find the time to test the suggested solution but I trust it works. Thank you. @rangercairns Thank you, as well. This discussion was very educational, I really appreciate it. |
Original comment by Kerim Gueney (Bitbucket: KerimG, GitHub: KerimG). Hey @rangercairns , I tried to contact you via email via adc@us.ibm.com but I get a user unknown error. I wanted to tell you a little something something about http://yips.idevcloud.com/wiki/index.php/PASE/PASENutshell and stuff. You got any other way to email you? Cheers! |
Original report by Kerim Gueney (Bitbucket: KerimG, GitHub: KerimG).
I am attempting to insert a few thousand rows of data concurrently with the following code:
when I do, I get
When I checked the job log, it said that it was resource number 14.
How can I avoid that? Should I insert all values with a single insert statement?
The text was updated successfully, but these errors were encountered: