Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to insert multiple rows in one go? #70

Open
ghost opened this issue Nov 9, 2015 · 6 comments
Open

Is it possible to insert multiple rows in one go? #70

ghost opened this issue Nov 9, 2015 · 6 comments
Labels

Comments

@ghost
Copy link

ghost commented Nov 9, 2015

I noticed that I when I want to insert bulk data to a single table the insert query is built for every row of the bulk data separately and ran individually. Is it possible to combine these or does restsql already support this?

Something like the multirow VALUES syntax of postgres:

INSERT INTO films (code, title, did, date_prod, kind) VALUES
('B6717', 'Tampopo', 110, '1985-02-10', 'Comedy'),
('HG120', 'The Dinner Game', 140, DEFAULT, 'Comedy');

@restsql
Copy link
Owner

restsql commented Nov 10, 2015

You can send a single bulk request to restSQL (a post with an xml or json body of multiple rows), but each row will be a separate sql insert.

Are you concerned about performance?

Mark

@ghost
Copy link
Author

ghost commented Nov 11, 2015

yes, specially If I have large POST request my request takes more time then my webserver's user session idle timeout which is 30 mins by default on apache shiro

@honzabilek4
Copy link

Hi,
is there any progress on this? Has anything changed in latest releases? If not, would it be possible to extend the framework in order to achieve this functionality?
I'd like to use the framework in one of my projects, however I'm expecting pretty high insert rate and this would be a huge bottleneck I guess.

@restsql
Copy link
Owner

restsql commented Nov 8, 2017

Sorry it hasn't been implemented. Please be my guest to extend the framework. It's the liberal MIT license, so you can keep it yourself, or contribute it back to the community.

I will take a look soon and send some hints on where to start.

Mark

@honzabilek4
Copy link

Thanks, I'd really appreciate any hints on this.
JB

@restsql
Copy link
Owner

restsql commented Nov 11, 2017

I see a few of areas needing extensions.

The org.restsql.core.Request currently represents one sql resource operation (read or write). One API request can contain multiple of Requests. The RequestDeserializers demux and create one Request per contained json object/xml element, calling write on the sql resource.

We could go two ways. Change Request to handle multiple requests. That looks challenging to me looking at its interface. I'd go the other way and create a batch write method on sql resource. Change the deserializers to send a collection of them into the sql resource.

Next question, is batch writes the norm, so it tries optimistically to do this? Or is it an explicitly new path. Then we create new deserializers and create a new api method, or parameterize it.

Also, things get complicated when accounting for hierarchical sql resources. You may want to avoid it altogether and only support flat sql resources initially.

First, I'd start with adding a new write method overload on SqlResource/SqlResourceImpl that accepts Set. The current one only takes one Request.

Second, you need to extend the SqlBuilder/AbstractSqlBuilder to handle batched writes.

Third, the RequestDeserializer implementations. Start with JSON or XML in the org.restsql.core.impl.serial package, whichever suits you. Both have Handler inner classes that parse the content and execute a write operation on the sql resource for each object in the posted array/element set. On endObject()/endElement() it calls executeRequest() which calls sql resource write operation. Change it to put the request objects into a set, and after parsing send it into the batch write. Not sure if it's easier to create a parallel BatchXxxRequestDeserializer or paramerize the current ones with the mode, batch or non-batch.

If we think it safer to only batch if indicated, i.e. we need a new insert/POST method or an indicator on the request. Not sure if we want to explore another HTTP method. Can use a new header. Or insert some flag in the body. But I think probably the best option is a query parameter, e.g. _mode=batch. So then you need a new post method on ResResource that takes a mode parameter. And then pass that into the RequestDeserializer factory.

Alternatively, batching could be the normal attempt. Doesn't every database support multi row inserts? I think that has to be SQL 92 standard. Then we don't need a new api operation, nor new deserializers or parameter to the existing serializers.

Mark

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants