-
Notifications
You must be signed in to change notification settings - Fork 335
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhancement: DataTable or IDataReader support for Bulk Insert #737
Comments
To do it 'right' I'm thinking a subclass of In my own code I just dump everything into a massive MemoryStream, which isn't great for performance. |
Are you talking about the It seems like a helper class could be written that sets up a |
Essentially yes. As for IDataReader, it makes for a useful abstraction. For SQL Server I'll wrap a collection of objects, a enumeration of objects, or a file parser in an IDataReader and stream it in instead of building a fat DataTable. In fact, DataTable exposes an IDataReader so we'd only need one implementation. |
Whether it is part of MySqlBulkLoader or a separate extension method doesn't matter to me. Though it should be easily discoverable. |
I've developed a prototype version of this at e0a2de5. Current API (subject to change): using var connection = new MySqlConnection("...;AllowLoadLocalInfile=true");
var bulkLoader = new MySqlBulkLoader(connection, dataTableOrDataReader)
{
TableName = "destination_table",
};
bulkLoader.Load(); |
That's not at all like I expected it to look. It's certainly a lot cleaner than what I could have done from outside the library. |
Updated API proposal: 98db02d#diff-e5e5c76f457fba4a79beedb1995a9f5f using (var connection = new MySqlConnection("...;AllowLoadLocalInfile=True"))
{
await connection.OpenAsync();
var bulkCopy = new MySqlBulkCopy(connection);
bulkCopy.DestinationTableName = "some_table_name";
await bulkCopy.WriteToServerAsync(dataTable);
} @Grauenwolf how important/useful to you would it be to have the capability to load from |
Do you mean just passing in stuff like If so, I don't need it personally, but I think it would be useful to others. |
Yes, I was just referring to the latter (which would be a simple extension to the current code); I didn't have any plans to add reflection, matching property names with field names, etc. |
I've been testing this and it's working nicely except for text columns that contain newlines: Testcase:
Produces this:
|
The string-escaping routine is missing a check for
Thanks for pointing this out; a fix will be in the next release. |
@dennis-gr 0.62.0-beta3 is available and should fix this problem; thanks for reporting it! |
Thank you both for this! |
Thanks for the quick fix, working as expected now. |
There are few more useful items missing (comparing to SqlBulkCopy):
|
For SqlClient, this is documented as "Number of rows in each batch. At the end of each batch, the rows in the batch are sent to the server. ... Zero (the default) indicates that each WriteToServer operation is a single batch." The current implementation streams all the rows to the MySQL server, although there are implicit batches formed every 16MiB since that's the maximum size of a single MySQL network packet. Is there any specific outcome you want to accomplish by setting a BatchSize?
😱
It seems reasonable that a
Also seems reasonable to implement (along with the SqlRowsCopied event).
Accepting |
BatchSize is important for SQL Server because it has a measurable impact on performance. If that's not the case for MySQL, I think it's safe to omit it. |
I think BatchSize in SQL server will limit memory/network usage. I guess that 16MB parameter would play similar role, but is it configurable though? One more item to consider is ColumnMappings property |
One more issue that I've run into: If the data size exceeds the maximum network packet size of 16 MB, the copy operation fails:
Test to reproduce:
|
Yes; this is not currently supported. |
I've been giving this some more thought. If we support batch size, then we could also support notifications. In SQL Server, you can ask it do trigger an event after every N records are uploaded. |
Quick question: will this support transactions? I can't find any information on it one way or the other. (Back in 2006 it would silently commit any pending transactions, but that was a long time ago.) |
Transaction support would be up to MySQL Server. I haven't found any definitive documentation about it, so I would assume |
You cannot currently send a single string that is longer than 16MiB (this is a known limitation). However, if you were reporting that any string that crosses the 16 MiB packet boundary fails to be sent, this is indeed a bug that should be fixed. |
"v1" of this feature is implemented in 0.62.0 (documentation: https://mysqlconnector.net/api/mysql-bulk-copy/); please open new issues for bugs or feature requests to |
It would be really nice if we could just pass in a DataTable or IDataReader like we do with SQL Server and let the driver handle the CSV conversion and steam management.
This way we don't have people guessing about things like how to encode strings, dates, and nulls.
The text was updated successfully, but these errors were encountered: