Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New feature request: table.rows.clear() #1087

Closed
ttemple06 opened this issue Aug 18, 2020 · 1 comment
Closed

New feature request: table.rows.clear() #1087

ttemple06 opened this issue Aug 18, 2020 · 1 comment

Comments

@ttemple06
Copy link
Contributor

ttemple06 commented Aug 18, 2020

Would like to have a new method to clear out the table.rows array so that rows can be bulk inserted in batches of n rows at a time. This is particularly helpful when trying to bulk insert millions of rows from one MSSQL db to another, and wanting to do so in an offset of n records at a time. Trying to bulk insert millions of rows, even when pausing the request, row events are arriving faster than they can be processed, as described inside this SO article:
https://stackoverflow.com/questions/48645466/how-to-stream-2-million-rows-from-sql-server-without-crashing-node

Expected behaviour:

Having a table.rows.clear() method which simply clears the table.rows array would resolve this issue.

Actual behaviour:

Without a table.rows.clear() method, it is not possible to table.rows.add() and bulk insert in batches. What ends up happening is that the table.rows get bulk inserted the first time (e.g. let's say 10,000 is the offset), and the next time, those 10,000 plus another 10,000 get inserted, and so on, making the end result compounded.

Configuration:

Having a table.rows.clear() method which simply clears the table.rows array would resolve this issue. This method can be defined inside tables.js

Object.defineProperty(this.rows, 'clear', { value () { return this.splice(0, this.length) } })

Software versions

  • NodeJS: v12.16.1
  • node-mssql: ^6.2.0
  • SQL Server: MSSQL Server 2012 (SP4-GDR)
@ttemple06
Copy link
Contributor Author

ttemple06 commented Aug 19, 2020

Here is a code sample showing the new table.rows.clear() method being called from inside processRows:

async function doSomething () {

    const sql_table = await methodToCreateDestinationTableAndAddColumns()

    const request = new sql.Request();
    request.stream = true;
    request.query('select * from dbo.SourceTable');
    let rowsToProcess = [];
    const num_of_loops = 100
    let loop_count = 1;

    request.on('row', row => {
        // Emitted for each row in a recordset
        rowsToProcess.push(row);
        if (rowsToProcess.length >= 10000) {
            request.pause();
            await processRows();
          }
     });

    request.on('error', err => {
        // May be emitted multiple times
        console.log(err);
    });

    request.on('done', result => {
        // Always emitted as the last one
        await processRows();
        // close connections...

    });

    async function processRows () {
        console.log(`processing loop number ${loop_count} of ${num_of_loops}`)
        // here is the new method:
        sql_table.rows.clear()
        await methodToAddRowsToDestinationTable()
        await methodToDoBulkInsert()
        rowsToProcess = [];
        loop_count++
        request.resume();
      }
}

ttemple06 added a commit to ttemple06/node-mssql that referenced this issue Aug 26, 2020
dhensby pushed a commit to ttemple06/node-mssql that referenced this issue Nov 19, 2020
dhensby pushed a commit to ttemple06/node-mssql that referenced this issue Nov 19, 2020
dhensby pushed a commit to ttemple06/node-mssql that referenced this issue Nov 22, 2021
@dhensby dhensby added this to the v8.0.0 milestone Nov 22, 2021
@dhensby dhensby closed this as completed Nov 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants