Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple Insert or Batch Processing #22

Closed
harikt opened this issue Aug 15, 2012 · 11 comments
Closed

Multiple Insert or Batch Processing #22

harikt opened this issue Aug 15, 2012 · 11 comments
Assignees

Comments

@harikt
Copy link
Member

harikt commented Aug 15, 2012

Hi Paul ,

I am in need to issue multi query . Is there a way we can make it ?

I was looking at http://stackoverflow.com/a/4559320/487878

Do you have any suggestions how we can make it ?

@harikt
Copy link
Member Author

harikt commented Aug 15, 2012

I did some experiments to make this happen. But the problem is we are using ValuesTrait which has values stored as a single dimension via col and cols method. So the array of values will never become multidimensional .
I will continue with more ways if I get some idea.
But this is a required one , else if I have 1000 values to insert I feel its a bad idea.

@ghost ghost assigned pmjones Aug 15, 2012
@harikt
Copy link
Member Author

harikt commented Aug 15, 2012

I was trying something like

public function __toString()
{
    if ( count($this->values) ==  count($this->values, COUNT_RECURSIVE) ) {
        $values = '(' . $this->indentCsv(array_values($this->values)) . ')';
        $columns = array_keys($this->values);
    } else {
        // for first one we need key
        $columns = array_keys($this->values[0]);
        $multi = [];
        foreach ($this->values as $value) {
            $multi[] = '(' . $this->indentCsv(array_values($value)) . ')';
        }
        $values = implode(',', $multi);
    }

    $string = 'INSERT INTO ' . $this->table . ' ('
         . $this->indentCsv($columns)
         . ') VALUES '
         . $values;
    return $string;
}

in Aura\Sql\Query\Insert and

public function multiCol(array $columns)
{
    $this->values = [];
    var_export($columns);
    foreach ($columns as $k => $cols) {
        $value = [];
        foreach ($cols as $y => $col) {
            if ($k == 0) {
                $c = $y;
            } else {
                $c = $k . 'c' .$y;
            }
            $key = $this->sql->quoteName($c);
            $value[$key] = ":{$c}";
        }
        $this->values[] = $value;
    }
}

in Aura\Sql\Query\ValuesTrait and Aura\Sql\Adapter\AbstractAdapter

public function insert($table, array $cols)
{
    $insert = $this->newInsert();
    $insert->into($table);
    if ( count($cols) ==  count($cols, COUNT_RECURSIVE) ) {
        $insert->cols(array_keys($cols));
    } else {
        $insert->multiCol($cols);
    }
    $stmt = $this->query($insert, $cols);
    return $stmt->rowCount();
}

Now the problem is in prepare statement and also 4 tests fails .

public function prepare($text, array $data)
{
    // need the PDO object regardless
    $pdo = $this->getPdo();

    // was data passed for binding?
    if (! $data) {
        return $pdo->prepare($text);
    }

    // a list of placeholders to bind at the end
    $bind = array();

    // find all text parts not inside quotes or backslashed-quotes
    $apos = "'";
    $quot = '"';
    $parts = preg_split(
        "/(($apos+|$quot+|\\$apos+|\\$quot+).*?)\\2/m",
        $text,
        -1,
        PREG_SPLIT_DELIM_CAPTURE
    );

    // loop through the non-quoted parts (0, 3, 6, 9, etc.)
    $k = count($parts);
    for ($i = 0; $i <= $k; $i += 3) {

        // get the part as a reference so it can be modified in place
        $part =& $parts[$i];

        // find all :placeholder matches in the part
        preg_match_all(
            "/\W:([a-zA-Z_][a-zA-Z0-9_]*)/m",
            $part . PHP_EOL,
            $matches
        );

        // for each of the :placeholder matches ...
        foreach ($matches[1] as $key) {
            // is the corresponding data element an array?
            if (isset($data[$key]) && is_array($data[$key])) {
                // quote and replace it directly, because PDO won't bind
                // an array.
                $find = "/(\W)(:$key)(\W)/m";
                $repl = '${1}' . $this->quote($data[$key]) . '${3}';
                $part = preg_replace($find, $repl, $part);
            } else {
                // not an array, retain the placeholder name for later
                $bind[] = $key;
            }
        }
    }

    // bring the parts back together in case they were modified
    $text = implode('', $parts);

    // prepare the statement
    $stmt = $pdo->prepare($text);
    if (count($data) == count($data, COUNT_RECURSIVE)) {

        // for the placeholders we found, bind the corresponding data values
        foreach ($bind as $key) {
            $stmt->bindValue($key, $data[$key]);
        }
    } else {
        var_export($data);
        foreach ($data as $k => $cols) {
            foreach ($cols as $y => $col) {
                if ($k == 0) {
                    $key = $y;
                } else {
                    $key = $k . 'c' .$y;
                }
                echo ' Key ' . $key . $data[$k][$y] . PHP_EOL;
                $stmt->bindValue($key, $data[$k][$y]);
            }
        }
    }

    // done!
    return $stmt;
}

Its not finished, nor in a working condition .

The test failure is due to the check for multi query in prepare which happens to be a multi dimensional array for the query IN ( ) and where condition.

It will be also good if we can just execute the query when we say execute than executing by default .

@pmjones
Copy link
Member

pmjones commented Aug 15, 2012

Executing multiple inserts in a single statement is probably the fastest. However, something that is also very fast is executing multiple inserts inside a transaction. I've done that before and the speed difference from doing it outside a transaction is dramatic.

@harikt
Copy link
Member Author

harikt commented Aug 16, 2012

@pmjones What about MyISAM ?

@pmjones
Copy link
Member

pmjones commented Aug 16, 2012

"Don't" ? ;-)

@pmjones
Copy link
Member

pmjones commented Aug 16, 2012

On a more serious note, it sounds like a multiple insert is different-enough from a single insert that it might do better as a class of its own. I have to wonder how well prepared statements will work with multiple placeholders, though. Would need some way of numbering them properly.

@harikt
Copy link
Member Author

harikt commented Aug 16, 2012

@pmjones So may be I misunderstood. So a simple question :) .

1 ) According to my understanding transactions doesn't commit on the first time and for first sql. It only does in the final stage and when there is an issue it can be reverted back by rollBack .

So the basic question is in MyISAM even the transaction begins its inserting, and it will not wait for the last commit ( yes / no ).
If yes its same as single insert ? No ?

According to a comment http://www.php.net/manual/en/pdo.begintransaction.php#82033 . But anyway I am not getting error even if its MyISAM as in the comment.

$sql->beginTransaction();
try {
    $cols = ['count' => 215, 'order' => 341];
    $sql->insert('count', $cols);
    $cols = ['count' => 251, 'order' => 314];
    $sql->insert('count', $cols);
    $cols = ['author_id' => 42, 'title' => 'Hello transaction title', 'body' => 'Hello transaction Body'];
    $sql->insert('posts', $cols);
    $cols = ['count' => 2531, 'order' => 31334];
    $sql->insert('count', $cols);
    $sql->commit();
} catch ( Exception $e ) {
    var_export($e);
    $sql->rollBack();
}

@harikt
Copy link
Member Author

harikt commented Aug 16, 2012

Also according to my knowledge in a transaction if we never issued a commit() , nothing is inserted. Is it wrong in my understanding?

@harikt
Copy link
Member Author

harikt commented Aug 16, 2012

@pmjones about numbering my idea was to for eg :

$cols = [
    ['author_id' => 2, 'title' => 'Hellotitle', 'body' => 'HelloBody'],
    [3, 'Hello3', 'HelloBody3'],
    [43, 'Hellotitle43', 'HelloBody43'],
    [34, 'Hellotitle34', 'HelloBody34'],
    [35, 'Hellotitle35', 'HelloBody35'],
    [36, 'Hellotitle36', 'HelloBody36']
];

For the $cols[0] we keep the tables column name. For the other we can or we don't need to keep .
on the query level using

0identifier0, 0identifier1, 0identifier2 or :author_id, :title, :body, // for first

1identifier0, 1identifier1, 1identifier2,

2identifier0, 2identifier1,2identifier2

Not sure whether this is a good idea though.

@pmjones
Copy link
Member

pmjones commented Jan 17, 2013

Is this still something you think is necessary functionality? It seems relatively complex as it is. I'd like to close this ticket in order to clean up the issues listing, but if you feel it's really needed, I'd be happy to see a pull request in place of the ticket.

@harikt
Copy link
Member Author

harikt commented Jan 17, 2013

@pmjones I love to give a PR. But as you mentioned it is hard . But I will think about it and will come up when I have something. You can close it for the time if needed.

I will add in my todo list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants