Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Large Memory Leak #52

Closed
robholmes opened this Issue · 9 comments

5 participants

@robholmes

Hi,

I'm using the following piece of code to process a large amount of data. The function is designed to insert or update (upsert) data, however when looping through thousands of records of data, I'm gaining 7MB of memory every few seconds, non-stop!

public function iouProgramme($data, DateTime $dateTime, DocumentManager $documentManager) {
    /* @var $qb \Doctrine\MongoDB\Query\Builder */
    $qb = $documentManager->getRepository('StrawberrypearTvBundle:Programme')->createQueryBuilder();
    $qb->update()->upsert()
            ->field('channel')->equals($data['channel']->getId())
            ->field('start_time')->equals($dateTime->modify($data['start_time']->format('c')));

    if (isset($data['end_time']))       $qb->field('end_time')->set($data['end_time']);
    if (isset($data['duration']))       $qb->field('duration')->set($data['duration']);
    if (isset($data['title']))          $qb->field('title')->set($data['title']);
    if (isset($data['sub_title']))      $qb->field('sub_title')->set($data['sub_title']);
    if (isset($data['description']))    $qb->field('description')->set($data['description']);
    if (isset($data['category']))       $qb->field('category')->set($data['category']);
    if (isset($data['episode']))        $qb->field('episode')->set($data['episode']->toArray());
    if (isset($data['quality']))        $qb->field('quality')->set($data['quality']);
    if (isset($data['subtitles']))      $qb->field('subtitles')->set($data['subtitles']);

    /* @var $query \Doctrine\MongoDB\Query\Query */
    $query = $qb->getQuery();
    $query->execute();

    unset($query, $qb, $data);

    return true;
}

I can stop the memory leak by simply commenting out the $query->execute(); line, but obviously this kills the point of everything. Also worth mentioning that I've commented out every part of the query until I'm not actually finding or updating anything, and there is still a memory leak.

I believe the leak must be somewhere within the Doctrine/MongoDB code.

Can someone please help me shed some light on this?

Thanks,

Rob

@henrikbjorn

Does the upsert also create documents that the manager knows about? if so you can call $documentManager->clear() every 1000 iteration or so, http://www.doctrine-project.org/blog/doctrine2-batch-processing.html

But again that is just a theory

@kriswallsmith
Collaborator

Does it leak when you use the PECL classes without Doctrine?

@robholmes

@henrikbjorn Although the DocumentManager isn't aware of any objects, I'm calling flush() and clear() on the DocumentManager every 500 iterations. So the leak is still occurring with that. I was previously finding and hydrating the Programme document if it existed then updating it, or inserting new document if it didn't exist.

@kriswallsmith I'm not aware how to use the PECL classes at the moment, but I'm going off to look now (unless you could provide a quick example off the top of your head).

@jmikola
Owner

I don't think this relates to managed documents (no doubt a cause of some memory leaks), as the query builder isn't hydrating anything. I'd wager this might be related to query logging. Can you confirm if you're using that?

@robholmes: You can replicate this with the PECL driver using the \MongoCollection::update() method. See these docs. If you poke around in the Doctrine wrapper classes, you'll see how to fetch the wrapped PECL object (e.g. Doctrine's Collection wrapper class has a getMongoCollection() method).

@robholmes

@jmikola You're right it is logging every single query! I didn't actually know it did that. I will now stop var_dumping so much ;-)

But unfortunately when I ran the command with the prod environment I still get the exact memory leak.

@robholmes

@kriswallsmith I've reworked the function I originally posted to use the PECL classes and there is no memory leak! Here's the code I'm using:

public function iouProgramme($data, DateTime $dateTime, $collection) {
    $criteria = array(
        'channel' => $data['channel']->getId(),
        'start_time' => new \MongoDate(strtotime($data['start_time']->format('c'))),
    );

    $update = array(
        'channel' => $data['channel']->getId(),
        'start_time' => new \MongoDate(strtotime($data['start_time']->format('c'))),
    );
    if (isset($data['end_time']))       $update['end_time'] = new \MongoDate(strtotime($data['end_time']->format('c')));
    if (isset($data['duration']))       $update['duration'] = $data['duration'];
    if (isset($data['title']))          $update['title'] = $data['title'];
    if (isset($data['sub_title']))      $update['sub_title'] = $data['sub_title'];
    if (isset($data['description']))    $update['description'] = $data['description'];
    if (isset($data['category']))       $update['category'] = $data['category'];
    if (isset($data['episode']))        $update['episode'] = $data['episode']->toArray();
    if (isset($data['quality']))        $update['quality'] = $data['quality'];
    if (isset($data['subtitles']))      $update['subtitles'] = $data['subtitles'];

    // Do an Upsert (hence the 3rd param == true)
    $collection->update($criteria, $update, true);

    return true;
}
@jmikola
Owner

@robholmes: If you have time, would you mind refactoring once more to use the Query object, but not the QueryBuilder? If the leak persists even then, that might narrow it down. I'd definitely like to get to the bottom of this over the coming weeks as I get a chance (with some real debugging and code profiling).

@jmikola
Owner

I'm going to cross-reference this with doctrine/mongodb-odm#415, which is likely discussing another ODM memory leak. Reviewing the posts above, though, I don't think this is related to doctrine/mongo.

@jmikola jmikola closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.