Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible memory issue #944

Closed
MartinP7r opened this issue Mar 21, 2021 · 4 comments
Closed

Possible memory issue #944

MartinP7r opened this issue Mar 21, 2021 · 4 comments
Labels

Comments

@MartinP7r
Copy link
Collaborator

First of all thanks for this awesome library.
I have an issue

What did you do?

Trying to persist/update a sizable amount of records (~180,000).

What did you expect to happen?

Records being persisted/updated.

What happened instead?

Device crash at memory usage of 2+ GB

Environment

GRDB flavor(s):
GRDB.swift with FTS5 via cocoapods

pod 'GRDB.swift', :git => 'https://github.com/groue/GRDB.swift.git'
  
  post_install do |installer|
    installer.pods_project.targets.select { |target| target.name == "GRDB.swift" }.each do |target|
      target.build_configurations.each do |config|
        config.build_settings['OTHER_SWIFT_FLAGS'] = "$(inherited) -D SQLITE_ENABLE_FTS5"
      end
    end
  end

GRDB version: 5.2.0 (I've updated to 5.6.0 and will run below instruments again)
Installation method: pods, see above
Xcode version: 12.2
Swift version: 5.2
Platform(s) running GRDB: iOS
macOS version running Xcode: 10.15.7

Demo Project

I'm sorry I can't upload a demo project at the moment, but if you can spare the time and give me your two cents on my approach below, that'd be already really helpful I hope.

Basically all I'm doing is

func persist(items: [T]) {
    l.info("persisting items for Type \(T.self)")
    do {
        for item in items {
            try db.dbQueue.write { db in
                try item.save(db)
            }
            db.dbQueue.releaseMemory()
        }
    } catch {
        l.error(error)
    }
}

for an array of ~180,000 items in one table.

The items are being parsed from an xml file (100mb) beforehand which is already quite hefty and they take up about 800mb of memory.

I've been wondering if I should just persist the data in smaller chunks. Keeping all the parsed structs in memory seems like it's probably just too much. So splitting the xml data up in smaller sizes and going from there is probably my only option?

@groue
Copy link
Owner

groue commented Mar 21, 2021

Hello @MartinP7r,

In my experience, the following loop consumes a stable amount of memory:

let dbQueue = try DatabaseQueue(path: ...)

try dbQueue.write { db in
    try db.create(table: "player") { t in
        t.autoIncrementedPrimaryKey("id")
        t.column("name", .text).notNull()
        t.column("score", .integer).notNull()
    }
}

struct Player: Encodable, PersistableRecord {
    var id: Int64
    var name: String
    var score: Int
}

for i in 1..<180000 {
    let player = Player(id: Int64(i), name: "Player \(i)", score: i)
    try dbQueue.write { db in
        try player.save(db)
    }
}

So you'll have to look elsewhere for the memory issue, or provide a minimal reproducible example if you think the issue lies in GRDB.


Note: the loop above is very slow, because it forces SQLite to synchronize with the file system on each step of the loop.

To get a much faster loop, perform all updates in a single transaction:

try dbQueue.write { db in
    for i in 1..<180000 {
        let player = Player(id: Int64(i), name: "Player \(i)", score: i)
        try player.save(db)
    }
}

And for the best performance, see #926 (comment)

@groue groue added the needs user input More information is requested label Mar 21, 2021
@MartinP7r
Copy link
Collaborator Author

Thanks a lot for your input and suggestions! I will try and pinpoint the problem and get back to you if it seems to be with GRDB. But I'd rather expect it to be somewhere else, so I'll close this and reopen as need.

Thanks again!

@MartinP7r
Copy link
Collaborator Author

Sorry for bringing this up again. I might have found some issue after all.
Here is a simple example that uses your json column encoding for one property in the persisted type. This will make the memory footprint go past 2GB before finishing the 180k loops.

let dbQueue = try DatabaseQueue(path: ...)

try dbQueue.write { db in
    try db.create(table: "player") { t in
        t.autoIncrementedPrimaryKey("id")
        t.column("team", .text).notNull()
    }
}

struct Player: Encodable, PersistableRecord {
    var id: Int64
    var team: Team
}

struct Team: Encodable {
    var id: Int64
    var name: String
}

try dbQueue.write { db in
    for i in 1..<180_000 {
        let player = Player(id: Int64(i), team: Team(id: Int64(i), name: "Team \(i)"))
        try player.save(db)
    }
}

I think I found a possible cause (and maybe solution) and started a PR #950 if that's ok.

@MartinP7r MartinP7r reopened this Apr 1, 2021
@groue
Copy link
Owner

groue commented Apr 2, 2021

#950 has fixed the memory issue with JSON columns. This issue will be closed with the next GRDB version. Thank you @MartinP7r!

@groue groue added bug and removed needs user input More information is requested labels Apr 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants