-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fetching performance #12
Comments
I agree that this code isn't optimal. Some things are already cached : structs (types) details are fetched only one time. But fetching data involve many operations (especially for big structs). The current version is done to be easy to maintain, not to be the fastest. Before building godb I though about using unsafe code, calculating memory offsets only one time, but it was not the way I choosed. It could be an optimization later. But before doing things like this I prefer writing fastest safe code. It's perhaps not the fastest, but it's fast. Really blazing fast compared to other tools I used in other ecosystems. Then changing this is not my priority, and I will not do anything without taking time to be sure that's correct, really faster and easy to maintain (it's really an important part of godb). I made a test with SQLite (because it's easy to setup, except on Windows), the full code is below, and a flamegraph. It's not a full analysis but a good start for me to show what's take time. I made only CPU profile, but it's obvious seeing the graph that many memory operations are implied. The graph could look different on other OS, or with other database, but it's enough for me to see where the CPU is used. And to bee honest it's not a surprise. Perhaps for godb 2, writen with Go 2 (joke inside). It's not my priority but I'll take time "a day" on this. It's an interesting subject ;) EDIT : the famegraph is done with the latest versoin of pprof : https://github.com/google/pprof package main
import (
"fmt"
"log"
"os"
"runtime/pprof"
"github.com/samonzeweb/godb"
"github.com/samonzeweb/godb/adapters/sqlite"
)
type Record struct {
ID int `db:"id,key,auto"`
Dummy1 string `db:"dummy1"`
Dummy2 string `db:"dummy2"`
Dummy3 string `db:"dummy3"`
Dummy4 string `db:"dummy4"`
Dummy5 string `db:"dummy5"`
}
const dataSize int = 100000
func main() {
db, err := godb.Open(sqlite.Adapter, ":memory:")
panicIfErr(err)
_, err = db.CurrentDB().Exec(`
create table Record (
id integer not null primary key autoincrement,
dummy1 text not null,
dummy2 text not null,
dummy3 text not null,
dummy4 text not null,
dummy5 text not null)
`)
panicIfErr(err)
massiveInsert(db)
count, err := db.Select(&Record{}).Count()
panicIfErr(err)
fmt.Println("Inserted : ", count)
readAll(db)
}
func massiveInsert(db *godb.DB) {
db.Begin()
bulkSize := 100
records := make([]Record, 0, bulkSize)
for i := 0; i < dataSize; i++ {
record := Record{
Dummy1: "dummy",
Dummy2: "dummy",
Dummy3: "dummy",
Dummy4: "dummy",
Dummy5: "dummy",
}
records = append(records, record)
if len(records) >= bulkSize {
err := db.BulkInsert(&records).Do()
panicIfErr(err)
records = records[:0]
}
}
if len(records) > 0 {
err := db.BulkInsert(&records).Do()
panicIfErr(err)
}
db.Commit()
}
func readAll(db *godb.DB) {
f, err := os.Create("cpu.prof")
if err != nil {
log.Fatal("could not create CPU profile: ", err)
}
if err := pprof.StartCPUProfile(f); err != nil {
log.Fatal("could not start CPU profile: ", err)
}
defer pprof.StopCPUProfile()
all := make([]Record, 0, dataSize)
err = db.Select(&all).Do()
panicIfErr(err)
fmt.Println("Read : ", len(all))
}
func panicIfErr(err error) {
if err != nil {
panic(err)
}
} |
Thanks for detailed answer. I agree that it is fast enough(for me, especially after using Python SqlAlchemy for years), Go is the way to Go. :-) I had a change to check coding for other Go ORM/ODM/SQL builders while developing gobenchorm benchmarkings, and I have to say that I've done tests with adding this code before exit: runtime.GC()
memProfile, err := os.Create("/tmp/godb.mprof")
if err != nil {
log.Fatal(err)
}
defer memProfile.Close()
if err := pprof.WriteHeapProfile(memProfile); err != nil {
log.Fatal(err)
} And reported with: go tool pprof --alloc_space mem /tmp/godb.mprof And it shows that we let a lot of work to do for GC. That is why CPU usage is high. In my first message I gave an example(by editing issue later, maybe you didn't see it) of pg - please check link I think because of this I'm looking forward for your patches on this subject, If you need help please don't hesitate to ask. Not to scare people with this issue, I'm closing it. |
There is an obvious way to improve performance. It's so obvious that I'm ashamed of not having done it from start. See this commit : b0df693 I'll take a little more time to analyze performances. Now it's on a separate branch, i'll see if there are other simple things to fix after doing escape analysis. If I've no time or don't see simple things I'll merge it almost as-is (just fix some comments). I made a bench targeting
Before :
After :
|
Wow! Thanks, that is nice! |
I didn't see something as obvious (or simple to to in short time). I made some cleaning, rebased and merged it to master : 18ac3a5 (the commit in my previous message could be unavailable as I'll remove the corresponding branch) |
Hi Samuel,
I did some benchmarking&profiling. Fetching is spending time&memory on calling
(smd *structMappingDetails) traverseTree
method. It is critical especially while scanning high number of rows from results(for reporting).Full results of benchmarking is at this repo. As for every fetching of rows(single or multiple, after inserts and updates also)
godb
is callingtraverseTree
and that putsgodb
into middle of report. I think it will get better if a caching mechanism is added instead of running reflections for each time. Caching should be implemented specific to db session instance.For example pg orm make use of this kind of caching.
I've checked if some kind of caching for column mapping is possible, but unfortunately couldn't find a place in code to implement. Can you also check this?
Here is profiling output for scanning rows of SQL with
LIMIT 10000
:Sample SQL is:
The text was updated successfully, but these errors were encountered: