Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Break out of In-Table Revisions? #10

Open
skgma opened this issue Dec 11, 2009 · 4 comments
Open

Break out of In-Table Revisions? #10

skgma opened this issue Dec 11, 2009 · 4 comments

Comments

@skgma
Copy link

skgma commented Dec 11, 2009

I like the idea of keeping the revisions inside a single table, but after so many revisions... Things get slow. Is there a simple way to have it store the revisions in a separate model/table? I'm not very good at figuring out where things happen in the code, with Ruby, still. Would this be easy for me to do, or is it pretty thoroughly built into how aar works?

@rich
Copy link
Owner

rich commented Dec 11, 2009

The single-table approach is pretty ingrained into how AAR works.

I'm curious if you can give some details on the performance issues you're having though. Can you give me some examples of what slow and what volume of data you're working with?

@skgma
Copy link
Author

skgma commented Dec 18, 2009

I'm sorry for the delayed response, I haven't signed in in a few days. Part of the issue is my design, which InnoDB's lack of FULLTEXT makes pretty painful. And Sphinx can be quirky, so I don't use it during my ETL.

http://pastie.org/private/dyq2votczhekmtiuyz28sa

Before you call me an idiot, part of why things are SO slow is my own idiotic fault. I was learning Rails when I began this thing (converting from a mini-Perl project I wrote before). So I've done all kinds of crazy STI tricks and Polymorphic stuff. The slowdown isn't so much because of AAR, but because the size of the indexes grow very large.

The Accounts table's Data Length is 21.6MB and the Index size is 30.1MB.

My Connects table (has two polymorphic foreign keys, which are used by STI models to represent different types of connects between my tables) is huge:

Connect.count
=> 201002
ConnectRevision.count
=> 39378

and it's design is retarded. I'm breaking it out into its own separate tables and getting away from my retarded polymorphic nonsense (except in one case). This will speed it up and AAR is really fine. I just liked the idea of having it be placed into another table to minimize the size of indexes over time. But even then, it's not a big deal--I just need to trim old/useless records.

One thing I did notice is something I'm going to open a ticket for, because I'd be shocked if you've read this far.

@rich
Copy link
Owner

rich commented Dec 18, 2009

Those table sizes aren't particularly large and nothing else you've described sounds so bad. Can you give me some idea what queries are slow? Maybe the output of "show indexes from <table_name>" for accounts and connects?

@skgma
Copy link
Author

skgma commented Dec 18, 2009

Here it is:

http://pastie.org/private/pjab3aiwx7ivzidyypfdzw

Thank you for looking. What is slow is during my ETL, where I do an UPDATE on each record. I don't want you to think AAR is especially slow. It's mostly my design and ActiveRecord, IMO. And it's hard for me to tell just how slow it may or may not be with the ~30,000 extra revisions--but it doesn't seem to be as big of a hog as I thought it was.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants