Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests require a running MySQL instance #36

Closed
monitorjbl opened this issue Aug 13, 2014 · 19 comments
Closed

Tests require a running MySQL instance #36

monitorjbl opened this issue Aug 13, 2014 · 19 comments

Comments

@monitorjbl
Copy link
Contributor

I'm not sure if this was intended or not, but the tests require you to have a separate MySQL instance. Mamute is written with Hibernate, so it should (hopefully) be able to run the same tests using an embedded DB like H2 without too much effort.

@leocwolter
Copy link
Contributor

Yes, it is intended. We want to keep the test environment as close as possible to the production environment, that's why we use MySQL on them.

@monitorjbl
Copy link
Contributor Author

Yeah I can see why that would be useful for development, but it's hiding some issues. I did a quick check to see how well the tests would fare on a different database. For the most part, the unit tests passed, but there are several places in the the DAO layer that have some implicit dependencies on MySQL's quirks. For example, the ReputationEventDAO.karmaByContextQuery() method is dependent on the fact that MySQL lets you get away with not including all of the SELECT columns in the GROUP BY clause. As far as I know, MySQL is the only DB out there that lets you get away with it; it's contrary to the SQL99 spec.

I don't know what your plans are to use other databases in the future, but I was hoping that might be in the cards. We're kinda forced to use Oracle where I work (grumble), and we'd like to use Mamute here.

@leocwolter
Copy link
Contributor

You are right, mamute will only work at MySQL for now. In that case I believe that the most adequate ticket to open would be "Make mamute work with other databases".

Even if it works with other dbs like oracle, I don't believe using a H2, HSQLDB or something like that to run the DAO tests would be enough to assure that it is working on oracle and mysql, right? We would have to be testing it in all dbs we choose to support.

@monitorjbl
Copy link
Contributor Author

Haha yeah, I admit that was a fairly roundabout way to get there. I was evaluating the code and I was the first thing that I noticed. I can open up another ticket under that name.

You're also right that you would need to test on Oracle, which may be beyond your means at the moment (Oracle ain't cheap). However testing on an additional embedded database will at least suss out some of the problems that you'll have in the future. It wouldn't be too hard to have two hibernate configs so you can run all the DAO tests on both.

Would that be a good route for you guys?

@csokol
Copy link
Contributor

csokol commented Aug 13, 2014

I think that adding a configuration to run the tests against another db
will improve or test battery (we would get to know that SQL issue later).
Maybe from a jvm property, something like:

mvn test -Dmamute.database= -> use an hibernate..cfg.xml
to run the tests

What do you think?

Besides that, I think we should open another issue to solve that SQL
incompatibility.

Chico Sokol

On Wed, Aug 13, 2014 at 6:31 PM, monitorjbl notifications@github.com
wrote:

Haha yeah, I admit that was a fairly roundabout way to get there. I was
evaluating the code and I was the first thing that I noticed. I can open up
another ticket under that name.

You're also right that you would need to test on Oracle, which may be
beyond your means at the moment (Oracle ain't cheap). However testing on an
additional embedded database will at least suss out some of the problems
that you'll have in the future. It wouldn't be too hard to have two
hibernate configs so you can run all the DAO tests on both.

Would that be a good route for you guys?


Reply to this email directly or view it on GitHub
#36 (comment).

@monitorjbl
Copy link
Contributor Author

That's exactly what I had in mind :). I opened issue #37 for the general support of other databases, but if you'd like I could take a stab at implementing that test as part of this issue. Since it'd be a Maven property, having the embedded config wouldn't event affect your current builds.

monitorjbl pushed a commit to monitorjbl/mamute that referenced this issue Aug 13, 2014
monitorjbl pushed a commit to monitorjbl/mamute that referenced this issue Aug 14, 2014
monitorjbl added a commit to monitorjbl/mamute that referenced this issue Aug 14, 2014
monitorjbl added a commit to monitorjbl/mamute that referenced this issue Aug 18, 2014
leocwolter added a commit that referenced this issue Aug 19, 2014
adding system property for databases as part of #36
@leocwolter
Copy link
Contributor

This is solved, right?

@monitorjbl
Copy link
Contributor Author

Not entirely. There are three or four tests that require MySQL still. I know at least one depends on a MySQL quirk in dealing with aggregation.

monitorjbl added a commit to monitorjbl/mamute that referenced this issue Sep 4, 2014
@monitorjbl
Copy link
Contributor Author

I've updated code to fix most of the issues that require MySQL. There is, unfortunately, still that ReputationEventDAO.karmaByContextQuery() method that requires a specific quirk of MySQL to function in the same way.

In this particular query, a child object is being aggregated and the root object is not being returned at all. In Hibernate, this is very problematic because the ID field of the root object (ReputationEvent) must be returned in the generated query. With MySQL, the fact that this field isn't in the GROUP BY clause causes no issues. In all other databases, an exception will be thrown.

I'm aware of three potential solutions here:

  1. Do aggregation in Java. If the query groups by the root ID, there will be duplicate rows and the DAO code can add the sums together.
  2. Do it in two sets of queries: one to get the aggregate data and the context ID value and another set to pull out each ReputationEventContext object.
  3. Try to rewrite this with JPQL criteria. I don't know for sure if this will work, but there is more control over what fields are aggregated.

My personal preference is to do 1), as it's been my experience that ORMs are just not very good at analytic functions in general. Any thoughts?

@monitorjbl
Copy link
Contributor Author

Oh, one other possibility. It might be feasible to do this in raw SQL. That way the aggregation could still be done DB-side, saving a lot potentially large amount of memory/CPU for Mamute. This would of course be entirely specific to each database and therefore need to be implemented multiple times, but it might not be a bad idea to have some explicit handlers for certain functions on different DBs.

@leocwolter
Copy link
Contributor

That's weird, hibernate will try to return the id of the root object don't matter if I want or not? Can you put this exception at gist for me please?

@csokol
Copy link
Contributor

csokol commented Sep 5, 2014

Strange, I don't see why hibernate would return the id of root object.

Maybe hibernate is fetching the parent object because we are selecting the whole child object, what if we fetch only its id? @monitorjbl, can you try the following:

select e.context.id, sum(e.karmaReward), e.date from ReputationEvent e
group by e.context, day(e.date) 
order by e.date desc

After that, we could get the objects in a second query...

@monitorjbl
Copy link
Contributor Author

Yeah, that works, it was the basis for option 2). The trouble is that is the scaling of that list. There'd be N number of IDs to query for.

@csokol
Copy link
Contributor

csokol commented Sep 5, 2014

Then you could do something like: from ReputationContext where id in (:ids)

But now I'm thinking that this won't work since ReputationCOntext is an interface, can you try that?

@monitorjbl
Copy link
Contributor Author

Nah, doesn't work. I think you'd need to configure the Hibernate inheritance scheme in the model somehow.

@monitorjbl
Copy link
Contributor Author

Ah, think I've found a solution. It's possible to pull out the child object class in the query, so you can do this:

select e.context.class, e.context.id, sum(e.karmaReward), day(e.date) 
    from ReputationEvent e 
where e.user=:user and e.date > :after 
group by e.context.class, e.context.id, day(e.date) 
order by day(e.date) desc

and then in the result it's possible to sort out the different tables to query using a MultiMap or a Map<String,List<Long>>.

@monitorjbl
Copy link
Contributor Author

One other issue has cropped up as a result of all this. It's somewhat minor, but it again goes back to MySQL just being weird. The original query groups by day(e.date) but returns and orders by e.date. This can't logically be done because the date isn't grouped by, but MySQL makes some (dangerous) assumptions and allows it anyway.

The current ReputationEventDAOTest#should_group_reputation_karma_reward_by_post test case assumes the dates are ordered down to the millisecond. Since this can't be done anywhere but MySQL (and even then, it probably shouldn't be), I updated the test to not test for that kind of ordering.

@monitorjbl
Copy link
Contributor Author

I believe this issue can be closed now, you can build Mamute using H2 instead of MySQL as of PR #59.

@leocwolter
Copy link
Contributor

Nice! Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants