[WIP] More robust Updates verification#2536
[WIP] More robust Updates verification#2536knewmanTE wants to merge 17 commits intoTechEmpower:masterfrom
Conversation
|
Initial Travis test seems to indicate that |
|
|
|
Okay, so of the tests failing on Travis, I'm seeing the following:
One problem that does seem a bit hard to address is what we re seeing in Javascript/express. Because most of the queries are only for 1 or 2 items, even a single instance of a row being set to its same randomNumber will cause a FAIL, because my verification test will see it as 100% of the items for that endpoint were not properly updated. I'm open to any feedback or thoughts for how to handle this. I could prevent FAILs for requests that return only 1 or 2 items, and instead just throw a WARN. The downside of this is that it would mean the only test that really has any weight in determining whether a framework passes or fails is the 501 items, but maybe that's a good thing? |
|
I think that's fine actually, to only do the check with the 5% margin of error on the 500 updates one. With the amount of testing we do with travis, we probably don't want to get the occasional error on the smaller updates. Thinking ahead to continuous benchmarking, we don't want false failing there either. Ping @msmith-techempower |
|
Correct me if I'm wrong @knewmanTE, it looks like you're verifying whether or not the updates were done by looking at the json response from the framework. Since you've already built the |
… on 501 query test
|
Just pushed an update that does a few things differently:
|
|
Hmm, looks like a lot of MongoDB tests are failing now saying that the DB wasn't changed. I will investigate. Anyone know of any issues or delays that MongoDB might have with updating the DB immediately? |
|
@knewmanTE I just checked the build log https://travis-ci.org/TechEmpower/FrameworkBenchmarks/jobs/197405309 and it looks like actframework's mongo test also failed. I will check on my local environment |
|
@knewmanTE I did some test on my local environment and found the data has been changed, here is what I've done: So it proved that the mongodb data has been updated I think it might be related to the testing procedure |
|
@greenlaw110 yeah, my guess is something got a bit goofed up when I switched from comparing the response body to comparing the database itself. Hopefully I'll have a concrete answer and fix to share soon. |
|
Okay, after a bit of testing, I'm noticing two potentially related, but maybe separate issues: First, the after-the-update database that I am loading isn't updating properly. It happens mostly on MongoDB tests, but I am registering no changes between the before and after database. However,this only happening on some tests. For example, The second issue is that As a sanity check against a race condition occurring between the frameworks setting values in the database and my script reading the values, I set a manual 20 second sleep before fetching the after-the-update database and am still seeing both issues. So that's where I'm at right now. I'll continue investigating to see if I can discern anything else. |
|
Aha! I believe I've at least figured out the Dropwizard issue. For whatever reason, some of our MongoDB tests use lowercase table names (world, fortune) while others use uppercase table names (World, Fortune). My test is only checking against the lowercase World table, but it's my guess that all of the breaking frameworks are ones that expect an uppercase table name. Switching my test to use World for dropwizard gives me the same result as the hapi test (only detecting 493/500 updates). Unfortunately, as far as I know, there isn't an easy way to determine whether a framework is using the world or World table outside of the framework's implementation itself. |
|
Okay, I think I've gotten to the bottom of the other issue. I think it might just have to do with weak pseudo-random number generators. For example, in a recent test of dropwizard-mongodb, even though the /update?queries=501 response body contained 500 entries, that is because it is returning a list of JSON objects, which doesn't account for duplicates. If you convert this list into a single JSON object, the duplicates generated by the framework merge together and the resulting object only had 488 unique items. If this is the case (and I believe it is), my other concern is how to handle WARN vs. PASS responses. If pretty much every test is going to fail to update 500 unique elements, no test will receive a PASS for update, given that I am only issuing a PASS when there are at or very close to 500 successful updates (<1% margin of error). That being said, as long as the results are within the 5% margin of error (which they all seem to be), they will receive a WARN, which counts as an overall Travis pass. So we could handle this a few different ways:
@nbrady-techempower I'm curious to hear your thoughts on the matter. Also, if you have any ideas on a reasonably efficient way to test against both MongoDB tables that doesn't involve just fetching all 10,000 items from both the world and World tables, I'd love to hear them. |
|
@knewmanTE Good spot on! I think it might be worth to specify the name of the tables/collections, say we want to make all table name/collection name be lowercase or vice verse. |
|
@greenlaw110 I think we tried to remove one set of the tables with #2237, but ran into some sort of issue where a number of MongoDB ORMs were automatically changing a table name to lower or uppercase. |
|
Opened #2545 to hopefully address the MongoDB table issue. |
|
@greenlaw110 I don't know if you were working on the dropwizard tests, but it also appears that it's using mongo's Edit: Maybe I'm wrong. It looks like it's |
|
@nbrady-techempower I checked dropwizard mongo db impl code and confirmed it is using mongo's The action handler code: So it will use one trip to complete both find and modify operation. This tech is also used in Act's code: However I am not the author of dropwizard test set. You need to contact them |
|
@greenlaw110 Thanks for looking in to that!! |
|
@nbrady-techempower I think the best way to enforce two trip updates is to have the server send response as: It is not possible to get the |
|
@knewmanTE I've restarted the perl frameworks because it looked like their package manager resource was temporarily unavailable. Everything with those seem fine now. |
|
Starting up a new run now that #2545 has been merged into master. |
|
Alright, still getting a bunch of fails. I just pushed some commits that fix a few things:
|
|
Pushed a fix that should at least address a couple JavaScript bugs. |
|
So, most of the fails we're seeing at the moment are with the Update test. The big challenge will be determining which of the following cases we're dealing with:
I might end up getting kind of busy over the next week or two, so any help checking on broken test implementations or looking through the verification test code is greatly appreciated! |
|
@knewmanTE I have checked the only Java failed case:
|
|
@greenlaw110 thanks for the extra pair of eyes! I had started a discussion with the Permeagility maintainer about his framework, but it doesn't look like he's properly added OrientDB to our suite yet. Currently, it's building an OrientDB database on the same machine as the app server, rather than building it isolated on the DB machine, so I'm okay with that failing since its database implementation needs an overhaul in general. My Update verification currently only works for Postgres, MySQL, and MongoDB, but it shouldn't be too hard to add additional databases as they are added to the suite. |
|
This has been open over a year... let's close the PR for now and if someone has time, they can pick it up later. |
|
hopefully one day we can get this working. |
* remove unused test directory * move toolset print statements to functions * include changes from #2536 * During verify updates, check for `world` or `World` table updates (postgres) * make sure postgres is checking both world tables
* remove unused test directory * move toolset print statements to functions * include changes from TechEmpower#2536 * During verify updates, check for `world` or `World` table updates (postgres) * make sure postgres is checking both world tables









Currently, our verification for the /updates endpoints is the same as the /queries endpoints, so it is checking for properly formatted JSON, but doesn't actually do any checks to confirm that the database itself is being updated.
This pull request, which currently works with MySQL, Postgres, and MongoDB, pulls a current copy of the relevant World table and compares the updated values with their previously stored values. It throws a FAIL if none of them have been updated, a PASS if all of them have been updated, and a WARN if more then 5% of them remain unchanged. The reason for this 5% is to allow for the 1 in 10,000 chance that an entry is updated to the same value it previously held.