New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tests: Add test for open vs invalidation race #345
Conversation
CI is failing with:
which should be unrelated to my patch. |
Currently loadBefore and prefetch spawn async protocol.load_before task, and, after waking up on its completion, populate the cache with received data. But in between the time when protocol.load_before handler is run and the time when protocol.load_before caller wakes up, there is a window in which event loop might be running some other code, including the code that handles invalidateTransaction messages from the server. This means that cache updates and cache invalidations can be processed on the client not in the order that server sent it. And such difference in the order can lead to data corruption if e.g server sent <- loadBefore oid serial=tid1 next_serial=None <- invalidateTransaction tid2 oid and client processed it as invalidateTransaction tid2 oid cache.store(oid, tid1, next_serial=None) because here the end effect is that invalidation for oid@tid2 is not applied to the cache. The fix is simple: perform cache updates right after loadBefore reply is received. Fixes: zopefoundation#155 The fix is based on analysis and initial patch by @jmuchemb: zopefoundation#155 (comment) For tests, similarly to zopefoundation/ZODB#345, I wanted to include a general test for this issue into ZODB, so that all storages - not just ZEO - are exercised for this race scenario. However in ZODB test infrastructure there is currently no established general way to open several client storage connections to one storage server. This way the test for this issue currently lives in wendelin.core repository (and exercises both NEO and ZEO there): https://lab.nexedi.com/nexedi/wendelin.core/commit/c37a989d I understand there is a room for improvement. For the reference, my original ZEO-specific data corruption reproducer is here: zopefoundation#155 (comment) https://lab.nexedi.com/kirr/wendelin.core/blob/ecd0e7f0/zloadrace5.py /cc @d-maurer, @jamadden, @dataflake, @jimfulton
Ouch, that would be the longest ZODB test. I run them for NEO on my laptop:
So ~15s. |
We can reduce N somewhat, but the timings highlight that NEO/SQLite is 2x slower than ZEO here. And for ZEO with N=1000 it is only 2 seconds on my machine: (ZEO 5.2.2-1-g3d90ed42 + zopefoundation/ZEO#167; NEO v1.12-77-gde0feb4e): # stable with 3 repeats
kirr@deco:~/src/wendelin/wendelin.core/lib/tests/testprog$ time WENDELIN_CORE_TEST_DB='<zeo>' ./zopenrace.py
OK
real 0m2,340s
user 0m2,561s
sys 0m0,505s # stable with 3 repeats
kirr@deco:~/src/wendelin/wendelin.core/lib/tests/testprog$ time WENDELIN_CORE_TEST_DB='<neo>' ./zopenrace.py
Using temp directory /tmp/neo_a8sq1P
OK
real 0m4,435s
user 0m4,238s
sys 0m1,026s Reducing N will reduce the probability to catch problems and so will make CI less useful. I would say that spending even 10 seconds to reliably catch data corruption bugs is worth it. I will reduce N to 500. On my machine from 2016 (from 5 years ago) this will make the test to run with ZEO for ~ 1s and to run with NEO for ~ 2s. I believe this is reasonable. Kirill |
@jmuchemb reports that it this test is noticably slow on his laptop: zopefoundation#345 (comment)
Travis CI failures are, again, completely unrelated to my patch. |
For those who wonder, the ZODB tests in NEO run by default in a single-process mode with a special scheduler. Compared to a normal NEO DB:
In normal mode, the ZODB tests of NEO take 100.97s vs 109.18s, so the added test is about twice as fast, which would be about the same as ZEO.
Which which value can you almost always reproduce all bugs that have been found ? In fact, my previous comment was mainly informative, with little hope that something could be done. More generally, a "classic" test suite is not really the place for this kind of test. In NEO, we put such test in a stress test suite that we choose for example to run for 1 hour. That would be much more work to do the same in ZODB. Depending on the value of a useful N, this PR can be fine for now. |
@jmuchemb, in #345 (comment) I ran zopenrace.py, not ZODB test itself (which is modelled after For the reference, here is how ZEO test cluster is set up when
N=1000. Even 500 is somewhat a compromise, because I rememeber sometimes some tests were failing around iter=700 or 800.
It is a reasonable approach when it takes non-small time to reproduce problems. But if it is just several seconds to reproduce, the place for such tests is in regular testsuite in my view. The synergy of those two approaches could be that such tests live in regular testsuite and leverage reasonably small N by default for the run to take seconds. But when the test suite is run with something like
I'm fine with N=1000; I'm somewhat ok with N=500. In my view the test becomes less useful if N is reduced further. Kirill |
…neric _new_storage_client() This allows ZODB tests to recognize ZEO as client-server storage and so make "load vs external invalidate" test added in zopefoundation/ZODB#345 to reproduce data corruption problem reported at zopefoundation#155. For the reference: that problem should be fixed by zopefoundation#169. We drop # It's hard to find the actual address. # The rpc mgr addr attribute is a list. Each element in the # list is a socket domain (AF_INET, AF_UNIX, etc.) and an # address. note because at the time it was added (81f586c) it came with addr = self._storage._rpc_mgr.addr[0][1] but nowdays after 0386718 getting to server address is just by ClientStorage._addr.
I have updated this pull request with another test that exercises "load vs external invalidation" data corruption discovered in zopefoundation/ZEO#155. Quoting 9601756 : ---- 8< ---- For ZEO this data corruption bug was reported at Without that fix the failure shows e.g. as follows when running ZEO test
Even if added test is somewhat similar to For the test to work, test infrastructure is amended with For ZEO ._new_storage_client() is added by zopefoundation/ZEO#170 Other client-server storages can follow to implement ._new_storage_client() Contrary to test for "load vs local invalidate" N is set to lower value (100), /cc @d-maurer, @jamadden, @jmuchemb So now this pull-request, together with zopefoundation/ZEO#170, adds test for all data corruption bugs in ZODB5 stack discovered so far. |
Travis CI failures - all with error to install sphinx - are, again, completely unrelated to my patch. |
Earlier, we observed that race conditions are much harder to detect in Python 3 than Python 2. The experience from zopefoundation/ZEO#168 (comment) explains this observation: obviously, race condition probability increases with the context switch frequency; by default Python 3 switches contexts every 5 ms compared to 100 interpreter instructions for Python 2 (about 0.01 ms on modern hardware); thus, under normal settings, Python 3 is almost sequential compared to Python 2. We may want to control the context switch frequency for race condition tests |
@d-maurer, thanks, this might be good idea. I see corresponding changes on https://github.com/zopefoundation/ZEO/pull/167/files/2c98d943d7e191abd64cdac53876267dffac9c74..4f59226553e651e5cc8597e734a084f63501c16e and I've tried to see whether it indeed makes the test fail with python3. Unfortunately I stilll cannot trigger the test to fail, even with something like --- a/lib/tests/testprog/zopenrace.py
+++ b/lib/tests/testprog/zopenrace.py
@@ -106,6 +106,10 @@ def __init__(self, i):
@func
def main():
+ import sys
+ #sys.setcheckinterval(10)
+ sys.setswitchinterval(5*1e-5)
+
tdb = getTestDB()
tdb.setup()
defer(tdb.teardown) I've tried also For the reference "100 instructions" on my machine is something in between 1e-6 and 1e-5 seconds. I also see CPython itself has https://github.com/python/cpython/blob/68ba0c67cac10c2545ea3b62d8b161e5b3594edd/Lib/test/test_concurrent_futures.py#L687-L694 However on python2, if I lower the "check interval" from default 100 to 1 (again the lowest possible value), We might indeed move race tests to layer (as in your changes), or mark with something lile |
Kirill Smelkov wrote at 2021-4-19 11:37 -0700:
...
it is still not completely clear to me what to do in those settings on python3. And python2 "works" out of the box with defaults.
Not yet fully understood race conditions will likely never be
detected reliably by tests because the race happens only under quite
specific conditions. For example, the ZEO#167 race conditions tests
all succeed in my local environment while some fail (reliably) on Travis.
Even when the race condition is fully understood, special instrumentation
may be necessary to enforce that some things happen at precise moments.
IMO we should use heuristics and try to improve it when experience
gives some hints. In ZEO#167, I used the heuristics to emulate
Python 2's context switch frequency in Python 3 because experience
demonstrated that some race condition tests revealed problems in Python 2
but not in Python 3. Another heuristics could be to learn from
Python race condition tests. I do not know what is better.
|
@d-maurer, thanks. I have marked my race-reproducing tests with Would you please have a look? Thanks beforehand, |
Add entry to changelog. Note: the fix now has corresponding test, that should be coming in through zopefoundation#170 and zopefoundation/ZODB#345
CI is ok, modulo already seen unrelated "version conflict on install sphinx" kind of problems. |
""" | ||
@functools.wraps(f) | ||
def _(*argv, **kw): | ||
if six.PY3: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you move to an upper level so we can have:
previous = sys.getswitchinterval()
try:
sys.setswitchinterval(5e-6)
return f(*argv, **kw)
finally:
sys.setswitchinterval(previous)
and
previous = sys.getcheckinterval()
try:
sys.setcheckinterval(100)
return f(*argv, **kw)
finally:
sys.setcheckinterval(previous)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes the code less clear to me. I prefer to keep my original version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One could argue about the duplication of the try/return/finally lines, even if for me it's so small that I'd call it "overfactoring". But less clear? There's nothing more pythonic than the following pattern:
save state
try:
mess up (and ideally, do it after the try - see PEP 419)
...
finally:
restore
No need to "name" the restore part by moving it a function.
When we drop support for Python 2, that's anyway how it would be rewritten.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One could argue about the duplication of the try/return/finally lines
Yes, that duplication of f call sites makes it less clear in my view.
When we drop support for Python 2, that's anyway how it would be rewritten.
So let's rewrite it that way if | when Python 2 support is dropped. And until it is not dropped, let's keep the code clearly structured.
break | ||
except: | ||
failed.set() | ||
raise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer:
- move _modify/_verify out of modify/verify
- merge modify & verify into a single function (you could still have a nested function with code that is common to _modify/_verify)
- rename _modify/_verify to modify/verify
- make threading.Thread pass an extra argument that is either modify or verify
- for the commented print, you can use
__name__
of the passed function
On the other side, no need for threading.Thread to pass N.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, let's have it this way
On the other side, no need for threading.Thread to pass N.
except this - I prefer to pass parameters explicitly instead of implicitly referring to shared globals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you capitalized it, it's semantically not a variable but a constant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just a major parameter.
for x in range(nwork): | ||
t = threading.Thread(name='T%d' % x, target=T, args=(x, N)) | ||
t.start() | ||
tg.append(t) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about
tg = [threading.Thread(name='T%d' % x, target=T, args=(x, N))
for x in range(nwork)]
for t in tg:
t.start()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is a bit less clear to me, so I prefer to leave my version of this code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides style (here, I won't insist), I wonder it makes a difference to minimize the difference of start time between threads, and if it would have an impact to reproduce failure. Or maybe it's completely negligible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tried this change and run the tests on unfixed ZEO. The failure is reproducible with both versions. The iteration number at which the failue is reproduced is scattered in between 1-5-6-30-... in both versions. So I believe this change is completely orthogonal to test semantic.
To get rid of the |
…neric _new_storage_client() (#170) This allows ZODB tests to recognize ZEO as client-server storage and so make "load vs external invalidate" test added in zopefoundation/ZODB#345 to reproduce data corruption problem reported at #155. For the reference: that problem should be fixed by #169. We drop # It's hard to find the actual address. # The rpc mgr addr attribute is a list. Each element in the # list is a socket domain (AF_INET, AF_UNIX, etc.) and an # address. note because at the time it was added (81f586c) it came with addr = self._storage._rpc_mgr.addr[0][1] but nowdays after 0386718 getting to server address is just by ClientStorage._addr. /reviewed-on #170 /reviewed-by @d-maurer
Address review feedback provided by @jmuchemb: 1. move _modify/_verify out of modify/verify 2. merge modify & verify into a single function (you could still have a nested function with code that is common to _modify/_verify) 3. rename _modify/_verify to modify/verify 4. make threading.Thread pass an extra argument that is either modify or verify 5. for the commented print, you can use __name__ of the passed function zopefoundation#345 (comment)
Address review feedback: Move nwork up so that there is no longer misleading failure=[] coming before failure=[None]*nwork. Also add space in `[None] * nwork` as suggested. zopefoundation#345 (comment) zopefoundation#345 (comment)
…between threads Keep imports sorted as suggested by @jmuchemb: zopefoundation#345 (comment)
#346 was merged. I synced changes in this pull-request with master to get to green CI state. Let's see how it goes this time. |
Currently loadBefore and prefetch spawn async protocol.load_before task, and, after waking up on its completion, populate the cache with received data. But in between the time when protocol.load_before handler is run and the time when protocol.load_before caller wakes up, there is a window in which event loop might be running some other code, including the code that handles invalidateTransaction messages from the server. This means that cache updates and cache invalidations can be processed on the client not in the order that server sent it. And such difference in the order can lead to data corruption if e.g server sent <- loadBefore oid serial=tid1 next_serial=None <- invalidateTransaction tid2 oid and client processed it as invalidateTransaction tid2 oid cache.store(oid, tid1, next_serial=None) because here the end effect is that invalidation for oid@tid2 is not applied to the cache. The fix is simple: perform cache updates right after loadBefore reply is received. Fixes: zopefoundation#155 The fix is based on analysis and initial patch by @jmuchemb: zopefoundation#155 (comment) A tests corresponding to the fix is coming coming through zopefoundation#170 and zopefoundation/ZODB#345 For the reference, my original ZEO-specific data corruption reproducer is here: zopefoundation#155 (comment) https://lab.nexedi.com/kirr/wendelin.core/blob/ecd0e7f0/zloadrace5.py /cc @jamadden, @dataflake, @jimfulton /reviewed-by @jmuchemb, @d-maurer /reviewed-on zopefoundation#169
Currently loadBefore and prefetch spawn async protocol.load_before task, and, after waking up on its completion, populate the cache with received data. But in between the time when protocol.load_before handler is run and the time when protocol.load_before caller wakes up, there is a window in which event loop might be running some other code, including the code that handles invalidateTransaction messages from the server. This means that cache updates and cache invalidations can be processed on the client not in the order that server sent it. And such difference in the order can lead to data corruption if e.g server sent <- loadBefore oid serial=tid1 next_serial=None <- invalidateTransaction tid2 oid and client processed it as invalidateTransaction tid2 oid cache.store(oid, tid1, next_serial=None) because here the end effect is that invalidation for oid@tid2 is not applied to the cache. The fix is simple: perform cache updates right after loadBefore reply is received. Fixes: #155 The fix is based on analysis and initial patch by @jmuchemb: #155 (comment) A tests corresponding to the fix is coming coming through #170 and zopefoundation/ZODB#345 For the reference, my original ZEO-specific data corruption reproducer is here: #155 (comment) https://lab.nexedi.com/kirr/wendelin.core/blob/ecd0e7f0/zloadrace5.py /cc @jamadden, @dataflake, @jimfulton /reviewed-by @jmuchemb, @d-maurer /reviewed-on #169
CI is green. |
Add test that exercises open vs invalidation race condition that, if happen, leads to data corruption. We are seeing such race happening on storage level in ZEO (zopefoundation/ZEO#166), and previously we've seen it also to happen on Connection level (zopefoundation#290). By adding this test to be exercised wrt all storages we make sure that all storages stay free from this race. And it payed out. Besides catching original problems from zopefoundation#290 and zopefoundation/ZEO#166 , this test also discovered a concurrency bug in MVCCMappingStorage: Failure in test check_race_open_vs_invalidate (ZODB.tests.testMVCCMappingStorage.MVCCMappingStorageTests) Traceback (most recent call last): File "/usr/lib/python2.7/unittest/case.py", line 329, in run testMethod() File "/home/kirr/src/wendelin/z/ZODB/src/ZODB/tests/BasicStorage.py", line 492, in check_race_open_vs_invalidate self.fail(failure[0]) File "/usr/lib/python2.7/unittest/case.py", line 410, in fail raise self.failureException(msg) AssertionError: T1: obj1.value (24) != obj2.value (23) The problem with MVCCMappingStorage was that instance.poll_invalidations was correctly taking main_lock with intention to make sure main data is not mutated during analysis, but instance.tpc_finish and instance.tpc_abort did _not_ taken main lock, which was leading to committed data to be propagating into main storage in non-atomic way. This bug was also observable if both obj1 and obj2 in the added test were always loaded from the storage (added obj2._p_invalidate after obj1._p_invalidate). -> Fix MVCCMappingStorage by correctly locking main MVCCMappingStorage instance when processing transaction completion. /cc @d-maurer, @jamadden, @jmuchemb /reviewed-on zopefoundation#345
For ZEO this data corruption bug was reported at zopefoundation/ZEO#155 and fixed at zopefoundation/ZEO#169. Without that fix the failure shows e.g. as follows when running ZEO test suite: Failure in test check_race_load_vs_external_invalidate (ZEO.tests.testZEO.BlobAdaptedFileStorageTests) Traceback (most recent call last): File "/usr/lib/python2.7/unittest/case.py", line 329, in run testMethod() File "/home/kirr/src/wendelin/z/ZODB/src/ZODB/tests/BasicStorage.py", line 621, in check_race_load_vs_external_invalidate self.fail([_ for _ in failure if _]) File "/usr/lib/python2.7/unittest/case.py", line 410, in fail raise self.failureException(msg) AssertionError: ['T1: obj1.value (7) != obj2.value (8)'] Even if added test is somewhat similar to check_race_loadopen_vs_local_invalidate, it is added anew without trying to unify code. The reason here is that the probability to catch load vs external invalidation race is significantly reduced when there are only 1 modify and 1 verify workers. The unification with preserving both tests semantic would make test for "load vs local invalidate" harder to follow. Sometimes a little copying is better than trying to unify too much. For the test to work, test infrastructure is amended with ._new_storage_client() method that complements ._storage attribute: client-server storages like ZEO, NEO and RelStorage allow several storage clients to be connected to single storage server. For client-server storages test subclasses should implement _new_storage_client to return new storage client that is connected to the same storage server self._storage is connected to. For ZEO ._new_storage_client() is added by zopefoundation/ZEO#170 Other client-server storages can follow to implement ._new_storage_client() and this way automatically activate this "load vs external invalidation" test when their testsuite is run. Contrary to test for "load vs local invalidate" N is set to lower value (100), because with 8 workers the bug is usually reproduced at not-so-high iteration number (5-10-20). /cc @d-maurer, @jamadden, @jmuchemb /reviewed-on zopefoundation#345
… threads As suggested by @d-maurer: zopefoundation#345 (comment) zopefoundation/ZEO#168 (comment) /reviewed-on zopefoundation#345
I believe it is time to merge this. |
( patches rebased to master with fixups squashed into corresponding commits ) |
Add test that exercises open vs invalidation race condition that, if happen, leads to data corruption. We are seeing such race happening on storage level in ZEO (zopefoundation/ZEO#166), and previously we've seen it also to happen on Connection level (#290). By adding this test to be exercised wrt all storages we make sure that all storages stay free from this race. And it payed out. Besides catching original problems from #290 and zopefoundation/ZEO#166 , this test also discovered a concurrency bug in MVCCMappingStorage: Failure in test check_race_open_vs_invalidate (ZODB.tests.testMVCCMappingStorage.MVCCMappingStorageTests) Traceback (most recent call last): File "/usr/lib/python2.7/unittest/case.py", line 329, in run testMethod() File "/home/kirr/src/wendelin/z/ZODB/src/ZODB/tests/BasicStorage.py", line 492, in check_race_open_vs_invalidate self.fail(failure[0]) File "/usr/lib/python2.7/unittest/case.py", line 410, in fail raise self.failureException(msg) AssertionError: T1: obj1.value (24) != obj2.value (23) The problem with MVCCMappingStorage was that instance.poll_invalidations was correctly taking main_lock with intention to make sure main data is not mutated during analysis, but instance.tpc_finish and instance.tpc_abort did _not_ taken main lock, which was leading to committed data to be propagating into main storage in non-atomic way. This bug was also observable if both obj1 and obj2 in the added test were always loaded from the storage (added obj2._p_invalidate after obj1._p_invalidate). -> Fix MVCCMappingStorage by correctly locking main MVCCMappingStorage instance when processing transaction completion. /cc @d-maurer, @jamadden, @jmuchemb /reviewed-on #345
For ZEO this data corruption bug was reported at zopefoundation/ZEO#155 and fixed at zopefoundation/ZEO#169. Without that fix the failure shows e.g. as follows when running ZEO test suite: Failure in test check_race_load_vs_external_invalidate (ZEO.tests.testZEO.BlobAdaptedFileStorageTests) Traceback (most recent call last): File "/usr/lib/python2.7/unittest/case.py", line 329, in run testMethod() File "/home/kirr/src/wendelin/z/ZODB/src/ZODB/tests/BasicStorage.py", line 621, in check_race_load_vs_external_invalidate self.fail([_ for _ in failure if _]) File "/usr/lib/python2.7/unittest/case.py", line 410, in fail raise self.failureException(msg) AssertionError: ['T1: obj1.value (7) != obj2.value (8)'] Even if added test is somewhat similar to check_race_loadopen_vs_local_invalidate, it is added anew without trying to unify code. The reason here is that the probability to catch load vs external invalidation race is significantly reduced when there are only 1 modify and 1 verify workers. The unification with preserving both tests semantic would make test for "load vs local invalidate" harder to follow. Sometimes a little copying is better than trying to unify too much. For the test to work, test infrastructure is amended with ._new_storage_client() method that complements ._storage attribute: client-server storages like ZEO, NEO and RelStorage allow several storage clients to be connected to single storage server. For client-server storages test subclasses should implement _new_storage_client to return new storage client that is connected to the same storage server self._storage is connected to. For ZEO ._new_storage_client() is added by zopefoundation/ZEO#170 Other client-server storages can follow to implement ._new_storage_client() and this way automatically activate this "load vs external invalidation" test when their testsuite is run. Contrary to test for "load vs local invalidate" N is set to lower value (100), because with 8 workers the bug is usually reproduced at not-so-high iteration number (5-10-20). /cc @d-maurer, @jamadden, @jmuchemb /reviewed-on #345
Add test that exercises open vs invalidation race condition that, if
happen, leads to data corruption. We are seeing such race happening on
storage level in ZEO (zopefoundation/ZEO#166),
and previously we've seen it also to happen on Connection level
(#290). By adding this test
to be exercised wrt all storages we make sure that all storages stay
free from this race.
And it payed out. Besides catching original problems from
#290 and
zopefoundation/ZEO#166 , this test also
discovered a concurrency bug in MVCCMappingStorage:
The problem with MVCCMappingStorage was that instance.poll_invalidations
was correctly taking main_lock with intention to make sure main data is
not mutated during analysis, but instance.tpc_finish and
instance.tpc_abort did not taken main lock, which was leading to
committed data to be propagating into main storage in non-atomic way.
This bug was also observable if both obj1 and obj2 in the added test
were always loaded from the storage (added obj2._p_invalidate after
obj1._p_invalidate).
-> Fix MVCCMappingStorage by correctly locking main MVCCMappingStorage
instance when processing transaction completion.
/cc @d-maurer, @jamadden, @jmuchemb