Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

events.* tables think they're stored after resetting the cache, but Query.get_stored() disagrees #832

Closed
jc-harrison opened this issue May 21, 2019 · 1 comment · Fixed by #840
Labels
bug Something isn't working FlowMachine Issues related to FlowMachine

Comments

@jc-harrison
Copy link
Member

Describe the bug
After a call to flowmachine.core.cache.reset_cache(), Query.get_stored() will return no queries. However, the is_stored() method of the events.{calls,mds,sms} table objects will return True. As a result, these tables will appear in the set returned by queries' _get_stored_dependencies() method, and calling query.store().result() will raise an error:

psycopg2.errors.ForeignKeyViolation: insert or update on table "dependencies" violates foreign key constraint "cache_dependency_id"
DETAIL:  Key (depends_on)=(70a2d392f73aedcc251a6a504f17008a) is not present in table "cached".

To Reproduce

>>> import flowmachine
>>> from flowmachine.features import daily_location
>>> from flowmachine.core.cache import reset_cache
>>> flowmachine.connect()
FlowMachine version: 0+unknown
Flowdb running on: localhost:9000/flowdb (connecting user: flowmachine)
<flowmachine.core.connection.Connection object at 0x11b14b080>

>>> list(flowmachine.core.Query.get_stored())
[]

>>> dl_query = daily_location(date="2016-01-03", level="admin3", method="last")

>>> list(flowmachine.core.Query.get_stored())
[<Table: 'events.calls', query_id: '057addedac04dbeb1dcbbb6b524b43f0'>,
 <Table: 'events.calls', query_id: '70a2d392f73aedcc251a6a504f17008a'>,
 <Table: 'events.calls_20160101', query_id: '41702c7c062a29932c738de26117a12f'>,
 <Table: 'events.calls_20160102', query_id: 'ec4a35b5b695aa67ec3e074949074b7c'>,
 <Table: 'events.calls_20160103', query_id: '425848a6a1113dfd9015eca6fc30f16a'>,
 <Table: 'events.calls_20160104', query_id: 'fd85443cad5fd529536dd36a88ac6a55'>,
 <Table: 'events.calls_20160105', query_id: '50b67ea6b52b9d1e58d40c0ceb2b2d59'>,
 <Table: 'events.calls_20160106', query_id: '7b70d1ade9970da2c4929e643d3ee736'>,
 <Table: 'events.calls_20160107', query_id: 'c18720a0e2bb9db77393ad928071936a'>,
 <Table: 'events.calls_20160108', query_id: '1f449d564be3716cbffedfc04a1593ef'>,
 <Table: 'events.calls_20160109', query_id: 'b3b3188e554f0e0319035f920f9400e4'>,
 <Table: 'events.calls_20160110', query_id: '80747ef52ac47669d0b6ae7011379be8'>,
 <Table: 'events.sms', query_id: '7a7f27978925c385bc44a5ec5667d7b3'>,
 <Table: 'events.sms', query_id: '9de507e882f1fb6b0cfcedd324e27839'>,
 <Table: 'events.sms_20160101', query_id: '01136f2c505733415afa233f26092403'>,
 <Table: 'events.sms_20160102', query_id: '5a52bf5e64fa3ab2c6e876eea645fdf7'>,
 <Table: 'events.sms_20160103', query_id: 'b2691ce1659275ea127bc8ebdc8207f0'>,
 <Table: 'events.sms_20160104', query_id: '27604f440068e8734f92c051e21cc740'>,
 <Table: 'events.sms_20160105', query_id: '7701809cf8e83c82eef5deb0a94cf5f5'>,
 <Table: 'events.sms_20160106', query_id: '5f1741baac4544a30052ed8c8ca3ae66'>,
 <Table: 'events.sms_20160107', query_id: '7f42676c72c12ae61995be405b1d4bed'>,
 <Table: 'events.mds', query_id: '64ba935fee023d48dad2ec1d41ccc2e0'>,
 <Table: 'events.mds', query_id: 'a7a51b1e9c9cabb84a525ac6510ea612'>,
 <Table: 'events.mds_20160101', query_id: 'c4a046663c753982886e60e213e4e986'>,
 <Table: 'events.mds_20160102', query_id: '226c07e80786031459c72df4d067658c'>,
 <Table: 'events.mds_20160103', query_id: '5ba4dbdcd00c35e121a2e18bdc0a7230'>,
 <Table: 'events.mds_20160104', query_id: 'b9477c988b924edad8da7520b55b9522'>,
 <Table: 'events.mds_20160105', query_id: '4e316cccebfad0b2fc8e5fc1081352eb'>,
 <Table: 'events.mds_20160106', query_id: '3c987b555205ff2bf994722788baaddc'>,
 <Table: 'events.mds_20160107', query_id: '88d36ee986a16b14f5f9fd8c87554ece'>]

>>> dl_query._get_stored_dependencies()
{<Table: 'events.calls', query_id: '70a2d392f73aedcc251a6a504f17008a'>,
 <Table: 'events.mds', query_id: 'a7a51b1e9c9cabb84a525ac6510ea612'>,
 <Table: 'events.sms', query_id: '9de507e882f1fb6b0cfcedd324e27839'>}

>>> reset_cache(flowmachine.core.Query.connection, flowmachine.core.Query.redis)
>>> list(flowmachine.core.Query.get_stored())
[]
>>> dl_query._get_stored_dependencies()
{<Table: 'events.calls', query_id: '70a2d392f73aedcc251a6a504f17008a'>,
 <Table: 'events.mds', query_id: 'a7a51b1e9c9cabb84a525ac6510ea612'>,
 <Table: 'events.sms', query_id: '9de507e882f1fb6b0cfcedd324e27839'>}

So Query.get_stored() doesn't include the events tables any more, but dl_query._get_stored_dependencies() still does. If we now try to store the daily location query, we get an error:

>>> dl_query.store().result()
Traceback (most recent call last):
  File "/Users/jamesharrison/.local/share/virtualenvs/query_timing-QV37Obvu/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1244, in _execute_context
    cursor, statement, parameters, context
  File "/Users/jamesharrison/.local/share/virtualenvs/query_timing-QV37Obvu/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 552, in do_execute
    cursor.execute(statement, parameters)
psycopg2.errors.ForeignKeyViolation: insert or update on table "dependencies" violates foreign key constraint "cache_dependency_id"
DETAIL:  Key (depends_on)=(70a2d392f73aedcc251a6a504f17008a) is not present in table "cached".


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/jamesharrison/.pyenv/versions/3.7.0/lib/python3.7/concurrent/futures/_base.py", line 432, in result
    return self.__get_result()
  File "/Users/jamesharrison/.pyenv/versions/3.7.0/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
  File "/Users/jamesharrison/.pyenv/versions/3.7.0/lib/python3.7/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/Users/jamesharrison/.local/share/virtualenvs/query_timing-QV37Obvu/lib/python3.7/site-packages/flowmachine/core/cache.py", line 113, in write_query_to_cache
    write_cache_metadata(connection, query, compute_time=plan_time)
  File "/Users/jamesharrison/.local/share/virtualenvs/query_timing-QV37Obvu/lib/python3.7/site-packages/flowmachine/core/cache.py", line 193, in write_cache_metadata
    (query.md5, dep.md5),
  File "/Users/jamesharrison/.local/share/virtualenvs/query_timing-QV37Obvu/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 2166, in execute
    return connection.execute(statement, *multiparams, **params)
  File "/Users/jamesharrison/.local/share/virtualenvs/query_timing-QV37Obvu/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 982, in execute
    return self._execute_text(object_, multiparams, params)
  File "/Users/jamesharrison/.local/share/virtualenvs/query_timing-QV37Obvu/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1155, in _execute_text
    parameters,
  File "/Users/jamesharrison/.local/share/virtualenvs/query_timing-QV37Obvu/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1248, in _execute_context
    e, statement, parameters, cursor, context
  File "/Users/jamesharrison/.local/share/virtualenvs/query_timing-QV37Obvu/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1466, in _handle_dbapi_exception
    util.raise_from_cause(sqlalchemy_exception, exc_info)
  File "/Users/jamesharrison/.local/share/virtualenvs/query_timing-QV37Obvu/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 383, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb, cause=cause)
  File "/Users/jamesharrison/.local/share/virtualenvs/query_timing-QV37Obvu/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 128, in reraise
    raise value.with_traceback(tb)
  File "/Users/jamesharrison/.local/share/virtualenvs/query_timing-QV37Obvu/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1244, in _execute_context
    cursor, statement, parameters, context
  File "/Users/jamesharrison/.local/share/virtualenvs/query_timing-QV37Obvu/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 552, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.IntegrityError: (psycopg2.errors.ForeignKeyViolation) insert or update on table "dependencies" violates foreign key constraint "cache_dependency_id"
DETAIL:  Key (depends_on)=(70a2d392f73aedcc251a6a504f17008a) is not present in table "cached".

[SQL: INSERT INTO cache.dependencies values (%s, %s) ON CONFLICT DO NOTHING]
[parameters: ('ea5f29df58ec0411a640cec967840913', '70a2d392f73aedcc251a6a504f17008a')]
(Background on this error at: http://sqlalche.me/e/gkpj)

Expected behavior
Presumably we don't want to remove the events.{calls,mds,sms} tables when resetting the cache, so cache.cached should still know about these tables.

@jc-harrison jc-harrison added bug Something isn't working FlowMachine Issues related to FlowMachine labels May 21, 2019
@greenape
Copy link
Member

Mm. Ok, this arises because the is_stored property of all Table objects is True, but they only call self._db_store_cache_metadata() in their constructor. So here, because the Table objects already exist, they won't put themselves back in cache.

Fix is, as suggested, to remove

trans.execute("TRUNCATE cache.cached CASCADE")

And I guess DELETE CASCADE the tables from cache.cached while iterating through.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working FlowMachine Issues related to FlowMachine
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants