-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
redis cache on insert #373
Conversation
the env variable to set ttl for cache has to be documented. default is 60 sec. |
# in a following query call (below ttl) the same cache can be called despite there is no data. | ||
# a possible solution is to create a cache based on query parameters that would cache all the results | ||
self.cursor.execute(stmt, [table_name]) | ||
res = self.cursor.fetchall() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we tested using cache here, but with current "delete" approach this causes issues. scenario triggering the issue is: a entity is create, delete is used to delete all values what happens is that the table is empty, but metadata are still there, so caching the query with
res = self._execute_query_via_cache(table_name, "metadata", stmt, [table_name], self.default_ttl)
actually create an entry in the cache table_name, "metadata", in a following query call (below ttl) the same cache can be called despite there is no data. a possible solution is to create a cache based on query parameters that would cache all the results
@@ -1060,6 +1105,50 @@ def drop_table(self, etype, fiware_service=None): | |||
except Exception as e: | |||
logging.error("{}".format(e)) | |||
|
|||
def query_entity_types(self, fiware_service=None, fiware_servicepath=None): | |||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
baseline to be extended to provide /v2/types endpoint ;)
@@ -1105,3 +1197,68 @@ def _get_entity_type(self, entity_id, fiware_service): | |||
|
|||
def _compute_type(self, attr_t, attr): | |||
raise NotImplementedError | |||
|
|||
def _execute_query_via_cache(self, table_name, key, stmt, parameters=None, | |||
ex=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function provide a way to check if a query is in cache in a certain key, table tuple. if not execute the query.
return res | ||
|
||
def _is_query_in_cache(self, table_name, key): | ||
if self.cache: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check if a key, table tuple is in cache
return False | ||
|
||
def _cache(self, table_name, key, value=None, ex=None): | ||
if self.cache: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cache a certain value in a tuple table, key for tot seconds
logging.warning(e.__str__()) | ||
|
||
def _remove_from_cache(self, table_name, key): | ||
if self.cache: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove a tuple table, key from cache
src/translators/sql_translator.py
Outdated
|
||
|
||
class QueryCacheManager(Borg): | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
singleton to not recreate connection to cache at every request
src/translators/sql_translator.py
Outdated
# Sort out data table. | ||
self._prepare_data_table(table_name, table, fiware_service) | ||
if modified: | ||
self._prepare_data_table(table_name, table, fiware_service) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now _update_metadata_table
return 0 if no difference between existing metadata and new ones, if difference it return 1, this allows to avoid useless calls to _prepare_data_table
018a164
to
1da02f8
Compare
8dbafe0
to
f8da9a5
Compare
|
||
|
||
class ConnectionManager(Borg): | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically this is a pool manager. Pools are not offered in the basic driver of crate and psql. It would be good to explore a proper db pool.
@@ -48,6 +48,14 @@ services: | |||
volumes: | |||
- redisdata:/data | |||
|
|||
redis-commander: | |||
image: rediscommander/redis-commander:latest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so we can monitor what is cached
@@ -66,7 +66,7 @@ | |||
# Logging config section. | |||
# | |||
|
|||
loglevel = 'debug' | |||
loglevel = 'error' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
debug is too verbose
@@ -11,6 +11,7 @@ install: | |||
|
|||
before_script: | |||
- pipenv install | |||
- sudo service postgresql stop |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so we don't need weird ports
5b4a9fc
to
84286aa
Compare
* implement basic redis cache for metadata (reducing query to database) * provide a reusable connection to database (nor crate driver or pg8000 seems to support pool natively) * improve logs and use orion like format * add support for fiware-correlator (facilitate tracking data chains between agents, orion and quantumleap) * add missing configuration variables to doc * fix postgres issue in travis tests * minor doc updates * split create table from update table logic * use alter table for adding column to crate * change crate table column_policy from dynamic to strict * support injection of NGSI-LD payloads in NGSI-V2 backcompatible way * when type is not an NGSI type, it's type is derived from value (support #383)
1d94ee1
to
15bdfa6
Compare
@@ -24,7 +25,8 @@ def geometry_value(self) -> Optional[str]: | |||
return self._location.get(_VALUE_ATTR_NAME, None) | |||
|
|||
def is_geojson(self): | |||
return self.geometry_type() == _GEOJSON_TYPE | |||
return self.geometry_type() == _GEOJSON_TYPE or \ | |||
self.geometry_type() == _GEOJSON_LD_TYPE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to support proper management also of GeoProperty
if NGSI-LD likey payload is sent
|
||
@pytest.mark.parametrize("service", services) | ||
def test_issue_382(service, notification): | ||
# entity with one Null value and no type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we store correctly 'relationship' type?
|
||
@pytest.mark.parametrize("service", services) | ||
def test_json_ld(service, notification): | ||
# entity with one Null value and no type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we store correctly an NGSI-LD payload?
lat, lon = e[attr]['value'].split(',') | ||
value = [float(lon), float(lat)] | ||
elif 'type' in e[attr] and e[attr]['type'] == 'Property' \ | ||
and 'value' in e[attr] \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
manage DateTime in NGSI-LD
and isinstance(e[attr]['value'], dict) \ | ||
and '@type' in e[attr]['value'] \ | ||
and e[attr]['value']['@type'] == 'DateTime': | ||
mapped_value = e[attr]['value']['@value'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
manage DateTime in NGSI-LD
elif 'type' in e[attr] and e[attr][ | ||
'type'] == 'Relationship': | ||
mapped_value = e[attr].get('value', None) or \ | ||
e[attr].get('object', None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
manage Relationship in NGSI-LD
elif 'type' in e[attr] and e[attr]['type'] == 'Relationship': | ||
value = e[attr].get('value', None) or \ | ||
e[attr].get('object', None) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
manage Relationship in NGSI-LD
@pytest.mark.parametrize("service", services) | ||
def test_json_ld(service, notification): | ||
# entity with one Null value and no type | ||
notification['data'][0] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jason-fox @kzangeli is ngsi-ld payload representative enough? is there any tricky attribute type we should take into account? is DateTime the only date format that is specified via @value and @type ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Attributes in NGSI-LD (at least right now) can have only three values in their type:
- Property
- Relationship
- GeoProperty
[ Two new attribute types are on their way: "LanguageProperty" and "VocabProperty", but those two are still under discussion in ETSI ISG CIM, so, never mind for now ]
A "Property" attribute has a "value" that can be of any JSON type.
A "Relationship" attribute has an "object" (instead of a "value") that is a string that is a valid URI
A "GeoProperty" has a "value" that is a GeoJSON. Eg: { "type": "Point", coordinates: [] }
There are a few special attributes, like observedAt, observationSpace etc. For that, see the NGSI-LD spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thx @kzangeli
I think current pr would generally cover correctly
- Property
- Relationship
- GeoProperty
The issue is more for the case when the type of Property is not a native JSON Type. So far we implemented support for "DateTime" as by examples, but I could not find other similar not native types, to think of a generalisation. This is important especially if we want to provide backward and forward compatibility (as we are trying to do), where we meant the following:
- backward compatibility: data notified as NGSI-LD should be able to be retrieved as proper NGSI-v2
- forward compatibility: data notified as NGSI-v2 should be able to be retrieved as NGSI-LD
thx for any pointer you can provide to this tricky "custom" data type cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's actually a fourth type of attribute/property - TemporalProperty, but that is kind of hidden.
Users aren't allowed to create a property of the type "TemporalProperty", but there is one "timestamp" that can be set by users.
NGSI-LD admits three different Temporal Properties:
- modifiedAt
- crateadAt
- observedAt
The two first are builtins, set by the broker and only observedAt is user-defined.
Its format is on key-value basis:
"observedAt": "2020-11-15T10:00:00.000Z"
and, the broker will convert the timestamp into a floating point value, internally.
This to be able to easily compare the timestamp with other timestamps (I imagine QL does the same).
Note though that the timestamps are converted back to ISO8601 form in queries and notifications.
About compound property values with "@type" and "@value", Orion-LD doesn't modify the value but keeps it as is.
This is actually something I've been thinking of taking up with ETSI ISG CIM but I haven't had any time for that still.
For example:
{
"T1": {
"type": "Property",
"value": {
"@type": "DateTime",
"@value": "2020-..."
}
}
T1 could be saved as a floating point for simpler and faster comparisons.
However, in the notifications that QL receives from the broker, T1 would always be converted back to its original form.
I believe this should be transparent to the subscriber (i.e. QL, Cygnus et al).
About compatibility questions, the easy answer is that:
- if NGSI-LD, simply compact entity types and attr names according to the current @context
- If NGSIv2, send the longnames without compaction.
The "not so easy" answer is that some features (datasetId) are simply not supported by NGSIv2 and there's probably no solution for this (well, if we really want to, there's always a way ...).
An attribute with datasetIds (or: multi-instances) are represented as an array in NGSI-LD, while that would provoke an error in NGSIv2:
"P1": [
{
"type": "Property",
"value": 4,
"datasetId": "urn:ngsi-ld:xxx:north"
},
{
"type": "Property",
"value": 9,
"datasetId": "urn:ngsi-ld:xxx:south"
}
]
My personal opinion is that this concept of notifying in NGSIv2 is a temporary thing that we may use with some constraints until we implement support for NGSI-LD.
BTW, the whole datasetId concept is quite complex and only partially implemented in Orion-LD as of today.
Right now I can't think of any other problems we may find.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thx @kzangeli
docs/manuals/admin/configuration.md
Outdated
| `QL_CONFIG` | Pathname for tenant configuration | | ||
| `LOGLEVEL` | define the log level for all services (`DEBUG`, `INFO`, `WARNING` , `ERROR`) | | ||
| `QL_DEFAULT_DB` | Default backend: `timescale` or `crate` | | ||
| `USE_FLASK` | `True` or `False` to use flask server or gunicorn. Default to `False` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment that is for development and testing, not for prod! (maybe rename)
docs/manuals/user/index.md
Outdated
If the type of any of the received attributes is not present in the column | ||
*NGSI Type* of the previous table, the value of such attribute will be treated | ||
internally as a string. So, if you use `Float` for your attribute type (not | ||
valid), your attribute will be stored as a `string`. | ||
valid), your attribute will be stored as a `text`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if type not a NGSI type, the is inferred. with few exceptions to provide NGSI-LD forward compatibility
return error, 400 | ||
# Add TIME_INDEX attribute | ||
custom_index = request.headers.get(TIME_INDEX_HEADER_NAME, None) | ||
entity[TIME_INDEX_NAME] = \ | ||
select_time_index_value_as_iso(custom_index, entity) | ||
# Add GEO-DATE if enabled | ||
add_geodata(entity) | ||
if not entity.get(LOCATION_ATTR_NAME, None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check if it's a geo:json or something
@chicco785 good stuff! I think we can close #382 and #383 too, can't we? |
quantumleap heavily relies on crate (or timescale) to query and store data model configuration, this implies during insert an high number of queries on the backend, despite the change of models between an insert and another of the same entity type should not be that frequent.
as tested, caching reduce such number of queries, and thus increase throughput. previous test used in-memory cache, but that's not ideal in a concurrent environment, thus in this new test we experiment the usage of redis.
experiments shows that:
caveats: if you use caching, manual delete (off course) can make query bombs unless you clear also cache or wait cache to be expired
System used for testing:
Mac2016 with 3.1 GhZ i7 (i.e. 4 cores) and 16GB Ram
Baseline (latest image as off 2020/10/02): 44 req/s - 600 ms avg response time - (crate queries peak: select 111 q/sec, insert 54 q/sec)
Redis caching: 60 req/s - (no data on response time - have to check old image)
In memory caching: 95 req/s - (no data on response time - have to check old image)
Redis caching + connection re-usage: 100 req/s - 200 ms avg response time (crate queries peak: select 7 q/sec, insert 142 q/sec)
Connection re-usage (without any cache): 55 req/s - 400 ms avg response time (crate queries peak: select 177 q/sec, insert 86 q/sec)
Redis caching + connection re-usage data: