-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate collection names using a hash function #84
Generate collection names using a hash function #84
Conversation
} | ||
// The maximum number of bytes accepted by MongoDB for namespaces is 120 bytes. | ||
var limit = 119 - bytesCounter.count(databaseName) - bytesCounter.count(sthConfig.COLLECTION_PREFIX) - | ||
bytesCounter.count('.aggr'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Despite this is a method for obtaining the collection name for raw data, I guess it is necessary to have into account the number of bytes for ".aggr" since the hash-based collection name must match both for raw and aggregated, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right :) That's the reason of the - bytesCounter.count('.aggr')
part :) Am I missing anything here? :p
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it is perfect, just wondering if I understood it well :) NTC
@@ -345,6 +345,16 @@ the attribute type does not have any special semantic or effect currently. | |||
As already mentioned, all this configuration parameters can also be adjusted using the | |||
[`config.js`](https://github.com/telefonicaid/IoT-STH/blob/develop/config.js) file whose contents are self-explanatory. | |||
|
|||
It is important to note that there is a limitation on the bytes a MongoDB namespace (concatenation of the database name and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By introducing this paragraph it should be asumed that the data model is fixed to collection-per-entity
, right? I mean, we are still maintaining the possibility to switch to another data model, and AFAIK this hashing mechanism is only addressing the collection-per-entity
data model, isn`t it?
I've been thinking about fixing the data model and so on, and I think we can hide from the documentation any reference to the data_model
parameter, defaulting it to collection-per-entity
but maintaining the code that allows to change the data model for internal testing purposes (I'm affraid we decide to fix the data model and in the future the bosses decide for another one :)).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, no, the included code can be used with any of the 3 supported data models. The hash is generated from the service path (if collection-per-service-path), the concatenation of service path, entityId and entityType (if collection-per-entity) or the concatenation of service path, entityId, entityType and attributeName (if collection-per-attribute). See https://github.com/telefonicaid/IoT-STH/pull/84/files#diff-db503fe45ab607476c902dd760bd0defR203 and how the collectionName4Events
is obtained ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consequently, we can keep the option to select the data model if we want to since it is supported by the hashing mechanism :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, I did not notice the switch, I only saw the last case... Tip: do not review PRs with fever xD NTC
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition, the last case is not about collection-per-entity
but collection-per-attribute
... so I reviewed it completely bad xD
c3ff1bb
to
63c914b
Compare
|
||
1. <u>Plain text</u>: In case the `SHOULD_HASH` configuration parameter is set to 'false', the collection names are | ||
generated as a concatenation of the `COLLECTION_PREFIX` plus the service path (in case of the collection-per-service-path | ||
data model) plus the entity id plus the entity type (in case of the collection-per-entity data model) plus the attribute name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We agreed to eliminate all related information with data models
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But not in this PR, right? :) I wanted to include everything needed and in a new PR delete all the information just to have that information together in case someone asks us to get it back... :p
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 not in this PR
LGTM, well done @gtorodelvalle !! |
…names-using-a-hash-function Generate collection names using a hash function
COLLECTION_PREFIX + generatedHash
COLLECTION_PREFIX + generatedHash + '.aggr'
where
generatedHash
is generated using the SHA-512 algorithm truncating the hash to 120 bytes after reserving the needed bytes for the database name, theCOLLECTION_PREFIX
and the.aggr
prefix for aggregates.The relation between the collection names and the combination of service, service path, entity and attribute is stored in a new collection named
COLLECTION_PREFIX + 'collection_names'
.Example entries of this collection are the following documents: