Skip to content

Commit

Permalink
Hash size in bytes must be bigger than 20 bytes
Browse files Browse the repository at this point in the history
  • Loading branch information
gtorodelvalle committed May 21, 2015
1 parent 2c6a1ed commit c3ff1bb
Show file tree
Hide file tree
Showing 2 changed files with 62 additions and 14 deletions.
30 changes: 21 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -349,15 +349,27 @@ the attribute type does not have any special semantic or effect currently.
As already mentioned, all this configuration parameters can also be adjusted using the
[`config.js`](https://github.com/telefonicaid/IoT-STH/blob/develop/config.js) file whose contents are self-explanatory.

It is important to note that there is a limitation on the bytes a MongoDB namespace (concatenation of the database name and
collection name) may have (see <a href="http://docs.mongodb.org/manual/reference/limits/#namespaces" target="_blank">http://docs.mongodb.org/manual/reference/limits/#namespaces</a>
for further information). This forces us to generate hashes as part of the names of the created collections based on the combination
of the concrete service, service path, entity and attribute. The hash function
used is SHA-512.

To let the user or developer easily recover this information, a collection named ```DB_COLLECTION_PREFIX + _collection_name```
is created and fed with information regarding the mapping of the collection names and the combination of concrete services,
service paths, entities and attributes.
It is important to note that there is a limitation of 120 bytes for the namespaces (concatenation of the database name and
collection names) in MongoDB (see <a href="http://docs.mongodb.org/manual/reference/limits/#namespaces" target="_blank">http://docs.mongodb.org/manual/reference/limits/#namespaces</a>
for further information). Related to this, the STH generates the collection names using 2 possible mechanisms:

1. <u>Plain text</u>: In case the `SHOULD_HASH` configuration parameter is set to 'false', the collection names are
generated as a concatenation of the `COLLECTION_PREFIX` plus the service path (in case of the collection-per-service-path
data model) plus the entity id plus the entity type (in case of the collection-per-entity data model) plus the attribute name
(in case of the collection-per-attribute data model) plus '.aggr' for the collections of the aggregated data. The length
of the collection name plus the `DB_PREFIX` plus the database name (or service) should not be more than 120 bytes using UTF-8
format or MongoDB will complain and will not create the collection, and consequently no data would be stored by the STH.

2. <u>Hash based</u>: In case the `SHOULD_HASH` option is set to something distinct from 'false' (the default option), the
collection names are generated as a concatenation of the `COLLECTION_PREFIX` plus a generated hash plus '.aggr' for the
collections of the aggregated data. To avoid collisions in the generation of these hashes, they are forced to be 20 bytes
long at least. Once again, the length of the collection name plus the `DB_PREFIX` plus the database name (or service) should not
be more than 120 bytes using UTF-8 or MongoDB will complain and will not create the collection, and consequently no data
would be stored by the STH. The hash function used is SHA-512.

In case of using hashes as part of the collection names and to let the user or developer easily recover this information,
a collection named ```DB_COLLECTION_PREFIX + _collection_name``` is created and fed with information regarding the mapping
of the collection names and the combination of concrete services, service paths, entities and attributes.

[Top](#section0)

Expand Down
46 changes: 41 additions & 5 deletions src/sth_database.js
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,15 @@
"use strict";

var mongoose = require('mongoose');
var boom = require('boom');
var crypto = require('crypto');
var bytesCounter = require('bytes-counter');

var sthConfig, sthLogger, sthHelper, connectionURL, eventSchema, aggregatedSchema;

var MAX_NAMESPACE_SIZE_IN_BYTES = 120,
MIN_HASH_SIZE_IN_BYTES = 20;

/**
* Declares the Mongoose schemas.
*/
Expand Down Expand Up @@ -198,15 +202,37 @@
break;
}
if (sthConfig.SHOULD_HASH) {
// The maximum number of bytes accepted by MongoDB for namespaces is 120 bytes.
var limit = 119 - bytesCounter.count(databaseName) - bytesCounter.count(sthConfig.COLLECTION_PREFIX) -
bytesCounter.count('.aggr');
var limit = getHashSizeInBytes(databaseName);
if (limit < MIN_HASH_SIZE_IN_BYTES) {
sthLogger.warn('The available bytes for the hashes to be used as part of the collection names is not big enough (' +
'at least ' + MIN_HASH_SIZE_IN_BYTES + ' bytes are needed), ' +
'please reduce the size of the DB_PREFIX ("' + sthConfig.DB_PREFIX + '" = ' + bytesCounter.count(sthConfig.DB_PREFIX) + ' bytes), ' +
'the service ("' + databaseName.substring(sthConfig.DB_PREFIX.length, databaseName.length) + '" = ' + bytesCounter.count(databaseName) +
' bytes) and/or the COLLECTION_PREFIX ("' + sthConfig.COLLECTION_PREFIX + '" = ' + bytesCounter.count(sthConfig.COLLECTION_PREFIX) +
' bytes)',
{
operationType: sthConfig.OPERATION_TYPE.DB_LOG
}
);
return null;
}
return sthConfig.COLLECTION_PREFIX + generateHash(collectionName4Events, limit);
} else {
return sthConfig.COLLECTION_PREFIX + collectionName4Events;
}
}

/**
* Returns the available hash size in bytes to be used as part of the collection names
* based on the database name, database name prefix and collection name prefix
* @param databaseName The database name
* @return {number} The size of the hash in bytes
*/
function getHashSizeInBytes(databaseName) {
return MAX_NAMESPACE_SIZE_IN_BYTES - bytesCounter.count(databaseName) -
bytesCounter.count(sthConfig.COLLECTION_PREFIX) - bytesCounter.count('.aggr') - 1;
}

/**
* Return the name of the collection which will store the aggregated data
* @param {string} databaseName The database name
Expand All @@ -218,8 +244,13 @@
*/
function getCollectionName4Aggregated(databaseName, servicePath, entityId, entityType,
attrName) {
return getCollectionName4Events(
databaseName, servicePath, entityId, entityType, attrName) + '.aggr';
var collectionName4Events = getCollectionName4Events(
databaseName, servicePath, entityId, entityType, attrName);
if (collectionName4Events) {
return collectionName4Events + '.aggr';
} else {
return null;
}
}

/**
Expand Down Expand Up @@ -250,6 +281,11 @@
params.attrName);
}

if (!collectionName) {
var error = boom.badRequest('The collection name could not be generated');
return process.nextTick(callback.bind(null, error));
}

// Switch to the right database
var connection = mongoose.connection.useDb(databaseName);

Expand Down

0 comments on commit c3ff1bb

Please sign in to comment.