Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update with new versioning #228

Merged
merged 3 commits into from
Apr 1, 2017
Merged

update with new versioning #228

merged 3 commits into from
Apr 1, 2017

Conversation

ttvi
Copy link
Contributor

@ttvi ttvi commented Feb 27, 2017

What changed:

  • update listing algorithms delimiterMaster and delimiterVersions to reflect new versioning in metadata (with metadata's functional tests to verify)
  • remove unused constants
  • change versioning tools to version class
  • update unit tests

@ghost
Copy link

ghost commented Feb 27, 2017

What about S3 functional tests ? The listing algorithm is used for the file backend.

@ttvi
Copy link
Contributor Author

ttvi commented Feb 27, 2017

There should be no impact. The listing algorithms delimiterMaster and delimiterVersions are specific to versioning only.

@ttvi ttvi force-pushed the ft/vsp branch 9 times, most recently from a2bb286 to ec68ccf Compare March 21, 2017 08:41
* < 0: entry is not accepted, listing should finish
*/
filter(entry) {
return entry ? 0 : 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You always return 0 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the base class.

@@ -3,6 +3,11 @@
const checkLimit = require('./tools').checkLimit;
const DEFAULT_MAX_KEYS = 1000;

function inc(str) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is present in 3 files, can you make it into a separate file ?

this.res = [];
if (parameters) {
this.maxKeys = checkLimit(parameters.maxKeys, DEFAULT_MAX_KEYS);
const maxKeys = parameters.maxKeys;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do this in two steps ?


/**
* Get the listing resutls. Format depends on derivatives' specific logic.
* @return {Array} - The listed elements
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this.res is always undefined.

@@ -34,11 +54,11 @@ class List {
filter(elem) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You return a Number now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Updated.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird.

@@ -94,30 +127,12 @@ class Delimiter {
*/
addContents(key, value) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same about the return.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

@@ -3,6 +3,11 @@
const checkLimit = require('./tools').checkLimit;
const DEFAULT_MAX_KEYS = 1000;

function inc(str) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ttvi You can add this method in lib/algos/list/tools.js as a utility function and import here. Please add a short jsdoc description on top for the function.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

genMDParams() {
const params = {};
if (this.params.keyMarker) {
params.gt = `overview${this.params.splitter}` +
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please abstract the key prefix generation into a method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only use it once, I don't see the need for doing that. It's better to keep the code as is to see what it does.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep it as-is.

const checkLimit = require('./tools').checkLimit;
const DEFAULT_MAX_KEYS = 10000;

/**
* Class of an extension doing the simple listing
*/
class List {
class List extends Extension {
Copy link
Collaborator

@rahulreddy rahulreddy Mar 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extends super currently performs poor in pure ES6 implementation. You can choose to use the class directly instead or introduce base class in a PR later when performance regression is fixed in node 6.

Source: https://kpdecker.github.io/six-speed/ for reference.

Copy link

@ghost ghost Mar 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't that part of the "class" tests ? They report "same" as I see it, or I may not be looking at the right row ?
EDIT: Or maybe we're talking about super which is yet again different ? If so, I'm not sure that's a real issue here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still an issue here considering the time nodejs spends to do "super" compared to the listing process itself? The number 4x is significant but in the context of listing I do not really see that as an issue.
Definitely that needs to be improved though. What is your suggestion here: removing super and duplicating the code or is there any cool approach you can think of?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's an issue. The benchmark compares to the ES5 implementation, so it's 4x slower than the ES5 implementation but fast enough for us since the code would spend nowhere near time number of times making as many super calls as the benchmark.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep it as-is. @rahulreddy let me know if you want to open an issue for this.

@ghost ghost self-assigned this Mar 22, 2017
Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of good things in there, + a few questions about doc, explanations/clarifications, and design improvement suggestions

'use strict'; // eslint-disable-line strict

/**
* Base class of listing extensions.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally like this :D

'use strict'; // eslint-disable-line strict

/**
* Base class of listing extensions.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 for this abstraction to clarify things :)

@@ -3,6 +3,11 @@
const checkLimit = require('./tools').checkLimit;
const DEFAULT_MAX_KEYS = 1000;

function inc(str) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

*
* @return {string} - the insight; a common prefix or a master key
*/
skipping() {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to return a predefined constant here.
For instance:

const SKIP_REASON_NONE = '';
const SKIP_REASON_WHATEVER = 'whatever';
...
class Extension {
    ...
    skipping() {
        return SKIP_REASON_NONE;
    }

This scheme could be reused by future extensions to only return known and checkable reasons.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked.

creationDate: tmp.creationDate,
},
});
this.Contents.push({ key, value });
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to double-check, why are we changing this whole bit ? Is it because value is now expected to contain versionning information + the former Object within ?

Does it have an impact on MD or S3 ? or Both ?

IF it is both, it's becoming a REAL issue, as we won't be able to make progress until the whole work is ready...

Also, where is this formatting done now ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make S3 to do the computation instead of the leader node. S3 needs to do the parsing scality/cloudserver@4415788#diff-d1fea59d8501a89ee1ce02464ecdcd16; no metadata impact.
To make them consistent across extensions (w.r.t Basic and Versions originally).

* @return {boolean} - whether this is a PHD version
*/
static isPHD(value) {
// check the length of the value before parsing
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check the length ? but we're only checking if the value contains "isPHD" before actually parsing the value for a real check -> mismatch between code and comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the description for the previous version. Will be removed.

try {
return Version.from(value).isPHDVersion();
} catch (exception) { // eslint-disable-line strict
return false; // nice, Vault
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh ? Vault ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vault is sending plain text data not in JSON format. Vault should never contain "isPHD" in the data it sends, but if it does, we have this covered.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, nice defensive coding then :)

* @return {string} - the formated versionId string
*/
function generateVersionId(info) {
// Need to wait for the millisecond slot got "flushed". We wait for
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure to understand this bit. I get that you wait for one millisecond, but not why ?
Is it because you need to ensure that it starts fresh on a "new" millisecond, due to the inner workings relying on millisecond + sequence number ? Are you relying on the fact that Node.JS computation is monothreaded + this mechanism, to make sure that any two instantiations of this class cannot start on the same millisecond ?
If yes, please fix doc to reflect this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Epochs are at millisecond level. Wait one millisecond to make sure the first epoch when this module started is always the latest, regardless of the sequence number.

* @param {string} str - the versionId to encrypt
* @return {string} - the encrypted versionId
*/
function encrypt(str) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the encrypt/decrypt part is not used for MD.
It should have gone to another PR, depending on this one.

No matter it's too late, but please keep this in mind, as such huge reviews are a hassle for everyone of us.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arsenal is in the middle to provide common knowledge to both S3 and MetaData. Since it's related to versioning, it belongs to this specific PR.

@@ -1,179 +0,0 @@
'use strict'; // eslint-disable-line strict
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why have no tests for this one ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has been merged with the tests for delimiter.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh ok

@ttvi ttvi force-pushed the ft/vsp branch 2 times, most recently from 90386ba to aaf31d7 Compare March 24, 2017 08:03
@ghost
Copy link

ghost commented Mar 24, 2017

LGTM now, Good work o/

@@ -34,11 +54,11 @@ class List {
filter(elem) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet :)

const TEMPLATE_ST = new Array(LENGTH_ST + 1).join(' ');

// site identifier, like PARIS, TOKYO; will be trimmed if exceeding max length
const SITE_ID = `${process.env.SITE_ID}${TEMPLATE_ST}`.slice(0, LENGTH_ST);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could process.env.SITE_ID be undefined? If yes, should it be process.env.SITE_ID || ${a_default_name}?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. SITE_ID is a new required variable to be available to both S3 and MetaData https://github.com/scality/Federation/pull/937.

VID_CRYPTO_ENCODING_TXT, VID_CRYPTO_ENCODING);
encrypted += cipher.final(VID_CRYPTO_ENCODING);
let length = encrypted.length;
while (length && encrypted.charAt(length - 1) === '=') {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why '=' must be removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To beautify the base64 versionId string without impacting the correctness.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A regexp could be used to make things slightly clearer than a loop IMO, say, this should be equivalent (considering that base64 reserves = only for the trailer):

encrypted = encrypted.replace(/=/g,'')

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it was not carefully crafted. Some other alternatives can be used like a simple indexOf, but regex seems concise and is a good solution.

@ttvi
Copy link
Contributor Author

ttvi commented Mar 24, 2017

Integration with versioning OK: http://ci.ironmann.io/gh/scality/Integration/11082.
Trying integration without versioning: http://ci.ironmann.io/gh/scality/Integration/11089.

@ttvi
Copy link
Contributor Author

ttvi commented Mar 25, 2017

Ah, I just realized that this has completely changed the format of listing so this branch of Arsenal cannot be tested against the non-versioning branch of the other repos. @LaurenSpiegel @rahulreddy please merge this all together with S3's scality/cloudserver#651 and MetaData's https://github.com/scality/MetaData/pull/1009.

* < 0: entry is not accepted, listing should finish
*/
filter(entry) {
return entry ? FILTER_SKIP : FILTER_SKIP;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, this is considered an abstract base class with default implementations of some methods. Here it looks like you simply want to return FILTER_SKIP. Or does it make more sense to return FILTER_ACCEPT for a default implementation, which would be no filtering at all? Or maybe just remove the default implementation if it makes no sense for subclasses.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this does not look nice. As this is an interface, I wanted to give some specific description about each function can so that we can get the general idea of the derivatives. About the double FILTER_SKIPs there, they exist for some purposes: first, to give an example on what to return (matching what described in JSDoc), and second, to avoid the linting warning of not using the input value entry (while I still want to keep the entry there).

* version when doing master version listing.
*
* @return {string} - the insight: a common prefix or a master key,
* or undefined if there is no insight
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems the comment should be updated, since the function returns SKIP_NONE.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

params.gt = `overview${this.params.splitter}` +
`${this.params.keyMarker}${this.params.splitter}`;
if (this.params.uploadIdMarker) {
params.gt += `${this.params.uploadIdMarker}`;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably don't need the formatting markers here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if this.params.uploadIdMarker could be a number. As with the param.gt, I copied it from the old code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the + operator on a string automatically converts its second argument into a string.

}
// advance so that lower bound does not include the supplied
// markers
params.gt = inc(params.gt);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you want to start the listing from the marker instead of the "incremented" marker? (and gt would not include the marker itself unlike gte).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*/
filter(obj) {
// Check first in case of maxkeys = 0
if (this.keys >= this.maxKeys) {
// In cases of maxKeys <= 0 => IsTruncated = false
this.IsTruncated = this.maxKeys > 0;
return false;
return -1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should now use FILTER_END and family

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

used

return diff[0] * 1e9 + diff[1];
}
const start = process.hrtime();
while (getspan(process.hrtime(start)) < span);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind putting the ; on the next line, or instead have an empty instruction block? So that it becomes clearer that it's a busy loop doing nothing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good

const VID_CRYPTO_PASSWORD = process.env.VID_CRYPTO_PASSWORD || 'hangngon';

// internal state of the module
let prvts = 0; // epoch of the last versionId
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would opt for longer, more descriptive names for those variables, especially since they are global to the module.

const VID = require('../../../lib/versioning/VersionID.js');
const assert = require('assert');

function randkey(length = 15) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very minor: you may remove the default value for length and let the caller specify it explicitly each time, because I'm not sure what the default value is representing (i.e which kind of particular key) and if there is a meaningful default value here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the comment I like; it touches the right issues + having convincing reasoning for its suggestions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, about the answer, it's there to represent the approximated length of a versionId string. I will move it to the caller.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved

VID_CRYPTO_ENCODING_TXT, VID_CRYPTO_ENCODING);
encrypted += cipher.final(VID_CRYPTO_ENCODING);
let length = encrypted.length;
while (length && encrypted.charAt(length - 1) === '=') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A regexp could be used to make things slightly clearer than a loop IMO, say, this should be equivalent (considering that base64 reserves = only for the trailer):

encrypted = encrypted.replace(/=/g,'')

const MAX_SQ = Math.pow(10, LENGTH_SQ) - 1; // good for 1 billion ops

// the earliest versionId, used for versions before versioning
const VID_INF = `${TEMPLATE_TS}${MAX_TS}`.slice(-LENGTH_TS) +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could consider using helper functions to do the proper padding.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants