You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Mar 7, 2021. It is now read-only.
There are a few bugs with removeFetchCondition and removeDownloadCondition. I'm submitting a pull request with this issue that resolves those bugs. Feel free to ignore it or change it or ask me to change it.
Typo
First, removeFetchCondition has a typo on this line. The relevant block:
if(indexinstanceofFunction){varitemIndex=crawler._downloadConditions.indexOf(index);returnBoolean(crawler._fetchConditions.splice(itemIndex,1));}elseif(typeofindex==="number"){returnBoolean(crawler._fetchConditions.splice(index,1));}thrownewError("Unable to find indexed fetch condition");
That second line should obviously be _fetchConditions not _downloadConditions.
Errors never throw
But that begs the question "Why are the tests passing?" That's due to a few logic errors in that block (and the corresponding block of removeDownloadCondition).
In the test, itemIndex gets a value of -1 from indexOf (since the function is not found in _downloadConditions) which is then passed to splice which interprets it as "one less than the end of the array" which is coincidentally the correct element (since it's a one element array). It then removes it successfully and returns, providing a false positive.
The remove methods' JSDocs both say:
If the removal was successful, the method will return true. Otherwise, it will throw an error.
Assuming that this is the contract you want to keep, the remove methods are not functioning correctly. The methods will never throw an error if passed a function. They will also never throw an error if passed a number. This is because splice will always successfully return an array, even if it's splicing on an empty array or if the index is negative. To keep the contract, index bounds checks need to be added to both the function parameter block and the index parameter block.
I propose the following for removeFetchCondition (and similar for removeDownloadCondition):
if(indexinstanceofFunction){varitemIndex=crawler._fetchConditions.indexOf(index);if(itemIndex!==-1){crawler._fetchConditions.splice(itemIndex,1);returntrue;}}elseif(typeofindex==="number"){if(index>=0&&index<crawler._fetchConditions.length){crawler._fetchConditions.splice(index,1);returntrue;}}thrownewError("Unable to find indexed fetch condition");
Now, if removeFetchCondition is passed a function which is not a fetch condition, or if it is passed a number which is outside of the bounds of _fetchConditions, it will throw an error. (I broke up the index if clause to make it more logically clear what's going on.)
Note that splice will always return an array, which will always evaluate as truthy. There's no need for a Boolean coercion, since that will always be true, which is what the method contract states anyways.
I've added four new mocha tests to check that errors are getting thrown. They fail for the current versions of removeFetchCondition and removeDownloadCondition but pass for these versions. I'm not sure I've styled the tests how you would prefer; feel free to correct them or have me change them.
Volatile conditionIDs
There is a separate logic problem here: Due to the use of splice, the "conditionID" which is returned by addFetchCondition (and addDownloadCondition) is an index number which may or may not remain correct after a call to removeFetchCondition. Calls to removeFetchCondition invalidate existing conditionIDs for conditions that had been added after the one that just got removed had been added. For example, the following code should work logically but won't:
varconditionID1=myCrawler.addFetchCondition(function(){});// = 0varconditionID2=myCrawler.addFetchCondition(function(){});// = 1myCrawler.removeFetchCondition(conditionID1);// SUCCESS: removing @ 0, now length is 1myCrawler.removeFetchCondition(conditionID2);// ERROR THROWN: removing @ 1, which doesn't exist
If you reversed the calls to removeFetchCondition it would succeed, which may be a counter-intuitive gotcha for users. Additionally, with more than two conditions you could easily receive a success from removeFetchCondition while trying to remove one condition when it actually removes the wrong condition. For example:
varconditionID1=myCrawler.addFetchCondition(function(){});// = 0varconditionID2=myCrawler.addFetchCondition(function(){});// = 1varconditionID3=myCrawler.addFetchCondition(function(){});// = 2myCrawler.removeFetchCondition(conditionID1);// SUCCESS: removing @ 0, now length is 2myCrawler.removeFetchCondition(conditionID2);// SUCCESS BUT WRONG: removing @ 1, which is actually condition3// myCrawler._fetchConditions now only contains condition2, which isn't what we wanted
Also notice that a subsequent call to addFetchCondition would return a conditionID of 1, which we already used. This is a common problem which can be easily fixed by using a "removed condition" placeholder object in the underlying arrays to preserve the accuracy of other conditionIDs, and then testing for (and ignoring) those placeholders in the async.every calls. Proposed change to removeFetchCondition:
if(indexinstanceofFunction){varitemIndex=crawler._fetchConditions.indexOf(index);if(itemIndex!==-1){crawler._fetchConditions[itemIndex]=REMOVED_CONDITION;returntrue;}}elseif(typeofindex==="number"){if(index>=0&&index<crawler._fetchConditions.length){crawler._fetchConditions[index]=REMOVED_CONDITION;returntrue;}}thrownewError("Unable to find indexed fetch condition");
Where REMOVED_CONDITION is defined once in the file to be a unique object for identification purposes. Proposed change to the actual fetch condition calls:
This method does make the async.every calls very slightly slower by adding overhead when conditions have been removed. If you're willing to add two new data structures with each crawler object, you could do this without the overhead while still maintaining a full list of all previously-assigned conditionIDs (which is the underlying problem). I wanted to restrain myself from modifying the codebase too much. If you'd like to see that implementation, I can provide that.
There is a slight complication with fixing it this way: the length of the underlying _fetchConditions does not change, which affects the other mocha tests for removing conditions because they are accessing the underlying array directly. I've changed those tests to no longer rely on the length of the arrays but instead on their actual contents. I've also added four new tests to check that multiple conditions can be removed correctly, both by ID and by reference.
If you eventually want to add a getFetchConditionCount() method, you would want to either keep track of how many removals are called per crawler object and subtract that from _fetchConditions.length or iterate through the array and count REMOVED_CONDITION objects, the first option being O(1) and probably preferable.
Multiple removals
Attempting to remove the same function twice will throw an error, but attempting to remove the same conditionID twice will not. This was not a problem before I made the above changes, since the methods would never error (as long as the input was a function or number). To maintain the original contract, I changed removeFetchCondition to throw if the conditionID has already been removed:
if(indexinstanceofFunction){varitemIndex=crawler._fetchConditions.indexOf(index);if(itemIndex!==-1){crawler._fetchConditions[itemIndex]=REMOVED_CONDITION;returntrue;}}elseif(typeofindex==="number"){if(index>=0&&index<crawler._downloadConditions.length){if(crawler._downloadConditions[index]!==REMOVED_CONDITION){crawler._downloadConditions[index]=REMOVED_CONDITION;returntrue;}}}thrownewError("Unable to find indexed fetch condition");
And similar for removeDownloadCondition.
Again, I nested if clauses because I didn't know how you would want to style what would be a really long line, and it makes it more obvious what is happening. This also gives you space to add line comments explaining why each clause exists. Really, you could replace the var crawler = this; statement at the beginning of the method with var conditions = this._fetchConditions; and shorten the rest of the code in the method, but that's your call.
I added four tests for checking that it throws when removing a condition twice. At this point, I'd added so many tests, I created sub-sections within the tests to make it easier to read the output.
There are more tests you could write, like adding->removing->adding and then testing if the new conditionID is a repeat of the first, but that just seems like overkill.
Documentation
Unrelated to these changes, the JSDoc for the "index" parameters should probably also change to reflect the fact that they can accept either a condition reference or an index. It would also probably make sense to change the argument name from index to something that more accurately reflects its dual purpose; but, really, this is a minor point. I have not done either of these things.
The text was updated successfully, but these errors were encountered:
There are a few bugs with
removeFetchCondition
andremoveDownloadCondition
. I'm submitting a pull request with this issue that resolves those bugs. Feel free to ignore it or change it or ask me to change it.Typo
First,
removeFetchCondition
has a typo on this line. The relevant block:That second line should obviously be
_fetchConditions
not_downloadConditions
.Errors never throw
But that begs the question "Why are the tests passing?" That's due to a few logic errors in that block (and the corresponding block of
removeDownloadCondition
).In the test,
itemIndex
gets a value of-1
fromindexOf
(since the function is not found in_downloadConditions
) which is then passed tosplice
which interprets it as "one less than the end of the array" which is coincidentally the correct element (since it's a one element array). It then removes it successfully and returns, providing a false positive.The remove methods' JSDocs both say:
Assuming that this is the contract you want to keep, the remove methods are not functioning correctly. The methods will never throw an error if passed a function. They will also never throw an error if passed a number. This is because
splice
will always successfully return an array, even if it's splicing on an empty array or if the index is negative. To keep the contract, index bounds checks need to be added to both the function parameter block and the index parameter block.I propose the following for
removeFetchCondition
(and similar forremoveDownloadCondition
):Now, if
removeFetchCondition
is passed a function which is not a fetch condition, or if it is passed a number which is outside of the bounds of_fetchConditions
, it will throw an error. (I broke up the indexif
clause to make it more logically clear what's going on.)Note that
splice
will always return an array, which will always evaluate as truthy. There's no need for aBoolean
coercion, since that will always betrue
, which is what the method contract states anyways.I've added four new mocha tests to check that errors are getting thrown. They fail for the current versions of
removeFetchCondition
andremoveDownloadCondition
but pass for these versions. I'm not sure I've styled the tests how you would prefer; feel free to correct them or have me change them.Volatile conditionIDs
There is a separate logic problem here: Due to the use of
splice
, the "conditionID" which is returned byaddFetchCondition
(andaddDownloadCondition
) is an index number which may or may not remain correct after a call toremoveFetchCondition
. Calls toremoveFetchCondition
invalidate existing conditionIDs for conditions that had been added after the one that just got removed had been added. For example, the following code should work logically but won't:If you reversed the calls to
removeFetchCondition
it would succeed, which may be a counter-intuitive gotcha for users. Additionally, with more than two conditions you could easily receive a success fromremoveFetchCondition
while trying to remove one condition when it actually removes the wrong condition. For example:Also notice that a subsequent call to
addFetchCondition
would return a conditionID of1
, which we already used. This is a common problem which can be easily fixed by using a "removed condition" placeholder object in the underlying arrays to preserve the accuracy of other conditionIDs, and then testing for (and ignoring) those placeholders in theasync.every
calls. Proposed change toremoveFetchCondition
:Where
REMOVED_CONDITION
is defined once in the file to be a unique object for identification purposes. Proposed change to the actual fetch condition calls:And similar for download conditions.
This method does make the
async.every
calls very slightly slower by adding overhead when conditions have been removed. If you're willing to add two new data structures with each crawler object, you could do this without the overhead while still maintaining a full list of all previously-assigned conditionIDs (which is the underlying problem). I wanted to restrain myself from modifying the codebase too much. If you'd like to see that implementation, I can provide that.There is a slight complication with fixing it this way: the length of the underlying
_fetchConditions
does not change, which affects the other mocha tests for removing conditions because they are accessing the underlying array directly. I've changed those tests to no longer rely on the length of the arrays but instead on their actual contents. I've also added four new tests to check that multiple conditions can be removed correctly, both by ID and by reference.If you eventually want to add a
getFetchConditionCount()
method, you would want to either keep track of how many removals are called per crawler object and subtract that from_fetchConditions.length
or iterate through the array and countREMOVED_CONDITION
objects, the first option being O(1) and probably preferable.Multiple removals
Attempting to remove the same function twice will throw an error, but attempting to remove the same conditionID twice will not. This was not a problem before I made the above changes, since the methods would never error (as long as the input was a function or number). To maintain the original contract, I changed
removeFetchCondition
to throw if the conditionID has already been removed:And similar for
removeDownloadCondition
.Again, I nested
if
clauses because I didn't know how you would want to style what would be a really long line, and it makes it more obvious what is happening. This also gives you space to add line comments explaining why each clause exists. Really, you could replace thevar crawler = this;
statement at the beginning of the method withvar conditions = this._fetchConditions;
and shorten the rest of the code in the method, but that's your call.I added four tests for checking that it throws when removing a condition twice. At this point, I'd added so many tests, I created sub-sections within the tests to make it easier to read the output.
There are more tests you could write, like adding->removing->adding and then testing if the new conditionID is a repeat of the first, but that just seems like overkill.
Documentation
Unrelated to these changes, the JSDoc for the "index" parameters should probably also change to reflect the fact that they can accept either a condition reference or an index. It would also probably make sense to change the argument name from
index
to something that more accurately reflects its dual purpose; but, really, this is a minor point. I have not done either of these things.The text was updated successfully, but these errors were encountered: