Skip to content
This repository has been archived by the owner on Aug 23, 2020. It is now read-only.

Tip Selection - memory efficient algorithm for computing comulative weights #558

Merged
merged 10 commits into from May 16, 2018

Conversation

GalRogozinski
Copy link
Contributor

@GalRogozinski GalRogozinski commented Feb 26, 2018

Context: The whitepaper describes a need to perform cumulative weight calculations on the transactions in order to perform the MCMC algorithm.

Problem: There are two versions of the algorithm implemented in the code:

In use - a time and memory efficient algorithm that performs a different calculation than the one described in the WP. Each transaction sums the weight of its direct ancestors and adds itself. This way indirect ancestors may be counted more than once. This may cause weight to increase exponentially.

Solution: A space time efficient algorithm similar to algo (2). If you traverse the subtangle in topological order, you can dispose of sets you have used. See https://github.com/alongalky/iota-docs/blob/master/cumulative.md.


public class TipsManager {

public static final int MAX_ANCESTORS_SIZE = 1000;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does this number come from? Why 1000?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the max size of the ancestor set. It is limited to reduce the chance of OutOfMemory exception.
@alon-e and I agreed on it.

Is it the correct number to use?
Frankly I don't know.
If @paulhandy or @th0br0 have something to say about this figure then I am all ears.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a memory consumption estimation? This will tell us at least if we are in the ball park or being overly permissive or restrictive.
It could even be a back-of-the-envelope type calculation:
1000 transactions *
size of the subhash *
expected walk depth *
reasonable number of transactions between milestones (say with 100 tx/sec)

Copy link

@alongalky alongalky Feb 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition, what happens when there the sets are full? It seems silly to keep allocating ancestor sets, you should just somehow pass "MAX" to save memory and time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm

Let say the size of a Subhash is 32 bytes. 1000 txs => ~32kb per set.
The total amount of memory 32kb * num_of_unreleased_sets.
This depends on how wide the tangle is...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will check on testnet how often we hit the hard limit + make it configurable

log.debug("Topological sort done. Start traversing on txs in order and calculate weight");
Map<Buffer, Integer> cumulativeWeights = calculateCwInOrder(txHashesToRate, myApprovedHashes, confirmLeftBehind,
analyzedTips);
log.debug("Cumulative weights calculation done in {} ms", System.currentTimeMillis() - start);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a better way to measure times in Java? Maybe logs come with built in timestamps anyway? You probably want a profiling of every function call, not only the cumulative weight calculation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the simplest way.
Real profiling will be done with tools like JProfile or YourKit.

Map<Hash, Collection<Hash>> txToDirectApprovers = new HashMap<>();

stack.push(startTx);
while (CollectionUtils.isNotEmpty(stack)) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think !stack.isEmpty() is cleaner, and more consistent with the rest of the calls

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea of CollectionUtils is to be defensive and not fall for NullPointerException.

You can say that in this specific code segment stack will never be null, and even if it can the above line (stack.push()) will throw an exception.

However, the reason I use it is purely out of habit, which I believe to be a good one.
Wherever it is fine to treat a null collection like an empty colllection one should use a null safe method so we will get less pesky null pointer exceptions.

return cumulativeWeights;
}

private LinkedHashSet<Hash> sortTransactionsInTopologicalOrder(Hash startTx) throws Exception {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What algorithm are you using? Can you add a comment like // based on DFS algorithm, taken from: https://en.wikipedia.org/wiki/Topological_sorting#Depth-first_search?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, maybe it makes sense to take this function out of the module so you can write unit tests for it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add a comment.

A unit test may be a good idea, but currently it is hard to write.
This is because Topological Order is not deterministic...

Thinking about this a little more now,
I can do a diamond shaped graph with 4 vertices. It will have only two possible orders and I can test whether one of them occurs.
If you have an idea of how to better test this please tell.

Another caveat (unrelated to your comment) to think about is the use of LinkedHashMap.
There was an earlier version of the code where the the continue in L#356 was erroneously omitted. The result was that there were multiple calls to add. In that early version List and Set were used instead of LinkedHashMap, so the bug was easily found. However if that continue will be deleted now that bug will not be so easy to see. The method will return the correct output but will be slower due to excessive add calls.

The advantage of using LinkedHashMap is that it will be more efficient on memory, than 2 objects. This method by the way is a bottle-neck memory-wise.

Hash txHash = stack.peek();
if (!sortedTxs.contains(txHash)) {
Collection<Hash> appHashes = getTxDirectApproversHashes(txHash, txToDirectApprovers);
if (CollectionUtils.isNotEmpty(appHashes)) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!appHashes.isEmpty()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

defensive programming

transaction1.store(tangle);
transaction2.store(tangle);
transaction3.store(tangle);
log.debug("printing transaction in diamond shape \n {} \n{} {}\n {}",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this log in a UT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is helpful to log. If others will agree with you I am fine with removing

transaction4 = new TransactionViewModel(getRandomTransactionWithTrunkAndBranch(transaction2.getHash(), transaction3.getHash()), getRandomTransactionHash());
transaction1 = new TransactionViewModel(getRandomTransactionWithTrunkAndBranch(
transaction.getHash(), transaction.getHash()), getRandomTransactionHash());
transaction2 = new TransactionViewModel(getRandomTransactionWithTrunkAndBranch(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it makes sense to factor these out to a function, something like:

generateTangle([[1, 0], [2, 1], [3,2])

which returns a store with the three different edges and vertices. The code appears in every test and is hard to read, which makes it difficult to understand the scenario that is being tested

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some point in the future I will rewrite the test to not use random hashes (It is now this way because this is how it was before). Then I will also add this fix.

Assert.assertEquals(ratings.get(transaction1.getHash()).size(),4);
Assert.assertEquals(ratings.get(transaction2.getHash()).size(), 3);

log.info(String.format("Linear ordered hashes from tip %.4s, %.4s, %.4s, %.4s, %.4s", transaction4.getHash(),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log in test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it helps when the test fails

}

@Test
public void updateRatings2TestWorks() throws Exception {
TransactionViewModel transaction, transaction1, transaction2, transaction3, transaction4;
public void testCalculateCumulativeWeightAlon() throws Exception {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which Alon? :)
I think the name needs to say a bit more

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a unit test with a tangle @alon-e made up to understand the PR better. I have no idea what name to give it, honestly.

Any name you can recommend will be fine.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something to describe what it does.

}

//@Test
// @Test

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend removing changes to this function, if it's only formatting and comments

//transition probability = ((Hx-Hy)^-3)/maxRating
walkRatings[i] = Math.pow(tipRating - ratings.getOrDefault(tips[i],0L), -3);
walkRatings[i] = Math.pow(tipRating - cumulativeWeights.getOrDefault(subHash,0), -3);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance could be improved by a factor of 20+. See #535

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend leaving this change out of this PR, it's big enough as it is. It's an unrelated issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are supposed to switch to the formula described in the whitepaper (with the alpha). If it won't happen soon then we will merge your PR.

Copy link
Contributor

@paulhandy paulhandy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯

@GalRogozinski GalRogozinski added P1 - Critical Priority - Do this as soon as you can S2 - High High Severity labels Mar 19, 2018
@iotasyncbot iotasyncbot changed the title Tip Selection - memory efficient algorithm for computing comulative weights IRI-301 ⁃ Tip Selection - memory efficient algorithm for computing comulative weights Apr 17, 2018
@anyong anyong changed the title IRI-301 ⁃ Tip Selection - memory efficient algorithm for computing comulative weights Tip Selection - memory efficient algorithm for computing comulative weights Apr 22, 2018
@GalRogozinski GalRogozinski deleted the cw-algo branch May 16, 2018 18:25
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
C-Memory P1 - Critical Priority - Do this as soon as you can S2 - High High Severity T-Enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants