You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on May 2, 2023. It is now read-only.
Looking back at the Travis logs there appears to be a significant number of times that prescription_already_processed errors are shown in the log, which means that the the IDs generated by the clients are colliding too often. This is a bit more serious since the Travis tests are run on a single machine, and this situation is obviously magnified in distributed deployments of the clients...
In search for a solution to this problem I have read this post and I believe we should try to mimic some of the techniques described there. Firstly, the Twitter Snowflake approach is not ideal since it is a networked service. This would potentially add latency to client requests and we need to accurately measure that metric.
An optimal solution seems to be one that does not require ID generators to know about each other. One in each node seems to be ideal, but we need to minimize the chance of collision between the different generators. In the previously mentioned post there is an interesting approach to design an ID from 3 different parts:
The per generator unique value is probably easy to get with a function call like erlang:phash2(node()), but some other source of randomness should be added just in case.
Once this question is properly addressed and the probability of collision minimised, the behaviour for collisions should also be specified. For instance, if a collision does happen, maybe the operation should just be counted as successful, or at least log the reason behind the failure (e.g. {error, prescription_id_collision}).
The text was updated successfully, but these errors were encountered:
This will be fixed with an update to the client, which may pass unnoticed in this repository. Must remember to come back here when changing lasp-bench, or add the client to the repository.
goncalotomas
changed the title
Client generated IDs collide too often
Client generated IDs sometimes collide during benchmark
Nov 11, 2017
Revisiting this issue to clarify some things. Despite this erroneous behavior seen in the benchmarks, the frequency with which these errors are generated is negligible. This can easily be seen in the benchmark graphs as they also record the number of unsuccessful operations, and there is no evidence of a significant (or even noticeable) percentage of operations resulting in this error.
However, in order to fix this issue we will need to implement something akin to Twitter Snowflake.
erlang:phash2(node()) will not return a unique value for all clients since all of the clients will be bound to bench@127.0.0.1. It may be used as long as we make sure that the client node IDs are generated with the IP address they are using, but this might have to be achieved with a makefile target when running benchmarks.
Looking back at the Travis logs there appears to be a significant number of times that
prescription_already_processed
errors are shown in the log, which means that the the IDs generated by the clients are colliding too often. This is a bit more serious since the Travis tests are run on a single machine, and this situation is obviously magnified in distributed deployments of the clients...In search for a solution to this problem I have read this post and I believe we should try to mimic some of the techniques described there. Firstly, the Twitter Snowflake approach is not ideal since it is a networked service. This would potentially add latency to client requests and we need to accurately measure that metric.
An optimal solution seems to be one that does not require ID generators to know about each other. One in each node seems to be ideal, but we need to minimize the chance of collision between the different generators. In the previously mentioned post there is an interesting approach to design an ID from 3 different parts:
{timestamp, per_generator_counter, per_generator_unique_value}
The per generator unique value is probably easy to get with a function call like
erlang:phash2(node())
, but some other source of randomness should be added just in case.Once this question is properly addressed and the probability of collision minimised, the behaviour for collisions should also be specified. For instance, if a collision does happen, maybe the operation should just be counted as successful, or at least log the reason behind the failure (e.g.
{error, prescription_id_collision}
).The text was updated successfully, but these errors were encountered: