-
Notifications
You must be signed in to change notification settings - Fork 3.8k
CASSANDRA-17677: Fix BulkLoader to load entireSSTableThrottle and entireSSTableInterDcThrottle #1701
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good and thank you for adding all those tests. I found only one issue that is easy to miss if you haven't before.
Precision problem when we set the old deprecated throttle in megabits. More info in the comments
Also, I almost forgot we need to add News.txt entry probably under Deprecation
| File config = new File(Paths.get(".", "test", "conf", "cassandra.yaml").normalize()); | ||
| String[] args = { "--throttle", "200", "--inter-dc-throttle", "400", "-d", "127.9.9.1", "-f", config.absolutePath(), sstableDirName("legacy_sstables", "legacy_ma_simple") }; | ||
| LoaderOptions options = LoaderOptions.builder().parseArgs(args).build(); | ||
| // converts from megabytes to mebibytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // converts from megabytes to mebibytes | |
| // converts from megabits to mebibytes |
| File config = new File(Paths.get(".", "test", "conf", "cassandra.yaml").normalize()); | ||
| String[] args = { "-tmib", "200", "-idctmib", "400", "-d", "127.9.9.1", "-f", config.absolutePath(), sstableDirName("legacy_sstables", "legacy_ma_simple") }; | ||
| LoaderOptions options = LoaderOptions.builder().parseArgs(args).build(); | ||
| // converts from megabytes to mebibytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // converts from megabytes to mebibytes | |
| // converts from megabits to mebibytes |
| File config = new File(Paths.get(".", "test", "conf", "cassandra.yaml").normalize()); | ||
| String[] args = { "--throttle-mib", "200", "--inter-dc-throttle-mib", "400", "-d", "127.9.9.1", "-f", config.absolutePath(), sstableDirName("legacy_sstables", "legacy_ma_simple") }; | ||
| LoaderOptions options = LoaderOptions.builder().parseArgs(args).build(); | ||
| // converts from megabytes to mebibytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // converts from megabytes to mebibytes | |
| // converts from megabits to mebibytes |
| File config = new File(Paths.get(".", "test", "conf", "cassandra.yaml").normalize()); | ||
| String[] args = { "-t", "200", "-idct", "400", "-d", "127.9.9.1", "-f", config.absolutePath(), sstableDirName("legacy_sstables", "legacy_ma_simple") }; | ||
| LoaderOptions options = LoaderOptions.builder().parseArgs(args).build(); | ||
| // converts from megabytes to mebibytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // converts from megabytes to mebibytes | |
| // converts from megabits to mebibytes |
| if (cmd.hasOption(THROTTLE_MEGABITS) && cmd.hasOption(THROTTLE_MEBIBYTES)) | ||
| { | ||
| errorMsg(String.format("Both '%s' and '%s' were provided. Please only provide one of the two options", THROTTLE_MEGABITS, THROTTLE_MEBIBYTES), options); | ||
| } | ||
|
|
||
| if (cmd.hasOption(INTER_DC_THROTTLE_MEGABITS) && cmd.hasOption(INTER_DC_THROTTLE_MEBIBYTES)) | ||
| { | ||
| errorMsg(String.format("Both '%s' and '%s' were provided. Please only provide one of the two options", INTER_DC_THROTTLE_MEGABITS, INTER_DC_THROTTLE_MEBIBYTES), options); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call! :-)
| if (cmd.hasOption(THROTTLE_MEGABITS)) | ||
| { | ||
| throttleMebibytes = DataRateSpec.IntMebibytesPerSecondBound.megabitsPerSecondInMebibytesPerSecond(Integer.parseInt(cmd.getOptionValue(THROTTLE_MEGABITS))) | ||
| .toMebibytesPerSecondAsInt(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this part has an issue and breaks the compatibility. The conversions between megabits and mebibytes are double numbers internally. Something very annoying to keep backward compatibility, etc.
So the issue with this approach here is that someone will set megabits but by using .toMebibytesPerSecondAsInt() to set the throttling we actually round the number. This can be caught if you try to add --throttle being 5megabits in testBulkLoader_WithArgs6. Then if you do assertEquals(5, DatabaseDescriptor.getStreamThroughputOutboundMegabitsPerSec()); you actually get 8.
As it is a tool and the difference doesn't seem super big might be ok.
The bad part is that if someone tries super small number just for testing or something, they will actually get 0 and unthrottle... That is how I identified that annoying issue when we worked on the new config framework and we switched to double in DataRateSpec to be precise as the RateLimiter gets double anyway. At the same time, of course, we do not allow the users to input double numbers, just int. The SetGetStreamThroughputTest tests small numbers but we didn't have this for the tool and it is very tiny detail easy to miss. :-(
I am thinking maybe we can introduce double variable to ensure precision and no surprises to people who continue using the megabits version? WDYT?
In general probably we can just document for people this behavior and leave it as is I guess but if we want to keep the compatibility and be precise everywhere, we will need to introduce double. I guess this shouldn't be hard so at this point of the release I guess I would prefer just to work our the precision to be correct.
CC @yifan-c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let me think about this. I have some ideas, but one of the options is to use a double type instead of int. Another idea is to keep two variables, one for mibs and one for MiBs. I will probably end up going with the solution that produces the more accurate result in the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As long as the user sees only ints and input matches output - I am good :-) Probably both versions should work
| { | ||
| LoaderOptions options = LoaderOptions.builder().build(); | ||
| assertEquals(0, options.throttleMebibytes); | ||
| assertEquals(0, options.interDcThrottleMebibytes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: shall we add assertions also for the other two?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have another test for entire sstable default settings testEntireSSTableDefaultSettings. Do we want to combine them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I missed that one it seems. Maybe we can just combine them, it's up to you.
| // converts from megabytes to mebibytes | ||
| assertEquals(24, options.throttleMebibytes); | ||
| assertEquals(48, options.interDcThrottleMebibytes); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After we fix the precision issue, I think it will be correct to check the megabits here
| LoaderOptions options = LoaderOptions.builder().parseArgs(args).build(); | ||
| // converts from megabytes to mebibytes | ||
| assertEquals(24, options.throttleMebibytes); | ||
| assertEquals(48, options.interDcThrottleMebibytes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After we fix the precision issue, I think it will be correct to check the megabits here
| @@ -296,7 +296,7 @@ public IntMebibytesPerSecondBound(long mebibytesPerSecond) | |||
| // which were in megabits per second in 4.0. Do not start using it for any new properties | |||
| public static IntMebibytesPerSecondBound megabitsPerSecondInMebibytesPerSecond(long megabitsPerSecond) | |||
| { | |||
| final double MEBIBYTES_PER_MEGABIT = 0.119209289550781; | |||
| final double MEBIBYTES_PER_MEGABIT = 15625D / 131072D; | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made these conversions more explicit, let me know if this is okay @ekaterinadimitrova2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is hard to see the reasoning behind the equations.
How about define the constants for MEGABIT, MEBIBYTES, etc. in bits and derive the conversion from the constants? For example,
static final double MEBIBYTES_IN_BITS = Math.pow(2, 20) * 8;
static final double MEGABIT_IN_BITS = Math.pow(10, 6);
static final double MEBIBYTES_PER_MEGABIT = MEGABIT_IN_BITS / MEBIBYTES_IN_BITS;
... more for the other size units...There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, I've updated the PR. The only difference is that I've defined MEBIBYTES_IN_BITS as:
static final double MEBIBYTES_IN_BITS = Math.pow(2, 23);There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering what is the benefit from introducing these calculations? We can also add just a comment, at the same time the name already states - MEBIBYTES/MEGABIT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is only to make the reasoning more clear since the procedure is coded. Adding a comment with the calculations also works. Meanwhile a few constants should not introduce any overkill. I think either way is OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, while you are right I prefer we leave it in the comments and don't do any more changes in this area
| @@ -395,7 +395,7 @@ public double convert(double source, DataRateUnit sourceUnit) | |||
| }; | |||
|
|
|||
| static final double MAX = Long.MAX_VALUE; | |||
| static final double MEGABITS_PER_MEBIBYTE = 8.388608; | |||
| static final double MEGABITS_PER_MEBIBYTE = 131_072D / 15_625D; | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another explicit conversion
| @@ -38,6 +38,10 @@ | |||
| * The Regexp used to parse the rate provided as String in cassandra.yaml. | |||
| */ | |||
| private static final Pattern UNITS_PATTERN = Pattern.compile("^(\\d+)(MiB/s|KiB/s|B/s)$"); | |||
| static final double MEBIBYTES_IN_BITS = Math.pow(2, 23); | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| static final double MEBIBYTES_IN_BITS = Math.pow(2, 23); | |
| static final double MEBIBYTE_IN_BITS = Math.pow(2, 23); |
| return conf.stream_throughput_outbound.toMebibytesPerSecondAsInt(); | ||
| } | ||
|
|
||
| public static void setStreamThroughputOutboundMegabitsPerSec(long value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think value should stay int
| return conf.inter_dc_stream_throughput_outbound.toMebibytesPerSecondAsInt(); | ||
| } | ||
|
|
||
| public static void setInterDCStreamThroughputOutboundMegabitsPerSec(long value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think value should stay int
| DatabaseDescriptor.setStreamThroughputOutboundMegabitsPerSec(Double.valueOf(options.throttleMegabits).longValue()); | ||
| DatabaseDescriptor.setInterDCStreamThroughputOutboundMegabitsPerSec(Double.valueOf(options.interDcThrottleMegabits).longValue()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we just leave the option in double? it is later casted to double anyway in megabitsPerSecondInMebibytesPerSecond?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DatabaseDescriptor.setStreamThroughputOutboundMegabitsPerSec and DatabaseDescriptor.setInterDCStreamThroughputOutboundMegabitsPerSec take int as a param. So we need to cast it. In your previous comment you mentioned that in public static void setStreamThroughputOutboundMegabitsPerSec(int value) value should stay as int
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, that one is called directly in nodetool and we should keep it int as otherwise we will introduce a bug by extending to long or double.
| @@ -224,27 +238,27 @@ public Builder authProvider(AuthProvider authProvider) | |||
| return this; | |||
| } | |||
|
|
|||
| public Builder throttle(int throttle) | |||
| public Builder throttleMegabits(int throttleMegabits) | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering whether we should also add builder methods for the mebibytes version we added?
Also, considering all latest discussions around tools, are we allowed change the names of these Builder methods? Maybe better to deprecate the old name and point to the new ones?
Something like:
public Builder throttleMegabits(int throttleMegabits)
{
this.throttleMegabits = throttleMegabits;
return this;
}
@Deprecated
public Builder throttle(int throttle)
{
return throttleMegabits(throttle);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am kind of worried about the similarity of the names... It could be prone to oversights.
For example, code auto-completion may put throttleMegabits, while the actual intent is to have throttleMebibytes.
How about not introducing a new method with the similar name, but adding java docs to indicate the input throttle (better rename it to throttleMegabits) is in the megabits, and the method is deprecated.
@ekaterinadimitrova2 , what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh so you suggest we remove the throttleMegabits and end up only with the old deprecated throttle and the new throttleMebibytes
That also should work. Honestly, people again can make the same mistake and ignore throttle was in megabits but I guess less methods is better. Also, if we decide to get rid of the megabits version for those properties on a major version.
better rename it to throttleMegabits --> I think any renaming here will be breaking change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better rename it to throttleMegabits
I was not clear on what is "it". I mean to say that it is better to just rename the input parameter from throttle to throttleMegabits. Instead, we have
@Deprecated
public Builder throttle(int throttleMegabits)
{
...
}|
|
||
| if (cmd.hasOption(THROTTLE_MEBIBYTES)) | ||
| { | ||
| throttleMegabits = DataRateSpec.DataRateUnit.MEBIBYTES_PER_SECOND.toMegabitsPerSecond(Integer.parseInt(cmd.getOptionValue(THROTTLE_MEBIBYTES))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe new can add static import:
import static org.apache.cassandra.config.DataRateSpec.DataRateUnit.MEBIBYTES_PER_SECOND;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
throttleMegabits ---> Internally we save in mebibytes in DataRateSpec and megabits is used just on the parsing level, backward compatibility. So I feel we need to do this the other way around. - store in throttleMebibytes and if someone wants to use the megabits, that is converted to mebibytes. Otherwise we will do more calculations - the user gives mebibytes, we make it megabits here, but then we use those megabits to set it internally in mebibytes again in DataRateSpec
| { | ||
| interDcThrottle = Integer.parseInt(cmd.getOptionValue(INTER_DC_THROTTLE_MBITS)); | ||
| interDcThrottleMegabits = DataRateSpec.DataRateUnit.MEBIBYTES_PER_SECOND.toMegabitsPerSecond(Integer.parseInt(cmd.getOptionValue(INTER_DC_THROTTLE_MEBIBYTES))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as for throttleMegabits
| @@ -61,11 +61,11 @@ public static void setupClass() | |||
| public void testUpdateStreamThroughput() | |||
| { | |||
| // Initialized value check | |||
| assertEquals(defaultStreamThroughputMebibytesPerSec * BYTES_PER_MEBIBYTE, StreamRateLimiter.getRateLimiterRateInBytes(), 0); | |||
| assertEquals(defaultStreamThroughputMebibytesPerSec * BYTES_PER_MEBIBYTE, StreamRateLimiter.getRateLimiterRateInBytes(), 0.01); | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I believe these changes can be skipped and we can be consistent if we change in the test class the constant MEBIBYTES_PER_MEGABIT to (same as DataRateSpec)
private static final double MEBIBYTES_IN_BITS = Math.pow(2, 23);
private static final double MEGABIT_IN_BITS = Math.pow(10, 6);
private static final double MEBIBYTES_PER_MEGABIT = MEGABIT_IN_BITS / MEBIBYTES_IN_BITS;
| @@ -296,7 +296,7 @@ public IntMebibytesPerSecondBound(long mebibytesPerSecond) | |||
| // which were in megabits per second in 4.0. Do not start using it for any new properties | |||
| public static IntMebibytesPerSecondBound megabitsPerSecondInMebibytesPerSecond(long megabitsPerSecond) | |||
| { | |||
| final double MEBIBYTES_PER_MEGABIT = 0.119209289550781; | |||
| final double MEBIBYTES_PER_MEGABIT = 15625D / 131072D; | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering what is the benefit from introducing these calculations? We can also add just a comment, at the same time the name already states - MEBIBYTES/MEGABIT.
| assertEquals(0, options.throttleMegabits, 0.01); | ||
| assertEquals(0, options.interDcThrottleMegabits, 0.01); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say 0 should be exactly 0, we should never deviate as it has special meaning - unthrottle.
I think it is safe to leave the delta being 0 and be sure this never regress
| "-d", | ||
| "127.9.9.1:9041", | ||
| "--throttle", | ||
| "10", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest to set something crazy like 3 which is less than 0.5 in mebibytes to ensure no one ever makes a change this to be rounded to 0 and unthrottle. Highly unlikely, but someone might be testing something, I like correctness :-)
ekaterinadimitrova2
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I think I am +1, I left just a nit and we need to update also NEWS.txt for the new flags and methods and mark as deprecated the old ones. I have only one comment that I will leave to @yifan-c to express final preference in the BulkLoader class, line 76-77
| public static final String ENTIRE_SSTABLE_THROTTLE_MBITS = "entire-sstable-throttle"; | ||
| public static final String ENTIRE_SSTABLE_INTER_DC_THROTTLE_MBITS = "entire-sstable-inter-dc-throttle"; | ||
| @Deprecated | ||
| public static final String THROTTLE_MEGABITS = "throttle"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My OCD around the tools is on alert these days and I was wondering about changing this public static final String names but I guess people should be using the actual string, not the constants? @yifan-c ? Should we deprecate them with the old names?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call!
Cassandra is not provided as a library, but people do use it for the purpose. Compatibility is best effort. Those constants can be perfectly as "private", but declared as "public". See https://issues.apache.org/jira/browse/CASSANDRA-10637
I think the changes of the constants names are not strictly required. We can deprecate the original names and add a comment saying "MBITS" are "MEGAMITS".
WDYT? @frankgh
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's best to keep the original name, since Cassandra could sometimes be used as a library. I will add the comment and I will retain the @Deprecated annotation as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for sharing that ticket @yifan-c , I think we can even add it to the comments for other people like us, tempted to make changes here :-)
| public final int entireSSTableThrottle; | ||
| public final int entireSSTableInterDcThrottle; | ||
| public final double throttleMegabits; | ||
| public final double interDcThrottleMegabits; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had for a moment a concern whether we change how and where we demand only ints from the users for those two but testing in LoaderOptionsTest with double numbers seems it triggers the same NumberFormatException for all 4 of them so it seems we are good
| assertEquals(200, options.throttleMegabits, 0.01); | ||
| assertEquals(400, options.interDcThrottleMegabits, 0.01); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| assertEquals(200, options.throttleMegabits, 0.01); | |
| assertEquals(400, options.interDcThrottleMegabits, 0.01); | |
| assertEquals(200, options.throttleMegabits, 0); | |
| assertEquals(400, options.interDcThrottleMegabits, 0); |
| assertEquals(200, options.throttleMegabits, 0.01); | ||
| assertEquals(400, options.interDcThrottleMegabits, 0.01); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| assertEquals(200, options.throttleMegabits, 0.01); | |
| assertEquals(400, options.interDcThrottleMegabits, 0.01); | |
| assertEquals(200, options.throttleMegabits, 0); | |
| assertEquals(400, options.interDcThrottleMegabits, 0); |
| DatabaseDescriptor.setInterDCStreamThroughputOutboundMegabitsPerSec(options.interDcThrottle); | ||
| StreamResultFuture future = null; | ||
| DatabaseDescriptor.setStreamThroughputOutboundMegabitsPerSec(Double.valueOf(options.throttleMegabits).intValue()); | ||
| DatabaseDescriptor.setInterDCStreamThroughputOutboundMegabitsPerSec(Double.valueOf(options.interDcThrottleMegabits).intValue()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it is not a big deal but I wanted to mention again that if we deprecate the megabits options and we go for mebibytes internally in DataRateSpec, probably better to add a setter in mebibytes and use mebibytes and do only one calculation megabits to mebibytes in the parking area for legacy reason in case someone wants to use the old ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added two setters setStreamThroughputOutboundMebibytesPerSec, setInterDCStreamThroughputOutboundMebibytesPerSec. However, I've decided to keep throttle and interDcThrottle in megabits, because the conversion from mebibytes to megabits produces a larger number, and it is less prone to loss of precision.
For example, if we have 0.1 and 0.3 mebibytes, the rounded conversion to megabits will be 1, and 3 megabits respectively. Because we cast these numbers to int in the setter, we get more precision in megabits. If we go the other way around, we lose precision, 3 and 1 both end up as 0 mebibytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering you kept throttleMegabits and interDcThrottleMegabits being double, I guess this comment is outdated.
I spent some time thinking and I think we should just make internally those 4 properties to be operated in bytes and change also throttleMegabits and interDcThrottleMegabits to being bytes. This will save a lot of calculations and headache.
If we step back and look at the StreaManager.StreamRateLimiter, we will see that the rate limiter expects bytes.
Why this was actually not done when we were switching to the new types - because the whole effort was huge and switching internally parameters to smallest possible unit was left for follow up improvement tickets. But in this case I think this is at this point more like a bug considering how many twists and calculations the bulk loader will trigger. Also, division is more expensive operation so it is good if we can get rid of it. WDYT? CC @yifan-c
|
Thanks for the review @ekaterinadimitrova2 and @yifan-c . I think this PR is in a good place for merging. |
…ireSSTableInterDcThrottle `entire_sstable_stream_throughput_outbound` and `entire_sstable_inter_dc_stream_throughput_outbound` were introduced in CASSANDRA-17065. They were added to the LoaderOptions class but they are not being used in the BulkLoader class. In this commit, we make use of the entire_sstable streaming options in BulkLoader. In addittion, the `throttle` and `interDcThrottle` are being passed as MB/s but they should be MiB/s. This commit fixes the verbiage as well as setting the correct value for those properties in BulkLoader.
76cda2b to
b39abf4
Compare
ekaterinadimitrova2
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left only one comment about the description, otherwise I am +1. we need only NEWS.txt (deprecation section and to explain we have changed the new flags for entire-stable an this is not compatible with alpha1 version)
| options.addOption("idct", INTER_DC_THROTTLE_MBITS, "inter-dc-throttle", "inter-datacenter throttle speed in Mbits (default unlimited)"); | ||
| options.addOption("e", ENTIRE_SSTABLE_THROTTLE_MBITS, "entire-sstable-throttle", "entire SSTable throttle speed in Mbits (default unlimited)"); | ||
| options.addOption("eidct", ENTIRE_SSTABLE_INTER_DC_THROTTLE_MBITS, "entire-sstable-inter-dc-throttle", "entire SSTable inter-datacenter throttle speed in Mbits (default unlimited)"); | ||
| options.addOption("t", THROTTLE_MBITS, "throttle", "throttle speed in Mbps (default 0 for unlimited), this option is deprecated, use \"throttle-mib\" instead"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: use tmib, while throttle-mib explains it all, the actual option is different
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for the inter-dc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed offline and decided to keep the long name, and additionally, we decided to remove the short names that we're going to be introduced in this PR. I have updated the code, documentation, and tests to reflect the removal of the short names
| -idct,--inter-dc-throttle <inter-dc-throttle> Inter-datacenter throttle speed in Mbits (default unlimited) | ||
| -idct,--inter-dc-throttle <inter-dc-throttle> (deprecated) Inter-datacenter throttle speed in Mbits (default 0 for unlimited) | ||
| Use -idctmib / --inter-dc-throttle-mib instead |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove -idctmib
| -t,--throttle <throttle> Throttle | ||
| speed in Mbits (default unlimited) | ||
| -t,--throttle <throttle> (deprecated) Throttle speed in Mbits (default 0 for unlimited) | ||
| Use -tmib / --throttle-mib instead |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove -tmib
| import java.lang.reflect.Constructor; | ||
| import java.lang.reflect.InvocationTargetException; | ||
| import java.net.*; | ||
| import java.net.InetAddress; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not against the changes. Those are good and more in line with the code style, but unrelated changes of the patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should I revert these changes?
| @@ -224,27 +238,27 @@ public Builder authProvider(AuthProvider authProvider) | |||
| return this; | |||
| } | |||
|
|
|||
| public Builder throttle(int throttle) | |||
| public Builder throttleMegabits(int throttleMegabits) | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am kind of worried about the similarity of the names... It could be prone to oversights.
For example, code auto-completion may put throttleMegabits, while the actual intent is to have throttleMebibytes.
How about not introducing a new method with the similar name, but adding java docs to indicate the input throttle (better rename it to throttleMegabits) is in the megabits, and the method is deprecated.
@ekaterinadimitrova2 , what do you think?
### What is the issue This is a follow up test for riptano/cndb#13696. ### What does this PR fix and why was it fixed The fix was already merged, but this confirms that without the fix, collections are affected by the NoSuchElementException bug.
### What is the issue This is a follow up test for riptano/cndb#13696. ### What does this PR fix and why was it fixed The fix was already merged, but this confirms that without the fix, collections are affected by the NoSuchElementException bug.
entire_sstable_stream_throughput_outboundandentire_sstable_inter_dc_stream_throughput_outboundwereintroduced in CASSANDRA-17065. They were added to the LoaderOptions class but they are not being used in
the BulkLoader class. In this commit, we make use of the entire_sstable streaming options in BulkLoader.
In addittion, the
throttleandinterDcThrottleare being passed as MB/s but they should be MiB/s.This commit fixes the verbiage as well as setting the correct value for those properties in BulkLoader.