New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SOLR-14702: Remove oppressive language (part1) #1712
SOLR-14702: Remove oppressive language (part1) #1712
Conversation
taking this out of Draft status. 👨🏾💻 |
...taimporthandler/src/test/org/apache/solr/handler/dataimport/TestContentStreamDataSource.java
Outdated
Show resolved
Hide resolved
PR description says:
Actual replacement is primary/secondary. I don't see the renamed |
solr/core/src/java/org/apache/solr/security/JWTVerificationkeyResolver.java
Outdated
Show resolved
Hide resolved
} | ||
@SuppressWarnings({"rawtypes"}) | ||
NamedList master = (NamedList) initArgs.get("master"); | ||
boolean enableMaster = isEnabled( master ); | ||
NamedList primary = (NamedList) initArgs.get("primary"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this a communication between different Solr nodes?
If the current node got updated to new vocabulary primary/secondary but the remote node did not and did send data using master/slave, we might run into issues.
It might be desirable (if the above is true) to support both primary and master as return values (and secondary and slave) until a subsequent version of Solr not expected to work with nodes running code using master/slave.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also same issue in the other direction: if an updated node sends "primary" to a non updated node expecting "master", the non updated node will fail. Maybe we need to send the same data under both "primary" and "master" and the same data under both "secondary" and "slave", and accept either one on the receiving end.
I'm not familiar with how this code exactly works so might be off here, but this rename is not internal so must use caution (and test with renamed nodes talking to non renamed nodes, on both roles).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to look through it. Hopefully I can grok it. Not sure, though. If I can find a more backcompat approach, I will drop it in the PR. Open to suggestions from others.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
initArgs.containsKey("primary") ? initArgs.get("primary") : initArgs.get("master")
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @chatman commented in the general discussion, maybe these changes happening in a new major version branch does no require them to be backward/forward compatible?
If 8x and 9.0 servers are expected to talk and understand each other, the suggestion looks good on the receiving end. We need something on the sending end as well (send both values, because "primary" being sent to a 8x server expecting only "master" will not be understood).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
except for the backcompat issue (which I don't think matters at all).
What's the real issue? Let's accept as input "master" or "primary" in one case and "slave" or "secondary" in the other, as well as send both values over the wire as output.
This will make new code and old code compatible (old talking to new and new talking to old). Once only new code is running, we can safely remove sending or accepting the offending terms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this discussion is resolved?
What's the real issue? Let's accept as input "master" or "primary" in one case and "slave" or "secondary" in the other, as well as send both values over the wire as output.
This will make new code and old code compatible (old talking to new and new talking to old). Once only new code is running, we can safely remove sending or accepting the offending terms.
+1, this should allow us to be backward compatible (and I'd even suggest we keep that for the life of 9 to make upgrades easy)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I added I thought @tflobbe Can you let me know where there other places where I should anticipate receiving slave
or master
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rolling upgrade across major versions isn't supported, and I don't see any reason to bend over backwards to support that.
Not to derail this effort at all, which I think is very important, but we do claim to support rolling upgrades across major versions, see for example https://lucene.apache.org/solr/guide/8_6/major-changes-in-solr-8.html#rolling-upgrades-with-solr-8. The page on upgrading a Solr cluster is really only meant for SolrCloud users (that's what was meant by "cluster" when it was written, which is actually incorrect since you can have a "cluster" without using SolrCloud), but it doesn't say not to do a rolling upgrade across major versions because we have other docs that tell you how to do exactly that.
I don't think it's a major issue though, we just need to clearly spell out the proper upgrade steps for 9.0 (which doesn't have to be part of this PR). There is going to be a lot for people to do (if they're using removed features and need to install plugins), this isn't going to add much IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
except for the backcompat issue (which I don't think matters at all).
What's the real issue? Let's accept as input "master" or "primary" in one case and "slave" or "secondary" in the other, as well as send both values over the wire as output.
This will make new code and old code compatible (old talking to new and new talking to old). Once only new code is running, we can safely remove sending or accepting the offending terms.
The real issue here is that the terms should've been removed years ago, or never used. They are super stressful to interact with and every time I combed through my changes here it was really frustrating. I was in college when the project was released. These terms were not used at Grinnell College. I could give a history lesson if you'd like to understand, but I think we should align with the standards put forth by the IETF and expressed in the Apache Code of Conduct at a minimum. We should also remember to prioritize community over code. We shouldn't even be looking to preserve rolling upgrades in favor of keeping such stressful nomenclature. There's so much changing in 9 already that is backward incompatible.
I've fixed the description. That was a vestige of my initial PR where I ran into problems when I tried to run/understand the changes.
On the PNG's, I've fixed them. I was aware of them and forgot to address. I think subsequent PRs will follow, but haven't determined the scope yet. I want to move away from surface level fixes that have deeply emotional implications for code readers and developers on a social level to surface level fixes that have deeply emotional implications for code readers on a technical level. Lots of words there. Better developer experience. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, changes look good. Please make sure all tests pass.
Also, since there are things that are backward incompatible here, let us target this issue for a 9.0 release only, and add upgrade notes in the relevant ref guide page.
After all the recent changes, making sure the tests pass |
I ran it again and shard split tests didn't fail so there seems to be some inconsistency there. ReplicationHandler tests still failed. I'm digging as to what they test to see if I can determine why they don't fail when isolated but do fail when run together. See below:
So strange |
Looks like In other words, moving on to check the others. I'll batch my comments accordingly. |
I cannot get the additional three tests to fail with consistency when isolated, but I cannot say with any certainty that they are not an issue. A second (maybe third) set of eyes would be really helpful here. It's really tricky for me to understand why they are failing the test config files look good afaik. let me know if this warrants further investigation. |
Did you run these tests with same scrutiny before your changes? Maybe behavior is similar? |
@chatman . We should get things correct before merging it. This has to be vetted properly |
I had not. I am checking now. |
I looked at the Ref Guide changes, +1 on those from me (but I can't make it a blanket +1 since I can't comment on all the other code changes). |
in my most recent test, the previously failing tests all passed. Thanks @ErickErickson. However, a new set of tests failed. Tests with failures [seed: 358D229C22BF813]:
|
Then they passed with the same seed. BUILD SUCCESSFUL
|
all tests pass: Can we agree that I can add the back-compat changes to support rolling upgrades in a subsequent PR @risdenk @tflobbe @ctargett @thelabdude @murblanc @noblepaul @elyograg @janhoy @chatman |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looked at the PR only, do I don't know if there are missing references (other than those right next to some change). I left some comments, mostly regarding back compatibility.
NamedList<Object> secondaryConfig = new NamedList<>(); | ||
secondaryConfig.add("fetchFromLeader", Boolean.TRUE); | ||
secondaryConfig.add(ReplicationHandler.SKIP_COMMIT_ON_MASTER_VERSION_ZERO, switchTransactionLog); | ||
secondaryConfig.add("pollInterval", pollIntervalStr); | ||
NamedList<Object> replicationConfig = new NamedList<>(); | ||
replicationConfig.add("slave", slaveConfig); | ||
replicationConfig.add("secondary", secondaryConfig); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this is in the context of SolrCloud, so "secondary" doesn't apply and should be instead follower
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks
@@ -239,7 +239,7 @@ final private void replicate(String nodeName, SolrCore core, ZkNodeProps leaderp | |||
} | |||
|
|||
ModifiableSolrParams solrParams = new ModifiableSolrParams(); | |||
solrParams.set(ReplicationHandler.MASTER_URL, leaderUrl); | |||
solrParams.set(ReplicationHandler.PRIMARY_URL, leaderUrl); | |||
solrParams.set(ReplicationHandler.SKIP_COMMIT_ON_MASTER_VERSION_ZERO, replicaType == Replica.Type.TLOG); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing master
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oof. good catch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought, this is a reference to the replication handler. One of the few links between the old way and the new one.
@@ -733,12 +733,12 @@ private void handleBootstrapStatus(SolrQueryRequest req, SolrQueryResponse rsp) | |||
} | |||
|
|||
static class BootstrapCallable implements Callable<Boolean>, Closeable { | |||
private final String masterUrl; | |||
private final String primaryUrl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is in the context of SolrCloud, should be "leader"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this whole file should go away, but that's a PR for another night...but in progress! 😍
Object skipCommitOnMasterVersionZero = initArgs.get(SKIP_COMMIT_ON_MASTER_VERSION_ZERO); | ||
if (skipCommitOnMasterVersionZero != null && skipCommitOnMasterVersionZero instanceof Boolean) { | ||
this.skipCommitOnMasterVersionZero = (boolean) skipCommitOnMasterVersionZero; | ||
Object skipCommitOnPrimaryVersionZero = initArgs.get(SKIP_COMMIT_ON_MASTER_VERSION_ZERO); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing an update here? Also, since you changed the parameter used, this needs to handle back compatibility
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the only one that needs to change, possibly. Still stitching it all together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't have to handle back-compat in this PR. I can commit to it prior to 9.0's release and even prioritize the work before opening PRs on the other work I am doing.
log.info("Primary's generation: {}", latestGeneration); | ||
log.info("Primary's version: {}", latestVersion); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To my point, all these log in IndexFetcher talk about Master/Primary even when running in SolrCloud mode. This is likely adding to the confusion that Shawn talked about in the Jira. If these were "leader"/"Follower", we'd have solve it :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gothca
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shawn suggested that we shouldn't use the same terminology for the legacy architecture. I've updated the first PR in any event. At this time, keeping the same terminology seems to be the right way to go unless I rewrite the replication handler. There's an argument for that because I'm sure there are areas where we could improve, but it won't happen in this PR.
@@ -23,7 +23,7 @@ This screen provides status information about each collection & node in your clu | |||
.Only Visible When using SolrCloud | |||
[NOTE] | |||
==== | |||
The "Cloud" menu option is only available on Solr instances running in <<getting-started-with-solrcloud.adoc#getting-started-with-solrcloud,SolrCloud mode>>. Single node or master/slave replication instances of Solr will not display this option. | |||
The "Cloud" menu option is only available on Solr instances running in <<getting-started-with-solrcloud.adoc#getting-started-with-solrcloud,SolrCloud mode>>. Single node or primary/secondary replication instances of Solr will not display this option. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we say "legacy replication" instead? (especially if we decide to go with leader/follower)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are the only one truly adamant about changing to leader follower that I have seen thus far, so I'm not sure we will go that route. Unless, of course, you are willing to make the changes in that branch. You have a had start. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not?. We can do a sweeping change after we are done with this PR and before merging. Shouldn't be too bad. Also, my comment is about the "mode", which has some other arguments also in that thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd never seen that email
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I'm on that list.
@@ -24,7 +24,7 @@ In the left-hand navigation bar, you will see a pull-down menu titled "Collectio | |||
==== | |||
The "Collection Selector" pull-down menu is only available on Solr instances running in <<solrcloud.adoc#solrcloud,SolrCloud mode>>. | |||
|
|||
Single node or master/slave replication instances of Solr will not display this menu, instead the Collection specific UI pages described in this section will be available in the <<core-specific-tools.adoc#core-specific-tools,Core Selector pull-down menu>>. | |||
Single node or primary/secondary replication instances of Solr will not display this menu, instead the Collection specific UI pages described in this section will be available in the <<core-specific-tools.adoc#core-specific-tools,Core Selector pull-down menu>>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
@@ -19,7 +19,7 @@ | |||
|
|||
The Core Admin API is primarily used under the covers by the <<collections-api.adoc#collections-api,Collections API>> when running a <<solrcloud.adoc#solrcloud,SolrCloud>> cluster. | |||
|
|||
SolrCloud users should not typically use the CoreAdmin API directly, but the API may be useful for users of single-node or master/slave Solr installations for core maintenance operations. | |||
SolrCloud users should not typically use the CoreAdmin API directly, but the API may be useful for users of single-node or primary/secondary Solr installations for core maintenance operations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
@@ -61,7 +61,7 @@ The following components support distributed search: | |||
|
|||
=== Shards Whitelist | |||
|
|||
The nodes allowed in the `shards` parameter is configurable through the `shardsWhitelist` property in `solr.xml`. This whitelist is automatically configured for SolrCloud but needs explicit configuration for master/slave mode. Read more details in the section <<distributed-requests.adoc#configuring-the-shardhandlerfactory,Configuring the ShardHandlerFactory>>. | |||
The nodes allowed in the `shards` parameter is configurable through the `shardsWhitelist` property in `solr.xml`. This whitelist is automatically configured for SolrCloud but needs explicit configuration for primary/secondary mode. Read more details in the section <<distributed-requests.adoc#configuring-the-shardhandlerfactory,Configuring the ShardHandlerFactory>>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
@@ -16,10 +16,10 @@ | |||
// specific language governing permissions and limitations | |||
// under the License. | |||
|
|||
The Replication screen shows you the current replication state for the core you have specified. <<solrcloud.adoc#solrcloud,SolrCloud>> has supplanted much of this functionality, but if you are still using Master-Slave index replication, you can use this screen to: | |||
The Replication screen shows you the current replication state for the core you have specified. <<solrcloud.adoc#solrcloud,SolrCloud>> has supplanted much of this functionality, but if you are still using Primary-Secondary index replication, you can use this screen to: | |||
|
|||
. View the replicatable index state. (on a master node) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missed what here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"View the replicatable index state. (on a master node)"
I suppose I should convert this back to draft. I don't think it all the comments makes sense, but there is one place I think it does make sense to preserve the name: |
One way to get back compatibility and rolling upgrades could be to make 9.x code be able to read previous formats, but write new format, and make 8.x (since 8.7) read new and old, but write old? Anyone wanting to do a rolling upgrade to 9 would have to be on at least 8.7. Rolling upgrades to 8.7 would still work. |
+1 to @tflobbe We should make the change not break rolling restarts. I like the suggestion |
PLEASE DO NOT MERGE — I've reverted back to the initial approach of using Leader/Follower because of various dependencies. |
Closing this PR in favor of the other: #1711 |
@tflobbe I like the suggestion. Can we save it for the subsequent PR? Will be. Much easier to read a review in that case. I'll set up a standalone 8_x cluster and upgrade some nodes on to 9_x provided rolling upgrades work today. That's the first test. Second test will be to add changes for handling legacy nomenclature to see what happens. |
About non-reproducible test failures. See: http://fucit.org/solr-jenkins-reports/failure-report.html
Hoss rolls up the last 7 day’s failures from various Jenkins jobs. If a non-reproducible failure
shows up on that list as this one does it’s often an indication of a flaky test.
This is actually an embarrassing situation to be in, I have some hope that Mark Miller’s work
will get incorporated sometime and we’ll have many fewer of these since they’re a PITA.
Erick
… On Aug 4, 2020, at 4:35 AM, Marcus ***@***.***> wrote:
@tflobbe I like the suggestion. Can we save it for the subsequent PR?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Marcus:
Also, that _particular_ test started failing a lot more recently, see: https://issues.apache.org/jira/browse/SOLR-14684
So I wouldn’t worry about it...
… On Aug 3, 2020, at 9:10 PM, Marcus ***@***.***> wrote:
in my most recent test, the previously failing tests all passed. Thanks @ErickErickson.
However, a new set of tests failed. Tests with failures [seed: 358D229C22BF813]:
[junit4] - org.apache.solr.cloud.CloudExitableDirectoryReaderTest.test
[junit4] - org.apache.solr.cloud.CloudExitableDirectoryReaderTest.testWhitebox
[junit4] - org.apache.solr.cloud.CloudExitableDirectoryReaderTest.testCreepThenBite {seed=[358D229C22BF813:D4908E36E9A995B2]}
[junit4] - org.apache.solr.cloud.CloudExitableDirectoryReaderTest.testCreepThenBite {seed=[358D229C22BF813:60C632CADD6B5E9E]}
[junit4] - org.apache.solr.cloud.CloudExitableDirectoryReaderTest.testCreepThenBite {seed=[358D229C22BF813:EE2FA4168CAF1655]}
[junit4] - org.apache.solr.cloud.CloudExitableDirectoryReaderTest.testCreepThenBite {seed=[358D229C22BF813:DFC10FF3E0514D7C]}
[junit4] - org.apache.solr.cloud.CloudExitableDirectoryReaderTest.testCreepThenBite {seed=[358D229C22BF813:93008A5E41598DC7]}
[junit4] - org.apache.solr.schema.TestBulkSchemaConcurrent (suite)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
closing this in favor of the finished product: #1711 |
Description
Please refer to this PR: #1711
Solution
Replaces master with primary and slave with secondary.
Tests
Run standard tests.
Checklist
Please review the following and check all that apply:
master
branch.ant precommit
and the appropriate test suite.