Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can we rename COMPlus_ to DOTNET_ for environment variables? #47283

Closed
stephentoub opened this issue Jan 21, 2021 · 47 comments · Fixed by #50507
Closed

Can we rename COMPlus_ to DOTNET_ for environment variables? #47283

stephentoub opened this issue Jan 21, 2021 · 47 comments · Fixed by #50507
Assignees
Milestone

Comments

@stephentoub
Copy link
Member

We have a lot of configuration possible via environment variables that begin with the prefix "COMPlus", e.g. all of the values in https://github.com/dotnet/runtime/blob/16c48d6a45d4d706c327fc840e1a5d450fb189ae/src/coreclr/inc/clrconfigvalues.h.

This is 20-year old legacy finding its way into user-visible configuration for the platform.

Can we rename this to "DOTNET"?

If there are concerns about tools that have hardcoded "COMPlus", we could support both for a few releases and then delete the "COMPlus" support when the majority of hardcodings have switched over.

@stephentoub stephentoub added this to the 6.0.0 milestone Jan 21, 2021
@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Jan 21, 2021
@dotnet dotnet deleted a comment from dotnet-issue-labeler bot Jan 21, 2021
@huoyaoyuan
Copy link
Member

COMPLUS used in code can also be renamed, for example COMPLUSTHROW. This can have less impact.

@danmoseley
Copy link
Member

danmoseley commented Jan 21, 2021

I think I suggested this a while back as we have some DOTNET_ already eg DOTNET_DiagnosticPorts. Presumably it would be easy to support both for several releases, or even indefinitely for some selected ones that are widely used.

I think it comes down to whether these are temporary things that aren't really documented, where names don't matter much, or they are supported dials and it seems increasingly some of them are de-facto supported and it's confusing.

@gfoidl
Copy link
Member

gfoidl commented Jan 21, 2021

In case of the rename it should be double-checked if / how this interferes with Generic Host configuration. Just to be on the safe side.

@danmoseley
Copy link
Member

I think it comes down to whether these are temporary things that aren't really documented, where names don't matter much, or they are supported dials and it seems increasingly some of them are de-facto supported and it's confusing.

It also occurs to me, is it clear enough which are supported and which are not? Is it simply - which ones are documented? Eg., COMPlus_GCRetainVM is documented even though in the code it is UNSUPPORTED_GCRetainVM

@vanbukin
Copy link
Contributor

Maybe CLR_ instead of DOTNET_?

@AraHaan
Copy link
Member

AraHaan commented Jan 22, 2021

How about create dummy ones that loop back to the renamed ones but mark them as obsolete to warn the users against using them (and have it state the renamed copy)?

@mangod9 mangod9 removed the untriaged New issue has not been triaged by the area owner label Jan 24, 2021
@AaronRobinsonMSFT
Copy link
Member

AaronRobinsonMSFT commented Mar 23, 2021

Seems like something that might make the most for an LTS release - so I like this is slated for .NET 6. If this is a priority we should get this in a preview ASAP.

The LTS vs non-LTS fact shouldn't be considered here.

@danmoseley
Copy link
Member

Maybe CLR_ instead of DOTNET_?

there may be cases where both CoreCLR and Mono have the same flag, particularly as we share more code. I would expect DOTNET_ may be a better prefix in that case.

@stephentoub
Copy link
Member Author

@jkotas, this primarily impacts usage in the runtime. How do you feel about it? Great idea and we should make it happen for .NET 6, terrible idea that's not worth pursuing any further, something in the middle?

@jkotas
Copy link
Member

jkotas commented Mar 24, 2021

It sounds good to me, assuming that we also look for COMPlus for the time being. @AaronRobinsonMSFT Do you plan to work on this?

Also, there is still the code that reads the config from the .NET Framework registry hive. We should get rid of that while we are on it, without replacement. I am not aware of anybody using it for .NET Core config, and we have never documented it as such.

@AaronRobinsonMSFT
Copy link
Member

@AaronRobinsonMSFT Do you plan to work on this?

Yep. I am going to add this to my list. I will put out a proposal for how to do this in a few steps. We have a decent number of partners that will need to be informed.

Also, there is still the code that reads the config from the .NET Framework registry hive.

Agree.

@AaronRobinsonMSFT
Copy link
Member

@jkotas and @stephentoub Proposal for this work is here.

@danmoseley
Copy link
Member

If an environment variable with COMPlus_ is detected a message to stdout and debugger output is written indicating the environment variable name should be updated to DOTNET_.

This would be problematic if a customer is using the same scripts or environment or container for both .NET Framework (or pre NET 6) workloads. I suggest this logging not happen if there are matching old and new variables set and they have the same value.

@stephentoub
Copy link
Member Author

stephentoub commented Mar 24, 2021

Proposal for this work is here.

Thanks!

Changing this would take place over the next 3 .NET releases.

I'm ok with that if it's deemed really necessary, but personally I think this is overly conservative. I was hoping instead it could be two releases: .NET 6 respects both, .NET 7 just respects DOTNET_. Supporting both in a long-term servicing release gives folks ample time to update without being forced to accept a breaking change, which should only impact a relatively very small, advanced, hopefully-well-informed set of users.

I suggest this logging not happen if there are matching old and new variables set and they have the same value.

We should only fall through to look for COMPlus_XYZ at all if we looked for DOTNET_XYZ and didn't find it.

If an environment variable with COMPlus_ is detected a message to stdout and debugger output is written indicating the environment variable name should be updated to DOTNET_.

I imagine writing to stdout could be an even larger breaking change ;-)

The following is a list of potential stakeholders for this work

You might want to sync with @marklio. We should be able to figure out what, if any, nuget packages might be setting some of these so we can follow-up with those owners more proactively. I expect the number to be super low, since these mostly affect new processes started, so it would end up being something probably with Process.Start.

Part of VS diagnostics you already called out, but just highlighting that your old stomping ground in the VS .NET allocation profiler exposes an option on by default for disabling R2R. It should likely be updated to just set both for the foreseeable future; I expect most tools that aren't tied to a particular .NET release will need to do the same.

@jkotas
Copy link
Member

jkotas commented Mar 24, 2021

.NET development teams

+CodeGen team. The env variables are heavily used for testing the JIT itself, and also recommended way to exercise hardware-intrinsics specific code-paths for library developers.

It is likely that every more substantial .NET project out there is going to have COMPlus_ in some corner.

a message to stdout and debugger output is written indicating the environment variable name should be updated to DOTNET_

I agree with Dan that introducing any logging for this sounds problematic. I think we should just look for DOTNET_ first and COMPlus second, and do not ever log anything.

Changing this would take place over the next 3 .NET releases.

The extra code to check for COMPlus in addition to DOTNET is <10 lines. It does not seem like a energy well spent to chase down all places that use COMPlus to be able to delete these 10 lines. I would actually not mind if keep the COMPlus check forever as legacy quirk and spend time on things with better ROI instead.

@stephentoub
Copy link
Member Author

It does not seem like a energy well spent to chase down all places that use COMPlus to be able to delete these 10 lines. I would actually not mind if keep the COMPlus check forever as legacy quirk and spend time on things with better ROI instead.

I think there's value in finding and fixing most occurrences. I don't want us to be in a https://imgs.xkcd.com/comics/standards.png position where two different variants of these are sprinkled everywhere and folks don't know why you'd use which when.

That said, if there's minimal overhead impact from doubling the number of env var checks we make in the common case (since most of the env vars will never be set), and if we can stomp out the vast majority of the uses of the old ones folks will ever see (in dotnet/runtime, in docs, etc.), I'd be ok for the foreseeable future having COMPlus always there as a fallback stopgap.

any logging for this sounds problematic.

Yes, at least any logging to something like stdout that is frequently programmatically consumed, or even visible to an end user. I wouldn't have qualms with logging to debugger output if it was deemed important, but I don't think it's that important.

@AraHaan
Copy link
Member

AraHaan commented Mar 24, 2021

It's only important when running in debug but that would inform developers of the change so then they would set the newer one.

@jkotas
Copy link
Member

jkotas commented Mar 24, 2021

I think there's value in finding and fixing most occurrences.

I agree. I think it is very similar to APIs. We have number of places where we have introduced new APIs that are replacements for existing APIs. We have analyzers to help you to move to the new APIs, etc. But we are very reluctant to remove any APIs since chasing down and fixing all places is not worth the energy spent.

if there's minimal overhead impact from doubling the number of env var checks

Yes, there is very minimal overhead from checking for both. The runtime indexes all env variables during startup using a bloom filter.

@AaronRobinsonMSFT
Copy link
Member

I suggest this logging not happen if there are matching old and new variables set and they have the same value.
I imagine writing to stdout could be an even larger breaking change ;-)
I agree with Dan that introducing any logging for this sounds problematic. I think we should just look for DOTNET_ first and COMPlus second, and do not ever log anything.

@danmoseley @stephentoub @jkotas I hear all of you, but don't agree. Pushing users to the correct approach is important here and I don't see the cost being too high for them which is why it was stated for stdout. The log is designed to make something painful enough to change but not so painful that people gripe - clearly that will never happen but in this case I don't see the struggle.

Basically we are saying that users can install .NET 6, set an environment variable with the COMPlus_ prefix, but setting one more environment variable is too onerous. The previous COMPlus_ variable can remain forever with the design but needs to be paired with a DOTNET_ variable for .NET 6. Post-.NET 6 we ignore if COMPlus_ is set and DOTNET_ isn't. So can someone help me understand how users that install .NET 6 and set COMPlus_ variables are in a position where seeing a message like the following and responding is too much of a burden.

Using COMPlus_ environment variables to configure the runtime is deprecated. Please prefix with DOTNET_ instead.

+CodeGen team. The env variables are heavily used for testing the JIT itself, and also recommended way to exercise hardware-intrinsics specific code-paths for library developers.
You might want to sync with @marklio. We should be able to figure out what, if any, nuget packages might be setting some of these so we can follow-up with those owners more proactively.
It should likely be updated to just set both for the foreseeable future;

Updated.

I'd be ok for the foreseeable future having COMPlus always there as a fallback stopgap.

That is reasonable and I've no qualms with that.

@stephentoub
Copy link
Member Author

stephentoub commented Mar 24, 2021

So can someone help me understand how users that install .NET 6 and set COMPlus_ variables are in a position where seeing a message like the following and responding is too much of a burden.

e.g. When the user doesn't own the code / binary that's setting the environment variables. Imagine for example app A launches app B with stdout redirected, parsing the response, and it launches app B with the COMPlus environment variable. That app breaks, the user has no idea why.

@jkotas
Copy link
Member

jkotas commented Mar 24, 2021

So can someone help me understand how users that install .NET 6 and set COMPlus_ variables are in a position where seeing a message like the following and responding is too much of a burden.

Or when you are a first party customer (it is fairly common to first parties to use COMPlus settings), running on .NET 5 by default, and want to flight .NET 6 (something we are encouraging first parties to do). Flighting .NET 6 is not just about flipping the runtime anymore, you have to also figure out how to change your deployment scripts to set COMPlus vs. DOTNET, or deal with the message appearing in all logs.

@tannergooding
Copy link
Member

In particular COMPlus_Enable*=0 are "public" and recommended to users for testing HWIntrinsic and SIMD (that is System.Numerics.Vector as well) configurations (e.g. disabling AVX so you can test the SSE4.1 or software fallback path).

@stephentoub
Copy link
Member Author

Does this alleviate any concerns? I will update the proposal if this makes more sense. In fact I think that sentence is enough of a proposal :)

LGTM

@marklio
Copy link

marklio commented Mar 24, 2021

Does this alleviate any concerns?

Certainly, removing the breaking aspect of the change alleviates my concerns for this release. Is the proposal that we will continue to support COMPlus_ in future releases? Or are will still thinking of pulling support at some point?

@AaronRobinsonMSFT
Copy link
Member

Is the proposal that we will continue to support COMPlus_ in future releases?

From the code I've looked at the cost to search for DOTNET_ and then COMPlus_ doesn't seem all that bad. @jkotas seemed to confirm that in #47283 (comment). I don't think we would be in a particularly bad position if we kept respecting it, other than being a bit too generous in my opinion.

Or are will still thinking of pulling support at some point?

Personally, I think we should pull it at some point - .NET 8 or .NET 9? I will say it would be a low priority thing given the cost to search for both so perpetually below the cut line for the core runtime team. That being said, do you have a suggestion on how we could make eventual removal easy when the resourcing for the effort is found?

@karpinsn
Copy link

From a tooling side I would be pretty against logging an error message. Currently in the Visual Studio profiler we set COMPLUS* variables for our .NET Allocation tool. To support new and old runtimes we would probably just set both since that is the easiest thing to do and doesn't require a runtime version check which can be error prone for us. Positing an error message is pretty much guaranteed to confuse some folks as to why they get an error message with the tool that they don't without.

@AaronRobinsonMSFT
Copy link
Member

AaronRobinsonMSFT commented Mar 24, 2021

From a tooling side I would be pretty against logging an error message.

I am really confused here. This would literally never happen if a DOTNET_ was supplied. I agree this could impact previous version of VS and that is fair, but proactively duplicating all COMPlus_ variables to have a DOTNET_ would avoid this issue. There would be no impact to any Microsoft tooling because it could all be handled well before GA. I genuinely think this message concern is overblown, but at this point it seems we've accepted using COMPlus_ into the foreseeable future so is largely moot.

@karpinsn
Copy link

Oh sorry, looks like I miss understood. After reading:

If an environment variable with COMPlus_ is detected a message to stdout and debugger output is written indicating the environment variable name should be updated to DOTNET_.

This would be problematic if a customer is using the same scripts or environment or container for both .NET Framework (or pre NET 6) workloads. I suggest this logging not happen if there are matching old and new variables set and they have the same value.

I was under the impression that the presence of a COMPlus variable would generate the stdout. If we set them to the same value or the presence of a DOTNET variable suppresses the error we are good.

@marklio
Copy link

marklio commented Mar 24, 2021

That being said, do you have a suggestion on how we could make eventual removal easy when the resourcing for the effort is found?

If we felt the value in getting people off of COMPlus_ was high and we needed a successful migration, I'd suggest our approach should look like:

  • Add standard information about an upcoming breaking change to BCNs, document, evangelize, etc.
  • Actively search out and fix as much as we have visibility into
  • Add auditing/error/compat modes that would allow production environments to be audited for use, and test environments could turn on runtime startup errors if they wanted to be strict, and compat mode would start as the default

I still think we'd need to have "compat" mode in perpetuity.

Any other tooling that's not runtime is unlikely to be successful due to the "ambient" nature of environment variables. It's quite likely that infrastructure that manages environment variables is never presented to the SDK in any meaningful way (ex. in a management portal somewhere)

Due to the mixed Framework/Core environments that are likely to exist for a long time, I think defaulting to error mode would likely not be achievable, so the "most complete" outcome would be to start ignoring COMPlus_ at some point.

@AaronRobinsonMSFT
Copy link
Member

so the "most complete" outcome would be to start ignoring COMPlus_ at some point.

Yep. I think this makes sense for a .NET 8 or beyond change.

@danmoseley
Copy link
Member

Not directly related, but I am interested in the thoughts about the question I asked above about which variables are supported.

It also occurs to me, is it clear enough which are supported and which are not? Is it simply - which ones are documented? Eg., COMPlus_GCRetainVM is documented even though in the code it is UNSUPPORTED_GCRetainVM

We have a situation where some are fully supported and some are not supported at all and/or don't work, and it's not clear (even in the code, apparently) which is which. Maybe there are three buckets: (1) fully supported, expected to continue to work, documented in some form, can be used in production (2) not supported but widely used eg for diagnostic purposes and we would likely fix if they broke (3) not supported/may not even work.

@jkotas
Copy link
Member

jkotas commented Mar 24, 2021

I am interested in the thoughts about the question I asked above about which variables are supported.

I think it should be: If the setting is documented on http://docs.microsoft.com/, it is supported. Otherwise, it is unsupported.

@danmoseley
Copy link
Member

That needs auditing, I think. Eg. I picked COMPlus_EnableHWIntrinsic and COMPlus_EnableAVX at random and bing says they're not in docs.microsoft.com.

@vargaz
Copy link
Contributor

vargaz commented Mar 26, 2021

On unix platforms at least, env variables are traditionally uppercase, so it might be useful to allow uppercase variants as well.

@vitek-karas
Copy link
Member

We'll also have to be careful to not overload the same variable. The DOTNET_ prefix is already used by the host/CLI (DOTNET_ROOT, DOTNET_ROLLFOWARD, and so on). I think there's also a variable called DOTNET_HOST use by vstest and related tools.

And then there's the already mentioned generic host by @gfoidl. This one is particularly nasty - it takes all environment variables prefixed with DOTNET_ and injects them into the configuration. It's hard to say how/when/where people use this.

@BruceForstall
Copy link
Member

If we don't (ever) remove the COMPlus_ variables, is there still value to add additional DOTNET_ "mirrors" of the variables? They may look odd, but is it a real problem?

What if we only checked for both prefixes for non-DEBUG (i.e., Release build available) variables, and forced all DEBUG variable usage to change to the new form? As a transition concern, I'd be really worried we stop looking for COMPlus_ variables, forget to update some test or testing system, and end up not testing some stress mode that we were previously testing.

I'm slightly worried about the overhead of checking for both. The JIT currently eagerly checks for all its variables at startup.

fwiw, the JIT uses the same config mechanism, but its variables are in https://github.com/dotnet/runtime/blob/main/src/coreclr/jit/jitconfigvalues.h, not in clrconfigvalues.h.

@danmoseley
Copy link
Member

They may look odd, but is it a real problem?

For those not aware, it is not obvious that COMPlus_ has anything to do with .NET. If anything it's a Windowsism.

@danmoseley
Copy link
Member

As a smaller issue, docs sometimes assume that it's case insensitive (as it is on Windows). Eg., in this topic it lists variables like COMPLUS_GCHeapHardLimitSOHPercent which as far as I know won't work on Linux. It may be easier to be consistent with DOTNET.

@marklio
Copy link

marklio commented Mar 26, 2021

For those not aware, it is not obvious that COMPlus_ has anything to do with .NET.

Of course, this has been the case for over 2 decades, and I'm not aware that this has ever caused any real problem for .NET customers. At this point, COMPlus_ has more to do with .NET than anything else. ;)

@jkotas
Copy link
Member

jkotas commented Mar 26, 2021

Right, COMPlus_ is not a real problem for customers. It is a problem for platform image.

We have been fixing non-sensical (Windows-specific) names in other places too, for example #46843 (comment) .

@danmoseley
Copy link
Member

@marklio right, but for someone new to the platform, it cannot be inferred..

@danmoseley
Copy link
Member

And I hope to reduce our use of Win32Exception

@AaronRobinsonMSFT AaronRobinsonMSFT self-assigned this Mar 30, 2021
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Mar 31, 2021
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Apr 1, 2021
@ghost ghost locked as resolved and limited conversation to collaborators May 1, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.