Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgraded Nuget now web app hangs #253

Closed
ethos-tim opened this issue May 6, 2024 · 14 comments
Closed

Upgraded Nuget now web app hangs #253

ethos-tim opened this issue May 6, 2024 · 14 comments
Labels
bug This issue is a bug. module/logging p1 This is a high priority issue queued

Comments

@ethos-tim
Copy link

Describe the bug

Everything was working for our developers and in our QA site. We upgraded nuget and if the developer did not have the AWS CLI configured the app hung at startup during ASP.NET initialization. We fixed that by configuring the AWS CLI for those users. We then installed it in QA where it was all working and logging perfectly before we upgraded the packages. The app now hangs and will never start. This all worked fine with the previous versions. Doing a dump of the w3wp.exe shows it locked and hung at:

SYMBOL_NAME:  w3wphost!AppHostInitialize+14c
MODULE_NAME: w3wphost
IMAGE_NAME:  w3wphost.dll
FAILURE_BUCKET_ID:  BREAKPOINT_80000003_w3wphost.dll!AppHostInitialize

Old Packages:

<package id="AWS.Logger.Core" version="3.0.0" targetFramework="net472" />
<package id="AWS.Logger.Log4net" version="3.2.1" targetFramework="net472" />
<package id="AWSSDK.CloudFront" version="3.7.0" targetFramework="net472" />
<package id="AWSSDK.CloudWatchLogs" version="3.7.0.5" targetFramework="net472" />
<package id="AWSSDK.Core" version="3.7.0.6" targetFramework="net472" />

New Packages:

<package id="AWS.Logger.Core" version="3.3.1" targetFramework="net48" />
<package id="AWS.Logger.Log4net" version="3.5.1" targetFramework="net48" />
<package id="AWSSDK.CloudFront" version="3.7.302.10" targetFramework="net48" />
<package id="AWSSDK.CloudWatchLogs" version="3.7.305.25" targetFramework="net48" />
<package id="AWSSDK.Core" version="3.7.303.24" targetFramework="net48" />

Is there a new config setting we need to set?

    <appender name="AWS" type="AWS.Logger.Log4net.AWSAppender,AWS.Logger.Log4net">
      <filter type="log4net.Filter.LevelMatchFilter">
        <levelToMatch value="CHUNKINFO" />
      </filter>
      <filter type="log4net.Filter.DenyAllFilter" />
      <LogGroup>WebAppLogger-dev</LogGroup>
      <Region>us-east-1</Region>
      <layout type="log4net.Layout.PatternLayout">
        <conversionPattern value="%-4timestamp [%thread] %-5level %logger %ndc - %message%newline" />
      </layout>
    </appender>

Expected Behavior

It runs and if it can not start it fails and just does NOT log. And it works like it used to work with the exact same settings and configuration.

Current Behavior

It hangs the entire web application and prevent it from starting.

Reproduction Steps

Update from previous version of nuget for ASP.NET Web applications and watch it all break.

Possible Solution

No response

Additional Information/Context

No response

AWS .NET SDK and/or Package version used

<package id="AWS.Logger.Core" version="3.3.1" targetFramework="net48" />
 <package id="AWS.Logger.Log4net" version="3.5.1" targetFramework="net48" />
 <package id="AWSSDK.CloudFront" version="3.7.302.10" targetFramework="net48" />
 <package id="AWSSDK.CloudWatchLogs" version="3.7.305.25" targetFramework="net48" />
 <package id="AWSSDK.Core" version="3.7.303.24" targetFramework="net48" />

Targeted .NET Platform

.NET Framework 4.8

Operating System and version

Windows 10, 11 and Server 2016

@ethos-tim ethos-tim added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels May 6, 2024
@ethos-tim
Copy link
Author

ethos-tim commented May 6, 2024

If we remove the AWS Log4net Section from the web.config everything works as expected minus the logs getting shipped to AWS Cloudwatch.

@ashishdhingra
Copy link
Contributor

@ethos-tim Good morning. Thanks for reporting the issue. Could you please share the below:

  • Execution environment for your ASP.NET web application. Is it local IIS or an EC2 instance?
  • How are AWS credentials configured for your application? The credentials are needed to push logs to CloudWatch. (refer Credential and profile resolution)
  • For configured AWS credentials, has the required IAM permissions configured (refer Required IAM Permissions?
  • Does the web application eventually times out (after initial hang)?
  • Could you please share the minimal reproducible code sample to troubleshoot the issue?
  • Additionally, you may enable verbose logging to emit detailed logs (refer Configuring Other Application Parameters)

Thanks,
Ashish

@ashishdhingra ashishdhingra added response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. and removed needs-triage This issue or PR still needs to be triaged. labels May 7, 2024
@ashishdhingra ashishdhingra self-assigned this May 7, 2024
@ers-jporche
Copy link

ers-jporche commented May 7, 2024

Potential duplicate of 1968?

@ashishdhingra - to answer some of your questions that I'm aware of, the issue occurs in an EC2 instance, which is managed by us, and our application runs within IIS on an application pool which does NOT load the user profile.

The EC2 instance has attached to it an EC2 IAM role which affords the permissions required by the logger (we use the same role to perform other things like S3 operations, etc. so the SDK should be able to access the creds through metadata).

I have validated that the APIPA address and URL resolves the correct credentials at: http://169.254.169.254/latest/meta-data/security-credentials/our-ec2-iam-role

When the issue is occurring, the application doesn't send anything to the front end. The initial get request to the application remains in a pending state indefinitely. In the server, the web.config remains locked by the w3p.exe process and is unable to be modified.

Thank you for the resource to enable verbose logging, I would like to configure that to determine what's actually going wrong.

@ethos-tim
Copy link
Author

Note: this was all working and logging fine before the nuget update.

@ethos-tim
Copy link
Author

It also happens in local development with IISExpress. It worked before the nuget update without any AWS setup or credentials on the developer a machine. We updated the nuget and it all locks up. From my understanding log4net is supposed to fail gracefully and not BLOCK if something is configured wrong.

@ethos-tim
Copy link
Author

All the proper permissions were configured and working with the previous version of the nuget. I verified them using the simulator in AWS yesterday. My guess is that something changed around how it located the credentials?

@ethos-tim
Copy link
Author

For a sample app you can see this: aws/aws-sdk-net#1968 (comment)

@ers-jporche
Copy link

ers-jporche commented May 7, 2024

Update:

I tested in a SEPARATE environment which did NOT have the required IAM permissions, after adding LibraryLogFileName to the appender, I observed the expected log generated from the library stating that I was missing required permissions.

I then gave the correct IAM permissions and restarted the app and its pool (this requires directly killing the process for the app after stopping the pool), observing that the issue of the application locking still occurs, but now there is no logging of any error locally.

So, I can't determine exactly what is going wrong because neither the SDK nor the DotNet logger is logging anything about its own failure (obviously no events are logged to the log stream either).

@ashishdhingra
Copy link
Contributor

@ers-jporche Thanks for your analysis. We would need to investigate what is going wrong. I will discuss this with the team.

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label May 8, 2024
@ethos-tim
Copy link
Author

I did some more troubleshooting with the SDK source and Logger source. It looks like the issue is in the SDK, See here: aws/aws-sdk-net#1968 (comment)

This is where it locks up in the Logger but that only appears to be due to the SDK hanging:
image

\aws-logging-dotnet\src\AWS.Logger.Core\Core\AWSLoggerCore.cs Line 101

@ethos-tim
Copy link
Author

I think the issue is with the SDK and this ticket can be merged to the SDK ticket. See latest there: aws/aws-sdk-net#1968 (comment)

@ashishdhingra ashishdhingra added needs-review p1 This is a high priority issue queued and removed needs-review labels May 10, 2024
@ashishdhingra ashishdhingra removed their assignment May 10, 2024
@ashishdhingra
Copy link
Contributor

Discussed this issue and other one aws/aws-sdk-net#1968 with the team. The investigation would be done as part of aws/aws-sdk-net#1968.

@ethos-tim Thanks for reporting your findings on aws/aws-sdk-net#1968.

@ashovlin
Copy link
Member

We've released AWS.Logger.Core v3.3.3 today, which now lazily initializes the internal CloudWatch Logs client. This should avoid the race condition with the SDK's own logging that was leading to the deadlock.

You could either upgrade your pinned version of AWS.Logger.Core, or to AWS.Logger.Log4net v3.5.3 to pull in the latest core.

Let us know if that doesn't mitigate the issue for you, thanks.

Copy link

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. module/logging p1 This is a high priority issue queued
Projects
None yet
Development

No branches or pull requests

4 participants