Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Input byte array has wrong 4-byte ending unit " error causing CPU overload #1681

Closed
mmoreno509 opened this issue Apr 12, 2022 · 8 comments
Closed
Assignees
Labels
already fixed bug Something isn't working duplicate This issue or pull request already exists
Milestone

Comments

@mmoreno509
Copy link

mmoreno509 commented Apr 12, 2022

Input byte array has wrong 4-byte ending unit " error causing CPU overload

Description

A service provider we were integrating with our fusionauth service sent a malformed

Affects versions

1.30.2

Steps to reproduce

Create a SAML application with the following configuration:
image

send this request: (replace host and application-id)

https://{host}/samlv2/login/{application-id}?SAMLRequest=jZFBT4QwEIX%2FCum9lE6hSAMkm%2BxlE72o8eDFdEsREmiRKcafbxc9uBezhznMTL4372Vq1PMEizpsYXCP9mOzGJKveXKofjYN2VanvMYRldOzRRWMejo83CtIM7WsPnjjJ%2FKX%2BR%2FRiHYNo3ckOR0b8lZCB7LQhoI8C5qbvqMVz4CKXuizLCBWR5IXu2JkGhIlIoi42ZPDoF2IowyAZjnl8MxzVZQqh7S4K15JcoxpRqfDTg4hLKgY0zFquhkarBmcn%2Fz7aDE1fmYX85%2FALhPHRC8zKWxOhSijNLeGapCWZppL2XNRSVmRtt4Dq93P2t564Rbtml1J%2F7bXj2q%2FAQAAAA%3D%3D&RelayState=enc_c2p5VE5MQ3lBMlllVlV4cnIxV21rSGg4d3h0ZkZEVjUrZ096cXRRbkdyUUxzT3BobnZIbkdXdlhJQ3hIK2VEVCs3VklvZ3I2cGNyV1YvSmlnbmRpcWlVTWpYSHhRK0RVa1hTQXdlZGtlMTVmYnZIREdkV1QvUkJobXhiVGZDV3crVE1zdVdqb21aUVlWcUxZbEpOMURSWTdjdHdhM1ZRUjlUUm9jOWt6eEZYSHA1SHducTlRdXcrUHFFZDBjNGdrZDBlc0dwd1M5dzhxdEFYenNvTzN0dWJnL1IvRFdscHFIWjQ1cnc3bU9aYmlZcmxDNDVIRUo2OXh0czNNNVMweHJva0pXUmhKZXpnTHJMSVBSbUlpS2t4cnFqOTFWWjRHM2lwbitjS3JlNDgzNGszNzJ5bjM0NitDa2FFemwySm1DaysvWjZETloxdXJWZEJXQ2xnTFVnPT0=
  1. Check fusion auth server load

Expected behavior

A response to the service provider saying the samlRequest was invalid or malformed, and the CPU should not overload after this request.

Platform

(Please complete the following information)

  • Device: AWS ec2 instance
  • OS: linux ubuntu 21
  • chrome 100.0.4896.75

Additional context

The server is still available, other services providers can still access FusionAuth to authenticate but the load increase after this failed request is abnormal.

Screenshots

image

Here is the debug log for the given request:

FusionAuth encountered an exception while processing the SAML v2 AuthnRequest.
The request originated from: null

Exception:
io.fusionauth.samlv2.domain.SAMLException: Invalid AuthnRequest. Inflating the bytes failed.
	at io.fusionauth.samlv2.util.SAMLTools.decodeAndInflate(SAMLTools.java:159)
	at io.fusionauth.samlv2.service.DefaultSAMLv2Service.parseRequestRedirectBinding(DefaultSAMLv2Service.java:577)
	at io.fusionauth.api.service.samlv2.DefaultSAMLv2ProviderService.parseAuthNRedirectRequest(DefaultSAMLv2ProviderService.java:298)
	at io.fusionauth.app.action.samlv2.LoginAction.lambda$get$0(LoginAction.java:73)
	at io.fusionauth.app.action.samlv2.BaseSAMLAction.handleSAMLException(BaseSAMLAction.java:114)
	at io.fusionauth.app.action.samlv2.LoginAction.get(LoginAction.java:70)
	at jdk.internal.reflect.GeneratedMethodAccessor597.invoke(Unknown Source)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:564)
	at org.primeframework.mvc.util.ReflectionUtils.invoke(ReflectionUtils.java:414)
	at org.primeframework.mvc.action.DefaultActionInvocationWorkflow.execute(DefaultActionInvocationWorkflow.java:79)
	at org.primeframework.mvc.action.DefaultActionInvocationWorkflow.perform(DefaultActionInvocationWorkflow.java:62)
	at org.primeframework.mvc.workflow.SubWorkflowChain.continueWorkflow(SubWorkflowChain.java:51)
	at org.primeframework.mvc.validation.DefaultValidationWorkflow.perform(DefaultValidationWorkflow.java:47)
	at org.primeframework.mvc.workflow.SubWorkflowChain.continueWorkflow(SubWorkflowChain.java:51)
	at org.primeframework.mvc.security.DefaultSecurityWorkflow.perform(DefaultSecurityWorkflow.java:60)
	at org.primeframework.mvc.workflow.SubWorkflowChain.continueWorkflow(SubWorkflowChain.java:51)
	at org.primeframework.mvc.parameter.DefaultPostParameterWorkflow.perform(DefaultPostParameterWorkflow.java:50)
	at org.primeframework.mvc.workflow.SubWorkflowChain.continueWorkflow(SubWorkflowChain.java:51)
	at org.primeframework.mvc.content.DefaultContentWorkflow.perform(DefaultContentWorkflow.java:52)
	at org.primeframework.mvc.workflow.SubWorkflowChain.continueWorkflow(SubWorkflowChain.java:51)
	at org.primeframework.mvc.parameter.DefaultParameterWorkflow.perform(DefaultParameterWorkflow.java:57)
	at org.primeframework.mvc.workflow.SubWorkflowChain.continueWorkflow(SubWorkflowChain.java:51)
	at org.primeframework.mvc.parameter.DefaultURIParameterWorkflow.perform(DefaultURIParameterWorkflow.java:102)
	at org.primeframework.mvc.workflow.SubWorkflowChain.continueWorkflow(SubWorkflowChain.java:51)
	at org.primeframework.mvc.scope.DefaultScopeRetrievalWorkflow.perform(DefaultScopeRetrievalWorkflow.java:58)
	at org.primeframework.mvc.workflow.SubWorkflowChain.continueWorkflow(SubWorkflowChain.java:51)
	at org.primeframework.mvc.message.DefaultMessageWorkflow.perform(DefaultMessageWorkflow.java:44)
	at org.primeframework.mvc.workflow.SubWorkflowChain.continueWorkflow(SubWorkflowChain.java:51)
	at org.primeframework.mvc.action.DefaultActionMappingWorkflow.perform(DefaultActionMappingWorkflow.java:126)
	at org.primeframework.mvc.workflow.SubWorkflowChain.continueWorkflow(SubWorkflowChain.java:51)
	at org.primeframework.mvc.workflow.StaticResourceWorkflow.perform(StaticResourceWorkflow.java:97)
	at org.primeframework.mvc.workflow.SubWorkflowChain.continueWorkflow(SubWorkflowChain.java:51)
	at org.primeframework.mvc.parameter.RequestBodyWorkflow.perform(RequestBodyWorkflow.java:91)
	at org.primeframework.mvc.workflow.SubWorkflowChain.continueWorkflow(SubWorkflowChain.java:51)
	at org.primeframework.mvc.security.DefaultSavedRequestWorkflow.perform(DefaultSavedRequestWorkflow.java:64)
	at org.primeframework.mvc.workflow.SubWorkflowChain.continueWorkflow(SubWorkflowChain.java:51)
	at io.fusionauth.app.primeframework.CORSFilter.doFilter(CORSFilter.java:262)
	at io.fusionauth.app.primeframework.CORSRequestWorkflow.perform(CORSRequestWorkflow.java:49)
	at org.primeframework.mvc.workflow.SubWorkflowChain.continueWorkflow(SubWorkflowChain.java:51)
	at io.fusionauth.app.primeframework.FusionAuthMVCWorkflow.perform(FusionAuthMVCWorkflow.java:86)
	at org.primeframework.mvc.workflow.DefaultWorkflowChain.continueWorkflow(DefaultWorkflowChain.java:44)
	at org.primeframework.mvc.servlet.FilterWorkflowChain.continueWorkflow(FilterWorkflowChain.java:50)
	at org.primeframework.mvc.servlet.PrimeFilter.doFilter(PrimeFilter.java:78)
	at com.inversoft.maintenance.servlet.MaintenanceModePrimeFilter.doFilter(MaintenanceModePrimeFilter.java:63)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
	at com.inversoft.servlet.UTF8Filter.doFilter(UTF8Filter.java:27)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
	at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:196)
	at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:97)
	at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:542)
	at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:135)
	at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:81)
	at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:78)
	at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:364)
	at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:624)
	at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:65)
	at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:831)
	at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1650)
	at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
	at org.apache.tomcat.util.threads.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1191)
	at org.apache.tomcat.util.threads.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:659)
	at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
	at java.base/java.lang.Thread.run(Thread.java:832)

The above saml request decodes and deflates to the following value:

b"ìæAOä0\x10à \nÚ¢öNíH\x03$øýe\x13¢¿±Ó┼tK\x11\x12hæ)ãƒo\x17=©\x17│ç9╠L¥7´ejȾ\x04ï:lapÅ÷c│\x18Æ»yr¿~6\r┘Vº╝ã\x11òË│E\x15îz:<▄+H3Á¼>xÒ'‗ù¨\x1FÐêv\rúw$9\x1D\e‗
VB\x07▓ðåé<\vÜø¥ú\x15¤Çè^Þ│, VGÆ\x17╗bd\x1A\x12%"ê©┘ô├á]êú\fÇf9Õ­╠sUö*ç┤©+^IrîiFº├N\x0E!,
¿\x18Ë1j║\x19\x1A¼\x19£ƒ³¹h15~f\x17¾ƒ└.\x13ÃD/3)lNà(ú4ÀåjÉûfÜK┘sQIYæÂÌ\x03½¦¤┌Ìzß\x16ÝÜ]I ÂÎÅj┐\x01\0\0\0

All signs point to a service provider issue that they are correcting, (including these '\x01\0\0\0' characters) but I still don't think the load on our fusionauth server should increase so dramatically

Release Notes

Fixed edge case handling in SAMLTools.

@robotdan
Copy link
Member

robotdan commented Apr 15, 2022

Assuming this is a duplicate of this issue:

If so, that was fixed in 1.30.2 https://fusionauth.io/docs/v1/tech/release-notes#version-1-30-2

If you have upgraded to version 1.30.2 or later, and still see the issue let us know.

@robotdan robotdan added duplicate This issue or pull request already exists already fixed labels Apr 15, 2022
@mmoreno509
Copy link
Author

Hi @robotdan,

Thanks for checking this out for us, turns out what was causing the memory leak was that the ServiceProvider was setting the issuer value in the auth SAMLRequest to the Idp's entity ID instead of the ServiceProvider EntityID:

Here's Is the problematic request:

<saml2p:AuthnRequest xmlns:saml2p="urn:oasis:names:tc:SAML:2.0:protocol" xmlns:saml2="urn:oasis:names:tc:SAML:2.0:assertion" ID="_72d265ac-26b3-4cfd-9102-3f3ab652b65d" Version="2.0" IssueInstant="2022-04-12T14:57:42.585Z" Destination="https://auth.uc-technologies.com/samlv2/login/3f6063e4-3374-11ec-a26e-0a166f139669"><saml2:Issuer>[https://auth.uc-technologies.com/samlv2/3f6063e4-3374-11ec-a26e-0a166f139669</saml2:Issuer></saml2p:AuthnRequest>https://auth.uc-technologies.com/samlv2/3f6063e4-3374-11ec-a26e-0a166f139669</saml2:Issuer></saml2p:AuthnRequest>

We are planning on updating our fusionauth version soon. I will update this issue if I able to reproduce the problem after the update.

@mmoreno509
Copy link
Author

@robotdan can confirm the issue still exists in 1.36. It seems that if a service provider sends an authentication request with FusionAuth URL as the issuer, the CPU load dramatically increases. Here is a screenshot of a recent spike that showed up while one of our service providers was testing their integration:

image

I know this is a user error (caused by our service provider having an incorrect auth request), but it does cause problems for use as it causes our instance CPU to spike, until we restart fusion auth service.

Also important to note, the memory leak issue you pointed out refers to fusion auth instances that send email, our installation has disabled email sending.

@mmoreno509 mmoreno509 reopened this Apr 22, 2022
@robotdan
Copy link
Member

Thanks for the additional detail @mmoreno509 we'll take a look.

@robotdan
Copy link
Member

I tried parsing the request you mentioned in this comment #1681 (comment) and compared it to other requests, and found no difference in performance.

When you see that CPU spike, is it sustained, or does that graph represent a stream of AuthN requests?

In your above example is this a typo [https:// or does the URL you have in there actually start with an open square bracket?

@mmoreno509
Copy link
Author

The CPU spike is sustained until we restart fusionauth. It does not affect the usability of the service, authentication on from other services was still possible., however the request that was initiated ends in a 502 Gateway error for the Service Provider.

  • the [ may just be the github markdown that I didn't entirely remove.

@robotdan
Copy link
Member

@mmoreno509 have you been able to attempt a recreate on a recent version? If it still occurs let me know and I can take another crack at it.

@bhalsey
Copy link

bhalsey commented Feb 27, 2024

Found a case of a CPU spike with a SAML request that was different from the truncated request in

@bhalsey bhalsey reopened this Feb 27, 2024
@bhalsey bhalsey self-assigned this Feb 27, 2024
@bhalsey bhalsey added this to the 1.49.0 milestone Feb 27, 2024
@andrewpai andrewpai added this to In progress in FusionAuth Issues Feb 28, 2024
@andrewpai andrewpai moved this from In progress to Delivered in FusionAuth Issues Feb 28, 2024
@bhalsey bhalsey added the bug Something isn't working label Mar 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
already fixed bug Something isn't working duplicate This issue or pull request already exists
Projects
FusionAuth Issues
  
Delivered
Development

No branches or pull requests

4 participants