Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TimeOut occurred when calling ParseTools.subCompileExpression function #348

Closed
PoppingSnack opened this issue Dec 7, 2023 · 16 comments
Closed

Comments

@PoppingSnack
Copy link

TimeOut occurred when calling ParseTools.subCompileExpression function

Description

An TimeOut error exists in the ParseTools.subCompileExpression method in mvel2 2.5.0.Final.

PoC

        <dependency>
            <groupId>org.mvel</groupId>
            <artifactId>mvel2</artifactId>
            <version>2.5.0.Final</version>
        </dependency>
import org.junit.Test;
import org.mvel2.util.*;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileWriter;
import java.io.IOException;
import java.io.InputStream;
import java.io.Serializable;
import java.lang.ref.WeakReference;
import java.lang.reflect.Constructor;
import java.lang.reflect.Executable;
import java.lang.reflect.Method;
import java.lang.reflect.Modifier;
import java.math.BigDecimal;
import java.math.BigInteger;
import java.math.MathContext;
import java.nio.ByteBuffer;
import java.nio.channels.ReadableByteChannel;
import java.util.ArrayList;
import java.util.Collection;
import java.util.Collections;
import java.util.HashMap;
import java.util.LinkedList;
import java.util.List;
import java.util.Map;
import java.util.WeakHashMap;
import org.mvel2.CompileException;
import org.mvel2.DataTypes;
import org.mvel2.MVEL;
import org.mvel2.Operator;
import org.mvel2.OptimizationFailure;
import org.mvel2.ParserContext;
import org.mvel2.ast.ASTNode;
import org.mvel2.compiler.AbstractParser;
import org.mvel2.compiler.BlankLiteral;
import org.mvel2.compiler.CompiledExpression;
import org.mvel2.compiler.ExecutableAccessor;
import org.mvel2.compiler.ExecutableAccessorSafe;
import org.mvel2.compiler.ExecutableLiteral;
import org.mvel2.compiler.ExpressionCompiler;
import org.mvel2.integration.VariableResolverFactory;
import org.mvel2.integration.impl.ClassImportResolverFactory;
import org.mvel2.math.MathProcessor;
import static java.lang.Class.forName;
import static java.lang.Double.parseDouble;
import static java.lang.String.valueOf;
import static java.lang.System.arraycopy;
import static java.lang.Thread.currentThread;
import static java.nio.ByteBuffer.allocateDirect;
import static org.mvel2.DataConversion.canConvert;
import static org.mvel2.DataTypes.*;
import static org.mvel2.MVEL.getDebuggingOutputFileName;
import static org.mvel2.compiler.AbstractParser.LITERALS;
import static org.mvel2.integration.ResolverTools.appendFactory;
public class ParseToolsFuzzerSubCompileExpression {

    // 超时
    @Test
    public void subCompileExpressionFuzzerTest2() {
        try {
            Serializable result = ParseTools.subCompileExpression("_|=A\uD040>_^=\u063E_F.WWW^=\u7DF7\u7DF7\u7DF7\u7DF7\u7DF7\u7DF7\uADF7\u7DF7\u7DF7\u763E_\u0F7E_^=\u062E4.W\u0000^=\u063E_\u0000\u0000\u0800\u0000|=>_^=\u0740\u0000\u0000\u0000z$\u0005\u0000\u0010\u0000Z\u7DF7\u7DF7\u7DF7\u763E_\u0F7E_^=\u062E4.\u7DF7\u7DF7\u7DF7\u7DF7\u7DF7\u7DF7\u7DD8>_\u0F7E_^=\u062E8.W\u0000^=\u063E_\u0000\u0000\u0800\u0000|=>_^=\u0637\u7DF7\u7DF7\u7DF7\u7DF7\u7DF7\u7DF7\u7DD8>_\u0F7E_^=\u062E4.W\u0000^=\u063E_\u0000\u0000\u0010Zar\u0000*=z.\u0005e.\u0005\u5000\u0000\u0000e\u0280*=\u07FF\u07F7\u07C8\u0000r\u0000\u0000Zer\u0000*=z.\u0005e\u0000\u0000\u0000\u0000e\u0280*=r\u0000*=\u0000\u0000Zer\u0000*z.\u0005e\u0000\u0000\u0000\u0000e\u0010\u0000Zer\u0000*=\u0740\u0000\u0000\u0000\u0000\u0000Zer\u0000*z.\u0005e\u0000\u0000\u0000\u0000e\u0280*=\u0000\u0010Zar\u0000*=z.\u0005e.\u0005\u5000\u0000\u0000e\u0280*=\u07FF\u07F7\u07C8\u0000r\u0000\u0000Zer\u0000*=z.\u0005e\u0000\u0000\u0000\u0000e\u0280*=r\u0000*=\u0740\u0000\u0000\u0000\u0000\u0740\u0000z.\u0005\u0000\u0010\bVer\u0000*=\u0740\u0000\u0000\u0000\u0000\u0000\u0000\u0006\u0000er\u00002222e66e7272e6e66e22e66e7272e672222e66e72766e22e66ee66e7272e6e66e2*\u0005z.e\u0000\u0000\u0000\u0000e\u0280*=\u07FF\u07FF\u07C8\u0000\u0000\u0000\b\u0000\u0000\u0000Z\u7DF7\u063E_\u0F7E_^=\u062E4.W\u0000^=\u063E_\u0000\u0000\u0800\u0000|=>_^=\u0740\u0000\u0000\u0000z$\u0005\u0000\u0010\u0000Z\u7DF7\u7DF7\u7DF7\u763E_\u0F7E_^=\u062E4.\u7DF7\u7DF7\u7DF7\u7DF7\u7DF7\u7DF7\u7DD8>_\u0F7E_^=\u062E8.W\u0000^=\u063E_\u0000\u0000\u0800\u0000|=>_^=\u0637\u7DF7\u7DF7\u7DF7\u7DF7\u7DF7\u7DF7\u7DD8>_\u0F7E_^=\u062E4.W\u0000^=\u063E_\u0000\u0000\u0010Zar\u0000*=z.\u0005e.\u0005\u5000\u0000\u0000e\u0280*=\u07FF\u07F7\u07C8\u0000r\u0000\u0000Zer\u0000*=z.\u0005e\u0000\u0000\u0000\u0000e\u0280*=r\u0000*=\u0000\u0000Zer\u0000*z.\u0005e\u0000\u0000\u0000\u0000e\u0010\u0000Zer\u0000*=\u0740\u0000\u0000\u0000\u0000\u0000Zer\u0000*z.\u0005e\u0000\u0000\u0000\u0000e\u0280*=\u0000\u0010Zar\u0000*=z.\u0005e.\u0005\u5000\u0000\u0000e\u0280*=\u07FF\u07F7\u07C8\u0000r\u0000\u0000Zer\u0000*=z.\u0005e\u0000\u0000\u0000\u0000e\u0280*=r\u0000*=\u0740\u0000\u0000\u0000\u0000\u0740\u0000z.\u0005\u0000\u0010\bVer\u0000*=\u0740\u0000\u0000\u0000\u0000\u0000\u0000\u0006\u0000er\u00002222e66e7272e6e66e22e66e7272e6722eR\u79E7\u79E7\u79E7\u79E7\u79E7\u79EE672222e66e7272e6722e\u0000\u0800\u0000|=>_^=\u063EV".toCharArray());
        } catch (Exception e) {
        }
    }
}
@mariofusco
Copy link
Member

I ran locally the reproducer you sent and it seems that it takes forever and never terminates. I cannot reproduce any TimeOut, do you have a general time out configured in your test environment? I still haven't understood if it makes MVEL falling into an infinite loop, but I will investigate further and let you know.

That said consider that the ParseTools.subCompileExpression is only "incidentally" public, but in reality it has to be considered internal to MVEL. Normally all the usages of the MVEL library go through the static methods exposed by the MVEL class. May I know why you're directly using that method? Do you have any specific use case?

@mariofusco
Copy link
Member

I further investigated this issue and now I can say for sure that the mvel parser is NOT falling in any infinite loop and it is working as expected. The string to be parsed is very long and convoluted and what is worse is that it obliges mvel to attempt tons of useless but unavoidable classes look up to check if any of the names in that string is an existing Java class. If I comment out those class loading attempts here and here (something that doesn't affect in any way the control flow of the parser since none of those classes exist) the parsing terminates successfully on my machine in about 15 minutes. This means that it would terminate successfully also with that class loading enabled even though I expect that it could take hours if not days.

Probably we could do some profiling and improve performances in that area (and for sure we will give it a try) but I guess this is not the point: you could always create an even longer and odd string to be parsed that could override all those performance improvements. The point here is that if you try to parse a crazy long and convoluted string the only thing that you could expect is that the parser will take a crazy amount of time to complete its task. I honestly don't see how it could work differently than this and more important how this could represent a security risk, but any suggestion is welcome.

@murainwood
Copy link

I further investigated this issue and now I can say for sure that the mvel parser is NOT falling in any infinite loop and it is working as expected. The string to be parsed is very long and convoluted and what is worse is that it obliges mvel to attempt tons of useless but unavoidable classes look up to check if any of the names in that string is an existing Java class. If I comment out those class loading attempts here and here (something that doesn't affect in any way the control flow of the parser since none of those classes exist) the parsing terminates successfully on my machine in about 15 minutes. This means that it would terminate successfully also with that class loading enabled even though I expect that it could take hours if not days.↳

Probably we could do some profiling and improve performances in that area (and for sure we will give it a try) but I guess this is not the point: you could always create an even longer and odd string to be parsed that could override all those performance improvements. The point here is that if you try to parse a crazy long and convoluted string the only thing that you could expect is that the parser will take a crazy amount of time to complete its task. I honestly don't see how it could work differently than this and more important how this could represent a security risk, but any suggestion is welcome.↳

Is it possible to add a configuration option to limit the length of parsed strings? For instance, by default, it's 8 megabytes. Would it be permissible to parse longer strings, allowing developers to modify the configuration to decide?

@mariofusco
Copy link
Member

Is it possible to add a configuration option to limit the length of parsed strings? For instance, by default, it's 8 megabytes. Would it be permissible to parse longer strings, allowing developers to modify the configuration to decide?

The problem in the String used in this test case is not much in its length (around 2k or just a bit more), but in how it is written. It is a long concatenation of meaningless assignments, also using Chinese, Arabic, Korean and other character sets. This is also suspicious because some of these characters trigger the unicode bidi algorithm (thanks @MalcolmOdd for pointing this out) and this kind of thing has been used to hide malicious code in open source PRs, even though I don't believe this is the case in this situation and @PoppingSnack was simply "having fun" with some pseudo random sequence of characters.

Also for this reason I don't think it will be useful to pollute mvel code with limitations on the length of the string to be parsed, the kind of characters it contains or the time taken to perform this parsing. Mvel is simply a library and if this sort of limitations are necessary for a specific use case they can be easily implemented by the user of the library immediately before calling it.

@murainwood
Copy link

Is it possible to add a configuration option to limit the length of parsed strings? For instance, by default, it's 8 megabytes. Would it be permissible to parse longer strings, allowing developers to modify the configuration to decide?↳

The problem in the String used in this test case is not much in its length (around 2k or just a bit more), but in how it is written. It is a long concatenation of meaningless assignments, also using Chinese, Arabic, Korean and other character sets. This is also suspicious because some of these characters trigger the unicode bidi algorithm (thanks @MalcolmOdd for pointing this out) and this kind of thing has been used to hide malicious code in open source PRs, even though I don't believe this is the case in this situation and @PoppingSnack was simply "having fun" with some pseudo random sequence of characters.↳

Also for this reason I don't think it will be useful to pollute mvel code with limitations on the length of the string to be parsed, the kind of characters it contains or the time taken to perform this parsing. Mvel is simply a library and if this sort of limitations are necessary for a specific use case they can be easily implemented by the user of the library immediately before calling it.↳

Emm ...Agree with your idea. The limitations of string length should be handled by external code or components.

@mpassell
Copy link

mpassell commented Jan 4, 2024

Does anyone here know why a CVE was created for this? (See https://nvd.nist.gov/vuln/detail/CVE-2023-51079) If you send weird, complex inputs to a library, it's not really surprising that it takes it a long time to come back with an answer. Should I create a CVE for a DB if I can come up with a query that takes a long time to run? (Sorry, I'm mostly venting. There are so many CVEs of the form, "If you use X to process untrusted inputs, bad things can happen. 😞 )

@GregDThomas
Copy link

Does anyone here know why a CVE was created for this?

I'm on the "security@" mailing list for an OSS project. I see quite regularly people raising "issues" that are no such thing - e.g. along the lines of: "If you delete all the access controls, anyone can access the system". I can only think that they are trying to beef up their CV by trying to show how many "vulnerabilities" they have identified. Given anyone can raise a CVE I wonder if that's what has happened here.

I suspect it may be necessary to raise a dispute. As you say - it's not surprising that a complex thing takes a long time to calculate.

https://www.cve.org/Resources/General/Policies/CVE-Record-Dispute-Policy.pdf

@GregDThomas
Copy link

Co-incidentally, I was made aware of https://daniel.haxx.se/blog/2024/01/02/the-i-in-llm-stands-for-intelligence/ today which is also relevant.

@mariofusco
Copy link
Member

It seems that there is a general agreement on the fact that this isn't a bug and for sure not a CVE (not to mention to legit doubts on why this thing has been filed as a CVE and on the process at large), so I'm closing this issue.

@mareknovotny
Copy link

@mariofusco i agree with the closing, just as notice [mariofusco](https://github.com/mariofusco) closed this as **completed** i think this should be closed with different status as now it shows that it was fixed, which is not true

@mariofusco mariofusco closed this as not planned Won't fix, can't repro, duplicate, stale Jan 8, 2024
@mpassell
Copy link

mpassell commented Jan 8, 2024

I'm on the "security@" mailing list for an OSS project. I see quite regularly people raising "issues" that are no such thing - e.g. along the lines of: "If you delete all the access controls, anyone can access the system". I can only think that they are trying to beef up their CV by trying to show how many "vulnerabilities" they have identified. Given anyone can raise a CVE I wonder if that's what has happened here.

I suspect it may be necessary to raise a dispute. As you say - it's not surprising that a complex thing takes a long time to calculate.

https://www.cve.org/Resources/General/Policies/CVE-Record-Dispute-Policy.pdf

Yeah, I suspect that you're right. I understand that one wants to keep the friction low for reporting legitimate issues, but I still wish that the process put a bit more of the burden of justifying a report on the person doing the reporting. I'd be happy to help kick off the CVE dispute process, but I don't see how you're supposed to identify the CNA (to which the dispute is submitted) from the CVE entry. Once again, putting the onus on the project. 😡

@attritionorg
Copy link

@mpassell Viewing it on MITRE's site shows the CNA, in this case MITRE themselves:
https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2023-51079

@mpassell
Copy link

mpassell commented Jan 19, 2024

Oh, cool. I see that this is now disputed. Thanks to whomever filed the paperwork!

@mariofusco
Copy link
Member

Oh, cool. I see that this is now disputed. Thanks to whomever filed the paperwork!

Seriously? @PoppingSnack wouldn't be time to admit that this doesn't make any sense and give up?

@GregDThomas
Copy link

I think "Disputed" is a good thing, it's the CVE that is disputed, (with link to this issue) so anyone digging in to it can see that it's not really a vulnerability.

@attritionorg
Copy link

You want to ensure that the CVE is more than just flagged "DISPUTED". MITRE can add a reply from the vendor, if brief, that explains why it is not a vulnerability. Otherwise many organizations may ignore the tag not having further context unless they dig into references. These days, getting MITRE to do that may be an uphill battle, said based on reading other GitHub issues with developers talking about their experience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants