New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Raise ValueError if Symbol does not print to valid C variable name. #26360
Conversation
✅ Hi, I am the SymPy bot. I'm here to help you write a release notes entry. Please read the guide on how to write release notes. Your release notes are in good order. Here is what the release notes will look like:
This will be added to https://github.com/sympy/sympy/wiki/Release-Notes-for-1.13. Click here to see the pull request description that was parsed.
Update The release notes on the wiki have been updated. |
I would prefer compatibility with To have it documented as a reference, here is where the actual replacement of
|
Replacing the curly braces is a fix, but we also have to replace all symbols but the valid symbols for the complete fix. The Python printer replace statement is a bandaid for just that invalid symbol. I can do that but it doesn't really solve the full issue. |
I would leave the burden on the user to either create simple variable names that they hand-tune later or else use names they know will work in C. I think what you have here is fine. |
As long as we support |
Benchmark results from GitHub Actions Lower numbers are good, higher numbers are bad. A ratio less than 1 Significantly changed benchmark results (PR vs master) Significantly changed benchmark results (master vs previous release) | Change | Before [a00718ba] | After [4fd4517a] | Ratio | Benchmark (Parameter) |
|----------|----------------------|---------------------|---------|----------------------------------------------------------------------|
| - | 69.6±1ms | 43.8±0.2ms | 0.63 | integrate.TimeIntegrationRisch02.time_doit(10) |
| - | 68.4±0.5ms | 43.4±0.2ms | 0.63 | integrate.TimeIntegrationRisch02.time_doit_risch(10) |
| + | 18.4±0.3μs | 29.7±0.1μs | 1.62 | integrate.TimeIntegrationRisch03.time_doit(1) |
| - | 5.35±0.02ms | 2.84±0.01ms | 0.53 | logic.LogicSuite.time_load_file |
| - | 73.1±0.9ms | 28.4±0.2ms | 0.39 | polys.TimeGCD_GaussInt.time_op(1, 'dense') |
| - | 25.7±0.1ms | 16.9±0.05ms | 0.66 | polys.TimeGCD_GaussInt.time_op(1, 'expr') |
| - | 73.7±0.6ms | 28.6±0.1ms | 0.39 | polys.TimeGCD_GaussInt.time_op(1, 'sparse') |
| - | 257±2ms | 124±0.3ms | 0.48 | polys.TimeGCD_GaussInt.time_op(2, 'dense') |
| - | 255±0.7ms | 123±0.2ms | 0.48 | polys.TimeGCD_GaussInt.time_op(2, 'sparse') |
| - | 658±3ms | 369±2ms | 0.56 | polys.TimeGCD_GaussInt.time_op(3, 'dense') |
| - | 657±3ms | 369±3ms | 0.56 | polys.TimeGCD_GaussInt.time_op(3, 'sparse') |
| - | 495±2μs | 283±1μs | 0.57 | polys.TimeGCD_LinearDenseQuadraticGCD.time_op(1, 'dense') |
| - | 1.76±0.01ms | 1.06±0.01ms | 0.6 | polys.TimeGCD_LinearDenseQuadraticGCD.time_op(2, 'dense') |
| - | 5.77±0.03ms | 3.09±0.01ms | 0.53 | polys.TimeGCD_LinearDenseQuadraticGCD.time_op(3, 'dense') |
| - | 447±1μs | 227±0.6μs | 0.51 | polys.TimeGCD_QuadraticNonMonicGCD.time_op(1, 'dense') |
| - | 1.49±0.01ms | 681±2μs | 0.46 | polys.TimeGCD_QuadraticNonMonicGCD.time_op(2, 'dense') |
| - | 4.92±0.05ms | 1.66±0.02ms | 0.34 | polys.TimeGCD_QuadraticNonMonicGCD.time_op(3, 'dense') |
| - | 377±1μs | 206±2μs | 0.55 | polys.TimeGCD_SparseGCDHighDegree.time_op(1, 'dense') |
| - | 2.45±0.02ms | 1.24±0ms | 0.51 | polys.TimeGCD_SparseGCDHighDegree.time_op(3, 'dense') |
| - | 10.1±0.07ms | 4.30±0.01ms | 0.42 | polys.TimeGCD_SparseGCDHighDegree.time_op(5, 'dense') |
| - | 361±2μs | 169±2μs | 0.47 | polys.TimeGCD_SparseNonMonicQuadratic.time_op(1, 'dense') |
| - | 2.49±0.02ms | 912±5μs | 0.37 | polys.TimeGCD_SparseNonMonicQuadratic.time_op(3, 'dense') |
| - | 9.67±0.03ms | 2.65±0.03ms | 0.27 | polys.TimeGCD_SparseNonMonicQuadratic.time_op(5, 'dense') |
| - | 1.06±0.01ms | 431±4μs | 0.41 | polys.TimePREM_LinearDenseQuadraticGCD.time_op(3, 'dense') |
| - | 1.76±0.01ms | 506±2μs | 0.29 | polys.TimePREM_LinearDenseQuadraticGCD.time_op(3, 'sparse') |
| - | 5.89±0.1ms | 1.79±0.01ms | 0.3 | polys.TimePREM_LinearDenseQuadraticGCD.time_op(5, 'dense') |
| - | 8.40±0.1ms | 1.53±0.01ms | 0.18 | polys.TimePREM_LinearDenseQuadraticGCD.time_op(5, 'sparse') |
| - | 288±1μs | 64.5±0.3μs | 0.22 | polys.TimePREM_QuadraticNonMonicGCD.time_op(1, 'sparse') |
| - | 3.49±0.03ms | 392±3μs | 0.11 | polys.TimePREM_QuadraticNonMonicGCD.time_op(3, 'dense') |
| - | 4.01±0.06ms | 282±2μs | 0.07 | polys.TimePREM_QuadraticNonMonicGCD.time_op(3, 'sparse') |
| - | 7.07±0.09ms | 1.27±0.01ms | 0.18 | polys.TimePREM_QuadraticNonMonicGCD.time_op(5, 'dense') |
| - | 8.75±0.08ms | 854±3μs | 0.1 | polys.TimePREM_QuadraticNonMonicGCD.time_op(5, 'sparse') |
| - | 5.04±0.07ms | 2.99±0ms | 0.59 | polys.TimeSUBRESULTANTS_LinearDenseQuadraticGCD.time_op(2, 'sparse') |
| - | 11.9±0.05ms | 6.60±0.03ms | 0.55 | polys.TimeSUBRESULTANTS_LinearDenseQuadraticGCD.time_op(3, 'dense') |
| - | 22.3±0.2ms | 9.09±0.02ms | 0.41 | polys.TimeSUBRESULTANTS_LinearDenseQuadraticGCD.time_op(3, 'sparse') |
| - | 5.26±0.02ms | 865±3μs | 0.16 | polys.TimeSUBRESULTANTS_QuadraticNonMonicGCD.time_op(1, 'sparse') |
| - | 12.6±0.05ms | 7.04±0.02ms | 0.56 | polys.TimeSUBRESULTANTS_QuadraticNonMonicGCD.time_op(2, 'sparse') |
| - | 103±2ms | 26.0±0.1ms | 0.25 | polys.TimeSUBRESULTANTS_QuadraticNonMonicGCD.time_op(3, 'dense') |
| - | 168±0.4ms | 54.0±0.2ms | 0.32 | polys.TimeSUBRESULTANTS_QuadraticNonMonicGCD.time_op(3, 'sparse') |
| - | 360±3μs | 214±0.7μs | 0.6 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(1, 'sparse') |
| - | 4.38±0.04ms | 838±2μs | 0.19 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(3, 'dense') |
| - | 5.35±0.05ms | 382±2μs | 0.07 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(3, 'sparse') |
| - | 20.4±0.4ms | 2.78±0.03ms | 0.14 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(5, 'dense') |
| - | 22.9±0.3ms | 635±6μs | 0.03 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(5, 'sparse') |
| - | 511±30μs | 136±0.7μs | 0.27 | polys.TimeSUBRESULTANTS_SparseNonMonicQuadratic.time_op(1, 'sparse') |
| - | 4.86±0.04ms | 608±2μs | 0.13 | polys.TimeSUBRESULTANTS_SparseNonMonicQuadratic.time_op(3, 'dense') |
| - | 5.35±0.02ms | 140±0.8μs | 0.03 | polys.TimeSUBRESULTANTS_SparseNonMonicQuadratic.time_op(3, 'sparse') |
| - | 13.7±0.2ms | 1.26±0.01ms | 0.09 | polys.TimeSUBRESULTANTS_SparseNonMonicQuadratic.time_op(5, 'dense') |
| - | 14.3±0.2ms | 141±1μs | 0.01 | polys.TimeSUBRESULTANTS_SparseNonMonicQuadratic.time_op(5, 'sparse') |
| - | 134±0.8μs | 76.9±0.4μs | 0.57 | solve.TimeMatrixOperations.time_rref(3, 0) |
| - | 248±1μs | 88.8±0.4μs | 0.36 | solve.TimeMatrixOperations.time_rref(4, 0) |
| - | 24.2±0.06ms | 10.2±0.2ms | 0.42 | solve.TimeSolveLinSys189x49.time_solve_lin_sys |
| - | 28.2±0.1ms | 15.3±0.1ms | 0.54 | solve.TimeSparseSystem.time_linsolve_Aaug(20) |
| - | 54.3±0.2ms | 24.7±0.08ms | 0.45 | solve.TimeSparseSystem.time_linsolve_Aaug(30) |
| - | 27.9±0.2ms | 15.1±0.06ms | 0.54 | solve.TimeSparseSystem.time_linsolve_Ab(20) |
| - | 54.2±0.2ms | 24.3±0.09ms | 0.45 | solve.TimeSparseSystem.time_linsolve_Ab(30) |
Full benchmark results can be found as artifacts in GitHub Actions |
Looks great! I'm fine to merge it at this point. If I'm playing the devils advocate: perhaps we should allow for names such as |
Ok, this should be good unless there is more feedback. Thanks folks. |
Thanks Jason! |
I have concerns regarding the current implementation of this patch, particularly due to its reliance on regular expressions for parsing the |
@zachetienne I agree with you. I completely forgot about struct fields. Both @moorepants how do you want to move forward? |
@bjodah : I appreciate your understanding and the acknowledgment of the complexities involved with struct fields and the nuances of C++ syntax, such as the use of the call operator in libraries like Eigen. My concern, however, extends beyond just accommodating these syntaxes. Specifically, it revolves around the principle that it should not be within sympy's scope to enforce or validate the correctness of C/C++ syntax. This responsibility ideally lies with the C/C++ compilers, which are in part designed for this purpose. Implementing a fully robust validation routine within sympy to handle the myriad of syntactical variations and edge cases in C/C++ would not only be a significant undertaking but also introduce considerable overhead. This could potentially impact the performance and maintainability of sympy, diverting resources away from its core functionalities. Therefore, while I understand the motivation behind ensuring syntactical correctness at the sympy level, I advocate for a more streamlined approach that relies on the compilers to perform these checks. This would allow sympy to focus on its strengths, reducing complexity and maintaining efficiency. I believe exploring alternatives that achieve a balance between usability and performance, without overstepping the responsibilities of sympy, would be in our best interest. |
I am fine if we revert this and look for a better solution, but I disagree that the solution is to rely on solely the compiler with no validation at the SymPy level. Our entire C code printing and code generation tooling already attempts to pass valid C code to the compiler so there are no compiler errors. For example, we throw errors if you use reserved words, which is no different than throwing errors if you use invalid variable names. There is also no issue with using regular expressions as long as we design them properly around any edge cases. A typical SymPy user will not know how to deal with compiler errors (even if those even properly surface for them to read). There is no machinery in SymPy to catch compiler errors and translate them into something a SymPy user would understand. An advanced user can relax or change any rule we put into place or write their own printer. |
I think that a user who is directly generating C-code using That would also reduce the risk of breaking back-comp. Regarding regular expressions: they are quite costly, and I notice that we actually compile it on every invocation in this PR. I recently added a similar printing option in gh-25913. Would something similar be a way forward here you think? |
That should be changed for sure. |
Yes, I'm open for any ways to do this, but would like checks at this this level (optionally or not). |
How about this: moorepants@bd3b15b |
It looks like we have |
@moorepants: Thanks for your work in revising the patch. [edit: remove duplicated comments abt the patch] @bjodah, @moorepants: Can we please revert the merged patch, while @moorepants focuses on refining a new one? [edit: remove immaterial comment abt CI] Thanks in advance! |
@zachetienne this is at least the second time that you've reported issues on SymPy asking us to fix things so that your CI runs as desired. It is your choice to have a Github "green" tied to SymPy dev, but not ours to keep your CI green. I recommend either not tying your CI green light to SymPy dev or to accept that you are testing against an unreleased SymPy. We will revert the patch when and if we decide to do so. If you catch bugs while testing against SymPy development versions, it's easiest if you report the exact errors and examples that are causing failures, which we can then investigate. I personally become less motivated to help you given your current feedback approach. |
@moorepants: Please accept my sincere apology. I deeply appreciate the hard work and dedication that goes into maintaining and developing SymPy. The tool is invaluable to my work, and I hold the utmost respect for the community that supports its growth and evolution. Indeed, this is not the place to mention my CI failing. Moving forward, I will not mention it, and instead focus on just reporting exact errors/example as you suggest. I agree this would be most productive. :) |
Thanks. We'll find a workable solution. |
References to other Issues or PRs
Brought up in opty at csu-hmc/opty#139
This somewhat addresses:
#7767
Brief description of what is fixed or changed
In C code generation processes there can be compile errors because some SymPy symbols do not print to valid C variable names. By the time the C code is compiled the error may not be clearly propagated and explained to the user. One option is to tell people up front at code printing time that they have an invalid symbol name. This will at least give a clear error message instead of a possibly obfuscated or hidden compile error.
Maybe the desired solution is to convert a
Symbol.name
into a valid C variable name instead of raising an error. I suppose that people could be printing invalid C code and then fixing it post printing before they compile the C code. If so, this "fix" could break their code. I'll throw this out here to see what people think, but maybe it is not a good solution.Other comments
Release Notes