New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use the strict setting for invalid name checks and deal with other C accessors. #26395
base: master
Are you sure you want to change the base?
Conversation
✅ Hi, I am the SymPy bot. I'm here to help you write a release notes entry. Please read the guide on how to write release notes.
Click here to see the pull request description that was parsed.
|
Benchmark results from GitHub Actions Lower numbers are good, higher numbers are bad. A ratio less than 1 Significantly changed benchmark results (PR vs master) Significantly changed benchmark results (master vs previous release) | Change | Before [a00718ba] | After [475815de] | Ratio | Benchmark (Parameter) |
|----------|----------------------|---------------------|---------|----------------------------------------------------------------------|
| - | 68.5±0.9ms | 44.2±0.4ms | 0.64 | integrate.TimeIntegrationRisch02.time_doit(10) |
| - | 68.5±1ms | 43.8±0.2ms | 0.64 | integrate.TimeIntegrationRisch02.time_doit_risch(10) |
| + | 18.2±0.2μs | 30.0±0.4μs | 1.65 | integrate.TimeIntegrationRisch03.time_doit(1) |
| - | 5.36±0.02ms | 2.87±0.01ms | 0.54 | logic.LogicSuite.time_load_file |
| - | 72.5±0.2ms | 28.5±0.2ms | 0.39 | polys.TimeGCD_GaussInt.time_op(1, 'dense') |
| - | 25.7±0.05ms | 17.1±0.08ms | 0.66 | polys.TimeGCD_GaussInt.time_op(1, 'expr') |
| - | 73.3±0.7ms | 28.7±0.2ms | 0.39 | polys.TimeGCD_GaussInt.time_op(1, 'sparse') |
| - | 253±0.7ms | 125±0.1ms | 0.49 | polys.TimeGCD_GaussInt.time_op(2, 'dense') |
| - | 252±1ms | 125±0.3ms | 0.49 | polys.TimeGCD_GaussInt.time_op(2, 'sparse') |
| - | 655±2ms | 371±2ms | 0.57 | polys.TimeGCD_GaussInt.time_op(3, 'dense') |
| - | 653±2ms | 375±0.8ms | 0.57 | polys.TimeGCD_GaussInt.time_op(3, 'sparse') |
| - | 493±2μs | 282±2μs | 0.57 | polys.TimeGCD_LinearDenseQuadraticGCD.time_op(1, 'dense') |
| - | 1.83±0.01ms | 1.06±0ms | 0.58 | polys.TimeGCD_LinearDenseQuadraticGCD.time_op(2, 'dense') |
| - | 5.83±0.08ms | 3.09±0.02ms | 0.53 | polys.TimeGCD_LinearDenseQuadraticGCD.time_op(3, 'dense') |
| - | 443±2μs | 227±2μs | 0.51 | polys.TimeGCD_QuadraticNonMonicGCD.time_op(1, 'dense') |
| - | 1.47±0.01ms | 670±2μs | 0.45 | polys.TimeGCD_QuadraticNonMonicGCD.time_op(2, 'dense') |
| - | 4.92±0.07ms | 1.65±0.02ms | 0.33 | polys.TimeGCD_QuadraticNonMonicGCD.time_op(3, 'dense') |
| - | 377±2μs | 202±2μs | 0.54 | polys.TimeGCD_SparseGCDHighDegree.time_op(1, 'dense') |
| - | 2.42±0.01ms | 1.21±0.01ms | 0.5 | polys.TimeGCD_SparseGCDHighDegree.time_op(3, 'dense') |
| - | 10.2±0.09ms | 4.34±0.03ms | 0.42 | polys.TimeGCD_SparseGCDHighDegree.time_op(5, 'dense') |
| - | 363±2μs | 166±0.5μs | 0.46 | polys.TimeGCD_SparseNonMonicQuadratic.time_op(1, 'dense') |
| - | 2.54±0.02ms | 894±7μs | 0.35 | polys.TimeGCD_SparseNonMonicQuadratic.time_op(3, 'dense') |
| - | 9.53±0.03ms | 2.63±0.03ms | 0.28 | polys.TimeGCD_SparseNonMonicQuadratic.time_op(5, 'dense') |
| - | 1.05±0ms | 431±8μs | 0.41 | polys.TimePREM_LinearDenseQuadraticGCD.time_op(3, 'dense') |
| - | 1.80±0.02ms | 496±1μs | 0.28 | polys.TimePREM_LinearDenseQuadraticGCD.time_op(3, 'sparse') |
| - | 5.82±0.04ms | 1.81±0.01ms | 0.31 | polys.TimePREM_LinearDenseQuadraticGCD.time_op(5, 'dense') |
| - | 8.51±0.04ms | 1.47±0.01ms | 0.17 | polys.TimePREM_LinearDenseQuadraticGCD.time_op(5, 'sparse') |
| - | 292±0.9μs | 64.4±0.7μs | 0.22 | polys.TimePREM_QuadraticNonMonicGCD.time_op(1, 'sparse') |
| - | 3.43±0.04ms | 388±2μs | 0.11 | polys.TimePREM_QuadraticNonMonicGCD.time_op(3, 'dense') |
| - | 4.07±0.03ms | 274±1μs | 0.07 | polys.TimePREM_QuadraticNonMonicGCD.time_op(3, 'sparse') |
| - | 7.08±0.05ms | 1.25±0.01ms | 0.18 | polys.TimePREM_QuadraticNonMonicGCD.time_op(5, 'dense') |
| - | 8.52±0.02ms | 849±20μs | 0.1 | polys.TimePREM_QuadraticNonMonicGCD.time_op(5, 'sparse') |
| - | 2.36±0.01ms | 1.55±0ms | 0.66 | polys.TimeSUBRESULTANTS_LinearDenseQuadraticGCD.time_op(2, 'dense') |
| - | 5.34±0.05ms | 2.94±0.02ms | 0.55 | polys.TimeSUBRESULTANTS_LinearDenseQuadraticGCD.time_op(2, 'sparse') |
| - | 12.5±0.03ms | 6.63±0.06ms | 0.53 | polys.TimeSUBRESULTANTS_LinearDenseQuadraticGCD.time_op(3, 'dense') |
| - | 22.6±0.1ms | 8.94±0.06ms | 0.4 | polys.TimeSUBRESULTANTS_LinearDenseQuadraticGCD.time_op(3, 'sparse') |
| - | 5.43±0.03ms | 864±6μs | 0.16 | polys.TimeSUBRESULTANTS_QuadraticNonMonicGCD.time_op(1, 'sparse') |
| - | 13.1±0.05ms | 6.97±0.03ms | 0.53 | polys.TimeSUBRESULTANTS_QuadraticNonMonicGCD.time_op(2, 'sparse') |
| - | 105±0.8ms | 25.9±0.1ms | 0.25 | polys.TimeSUBRESULTANTS_QuadraticNonMonicGCD.time_op(3, 'dense') |
| - | 166±1ms | 53.3±0.4ms | 0.32 | polys.TimeSUBRESULTANTS_QuadraticNonMonicGCD.time_op(3, 'sparse') |
| - | 175±0.5μs | 111±0.7μs | 0.63 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(1, 'dense') |
| - | 363±2μs | 212±1μs | 0.58 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(1, 'sparse') |
| - | 4.30±0.03ms | 829±4μs | 0.19 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(3, 'dense') |
| - | 5.39±0.04ms | 376±1μs | 0.07 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(3, 'sparse') |
| - | 20.3±0.3ms | 2.77±0.03ms | 0.14 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(5, 'dense') |
| - | 23.0±0.1ms | 621±2μs | 0.03 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(5, 'sparse') |
| - | 485±2μs | 133±1μs | 0.27 | polys.TimeSUBRESULTANTS_SparseNonMonicQuadratic.time_op(1, 'sparse') |
| - | 4.71±0.01ms | 608±6μs | 0.13 | polys.TimeSUBRESULTANTS_SparseNonMonicQuadratic.time_op(3, 'dense') |
| - | 5.38±0.02ms | 137±0.8μs | 0.03 | polys.TimeSUBRESULTANTS_SparseNonMonicQuadratic.time_op(3, 'sparse') |
| - | 13.6±0.08ms | 1.26±0.01ms | 0.09 | polys.TimeSUBRESULTANTS_SparseNonMonicQuadratic.time_op(5, 'dense') |
| - | 14.0±0.2ms | 141±0.9μs | 0.01 | polys.TimeSUBRESULTANTS_SparseNonMonicQuadratic.time_op(5, 'sparse') |
| - | 136±2μs | 75.1±0.4μs | 0.55 | solve.TimeMatrixOperations.time_rref(3, 0) |
| - | 254±1μs | 87.0±0.2μs | 0.34 | solve.TimeMatrixOperations.time_rref(4, 0) |
| - | 24.1±0.1ms | 10.2±0.04ms | 0.42 | solve.TimeSolveLinSys189x49.time_solve_lin_sys |
| - | 28.7±0.1ms | 15.4±0.08ms | 0.54 | solve.TimeSparseSystem.time_linsolve_Aaug(20) |
| - | 55.6±0.3ms | 25.0±0.08ms | 0.45 | solve.TimeSparseSystem.time_linsolve_Aaug(30) |
| - | 28.7±0.3ms | 15.4±0.1ms | 0.54 | solve.TimeSparseSystem.time_linsolve_Ab(20) |
| - | 55.2±0.3ms | 24.5±0.1ms | 0.44 | solve.TimeSparseSystem.time_linsolve_Ab(30) |
Full benchmark results can be found as artifacts in GitHub Actions |
Speedy work! (what are weekends for, if not open source 😝) This (new) printer setting (if we go with my suggestion above) could become useful for other languages as well I suspect? E.g. python allows unicode "letter" symbols, fortran77 has a quite strict max number of character rule, while Julia allows for tons of unicode characters last time I checked. I don't suggest that we implement any of those checks in this PR, only that the printer setting be moved to |
@@ -396,8 +397,11 @@ def _print_MatrixElement(self, expr): | |||
|
|||
def _print_Symbol(self, expr): | |||
name = super()._print_Symbol(expr) | |||
# if name is an indexed, e.g. a[0], only check the variable name | |||
self._check_valid_c_variable_name(name.split('[')[0]) | |||
if self._settings['strict']: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I could have sworn I wrote a comment here, github or I messed up, here I go again)
I would prefer if we introduced a new setting for checking the names of the symbols, since the type of checking is somewhat orthogonal to what our current strict
setting does. What about strict_symbol_names
or strict_names
? (The exact naming is not that important to me at least, only that we give users fine grained control).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would also differ with respect to default: we made strict
default to True
because it was generating non-functional code before. For this new option, that does not hold, so the default will need to be False
I think...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I can do that. Although it isn't so clear what 'strict' means in the current code. I couldn't really find an explanation. There is also a special setting for the reserved words check. I'll revisit this when I have some time and make a new proposal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's my fault. I need to write some documentation for what it is doing, and figure out where to put it (if we have a central place where we document printer options for the code printer). Currently it is simply mentioned in passing:
sympy/doc/src/modules/printing.rst
Line 269 in bb7a8d9
``strict=False``. The code then contains a comment informing the user of the issue: |
@@ -89,6 +89,8 @@ | |||
|
|||
reserved_words_c99 = ['inline', 'restrict'] | |||
|
|||
valid_var_name_pattern = re.compile(r"^[a-zA-Z_][a-zA-Z0-9_]{0,30}$") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will fail on variable names that are more than 31 characters long, which are permitted in C and C++.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the idea here is that we will have an opt-in "strict_names_flag" or something similar, in this mode, users should not use Symbol names for encode arbitrary syntax, but rather use e.g. Indexed
or instances in .codegen.ast
to reference struct fields etc.
As for the limit of 31 characters, it looks like there is some justification to use this limit for at least our C89CodePrinter
(may also the C99
version, I haven't checked...).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood. I have no objections to introducing a 'strict_names_flag' option. My primary concern is with the potential constraints it might introduce, especially if it were to be enabled by default. Implementing such restrictions could result in a learning curve for users unfamiliar with Indexed
or .codegen.ast
.
if self._settings['strict']: | ||
# if name is an accessor, e.g. a[0], a.b a->b, only check the first | ||
# variable name | ||
first = name.split('[')[0].split('.')[0].split('->')[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When applied to a struct, this references the struct name and not the variable name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my earlier comment applies here as well..
References to other Issues or PRs
Addition to #26360
Brief description of what is fixed or changed
Fixed some issues raised in the original PR after merging.
Other comments
Release Notes
NO ENTRY