Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the strict setting for invalid name checks and deal with other C accessors. #26395

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

moorepants
Copy link
Member

References to other Issues or PRs

Addition to #26360

Brief description of what is fixed or changed

Fixed some issues raised in the original PR after merging.

Other comments

Release Notes

NO ENTRY

@sympy-bot
Copy link

Hi, I am the SymPy bot. I'm here to help you write a release notes entry. Please read the guide on how to write release notes.

  • No release notes entry will be added for this pull request.
Click here to see the pull request description that was parsed.
<!-- Your title above should be a short description of what
was changed. Do not include the issue number in the title. -->

#### References to other Issues or PRs
<!-- If this pull request fixes an issue, write "Fixes #NNNN" in that exact
format, e.g. "Fixes #1234" (see
https://tinyurl.com/auto-closing for more information). Also, please
write a comment on that issue linking back to this pull request once it is
open. -->

Addition to #26360 


#### Brief description of what is fixed or changed

Fixed some issues raised in the original PR after merging.

#### Other comments


#### Release Notes

<!-- Write the release notes for this release below between the BEGIN and END
statements. The basic format is a bulleted list with the name of the subpackage
and the release note for this PR. For example:

* solvers
  * Added a new solver for logarithmic equations.

* functions
  * Fixed a bug with log of integers. Formerly, `log(-x)` incorrectly gave `-log(x)`.

* physics.units
  * Corrected a semantical error in the conversion between volt and statvolt which
    reported the volt as being larger than the statvolt.

or if no release note(s) should be included use:

NO ENTRY

See https://github.com/sympy/sympy/wiki/Writing-Release-Notes for more
information on how to write release notes. The bot will check your release
notes automatically to see if they are formatted correctly. -->

<!-- BEGIN RELEASE NOTES -->

NO ENTRY

<!-- END RELEASE NOTES -->

Copy link

github-actions bot commented Mar 24, 2024

Benchmark results from GitHub Actions

Lower numbers are good, higher numbers are bad. A ratio less than 1
means a speed up and greater than 1 means a slowdown. Green lines
beginning with + are slowdowns (the PR is slower then master or
master is slower than the previous release). Red lines beginning
with - are speedups.

Significantly changed benchmark results (PR vs master)

Significantly changed benchmark results (master vs previous release)

| Change   | Before [a00718ba]    | After [475815de]    |   Ratio | Benchmark (Parameter)                                                |
|----------|----------------------|---------------------|---------|----------------------------------------------------------------------|
| -        | 68.5±0.9ms           | 44.2±0.4ms          |    0.64 | integrate.TimeIntegrationRisch02.time_doit(10)                       |
| -        | 68.5±1ms             | 43.8±0.2ms          |    0.64 | integrate.TimeIntegrationRisch02.time_doit_risch(10)                 |
| +        | 18.2±0.2μs           | 30.0±0.4μs          |    1.65 | integrate.TimeIntegrationRisch03.time_doit(1)                        |
| -        | 5.36±0.02ms          | 2.87±0.01ms         |    0.54 | logic.LogicSuite.time_load_file                                      |
| -        | 72.5±0.2ms           | 28.5±0.2ms          |    0.39 | polys.TimeGCD_GaussInt.time_op(1, 'dense')                           |
| -        | 25.7±0.05ms          | 17.1±0.08ms         |    0.66 | polys.TimeGCD_GaussInt.time_op(1, 'expr')                            |
| -        | 73.3±0.7ms           | 28.7±0.2ms          |    0.39 | polys.TimeGCD_GaussInt.time_op(1, 'sparse')                          |
| -        | 253±0.7ms            | 125±0.1ms           |    0.49 | polys.TimeGCD_GaussInt.time_op(2, 'dense')                           |
| -        | 252±1ms              | 125±0.3ms           |    0.49 | polys.TimeGCD_GaussInt.time_op(2, 'sparse')                          |
| -        | 655±2ms              | 371±2ms             |    0.57 | polys.TimeGCD_GaussInt.time_op(3, 'dense')                           |
| -        | 653±2ms              | 375±0.8ms           |    0.57 | polys.TimeGCD_GaussInt.time_op(3, 'sparse')                          |
| -        | 493±2μs              | 282±2μs             |    0.57 | polys.TimeGCD_LinearDenseQuadraticGCD.time_op(1, 'dense')            |
| -        | 1.83±0.01ms          | 1.06±0ms            |    0.58 | polys.TimeGCD_LinearDenseQuadraticGCD.time_op(2, 'dense')            |
| -        | 5.83±0.08ms          | 3.09±0.02ms         |    0.53 | polys.TimeGCD_LinearDenseQuadraticGCD.time_op(3, 'dense')            |
| -        | 443±2μs              | 227±2μs             |    0.51 | polys.TimeGCD_QuadraticNonMonicGCD.time_op(1, 'dense')               |
| -        | 1.47±0.01ms          | 670±2μs             |    0.45 | polys.TimeGCD_QuadraticNonMonicGCD.time_op(2, 'dense')               |
| -        | 4.92±0.07ms          | 1.65±0.02ms         |    0.33 | polys.TimeGCD_QuadraticNonMonicGCD.time_op(3, 'dense')               |
| -        | 377±2μs              | 202±2μs             |    0.54 | polys.TimeGCD_SparseGCDHighDegree.time_op(1, 'dense')                |
| -        | 2.42±0.01ms          | 1.21±0.01ms         |    0.5  | polys.TimeGCD_SparseGCDHighDegree.time_op(3, 'dense')                |
| -        | 10.2±0.09ms          | 4.34±0.03ms         |    0.42 | polys.TimeGCD_SparseGCDHighDegree.time_op(5, 'dense')                |
| -        | 363±2μs              | 166±0.5μs           |    0.46 | polys.TimeGCD_SparseNonMonicQuadratic.time_op(1, 'dense')            |
| -        | 2.54±0.02ms          | 894±7μs             |    0.35 | polys.TimeGCD_SparseNonMonicQuadratic.time_op(3, 'dense')            |
| -        | 9.53±0.03ms          | 2.63±0.03ms         |    0.28 | polys.TimeGCD_SparseNonMonicQuadratic.time_op(5, 'dense')            |
| -        | 1.05±0ms             | 431±8μs             |    0.41 | polys.TimePREM_LinearDenseQuadraticGCD.time_op(3, 'dense')           |
| -        | 1.80±0.02ms          | 496±1μs             |    0.28 | polys.TimePREM_LinearDenseQuadraticGCD.time_op(3, 'sparse')          |
| -        | 5.82±0.04ms          | 1.81±0.01ms         |    0.31 | polys.TimePREM_LinearDenseQuadraticGCD.time_op(5, 'dense')           |
| -        | 8.51±0.04ms          | 1.47±0.01ms         |    0.17 | polys.TimePREM_LinearDenseQuadraticGCD.time_op(5, 'sparse')          |
| -        | 292±0.9μs            | 64.4±0.7μs          |    0.22 | polys.TimePREM_QuadraticNonMonicGCD.time_op(1, 'sparse')             |
| -        | 3.43±0.04ms          | 388±2μs             |    0.11 | polys.TimePREM_QuadraticNonMonicGCD.time_op(3, 'dense')              |
| -        | 4.07±0.03ms          | 274±1μs             |    0.07 | polys.TimePREM_QuadraticNonMonicGCD.time_op(3, 'sparse')             |
| -        | 7.08±0.05ms          | 1.25±0.01ms         |    0.18 | polys.TimePREM_QuadraticNonMonicGCD.time_op(5, 'dense')              |
| -        | 8.52±0.02ms          | 849±20μs            |    0.1  | polys.TimePREM_QuadraticNonMonicGCD.time_op(5, 'sparse')             |
| -        | 2.36±0.01ms          | 1.55±0ms            |    0.66 | polys.TimeSUBRESULTANTS_LinearDenseQuadraticGCD.time_op(2, 'dense')  |
| -        | 5.34±0.05ms          | 2.94±0.02ms         |    0.55 | polys.TimeSUBRESULTANTS_LinearDenseQuadraticGCD.time_op(2, 'sparse') |
| -        | 12.5±0.03ms          | 6.63±0.06ms         |    0.53 | polys.TimeSUBRESULTANTS_LinearDenseQuadraticGCD.time_op(3, 'dense')  |
| -        | 22.6±0.1ms           | 8.94±0.06ms         |    0.4  | polys.TimeSUBRESULTANTS_LinearDenseQuadraticGCD.time_op(3, 'sparse') |
| -        | 5.43±0.03ms          | 864±6μs             |    0.16 | polys.TimeSUBRESULTANTS_QuadraticNonMonicGCD.time_op(1, 'sparse')    |
| -        | 13.1±0.05ms          | 6.97±0.03ms         |    0.53 | polys.TimeSUBRESULTANTS_QuadraticNonMonicGCD.time_op(2, 'sparse')    |
| -        | 105±0.8ms            | 25.9±0.1ms          |    0.25 | polys.TimeSUBRESULTANTS_QuadraticNonMonicGCD.time_op(3, 'dense')     |
| -        | 166±1ms              | 53.3±0.4ms          |    0.32 | polys.TimeSUBRESULTANTS_QuadraticNonMonicGCD.time_op(3, 'sparse')    |
| -        | 175±0.5μs            | 111±0.7μs           |    0.63 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(1, 'dense')      |
| -        | 363±2μs              | 212±1μs             |    0.58 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(1, 'sparse')     |
| -        | 4.30±0.03ms          | 829±4μs             |    0.19 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(3, 'dense')      |
| -        | 5.39±0.04ms          | 376±1μs             |    0.07 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(3, 'sparse')     |
| -        | 20.3±0.3ms           | 2.77±0.03ms         |    0.14 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(5, 'dense')      |
| -        | 23.0±0.1ms           | 621±2μs             |    0.03 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(5, 'sparse')     |
| -        | 485±2μs              | 133±1μs             |    0.27 | polys.TimeSUBRESULTANTS_SparseNonMonicQuadratic.time_op(1, 'sparse') |
| -        | 4.71±0.01ms          | 608±6μs             |    0.13 | polys.TimeSUBRESULTANTS_SparseNonMonicQuadratic.time_op(3, 'dense')  |
| -        | 5.38±0.02ms          | 137±0.8μs           |    0.03 | polys.TimeSUBRESULTANTS_SparseNonMonicQuadratic.time_op(3, 'sparse') |
| -        | 13.6±0.08ms          | 1.26±0.01ms         |    0.09 | polys.TimeSUBRESULTANTS_SparseNonMonicQuadratic.time_op(5, 'dense')  |
| -        | 14.0±0.2ms           | 141±0.9μs           |    0.01 | polys.TimeSUBRESULTANTS_SparseNonMonicQuadratic.time_op(5, 'sparse') |
| -        | 136±2μs              | 75.1±0.4μs          |    0.55 | solve.TimeMatrixOperations.time_rref(3, 0)                           |
| -        | 254±1μs              | 87.0±0.2μs          |    0.34 | solve.TimeMatrixOperations.time_rref(4, 0)                           |
| -        | 24.1±0.1ms           | 10.2±0.04ms         |    0.42 | solve.TimeSolveLinSys189x49.time_solve_lin_sys                       |
| -        | 28.7±0.1ms           | 15.4±0.08ms         |    0.54 | solve.TimeSparseSystem.time_linsolve_Aaug(20)                        |
| -        | 55.6±0.3ms           | 25.0±0.08ms         |    0.45 | solve.TimeSparseSystem.time_linsolve_Aaug(30)                        |
| -        | 28.7±0.3ms           | 15.4±0.1ms          |    0.54 | solve.TimeSparseSystem.time_linsolve_Ab(20)                          |
| -        | 55.2±0.3ms           | 24.5±0.1ms          |    0.44 | solve.TimeSparseSystem.time_linsolve_Ab(30)                          |

Full benchmark results can be found as artifacts in GitHub Actions
(click on checks at the top of the PR).

@bjodah
Copy link
Member

bjodah commented Mar 24, 2024

Speedy work! (what are weekends for, if not open source 😝)

This (new) printer setting (if we go with my suggestion above) could become useful for other languages as well I suspect? E.g. python allows unicode "letter" symbols, fortran77 has a quite strict max number of character rule, while Julia allows for tons of unicode characters last time I checked.

I don't suggest that we implement any of those checks in this PR, only that the printer setting be moved to CodePrinter.

@@ -396,8 +397,11 @@ def _print_MatrixElement(self, expr):

def _print_Symbol(self, expr):
name = super()._print_Symbol(expr)
# if name is an indexed, e.g. a[0], only check the variable name
self._check_valid_c_variable_name(name.split('[')[0])
if self._settings['strict']:
Copy link
Member

@bjodah bjodah Mar 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I could have sworn I wrote a comment here, github or I messed up, here I go again)
I would prefer if we introduced a new setting for checking the names of the symbols, since the type of checking is somewhat orthogonal to what our current strict setting does. What about strict_symbol_names or strict_names? (The exact naming is not that important to me at least, only that we give users fine grained control).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would also differ with respect to default: we made strict default to True because it was generating non-functional code before. For this new option, that does not hold, so the default will need to be False I think...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I can do that. Although it isn't so clear what 'strict' means in the current code. I couldn't really find an explanation. There is also a special setting for the reserved words check. I'll revisit this when I have some time and make a new proposal.

Copy link
Member

@bjodah bjodah Mar 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's my fault. I need to write some documentation for what it is doing, and figure out where to put it (if we have a central place where we document printer options for the code printer). Currently it is simply mentioned in passing:

``strict=False``. The code then contains a comment informing the user of the issue:

@@ -89,6 +89,8 @@

reserved_words_c99 = ['inline', 'restrict']

valid_var_name_pattern = re.compile(r"^[a-zA-Z_][a-zA-Z0-9_]{0,30}$")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will fail on variable names that are more than 31 characters long, which are permitted in C and C++.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the idea here is that we will have an opt-in "strict_names_flag" or something similar, in this mode, users should not use Symbol names for encode arbitrary syntax, but rather use e.g. Indexed or instances in .codegen.ast to reference struct fields etc.

As for the limit of 31 characters, it looks like there is some justification to use this limit for at least our C89CodePrinter (may also the C99 version, I haven't checked...).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood. I have no objections to introducing a 'strict_names_flag' option. My primary concern is with the potential constraints it might introduce, especially if it were to be enabled by default. Implementing such restrictions could result in a learning curve for users unfamiliar with Indexed or .codegen.ast.

if self._settings['strict']:
# if name is an accessor, e.g. a[0], a.b a->b, only check the first
# variable name
first = name.split('[')[0].split('.')[0].split('->')[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When applied to a struct, this references the struct name and not the variable name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my earlier comment applies here as well..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants