New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistent Fortran printing for Indexed, ArraySymbol and MatrixSymbol #23998
Comments
For loops also seem inconsistent: from sympy import fcode, Range, IndexedBase, Idx
i = Idx('i')
from sympy.codegen.ast import For
print(fcode(For(i,Range(5), [IndexedBase('A')[i]]))) do i = 0, 5, 1
A(i)
end do The above fortran code (if I print
This is the c code btw: for (i = 0; i < 5; i += 1) {
A[i];
} |
I wonder if maybe the fortran printer should have a way to set which starting value for arrays you use (the default is that it starts at 1, but you in principle could define an array which starts at any number, including negative numbers I believe) |
Some more interesting things:
Maybe it is better to have one central array-like printing method, which itself uses printer specific methods to figure out what the correct format is. |
Digging into this makes it worse and worse tbh. e.g. A = IndexedBase('A', (5,5), strides = 'F')
i,j = Idx('i'), Idx('j')
print(ccode(A[i,j]))
print(ccode(A.func(*A.args)[i,j]))
print(ccode(simplify(A)[i,j]))
Shows that the strides information is lost when recreating the object. That makes this feature completely bugged, since the above can arbitrarily happen during any of SymPy's operations (e.g. with I wonder if strides should really be a part Currently |
Oh, that's a lot of bugs. I think it's great that you take the time to hunt them down. I'm probably the one to blame here, I overlooked the need for explicitly adding The I'm fine with deprecating features which are inherently broken. To be honest, I've seen very few people using some of these features besides me, which I guess says something about the ergonomics of them. But it would be useful to have a "multi-indexed" object (N-dimensional array) which can be printed as e.g. Part of the reason for why the situation looks like this is historic: |
For the translation of indices in the printer, I think we should probably make that configurable by a printer setting? For example: if we always add 1 to indices in the |
I think you should also always add +1 to symbolic indices, consider e.g.: c = 1
A[c+1] Now when you do a loop and want to explicitly go from 0..4 to 0..5, you should at that point subsitute the iteration variable do i = 0, 4
A[i + 1]
end do to do i = 1, 5
A[(i - 1) + 1]
end do Note that sympy's Edit: Second edit: |
Yes, |
Sometimes people (well, at least me) do code-generation not of a full loop, but just a fragment. Say that I want a fortran code snippet for this expression: >>> from sympy import IndexedBase, Idx
>>> from sympy.calculus import apply_finite_diff
>>> x, y = map(IndexedBase, 'xy')
>>> i = Idx('i')
>>> x_list, y_list = zip(*[(x[i+j], y[i+j]) for j in range(-1,2)])
>>> expr = apply_finite_diff(1, x_list, y_list, x[i])
>>> from sympy.printing.fortran import fcode
>>> print(fcode(expr, source_format="free"))
((x(i + 1) - x(i))/(-x(i - 1) + x(i)) - 1)*y(i)/(x(i + 1) - x(i)) - (x(i &
+ 1) - x(i))*y(i - 1)/((x(i + 1) - x(i - 1))*(-x(i - 1) + x(i))) &
+ (-x(i - 1) + x(i))*y(i + 1)/((x(i + 1) - x(i - 1))*(x(i + 1) - &
x(i))) Here I would rather not have any |
Yes, that's what I'm afraid of, but in my opinion you are currently working around a bug. The same sympy expression evaluates to something different in c and in fortran. You will now need a different value for |
Well, I politely disagree, and I would argue that what we have now is the expected behaviour (but not for our generated loops of course). 🙂 In modern Fortran we can't even make the assumption that the array index starts at 1 (cf. the |
So then how to keep the current behaviour, but also be able to consistently evaluate static and looped indices that results in the same behaviour when it is written to Fortran and C? |
That's the hard part, and I don't claim to have a silver bullet here. I think it's hard to reconcile with our current printers. Perhaps having more than one type of printer is a viable approach: A "low-level" one which makes close to zero assumptions, and then printers (maybe it should be called something else then) which carries state and are used to generate e.g. whole functions with consistent indexing for the specific language etc. |
It's hard to foresee all potential issues with competing designs, perhaps building such new facilities with a few disparate use cases in mind and putting them under: That way, we're not held back by back-comp. issues, and if all existing use cases can eventually be addressed with the new tools, then one can starting looking at slowly deprecating the old and half-broken tools we have. But as usual, it mostly comes down to manpower, if you have the bandwidth for spearheading this, then that's awesome! I would try to make time to provide some (hopefully) useful input. |
I could work with that if the current printers were truely low-level. A function which would apply such substitutions is perfectly fine with me, e.g. if So either we always start at 0, always start at 1 or we have a per class or even per object definition of where they start. But the current situation is bugged and changing it either way is going to bug someone's current code. If someone is currently using |
I agree that the current printers do too many things. In my point of view a printer should mostly be concerned about parenthesization, syntax and translation of SymPy tree into the natural representation of the targeted langauge (e.g. I'm speculating, but: I think we could break out those "non-controversial" methods into base classes for respective language, which then our current backwards-compatible printers inherit from. That way you can reuse the "good parts" of existing printers, while not being hindered by arbitrary choices? (the |
Yes, the Defining a Also, just to reiterate the insanity of the current situation from sympy import fcode, MatrixSymbol, IndexedBase, Idx, symbols
i,j = symbols('i,j', cls = Idx, range = 5)
A, B = symbols('A,B', cls = IndexedBase, real = True)
M = MatrixSymbol('M', 5,5)
print(fcode(B[i,j] + M[i,j], assign_to=A[i,j])) do j = 1, 5
do i = 1, 5
A(i, j) = B(i, j) + M(i + 1, j + 1)
end do
end do |
I'm also not really sure if backwards compatibility is really an argument here. To me "backwards compatibility" is not an argument in and off itself. The reason you want to maintain backwards compatibility is that you don't want that someone's current script suddenly fails, or worse, the result changes without explicit warnings. And while that is certainly the situation if we were to change something, it is also true that currently someone's script might not be functioning correctly without them knowing either. I would consider that a worse situation. I'd also be happy with saying that |
I think your multi-tiered approach of getting start indices sounds promising (first object, then printer, not sure about global though, but I guess it's the immutable default of the printer you're referring to?). I agree that the current situation is quite insane indeed, but people have adapted, and we try very hard not to break backwards compatibility. In general, people do test their calling code, and adapt to our idiosyncrasies. I know it's frustrating, but in this case, I don't think we can change the output of But the fortran-printing of |
I think I did not explain the tiered system correctly. When printing an object with indices, we first look in a dictionairy which contains object so e.g. if we print My problem with modifying the existing code is the amount of effort it takes to keep the current inconsistencies that I am not interrested in keeping in the first place. But I also don't think that SymPy needs yet another "new standard" of code generation. Maybe what I am looking for (being able to write algorithms based on SymPy expressions) is better suited being hosted as its own project. Not that it counters your general argument, but in the given example, it uses |
As for the |
OK, I think I see what you mean with the tiered system. That approach also sounds good. A separate project is of course easier to maintain since one gets to decide on deprecation rules etc. And it's always easier to incorporate a project into the SymPy codebase (once it has stabilized) than vice versa. For visibility you can add a link to it from SymPy's webpage (once it's somewhat stable). Yes, you're probably right about the |
I still think that |
Note that |
Several of the less controversial problems brought up in sympy#23998 are addressed.
Several of the less controversial problems brought up in sympy#23998 are addressed. typo
I'm actually starting to like this idea a bit more. Of course, the question then is what is "controversial". Maybe we should consider any processing as controversial. So that would include e.g. unrolling indexes for C. But also The above operations could easily be coded in simple functions that return a new It's not clear to me how to seperate the classes though, since a lot of the extra logic in |
Perhaps just new classes overriding >>> rewritten = rewrite(exprs, [einstein_sum_contraction, unroll_assignments, expand_integer_powers])
>>> actual_code = fcode(rewritten) I find that the Python community often suffer from feature creep in ever growing lists of accepted keyword arguments to swiss-army-knife-like functions (I'm looking at you |
That's exactly what I have in mind, except that currently the printers are doing some of those things, so it would be difficult to get no clashes if you can't turn the current behaviour. The problem with creating a new class that overrides the "controversial" methods in the current classes would be that there are good odds that someone in the future is going to create a new "controversial" method in the current classes, which then propagates through to the classes that are supposed to be "clean" and undoing that would mean breaking backwards compatibility when it is finally detected as an issue. What did you have in mind for |
You're right, I can't think of any guard against that short of manual review of future pull requests.
I was just referring to create_expand_pow_optimization. |
ArraySymbol
is just wrong using the bracket notation. I thinkIndexed
should print asI(1,1)
likeMatrixSymbol
does, considering that would be the standard in Fortran. Changing it would potentially break currently working code. (If anyone is using it).The text was updated successfully, but these errors were encountered: