Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ccode: Add capability to _print_Indexed() to use user-defined strided schemes and offsets #12490

Merged
merged 7 commits into from Apr 27, 2017

Conversation

ArifAhmed1995
Copy link
Contributor

@ArifAhmed1995 ArifAhmed1995 commented Apr 4, 2017

As mentioned in #12464 , ccode unrolled 2D Indexed objects in row-major format. This PR adds support for unrolling in column-major format as well.

Example :

>>> from sympy import *
>>> A = IndexedBase('A', shape=(5,3))
>>> i, j = Idx('i'), Idx('j')
>>> ccode(A[i,j])
'A[3*i + j]'
>>> A = IndexedBase('A', shape=(5,3), ordering=1)
>>> ccode(A[i,j])
'A[i + 5*j]'

UPDATE : Now support for strided schemes and offset have been added

Fixes #12464

@bjodah
Copy link
Member

bjodah commented Apr 5, 2017

Values of 0 & 1 for ordering strike me as somewhat opaque. Perhaps we should have a strides property instead and give strides='C' and strides='F' special meaning in __init__ which would lead to strides being calculated from shape (names chosen in analogy with NumPy naming). If we go with strides we should also add offset to make it complete. That would allow the user to generate code which works with e.g. dgbtrf.

EDIT: the default should then be strides=None which the C & Fortran printers would interpret as row-major and column-major order respectively.

@ArifAhmed1995
Copy link
Contributor Author

@bjodah Sorry for the late reply. Last week was quite hectic.
Can you show me an example output such that it's compatible with dgbtrf ?
I've searched a lot on the net but can't find a simple example.

@bjodah
Copy link
Member

bjodah commented Apr 15, 2017

@ArifAhmed1995 no worries. You don't need to worry about dgbtrf in particular, but what we need is strides and offset, something along these lines (mock-up code):

>>> s, o = symbols('s o', integer=True)  # stride and offset
>>> A = IndexedBase('A', shape=(29,29), strides=(1, s), offset=o)
>>> i, j = Idx('i'), Idx('j')
>>> ccode(A[i,j])
'A[i + j*s + o]'

or more general:

>>> l, m, n, o = symbols('l m n o', integer=True)  # strides and offset
>>> A = IndexedBase('A', strides=(l, m, n), offset=o)
>>> i, j, k = Idx('i'), Idx('j'), Idx('k')
>>> ccode(A[i, j, k])
'A[i*l + j*m + k*n + o]'

@ArifAhmed1995 ArifAhmed1995 changed the title ccode: Add capability to _print_Indexed() to use column-major ordering ccode: Add capability to _print_Indexed() to use user-defined strided schemes and offsets Apr 16, 2017
@ArifAhmed1995
Copy link
Contributor Author

@bjodah Can you please review again ?

Copy link
Member

@bjodah bjodah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is starting to look really good. I left some minor comments.

if offset is None:
offset = ""
return "%s[%s]" % (self._print(expr.base.label),
str(sum([x[0]*x[1]
Copy link
Member

@bjodah bjodah Apr 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of str you can use self._print here too (it is most probably safe to use str, but better safe than sorry if the user calculates indices in some fancy way)


if offset is None:
offset = ""
return "%s[%s]" % (self._print(expr.base.label),
Copy link
Member

@bjodah bjodah Apr 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of having a separate return statement we can calculate the missing strides here and let the method use the same logic further down as when strides is given explicitly.

self._print(elem) + str(offset))
elif isinstance(strides, tuple):
if offset is None:
offset = ""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can let offset be 0 by default and add it to the sum, i.e. sum(...) + offset

@ArifAhmed1995
Copy link
Contributor Author

@bjodah I've made the required changes.

@bjodah
Copy link
Member

bjodah commented Apr 18, 2017

Great, it looks good on my end.

Pinging @Upabjojr since he might want to add something (he has worked on Indexed extensively).
Also pinging @asmeurer since he is in general interested in codegen related work.

@Upabjojr
Copy link
Contributor

Oh, I was just suggesting to abolish Idx and IndexedBase on another discussion, apparently IndexedBase keeps getting expanded.

@bjodah
Copy link
Member

bjodah commented Apr 18, 2017

@Upabjojr I see. (#12533)

I agree that IndexedBase is a bit odd, it doesn't follow SymPy conventions, e.g. it doesn't keep all its data in args. Would you suggest implementing this somewhere else? NDArray? Would adding strides and offset to it be intrusive?

What I do like about Indexed is that it makes very few assumptions.
Since Indexed isn't going anywhere anytime soon I think this PR is valuable as is.

@Upabjojr
Copy link
Contributor

Would you suggest implementing this somewhere else? NDArray? Would adding strides and offset to it be intrusive?

I was actually thinking about it. They could be either stored in Indexed directly, or be given as parameters to the code printer.

Take into account that Indexed can also contain an array now (it's first args is not always an IndexedBase), so this PR definitely needs a fix.

If you give a symbolic index to an Array, you'll get an Indexed object, for example.

@bjodah
Copy link
Member

bjodah commented Apr 18, 2017

@Upabjojr I didn't know that Indexed could contain an array. We are trying to move away from passing info to the printers via side-channels. Implementing this in both IndexedBase and NDArray doesn't feel right (code-duplication). Any chance of NDArray and Indexed both inheriting from some baseclass where these attributes may be (optionally) stored?

Strides and offset are really specific to code-generation so perhaps the functionality should reside in a Indexed-like class in sympy.codegen.ast. But that would introduce yet another sympy class with indexing functionality...

@Upabjojr
Copy link
Contributor

Printing code of an Indexed with Array is currently not tested.

I think that Array could be expanded to support offset, but I don't see how strided could be useful.

Maybe just a global attribute dictionary? I don't like having so many specific arguments in the single classes. Also, I wouldn't create a class and inheritance-of-such-class just for these two parameters.

All other CAS's have attribute support for their objects, SymPy does not.

@bjodah
Copy link
Member

bjodah commented Apr 19, 2017

@Upabjojr for high performance code strides are used all over the place. Mainly for one of two reasons:

  1. Avoid copies (lazy transpose)
  2. Enhance cache-alignment for optimized performance (lapack takes a "leading dimension" argument to all functions accepting matrices, that is a stride)

Attribute support for objects would definitely be one way to solve these kinds of problems.

@Upabjojr
Copy link
Contributor

Attribute support for objects would definitely be one way to solve these kinds of problems.

+1, what about a global attribute dictionary in sympy.core? Atom and/or AtomicExpr could be equipped with a .set_attribute(a) method.

@bjodah
Copy link
Member

bjodah commented Apr 19, 2017

In that case I would actually like to see some mechanism to set attributes already during initialization. e.g. intercepting an "attributes" kwarg from assumptions:

>>> A = IndexedBase('A', attributes=dict(strides=(3, 5, 7), offset=3))

or

>>> x = Symbol('x', attributes=dict(domain_specific_knowledge=42), real=True, positive=True)

This would be major change so perhaps we should open a separate issue for it and discuss it on the mailing list?

@Upabjojr
Copy link
Contributor

Yes, I'd suggest the forum.

@ArifAhmed1995
Copy link
Contributor Author

@bjodah @Upabjojr So, I should include support for Arrays as well ?

@Upabjojr
Copy link
Contributor

So, I should include support for Arrays as well ?

Just wait. Let's ask on the mailing list about attributes.

@Upabjojr
Copy link
Contributor

https://groups.google.com/forum/#!topic/sympy/VVaDc-yFtlQ

I asked the opinion on introducing attributes.

@Upabjojr
Copy link
Contributor

Also, I have a recurrent thought about experimenting on Indexed with RandomSymbol. It would be nice to model stochastic processes. In that case strided and offset would not mean anything. But maybe a new class is more suitable.

@Upabjojr
Copy link
Contributor

Upabjojr commented Apr 22, 2017

What about just having some static dict in Indexed to point to the instances offsets and strided?

Something like:

class Indexed(...):
    ...
    _instance_offsets = {}
    _instance_strided = {}

    @property
    def offset(self):
        return self._instance_offsets.get(self, 0)

    ...

    def set_offset(self, offset):
        self._instance_offsets[self] = offset

So we can set the value after creation without clumbering the arguments too much.

@bjodah
Copy link
Member

bjodah commented Apr 23, 2017

I fail to see why this is better than what we had from the start (simple attributes), @Upabjojr could you explain the benefits?

EDIT: @ArifAhmed1995: 8f1e5f2 is not quite what Francesco meant, but I think you can wait with adding changes until we agree on a good way forward.

EDIT2: @Upabjojr I now see how this would allow us to add this information to Indexed instead of IndexedBase but I don't think that is quite what we want, consider this suggested API:

>>> A = IndexedBase('A', strides=(l, m, n), offset=o)
>>> i, j, k = Idx('i'), Idx('j'), Idx('k')
>>> ccode(A[i, j, k])
'A[i*l + j*m + k*n + o]'
>>> ccode(A[3, j, 5])
'A[3*l + j*m + 5*n + o]'

Adding strides to Indexed would not support that functionality.

@Upabjojr
Copy link
Contributor

I fail to see why this is better than what we had from the start (simple attributes), @Upabjojr could you explain the benefits?

We don't put those arguments in the defining arguments of the object, which is good because they are not defining arguments. Plus, we solve the issue of having Indexed depending on non-IndexedBase, as Indexed can have other objects in it.

is not quite what Francesco meant, but I think you can wait with adding changes until we agree on a good way forward.

Yes, I meant for the static dictionaries to be in Indexed, not in IndexedBase. We need to get rid of the dependence on IndexedBase, because Indexed is not guaranteed to have an IndexedBase (it could have an array, and in the future I would like to extend RandomSymbol to be supported in Indexed).

Though, I am favorable to restore the strides and offset elements in the class constructor of Indexed, just don't store them into args, but rather inside a static member of Indexed.

@bjodah
Copy link
Member

bjodah commented Apr 25, 2017

@Upabjojr I disagree. Adding this to Indexed does not help (see my "EDIT2" comment above).

@Upabjojr
Copy link
Contributor

Adding strides to Indexed would not support that functionality.

I mean, add strides to IndexedBase but store the strides info in Indexed. My previous code snippet is a bad example. The strides and offset properties in Indexed should not interact with IndexedBase but find the info in Indexed.

@bjodah
Copy link
Member

bjodah commented Apr 25, 2017

@Upabjojr oh, I see. Yes that would work. Thanks for explaining. @ArifAhmed1995 do you follow?

@bjodah
Copy link
Member

bjodah commented Apr 25, 2017

@Upabjojr what do you think about e79cc4b?

@Upabjojr
Copy link
Contributor

In e79cc4b expressions may change behavior if rebuilt or if they undergo rewriting rules, as strides and offset are not stored in a persistent manner. It's the reason why I proposed a static dict.

@bjodah
Copy link
Member

bjodah commented Apr 25, 2017

I see. Could you provide a minimal example where this might occur? (I think we should add a test to guard against it)

@Upabjojr
Copy link
Contributor

In [1]: A = IndexedBase("A")

In [2]: A.test = "This is a test"

In [3]: As = A.subs(Symbol("A"), Symbol("B")).subs(Symbol("B"), Symbol("A"))

In [4]: A.test
Out[4]: 'This is a test'

In [7]: As.test
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-7-3680e7e751f7> in <module>()
----> 1 As .test

AttributeError: 'IndexedBase' object has no attribute 'test'

@bjodah
Copy link
Member

bjodah commented Apr 26, 2017

Thanks, since it works for shape:

In [1]: from sympy import IndexedBase, Symbol

In [2]: A = IndexedBase("A", shape=(42,43))

In [3]: A.shape
Out[3]: (42, 43)

In [4]: As = A.subs(Symbol("A"), Symbol("B")).subs(Symbol("B"), Symbol("A"))

In [5]: A.shape
Out[5]: (42, 43)

In [6]: As.shape
Out[6]: (42, 43)

I was thinking that we could add strides and offset in analogy to shape (since it is also an optional attribute even though one might argue that it is a more important attribute than e.g. strides, but that really depends on context) in this PR and maybe address the whole attribute question in a subsequent PR?

But to my surprise I can't get that to work (see: 57ae527):

In [1]: from sympy import IndexedBase, Symbol; A = IndexedBase("A", strides=(3,1)); A.strides
Out[1]: (3, 1)

In [2]: As = A.subs(Symbol("A"), Symbol("B")).subs(Symbol("B"), Symbol("A")); A.strides
Out[2]: (3, 1)

In [3]: As.strides

In [4]: 

I've compared how shape and strides are handled in that commit of mine line by line, but I fail to see how they differ, any clue?

One of the strengths of SymPy (and Python projects in general) is that the source code is rather accessible, but with these things there is quite a bit of "magic" going on, so that so called "local reasoning" fails.

@Upabjojr
Copy link
Contributor

I've compared how shape and strides are handled in that commit of mine line by line, but I fail to see how they differ, any clue?

That commit ( 57ae527 ) does not preserve the positional order of args. That is, when the object is reconstructed, the arguments may be inserted in a different order.

One of the strengths of SymPy (and Python projects in general) is that the source code is rather accessible, but with these things there is quite a bit of "magic" going on, so that so called "local reasoning" fails.

That's why I'm a bit skeptical about putting too many arguments in a single object. I think we should just put the essential information the the args and leave extra info somewhere else.

@bjodah
Copy link
Member

bjodah commented Apr 26, 2017

Thank you.
After having experimented a bit more with storing everything in args, where optionals would be None I'm also convinced that it is not the way forward (9e7af5a)

I started sketching out an API for attributes, but I still need to make the super class (Basic?) look for attributes: 8546ffa

This is trickier than I expected.

@Upabjojr
Copy link
Contributor

I believe we shouldn't make this PR too long. Modifying Basic should be done with care and possibly discussed on the mailing list.

Also, attributes don't need to be subclasses of Basic if they are not stored in the args.

@bjodah
Copy link
Member

bjodah commented Apr 26, 2017

Agreed (that's why I didn't push those commits to this branch).
The static dictionary approach doesn't help with objects losing e.g. strides when applying subs, at least I can't see how to make that work in a clean way (i.e. allow this test to pass).

So is e79cc4b good enough for this PR?

@Upabjojr
Copy link
Contributor

The static dictionary approach doesn't help with objects losing e.g. strides when applying subs, at least I can't see how to make that work in a clean way (i.e. allow this test to pass).

I introduced the static dictionary approach exactly to solve this issue. The static dictionary is a permanent data structure that is not affected by the rewriting rules.

So is e79cc4b good enough for this PR?

We can merge if you want.

Maybe we can open another issue to remember to fix the reconstruction problem. Anyway, I would like to eventually get rid of IndexedBase (we don't really need that object, we could just use a Symbol with some attributes).

@bjodah bjodah merged commit 0577d28 into sympy:master Apr 27, 2017
@bjodah
Copy link
Member

bjodah commented Apr 27, 2017

@Upabjojr OK, so I opened #12593.
If you can support that test using static dictionaries I think it is worth a PR.

EDIT: I'm fine with giving up on IndexedBase if we can support the same features using Symbol (which attributes could very well do, the question is just where to document the different attributes used throughout the code base, the current approach offers a nice place to put docstrings.

@ArifAhmed1995
Copy link
Contributor Author

This discussion was quite valuable. I learnt quite a bit more about Python.

@ArifAhmed1995 ArifAhmed1995 deleted the print_indexed branch April 27, 2017 13:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CCodePrinter._print_Indexed assumes row-major ordering
3 participants