Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Special-case pol*term, term*pol for generic polynomials #25317

Closed
mezzarobba opened this issue May 9, 2018 · 12 comments
Closed

Special-case pol*term, term*pol for generic polynomials #25317

mezzarobba opened this issue May 9, 2018 · 12 comments

Comments

@mezzarobba
Copy link
Member

sage: Pol0.<t> = ZZ[]
....: Pol1.<x> = Pol0[]
....: p = ((t^2 - 3*t)*x^10 + (-3*t^2 - 1)*x^9 + (-t^2 - t + 1)*x^8 + (-t^2 - 1)*x^7 + (-4*t^2 + 14*t - 4)*x^6 + (2*t^2 + 2)*x^5 + (-t^2 + 2*t + 2)*x^4 + (3*t^2 - t + 3)*x^3 + (t^2 - t + 156)*x^2 + (7*t - 3)*x - t^2 + 2*t)
....: q = 2*x

8.3.beta0:

sage: %timeit p*q
The slowest run took 2129.02 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 8.02 µs per loop

This ticket:

The slowest run took 8.63 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 2.21 µs per loop

Component: basic arithmetic

Author: Marc Mezzarobba, Travis Scrimshaw

Branch/Commit: d37db8d

Reviewer: Travis Scrimshaw, Marc Mezzarobba

Issue created by migration from https://trac.sagemath.org/ticket/25317

@tscrim
Copy link
Collaborator

tscrim commented May 10, 2018

comment:3

Perhaps it would be better to have a specialized cdef scale_and_shift method that can be overwritten by the specific data structures in order to avoid creating an intermediate object? In fact, the generic _lmul_ and _rmul_ just lift the coefficient to the polynomial ring and multiply. I would also rewrite _mul_ and _rmul_ to use this scale_and_shift with a shift of 0 to avoid code duplication.

@mezzarobba
Copy link
Member Author

comment:4

Replying to @tscrim:

Perhaps it would be better to have a specialized cdef scale_and_shift method that can be overwritten by the specific data structures in order to avoid creating an intermediate object? In fact, the generic _lmul_ and _rmul_ just lift the coefficient to the polynomial ring and multiply. I would also rewrite _mul_ and _rmul_ to use this scale_and_shift with a shift of 0 to avoid code duplication.

Yes, why not... I'm not sure it is worth the pain now—I'm not really trying to make this operation as fast as possible, just to avoid the worst of the overhead (especially in the Karatsuba range). So I'm not going to implement your suggestion now, but perhaps later, and in any case I can review the implementation if you want to do it yourself.

Thanks for you comments in any case!

@tscrim
Copy link
Collaborator

tscrim commented May 10, 2018

comment:5

Replying to @mezzarobba:

Replying to @tscrim:

Perhaps it would be better to have a specialized cdef scale_and_shift method that can be overwritten by the specific data structures in order to avoid creating an intermediate object? In fact, the generic _lmul_ and _rmul_ just lift the coefficient to the polynomial ring and multiply. I would also rewrite _mul_ and _rmul_ to use this scale_and_shift with a shift of 0 to avoid code duplication.

Yes, why not... I'm not sure it is worth the pain now—I'm not really trying to make this operation as fast as possible, just to avoid the worst of the overhead (especially in the Karatsuba range). So I'm not going to implement your suggestion now, but perhaps later, and in any case I can review the implementation if you want to do it yourself.

I will do it today then.

@tscrim
Copy link
Collaborator

tscrim commented May 11, 2018

Reviewer: Travis Scrimshaw

@tscrim
Copy link
Collaborator

tscrim commented May 11, 2018

comment:6

Running the same test as in the description.

8.3.beta0:

sage: %timeit p*q
The slowest run took 2129.02 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 8.02 µs per loop

Your branch:

sage: %timeit p*q
The slowest run took 9.36 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 2.78 µs per loop

My branch:

The slowest run took 8.63 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 2.21 µs per loop

So my changes get us an extra ~20%.


New commits:

4a9c966Merge branch 'u/mmezzarobba/generic_pol_times_term' of git://trac.sagemath.org/sage into u/tscrim/generic_pol_times_term
d37db8dImproving the speed of single term multiplication.

@tscrim
Copy link
Collaborator

tscrim commented May 11, 2018

@tscrim
Copy link
Collaborator

tscrim commented May 11, 2018

Changed commit from 4203441 to d37db8d

@mezzarobba
Copy link
Member Author

Changed reviewer from Travis Scrimshaw to Travis Scrimshaw, Marc Mezzarobba

@mezzarobba

This comment has been minimized.

@mezzarobba
Copy link
Member Author

Changed author from Marc Mezzarobba to Marc Mezzarobba, Travis Scrimshaw

@mezzarobba
Copy link
Member Author

comment:7

Great, thanks!

@vbraun
Copy link
Member

vbraun commented May 15, 2018

Changed branch from u/tscrim/generic_pol_times_term-25317 to d37db8d

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants