Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

a much faster longest_common_prefix for words #19322

Closed
videlec opened this issue Oct 1, 2015 · 11 comments
Closed

a much faster longest_common_prefix for words #19322

videlec opened this issue Oct 1, 2015 · 11 comments

Comments

@videlec
Copy link
Contributor

videlec commented Oct 1, 2015

I had to do some computations of the following kind... which are damn slow

sage: w = words.FibonacciWord()[:10000]
sage: %time L = [[len(w[n:].longest_common_prefix(w[n+fibonacci(i):])) for i in range(5,20)] for n in range(1,1000)]
CPU times: user 20.6 s, sys: 44 ms, total: 20.7 s
Wall time: 20.5 s

sage: w = Words([0,1])(list(words.FibonacciWord()[:10000]))
sage: %time L = [[len(w[n:].longest_common_prefix(w[n+fibonacci(i):])) for i in range(5,20)] for n in range(1,1000)]
CPU times: user 7.99 s, sys: 56 ms, total: 8.04 s
Wall time: 7.93 s

and with the branch

sage: w = Words([0,1])(list(words.FibonacciWord()[:10000]))
sage: %time L = [[len(w[n:].longest_common_prefix(w[n+fibonacci(i):])) for i in range(5,20)] for n in range(1,1000)]
CPU times: user 172 ms, sys: 0 ns, total: 172 ms
Wall time: 168 ms

We also implement longest_common_suffix and fix the following annoying feature of has_prefix

sage: W = Words([0,1,2])
sage: w = W([0,1,0,2])
sage: w.has_prefix(words.FibonacciWord())
False

CC: @seblabbe

Component: combinatorics

Author: Vincent Delecroix

Branch/Commit: ebbc28d

Reviewer: Nathann Cohen

Issue created by migration from https://trac.sagemath.org/ticket/19322

@videlec videlec added this to the sage-6.9 milestone Oct 1, 2015
@videlec
Copy link
Contributor Author

videlec commented Oct 1, 2015

Commit: fe306f9

@videlec
Copy link
Contributor Author

videlec commented Oct 1, 2015

Branch: u/vdelecroix/19322

@videlec
Copy link
Contributor Author

videlec commented Oct 1, 2015

New commits:

fe306f9Trac 19322: faster longest_common_prefix

@nathanncohen
Copy link
Mannequin

nathanncohen mannequin commented Oct 2, 2015

comment:2

Hello Vincent,

Looks good. A couple of remarks:

  1. You can use min/max in Cython code. Contrary to Python :-P

  2. longest_suffix/Python case: what about 'caching' len(other) instead of recomputing it at every test?

  3. longest prefix/Python case: the code generated by python for this 'slice' is scary. Isn't it possible to reimplement it without it? Plus if you need to import 'islice' it is maybe better to do so at module level, it's not thaaat bad either and it may happen that the prefix be one character long, in which case importing stuff could be comparatively more expensive (?).

  4. Have you considered returning a 'new word' (through new_c) even when that new word is equal to one of self and other? It would simplify the code, and I don't know if it matters as it does not allocate new memory anyway?

  5. I do not understand your 'not able to initialize a word from [..]'. The code does not try to initialize a word, does it? It's more something like "unsupported type"?..

Nathann

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Oct 4, 2015

Branch pushed to git repo; I updated commit sha1. New commits:

ebbc28dTrac 19322: reviewer comments

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Oct 4, 2015

Changed commit from fe306f9 to ebbc28d

@videlec
Copy link
Contributor Author

videlec commented Oct 4, 2015

comment:4

Hello Nathann,

I implemented your remarks excepted 4. It does allocate memory to call _new_c: the one you need for a Python object. But I don't know whether it is justified or not (ie the ratio between "simple code" versus "efficient code").

Vincent

@nathanncohen
Copy link
Mannequin

nathanncohen mannequin commented Oct 5, 2015

Reviewer: Nathann Cohen

@nathanncohen
Copy link
Mannequin

nathanncohen mannequin commented Oct 5, 2015

comment:5

Yoooooooooo !

I implemented your remarks excepted 4. It does allocate memory to call _new_c: the one you need for a Python object. But I don't know whether it is justified or not (ie the ratio between "simple code" versus "efficient code").

Okay okay, you decide, just felt like bringing it up when I reviewed this code.

Stamped, and good to go.

Nathann

@videlec
Copy link
Contributor Author

videlec commented Oct 5, 2015

comment:6

Thanks!

@vbraun
Copy link
Member

vbraun commented Oct 12, 2015

Changed branch from u/vdelecroix/19322 to ebbc28d

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants