Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Factor iterator in suffix tree of word #25526

Closed
sagetrac-evandomme mannequin opened this issue Jun 7, 2018 · 10 comments
Closed

Factor iterator in suffix tree of word #25526

sagetrac-evandomme mannequin opened this issue Jun 7, 2018 · 10 comments

Comments

@sagetrac-evandomme
Copy link
Mannequin

sagetrac-evandomme mannequin commented Jun 7, 2018

We improve the algorithm computing the factor iterator in the implicit suffix tree of a word.

BEFORE:

sage: w = words.FibonacciWord([0,1])
sage: it = w[:10000].suffix_tree().factor_iterator()
sage: %time L = list(it)
Processus arrêté

AFTER:

sage: w = words.FibonacciWord([0,1])
sage: it = w[:10000].suffix_tree().factor_iterator()
sage: %time L = list(it)
CPU times: user 14 s, sys: 504 ms, total: 14.5 s
Wall time: 14.5 s
sage: len(L)
24337601

CC: @videlec @seblabbe

Component: combinatorics

Keywords: thursdaysbdx

Author: Vincent Delecroix, Élise Vandomme

Branch/Commit: d7bcfdf

Reviewer: Sébastien Labbé

Issue created by migration from https://trac.sagemath.org/ticket/25526

@sagetrac-evandomme sagetrac-evandomme mannequin added this to the sage-8.3 milestone Jun 7, 2018
@sagetrac-evandomme
Copy link
Mannequin Author

sagetrac-evandomme mannequin commented Jun 7, 2018

@seblabbe
Copy link
Contributor

Commit: d7bcfdf

@seblabbe
Copy link
Contributor

comment:2

Does the branch needs review?


New commits:

d7bcfdfBetter implmentation of the factor iterator in suffix tree of words

@seblabbe
Copy link
Contributor

comment:3

Can you provide an example in the description of the ticket that shows how better the code has become with the improvement you propose?

@videlec
Copy link
Contributor

videlec commented Jun 21, 2018

comment:5

Replying to @seblabbe:

Can you provide an example in the description of the ticket that shows how better the code has become with the improvement you propose?

No. The reason is that there is no guarantee that iterating through a dictionary via iteritems you always get the same order (you can check that in the doctests there are sorted everywhere). But before this ticket, it was clearly not a DFS

sage: for w in Word("cacao").suffix_tree().factor_iterator(): print w
c
a
ac
aca
ao
acao
o
ca
cac
caca
cao
cacao

With this ticket it is

sage: for w in Word("cacao").suffix_tree().factor_iterator(): print w
a
ao
ac
aca
acao
o
c
ca
cao
cac
caca
cacao

@seblabbe
Copy link
Contributor

comment:6

Replying to @videlec:

Replying to @seblabbe:

Can you provide an example in the description of the ticket that shows how better the code has become with the improvement you propose?

No.

I was asking in the description of the ticket not in a doctest.

@seblabbe
Copy link
Contributor

comment:7

Before this ticket, this just eats all the memory and CPU:

sage: w = words.FibonacciWord([0,1])
sage: it = w[:10000].suffix_tree().factor_iterator()
sage: %time L = list(it)
Processus arrêté

With the branch of this ticket, I get:

sage: w = words.FibonacciWord([0,1])
sage: it = w[:10000].suffix_tree().factor_iterator()
sage: %time L = list(it)
CPU times: user 14 s, sys: 504 ms, total: 14.5 s
Wall time: 14.5 s
sage: len(L)
24337601

So, it is definitely an improvement.

@seblabbe
Copy link
Contributor

Reviewer: Sébastien Labbé

@seblabbe

This comment has been minimized.

@vbraun
Copy link
Member

vbraun commented Jun 22, 2018

Changed branch from u/evandomme/factor_iterator_in_suffix_tree_of_word to d7bcfdf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants