Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repeated phrases should be done on corpora #5

Closed
benel opened this issue Jun 6, 2011 · 5 comments
Closed

Repeated phrases should be done on corpora #5

benel opened this issue Jun 6, 2011 · 5 comments
Assignees
Labels

Comments

@benel
Copy link
Member

benel commented Jun 6, 2011

... because getting it for the whole database can be far too long.

Tests done in Liège: more than one minute for a 50 Mb phrases list.

@christophe-lejeune
Copy link
Member

Moreover, in its current state, the 50 Mb phrases is completely reloaded each time "repeeted phrases" is hit (even when no changes occur in the database).

@benel
Copy link
Member Author

benel commented Oct 5, 2011

Yes, I don't understand why it's not cached.

@benel
Copy link
Member Author

benel commented Oct 5, 2011

Another complementary optimization would be to call this view asynchronously (with jquery get for example).

@benel
Copy link
Member Author

benel commented Oct 5, 2011

The cache seems to work on smaller corpora. The problem could be due to the size of the view...

@ghost ghost assigned benel Oct 6, 2011
@benel
Copy link
Member Author

benel commented Oct 7, 2011

When it is restricted to a single corpus (e.g. 2009_ordinateur), the cost is 5.4 Mb in 6 s.

If a list filters phrases to keep only repeated ones, the cost is 730 kb in 32 s.

Note: Tests done

  • with curl because REST Client has problems with the raw view,
  • on localhost (it would be interesting to test that on a distant server, when the payload size is more important).

@benel benel closed this as completed Oct 27, 2011
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants