Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scale: No. of Hits on Host, Aggregate and Clustering Results #19

Open
PsypherPunk opened this issue Dec 16, 2013 · 3 comments
Open

Scale: No. of Hits on Host, Aggregate and Clustering Results #19

PsypherPunk opened this issue Dec 16, 2013 · 3 comments

Comments

@PsypherPunk
Copy link
Contributor

Scale: # hits on host, aggregate and clustering results.

S'sheet line: 2
For whom? BNF, BL, DN, IA
Notes: New CDX server should enable this.
Est. Milestone: 2.x.x

@anjackson
Copy link
Member

I believe this concerned ensuring that the performance and user experience were acceptable when a particular page or host had a very large number of instances.

@saraaubry
Copy link

The question of scaling a large number of hits was raised at BnF when
we were doing some study on our big domains. lemonde.fr has over 4 million hits.
http://web.archive.org/web/*/www.lemonde.fr/* works fine
http://web.archive.org/web/*/www.google.com/* doesn't work
We currently have a "maxRecords" set to 100 000 and the way Wayback
is iterating over each CDX file in the same order as they were configured keeps it from displaying all results. This issue may not be a Wayback-only issue, it goes together with management of large and multiple CDX files.

kngenie added a commit to kngenie/wayback that referenced this issue Apr 23, 2014
Add ability to do custom rewrite of JS files.
@anjackson
Copy link
Member

Should we bake CDX-Server in as a default, and deprecate XML Query?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants