Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

List of robots should be in seperate file #311

Open
jesusbagpuss opened this issue Apr 24, 2015 · 1 comment
Open

List of robots should be in seperate file #311

jesusbagpuss opened this issue Apr 24, 2015 · 1 comment
Milestone

Comments

@jesusbagpuss
Copy link
Contributor

@jesusbagpuss jesusbagpuss commented Apr 24, 2015

The current list of robots that are excluded from download logging is:

  1. out of date
  2. hard coded into a perl module.

Ideally the list of robots would be in it's own file (possibly configurable on a per-archive basis?) in ~/lib/ somewhere, and could be updated by cronjob / Event / Bazaar plugin.

Comparing: 3.3 vs master
https://github.com/eprints/eprints/blob/3.3/perl_lib/EPrints/Apache/LogHandler.pm#L61-L98
https://github.com/eprints/eprints/blob/master/perl_lib/EPrints/Apache/LogHandler.pm#L251-L474

The current COUNTER list is http://www.projectcounter.org/r4/COUNTER_Robots_list_Jan2014.txt

  • I'm trying to find a stable URL that will return the most recent COUNTER list.
@jesusbagpuss jesusbagpuss added this to the 3.3.15 milestone Apr 24, 2015
@sebastfr
Copy link

@sebastfr sebastfr commented Apr 24, 2015

One potential issue is that a script may have to clean legacy data (access table) when new robots are added to the list. If one does that, then derived data (irstats1,2) will have to be re-generated from scratch (also worth noting that both irstats1,2 have their own robots definition).

Lots of processing in sight.... :-/

Imo counter/pirus should clean the data up-stream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants