-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
global: invenio-queues refactoring #3
global: invenio-queues refactoring #3
Conversation
6589e02
to
907a0e6
Compare
Signed-off-by: Dinos Kousidis <konstantinos.kousidis@cern.ch>
acc970c
to
8b10999
Compare
tests/test_invenio_stats.py
Outdated
@@ -52,17 +52,10 @@ def test_init(): | |||
assert 'invenio-stats' in app.extensions | |||
|
|||
|
|||
def test_event_queues_declare(app, event_queues_entrypoints): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test should be moved to invenio-queues
invenio_stats/ext.py
Outdated
'{1}'.format(cfg['event_type'], ep.name)) | ||
|
||
result[cfg['event_type']] = cfg | ||
contrib_dev = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added temporarily to register the 2 events coming from records-ui and files-rest from the contrib
c7f0d13
to
34258fe
Compare
- ADDS config variable to select enabled events - ADDS event registration from contrib for development Signed-off-by: Dinos Kousidis <konstantinos.kousidis@cern.ch>
34258fe
to
bd4bf19
Compare
CELERY_BEAT_SCHEDULE = { | ||
'indexer': { | ||
'task': 'invenio_stats.tasks.index_events', | ||
'schedule': timedelta(seconds=5), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you replace the 5 seconds with 5 minutes? Otherwise we might have issues as Lars mentioned in some other PR.
Note: This means that we should not make the aggregations at midnight or we might miss some events, they should be done a few hours later just to be sure.
} | ||
} | ||
} | ||
}, | ||
"aliases": { | ||
"events-filedownload": {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why removing the alias? How do you search the events without it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However it should be named events-stats_file_download
just to be consistent with the above template name
@@ -1,5 +1,5 @@ | |||
{ | |||
"template": "events-recordview-*", | |||
"template": "events-stats_record_view-*", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you rename the alias to match the template name too, i.e. events-stats_record_view
res = runner.invoke(cli.delete_queue, [], obj=script_info) | ||
assert 0 == res.exit_code | ||
class UnknownEventError(Exception): | ||
"""Error raised when a duplicate event is detected.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that this is the right docstring.
def events(self): | ||
EventConfig = namedtuple('EventConfig', | ||
['queue', 'config', 'processor']) | ||
# import iter_entry_points here so that it can be mocked in tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
duplicate comment? see _events_config
) | ||
return result | ||
|
||
# def __init__(self, app, entry_point_group): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dead code. Please remove it.
|
||
self.event_types[event_type] = IndexTemplate( | ||
event_type, name, package_name) | ||
# def register_eventtype(self, event_type, package_name): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dead code.
import click | ||
from flask.cli import with_appcontext | ||
from invenio_search.cli import index | ||
"""Celery background tasks.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docstring of the module doesn't match its content.
# When: | ||
timestamp=datetime.utcnow().isoformat(), | ||
# What: | ||
bucket=str(obj.bucket_id), | ||
filename=obj.key, | ||
# labels=record.get('communities', []), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dead code
'filedownload = invenio_stats.estemplates', | ||
'recordview = invenio_stats.estemplates', | ||
] | ||
# 'invenio_stats.estemplates': [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dead code
|
||
def register_templates(): | ||
"""Register elasticsearch templates for events.""" | ||
return ['contrib/{0}'.format(event['event_type']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you move the templates in a contrib/<event-type>/stats/
directory. That way they would be prefixed with stats
.
from sqlalchemy_utils.functions import create_database, database_exists | ||
|
||
from invenio_stats import EventQueue, InvenioStats, current_stats | ||
# from invenio_stats import EventQueue, InvenioStats, current_stats |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dead code
result = {} | ||
contrib_dev = False | ||
|
||
if contrib_dev: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dead code. contrib_dev
is always False.
] | ||
# 'invenio_stats.estemplates': [ | ||
# 'file_download = invenio_stats.indexer:EventsIndexer', | ||
# 'record_view = invenio_stats.indexer:EventsIndexer', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a function for 'invenio_stats.events'
listing the contrib events.
current_stats.publish('filedownload', [dict(data='val')]) | ||
index_events.delay('filedownload') | ||
current_stats.publish('event_0', [dict(data='val')]) | ||
process_events.delay(['event_0']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you check that the event is indexed properly?
Could you test with multiple events just to be sure.
Also could you test the indexer with file download events and check that you can find them in elasticsearch.
Signed-off-by: Dinos Kousidis konstantinos.kousidis@cern.ch