Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

global: invenio-queues refactoring #3

Merged
merged 2 commits into from
Jul 7, 2017

Conversation

nharraud
Copy link
Member

Signed-off-by: Dinos Kousidis konstantinos.kousidis@cern.ch

@dinosk dinosk force-pushed the invenio-queue-refactoring branch from 6589e02 to 907a0e6 Compare June 26, 2017 12:33
@dinosk dinosk self-assigned this Jun 26, 2017
Signed-off-by: Dinos Kousidis <konstantinos.kousidis@cern.ch>
@dinosk dinosk force-pushed the invenio-queue-refactoring branch 2 times, most recently from acc970c to 8b10999 Compare June 29, 2017 09:23
@@ -52,17 +52,10 @@ def test_init():
assert 'invenio-stats' in app.extensions


def test_event_queues_declare(app, event_queues_entrypoints):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test should be moved to invenio-queues

'{1}'.format(cfg['event_type'], ep.name))

result[cfg['event_type']] = cfg
contrib_dev = False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added temporarily to register the 2 events coming from records-ui and files-rest from the contrib

@dinosk dinosk force-pushed the invenio-queue-refactoring branch 14 times, most recently from c7f0d13 to 34258fe Compare July 3, 2017 16:10
- ADDS config variable to select enabled events

- ADDS event registration from contrib for development

Signed-off-by: Dinos Kousidis <konstantinos.kousidis@cern.ch>
@dinosk dinosk force-pushed the invenio-queue-refactoring branch from 34258fe to bd4bf19 Compare July 3, 2017 16:27
@dinosk dinosk changed the title WIP global: invenio-queues refactoring global: invenio-queues refactoring Jul 4, 2017
CELERY_BEAT_SCHEDULE = {
'indexer': {
'task': 'invenio_stats.tasks.index_events',
'schedule': timedelta(seconds=5),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you replace the 5 seconds with 5 minutes? Otherwise we might have issues as Lars mentioned in some other PR.

Note: This means that we should not make the aggregations at midnight or we might miss some events, they should be done a few hours later just to be sure.

}
}
}
},
"aliases": {
"events-filedownload": {}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why removing the alias? How do you search the events without it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However it should be named events-stats_file_download just to be consistent with the above template name

@@ -1,5 +1,5 @@
{
"template": "events-recordview-*",
"template": "events-stats_record_view-*",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you rename the alias to match the template name too, i.e. events-stats_record_view

res = runner.invoke(cli.delete_queue, [], obj=script_info)
assert 0 == res.exit_code
class UnknownEventError(Exception):
"""Error raised when a duplicate event is detected."""
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that this is the right docstring.

def events(self):
EventConfig = namedtuple('EventConfig',
['queue', 'config', 'processor'])
# import iter_entry_points here so that it can be mocked in tests
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicate comment? see _events_config

)
return result

# def __init__(self, app, entry_point_group):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dead code. Please remove it.


self.event_types[event_type] = IndexTemplate(
event_type, name, package_name)
# def register_eventtype(self, event_type, package_name):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dead code.

import click
from flask.cli import with_appcontext
from invenio_search.cli import index
"""Celery background tasks."""
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring of the module doesn't match its content.

# When:
timestamp=datetime.utcnow().isoformat(),
# What:
bucket=str(obj.bucket_id),
filename=obj.key,
# labels=record.get('communities', []),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dead code

'filedownload = invenio_stats.estemplates',
'recordview = invenio_stats.estemplates',
]
# 'invenio_stats.estemplates': [
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dead code


def register_templates():
"""Register elasticsearch templates for events."""
return ['contrib/{0}'.format(event['event_type'])
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you move the templates in a contrib/<event-type>/stats/ directory. That way they would be prefixed with stats.

from sqlalchemy_utils.functions import create_database, database_exists

from invenio_stats import EventQueue, InvenioStats, current_stats
# from invenio_stats import EventQueue, InvenioStats, current_stats
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dead code

result = {}
contrib_dev = False

if contrib_dev:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dead code. contrib_dev is always False.

]
# 'invenio_stats.estemplates': [
# 'file_download = invenio_stats.indexer:EventsIndexer',
# 'record_view = invenio_stats.indexer:EventsIndexer',
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a function for 'invenio_stats.events' listing the contrib events.

current_stats.publish('filedownload', [dict(data='val')])
index_events.delay('filedownload')
current_stats.publish('event_0', [dict(data='val')])
process_events.delay(['event_0'])
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you check that the event is indexed properly?
Could you test with multiple events just to be sure.

Also could you test the indexer with file download events and check that you can find them in elasticsearch.

@nharraud nharraud merged commit ae5e86a into inveniosoftware:master Jul 7, 2017
@nharraud nharraud deleted the invenio-queue-refactoring branch July 7, 2017 16:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants