-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
global: add record/collection permissions #1589
Conversation
Addresses #1577 One problem I see at the moment is that when I enable invenio-collections I am getting ES timeouts locally when migrating records (possibly due to some extra computation happening in |
I see from https://github.com/inveniosoftware/invenio-collections/blob/master/invenio_collections/config.py#L24 that by adjusting |
Another side effect I just realised. As
|
As discussed IRL we can drop |
3658a72
to
38850ff
Compare
This problem is now solved by not using percolators ( |
Second commit fixes #1589 (comment) |
TODO:
|
5978350
to
4212625
Compare
02eb99e
to
ada053f
Compare
if request: | ||
collection = request.values.get('cc', 'Literature') | ||
|
||
all_restricted_collections = set( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be better cached to avoid doing a DB query on every request.
723b96e
to
10af50d
Compare
Ready for comments. Now that Travis is passing, will add some extra tests regarding permissions/collections. |
Last commit ( |
{ | ||
"dbquery": "collections.primary:HEPNAMES", | ||
"name": "HepNames" | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this list is not up-to-date.
a.argument for a in ActionUsers.query.filter_by( | ||
action='view-restricted-collection').all() | ||
] | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be computed only once, for performance reason. We could cache it in Redis for a minute or so.
'view_restricted_collection' | ||
' = inspirehep.modules.records.permissions:' | ||
'action_view_restricted_collection', | ||
], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems extreme indentation 💃
10af50d
to
bd9e5e5
Compare
07a6ac4
to
629c2b1
Compare
This PR is now ready to be reviewed. The functionality added is covered by tests. |
@@ -125,6 +125,114 @@ def _(x): | |||
USERPROFILES_EXTEND_SECURITY_FORMS = False | |||
USERPROFILES_SETTINGS_TEMPLATE = 'inspirehep_theme/accounts/settings/profile.html' | |||
|
|||
# Collections | |||
# =========== | |||
COLLECTIONS_DELETED_RECORDS = '{dbquery} AND NOT collections.primary:"DELETED"' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, after @spirosdelviniotis work the DELETED
information corresponds to a flag being set to True in deleted
.
I am not sure how to formulate this filter in Invenio query syntax though. (i.e. deleted != True
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@spirosdelviniotis will test this out.
Maybe something like:
COLLECTIONS_DELETED_RECORDS = '{dbquery} AND NOT deleted:True'
works.
}, | ||
"lowercase_analyzer": { | ||
"filter": "lowercase", | ||
"tokenizer": "keyword" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: since you ignore case in matching collection in ES, are you anyway then presenting them in the HTML and URLs in their canonical form or you are reusing the case that was provided as input by the user?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use them in the UI in the canonical form, like Authors
. But then you can do /search?cc=Authors
or /search?cc=authors
if restricted_collections: | ||
cache.set( | ||
'restricted_collections', | ||
pickle.dumps(restricted_collections), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use protocol
-1
to have the most efficient protocol being chosen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will use #1622 so will not do pickling in the end myself.
query = Q('match', _collections=collection) | ||
|
||
for collection in list(all_restricted_collections - user_coll): | ||
query = query & ~Q('match', _collections=collection) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you tried using actually ES filtering?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The query returned here is added to a ES filter query. You can see the output in this test https://github.com/inspirehep/inspire-next/pull/1589/files#diff-8eedb14f28e016fe0e021789e25878abR336
After #1622 is merged I will change the way I read/write into Redis as Flask-cache allows for writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍, just some nits.
COLLECTIONS_QUERY_WALKERS = [ | ||
'inspirehep.modules.search.walkers.pypeg_to_ast:PypegConverter', | ||
] | ||
"""Modules to create the query AST.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: space here.
COLLECTIONS_USE_PERCOLATOR = False | ||
"""Define which percolator you want to use. | ||
|
||
Default value is `False` to use the internal percolator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's confusing! Maybe we should change upstream to have COLLECTIONS_USE_ES_PERCOLATOR
instead? Of course, not something to be fixed in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This docstring is copy/pasted from invenio-collections
. Maybe the phrase is a bit confusing since they called it 'internal percolator'. If we think of percolator
just as the ES one, then the config makes sense COLLECTIONS_USE_PERCOLATOR
True
or False
RECORDS_UI_ENDPOINTS = dict( | ||
literature=dict( | ||
pid_type='literature', | ||
route='/literature/<pid_value>', | ||
template='inspirehep_theme/format/record/' | ||
'Inspire_Default_HTML_detailed.tpl', | ||
record_class='inspirehep.modules.records.wrappers:LiteratureRecord', | ||
permission_factory_imp='invenio_records_rest.utils:allow_all', | ||
record_class='inspirehep.modules.records.wrappers:LiteratureRecord' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: keep the trailing commas here, the diffs will look nicer now and in the future.
@@ -585,63 +694,59 @@ def _(x): | |||
), | |||
) | |||
|
|||
|
|||
RECORDS_UI_DEFAULT_PERMISSION_FACTORY = \ | |||
"inspirehep.modules.records.permissions:record_read_permission_factory" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to know, I was the one who added the permission_factory_imp
everywhere...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:)
@@ -20,7 +20,7 @@ | |||
"numeric_detection": false, | |||
"properties": { | |||
"_collections": { | |||
"index": "not_analyzed", | |||
"analyzer": "lowercase_analyzer", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the rationale for using this particular one for _collections
? Should we prefer the lowercase_analyzer
to not_analyzed
in all cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to be forgiving to the user. Legacy INSPIRE/Invenio has always been case-insensitive for any field.
I wonder maybe if for e.g. DOI we have to be careful though.
from inspirehep.modules.pidstore.minters import inspire_recid_minter | ||
from inspirehep.modules.search.api import LiteratureSearch | ||
from inspirehep.utils.cache import cache | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: ...one extra space here.
login_user_via_session(client, email=user_info['email']) | ||
|
||
result = client.get("/literature/123") | ||
assert result.status_code == status |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you are not going to use result
further IMO assert client.get('/literature/123').status_code == status
is enough.
login_user_via_session(api_client, email=user_info['email']) | ||
|
||
result = api_client.get("/literature/123") | ||
assert result.status_code == status |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same thing here.
login_user_via_session(client, email=user_info['email']) | ||
|
||
result = client.get("/literature/222") | ||
assert result.status_code == status |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here.
if user_info: | ||
login_user_via_session(api_client, email=user_info['email']) | ||
result = api_client.get("/literature/222") | ||
assert result.status_code == status |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And here.
* When the test client makes a call to /api `current_app` becomes the API app. This was causing problems since invenio-collections is not registered in the API. Signed-off-by: Javier Martin Montull <javier.martin.montull@cern.ch>
Signed-off-by: Javier Martin Montull <javier.martin.montull@cern.ch>
* The task is not needed since enhancements should happen before record insert/index. And it was causing problem since a dictionary was being passed to receivers instead of a Record object. Signed-off-by: Javier Martin Montull <javier.martin.montull@cern.ch>
* Enhances queries with _collections field based on URL cc parameter. * By default Literature will only show records that had HEP collection in legacy. Signed-off-by: Javier Martin Montull <javier.martin.montull@cern.ch>
Signed-off-by: Javier Martin Montull <javier.martin.montull@cern.ch>
Signed-off-by: Javier Martin Montull <javier.martin.montull@cern.ch>
* Prevents circular dependency problem when doing imports in config.py. Signed-off-by: Javier Martin Montull <javier.martin.montull@cern.ch>
* Allow cataloger to access the Holding Pen. Signed-off-by: Javier Martin Montull <javier.martin.montull@cern.ch>
* Adds read permission factory for Literature API. Signed-off-by: Javier Martin Montull <javier.martin.montull@cern.ch>
* To avoid adding the default filter, use get_source(). Signed-off-by: Javier Martin Montull <javier.martin.montull@cern.ch>
Signed-off-by: Javier Martin Montull <javier.martin.montull@cern.ch>
629c2b1
to
b4e9ad2
Compare
Signed-off-by: Javier Martin Montull <javier.martin.montull@cern.ch>
* Creates receiver for user_logged_in signal to populate user restricted collections upon login. Signed-off-by: Javier Martin Montull <javier.martin.montull@cern.ch>
Signed-off-by: Javier Martin Montull <javier.martin.montull@cern.ch>
b4e9ad2
to
342cb3a
Compare
Comments addressed and changed Please, review one more or time and ready to 🚢 |
Signed-off-by: Javier Martin Montull javier.martin.montull@cern.ch
Closes #919