-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Keywords #115
Keywords #115
Conversation
Borrowing some ideas from experience with the eTraveler version of label support, I'd suggest
|
Is there a benefit to keeping inactive keyword entry links in the database, as opposed to a user just being able to delete a keyword from a dataset and the entry is gone? I also don't think keywords should be modifiable, as it would have a knock on effect to everyones dataset that is tagged with that keyword. |
|
@JoanneBogart, because we don't have access to the table classes which we create the schema with (in Coming across this because of the keywords many-to-many relationship, which would be nice to directly query "get me all datasets with this keyword" or vice versa. But the So to query keywords we'd have to do it a bit more manual in the code, through the Can discuss more during meeting |
@stuartmcalpine |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might have more suggestions (or maybe not) but it may be a few days before I can find time to finish looking at this. It seemed best to let you know what I found so far.
@@ -116,6 +117,12 @@ def modify(self, entry_id, modify_fields): | |||
|
|||
# Loop over each column to be modified | |||
for key, v in modify_fields.items(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A base class shouldn't have to know about its subclasses. And modifying keywords seems different from modifying anything else. I think this functionality belongs in a separate routine or routines, e.g. modify_keywords(self, entry_id, kwds)
; or, maybe more usefully, add_keywords(self, entry_id, kwds)
and remove_keywords(self, entry_id, kwds)
. That (or they) could go in DatasetTable, which I think is the best option currently. When and if we support assigning keywords to executions and if the implementation is going to be similar, then it could perhaps move back to BaseTable. If so, anything called, like _validate_keywords
, should also be declared in BaseTable but with an empty implementation if the real implementation is specific to the subclass:
def _validate_keywords(self, kwds):
pass
or maybe better something like
def _validate_keywords(self, kwds):
raise Exception("Subclasses should provide implementation")
Alternatively the routine or routines could go in some file for utilities where the table is passed in as an argument. Either way there would need to be a check that the table is an instance of an appropriate type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, I've moved the keyword specific functions to the dataset class.
There is now a add_keyword
and (for now empty) delete_keyword
function that can be used to manipulate keywords post-registration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks fine. We might make it a little easier for users to figure out how to query for keywords. You have a nice example in the tutorial. The docstring for find_datasets
could mention the possibility of querying for keywords and refer to the tutorial.
Adding keywords
Add a keywords table so that datasets can be tagged with multiple keywords.
Changes
keyword_id
: intkeyword
: strsystem
: bool, True for preset keyword, False for user defined keywordactivate
: bool, True is keyword is active and usablecreator_uid
: strcreation_date
: datetimedataset_keyword_id
: intdataset_id
: intkeyword_id
: intsrc/dataregistry/schema/keywords.yaml
which populates thekeyword
table during database creation.datareg.Registrar.dataset.get_keywords()
function to return the list of currently registered keywords.keyword
table is queried on an automatic join is made with thedataset_keyword
association table. So the user can query for all datasets with a given keyword for example.dregs show keywords
from CLI to display all pre-registered keywordsNotes