-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added first iteration of easy_publish #153
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great job Isaac, I think the easy_publish()
method will be a huge step in making model publishing even more accessible to the user! I have several questions about code logic, etc. Also, I think allowing the user to set datacite metadata is important.
"pytorch", | ||
"tensorflow", | ||
"sklearn") | ||
serv_options (dict): the servable_type specific arguments that are necessary for publishing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add more context for the user on what the servable options are -- ideally, link to documentation
except Exception as e: | ||
help_err = HelpMessage(f"Help can be found here:\nhttps://dlhub-sdk.readthedocs.io/en/latest/source/dlhub_sdk.models.servables.html#" | ||
f"{model.__module__}.{model.__name__}.create_model") | ||
raise help_err from e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just to clarify, this help_err won't eat the original exception message, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. specifying from e
means the traceback will show the original error and say something like "the following error is a direct result of the previous error" and then show the help message. docs on from Exception
here.
|
||
# raise an error if the provided servable_type is invalid | ||
model = models.get(servable_type) | ||
if model is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you wanted to shorten this a bit, you could do if servable_type not in models
-- slightly more pythonic imo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would, but I also need to store the model
variable for later. I suppose lookups are quick, but it's a little more visual clutter. up to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤷♀️ I like "not in models" but not enough to change this here.
dlhub_sdk/client.py
Outdated
|
||
# set the required datacite fields | ||
model_info.set_title(title) | ||
creators = [creators] if isinstance(creators, str) else creators |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, so this is handling the case where the user inputs creators as a string instead of a list? please add a small inline comment saying as much (something like # handle creator as string or list
or something similar)
dlhub_sdk/client.py
Outdated
# set the required datacite fields | ||
model_info.set_title(title) | ||
creators = [creators] if isinstance(creators, str) else creators | ||
model_info.set_creators(creators, []) # we do not ask for affiliations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should allow the user to still set affiliations if they'd like to
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be happy to talk about what parameters we want to provide for this function. When Logan and I discussed it, we opted to go for the absolute minimum and leave it to the publisher to edit later. All depends on where we want to draw the complexity line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's do at least affiliations as a breakout argument.
Args: | ||
f (object): A function pointer | ||
autobatch (bool): Whether to run function on an iterable of entries | ||
function_kwargs (dict): Any default options for this function | ||
""" | ||
return cls.create_model(f.__module__, f.__name__, autobatch, function_kwargs) | ||
return cls.create_model(f=f, autobatch=autobatch, function_kwargs=function_kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what to do about autobatch
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that I understand..
dlhub_sdk/tests/test_dlhub_client.py
Outdated
def test_submit(dl): | ||
# @mark.skipif(not is_gha, reason='Avoid running this test except on larger-scale tests of the system') | ||
# @mark.skip | ||
def test_submit(dl, mocker): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice job with mocking
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we go-ahead and remove the commented-out lines?
dlhub_sdk/utils/validation.py
Outdated
@@ -36,7 +36,7 @@ def _type_name_to_type(name: str) -> Union[type, Tuple[type]]: | |||
try: | |||
return type_table.get(name) or __builtins__[name] # getattr(__builtins__, name) if __builtins__ is a module | |||
except KeyError: # AttributeError if __builtins__ is a module | |||
raise ValueError(f"found an unknown type name in servable metadata: {name}") | |||
raise ValueError(f"found an unknown type name in servable metadata: {name}") from None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happens if you raise from None
? I am not familiar with this pattern
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will raise
the ValueError
without mentioning that it was the result of a KeyError
(the user does not need to know that) docs for exception chaining are here if you're interested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same behavior will happen without the "from None," (i.e., what we had before), if I understand right
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the docs: "Exception chaining happens automatically when an exception is raised inside an except
or finally
section. This can be disabled by using from None
idiom." Since the ValueError
is raised in an except
chaining will happen automatically unless it is disabled (if I understand correctly).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shows that I know. Let's keep yours
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're ready after the minor changes here
dlhub_sdk/client.py
Outdated
@@ -336,6 +347,59 @@ def get_result(self, task_id, verbose=False): | |||
result = result[0] | |||
return result | |||
|
|||
def easy_publish(self, title: str, creators: Union[str, list[str]], short_name: str, servable_type: str, serv_options: dict[str, Any]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree! We could still keep the "title" and "creators" as other arguments (that take precedent) so that people don't have to know datacite exists.
|
||
# raise an error if the provided servable_type is invalid | ||
model = models.get(servable_type) | ||
if model is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤷♀️ I like "not in models" but not enough to change this here.
dlhub_sdk/client.py
Outdated
# set the required datacite fields | ||
model_info.set_title(title) | ||
creators = [creators] if isinstance(creators, str) else creators | ||
model_info.set_creators(creators, []) # we do not ask for affiliations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's do at least affiliations as a breakout argument.
dlhub_sdk/tests/test_dlhub_client.py
Outdated
def test_submit(dl): | ||
# @mark.skipif(not is_gha, reason='Avoid running this test except on larger-scale tests of the system') | ||
# @mark.skip | ||
def test_submit(dl, mocker): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we go-ahead and remove the commented-out lines?
dlhub_sdk/utils/validation.py
Outdated
@@ -36,7 +36,7 @@ def _type_name_to_type(name: str) -> Union[type, Tuple[type]]: | |||
try: | |||
return type_table.get(name) or __builtins__[name] # getattr(__builtins__, name) if __builtins__ is a module | |||
except KeyError: # AttributeError if __builtins__ is a module | |||
raise ValueError(f"found an unknown type name in servable metadata: {name}") | |||
raise ValueError(f"found an unknown type name in servable metadata: {name}") from None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same behavior will happen without the "from None," (i.e., what we had before), if I understand right
The changes you mentioned should be added now (aside from the datacite parameter, which I think gets at the overall ease of use/usefulness issue and warrants further consideration). |
Thanks! I'm happy then. Leaving off datacite for now is a good idea. We can add other fields in smaller PRs Maybe another for now though: paper_doi so we can add a paper as an associated identifier |
I gathered what I could from the docs, please let me know if this is not what you were looking for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's simplify the "paper_doi" even more
The relation is now fixed! |
Pull Request Test Coverage Report for Build 3101287839
💛 - Coveralls |
Looks fantastic, excellent work Isaac! Super pumped for these changes for the user |
DO-NOT-MERGE is added while additional features are considered and testing is conducted