-
Notifications
You must be signed in to change notification settings - Fork 446
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Removed extra conditions in _auto_identify_connection_string
method
#2038
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,5 @@ | ||
import re | ||
import typing as t | ||
from re import match as re_match | ||
|
||
from superduperdb.base.configs import CFG | ||
|
||
|
@@ -24,26 +24,16 @@ def superduper(item: t.Optional[t.Any] = None, **kwargs) -> t.Any: | |
|
||
|
||
def _auto_identify_connection_string(item: str, **kwargs) -> t.Any: | ||
from superduperdb.base.build import build_datalayer | ||
|
||
if item.startswith('mongomock://'): | ||
kwargs['data_backend'] = item | ||
if re_match(r'^[a-zA-Z0-9]+://', item) is None: | ||
raise ValueError(f'{item} is not a valid connection string') | ||
|
||
elif item.startswith('mongodb://'): | ||
kwargs['data_backend'] = item | ||
if item.endswith('.csv') and CFG.cluster.cdc.uri is not None: | ||
raise TypeError('Pandas is not supported in cluster mode!') | ||
|
||
elif item.startswith('mongodb+srv://') and 'mongodb.net' in item: | ||
kwargs['data_backend'] = item | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hey @ChandanChainani , we need to check this pattern, because There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jieguangzhou for this type of conditions we can improve by creating single condition which validates all the supported url types. For example: supported_url_types = ["mongodb+srv://", "mongodb://", "mongomock://"]
if not any(item.startswith(url_type) for url_type in supported_url_types):
raise Exception("url format not supported") There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, could you update the pr with these changes? |
||
kwargs['data_backend'] = item | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hi @ChandanChainani thanks for the PR. Are you sure that these can be dropped like this? Do you get passing tests? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hi @blythed, Yes, except for llm and doc_string related test all the other tests are validated Command
Output
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have update the code and there is one additional condition that I think can be added which is to make sure the user only supply supported db/file uri format please suggest (for supported db/file format we can maintain constants) |
||
elif item.endswith('.csv'): | ||
if CFG.cluster.cdc.uri is not None: | ||
raise TypeError('Pandas is not supported in cluster mode!') | ||
kwargs['data_backend'] = item | ||
from superduperdb.base.build import build_datalayer | ||
|
||
else: | ||
if re.match(r'^[a-zA-Z0-9]+://', item) is None: | ||
raise ValueError(f'{item} is not a valid connection string') | ||
kwargs['data_backend'] = item | ||
return build_datalayer(CFG, **kwargs) | ||
|
||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use
re.match
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jieguangzhou it is
re.match
method I have created alias which is being used since we are only using single method from there
package.