Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Implement database creator #18

Merged
merged 8 commits into from
Dec 5, 2017

Conversation

tsirif
Copy link
Member

@tsirif tsirif commented Nov 30, 2017

No description provided.

@codecov-io
Copy link

codecov-io commented Nov 30, 2017

Codecov Report

Merging #18 into master will increase coverage by 5.96%.
The diff coverage is 97.26%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #18      +/-   ##
==========================================
+ Coverage   60.36%   66.32%   +5.96%     
==========================================
  Files           9       11       +2     
  Lines         333      389      +56     
  Branches       33       44      +11     
==========================================
+ Hits          201      258      +57     
- Misses        130      131       +1     
+ Partials        2        0       -2
Impacted Files Coverage Δ
src/metaopt/resolve_config.py 0% <ø> (ø) ⬆️
src/metaopt/cli.py 0% <0%> (ø) ⬆️
src/metaopt/io/database/mongodb.py 100% <100%> (+3.22%) ⬆️
tests/core/unittests/mongodb_test.py 100% <100%> (ø) ⬆️
tests/core/unittests/database_test.py 100% <100%> (ø)
tests/core/unittests/conftest.py 100% <100%> (ø)
src/metaopt/io/database/__init__.py 100% <100%> (ø) ⬆️
src/metaopt/utils/__init__.py 100% <100%> (ø) ⬆️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 746ec56...9c27d05. Read the comment docs.

Copy link
Member

@bouthilx bouthilx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done. There is only the question of ignoring the arguments when the Database is already instantiated that bugs me. The rest of my comments are open questions.


"""

all = [AbstractDB, MongoDB]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we make AbstractDB register automatically any children? In that case all could be replaced by something like AbstractDB.types. Lists like all are annoying to maintain. Test at line 195 would not be needed anymore.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could make something dirty here:

  1. globbing for all Python files in database directory
  2. try to import them
  3. for each public name in imported module, check whether it is a subclasses of AbstractDB,
    append to list of types if it is.

Copy link
Member

@bouthilx bouthilx Dec 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's too dirty. 😛

Man, I just learned something so nice. 😮 Did you know about SomeClass.__subclasses__()? I just found that from here

>>> class A(object):
...     pass
... 
>>> class B(A):
...     pass
... 
>>> A.__subclasses__()
[<class '__main__.B'>]

Magic! 🎉

To be sure it works, all databases need to be imported in src/metaopt/io/database/__init__.py otherwise __subclasses__() might miss some. To avoid import circle, I would move base/abstract stuff to base.py and import it in __init__.py


name -- Experiment's name.

If you provide a past experiment**s name,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why experiment**s?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo.

all = [AbstractDB, MongoDB]
typenames = list(map(lambda x: x.__name__, all))

def __call__(cls, dbtype='AbstractDB', *args, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be dbtype=None? I don't think we want AbstractDB to be ever instantiated.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It cannot be, eitherways 😛

Copy link
Member

@bouthilx bouthilx Dec 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So why AbstractDB by default if it can't be instantiated? None seems less confusing to me, but I might miss something.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None raises an exception somewhere 'random'. In particular when doing dbtype.lower(). Passing 'AbstractDB' raises a more natural and informative error.

username='user', password='pass')

assert Database() is database
assert Database('fire', [], {'doesnt_matter': 'it\'s singleton'}) is database
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems error-prone. In my opinion we should allow Database() after first call to ease reuse of the database without knowing the details (type), but trying to get something different should raise an error.

>>> database = Database("mongodb")
>>> database = Database("mysql")
>>> print type(database)
"MongoDB"

🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. This is easy to implement.

@@ -60,12 +60,14 @@ def test_bad_connection(self):
MongoDB(host='asdfada', port=123, dbname='metaopt',
username='uasdfaf', password='paasdfss')
assert "Connection" in str(exc_info.value)
MongoDB.instance = None # Set singular instance to None for independent tests
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't there something similar to TestCast.tearDown() in pytest?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is, I fix this in the next PR. About experiment and trials.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not fixing it here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am grouping "session" setup/teardowns in a default file conftest.py by pytest. I did that in another branch.

@tsirif
Copy link
Member Author

tsirif commented Dec 2, 2017

@bouthilx @dendisuhubdy Call for final review (not needed immediately now, but I believe that it is finalized)

@@ -20,7 +20,7 @@

import six

from metaopt.utils import AbstractSingletonType
from metaopt.utils import (AbstractSingletonType, SingletonFactory)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is SingletonFactory?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Factory is a metaclass to create the Database class, a factory pattern aggregating within its creation both the argument denoting its implementation database type (see of_type argument), and the arguments to be passed for the creation/initialization of an implementation of the interface, defined by the base class of the Factory class to-be-created (in our case is the AbstractDB). See also the tests for how it is used.

SingletonFactory is the Factory metaclass wrapped by the Singleton metaclass, so that it is allowed to return only one instantiation of the created class.

@@ -106,14 +106,14 @@ def write(self, collection_name, data,
pass

@abstractmethod
def read(self, collection_name, query, selection=None):
def read(self, collection_name, query=None, selection=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So a call on read using an empty query means that we retrieve all of the documents inside? [but yeah we did agree on that]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh damn I am not sure why I did that... see bad commit names. 😛 I am removing/stashing the commit tsirif/orion@4570cd6 until it becomes necessary. @bouthilx agree?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what is the problem here. The behavior db.read("collection_name") being similar to db.read("collection_name", {}) seems fine the me.

@@ -148,4 +148,11 @@ class DatabaseError(RuntimeError):
pass


from metaopt.io.database.mongodb import MongoDB # noqa
@six.add_metaclass(SingletonFactory) # pylint: disable=too-few-public-methods,abstract-method
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha here

('METAOPT_DB_NAME', 'name', 'MetaOpt'),
('METAOPT_DB_TYPE', 'type', 'MongoDB'),
('METAOPT_DB_ADDRESS', 'address', socket.gethostbyname(socket.gethostname()))
('METAOPT_DB_NAME', 'dbname', 'MetaOpt'),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes these variable names are much more clear


resources -- {resource_alias: (entry_address, scheduler, scheduler_ops)} (optional)

algorithm -- {optimizer module name : method-specific configuration}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat 👍

from metaopt.io.database import Database
from metaopt.io.database.mongodb import MongoDB

TEST_DIR = os.path.dirname(os.path.abspath(__file__))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to double check where does this point to?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the directory where experiment.yaml can be found, so that it can be loaded and used for the tests.

def test_empty_first_call(self):
"""Should not be able to make first call without any arguments.

Hegelian Ontology Primer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hegelian Ontology Primer :)


def test_instatiation_and_singleton(self):
"""Test create just one object, that object persists between calls."""
database = Database(of_type='MongoDB', dbname='metaopt_test',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cute we have an of_type now, so we could switch to PostgreSQL I assume

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Basically if everything is alright then we have a wrapper officially.

moptdb = MongoDB(username='user', password='pass', dbname='metaopt_test')
return moptdb


@pytest.mark.usefixtures("null_db_instances")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

last question, what is null_db_instances?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a setup fixture that resets Singleton-type classes internal singleton instance object to None, so that we can test in the same run every possible invocation of a Singleton's creation. We assume independency between two different test functions.

@bouthilx
Copy link
Member

bouthilx commented Dec 3, 2017

@tsirif I will look at it Monday morning.

Copy link
Member

@bouthilx bouthilx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed for making now a general Factory class for singletons since we will need many in the short term (Database, experiment, worker, producer, etc). I'm not sure however about the way children of the abstract singletons are fetched (see comment)


modules = []
base = import_module(cls.__base__.__module__)
py_files = map(lambda x: '.' + x.split('.')[0].split('/')[-1],
Copy link
Member

@bouthilx bouthilx Dec 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That will be painful to test. Why not depending on imports for making available the subclasses? It makes sense to import all classes of a given type in the same __init__.py file, database/__init__.py for instance. Also, it makes it easier to drop subclasses if they have unmet dependencies for instance. That way Database(of_type="PostgreSQL") would not be possible if the dependencies are unmet (it's not imported). Maybe dropping them for that is not a good idea but it's an example of how more flexible it is. The downside is making obligatory imports in specific __init__.py but that is a common practice anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did it like because if it's pain to append a class name to a types list, then I assume it would be also a pain to require importing every new class to __init__.py. This works. You can checks the test code. It assumes that all implementations lie in files that reside in the same directory as the base implementation. What else should be the case, in order to be considered that it is tested properly?

(Not too much) Later this will be augmented with searching through the pkg_resources for registered entry_points also, in order to include the case where the interface is exported as a plugin (I consider the database interface to be an internal though).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was worried that we would try looking all around so there could be many corner cases to test, but what you explain here is simple. Let's try it that way.

@@ -47,10 +36,12 @@ def clean_db(database, exp_config):
@pytest.fixture(scope='module')
def moptdb():
"""Return MongoDB wrapper instance initiated with test opts."""
MongoDB.instance = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it needed here because moptdb() is not a test so you cannot use null_db_instances fixture?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that null_db_instances has 'function' scope, while moptdb has 'module' scope. If there's a need to create a 'module' scope one I will do it, but currently there's no other needs for it. (It's kinda annoying, I know..)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. Thanks for the explanation.

@bouthilx
Copy link
Member

bouthilx commented Dec 4, 2017

@tsirif Alright! Needs rebase on master and ready to merge.

- Set `MongoDB.instance` to None after tests of instantiation to keep
tests independent of one another.
- Create and return the same instance of a specific available
implementation of AbstractDB
- Document possible arguments for a Database @ `merge_mopt_config`
- SingletonFactory inherits from AbstractSingletonType and Factory:
to be used for creating singleton's out of an interface.
- `types` attribute of Factory classes (like `Database`) is dynamically
populated by using **glob** to find implementation files and import
them, and then using `cls.__base__.__subclasses__()` to create list with
the visible names that are subclassing cls.__base__ (i.e. `AbstractDB`
for `Database`).
- Invoking with arguments after the first call on a SingletonFactory
class would result in raising a ValueError.
@tsirif
Copy link
Member Author

tsirif commented Dec 4, 2017

@bouthilx Done!

@dendisuhubdy
Copy link
Contributor

Is this ok to merge?

@tsirif
Copy link
Member Author

tsirif commented Dec 5, 2017

@dendisuhubdy Yes!

@bouthilx bouthilx merged commit 9f57125 into Epistimio:master Dec 5, 2017
@tsirif tsirif deleted the feature/db/creator branch December 7, 2017 20:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants