MongoDB - class and tests #8

dendisuhubdy · 2017-11-12T21:19:45Z

No description provided.

bouthilx · 2017-11-14T18:23:45Z

src/metaopt/io/database/mongodb.py

+from abstractdb import AbstractDB
+
+try:
+    from StringIO import StringIO


In which case StringIO is not available?

In Python 3.0 StringIO and cStringIO modules are gone. See https://stackoverflow.com/questions/11914472/stringio-in-python3

Ah! You are speaking to an ignorant of python3 indeed.

Gotcha 👍

http://python-future.org/compatible_idioms.html#stringio

You can use from io import StringIO.

In general, these sites are good references along with the six module:

http://python-future.org/compatible_idioms.html
https://docs.python.org/3/howto/pyporting.html
https://pythonhosted.org/six/

bouthilx · 2017-11-14T18:24:51Z

src/metaopt/io/database/mongodb.py

+        self.password = str(password)
+
+        # we might have a different port for the database port
+        # so here we do a check if the port is there


Is that a TODO comment?

I leaved it for a IF block before, I should've removed it. Noted. Thanks

bouthilx · 2017-11-14T18:27:25Z

src/metaopt/io/database/mongodb.py

+        # according to MongoDB documentation
+        # https://docs.mongodb.com/getting-started/python/client/
+
+        self.uri = 'mongodb://' + self.dbhost + ':' + self.dbport


Building the DB url (you typed uri by the way, was it intentional?) can get complicated if we support many options. That would deserve it's own method. It could be worth making it a dynamic @property.

I used the full URI until our discussion last week with Fred that for current localhost testing purposes we shouldn't put a default username and password because it would get passed on. Anyway, I'll comment out this URI and use the one following this line instead.

I think that the @property is a good idea. I am not sure how we should ask for the password. There are multiple ways of authentications here: https://api.mongodb.com/python/current/examples/authentication.html

Also, there this useful stl in 2.7, >=3.5: https://docs.python.org/3/library/getpass.html

bouthilx · 2017-11-14T18:28:30Z

src/metaopt/io/database/mongodb.py

+            conn = MongoClient(self.uri)
+            self.conn = conn
+        except pymongo.errors.ConnectionFailure:
+            raise Exception("Could not connect to server")


Could ConnectionFailure connect important information that we would like to pass on?

I don't see how we would need it? If pymongo.errors.ConnectionFailure as e we pass on (like in C), the assert value for the error is null, thus outside of the try exception block we would always return a valid connection object.

By passing on, I meant passing the error information in the raised error so that it finds its way in the stack trace printed to the user.

Hmmmn conn itself contains enough information to let the user know if things aren't right

It is not a good idea to raise or catch pure Exception objects; there is no way to differentiate between cases. I agree with Dendi's thought to reraise an exception here, but it should be an Exception of our own w.r.t. common database problems, so that we can wrap in the future other libraries as well.

However, I think we should still keep the traceback though, Pythonic stuff. raise wrapped_e from e clauses are the case here, but unfortunately however beautiful (I find) they are, they are Py3 only. But there is this:
http://python-future.org/compatible_idioms.html#raising-exceptions (from future.utils)
https://pythonhosted.org/six/#six.raise_from (from six although they do not implement the correct impl of this functn; there's a current PR about this)

For truthful reference: https://www.python.org/dev/peps/pep-3134/#rationale
.. seemore:: difference between implicit and explicit exception chaining :P

@tsirif I referred to this example in another comment for a way to reraise an exception. I agree we should not reraise it as a pure Exception. But again, I'm not sure we would need to reraise it. Isn't the ConnectionFailure explicit enough?

Either both, I added it to the __init__.py in utils.py and redid all of the exception blocks.

bouthilx · 2017-11-14T18:29:23Z

src/metaopt/io/database/mongodb.py

+    def close_connection(self):
+        try:
+            close = self.conn.close()
+        except pymongo.errors.ConnectionFailure:


bouthilx · 2017-11-14T18:32:03Z

src/metaopt/io/database/mongodb.py

+        except pymongo.errors.ConnectionFailure:
+            raise Exception("Could determine unlock due to connection")
+
+    def insert(self, experiment_name, trial_field, dictionary, field_filters=None):


trial_field is used as a collection name. It's confusing. Is trial_field supposed to be a string inside the collection experiment_name pointing to a dictionary of trials ({id: trial})? Shouldn't that trial_field be a CONSTANT variable then?

Nevertheless, I'm not sure DB classes should have experiment and trial semantic tied to them. From Christos' UMLs, it seems to me that Experiment and Trial should have their own interface, with each using AbstractDB internally (which can be MongoDB or PostgreSQL, etc). In some way, workers and producers do not need to know they are communicating with a DB, anything could be hidden behind Experiment and Trial objects; Database, Sockets, whatever. @tsirif Any thoughts on that?

bouthilx · 2017-11-14T18:33:01Z

src/metaopt/io/database/mongodb.py

+        #                               'hyperparam_type': 'int',
+        #                               'hyperparam_value': '200'}
+
+        # check field_filters value


bouthilx · 2017-11-14T18:33:10Z

src/metaopt/io/database/mongodb.py

+        if field_filters is None:
+            field_filters = {}
+
+        dbcollection    = self.db[experiment_name][trial_field]


bouthilx · 2017-11-14T18:47:48Z

src/metaopt/tests/mongodb_test.py

+        return conn
+
+    def mongodb_insert_test(self):
+        result = self.db.insert(self.experiment, self.trial, self.query_test, field_filters=None)


ok I see here what trial_field was intended to be. I'd prefer to combine them inside a single dict inside experiments otherwise they kind of pollute the experiment row.

I am not sure that I understand what you mean. Could you give 2 cases?

{ name: dummy, # some other properties, trial_1: { # properties of trial_1 }, # ... trial_n: { # properties of trial_n } # some other properties, }

vs

{ name: dummy, # some other properties, trials: [ { id: 1, # properties of trial #1 }, # ... trial_n: { # properties of trial_n } }, # some other properties, }

Trials could be either a list or a dictionary

bouthilx · 2017-11-14T18:50:56Z

src/metaopt/tests/mongodb_test.py

+
+from metaopt.io.database.mongodb import MongoDB
+
+class MongoDB_test():


Why not using unittest.TestCase and tests directly inside the class rather than gluing them together in a function later?

Gotcha I adjusted this per assert in the functions inside the class

No need for subclassing in general: https://docs.pytest.org/en/latest/getting-started.html

No but the assert methods coming with unittest.TestCase are quite useful. 🙂

bouthilx · 2017-11-15T02:01:45Z

src/metaopt/io/database/mongodb.py

+        # try to initialize connection
+        # setting this as a function allows to make sure reconnection
+        # is properly done in the case of an unreliable network
+        self.uri = property(self.get_uri, self.set_uri, \


Why not using this syntax?

@property def uri(self): # ... @uri.setter def uri(self, value): # ...

bouthilx · 2017-11-15T02:04:59Z

src/metaopt/io/database/mongodb.py

+        uri = "mongodb://%s:%s@%s:%s" % (quote_plus(self._username), \
+                                         quote_plus(self._password), \
+                                         self._dbhost, self._dbport)
+        self._uri = uri


uri passed to the setter is not used. Do we support editing dbhost, dbport, etc, or editing uri? Or both? I think we should support both, but for simplicity I would only support setting dbhost, dbport and others for now. If we allow setting the uri, than we need to parse it and set dbhost and others accordingly. We can keep that for later.

bouthilx · 2017-11-15T02:05:42Z

src/metaopt/io/database/mongodb.py

        # return connection string for checking purposes
        return conn

    def check_connection(self):
        if self.db:
-            #print("Connection Successful!")
+            print("Connection is successful!")


logger.info please :)

bouthilx · 2017-11-15T02:05:53Z

src/metaopt/io/database/mongodb.py

            return True
        else:
-            #print("Please check your connection")
+            print("Please check your connection")


bouthilx · 2017-11-15T02:09:10Z

src/metaopt/io/database/mongodb.py

            return False

    def close_connection(self):
        try:
            close = self.conn.close()
-        except pymongo.errors.ConnectionFailure:
-            raise Exception("Could not close socket connection")
+        except pymongo.errors.ConnectionFailure as e:


I was not saying we should do this. I was just wondering what kind of information is passed through ConnectionFailure. Is there any relevant information that we should not lose? Additionally, I believe raising Exception like this might break the stack trace. Look at this for instance. For simplicity, could the ConnectionFailure error message be good enough to just let it pass with no try-except clause?

bouthilx · 2017-11-15T02:17:59Z

@dendisuhubdy It would be preferable to use informative messages for the commits. Taken out of context, "adding suggested changes from code review" doesn't mean anything. Even then, the first though I had reading the commit's name was "What changes did I asked exactly? I don't remember". 🙂

A good reading about commit messages: http://who-t.blogspot.ca/2009/12/on-commit-messages.html

tsirif

Gogogo let's close this tomorrow!

tsirif · 2017-11-17T05:31:22Z

src/metaopt/io/database/mongodb.py

+            conn = MongoClient(self.uri)
+            self.conn = conn
+        except pymongo.errors.ConnectionFailure:
+            raise Exception("Could not connect to server")


It is not a good idea to raise or catch pure Exception objects; there is no way to differentiate between cases. I agree with Dendi's thought to reraise an exception here, but it should be an Exception of our own w.r.t. common database problems, so that we can wrap in the future other libraries as well.

However, I think we should still keep the traceback though, Pythonic stuff. raise wrapped_e from e clauses are the case here, but unfortunately however beautiful (I find) they are, they are Py3 only. But there is this:
http://python-future.org/compatible_idioms.html#raising-exceptions (from future.utils)
https://pythonhosted.org/six/#six.raise_from (from six although they do not implement the correct impl of this functn; there's a current PR about this)

For truthful reference: https://www.python.org/dev/peps/pep-3134/#rationale
.. seemore:: difference between implicit and explicit exception chaining :P

tsirif · 2017-11-17T05:32:59Z

src/metaopt/tests/mongodb_test.py

+
+from metaopt.io.database.mongodb import MongoDB
+
+class MongoDB_test():


No need for subclassing in general: https://docs.pytest.org/en/latest/getting-started.html

bouthilx

We will need to address the design issue of separation between AbstractDB and childs with Experiment and Trial

bouthilx · 2017-11-17T14:43:56Z

src/metaopt/io/database/mongodb.py

+        if __debug__:
+            self.logger = logging.getLogger('database')
+            self.fh = logging.FileHandler('database.log')
+            self.fh.setLevel(logging.DEBUG)


I understand why we would want a detailed log for the database but do we really want to have DEBUG by default? I guess that's like system's log that are always very verbose by default.

bouthilx · 2017-11-17T14:45:35Z

src/metaopt/io/database/mongodb.py

+        return self._uri
+
+    @uri.setter
+    def uri(self):


That's not a correct setter interface. It should be

@uri.setter def uri(self, value):

But that wouldn't make sense since we set it internally anyway. Maybe we just shouldn't have a setter.

I'm kinda confused if we do it this way what value should we pass, the value of the URI itself?

That's the thing. It would not make sense and that's why I'm saying that maybe there should not be any setter method for uri.

So there would only be a @property? I'm not that familiar with Python yet, so does it work without a setter?

In the other case would a self.uri as the previous commits be preferable?

It works without a setter. Using property can be more flexible if for instance the property is built using other arguments which can change over time. Since uri is built using other arguments I think we should keep @Property.

bouthilx · 2017-11-17T14:47:57Z

src/metaopt/io/database/mongodb.py

+            conn = MongoClient(self.uri)
+            self.conn = conn
+        except pymongo.errors.ConnectionFailure:
+            raise Exception("Could not connect to server")


@tsirif I referred to this example in another comment for a way to reraise an exception. I agree we should not reraise it as a pure Exception. But again, I'm not sure we would need to reraise it. Isn't the ConnectionFailure explicit enough?

bouthilx · 2017-11-17T14:51:02Z

src/metaopt/io/database/mongodb.py

            return True
        else:
-            #print("Please check your connection")
+            self.logger.info("Please check your connection")


It seems weird to me to print logging info inside the check_connection like this. If False, I guess the code calling the function might raise an error or something. Why should the check_connection print logging info then? Maybe that would be better to set them as debug. I don't know. @tsirif What do you think?

bouthilx · 2017-11-17T14:51:22Z

src/metaopt/io/database/mongodb.py

            return False

    def close_connection(self):
        try:
            close = self.conn.close()
        except pymongo.errors.ConnectionFailure:
            raise Exception("Could not close socket connection")
+            self.logger.info("Could not close socket connection")


This will never get executed

bouthilx · 2017-11-17T14:51:38Z

src/metaopt/io/database/mongodb.py

        return close

    def is_locked(self):
        try:
            is_locked = self.conn.is_locked()
        except pymongo.errors.ConnectionFailure:
            raise Exception("Could not determine if connection is locked")
+            self.logger.info("Could not determine if connection is locked")


This will also never get executed

bouthilx · 2017-11-17T14:52:22Z

src/metaopt/io/database/mongodb.py

+        #               'hyperparam_type': 'int',
+        #               'hyperparam_value': '200'
+        #               }
+        #   }


That would be worth turning into a doctstring

bouthilx

DB classes still have experiment and trial semantic tied to them. Methods insert, update, load and remove should be more general

bouthilx · 2017-11-20T18:56:09Z

src/metaopt/io/database/abstractdb.py

+        formatter = logging.Formatter('%(asctime)s - %(name)s -'
+                                      '%(levelname)s - %(message)s')
+        self.fh.setFormatter(formatter)
+        self.logger.addHandler(self.fh)


Why set the filehandler as an attribute of the AbstractDB?

Because it will be used if we would want to have a PostgreSQL or MySQL class down the line

They will change the file handler?

Shouldn't one prefer just one type of database, so we just need one type of database.log file rather than mongodb-database.log thus one database filehandler? What do you think? :)

Yes, but then I still don't see why set self.fh. It seems to me that they can all use self.logger just fine without messing with self.fh.

bouthilx · 2017-11-20T18:57:32Z

src/metaopt/io/database/mongodb.py

+            self.conn = conn
+        except pymongo.errors.ConnectionFailure as e:
+            raise_from(DatabaseError("Could not connect to server"), e)
+            self.logger.debug("Could not connect to server")


This will never be executed

I don't understand why this would never be executed.

Lines under raise SomeError() never get executed. Once reraising the error, the process will move up to the code calling this method.

Ok so let's settle for raise Exception then?

He means that commands should be in reverse order 😛

First logger.debug and then raise_from

Oh lol 💃

bouthilx · 2017-11-20T18:57:46Z

src/metaopt/io/database/mongodb.py

+            close = self.conn.close()
+        except pymongo.errors.ConnectionFailure as e:
+            raise_from(DatabaseError("Could not close socket connection."), e)
+            self.logger.debug("Could not close socket connection")


bouthilx · 2017-11-20T18:57:53Z

src/metaopt/io/database/mongodb.py

+            is_locked = self.conn.is_locked()
+        except pymongo.errors.ConnectionFailure as e:
+            raise_from(DatabaseError("Could not determine if connection is locked."), e)
+            self.logger.debug("Could not determine if connection is locked.")


bouthilx · 2017-11-20T18:58:08Z

src/metaopt/io/database/mongodb.py

+            unlock = self.conn.unlock()
+        except pymongo.errors.ConnectionFailure as e:
+            raise_from(DatabaseError("Could not determine unlock due to connection."), e)
+            self.logger.debug("Could not determine unlock due to connection")


bouthilx · 2017-11-20T18:59:51Z

src/metaopt/tests/mongodb_test.py

+    result = db.remove(experiment, trial, field_filters)
+    assert result is None
+
+def mongodb_tests():


I don't understand why you moved back to functions like this.

I thought we agreed on working with pytest that's why we moved to these functions, now we decided on Unittest.TestCase and now we're moving back to the previous class?

We can use pytest without unittest if Christos and you prefers this. From what I understand Christos' comment was no against nosetest but simply that pytest supports the class tests without inheriting from special classes like unittest.TestCase. You can avoid writing a test function to combine them all by structuring the tests in a class.

bouthilx · 2017-11-20T19:03:01Z

src/metaopt/tests/mongodb_test.py

+
+from metaopt.io.database.mongodb import MongoDB
+
+class MongoDB_test():


No but the assert methods coming with unittest.TestCase are quite useful. 🙂

bouthilx · 2017-11-20T19:08:15Z

src/metaopt/tests/mongodb_test.py

+        return conn
+
+    def mongodb_insert_test(self):
+        result = self.db.insert(self.experiment, self.trial, self.query_test, field_filters=None)


{ name: dummy, # some other properties, trial_1: { # properties of trial_1 }, # ... trial_n: { # properties of trial_n } # some other properties, }

vs

{ name: dummy, # some other properties, trials: [ { id: 1, # properties of trial #1 }, # ... trial_n: { # properties of trial_n } }, # some other properties, }

Trials could be either a list or a dictionary

bouthilx · 2017-11-20T21:12:57Z

src/metaopt/io/database/abstractdb.py

@@ -32,19 +32,19 @@ def close_connection(self):
        pass

    @abstractmethod
-    def insert(self, experiment_name, trial_field, dictionary, field_filters):
+    def insert(self, collection_name, dictionary, field_filters):


I don't think field_filters is necessary here

bouthilx · 2017-11-20T21:13:27Z

src/metaopt/io/database/abstractdb.py

        pass

    @abstractmethod
-    def update(self, experiment_name, trial_field, update_dictionary, field_filters):
+    def update(self, collection_name, update_dictionary, field_filters):


I think (self, collection_name, query, update_dictionary) would be more clear.

bouthilx · 2017-11-20T21:16:51Z

src/metaopt/io/database/abstractdb.py

        pass

    @abstractmethod
-    def load(self, experiment_name, experiment_field, field_filters):
+    def load(self, collection_name, field_filters):


I think (self, collection_name, query, projection) or (self, collection_name, query, selection) would be more clear. projection is the name used by mongodb for what is called a selection in SQL DBs. I don't have a clear preference. Sticking to mongodb's nomenclature could make the interface more coherent but at the same time I always found projection to be less obivous than selection.

bouthilx · 2017-11-20T21:17:16Z

src/metaopt/io/database/abstractdb.py

        pass

    @abstractmethod
-    def remove(self, experiment_name, trial_field, field_filters):
+    def remove(self, collection_name, field_filters):


I think (self, collection_name, query) would be more clear.

bouthilx · 2017-11-20T22:12:17Z

src/metaopt/io/database/mongodb.py


-        if len(dbdocs) == 0:
+        if len(dbdocs) is 0:


Why switch from == to is?

Technically is works on object comparison and == works on value comparison. I guess in this case it's a comparison between the lengths I was just trying to make the code to look more functional.

I know the difference :) and I would agree if it was meant for is False but here it's really about integers.

bouthilx · 2017-11-20T22:16:07Z

src/metaopt/io/database/mongodb.py

+    def insert(self, collection_name, query, dictionary):
+
+        dbcollection = self.db[collection_name]
+        result = dbcollection.insert(query, dictionary)


AFAIK there is no such thing as insert(query, dict) in pymongo. It should be insert_many(query, objects) where objects is a list of dictionaries. Also I can't see why we would want to query something when we do an insert. It's not an update.

Insert does exists What's the difference between insert(), insertOne() and insertMany() methods on MongoDB. I've double checked it with a local execution on my local computer with MongoDB too.

Although you're right, it's deprecated, I'll do some changes, although object does sound ambiguous, I propose dictionaries?

Oh! I though it did not exist looking at the doc here. Still, insert does not take a query as first argument. I don't understand how your tests can pass 🤔

I agree objects is not very clear but dictionaries only tells about the container not the content. What about document_dictionaries?

document_dictionaries sounds better 👍

tsirif · 2017-11-21T06:21:24Z

Public Interface

Read data by attribute name from db, experiment as "value"/"list of values": [collection_name, attr_name, **filter] -> data

From collection, with name collection_name, read attribute, with name attr_name and return a "value" (number, string or dict) or a "list of values", corresponding to that attribute.
If attribute, with name attr_name, is known to correspond to a "list of values", then **filter can contain key=(how, value) or key=value entries:
- key := attribute of an object in that list
- how := way of comparing; one of ['=', '>=', '<=', '>', '<'] ?
- value := reference "value"
- A key=value entry assumes how == '='.
A certain experiment can be access with attr_name == exp_name from collection 'experiments'. A certain attribute of an experiment can be accessed by attr_name == '<exp_name>.<attr_name>'.

Write data by attribute name to db, experiment as "value"/"list of values": [collection_name, attr_name, data, **filter] -> data

len(filter) != 0 assumes an update at a "list of values" attribute, with name attr_name. data is a "value".
len(filter) == 0 assumes:
- If attribute, with attr_name, is known to correspond to a "list of values", then insert (many) data in that list.
- Else, (over)write data of that attribute.

Provide context manager interface for a specific experiment, with name exp_name, in a specific collection, with name collection_name.

tsirif · 2017-11-21T06:24:39Z

@dendisuhubdy To be able not to brake CI from now on, it would be better if, instead of merging master back, you would rebase on top of master (after #7 is merged ^_^)

bouthilx · 2017-11-21T17:26:02Z

@tsirif I don't understand why we would need 2 read/write interfaces for the DB rather than a single one. For now I would only rely on MongoDB's methods for doing complex queries and updates. All the information for this can be contained in the dictionaries passed to the DB methods. We can later port those dictionaries to SQL statements.

tsirif · 2017-11-22T04:37:20Z

Initially I thought that it would be convenient to have two different sets, one for experiment entry and one for an experiment's attributes. But I see your point, you are right. I am restating that.

bouthilx · 2017-11-23T14:37:20Z

tests/core/unittests/mongodb_test.py

+def test_bad_connection():
+    """Raise when connection cannot be achieved."""
+    with pytest.raises(DatabaseError) as exc_info:
+        MongoDB(dbhost='asdfada', username='uasdfaf', password='paasdfss')


How do you test this? You don't have a database right?

So for the wrong host we would assert that the connection will definitely fail.

Yes I was confused because I didn't know we had a mongo database setup for the tests. I realized when looking at the CI setup files.

bouthilx · 2017-11-23T14:37:46Z

tests/core/unittests/mongodb_test.py

+from metaopt.io.database.mongodb import MongoDB
+
+
+@pytest.fixture(scope='module')


Wow, I didn't know about pytest fixtures, that's nice!

bouthilx · 2017-11-23T14:38:58Z

.pylintrc

@@ -51,7 +51,7 @@ confidence=
 # --enable=similarities". If you want to run only the classes checker, but have
 # no Warning level messages displayed, use"--disable=all --enable=classes
 # --disable=W"
-disable=fixme,F0401,intern-builtin,nonzero-method,parameter-unpacking,backtick,raw_input-builtin,dict-view-method,filter-builtin-not-iterating,long-builtin,unichr-builtin,input-builtin,unicode-builtin,file-builtin,map-builtin-not-iterating,delslice-method,apply-builtin,cmp-method,setslice-method,coerce-method,long-suffix,raising-string,import-star-module-level,buffer-builtin,reload-builtin,unpacking-in-except,print-statement,hex-method,old-octal-literal,metaclass-assignment,dict-iter-method,range-builtin-not-iterating,using-cmp-argument,indexing-exception,no-absolute-import,coerce-builtin,getslice-method,suppressed-message,execfile-builtin,round-builtin,useless-suppression,reduce-builtin,old-raise-syntax,zip-builtin-not-iterating,cmp-builtin,xrange-builtin,standarderror-builtin,old-division,oct-method,next-method-called,old-ne-operator,basestring-builtin
+disable=redefined-outer-name,fixme,F0401,intern-builtin,nonzero-method,parameter-unpacking,backtick,raw_input-builtin,dict-view-method,filter-builtin-not-iterating,long-builtin,unichr-builtin,input-builtin,unicode-builtin,file-builtin,map-builtin-not-iterating,delslice-method,apply-builtin,cmp-method,setslice-method,coerce-method,long-suffix,raising-string,import-star-module-level,buffer-builtin,reload-builtin,unpacking-in-except,print-statement,hex-method,old-octal-literal,metaclass-assignment,dict-iter-method,range-builtin-not-iterating,using-cmp-argument,indexing-exception,no-absolute-import,coerce-builtin,getslice-method,suppressed-message,execfile-builtin,round-builtin,useless-suppression,reduce-builtin,old-raise-syntax,zip-builtin-not-iterating,cmp-builtin,xrange-builtin,standarderror-builtin,old-division,oct-method,next-method-called,old-ne-operator,basestring-builtin


Why do we disable so much checks?

Most of the them are default exclusions, I will try to introduce small amount of them at a time.

bouthilx · 2017-11-23T14:39:26Z

.travis.yml

@@ -29,7 +29,7 @@ install:
  - pip install tox
 before_script:
  - sleep 15
-  - mongo metaopt --eval 'db.createUser({user:"user",pwd:"pass",roles:["readWrite"]});'
+  - mongo metaopt_test --eval 'db.createUser({user:"user",pwd:"pass",roles:["readWrite"]});'


Ah! Here is the test database. Fine.

bouthilx · 2017-11-23T14:49:14Z

src/metaopt/io/database/mongodb.py

-    def __init__(self, dbhost='localhost', dbname='metaopt',
-                dbport=27017, username=None, password=None):
-        super(MongoDB, self).__init__(dbhost, dbname, dbport, username, password)
+    def __init__(self, uri):


It seems my comment on slack wasn't clear. You did the exact opposite of what I proposed. 😆

I believe the simplest functional interface for now would be _init__(uri, with, all, the, sugar, already, there, being, optional), but inside __init__ only self._uri = uri would be set. Then, the uri getter would be as you wrote. However, there would be no setter for now as all the sugar already there are not set in the database yet.

I know it's tempting to already implement all of it. It's what, just 2 or 3 more functions and a couple of attributes after all? But then we need to test all use cases and make sure passing uri works fine just as much as passing attributes, and think about what to do if user pass uri + other arguments. How do we handle the setter for uri? How do we handle the setter for the other arguments? It's a lot more work than it looks like, enough to make us miss the Friday's milestone for the DB. Furthermore, we would have to maintain this code while implementing the rest of the minimum viable product, hence spending even more time maintaining extra features.

I don't get it really. If you mean we use the function parser, we need to pass a uri into it, before it was like this, it was __init__(self, dbhost, dbname, dbport, username, password)) which you said we shouldn't do this and we should just use the parser instead. If that get's parsed again inside the MongoClient we should just pass in the uri, which it would look like just that.

I don't know how this is incorrectly. You don't need to pass through init both uri and all of the dbhost, dbname, dbpassword, and dbport all over again, it's inefficient.

I see why you are confused, I'm sorry I didn't detail enough my thoughts.

I believe we should eventually support both

db = Database(uri)

and

db = Database(dbhost, dbname, dbport, username, password, **options)

Just like pymongo's MongoClient does. But for now only one of the two would be enough for the MVP. I believe Database(uri) would be the simplest one to implement first. You can keep the arguments in the init as they are optional, but they wouldn't be used for now.

We could use the same mechanism than MongoClient:

class pymongo.mongo_client.MongoClient(host='localhost', port=27017, ...)
...
The host parameter can be a full mongodb URI, in addition to a simple hostname

bouthilx · 2017-11-23T15:00:43Z

src/metaopt/io/database/mongodb.py

+            raise_from(DatabaseError("Could not determine unlock due to connection."), e)
+        return unlock
+
+    def insert(self, collection_name, query, dictionaries):


I believe previous discussion still holds: #8 (comment)

bouthilx · 2017-11-23T16:00:33Z

src/metaopt/io/database/mongodb.py

@@ -86,10 +86,10 @@ def unlock(self):
            raise_from(DatabaseError("Could not determine unlock due to connection."), e)
        return unlock

-    def insert(self, collection_name, query, dictionaries):
+    def insert(self, collection_name, query, document_dictionaries):


What about the query? Based on the doc, it seems it should be insert_many(document_dictionaries). I don't understand what a query would be used for in during an insertion.

bouthilx · 2017-11-23T16:01:25Z

src/metaopt/io/database/mongodb.py

+        # see http://api.mongodb.com/python/current/api/pymongo/collection.html
+        # for more details
+
+        result = dbcollection.update_one(query, update_dictionary)


I guess we should support update_many here.

dendisuhubdy · 2017-11-23T16:08:43Z

@tsirif @bouthilx are we going to implement #8 (comment) or this thin wrapper is enough?

bouthilx · 2017-11-23T16:17:58Z

@dendisuhubdy I believe the wrapper is enough to handle all of this. We can write _load, _update and _save methods in Experiment and Trial if we want to save us the burden of always specifying the collection_name and id. This way we have a clear separation between databases and meta optimization specific classes.

@tsirif What do you think?

bouthilx · 2017-11-23T17:21:53Z

src/metaopt/io/database/mongodb.py

@@ -50,6 +50,7 @@ def initiate_connection(self):
        try:
            conn = MongoClient(self.uri)
            self.conn = conn
+            self.conn.admin.command('ismaster')


That's a good example of why detailed commit messages are necessary. I believe that would be a good time to read and apply what I suggested in this comment. In this case, the commit message should explain why we need to set ismaster and maybe also the alternatives and why command('ismaster') is better.

Writing detailed commit messages does slow down the process of committing code but it significantly speeds up review and maintenance. Also, you will sometimes realize while writing the commit message that something is wrong or missing in the commit. Many of your commits in this PR have been undo/redo, which is a sign that what you were doing was not totally clear to you. It was my fault many times, because my comments were not detailed enough, but if you took the time to write down the commit message you could have realized that informations was missing and you would have asked me for clarifications before committing the changes.

bouthilx · 2017-11-23T17:25:23Z

src/metaopt/io/database/mongodb.py


+        Finds a single document and updates it, returning either the original or the updated document.


Why a string here? It won't be registered as a docstring.

bouthilx · 2017-11-23T17:28:41Z

src/metaopt/io/database/mongodb.py

+        result = dbcollection.find_one_and_update(filter=query,
+                                         update=update_dictionary,
+                                         projection=selection,
+                                         upsert=upsert,


upsert is a nice additon, but why limit ourselves to one update? Please don't make a new commit with update_many. 🙂 I'd prefer we discuss it and we'll see if we settle on find_one_and_update or update_many.

@tsirif Any thoughts on this?

We recently encountered a problem by doing a nested query inside a dictionary, and by the way update is deprecated http://api.mongodb.com/python/current/api/pymongo/collection.html (at the bottom of the page)

update is deprecated but update_{one, many} are not.

Do you have an example for the problem? I'd like to understand it better. It's not clear to me that update_one rather than update_many would be the best fix for that kind of problem.

Also you can not update two documents at once with a MongoDB query. You will always have to do that in two queries. You can of course set a value of a field to the same value, or increment with the same number, but you can not do two distinct updates in MongoDB with the same query. As a reference https://stackoverflow.com/questions/17207183/way-to-update-multiple-documets-with-different-values

Yes sure. Different updates need different calls to update(). I don't see that as an argument against update_many. The only argument I see in favor of find_one_and_update is the atomicity of the operation. I'm not sure however it is worth the cost for flexibility.

bouthilx · 2017-11-23T17:57:13Z

src/metaopt/io/database/mongodb.py

@@ -188,4 +188,8 @@ def remove(self, collection_name, query):
        Returns:
        result -- the document/documents
        """
-        self.db[collection_name].remove(query)
+        result = self.db[collection_name].find_one_and_delete(filter=query,


To complement this comment, I feel like deletion are very likely to be requested on many documents and we are getting our hands tied here.

Idem. But I'd rather return the document first at the end of the function rather than using delete_one or delete_many.

Being able to recover the deleted document could be handy but I'm not sure we are likely to encounter such a use case.

Yes, although rarely. I'm seeing a case where we want to simply remove a node from the set of resources if suddenly it becomes idle, we need to do a remove from the resources, but then if the node shows up again on our ping, we might need to return it back again. Just removing it might be dangerous.

I don't know. If ressources are represented as objects which can easily register themselves in the database then it would do no harm to remove it and register again using the objects info rather than the deleted rows info. To me the best point in favor of find_and_remove is if we want to delete a row and make sure nobody changes it meanwhile. If we do db.load() and then db.remove() it could have changed in between. But again, how is that likely to happen given our use cases and is it worth limiting the flexibility of our interface?

@dendisuhubdy @tsirif What would you think about setting base methods (insert, load, update, remove) as non-atomic ones supporting many operations and latter on we extend the interface with find_and_(update, remove) if needed?

bouthilx

@dendisuhubdy Please do not make any more modifications for now. Every comments are up to discussion and it would be preferable we agree on what to do before changing anything. Otherwise, comments may get hidden by github and it becomes hard to keep track of the discussions.

@tsirif Please take part of the discussion. 🙂

bouthilx · 2017-11-24T01:27:21Z

src/metaopt/io/database/mongodb.py

@@ -79,6 +72,15 @@ def is_locked(self):
            raise_from(DatabaseError("Could not determine if connection is locked."), e)
        return is_locked

+    def lock(self):


I would not add locks now to databases. As Fred said in previous discussions, locks on databases is not trivial. I opened a new issue (#10) to keep track of the idea.

bouthilx · 2017-11-24T01:28:14Z

src/metaopt/io/database/mongodb.py

+                    dictionary,
+                    selection,
+                    atomic=False,
+                    upsert=False,


Do we really want to give the option for upsert? I though it would always be True.

If we always set it to True we'd lose power over just updating or doing an insert.

No, upsert only inserts if the document is not found using its id. A very common use case will be to write new documents so it would be weird to always have to set database.write(new_documents, update=True). Also, in our case updates will usually be done by the objects representing the rows (Experiments, Trials, etc) so it will be simple to pass to write() the id so that write() does an update rather than an insert. All of that without specifying upsert.

I think that dictionary should be renamed to data and put as the second argument (after collection_name).

query and selection should be optional.

I agree that upsert should always be True.

MongoDB specifics should not be asked for in the implementation of an AbstractDB.

So, I believe that the complete argument list should be:
(self, collection_name, data, query=None, selection=None, atomic=False)

bouthilx · 2017-11-24T01:42:19Z

src/metaopt/io/database/mongodb.py

+                    selection,
+                    atomic=False,
+                    upsert=False,
+                    sort=None,


Sort is specific to find_one_and_update. I believe the atomic argument should lead to the same behavior if True of False with the only difference being the atomicity of the operation (safer but slower). I see two solutions:

We lock around update_many

We loop on find_one_and_update until all matches are updated (in which case sort argument is useless)

Problem of solution 1 is that database locks is not trivial. Problem of solution 2 is that the operation is only atomic at the row update level, other rows could be changed by another process while the current process updates one.

Solution 2 seems bad because it is only weakly atomic. Because of that I would prefer solution 1. Still, since it is not trivial and atomicity is not something important I would rather leave that for latter.

Forget about last comment. Based on the discussion here #10 it seems atomic operations on many rows is something we should forget about.

Then I think the behaviors for atomic is True or False are too different to be crammed inside the write() method. Since the default behavior is not atomic then we could postpone the addition of atomic operation to write/remove and finish this PR with only write(collection_name, query, dictionary, selection) (and maybe upsert if we decide to make it definable). We can decide later if we integrate atomicity inside write() to keep only write/read/remove or if we add new explicit methods for atomic operations.

bouthilx · 2017-11-24T01:46:01Z

src/metaopt/io/database/mongodb.py

+                    atomic=False,
+                    upsert=False,
+                    sort=None,
+                    return_document=ReturnDocument.BEFORE,


I fill like our use cases where we want the document returned will be ReturnDocument.AFTER (even though I know BEFORE is the default for MongoDB). Also, shouldn't we make that part of the abstract interface? Other kind of DBs will want that as well.

I'm suspicious that for MySQL or PostgreSQL the interface might be different, I'll do research for now and get back to you for that.

The interface will probably be different but we could handle that inside the method. If we define our own codes for return_document we could use the same interface for all type of databases. It could be something like this inside the abstractdb python file.

class ReturnDocument(object): NONE = 0 BEFORE = 1 AFTER = 2

For simplicity I would only implement NONE for now.

bouthilx · 2017-11-24T01:47:17Z

src/metaopt/io/database/mongodb.py

+                    upsert=False,
+                    sort=None,
+                    return_document=ReturnDocument.BEFORE,
+                    collation=None):


I don't think we should have this specific argument. We can add it later if someone requires it. Otherwise we need to test it and maintain for nothing.

Ah I see 👍

bouthilx

@dendisuhubdy @tsirif Please give your opinions on the review comments. We need to make sure we agree on the modifications to do before making them. When commits are submitted, comments get hidden and it is hard to keep track. Furthermore, we cannot continue at that pace. Making and reviewing often small edits that may be changed after discussion is sucking up to much time and hurting our other projects (research namely). If you want to propose small code changes for specific comments, please do so in the comments using markdown's syntax.

Also, I will have to ask you to remove extra features from the PR to speed up our progress. There is for instance the locking system and the support for both URI and attributes. For the atomic option, I believe we still have to discuss to decide whether many_updates or atomicity will be the feature we leave for latter. In other words, what should be the default behavior.

For any feature, please make sure to open an issue and explain it so that we do not loose track of it.

I would like this review to be the penultimate one. This means that all comments should be discussed so that when commits are pushed they address all of the comments. Some comments are requests, others are open questions so they need to be discussed and we need to reach agreement before any commit can address them.

Thanks for your cooperation and continue the good work.

bouthilx · 2017-11-27T15:48:53Z

src/metaopt/io/database/mongodb.py

+        else:
+            uri = self._uri
+        ret = parse_uri(uri)
+        if self._uri is not None:


I disagree with this. It is a weird side effect. If we want the uri to be settable it should be using a setter explicitely. We don't need both for now. We should decide what we support first: URI or attributes. I remember Pascal was in favor of attributes so that we can ask users for passwords and avoid passwords in URI. Personally I would start with URI as it is much simpler to handle but then we should make support for attributes a priority when MVP is ready.

I hold the same opinion that I wrote in the comment above. I agree with host only positional, being also the uri. But I think that we should let MongoClient handle the rest.

For reference, comments are here

bouthilx · 2017-11-27T16:00:54Z

src/metaopt/io/database/__init__.py

+
+    """
+
+    def __init__(self, uri=None, dbname='metaopt', host='localhost',


What would you think about making host the first argument and definable as host or URI like in MongoClient

The host parameter can be a full mongodb URI, in addition to a simple hostname. It can also be a list of hostnames or URIs. Any port specified in the host string(s) will override the port parameter. If multiple mongodb URIs containing database or auth information are passed, the last database, username, and password present will be used. For username and passwords reserved characters like ‘:’, ‘/’, ‘+’ and ‘@’ must be percent encoded following RFC 2396:

As I saw in one of Christos' commits, dbname could be the first positional (non-optional) argument as we know we will always work with only one database. Data for an experiment will be scattered across different collections/tables, not across different databases. Database is a singleton anyway.

I think making host/uri the first and only positional argument is the best practice here. dbname can be specified inside the uri. Also, in this case explicitly given arguments can override uri-specified ones. In fact I suggest that we let MongoDB handle the priority, and just pass all the arguments inside MongoClient. This in effect would drop the need for a property uri. What do you think?

I love it. Defining the attributes separately in the configuration file is more natural than the URI. The URI is more convenient for passing as an argument in command line.

So as you propose I would make the host/uri the first and only positional argument and all the arguments would be passed to MongoClient rather than the URI. That would also address Pascal's concern about the username/password in the URI. In that case, I would drop everything related to URI; the property and self._uri. Until we add support for URI, MongoDB("some_uri") would fail.

bouthilx · 2017-11-27T16:07:53Z

src/metaopt/io/database/__init__.py

+        """Disconnect from database."""
+        pass
+
+    @abstractproperty


I would leave lock for latter.

I agree. Remove these methods?

bouthilx · 2017-11-27T16:09:03Z

src/metaopt/io/database/__init__.py

+        pass
+
+    @abstractmethod
+    def write(self, collection_name, data, query, atomic, selection):


optional arguments should be as well in the AbstractDB if it is intended to stay the same for all children. The docstring should describe all of that.

Oh indeed. I did not see that. Will do.

bouthilx · 2017-11-27T16:09:09Z

src/metaopt/io/database/__init__.py

+        pass
+
+    @abstractmethod
+    def read(self, collection_name, query, selection, atomic):


bouthilx · 2017-11-27T16:34:28Z

src/metaopt/io/database/mongodb.py

+                                              upsert=True)
+            return result.acknowledged
+        elif atomic is True:
+            selected_one_key, selection = _get_selected_one_key(selection)


What is _get_selected_on_key? Why do we select only the first item of the given selection?

If only one item has been specified in the selection, then return the value for the specific projection, not a dictionary whose final value is what we want.

Ok I understand the intent. Then we would not need to do db.read({}, {'arg':1})['arg']. But isn't a bit weird? I'd be surprised as a user if making a selection of a single value suddenly returns items rather than dictionaries. It's convenient, but it makes the method's behavior more complex. What if some function makes dynamic reads and at some point its dictionary gets unitary. It would need to take into account that at if its dictionary gets unitary than it needs to rebuild the dictionary because the method suddenly returns an item.

Personally, I would favor coherence and simplicity over a convenience like this. Also, accumulation of small tweaks like that adds on the burden of testing and maintenance, thus slowing down development. Writing db.read({}, {'arg':1})['arg'] is less elegant but doesn't have such an impact.

With more dotted selections 'arg1.arg2.arg3' the more ugly it will get. But I agree about your point on coherence and simplicity. Will see what I can do 😉

We should finish this discussion before spending time on selected_one_key, otherwise it might be a waste of time.

bouthilx · 2017-11-27T16:37:45Z

src/metaopt/io/database/mongodb.py

+                                                      upsert=True,
+                                                      sort=[('$natural', -1)],
+                                                      return_document=ReturnDocument.AFTER)
+            return _get_leaf_value(selected_one_key, result) if selected_one_key else result


We return the value of the selected_on_key leaf? We do we integrate this in this method?

if selected_one_key is not None return the the final value of a dictionary like {'a': {'b': {'c': 5}}}.

See comment above

bouthilx · 2017-11-27T16:39:42Z

src/metaopt/io/database/mongodb.py

+            dbdocs = list(cursor)
+        elif atomic is True:
+            # Return the most recent of the query
+            dbdocs = list(dbcollection.find(query, selection, sort=[('$natural', -1)], limit=1))


This is weird, find is atomic anyway even on many documents. To me, this attempt to unify write/read behaviors highlights the problem of integrating many updates with one atomic updates in the same write() method.

I wanted an interface to return the most recent only. atomic means "which cannot be split/have separate parts".

In this case atomic means that the operation is safe for concurrency. The atomicity is about the operation. The operation could return many documents.

With the current implementation we are forced to return a single document in write() because of MongoDB particularities, but in general write() could be applied on many documents atomically. We are forcing read()s behavior to stay coherent on write()s behavior, while the later was already forced because of a specific database particularity. That doesn't sound like a good design to me.

It is my belief that we should give ourselves some time to figure out what would be the best way to integrate atomicity. Not having it won't hurt us for the MVP.

I see your point. I think though that I will find useful searching for a single most recent document later on, for MVP. We could change the name of the optional argument though to something else. I believe that in general this is ok for now. AbstractDB is an internal API, we could change this later.

I understand having a way to fetch the most recent document can be handy, but I disagree this is how it should be done. If I read the following code snippet, it's not clear why the output would be the most recent row unless I read the doc.

most_recent_row = db.read(some_query, atomic=True)

The following is less elegant but more explicit

most_recent_row = next(sorted(db.read(some_query), key=lambda d: d["date"]))

And could later be turned into

most_recent_row = db.read(some_query, sort={"date": 1})

Anyhow, we don't need sort for now, and adding it would not impact the interface so we don't need to rush for it. However, adding it now would affect our progress. When we need it, we can simply make a PR for sort only.

It might feel like I'm arguing for details and slowing down our progress as I advocate for removing stuff rather than changing or adding, but we need to focus more. The tendency of adding extra features will hurt us more and more as we progress with a larger and more complicated code base. Every extra feature is adding burden on development and maintenance. When the code is designed properly, extra features are easy to integrate later on, so there is no reason to rush them in. What we should rather rush for is users' feedback. We cannot afford to design hastily for that, but we can spare time on extra features.

I would just like an optional interface to return the most return recent thing. Databases can do that generally. I would not like a general sort argument, this can be done as you demonstrated. However, searching for the most recent something is probably going to be common. I am not sure.

I agree with your last comment. It takes unjustified motivation to remove stuff I tried right now to facilitate for my future coding. However, I will try to contain my urge to implement in front stuff before they become necessary, in the future.

What would you suggest that I should do? Should I remove it? I think the current tests describe pretty much what I will need to proceed. Interface can be polished later, no need to perfect it right now as you said.

It takes unjustified motivation to remove stuff I tried right now to facilitate for my future coding. However, I will try to contain my urge to implement in front stuff before they become necessary, in the future.

I don't understand what you mean.

What would you suggest that I should do? Should I remove it? I think the current tests describe pretty much what I will need to proceed. Interface can be polished later, no need to perfect it right now as you said.

I think the best would be to move the extra feature to a new branch and point to it in an issue. This way we do not loose trace of the code but do not have the burden of maintaining it. From what I see there should be 2 issues. One about atomicity and one about getting the most recent document only. Those two should not be coupled.

bouthilx · 2017-11-27T16:40:19Z

src/metaopt/io/database/mongodb.py

+            # Return the most recent of the query
+            dbdocs = list(dbcollection.find(query, selection, sort=[('$natural', -1)], limit=1))
+
+        if len(dbdocs) is 0:


Please use the pythonic way: if some_integer == 0:

bouthilx · 2017-11-27T16:46:38Z

tests/core/unittests/mongodb_test.py

+    @pytest.mark.skip(reason="sugar")
+    def test_ctx_read_experiment(self, exp_config, moptdb):
+        """Fetch a whole experiment entry using an experiment context (always atomic)."""
+        with moptdb('experiments', {'exp_name': 'supernaedo2',


I don't understand. What is moptdb here? Is it a MongoDB? Where was __enter__ defined?

I have not yet implement this. These test are marked to be skipped because of 'sugar'. See line 100.

But this would go to the abstract db, once it gets implemented.

Yes I saw the sugar skip. 😆

If I may ask you, could you please remove tests that are for sugar features. You can keep them in a branch on your fork to avoid loosing the code. I would also encourage you to make an issue where you explain your idea of context manager if you do spend some time on that. Nevertheless, we are past the milestone for the DB and are gonna soon be past the milestone for Experiment/Trials so I would recommend we focus on the base first.

tsirif · 2017-11-24T18:46:40Z

src/metaopt/io/database/abstractdb.py

@@ -6,8 +6,9 @@

 @six.add_metaclass(AbstractSingletonType)
 class AbstractDB(object):
-    def __init__(self, dbhost='localhost', dbname='metaopt',
+    def __init__(self, uri, dbhost='localhost', dbname='metaopt',


@dendisuhubdy @bouthilx Should uri also be optional? In tests for instance, I expect to pass only dbname, username and password.

tsirif · 2017-11-24T19:53:52Z

src/metaopt/io/database/mongodb.py

+    from urllib import quote_plus
+
+class MongoDB(AbstractDB):
+    def __init__(self, uri, dbhost='localhost', dbname='metaopt', dbport=27017, username=None, password=None):


uri here should probably be optional (=None).

tsirif · 2017-11-24T19:54:07Z

src/metaopt/io/database/mongodb.py

+        # setting this as a function allows to make sure reconnection
+        # is properly done in the case of an unreliable network
+
+        self._uri = uri


This is a duplicate from AbstractDB

tsirif · 2017-11-24T19:55:02Z

src/metaopt/io/database/mongodb.py

+    @property
+    def uri(self):
+        """ The URI property of the class """
+        uri = "mongodb://%s:%s@%s:%s" % (quote_plus(self._username),


Should probably overwrite private self._uri, only if the user has not already provided one (if self._uri is None)

tsirif · 2017-11-24T19:55:24Z

src/metaopt/io/database/mongodb.py

+        # return connection string for checking purposes
+        return conn
+
+    def check_connection(self):


name in similar fashion to is_locked

tsirif · 2017-11-27T18:40:50Z

src/metaopt/io/database/mongodb.py

+                                                      upsert=True,
+                                                      sort=[('$natural', -1)],
+                                                      return_document=ReturnDocument.AFTER)
+            return _get_leaf_value(selected_one_key, result) if selected_one_key else result


if selected_one_key is not None return the the final value of a dictionary like {'a': {'b': {'c': 5}}}.

tsirif · 2017-11-27T18:44:39Z

src/metaopt/io/database/mongodb.py

+            dbdocs = list(cursor)
+        elif atomic is True:
+            # Return the most recent of the query
+            dbdocs = list(dbcollection.find(query, selection, sort=[('$natural', -1)], limit=1))


I wanted an interface to return the most recent only. atomic means "which cannot be split/have separate parts".

tsirif · 2017-11-27T18:44:55Z

src/metaopt/io/database/mongodb.py

+            # Return the most recent of the query
+            dbdocs = list(dbcollection.find(query, selection, sort=[('$natural', -1)], limit=1))
+
+        if len(dbdocs) is 0:


tsirif · 2017-11-27T19:44:50Z

tests/core/unittests/mongodb_test.py

+    @pytest.mark.skip(reason="sugar")
+    def test_ctx_read_experiment(self, exp_config, moptdb):
+        """Fetch a whole experiment entry using an experiment context (always atomic)."""
+        with moptdb('experiments', {'exp_name': 'supernaedo2',


I have not yet implement this. These test are marked to be skipped because of 'sugar'. See line 100.

tsirif · 2017-11-27T19:45:08Z

tests/core/unittests/mongodb_test.py

+    @pytest.mark.skip(reason="sugar")
+    def test_ctx_read_experiment(self, exp_config, moptdb):
+        """Fetch a whole experiment entry using an experiment context (always atomic)."""
+        with moptdb('experiments', {'exp_name': 'supernaedo2',


But this would go to the abstract db, once it gets implemented.

codecov-io · 2017-11-28T14:56:52Z

Codecov Report

Merging #8 into master will increase coverage by 52.07%.
The diff coverage is 99.19%.

@@             Coverage Diff             @@
##           master       #8       +/-   ##
===========================================
+ Coverage   13.63%   65.71%   +52.07%     
===========================================
  Files           7        9        +2     
  Lines         154      385      +231     
  Branches       22       39       +17     
===========================================
+ Hits           21      253      +232     
+ Misses        133      130        -3     
- Partials        0        2        +2

Impacted Files	Coverage Δ
tests/dummy_test.py	`100% <100%> (ø)`	⬆️
src/metaopt/io/database/__init__.py	`100% <100%> (ø)`
src/metaopt/utils/__init__.py	`100% <100%> (ø)`	⬆️
tests/core/unittests/mongodb_test.py	`100% <100%> (ø)`
src/metaopt/io/database/mongodb.py	`97.22% <97.22%> (ø)`
src/metaopt/worker/__init__.py	`0% <0%> (-100%)`	⬇️
src/metaopt/cli.py	`0% <0%> (ø)`	⬆️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 679f573...94f8e09. Read the comment docs.

tsirif · 2017-11-28T20:52:41Z

@dendisuhubdy @bouthilx I think this is done. What do you think? Dendi could you merge the last commits I made?

dendisuhubdy · 2017-11-28T21:09:46Z

@tsirif I believe it is done, @bouthilx could you approve the PR? So @tsirif could merge it :)

bouthilx · 2017-11-28T21:11:49Z

Overall yes but we didn't finish this discussion. When we are done discussing, I'll make a final review and we should be able to merge.

- Public API of AbstractDB class consists of: `initiate_connection`, `is_connected`, `close_connection`, `write`, `read` and `remove`. - `read` uses `MongoClient.find` - `write` uses `MongoClient.insert_many`, when `query` is not supplied, and `MongoClient.update_many` with `upsert=True` else. - `remove` uses `MongoClient.delete_many`. - Initializing `MongoDB` class needs either a proper Mongo URI or separate args, which overwrite elements infered from a possible URI. - Create tests for each interface cases. - AbstractDB is an singleton. - Fix docs configuration.

Discussion has not yet concluded but major working functions of the database has been addressed. Issues has been written as well as associated links towards improvements.

bouthilx · 2017-11-30T03:01:16Z

@dendisuhubdy Why did you dismiss the review? In the main comment of the review I wrote "I would like this review to be the penultimate one". Please do not dismiss reviews. I will never dismiss your reviews either. This mechanism is there for a reason.

dendisuhubdy requested review from bouthilx and tsirif November 12, 2017 21:20

bouthilx requested changes Nov 14, 2017

View reviewed changes

bouthilx reviewed Nov 15, 2017

View reviewed changes

tsirif requested changes Nov 17, 2017

View reviewed changes

bouthilx reviewed Nov 17, 2017

View reviewed changes

bouthilx requested changes Nov 20, 2017

View reviewed changes

bouthilx reviewed Nov 20, 2017

View reviewed changes

bouthilx requested changes Nov 23, 2017

View reviewed changes

bouthilx reviewed Nov 23, 2017

View reviewed changes

bouthilx requested changes Nov 23, 2017

View reviewed changes

bouthilx reviewed Nov 23, 2017

View reviewed changes

bouthilx requested changes Nov 24, 2017

View reviewed changes

bouthilx previously requested changes Nov 27, 2017

View reviewed changes

tsirif reviewed Nov 28, 2017

View reviewed changes

tsirif approved these changes Nov 28, 2017

View reviewed changes

tsirif mentioned this pull request Nov 30, 2017

Add atomic option in AbstractDB.{write,remove} #14

Closed

dendisuhubdy merged commit 07f89d9 into Epistimio:master Nov 30, 2017

bouthilx mentioned this pull request Nov 30, 2017

Revert "MongoDB - class and tests" #17

Closed

dendisuhubdy deleted the database branch January 10, 2018 17:03


		from metaopt.io.database.mongodb import MongoDB

		class MongoDB_test():

		from metaopt.io.database.mongodb import MongoDB


		@pytest.fixture(scope='module')


		Finds a single document and updates it, returning either the original or the updated document.


		"""

		def __init__(self, uri=None, dbname='metaopt', host='localhost',

MongoDB - class and tests #8

MongoDB - class and tests #8

Conversation

dendisuhubdy commented Nov 12, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tsirif Nov 17, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bouthilx commented Nov 15, 2017 • edited Loading

tsirif left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bouthilx left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bouthilx left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dendisuhubdy Nov 20, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tsirif Nov 17, 2017 •

edited

Loading

bouthilx commented Nov 15, 2017 •

edited

Loading

dendisuhubdy Nov 20, 2017 •

edited

Loading