New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix #7: implement main worker process, algorithm_registry and logging #15
Conversation
|
||
"""Base class for classification algorithms""" | ||
|
||
import abc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@seanlip can we use this module now? I remember that we decided not to use this in Oppia because it would deviate from the existing approach.
I'm not too bothered either way, I think; your call!
…On Friday, June 16, 2017, Anmol ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In core/classifiers/BaseClassifier/BaseClassifier.py
<#15 (comment)>:
> +#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS-IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Base class for classification algorithms"""
+
+import abc
@seanlip <https://github.com/seanlip> can we use this module here? I
remember that we decided not to use this in Oppia because it would deviate
from the existing approach.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#15 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AKFeymxntCvWBqkTZBcA2njn13OoC7OKks5sEivhgaJpZM4N6APN>
.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@prasanna08 I have taken a first pass. PTAL at the comments. Thanks!
|
||
Below are some concepts used in this class. | ||
training_data: list(dict). The training data that is used for training | ||
the classifier. This field is populated lazily when the job request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@prasanna08 I don' think "This field is populated lazily when the job request ..." is relevant here. It is only true in the case of training jobs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup. Actually, I didn't verify the doc strings in much detail before issuing PR. I copied the relevant code and took a brief look at the doc string for consistency. I guess I'll have to go through them once again.
|
||
@abc.abstractmethod | ||
def train(self, training_data): | ||
"""Loads examples for training. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that is the right description for this method.
"""Loads examples for training. | ||
|
||
Args: | ||
training_data: list(dict). The training data that is used for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this description is correct on the Oppia side, but on the VM side this field will always be populated. In fact, VM doesn't need to know anything about lazy population.
core/services/job_services.py
Outdated
|
||
# pylint: disable=too-many-branches | ||
def _validate_job_data(job_data): | ||
if not isinstance(job_data): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instance of what?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, my bad.
try: | ||
job_data = job_services.get_next_job() | ||
if job_data is None: | ||
logging.info('No pending job requests.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you might want to add additional info like time, vm_id etc. I guess this can be done by configuring the logger to append these details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh sorry, I saw the log config after writing this comment :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I have kept the format of logging same as GAE's log formats which includes all necessary details, I guess.
vmconf.py
Outdated
FIXED_TIME_WAITING = 'fixed_time_wait' | ||
|
||
# Seconds to wait in case of fixed time waiting approach. | ||
FIXED_TIME_WAITING_SECS = 60 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a better name perhaps? (something like FIXED_TIME_WAITING_PERIOD?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good.
Actually the idea is to use exponential backoff algorithm for waiting in PROD and fixed time waiting in DEV. On local machines there will be at most a few jobs which can be processed quickly and we don't want VM to go into sleep for large duration when there are no pending jobs and that's why we use fixed backoff in DEV. But that's not the case with PROD. There will be many jobs and so we can use exponential backoff there because we also have to be wary of resources VM is using. Fixed time waiting would lead to wastage of resources. However exponential backoff is still "future idea" which will be implemented later on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@prasanna08 done! Sorry for the delay, I have taken another pass.
core/services/job_services.py
Outdated
"""This module contains functions used for polling, training and saving jobs.""" | ||
|
||
from core.services import remote_access_services | ||
from core.classifiers import algorithm_registry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: import order
core/services/job_services.py
Outdated
|
||
Args: | ||
algorithm_id: str. ID of classifier algorithm. | ||
training_data: dict. A dictionary containing training data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wouldn't training data be a list of dictionaries?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup. My bad.
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
"""This module contains functions used for polling, training and saving jobs.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we need a job_services_test.py?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so. They are just using the remote_access_service
functions, so as long as they are working these functions should work fine, too.
(I wasn't going to add this layer initially but later on I added it because higher modularity is always good for future maintenance)
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
"""Registry for classification algorithms/classifiers.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Point for the future: I know there are no classifiers to test for now, but, this should have a unit tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. I will add TODO comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@prasanna08 LGTM!
@AllanYangZhou you might want to review this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Prasanna,
I've read through it, but didn't have any comments to make--it looks good to me!
This PR fixes #7 which is combination of milestone 1.3 and milestone 2.1 of my GSoC project.
This PR implements following:
algorithm_registry
which is mapping from algorithm_id <=> classifier class instance.