Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #7: implement main worker process, algorithm_registry and logging #15

Merged
merged 6 commits into from Jun 23, 2017

Conversation

prasanna08
Copy link
Contributor

This PR fixes #7 which is combination of milestone 1.3 and milestone 2.1 of my GSoC project.

This PR implements following:

  • Main worker process which is implemented using polling, training, storing functions.
  • algorithm_registry which is mapping from algorithm_id <=> classifier class instance.
  • Incorporates logging module for logging important events.

@prasanna08 prasanna08 requested review from anmolshkl and AllanYangZhou and removed request for anmolshkl and AllanYangZhou June 16, 2017 03:24

"""Base class for classification algorithms"""

import abc
Copy link
Contributor

@anmolshkl anmolshkl Jun 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seanlip can we use this module now? I remember that we decided not to use this in Oppia because it would deviate from the existing approach.

@seanlip
Copy link
Member

seanlip commented Jun 16, 2017 via email

Copy link
Contributor

@anmolshkl anmolshkl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prasanna08 I have taken a first pass. PTAL at the comments. Thanks!


Below are some concepts used in this class.
training_data: list(dict). The training data that is used for training
the classifier. This field is populated lazily when the job request
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prasanna08 I don' think "This field is populated lazily when the job request ..." is relevant here. It is only true in the case of training jobs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup. Actually, I didn't verify the doc strings in much detail before issuing PR. I copied the relevant code and took a brief look at the doc string for consistency. I guess I'll have to go through them once again.


@abc.abstractmethod
def train(self, training_data):
"""Loads examples for training.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that is the right description for this method.

"""Loads examples for training.

Args:
training_data: list(dict). The training data that is used for
Copy link
Contributor

@anmolshkl anmolshkl Jun 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this description is correct on the Oppia side, but on the VM side this field will always be populated. In fact, VM doesn't need to know anything about lazy population.


# pylint: disable=too-many-branches
def _validate_job_data(job_data):
if not isinstance(job_data):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instance of what?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, my bad.

try:
job_data = job_services.get_next_job()
if job_data is None:
logging.info('No pending job requests.')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you might want to add additional info like time, vm_id etc. I guess this can be done by configuring the logger to append these details.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry, I saw the log config after writing this comment :D

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I have kept the format of logging same as GAE's log formats which includes all necessary details, I guess.

vmconf.py Outdated
FIXED_TIME_WAITING = 'fixed_time_wait'

# Seconds to wait in case of fixed time waiting approach.
FIXED_TIME_WAITING_SECS = 60
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a better name perhaps? (something like FIXED_TIME_WAITING_PERIOD?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

Actually the idea is to use exponential backoff algorithm for waiting in PROD and fixed time waiting in DEV. On local machines there will be at most a few jobs which can be processed quickly and we don't want VM to go into sleep for large duration when there are no pending jobs and that's why we use fixed backoff in DEV. But that's not the case with PROD. There will be many jobs and so we can use exponential backoff there because we also have to be wary of resources VM is using. Fixed time waiting would lead to wastage of resources. However exponential backoff is still "future idea" which will be implemented later on.

Copy link
Contributor

@anmolshkl anmolshkl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prasanna08 done! Sorry for the delay, I have taken another pass.

"""This module contains functions used for polling, training and saving jobs."""

from core.services import remote_access_services
from core.classifiers import algorithm_registry
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: import order


Args:
algorithm_id: str. ID of classifier algorithm.
training_data: dict. A dictionary containing training data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't training data be a list of dictionaries?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup. My bad.

# See the License for the specific language governing permissions and
# limitations under the License.

"""This module contains functions used for polling, training and saving jobs."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need a job_services_test.py?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. They are just using the remote_access_service functions, so as long as they are working these functions should work fine, too.

(I wasn't going to add this layer initially but later on I added it because higher modularity is always good for future maintenance)

# See the License for the specific language governing permissions and
# limitations under the License.

"""Registry for classification algorithms/classifiers."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Point for the future: I know there are no classifiers to test for now, but, this should have a unit tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. I will add TODO comment.

Copy link
Contributor

@anmolshkl anmolshkl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prasanna08 LGTM!

@prasanna08
Copy link
Contributor Author

@AllanYangZhou you might want to review this?

Copy link

@AllanYangZhou AllanYangZhou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Prasanna,

I've read through it, but didn't have any comments to make--it looks good to me!

@prasanna08 prasanna08 merged commit 543df4e into oppia:develop Jun 23, 2017
@prasanna08 prasanna08 deleted the main-process branch June 23, 2017 05:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement main worker process.
4 participants