# SOME GOOD CODING PRACTICES FOR DATA SCIENTISTS

# 1.Automate repetitive tasks through functions

Let’s look at a common task in Machine Learning projects pipeline like tuning hyperparameters model. Suppose you’re working in an image classification project and you would like to try Support Vector Machine classifier (SVC) and Artificial Neural Network (ANN) with Multi-layer Perceptron classifier (MLPClassifier()). To tune the hyperparameters model we’ll be using the GridSearchCV() method for each one from scikit-learn Machine Learning framework.

In [None]:
# svc model
ml_model = SVC()
hyper_parameter_candidates = {"C": [1e-4, 1e-2, 1, 1e2, 1e4],
    "gamma": [1e-3, 1e-2, 1, 1e2, 1e3],
    "class_weight": [None, "balanced"],
    "kernel":["linear", "poly", "rbf", "sigmoid"]}
scoring_parameter = "accuracy"
cv_fold = KFold(n_splits=5, shuffle=True, random_state=1)
classifier_model = GridSearchCV(estimator=ml_model, 
    param_grid=hyper_parameter_candidates,   
    scoring=scoring_parameter, cv=cv_fold)
classifier_model.fit(X_train, y_train)


# ann model
ml_model = MLPClassifier()
hyper_parameter_candidates = {"hidden_layer_sizes":[(20), (50), 
   (100)], "max_iter":[500, 800, 1000], 
   "activation":["identity", "logistic", "tanh", "relu"],
   "solver":["lbfgs", "sgd", "adam"]}
scoring_parameter = "accuracy"
cv_fold = KFold(n_splits=5, shuffle=True, random_state=1)
classifier_model = GridSearchCV(estimator=ml_model,    
   param_grid=hyper_parameter_candidates,  
   scoring=scoring_parameter, cv=cv_fold)
classifier_model.fit(X_train, y_train)

Based on the code above, I could write a simple function as:

In [1]:
def tune_hyperparameter_model(ml_model, X_train, y_train, hyper_parameter_candidates, scoring_parameter, cv_fold):   
    classifier_model = GridSearchCV(estimator=ml_model, 
       param_grid=hyper_parameter_candidates, 
       scoring=scoring_parameter, cv=cv_fold)       
    classifier_model.fit(X_train, y_train)  
    return classifier_model

# 2.Implement error handling

Let’s add some error handling code:

In [2]:
def tune_hyperparameter_model(ml_model, X_train, y_train, hyper_parameter_candidates, scoring_parameter, cv_fold):   
    try:   
        classifier_model = GridSearchCV(estimator=ml_model, 
           param_grid=hyper_parameter_candidates, 
           scoring=scoring_parameter, cv=cv_fold)       
        classifier_model.fit(X_train, y_train)
    except:
        exception_message = sys.exc_info()[0]
        print("An error occurred. {}".format(exception_message))
    return classifier_model

With some planning, I should be able to write a better generic function print_exception_message() to print my exception messages and use it for all my functions.

In [3]:
def print_exception_message(message_orientation="horizontal"):
    """
    print full exception message
   :param message_orientation: horizontal or vertical
   :return None   
    """
    try:
        exc_type, exc_value, exc_tb = sys.exc_info()           
        file_name, line_number, procedure_name, line_code =  traceback.extract_tb(exc_tb)[-1]      
        time_stamp = " [Time Stamp]: " + str(time.strftime(" %Y-%m-%d %I:%M:%S %p"))
        file_name = " [File Name]: " + str(file_name)
        procedure_name = " [Procedure Name]: " +  str(procedure_name)
        error_message = " [Error Message]: " + str(exc_value)       
        error_type = " [Error Type]: " + str(exc_type)                   
        line_number = " [Line Number]: " + str(line_number)               
        line_code = " [Line Code]: " + str(line_code)
        if (message_orientation == "horizontal"):
            print( "An error occurred:{};{};{};{};{};{}; {}".format(time_stamp, file_name, procedure_name, 
               error_message, error_type, line_number, line_code))
        elif (message_orientation == "vertical"):
            print( "An error occurred:\n{}\n{}\n{}\n{}\n{}\n{}\n{}".format(time_stamp, file_name, 
               procedure_name, error_message, error_type,        
               line_number, line_code))
        else:
            pass                   
    except:
        exception_message = sys.exc_info()[0]
        print("An error occurred. {}".format(exception_message))

If we implement this function in our code, we’ll use a simple line of code in the exception block only.

In [4]:
def tune_hyperparameter_model(ml_model, X_train, y_train, hyper_parameter_candidates, scoring_parameter, cv_fold):   
    try:   
        classifier_model = GridSearchCV(estimator=ml_model, 
           param_grid=hyper_parameter_candidates, 
           scoring=scoring_parameter, cv=cv_fold)       
        classifier_model.fit(X_train, y_train)
    except:
        print_exception_message()
    return classifier_model

SOURCE: https://medium.com/@ernest.bonat/refactoring-python-code-for-machine-learning-projects-python-spaghetti-code-everywhere-daaa6c116bd1