# Imports
First we will import all the newly developed modules from the hupml library:

In [1]:
# Hupml imports
from hupml import LoadConfig
from hupml import MlDataFrame
from hupml import PipelineBase

# Some imports for this demo
import pandas as pd
import json
import os

There is one module that has been developed that we won't be using today. It's a database connection class. So if you ever work for a client with a database, you can easily kickstart your project by using this. It includes several methods to read and write to the database from/to Pandas DataFrame.

In [2]:
# from hupml.database_connection import DbConnection 

We need a couple of other variables and methods for this demo specific:

In [3]:
root_dir = (os.path.abspath(os.path.dirname('../..')))

def print_dict_pretty(d):
    print(json.dumps(d, indent=4))

# Loading configuration files
All configuration files in hupml and ai-template are `.yaml` files. To load these, there is a class in hupml called LoadConfig. This class contains two methods:
1. `load_yaml_as_dict`
2. `load_logger_from_config`

The first method reads the `.yaml` file and converts it to a Python dictionairy. In this way, you can easily add more configuration settings for whatever you want to configure. You can, for instance, add your model parameters in a yaml file instead of keeping these parameters directly in your code. The second method loads the `.yaml` file and configures a logger. More about that in the next section, but first an example of the first method:

In [4]:
# Path to config file
config_path_logging = f'{root_dir}/configs/logging_settings.yaml'

# Load the config from the .yaml file to a Python dictionairy
logging_config_dict = LoadConfig.load_yaml_as_dict(config_path_logging)

# Print dictionairy in a nice way
print_dict_pretty(logging_config_dict)

{
    "version": 1,
    "root": {
        "level": "NOTSET",
        "handlers": [
            "console_handler"
        ]
    },
    "disable_existing_loggers": false,
    "loggers": {
        "dev": {
            "level": "DEBUG",
            "handlers": [
                "file_handler"
            ],
            "propagate": true
        },
        "prod": {
            "level": "INFO",
            "handlers": [
                "file_handler"
            ],
            "propagate": false
        }
    },
    "handlers": {
        "console_handler": {
            "class": "logging.StreamHandler",
            "level": "NOTSET",
            "formatter": "standard",
            "stream": "ext://sys.stdout"
        },
        "file_handler": {
            "class": "logging.handlers.RotatingFileHandler",
            "level": "NOTSET",
            "formatter": "standard",
            "filename": "logs/standard.log",
            "maxBytes": 10485760,
            "backupCount": 20,
         

As you can see, the previous code block loaded the configuration setting for loggers in dictionairy format. You can easily acces this dictionairy using the key-value pairs. For instance `logging_config_dict['version']` gives `1` and `logging_config_dict['loggers']['dev']['level']` gives `DEBUG`.

# Loggers: Stop using the print method!
The `load_logger_from_config` automatically converts the `.yaml` configuration file to a logger ready to use, so you don't have to do this conversion everytime yourself. In the next code block there is an example again.

### Why you need loggers
Why do I need to use a logger to print stuff on my screen I hear you ask. Python has the `print` method right? Well, using a logger has a lot of advantages. I want to highlight three of them:
1. You can configure to see a timestamp.
2. You save what you ran automatically in log files. If you log your settings and output of models correctly, you can exactly retrace what your model input and output was at a certain point in time. You can essentially see it as a very basic, automatically generated labjournal.
3. There are 4 default logger levels (DEBUG, INFO, WARNING, ERROR) to divide your messages into. It then becomes clearer what _kind_ of message you're logging. Different loggers can output different kind of levels of messages. This can be very useful within different environments. For instance: If you are developing, you usually want to output DEBUG level and higher messages. However, if you want to hand over a model/software product to the client (running in production), you only want to show INFO messages and higher. You can even define your own logging levels in addition to the default ones, if you want. 

### Using the default loggers
You don't need to understand what is in de logger configuration settings to be able to use it. By default, there are two loggers available in the settings: a `dev` logger and a `prod` logger. The `dev` logger outputs DEBUG level and higher messages, the `prod` logger outputs INFO level and higher messages. Both these loggers save their output to `logs/standard.log`. Moreover, the `prod` logger doesn't even output messages on screen, but does this _only in the log file_.

Lastly, in the next code block we have to pass the `overwrite_file_name_path` argument. As a little challenge, can you figure out why we need to do this? Hint: look at the `logging_config_dict` in the previous code block.

In [5]:
logger_dev = LoadConfig.load_logger_from_config(path=config_path_logging, logger_name='dev', 
                                                overwrite_file_name_path=f'{root_dir}/logs/standard.log')
logger_dev.debug('debug')
logger_dev.info('info')
logger_dev.warning('warning')
logger_dev.error('error')

logger_prod = LoadConfig.load_logger_from_config(path=config_path_logging, logger_name='prod', 
                                                 overwrite_file_name_path=f'{root_dir}/logs/standard.log')
logger_prod.debug('debug')
logger_prod.info('info')
logger_prod.warning('warning')
logger_prod.error('error')

2020-01-22 12:11:40,585 - dev - DEBUG - debug
2020-01-22 12:11:40,585 - dev - INFO - info
2020-01-22 12:11:40,589 - dev - ERROR - error


After you ran the previous code block, you can checkout the logs folder. Open the `standard.log` file to read back the logger output.

For those who _really, really_ want to stick with the `print` method, you can actually overwrite this and _still_ use loggers (not recommended):

In [6]:
def print(message):
    logger_dev.info(message)

print('Some message. Can you see a timestamp?')

# Reset for the rest of this document
def print(message):
    __builtin__.print(message)

2020-01-22 12:11:41,203 - dev - INFO - Some message. Can you see a timestamp?
