# Foundational model and training loop

This is a high-level roadmap
1. Data loading of the <strong>.mhd</strong> and <strong>.raw</strong> file. 
2. Segmentation (ch13)
3. Grouping (ch14) 
4. Nodule classification (0/1)
5. Nodule analysis and diagnosis (Malignant/Benign)


As a reminder, we will classify candidates as nodules or non-nodules (we’ll build another classifier to attempt to tell malignant nodules from benign ones in chapter 14). That means we’re going to assign a single, specific label to each sample that we present to the model. In this case, those labels are “nodule” and “non-nodule,” since each sample represents a single candidate.

In [1]:
import torch

In [2]:
import datetime

import training
from util.util import importstr
from util.logconf import logging

log = logging.getLogger('nb') #<1>

def run(app, *argv):
    argv = list(argv)
    argv.insert(0, '--num-workers=4') #<2>
    log.info("Running: {}({!r}).main()".format(app, argv))
    app_cls = importstr(*app.rsplit('.', 1))
    app_cls(argv).main()
    log.info("Finished: {}.{!r}).main()".format(app, argv)) #<3>
run('training.LunaTrainingApp', '--epochs= 1')

2022-10-07 00:16:59,395 INFO     pid:15534 nb:012:run Running: training.LunaTrainingApp(['--num-workers=4', '--epochs= 1']).main()
2022-10-07 00:16:59,403 INFO     pid:15534 training:072:main Starting LunaTrainingApp, Namespace(num_workers=4, epochs=1)
2022-10-07 00:16:59,404 INFO     pid:15534 nb:015:run Finished: training.LunaTrainingApp.['--num-workers=4', '--epochs= 1']).main()


cpu


1. Logging is the process of writing information into log files. Log files contain information about various events that happened in operating system, software, or in communication. (https://docs.python.org/3/howto/logging.html)


2. We assume you have a four-core, eight- thread CPU. Change the 4 if needed.
3. This is a slightly cleaner call to \_\_import\_\_

One way to take advantage of being able to invoke our training by either function call or OS-level process is to wrap the function invocations into a Jupyter Notebook so the code can easily be called from either the native CLI or the browser.

### logging
(https://docs.python.org/3/howto/logging.html)<br>


The logging module in Python is a ready-to-use and powerful module that is designed to meet the needs of beginners as well as enterprise teams. By default, there are 5 standard levels indicating the severity of events. Each has a corresponding method that can be used to log events at that level of severity.
- DEBUG
- INFO
- WARNING
- ERROR
- CRITICAL


The output shows the severity level before each message along with root, which is the name the logging module gives to its default logger. This format, which shows the level, name, and message separated by a colon (:), is the default output format that can be configured to include things like timestamp, line number, and other details.

## "training.py" file

In [None]:

class LunaTrainingApp:
    def __init__(self, sys_argv=None):
        if sys_argv is None: #<a>
            sys_argv = sys.argv[1:]

        parser = argparse.ArgumentParser()
        parser.add_argument('--num-workers',
            help='Number of worker processes for background data loading',
            default=8,
            type=int,
        )
       # parser.add_argument('--epochs',
       #     help='Number of epochs to train for',
       #     default=1,
       #     type=int,
       # )

        self.cli_args = parser.parse_args(sys_argv)
        self.time_str = datetime.datetime.now().strftime('%Y-%m-%d_%H.%M.%S') #<b>
        self.use_mps1 = torch.backends.mps.is_available()
        self.use_mps2 = torch.backends.mps.is_built()
        self.device = torch.device("mps" if self.use_mps1 and self.usemps2 else "cpu")

        self.model = self.initModel()
        self.optimizer = self.initOptimizer()
    def initModel(self):
        print(self.device)
        model = LunaModel()
       # if self.use_cuda:
            log.info("Using CUDA; {} devices.".format(torch.cuda.device_count()))
            #if torch.cuda.device_count() > 1:
            #    model = nn.DataParallel(model)
            model = model.to(self.device)
        return model

    def initOptimizer(self):
        return SGD(self.model.parameters(), lr=0.001, momentum=0.99)

    def main(self):
        log.info("Starting {}, {}".format(type(self).__name__, self.cli_args))

    


if __name__ == '__main__': #<c>
    LunaTrainingApp().main() 

i. This instantiates the application object and invokes the <strong>main</strong> method. 
ii. If the caller doesn't provide arguments, we get them from the command line.
iv. The timestamp is used to help identify training runs. The .now method is used of the datetime library. 


The application class <strong>LunaTrainingApp</strong> has two functions by mandate; the <strong>\_\_init\_\_</strong> and <strong>main</strong>. We are parsing arguments in <strong>\_\_init\_\_</strong>, and that allows us to configure the application separately from invoking it.

## Workflow 
Before we can begin iterating over each batch in our epoch, some initialization work needs to happen, which includes instantiating the model.  

<strong></strong>
&emsp;<strong>i.</strong> Initialize our model and optimizer. The model is initialized with random weights. <br>
&emsp;<strong>ii.</strong> Initialize our <strong>Dataset</strong> and <strong>DataLoader</strong> instances. <br>
&emsp;<strong>iii.</strong> Start Training loop. This is when the batch tuple is loaded, the batch is classified, &emsp;&emsp;the loss is calculated, the metrics are recorded, and the weights are updated. <br>
&emsp;<strong>iv.</strong> In parallel, the validation loop is initiated where the validation set is loaded as &emsp;&emsp;a batch tuple, the batches are classified, the loss is calculated, and the metrics &emsp;&emsp;are recorded. <br>
&emsp;<strong>v.</strong> This process, excluding the i. step is looped over a predefined number of epochs &emsp;&emsp;until the model is fully trained.<br>


<strong>LunaDataset</strong> will define the randomizedset of samples that will make up our training epoch, and our <strong>DataLoader</strong> instance
will perform the work of loading the data out of our dataset and providing it to
our application.

In [17]:
use_mps1 = torch.backends.mps.is_available()
use_mps2 = False#torch.backends.mps.is_built()
device = torch.device("mps" if use_mps1 and use_mps2 else "cpu")

In [18]:
device

device(type='cpu')

In [19]:
torch.backends.mps.device_count()

AttributeError: module 'torch.backends.mps' has no attribute 'device_count'