# **Advanced Model Training Tuning and Evaluation**

## **Advanced Model Training and Tuning**

![](2024-01-01-20-02-10.png)

![](2024-01-01-20-02-41.png)

![](2024-01-01-20-03-33.png)

Speaking of model evaluation, there is a code that is often credited to Peter Drucker, who is often considered as founder of the modern management. The code says, "If you can't measure it, you can't improve it." This certainly applies to machine learning as well. An important step in model development is to evaluate the final model with another holdout dataset called test dataset that your model has never seen before. These final model metrics can be used to compare and contrast competing models. Typically, the higher this final score is, the better is the ability of the model to generalize. Hyperparameter tuning is an important part of model development process. When you start working on a new model, you're most likely to start with manually selecting hyperparameter values depending on the algorithm that you choose for your use case. For popular algorithms and use cases, you can generally find great guidance on the values of hyperparameters to use from the data science and the research community. This is a great starting point and helps you build your own intuition over time. Once you have validated your choices of algorithm, code, and dataset to solve your machine learning use cases, you can leverage automatic model tuning to fine tune your hyperparameters to find the best performing values. Next, I will introduce a few popular algorithms for automatic model tuning. Specifically, I want to introduce these four algorithms to you. Grid search, random search, Bayesian optimization, and hyperband. The first approach is grid search. Here, to tune your model, you start by defining available hyperparameter sets that include both the name of the hyperparameter and the range of values you want to explore for the hyperparameter. The grid search algorithm tests every combination by training the model on each of the hyperparameters and selecting the best possible parameters. Advantage of the grid search is that it allows you to explore all possible combinations. This idea works really well when you have a small number of hyperparameters and a small range of hyperparameter values to explore for these hyperparameters. However, when the number of hyperparameters increases or the range of values that you want to explore for these hyperparameters increases, this could become very time consuming. The grid search does not scale well to large number of parameters. To address this issue, you can use random search. In random search, once again, you start by defining the available hyperparameter sets that consists of the name of the hyperparameters and the values that you want to explore. Here, the algorithm, instead of searching for every single combination, picks random hyperparameter values to explore in the defined search space. Additionally, you can also define stop criteria, such as the time elapsed or the maximum number of trainings to be completed. Once the stop criteria is met, you select the best performing set of hyperparameters from the trained models available so far. An advantage of random search is that it is much more faster when compared to the grid search. However, due to the randomness involved in the search process, this algorithm might miss the better performing hyperparameters. 

![](2024-01-01-21-09-32.png)

![](2024-01-01-21-10-25.png)

When you apply the concept of hyperparameter tuning to classification and regression models, it is very similar to finding the best possible model parameters by minimizing the loss function. You might be asking, why can't we apply the same process to hyperparameters as well? That is the idea behind Bayesian optimization, which is our next algorithm. In Bayesian optimization, hyperparameter tuning is treated as a regression problem. The hyperparameter values are learned by trying to minimize the loss function of a surrogate model. Here, the algorithm starts with random values for the hyperparameters and continuously narrows down the search space by using the results from the previous searches. The strength of Bayesian optimization is that the algorithm is much more efficient in finding the best possible hyperparameters because it continues to improve on the results from previous searches. However, this also means that the algorithm requires a sequential execution. There is also a possibility that Bayesian optimization could get stuck in a local minima, which is a very prominent problem when you use techniques like gradient descent for minimizing a loss function. The final algorithm that I will discuss today is a hyperband, which is a relatively new approach towards hyperparameter tuning. Hyperband is based on bandit approach. Bandit approaches typically use a combination of exploitation and exploration to find the best possible hyperparameters. The strength of the bandit approaches is that dynamic pull between exploitation and exploration. When applied to the hyperparameter tuning problem space, this is how the bandit-based hyperband algorithm works. You start with the larger space of random hyperparameter set and then you explore a random subset of these hyperparameters for a few iterations. After the first few iterations, you discard the worst performing half of the hyperparameter sets. In the subsequent few iterations, you continue to explore the best performing hyperparameters from the previous iteration. You continue this process until the set time is elapsed or you remain with just one possible candidate. Hyperband clearly stands out by spending the time much more efficiently than other approaches we discussed to explore the hyperparameter values using the combination of exploitation and exploration. On the downside, it might discard good candidates very early on and these could be the candidate that converge slowly. That is a wrap on discussion of the four popular hyperparameter tuning algorithms that can help you automate the hyperparameter tuning process. er seen before. These final model metrics can be used to compare and contrast competing models. Typically, the higher this final score is, the better is the ability of the model to generalize. Hyperparameter tuning is an important part of model development process. When you start working on a new model, you're most likely to start with manually selecting hyperparameter values depending on the algorithm that you choose for your use case. For popular algorithms and use cases, you can generally find great guidance on the values of hyperparameters to use from the data science and the research community. This is a great starting point and helps you build your own intuition over time. Once you have validated your choices of algorithm, code, and dataset to solve your machine learning use cases, you can leverage automatic model tuning to fine tune your hyperparameters to find the best performing values. Next, I will introduce a few popular algorithms for automatic model tuning. Specifically, I want to introduce these four algorithms to you. Grid search, random search, Bayesian optimization, and hyperband. The first approach is grid search.

![](2024-01-01-21-14-36.png)

![](2024-01-01-21-18-43.png)

Here, to tune your model, you start by defining available hyperparameter sets that include both the name of the hyperparameter and the range of values you want to explore for the hyperparameter. The grid search algorithm tests every combination by training the model on each of the hyperparameters and selecting the best possible parameters. Advantage of the grid search is that it allows you to explore all possible combinations. This idea works really well when you have a small number of hyperparameters and a small range of hyperparameter values to explore for these hyperparameters. However, when the number of hyperparameters increases or the range of values that you want to explore for these hyperparameters increases, this could become very time consuming. The grid search does not scale well to large number of parameters. To address this issue, you can use random search. In random search, once again, you start by defining the available hyperparameter sets that consists of the name of the hyperparameters and the values that you want to explore. Here, the algorithm, instead of searching for every single combination, picks random hyperparameter values to explore in the defined search space. Additionally, you can also define stop criteria, such as the time elapsed or the maximum number of trainings to be completed. Once the stop criteria is met, you select the best performing set of hyperparameters from the trained models available so far. An advantage of random search is that it is much more faster when compared to the grid search. However, due to the randomness involved in the search process, this algorithm might miss the better performing hyperparameters. When you apply the concept of hyperparameter tuning to classification and regression models, it is very similar to finding the best possible model parameters by minimizing the loss function. You might be asking, why can't we apply the same process to hyperparameters as well? That is the idea behind Bayesian optimization, which is our next algorithm. In Bayesian optimization, hyperparameter tuning is treated as a regression problem. The hyperparameter values are learned by trying to minimize the loss function of a surrogate model. Here, the algorithm starts with random values for the hyperparameters and continuously narrows down the search space by using the results from the previous searches. The strength of Bayesian optimization is that the algorithm is much more efficient in finding the best possible hyperparameters because it continues to improve on the results from previous searches. However, this also means that the algorithm requires a sequential execution. There is also a possibility that Bayesian optimization could get stuck in a local minima, which is a very prominent problem when you use techniques like gradient descent for minimizing a loss function. The final algorithm that I will discuss today is a hyperband, which is a relatively new approach towards hyperparameter tuning. Hyperband is based on bandit approach. Bandit approaches typically use a combination of exploitation and exploration to find the best possible hyperparameters. The strength of the bandit approaches is that dynamic pull between exploitation and exploration. When applied to the hyperparameter tuning problem space, this is how the bandit-based hyperband algorithm works. You start with the larger space of random hyperparameter set and then you explore a random subset of these hyperparameters for a few iterations. After the first few iterations, you discard the worst performing half of the hyperparameter sets. In the subsequent few iterations, you continue to explore the best performing hyperparameters from the previous iteration. You continue this process until the set time is elapsed or you remain with just one possible candidate. Hyperband clearly stands out by spending the time much more efficiently than other approaches we discussed to explore the hyperparameter values using the combination of exploitation and exploration. On the downside, it might discard good candidates very early on and these could be the candidate that converge slowly. That is a wrap on discussion of the four popular hyperparameter tuning algorithms that can help you automate the hyperparameter tuning process.

![](2024-01-01-21-26-24.png)

![](2024-01-01-21-30-39.png)

### **Tune a BERT-based Text Classifier**

![](2024-01-01-21-42-25.png)

Sagemaker's automatic model tuning, also called as hyperparameter tuning, finds the best version of the model by running multiple training jobs on your dataset using the hyperparameter range values that you specify. Additionally to the hyperparameter tuning job, you can also provide objective metric and tuning strategy. For example, you can specify the objective metric as maximizing the validation accuracy. For tuning strategies, SageMaker natively supports random and Bayesian optimization strategies. You can extend this functionality by providing an implementation of another tuning strategy as a docker container. For the specific use case of using BERT for classifying product reviews, this is the setup that I will use for hyperparameter tuning job. First, I want to tune the hyperparameters so that the validation accuracy of the model is maximized. That is my objective. To achieve this, I want to tune two different hyperparameters, learning rate and batch size. I want to use the random tuning strategy for the hyperparameter tuning. SageMaker behind the scenes, runs multiple training jobs and returns the training job with the best possible validation accuracy. There are three steps involved in this process. Creating the PyTorch Estimator, and creating a hyperparameter tuner job, and then finally, analyzing the results from the tuner job. I will now dive a little bit deeper into each one of these steps. First, let's start with creating a PyTorch Estimator. The Pytorch Estimator will hold a fixed set of hyperparameters that you do not want to tune during this process. These hyperparameters are defined like a dictionary, as you can see here. Here, I'm assuming that the hyperparameters like epochs, number of steps for epochs, and so on, need not be tuned during this tuning process. These are the fixed hyperparameters for my example. Once I have the fixed hyperparameters, then I can create a PyTorch estimator and pass in the fixed hyperparameters dictionary to the PyTorch estimator. Next step is to create a hyperparameter tuning job. In this step, I want to define the hyperparameters that I want to tune. The hyperparameters list that I want to tune looks like this. I want to tune the learning rate and the batch size. As you can see, I specified the name of the hyperparameter, as well as the range of the values that I want to explore. Additionally, I have to specify the hyperparameter type. The hyperparameter type could be a categorical, or continuous, or integer type. Before jumping into the next step of creating the hyperparameter tuner job, I will take a brief detour to talk about these various hyperparameter types. Let's start with categorical hyperparameter type. As the name indicates, use this type if your hyperparameter can take specific categorical values. For example, if you want to test your neural network with different types of optimizers, you can use the categorical variable. More interestingly, you can also use the categorical type with numerical values, if you want to test for specific values instead of a range. For example, here, for the batch size parameter, I want to test specific values of 128 and 256 instead of a range. I can treat the batch size as a categorical parameter. Similarly, if you have any parameters that can take Boolean values like true and false, you can treat that as a categorical type as well. The next type is integer. Use this if you would rather explore a range of values for your parameters. For example, here, for the batch size, I want to explore all the values between 16 and 1024 as batch sizes. I use the integer type here. If the range of values is large to explore, definitely use logarithmic scale to optimize the tuning process. The final type that you can use is continuous. Example of a continuous type hyperparameter could be learning rate. Here, you're asking the tuning job to explore the values between the low and the high values for the range in a linear scale. Now, back to our hyperparameter tuning job. Now that we have defined the tunable hyperparameters with the name, type, and the values to explore, pass those in to the hyperparameter Tuner object. Along with the estimator object that you created in the previous step. The objective type and objective metric name and the strategy. Once you configure the hyperparameter tuner object with all these values, finally, run the tuner job by using the fit method.

![](2024-01-01-21-46-58.png)

![](2024-01-01-21-47-41.png)

![](2024-01-01-22-07-33.png)

![](2024-01-01-22-09-09.png)

![](2024-01-01-22-09-47.png)

![](2024-01-01-22-10-08.png)

Behind the scenes, the fit method is kicking off multiple training jobs that we'll explore all the hyperparameter ranges that I have configured in the tuner job. To analyze my results, I can once again use the tuner object and get the DataFrame like so from the analytics method. Here, you're seeing a subset of the results from the hyperparameter tuning job. In this subset, you see the different learning rate and the batch sizes that were explored and the resulting final objective value, which is the validation accuracy. Next, I will discuss more on starting hyperparameter tuning job. One starting reuses prior results from a previously completed hyperparameter tuning job or a set of completed hyperparameter tuning jobs to speed up the optimization process and reduce the overall cost. In the example here, you perform warm start using a single parent, which is a previously completed tuning job. A warm start is particularly useful if you want to change the hyperparameter tuning ranges from the previous job, or if you want to add new hyperparameters to explore. Both these situations can use the knowledge from the previously completed job to speed up the process and find the best model quickly. With warm start, there are two different types supported. The first type is identical data and algorithm. When you implement this type, the new hyperparameter tuning job uses the same input data and the training data and the training algorithm as the parent tuning job. You have a chance to update the hyperparameter tuning ranges and the maximum number of training jobs. The second type is transfer learning. With this type, the new hyperparameter tuning job uses an updated training data and also can use a different version of the training algorithm. Perhaps you have collected more training data since your last tuning job, and you want to explore the best possible model for the entire training data. Or you may have come across a new algorithm that you would like to explore. This is how it looks in code. You start by specifying the warm start config. To the warm start config, you pass in the type of the warm start that you want to implement. Here, I'm using the identical data and algorithm. Then, you specify the name of the parent tuning job. You pass in the warm start config to the hyperparameter tuning object, and then you finally run the fit method. Now that you know how to use hyperparameter tuning with SageMaker, I will briefly discuss a few best practices. The first one is to set up a small limit of hyperparameters. Hyperparameter Tuning, as I discussed before, is a time and computation intensive task. The computational complexity is directly proportional to the number of hyperparameters that you tune. SageMaker does allow you to tune up to 20 different hyperparameters at a time. However, choosing a smaller number of hyperparameters to tune will typically yield better results. Along the same lines, choose a smaller range of values to explore for the hyperparameters. The values that you choose for the hyperparameters can significantly affect the success of hyperparameter optimization. Intuitively, you might feel like specifying a large range of values so that you can explore all possible values. But in fact, you get better results by limiting your search to a small range of values. Next, enable warm start. As discussed before, when you enable warm start, the hyperparameter tuning job uses results from previously completed jobs to speed up the optimization process and save you the tuning cost. Next, enable early stop. When you enable early stop on the hyperparameter tuning job, the individual training jobs that are launched by the tuning job are dominated early in case the objective metric is not continuously improving. This early stopping of the individual training jobs leads to earlier completion of the hyperparameter tuning job and reduce costs. The final best practice here is to use small number of concurrent training jobs. SageMaker does allow you to run multiple jobs concurrently during the hyperparameter tuning process. On one hand, if you use a larger number of concurrent jobs, the tuning process will be completed faster. But in fact, the hyperparameter tuning process is able to find best possible results only by depending on the previously completed training jobs. So choose to use a smaller number of concurrent jobs when you're executing these hyperparameter tuning job. Finally, when training and tuning at scale, it is important to continuously monitor and use the right compute resources. While you do have the flexibility of using different instance types and instance sizes, how do you determine the exact specific instance type and size to use for your workloads? There is really no standard answer for this. It comes down to understanding your workload well and running empirical testing to determine the best possible compute resources to use for your tuning and training workloads. SageMaker training jobs emit made CloudWatch metrics for resource utilization of the underlying infrastructure. You can use these metrics to observe your training utilization and improve your successive training runs. Additionally, when you enable SageMaker Debugger on your training jobs, Debugger provides the visibility into training jobs and infrastructure that is running these training jobs. Debugger also monitors and reports on system resources such as CPU, GPU, and memory, providing you with the very useful insights on the resource utilization and resource bottlenecks. You can use these insights and recommendations from Debugger as a guidance to further optimize your training infrastructure.

![](2024-01-01-22-10-59.png)

![](2024-01-01-22-11-14.png)

![](2024-01-01-22-11-43.png)

![](2024-01-01-22-12-32.png)

![](2024-01-01-22-17-19.png)

![](2024-01-01-22-18-07.png)

![](2024-01-01-22-22-33.png)

![](2024-01-01-22-23-19.png)

![](2024-01-01-22-24-18.png)

![](2024-01-01-22-25-51.png)

![](2024-01-01-22-26-50.png)

![](2024-01-01-22-30-51.png)

![](2024-01-01-22-32-08.png)

## **ML Training Challenges**

### **Checkpointing**

![](2024-01-01-22-54-00.png)

![](2024-01-01-22-56-45.png)

![](2024-01-01-22-58-48.png)

![](2024-01-01-23-00-42.png)

Checkpointing is a way to save the current state of a running training job so the training job, if it is stopped, can be resumed from a known state. Checkpoints are basically snapshots of model in training and include details like model architecture, which allows you to recreate the model training once it stopped, also includes model weights that have been learned in the training process so far. Also, training configuration such as number of epochs that have been executed, and the optimizer used, and the loss observed so far in training, and other metadata information. Finally, the checkpoints also include information such as optimizer state. This optimizer state allows you to easily resume the training job from where it has stopped. When configuring your new training job with checkpointing take two things into consideration, one is the frequency of checkpointing, and the second is the number of checkpoint files you are saving each time. If you have a high frequency of checkpointing and saving several different files each time, then you are quickly using up the storage. However, this high frequency and high number of checkpoints you're processing, this state will allow you to resume your training jobs without losing any training state information. On the other hand, if the frequency and the number of checkpoints you're saving each time is low, you are definitely saving on the storage space, but there is a possibility that some of the training state has been lost when the training job is stopped. When configuring your training jobs with these parameters, take the balance of your storage costs versus your productivity requirements into consideration. Next, I will discuss the Amazon SageMaker Managed Spot capability that allows you to save training costs. Managed Spot is based on the concept of Spot Instances that offer speed and unused capacity to users at discount prices. SageMaker Managed Spot uses these Spot Instances for hyperparameter tuning and training and leverages machine learning checkpointing to resume training jobs easily. Here's how it works. You start a training job on a Docker container on a Spot Instance. Here, you use a training script called train.python. Since Spot Instances can be preempted and terminated with just a two-minute notice, it is important that your train.py file implement the ability to save checkpoints, and the ability to resume from checkpoints. SageMaker Managed Spot does the remaining. It automatically backs up the checkpoints to an S3 bucket. In case a Spot Instance is terminated because of lack of capacity, SageMaker Managed Spot continues to pull for additional capacity. Once the additional capacity becomes available, a new Spot Instance is created to resume your training and the service automatically transfers all the dataset as well as the checkpoints that are saved into the S3 bucket into your new Instance so that training can be resumed. A key thing for you to take advantage of Managed Spot capability is implementing your training script so that they can periodically save the checkpoints and have the ability to resume from a saved checkpoint.

### **Distributed Training Strategies**

![](2024-01-01-23-07-05.png)

Training at skill challenges come in two flavors. One is the increased training data volume and second is the increased model complexity and model size as a result of the increased training data volume. Now, using huge amounts of training data and the resulting model complexity could give you a more accurate model. However, there is always a physical limit on the amount of the training data or the size of the model that you can fit on a single computer instance memory. Even if you try to use a very powerful CPU or even a GPU instance, increased training data volumes typically means increased number of computations during training process. And that could potentially lead to long running training jobs. Distributed training is often a technique used to address the scale challenges in distributed training. The training load is split across multiple CPUs and GPUs, also called as devices within a single Compute Node. Or the node can be distributed across multiple compute nodes or compute instances that form a compute cluster. Regardless of whether you choose to distribute the training load within a single compute node or across multiple compute nodes there are two distributed training strategies at play. The training strategies that I'd like to discuss today are data parallelism and model parallelism. Let's start with data parallelism. With data parallelism the training data is split up across the multiple nodes that are involved in the training cluster. The underlying algorithm or the neural network is replicated on each individual nodes of the cluster. Now, batches of data are retrained on all nodes using the algorithm and the final model as a result of a combination of results from each individual node. On the other hand, in model parallelism, the underlying algorithm or the neural network in this case, is split across the multiple nodes. Batches of data are send to all of the nodes again so that each batch of the data can be processed by the entire neural network. The results are once again combined for a final model. Now, how does this look in code? Let's take a look. Here's a code for using SageMaker Estimator. This code should look very familiar to you by now, you have used this in multiple labs across the specialization. With the Estimator in addition to specifying the instance_count, instance_type and your own training script. To enable distribute a training, you specify one additional parameter called distribution. Here for the distribution flag you can see that I'm setting the data parallel to be enabled using the smdistributed flag. Similarly, if you'd like to enable model parallelism, your code would look like this. Now that you know how to start implementing the data parallelism and model parallelism strategies on your training jobs. How do you go about making a selection of which one to use for your specific requirements? When choosing a distributed training strategy always keep in mind that if your training across multiple nodes or multiple instances, there is always a certain training overhead. The training overhead comes in the form of internode communication because of the data that needs to be exchanged between the multiple nodes of the cluster. Let's make this a little bit more clear using a flow chart that will help us choose a distribution strategy. If the train model can fit on a single node's memory, then use data parallelism. In the situations where the model cannot fit on a single node's memory, you have some experimentation to do to see if you can reduce the model size to fit on that single node. All of these experimentations will include an effort to resize the model. Some of the things that you can try to resize your model include tuning the hyperparameters. Tuning the hyperparameters, such as the number of neural network layers in your neural network, as well as tuning the optimizer to use will have a considerable effect on the final model size. Another thing you can try is reduce the batch size. Try to incrementally decrease the batch size to see if the final end model can fit in a single node's memory. Additionally, you can also try to reduce the model input size. If for example, your model is taking a text input, then consider embedding the text with a low dimensional embedded in vector. Or if your model is taking image as an input, try to reduce the image resolution to reduce the model input. After trying these various experimentation, go back and check if the final model fits on a single node's memory. And if it does use data parallelism on a single node. Now, even after these experiments if the model is too big to fit on a single node memory, then choose to implement model parallelism. 

![](2024-01-01-23-13-00.png)

![](2024-01-01-23-14-17.png)

![](2024-01-01-23-14-40.png)

![](2024-01-01-23-16-13.png)

![](2024-01-01-23-17-59.png)

![](2024-01-01-23-18-32.png)

![](2024-01-01-23-18-49.png)

![](2024-01-01-23-19-25.png)

![](2024-01-01-23-20-05.png)

![](2024-01-01-23-23-23.png)

### **Custom Algorithms with Amazon SageMaker**

What if you want to use your own custom algorithm implementation, your own training and inference logic, but want do take advantage of the infrastructure managed by SageMaker? You can absolutely do that. In this video, I will walk you through the steps that you need to use to bring your own custom algorithms onto SageMaker. But first, here's a quick review of the various options that are available to you on Amazon SageMaker. You can use the SageMaker built-in algorithms like Blazing Text algorithm for your NLP problems. Using the built-in algorithm looks like this in the code. Here, you use the estimated object and to the estimated object, you're passing in the image URI. The image URI is pointing to a container that comes to the implementation of the built-in algorithm as well as the training and inference logic. The next option is to bring your own script. Here, you're using a SageMaker provider container such as a PyTorch container, but you are providing your own training script to be used during training with that particular container. Here's a look at the code. Here, you're using the PyTorch container for the estimator and passing in your own script, training.python for the training purposes. As you can see, there is a common theme of using containers across these multiple options. When it is time for you to bring in your own algorithms, you will also create a container and bring that container to be used with a SageMaker. That's the third option, bringing your own container. Bringing your own container to be used with SageMaker consists of four different steps. First, you clear the code that captures all the logic and then you containerize the code. Once you have the container ready, you register the container with Amazon ECR, which is the Elastic Container Registry. Once the container is registered with ECR, you can use the image URI of the registered container with the estimated object. Let's dive a little bit deeper into each one of these steps. The first step is codifying your logic. Now, the code should include the logic for the algorithm that you want to implement, as well as the training logic and the inference logic. Once you have the code ready, next step is to containerize your code. Here you're creating a Docker container using the docker build command that you see on the screen here. Once you have the Docker container ready, next step is to register it with Amazon ECR, which is the Elastic Container Registry. Here, first, you will create a repository to hold all of your algorithm logic as a container and into that repository, you push the container from the previous step using the docker push command. Once the push command is successful, you have successfully registered your container with Amazon ECR. This registered containers can be accessed within image URI that you can use to finally create an estimator. The format of the image URI is as shown on the screen here. Once you have that image URL, you simply create an estimator object by passing in that URI. After this point, using estimator is very similar to how you would use an estimator object with a built-in algorithm, for example. Using the four steps that are outlined in this video, you can bring your custom algorithm implementation and train and host the model on the infrastructure that is managed by SageMaker. 

![](2024-01-01-23-30-07.png)

![](2024-01-01-23-30-37.png)

![](2024-01-01-23-31-02.png)

![](2024-01-01-23-32-02.png)

![](2024-01-01-23-32-40.png)

![](2024-01-01-23-33-04.png)

![](2024-01-01-23-34-03.png)

![](2024-01-01-23-34-32.png)

![](2024-01-01-23-35-15.png)

![](2024-01-01-23-35-43.png)

![](2024-01-01-23-36-46.png)

![](2024-01-01-23-37-20.png)