### Example X: Ray ML Workers

This is a simple example of Ray actors demonstrating that actors are stateful service centers.

We start with a simple program that runs the ResNet50 network to classify 1000 images from ImageNet, one for each class. 
The data is from https://github.com/EliSchwartz/imagenet-sample-images

In [None]:
import time
import os

from resnet50 import ResNet50

# an object to run the ResNet50 model
srn50 = ResNet50()

# JPEG image files to classify
directory = '../data/imagenet1000'

start_time = time.time()  # Get the current time

# iterate over the sample images
for filename in os.listdir(directory):
    if filename.endswith(".JPEG"):
        try:
            file_path = os.path.join(directory, filename)
            
            # classify the image and return top predicted classes
            preds = srn50.classify_image(file_path)
            print(f"Filename {filename}: predictions {preds}")
        except:
            print(f"Failed to classify. Probably an image error {filename}.")
            pass
            
end_time = time.time()  # Get the current time again

execution_time = end_time - start_time
print("Execution time: ", execution_time, " seconds")

The file [resnet50.py](resnet50.py) shows how simple it is to run computer vision, deep learning models. It loads a pre-trained model  and the parameters needed to normalize input images in the constructor.  The function `classify_image` normalizes the image to a tensor, evaluates the tensor on the model, and then extracts the class names for the top predictions.

This is a serial implementation in that one object runs in a single thread. It could be parallelized in many ways.  We could use `joblib` to create multiple processes. In this case, we are going to use `ray` to build a set of distributed actors. The concept is to instantiate a series of actors each of which has loaded the model. This loading is a one-time cost on instantiation. We can then call remote functions on the actors to classify images. The actors stay around and act as service centers for parallel work.

The Ray implementation in [rayresnet50.py](rayresnet50.py) is the exact same code. It differs only in that it has the `@ray.remote` decorator to indicate that the object will be run as a Ray actor. Most of the complexity lies in the driver code that must launch the remote functions on the actors and complete them asynchronously.

In [None]:
from rayresnet50 import RayResNet50
import ray
import time
import os

num_actors=4

# script to drive parallel program
ray.init(num_cpus=num_actors, ignore_reinit_error=True)

# create the actors and store actor handles
actors = []
for i in range(num_actors):
    actors.append(RayResNet50.remote())

current_actor = 0

directory = '../data/imagenet1000'
files = os.listdir(directory)
roids = [None] * len(files)

start_time = time.time()  # Get the current time

for i in range(len(files)):
    if files[i].endswith(".JPEG"):
        file_path = os.path.join(directory, files[i])
        roids[i] = (actors[i%num_actors].classify_image.remote(file_path))

for i in range(len(files)):
    try:
        if files[i].endswith(".JPEG"):
            preds = ray.get(roids[i])
            print(f"Filename {files[i]}: predictions {preds}")
    except:
        pass

end_time = time.time()  # Get the current time again

execution_time = end_time - start_time
print("Execution time: ", execution_time, " seconds")
