### Activity 7: Communicating Ray Actors

(due due Friday December 8, 2023 5:00 pm)

This is a short exercise to demonstrate how actors can communicate through remote oids.
We are going to break the actors of the ImageNet classification [Example 24](../../examples/24_ex_ray_actors.ipynb) into 
two actors: one that transforms the image into an ResNet50 compatible tensor and one that takes
the tensor as input and returns the classification. 

You have been given two class files that have been written to be instantiated as Ray actors:
  * [rayresnet50_normalize](./rayresnet50_normalize.py)
  * [rayresnet50_classify](./rayresnet50_classify.py)

To complete the exercise you need to populate the following driver code.  Then answer the questions.

Data is from https://github.com/EliSchwartz/imagenet-sample-images.

Note: check your ouput to make sure that the predictions match the input file. This classifier should be over 90% correct. You need to be careful to match the return OIDs with files. **Include the cell output in submitted notebook**.

In [None]:
!pip install torchvision

Defaulting to user installation because normal site-packages is not writeable
Collecting torchvision
  Downloading torchvision-0.20.1-cp312-cp312-manylinux1_x86_64.whl.metadata (6.1 kB)
Collecting torch==2.5.1 (from torchvision)
  Downloading torch-2.5.1-cp312-cp312-manylinux1_x86_64.whl.metadata (28 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch==2.5.1->torchvision)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch==2.5.1->torchvision)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch==2.5.1->torchvision)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch==2.5.1->torchvision)
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)


In [1]:
from rayresnet50_normalize import RRN50Normalize
from rayresnet50_classify import RRN50Classify
import ray
import time
import os

num_actors=4

# script to drive parallel program
ray.init(num_cpus=num_actors, ignore_reinit_error=True)

### TODO instantiate 4 normalization actors
normalize_actors = [RRN50Normalize.remote() for _ in range(num_actors)]

### TODO instantiate 4 classification actors
classify_actors = [RRN50Classify.remote() for _ in range(num_actors)]

directory = '../../data/imagenet1000'
files = os.listdir(directory)

classify_oids = {}

start_time = time.time()  # Get the current time

for i in range(len(files)):
    if files[i].endswith(".JPEG"):
        file_path = os.path.join(directory, files[i])

        ### TODO call remote to normalize image into tensor
        normalized_img = normalize_actors[i % num_actors].normalize_image.remote(file_path)
        
        ### TODO call remote to classify tensor
        classify_oid = classify_actors[i % num_actors].classify_image.remote(normalized_img)
        
        ### TODO store the oids needed to complete the computation
        classify_oids[filename] = classify_oid
        
for i, filename in enumerate(os.listdir(directory)):
    try:
        preds = ray.get(classify_oids[filename])
        print(f"Filename {[filename]}: predictions {preds}")
    except Exception as e:
        print(f"Error in retrieving results for file {filename}: {e}")

end_time = time.time()  # Get the current time again

execution_time = end_time - start_time
print("Execution time: ", execution_time, " seconds")
ray.stop()

2024-11-20 22:20:16,811	INFO util.py:154 -- Outdated packages:
  ipywidgets==7.8.1 found, needs ipywidgets>=8
Run `pip install -U ipywidgets`, then restart the notebook server for rich notebook output.


ModuleNotFoundError: No module named 'torchvision'

### Questions

* Question 1: Does the computation for a single input file (normalization and classification) run in serial or parallel?  If serially, how is the dependency enforced?

Each file is normalized and classified serially, as the classifier requires the `normalize_oid` as an input to the `classify_image` function. The dependency is enfored 

* Question 2: Does the computation of different files run in serial or parallel?  If parallel, explain why they are independent. 

* Question 3: Your computation needs to collect return identifiers for the classification objects. It is not necessary to collect the OIDs of the normalization function in the driver code. Why?

* Question 4: At any given point in time, how many actors are running and what are they doing?

* Question 5: Is this implementation faster or slower than doing the normalization and classification in one actor?  Can you think of a situation in which it would be faster to do them together?  (By situation, I mean data properties or target hardware system on which this would be preferable.) 