### Model Composition ServerHandle APIs

© 2019-2022, Anyscale. All Rights Reserved

### Learning Objective:
In this tutorial, you will learn how to:

 * compose complex models using ServeHandle APIs
 * deploy each discreate model as a seperate model deployment
 * use a single class deployment to include individual as a single model composition
 * deploy and serve this singluar model composition


In this short tutorial we going to use HuggingFace Transformer 🤗 to accomplish three tasks:
 1. Analyse the sentiment of a tweet: Positive or Negative
 2. Translate it into French
 3. Demonstrate the model composition deployment pattern using ServeHandle APIs


In [1]:
from transformers import TranslationPipeline, TextClassificationPipeline
from transformers import AutoTokenizer, AutoModelWithLMHead, AutoModelForSequenceClassification
import torch
import requests
import ray
from ray import serve

These are example 🐦 tweets, some made up, some extracted from a dog lover's twitter handle. In a real use case,
these could come live from a Tweeter handle using [Twitter APIs](https://developer.twitter.com/en/docs/twitter-api/getting-started/getting-access-to-the-twitter-api). 

In [2]:
TWEETS = ["Tonight on my walk, I got mad because mom wouldn't let me play with this dog. We stared at each other...he never blinked!",
          "Sometimes. when i am bored. i will stare at nothing. and try to convince the human. that there is a ghost",
          "You little dog shit, you peed and pooed on my new carpet. Bad dog!",
          "I would completely believe you. Dogs and little children - very innocent and open to seeing such things",
          "You've got too much time on your paws. Go check on the skittle. under the, fridge",
          "You sneaky little devil, I can't live without you!!!",
          "It's true what they say about dogs: they are you BEST BUDDY, no matter what!",
          "This dog is way dope, just can't enough of her",
          "This dog is way cool, just can't enough of her",
          "Is a dog really the best pet?",
          "Cats are better than dogs",
          "Totally dissastified with the dog. Worst dog ever",
          "Brilliant dog! Reads my moods like a book. Senses my moods and reacts. What a companinon!"
          ]

Utiliy function to fetch a tweet; these could very well be live tweets coming from Twitter API for a user or a #hashtag

In [3]:
def fetch_tweet_text(i):
    text = TWEETS[i]
    return text

### Sentiment model deployment

Our class deployment model to analyse the tweet using a pretrained transformer from HuggingFace 🤗.
Note we have number of `replicas=1` but to scale it, we can increase the number of replicas, as
we have done below.

In [4]:
@serve.deployment(num_replicas=1)
class SentimentTweet:
    def __init__(self):
        # self.tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")
        # self.model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")
        # self.pipeline = TextClassificationPipeline(model=self.model, tokenizer=self.tokenizer, task="sentiment-analysis")
        pass

    @ray.method(num_returns=2)
    def sentiment(self, text: str):
        # return self.pipeline(text)[0]['label'], self.pipeline(text)[0]['score']
        return ("POSTIVE", 99.9)

### Translation model deployment

Our class to translate a tweet from English --> French using a pretrained Transformer from HuggingFace 🤗

In [6]:
# class to translate a tweet from English --> French 
# using a pretrained Transformer from HuggingFace
@serve.deployment(num_replicas=2)
class TranslateTweet:
    def __init__(self):
        #  self.tokenizer = AutoTokenizer.from_pretrained("t5-small")
        #  self.model = AutoModelWithLMHead.from_pretrained("t5-small")
        #  self.use_gpu = 0 if torch.cuda.is_available() else -1
        #  self.pipeline = TranslationPipeline(self.model, self.tokenizer, task="translation_en_to_fr", device=self.use_gpu)
        pass

    def translate(self, text: str):
        return self.pipeline(text)[0]['translation_text']

### Use the Model Composition pattern

A composed class is deployed with both sentiment analysis and translations models' ServeHandles initialized in the constructor

In [7]:
@serve.deployment(route_prefix="/composed", num_replicas=2)
class ComposedModel:
    def __init__(self, translate, sentiment):
        # fetch and initialize deployment handles
        self.translate_model = translate
        self.sentiment_model = sentiment

    async def __call__(self, http_request):
        data = await http_request.json()
        sentiment_ref, score_ref =  await self.sentiment_model.sentiment.remote(data)
        print(f"sentiment_ref:{sentiment_ref}, score_ref:{score_ref}")
        trans_text_ref = await self.translate_model.translate.remote(data)
        print(f"trans_text_ref:{trans_text_ref}")
        sentiment_val, score_val = ray.get([sentiment_ref, score_ref])
        trans_text = ray.get(trans_text_ref)

        return {'Sentiment': sentiment_val, 'score': score_val, 'Translated Text': trans_text}

Start a Ray Serve instance. Note that if Ray cluster does not exist, it will create one and attach the Ray Serve
instance to it. If one exists it'll run on that Ray cluster instance.

In [8]:
sentiment_cls_node = SentimentTweet.bind()
translate_cls_node = TranslateTweet.bind()
compose_cls_node = ComposedModel.bind(sentiment_cls_node, translate_cls_node)

serve.run(compose_cls_node)

2022-08-09 07:50:39,769	INFO worker.py:1481 -- Started a local Ray instance. View the dashboard at [1m[32mhttp://127.0.0.1:8265[39m[22m.
[2m[36m(ServeController pid=20225)[0m INFO 2022-08-09 07:50:40,764 controller 20225 http_state.py:129 - Starting HTTP proxy with name 'SERVE_CONTROLLER_ACTOR:SERVE_PROXY_ACTOR-c0b996a62ffd847f9b166635a50042959480ac91fc72c268fb72d187' on node 'c0b996a62ffd847f9b166635a50042959480ac91fc72c268fb72d187' listening on '127.0.0.1:8000'
[2m[36m(ServeController pid=20225)[0m INFO 2022-08-09 07:50:41,280 controller 20225 deployment_state.py:1232 - Adding 1 replicas to deployment 'SentimentTweet'.
[2m[36m(ServeController pid=20225)[0m INFO 2022-08-09 07:50:41,307 controller 20225 deployment_state.py:1232 - Adding 2 replicas to deployment 'TranslateTweet'.
[2m[36m(ServeController pid=20225)[0m INFO 2022-08-09 07:50:41,314 controller 20225 deployment_state.py:1232 - Adding 2 replicas to deployment 'ComposedModel'.
[2m[36m(HTTPProxyActor pid=20227

RayServeSyncHandle(deployment='ComposedModel')

### Send HTTP requests to our deployment model

In [9]:
tweet = fetch_tweet_text(0)
print(f"Sending tweet request... : {tweet}")
response = requests.post("http://127.0.0.1:8000/composed", json=tweet)
print(response.text)

Sending tweet request... : Tonight on my walk, I got mad because mom wouldn't let me play with this dog. We stared at each other...he never blinked!
Task Error. Traceback: [36mray::ServeReplica:ComposedModel.handle_request()[39m (pid=20233, ip=127.0.0.1)
  File "/Users/jules/git-repos/ray/python/ray/serve/_private/utils.py", line 231, in wrap_to_ray_error
    raise exception
  File "/Users/jules/git-repos/ray/python/ray/serve/_private/replica.py", line 420, in invoke_single
    result = await method_to_call(*args, **kwargs)
  File "/var/folders/zc/tmtrbwyn321fxfv_4xh_s0qm0000gn/T/ipykernel_19991/1168296825.py", line 10, in __call__
TypeError: cannot unpack non-iterable ray._raylet.ObjectRef object.


In [None]:
for i in range(len(TWEETS)):
    tweet = fetch_tweet_text(i)
    print(f"Sending tweet request... : {tweet}")
    response = requests.post("http://127.0.0.1:8000/composed", json=tweet)
    print(response.text)

Gracefully shutdown the Ray serve instance.

In [None]:
serve.shutdown()

### Exercise

1. Add more tweets to `TWEETS` with different sentiments.
2. Check the score (and if you speak and read French, what you think of the translation?)

### Homework

1. Instead of French, use a language transformer of your choice
2. What about Neutral tweets? Try using [vaderSentiment](https://github.com/cjhutto/vaderSentiment)
3. Solution for 2) is [here](https://github.com/anyscale/academy/blob/main/ray-serve/05-Ray-Serve-SentimentAnalysis.ipynb)

### Next

We'll further explore model composition using [Deploymant Graph APIs](https://docs.ray.io/en/latest/serve/deployment-graph.html).

📖 [Back to Table of Contents](./ex_00_tutorial_overview.ipynb)<br>
➡ [Next notebook](./ex_04_inference_graphs.ipynb) <br>
⬅️ [Previous notebook](./ex_02_ray_serve_fastapi.ipynb) <br>