Skip to content

Conversation

@sayakpaul
Copy link
Member

Done with @deep-diver.

Cc: @osanseviero

title: "Deploying 🤗 ViT on Kubernetes with TF Serving"
authors: chansung, sayakpaul
thumbnail: /blog/assets/86_nystromformer/thumbnail.png
date: August 15, 2022
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tentative.

Copy link
Contributor

@merveenoyan merveenoyan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this blog post, I refreshed my GCP knowledge while reviewing! I left couple of suggestions and nits 🙂

Therefore, we assume some familiarity with Docker and Kubernetes.

This post builds on top of the [<u>previous post</u>](https://huggingface.co/blog/tf-serving-vision). So, we highly
recommend reading it if not already done. You can find all the code
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
recommend reading it if not already done. You can find all the code
recommend reading it if it's not already read. You can find all the code

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not too sure about this passive voicing.

import base64
image_path = tf.keras.utils.get_file(
"image.jpg", "http://images.cocodataset.org/val2017/000000039769.jpg"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe mention later they can get it from bucket URI as well (I think it's cool for the sake of being end-to-end here)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it depends on who is the end-user. For service consumers from personal devices, it is common to make requests with data from arbitrary sources. However, if TF Serving is just a part of micro service architecture, then it would make sense to say "end-to-end" with GCS bucket URI.

Maybe we could just a leave a short comment like you can get a file from GCS with get_file API as well.

Copy link
Member Author

@sayakpaul sayakpaul Aug 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think get_file() supports gs:// URIs.

I tried the following (video-api-storage bucket has public viewing enabled):

video_gcs_uri = "gs://video-api-storage/sample_video.mp4"
path_to_downloaded_file = tf.keras.utils.get_file("video.mp4", video_gcs_uri)

It results it:

Exception                                 Traceback (most recent call last)
[/usr/local/lib/python3.7/dist-packages/keras/utils/data_utils.py](https://localhost:8080/#) in get_file(fname, origin, untar, md5_hash, file_hash, cache_subdir, hash_algorithm, extract, archive_format, cache_dir)
    279         raise Exception(error_msg.format(origin, e.code, e.msg))
    280       except urllib.error.URLError as e:
--> 281         raise Exception(error_msg.format(origin, e.errno, e.reason))
    282     except (Exception, KeyboardInterrupt) as e:
    283       if os.path.exists(fpath):

Exception: URL fetch failure on gs://video-api-storage/sample_video.mp4: None -- unknown url type: gs

From the source code, we can see that it doesn't have any GCS specific utilities built-in.

various options to tailor the deployment based on your application use
case. Below, we briefly discuss some of them.

**`enable_batching`** enables the batch inference capability that
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Batching in cloud is particularly useful if you don't want to pay continuously (I think this is a super cool info to mention to people can save 💴 💵 ) but it's your choice 😅

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we have an instance of TFServing, people could choose where to deploy it. So we left just a general ideas about batching capability. We will discuss more in-depth in the next blog posts.

However, I really appreciate your comment since I have not thought about it as a money saver :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a money saver only when you can afford to run things in an offline mode. We can definitely mention that in a bit more detail.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a note.

sayakpaul and others added 9 commits August 9, 2022 07:47
Co-authored-by: Merve Noyan <merveenoyan@gmail.com>
Co-authored-by: Merve Noyan <merveenoyan@gmail.com>
Co-authored-by: Merve Noyan <merveenoyan@gmail.com>
Co-authored-by: Merve Noyan <merveenoyan@gmail.com>
Co-authored-by: Merve Noyan <merveenoyan@gmail.com>
Co-authored-by: Merve Noyan <merveenoyan@gmail.com>
Co-authored-by: Merve Noyan <merveenoyan@gmail.com>
Co-authored-by: Merve Noyan <merveenoyan@gmail.com>
@sayakpaul
Copy link
Member Author

@merveenoyan @deep-diver addressed almost all the PR comments. PTAL.

@sayakpaul sayakpaul requested a review from merveenoyan August 9, 2022 02:56
Copy link
Contributor

@osanseviero osanseviero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool blog post! Two suggestions:

  • Consistently capitalize Pods. It's somewhat inconsistent at the moment.
  • Try to be a bit more consistent between first and second person. Not a big issue though!

Thanks a lot for the contribution 🔥 it's amazing! 🚀

@sayakpaul
Copy link
Member Author

Try to be a bit more consistent between first and second person. Not a big issue though!

You'd notice "you" and "we" both. Yes, we're aware of it.

If you look into it a bit more deeply, you will notice that "we" is used quite sparingly and in cases where we deliberately want to highlight something about us rather than deflecting the reader.

In all the other cases, we have tried maintaining "you" throughout.

With this in mind, if you spot any inconsistencies regarding this, please let us know.

deep-diver and others added 5 commits August 9, 2022 21:43
@deep-diver
Copy link
Contributor

@osanseviero @sayakpaul

addressed the comments

Comment on lines +572 to +573
We applied the same deployment workflow for an ONNX-optimized version of the same
Vision Transformer model. For more details, check out [this link](https://github.com/sayakpaul/deploy-hf-tf-vision-models/tree/main/hf_vision_model_onnx_gke). ONNX-optimized models are especially beneficial if you're using x86 CPUs for deployment.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@osanseviero @merveenoyan added a note regarding ONNX deployments since many HF users are already familiar with it. Our repository supports deploying an ONNX-optimized version of the same ViT model to a Kubernetes cluster :)

Cc: @deep-diver

@deep-diver
Copy link
Contributor

we need an approval from @merveenoyan :) more suggestions?

Copy link
Contributor

@merveenoyan merveenoyan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! But I'd like @stevhliu to have a final look to see if there's anything. @stevhliu you can also merge given Omar and me will be gone next week. (Omar is already off)

Copy link
Member

@stevhliu stevhliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super interesting read, especially for someone who doesn't know too much about Docker/Kubernetes! 🤗


In this section, you’ll see how to containerize that model using the
[<u>base TensorFlow Serving Image</u>](http://hub.docker.com/r/tensorflow/serving/tags/). TensorFlow Serving consumes models
in the [`SavedModel`](https://www.tensorflow.org/guide/saved_model) format. Recall how you
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above about having the reader refer to another post.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe keep this one to empathize with the reader in case they feel clueless about the SavedModel?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that's a good compromise :)

sayakpaul and others added 14 commits August 11, 2022 21:40
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
@sayakpaul
Copy link
Member Author

@stevhliu all comments addressed :)

@stevhliu stevhliu merged commit e9df715 into huggingface:main Aug 11, 2022
@sayakpaul sayakpaul deleted the add/tfserving-kubernetes branch August 11, 2022 16:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants