Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

base inference server #1

Open
SkalskiP opened this issue Nov 11, 2022 · 5 comments
Open

base inference server #1

SkalskiP opened this issue Nov 11, 2022 · 5 comments

Comments

@SkalskiP
Copy link
Owner

SkalskiP commented Nov 11, 2022

overview

The reason for the repository is to give makesense.ai users even more opportunities to support manual annotation with pre-trained models. So far, we've been using tensorflow.js for this purpose, but quite recently there was an issue SkalskiP/make-sense#293 where users are asking to add support for inference via http. I've been thinking about this for a while now, so we're doing it!

scope

  • To start, the server is to support only one architecture - YOLOv5 or YOLOv7. But when writing the service, please keep in mind that we will probably expand this in the future.
  • To start, the server is to support only one CV task - object detection. But when writing the service, please keep in mind that we will probably expand this in the future.
  • Communication must be over http.
  • Authentication is not required. Unless some simple token that will be randomly generated by the server at startup, printed in the console and then supposed to be included in the http request.
  • Request and response format to be determined. I do not have a preference at this time.

scope (nice to have)

  • It would be nice if everything was in Docker.
  • It would be nice if the server could be controlled through a YML file. For example, define the location of weight files.
@hardikdava
Copy link
Collaborator

@SkalskiP I checked the implemented code. In general, the idea of using torch server is not good. For example, torch server can be replaced by REST api server(Flask/ fastapi/tornado/etc), model serving(onnx/opencv dnn/Tensorflow/etc) with the usage of opencv. In this way, the user is not bound to use only torch server. If I am missing anything specific advantage then let me know. I can create such small server within a day. I already built something like that.

@PawelPeczek
Copy link
Collaborator

@hardikdava - basically it is done - obviously, you can implement it in another way - but this method supports ONNX, TRT, TF, and whatever you want - TorchServe is "Torch"-oriented only by the name. We now have YOLOv5 and YOLOv7, it is trivial to deploy TorchHub, other models should also work.
Obviously, that is only an example server - everyone can provide their implementation just interface matters

@PawelPeczek
Copy link
Collaborator

It is better now to focus on integration from labeling app side

@SkalskiP
Copy link
Owner Author

Hi @PawelPeczek and @hardikdava 👋!

I apologize to you for engaging so little here so far. The new job is weighing me down a bit. I promise to improve. :)

Guys, remember that this server is only an example that we will use for development and as a guideline for others to build an API that is compatible with make-sense. Others may use it but don't have to :)

@PawelPeczek could you describe in simple words how other non-torch models than be deployed?

@PawelPeczek
Copy link
Collaborator

up to details described in readme - custom dependency can be installed in environment and the inside model handler - at first model needs to be loaded (it will yield some object of specific type - and torch.device object should only be used there to conclude which device to use), then at inference time - properly constructed handler function is going to be given reference to model and input image - this should be enough to infer. If I were not so lazy I could add onnx model for instance 😂

As per readme:

ModelObject = Any  # this can be anything even, tensorflow if someone wanted
InferenceFunction = Callable[
    [ModelObject, List[np.ndarray], torch.device], List[InferenceResult]
]

and you are supposed to construct two functions in your module - first is:

def load_model(
    context: Context, device: torch.device
) -> Tuple[ModelObject, InferenceFunction]:

and the second is InferenceFunction as per signature

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants