[ML] PyTorch Inference #1638

davidkyle · 2020-12-18T16:12:25Z

This application reads a TorchScript model from a named pipe, the input is encoded tokens from the BERT dictionary.

The script evaluate.py starts the C++ app, connects the named pipes runs a language modelling example where the response is the classification of the tokens in the input sentence. This is the easiest way to test the app and serves as a proxy for the Java/ES side as this is in development.

// Stream the model and input tokens then check the expected response
python evaluate.py '/path/to/conll03_traced_ner.pt' examples/ner/input.json examples/ner/expected_response.json

Input

LibTorch loads requires an istream supporting seek and tell to load the model, this cannot be done over a named pipe so the entire model is first read into a char array. That memory is freed once the model is loaded. The first 4 bytes sent to over the restore stream is the size of the model followed by the model definition. The inference input is a number of integers encoded in network order, the output is a JSON document.

LibTorch is a dependency

LibTorch has to be built externally and the headers added to build this app. setup_pytorch.md provides instructions for doing so using Anaconda. The 3rd_party.sh script will copy the dependencies to the install location.

Building

I have not added this application bin/Makefile so it will not break builds where the library does not exist. If you want to build it follow the instructions in setup_pytorch.md then run 3rd_party/3rd_party.sh --add and cd bin/pytorch_inference & make

There is no ARM build of PyTorch. UPDATE ARM support is in beta with the 1.7 release

bin/pytorch_inference/connect_pipe.py

bin/pytorch_inference/CBufferedIStreamAdapter.cc

3rd_party/pull-pytorch.sh

bin/pytorch_inference/CBufferedIStreamAdapter.cc

bin/pytorch_inference/CBufferedIStreamAdapter.h

bin/pytorch_inference/CBufferedIStreamAdapter.cc

mk/linux_crosscompile_macosx.mk

mk/macosx.mk

mk/windows.mk

bin/Makefile

3rd_party/build-pytorch.sh

davidkyle · 2021-01-06T11:01:21Z

The PyTorch repo is no longer cloned into the 3rd_Party directory instead setup_pytorch.md provides instructions for building PyTorch and installing the headers and libraries in the correct location, but only for macOS.

Note CI is not building this change, that is beyond the scope of this PR but that change can be made in parallel.

droberts195 · 2021-01-06T12:20:44Z

3rd_party/3rd_party.sh

According to pytorch/pytorch#13130 (comment) if you build the binaries yourself then everything still works on aarch64. So I think the assumption should be that by the time we've got this working for all platforms aarch64 will also work. If it proves to be completely impossible to get working then maybe we can exclude them, but that will create headaches for documentation, support, Cloud, and will make the corresponding Java code much more complex as that will then also have to account for the hardware architecture, so even if it's a month of work to build for aarch64 it's probably less work company-wide than having a chunk of functionality that isn't available.

As noted in the description aarch64 support is beta in the 1.7 release which came out as this was being written. I removed the conditional as it is a no-op until the libraries are installed on the platform anyway

droberts195 · 2021-01-06T12:25:35Z

bin/pytorch_inference/Main.cc

EMPTY, EMPTY makes this call a no-op. Other comments that have been pasted from the other programs make it sound like logging could go to a named pipe, so I think there should be a TODO here saying that log redirection is still left to do.

I added a log pipe argument to the program and now log to the pipe. evaluate.py reads the log pipe output but in the main thread for now.

I also removed the --xxxIsPipe arguments as pipes are the only option right now

I added a log pipe argument to the program and now log to the pipe. evaluate.py reads the log pipe output but in the main thread for now.

Thanks.

I also removed the --xxxIsPipe arguments as pipes are the only option right now

One thing to note about this is that it will add complexity in the production Java code in order to simplify a dev Python script - see https://github.com/elastic/elasticsearch/blob/d613a1a1d12890825b982c1a7a4fd6e17f20b325/x-pack/plugin/ml/src/main/java/org/elasticsearch/xpack/ml/process/ProcessPipes.java#L110-L133

Theoretically on-disk files could have been used for the inputs if someone put the appropriate binary data in them. We sometimes use this technique for running autodetect through valgrind.

I would keep the arguments consistent between all the processes that connect to Java using named pipes.

I took out the --xxxIsPipe arguments because that is not the intended usage but I can see the sense in keeping the arguments consistent with the other apps.

It also made me realise the C++ can be launched by the python script making testing a lot easier as you just have to run the python now.

droberts195 · 2021-01-06T12:40:55Z

build-setup/pytorch/setup_pytorch.md

This raises the question of whether this PR is aiming for a dev-only prototype or the beginnings of a productionisation of the functionality.

If the intention is that it's the beginnings of productionisation then it shouldn't be "if you wish", it should be a compulsory setup step, and the instructions should be in build-setup/macos.md, so that people who are setting up from scratch just have one set of instructions to follow. I know since you're specifically working on PyTorch it seems nicer to you to have that bit split out into a separate file, but in the future this will just be part of our required toolchain, and people set up one machine at a time, not one build component at a time, so it's better for the future to just have all the instructions for a particular platform in one file.

The language was intentional this is an optional step for everyone that is the reason the top level makefile does not build the the pytorch binary. If the future once this code is in production I can still see the value of having this step optional, if you just want to build anomaly detection why do you have to install anaconda, clone the PyTorch repo etc.
The use of a separate file in a subfolder is following the pattern set by clion and strptime and because the initial setup steps are the same for each platform.

It's only an optional step while this is a dev-only prototype. If we are ever going to ship this then it should be in the top level per-OS files.

clion is truly optional even for shipped code. It's not installed on any of the CI machines.

strptime has a separate directory because the patches are too complex to inline in the top level build setup instructions. But the instructions for using the files in that directory are in the top level build setup instructions here.

You could say "the initial setup steps are the same for each platform" for Boost. So instead of having a Boost section in macos.md, linux.md and windows.md we could say in each of these files "now build Boost using the instructions in boost/boost.md and direct people to another file that has separate sections for macOS, Linux and Windows. But I am not sure that would really make life easier for someone who had been tasked with setting up a particular type of build server. They'd have to look at two files instead of one, or N files instead of one in the case where we did this for every component.

So the instructions could stay as-is for this PR, while we're not actually building the new program on every build, but eventually they need to move to be consistent with the other instructions.

droberts195 · 2021-01-06T12:45:55Z

build-setup/pytorch/setup_pytorch.md

People setting up build machines shouldn't be making decisions about how to do it. The instructions should tell them exactly which steps to follow. So for a controlled build it sounds like this step should be "You must install Anaconda". I am not sure how much it matters which version of Anaconda is used if we're effectively just using it to run the build script - presumably it doesn't play that much of a role in building the C++ libraries. But if I am wrong and it does affect the functionality of the C++ libraries that get built then the instructions should specify a specific version and link to the specific file to install from https://repo.anaconda.com/archive/ so that somebody setting up an official build server gets the prescribed version even if a more recent one has been released. That's only necessary if it affects the artifacts we'll redistribute though.

There's no need to mention "Additionally a C++14 compiler is required." The build instructions in build-setup/macos.md will already have ensured this. (For macOS it just means the Anaconda section should go after the Xcode section.)

The same can be achieved by pip installing the dependencies with Python 3.6 or above and installing CMake but I didn't want to create a maintenance issue by describing the more complex option. You don't have to install Anaconda but it comes with the build tools invoked by setup.py

There will always be different ways to build things. Consider the instructions for building Boost on Linux. Suppose instead they said "become an expert in how to build Boost, then build it in the most appropriate way". That would be a complete nightmare for somebody who just wanted to get to the point where they could type make -j 8 at the top level of the ml-cpp repo and get a working build.

You are the expert in how to build PyTorch, so you know several ways to do it that work. But for everyone else it's easiest if you choose one way that works and then write down a set of steps that can be mindlessly followed that get you to a working library.

If other experts read the build setup instructions and decide they know a better way to build one of the required components then that's fine, but the instructions should make it as simple and mechanical as possible for people who don't want to spend the time becoming experts in how to set up the tool chain to get something that works. Imagine that every sentence of the setup instructions needs to unambiguously specify one line of a shell script that automates it (which is pretty much what is required for the Dockerfiles).

droberts195 · 2021-01-06T12:49:31Z

build-setup/pytorch/setup_pytorch.md

I don't think the MACOSX_DEPLOYMENT_TARGET=10.9 bit is worthwhile. We don't build our other code for 10.9. Currently it's 10.11 for 7.x and 10.14 for 8.x, but the default for Apple clang is to support the build platform and higher, and that's what we do for our other tools (i.e., for what we ship to customers we build 7.x on 10.11 and 8.x on 10.14).

and build

droberts195 · 2021-01-07T10:20:35Z

build-setup/pytorch/setup_pytorch.md

+
+## Build macOS
+```
+export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname $(which conda))/../"}


There's an assumption here that CMake is available, but it's not on a newly installed Mac, so instructions for installing CMake should be in these instructions. Probably before the conda install step in case installing the Python cmake module requires that the cmake binary be on the path.

droberts195

Happy to merge this to the feature branch and iterate

The application connects to 4 named pipes and accepts a TorchScript model on the 'restore' pipe. Input in a series of tokens from the BERT vocabulary and the output an JSON document containing the results tensor. Logging is written to the 4th pipe. The script evaluate.py will start the app, connect the named pipes and run an example from the examples directory. This is the easiest way to test and develop the app serving as a proxy for the Java/ES side until that is complete. The app will not be built by a call to the top level makefile so the dependencies do not have to be installed yet.

joshdevins reviewed Dec 18, 2020

View reviewed changes

bin/pytorch_inference/connect_pipe.py Outdated Show resolved Hide resolved

davidkyle commented Dec 20, 2020

View reviewed changes

bin/pytorch_inference/CBufferedIStreamAdapter.cc Outdated Show resolved Hide resolved

droberts195 reviewed Dec 22, 2020

View reviewed changes

davidkyle added the >non-issue label Jan 6, 2021

davidkyle force-pushed the pytorch-app branch from 6664909 to 24b383e Compare January 6, 2021 10:46

droberts195 reviewed Jan 6, 2021

View reviewed changes

davidkyle added 23 commits January 6, 2021 17:10

Create skeleton app

657bdbf

Add torch libraries

b4a95f6

trivial example

6ac2739

Add scripts to clone pytorch into 3rd_party

6b3ca77

and build

Build from local pytorch repo

d3cafe5

Add torch to all platform files

65e6363

Add buffered implementation of ReadAdapterInterface

2fb6903

Read size from stream

64cce08

Connect more streams

37e5e72

Tidying up

439fa5d

Add macOS sandbox rules to read sysclt values

12d3529

Read input tokens and write JSON result

e13056b

clang format

273dcc0

update 3rd party script for all platforms

276db08

Paramaterise inputs to the python script

0f2fc2c

Add change log

1ead7b1

Disable copying libtorch in 3rd_party script

ccc1685

review feedback

157e531

Remove pytorch from 3rd_party folder

0b466c4

clang format

f8b06a7

Building PyTorch for macOS instructions

6421ba8

tidy up

3246aa6

Add logging to a pipe

7a4115c

Enable aarch in 3rd party libs and remove mac compatibility

623f17a

davidkyle force-pushed the pytorch-app branch from 24b383e to 623f17a Compare January 6, 2021 23:47

clang format

c93c76c

davidkyle force-pushed the pytorch-app branch from 56dfaaf to c93c76c Compare January 6, 2021 23:57

droberts195 reviewed Jan 7, 2021

View reviewed changes

davidkyle added 3 commits January 7, 2021 14:56

Restore the isPipeFlags

03bb8bd

Launch the C++ from python script

62d01df

Make anaconda an requirement

b961256

davidkyle changed the base branch from master to feature/pytorch-inference January 11, 2021 16:34

droberts195 changed the title ~~PyTorch Inference~~ [ML] PyTorch Inference Jan 11, 2021

droberts195 approved these changes Jan 11, 2021

View reviewed changes

Clang format

2b060a9

davidkyle merged commit eaeb0a4 into elastic:feature/pytorch-inference Jan 12, 2021

davidkyle deleted the pytorch-app branch January 12, 2021 18:11

davidkyle mentioned this pull request Jan 28, 2021

[ML] [PyTorch] Create command processor for inference app #1701

Closed

davidkyle added 3rd party models >feature labels Jan 28, 2021

davidkyle mentioned this pull request Jun 1, 2021

[ML] Merge the feature/pytorch-inference branch #1902

Merged

[ML] PyTorch Inference #1638

[ML] PyTorch Inference #1638

Uh oh!

Conversation

davidkyle commented Dec 18, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Input

LibTorch is a dependency

Building

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

davidkyle commented Jan 6, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

droberts195 Jan 6, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

droberts195 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

davidkyle commented Dec 18, 2020 •

edited

Loading

droberts195 Jan 6, 2021 •

edited

Loading