[ML] Merge the feature/pytorch-inference branch #1902

davidkyle · 2021-05-24T12:37:33Z

Merge the feature/pytorch-inference branch into master.

Create a new pytorch_inference binary for evaluating PyTorch Models
Add the evaluate.py script for quickly evaluating models with pytorch_inference
Document LibTorch build instructions to the setup files (multiple PRs)
Build LibTorch for each platform and update the docker build images (multiple PRs)

The application connects to 4 named pipes and accepts a TorchScript model on the 'restore' pipe. Input in a series of tokens from the BERT vocabulary and the output an JSON document containing the results tensor. Logging is written to the 4th pipe. The script evaluate.py will start the app, connect the named pipes and run an example from the examples directory. This is the easiest way to test and develop the app serving as a proxy for the Java/ES side until that is complete. The app will not be built by a call to the top level makefile so the dependencies do not have to be installed yet.

1. Use vanilla Python instead of Anaconda 2. Don't build all the possible libraries, as we don't need them (there may be more to do here in the future too) 3. Move instructions to main build instructions now we are on the road to incorporating this into core product

- Adds build setup instructions for PyTorch 1.7.1 on Windows. - Some minor tweaks to the build system to account for the way Windows imports DLLs using import libraries, and the extra dependencies of the PyTorch DLLs on Windows. - WinSock2 is needed for ntohl on Windows. - Reinstate the --version functionality to the new program.

Builds and installs Python 3.7.9 and CMake then clones the PyTorch repo and builds libtorch

This change sets extra flags during the PyTorch build to avoid building unnecessary components, which dramatically cuts the build time. It also changes the location of debug symbols from the libraries themselves to separate .pdb files. A couple of tweaks to the 1.7.1 build files are required to get these options to work. One of these is already in the PyTorch master branch code. There may be time to get the other one incorporated in time for 1.8.0 too.

1. The IO manager needs to open streams with the binary flag, otherwise the stream will stop reading binary files that contain an EOF character prematurely. 2. Upload the Windows build tools to S3 so CI can download them. 3. Adjust the test script so it uses regular files instead of named pipes. Named pipes are completely different on Windows and hard to use from Python. It's easier to make a Python script portable if it uses regular files.

Linux uses the Eigen BLAS library and macOS the Accelerate framework so MKL is not required. Delete the pytorch repo once built in the docker image

FBGEMM requires AVX2 instructions and cannot be used on ARM, Caffe2 is used instead which is different to the x64 builds. The linux docker files are changed to use multi-stage builds so the final image does not contain intermediate dependencies uses to build the actual dependencies.

…Linux Aarch64 and macOS (#1732)

Now we have built libraries for all the different platforms pytorch_inference can be added to the Makefile without breaking CI. If CI fails for this PR it will smoke out any mistakes in Docker images or dependency bundles. Also adding to controller as that's a one-liner.

Delete temp files created by evaluate.py

Defines the input and output format for the PyTorch 3rd party model app and adds a command processor which parses JSON documents from a input stream then calls a handler function for each request.

Upgrades PyTorch from version 1.7.1 to version 1.8.0 on macOS. It is now possible to build a version that works on Apple Silicon. The build instructions are also adjusted to remove functions that call an external compiler to build custom extensions. Although these would never have worked in our programs due to system call filtering, it's best if they aren't present at all as they could alarm heuristic virus scanners. (Similar upgrades will be done for Windows and Linux in the near future.)

Upgrades PyTorch from version 1.7.1 to version 1.8.0 on Windows. The build instructions are also adjusted to remove functions that call an external compiler to build custom extensions. Although these would never have worked in our programs due to system call filtering, it's best if they aren't present at all as they could alarm heuristic virus scanners. (A similar upgrade will be done for Linux in the near future.)

Renames pytorch process logging arguments to be consistent with the other processes and with what java provides.

The build instructions are also adjusted to remove functions that call an external compiler to build custom extensions. Although these would never have worked in our programs due to system call filtering, it's best if they aren't present at all as they could alarm heuristic virus scanners.

The details are for version 1.8.0. PyTorch will also show up in the public dependency report once the feature branch is merged to master.

This adds an error handler to the command processor that is called in the event of a bad inference request. The error is returned on the output so a client can expect a response for every request sent.

Evaluate a simple linear model for testing and another use-case added to the repertoire

This commit changes the way the pytorch_inference process writes results so they fit the format java expects.

Following #1841 the output is now an array of objects rather than NDJSON

droberts195

LGTM if CI passes

davidkyle and others added 30 commits January 12, 2021 18:11

Merge branch 'master' into feature/pytorch-inference

5b0ef6f

[ML] Add PyTorch setup instructions for Linux (#1688)

3e5d929

Merge branch 'master' into feature/pytorch-inference

3254e0c

[ML] Add libtorch Linux DockerFile (#1678)

ed19689

Builds and installs Python 3.7.9 and CMake then clones the PyTorch repo and builds libtorch

Use Eigen BLAS library when building PyTorch for Linux (#1696)

69cf46d

[ML] Remove mkl dependency and improve docker cleanup (#1712)

056131d

Linux uses the Eigen BLAS library and macOS the Accelerate framework so MKL is not required. Delete the pytorch repo once built in the docker image

Merge branch 'master' into feature/pytorch-inference

0c4213c

Increment version for the docker builder file and build scripts for …

66e23e7

…Linux Aarch64 and macOS (#1732)

Updating macOS Docker image (#1738)

ec31165

Merge branch 'master' into feature/pytorch-inference

4d2a4fb

[ML] Delete temp files created by evaluate.py (#1740)

ef4440a

Delete temp files created by evaluate.py

Merge branch 'master' into feature/pytorch-inference

574b1b5

Merge branch 'master' into feature/pytorch-inference

450353e

[ML] PyTorch Command Processor (#1770)

3ab6f07

Defines the input and output format for the PyTorch 3rd party model app and adds a command processor which parses JSON documents from a input stream then calls a handler function for each request.

Merge branch 'master' into feature/pytorch-inference

04498ba

[ML] Use TESTED_OBJS dependency in PyTorch unittest makefile (#1786)

58b95a1

[ML] Rename pytorch log pipe arg (#1799)

4a466c2

Renames pytorch process logging arguments to be consistent with the other processes and with what java provides.

Merge branch 'master' into feature/pytorch-inference

d9a30b4

[ML] Add PyTorch license to repo (#1816)

6f1a416

The details are for version 1.8.0. PyTorch will also show up in the public dependency report once the feature branch is merged to master.

[ML] Handle validation errors in PyTorch requests (#1818)

d2834b3

This adds an error handler to the command processor that is called in the event of a bad inference request. The error is returned on the output so a client can expect a response for every request sent.

davidkyle and others added 6 commits March 23, 2021 11:43

[ML] Add ability to evaluate a simple PyTorch model (#1817)

2a44ffb

Evaluate a simple linear model for testing and another use-case added to the repertoire

[ML] PyTorch results should be written as an array (#1841)

7c85298

This commit changes the way the pytorch_inference process writes results so they fit the format java expects.

Merge branch 'master' into feature/pytorch-inference

fabc3b7

Merge branch 'master' into feature/pytorch-inference

3a3f07b

Parse inference results from an array (#1843)

e3d03f9

Following #1841 the output is now an array of objects rather than NDJSON

Merge branch 'master' into feature/pytorch-inference

add502b

davidkyle added v8.0.0 3rd party models labels May 24, 2021

droberts195 approved these changes May 24, 2021

View reviewed changes

Add changelog comment

26f9aff

joshdevins mentioned this pull request Jun 1, 2021

[ML] Add Trained Model Post-Processors elastic/elasticsearch#69571

Closed

davidkyle changed the title ~~[ML] Merge the feature/pytorch_inference branch~~ [ML] Merge the feature/pytorch-inference branch Jun 1, 2021

davidkyle merged commit e7a0342 into master Jun 1, 2021

droberts195 deleted the feature/pytorch-inference branch July 28, 2021 08:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ML] Merge the feature/pytorch-inference branch #1902

[ML] Merge the feature/pytorch-inference branch #1902

Uh oh!

davidkyle commented May 24, 2021 •

edited

Loading

Uh oh!

droberts195 left a comment

Uh oh!

Uh oh!

[ML] Merge the feature/pytorch-inference branch #1902

[ML] Merge the feature/pytorch-inference branch #1902

Uh oh!

Conversation

davidkyle commented May 24, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

droberts195 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

davidkyle commented May 24, 2021 •

edited

Loading