EFK with transform for tabular data #616

ryandawsonuk · 2019-06-04T17:24:11Z

Alternative to #610

In this version the engine can do a transform to put tabular data into key-value pairs for searching. We can then do e.g. value-range searches:

It checks whether there's a 'data.names' array before transforming but need to do more checks to make sure this won't break other types of requests (e.g. what can we assume about the data value type and array shape). Also need to consider performance.

…me changed vs katacoda

ryandawsonuk · 2019-06-05T08:21:51Z

For images we may be better to store the image on disk or in a bucket and keep a searchable reference to it. I'd like to investigate computing metadata about the image before storing it so that you could search by metadata (e.g. presence of certain colours) - see https://stackoverflow.com/questions/30440224/possible-to-store-images-in-elasticsearch

We should discuss whether to do this in the engine or in a wrapper library or even start with just an example of how it can be done in the end-user code

ryandawsonuk · 2019-06-05T20:20:04Z

Next steps:

use thread to make async
consider more examples - e.g. config option for column-primary, data.names as comma-sep in env var, size cap on transform/enrichment, percentage sampling
revisit whether to split out batches or provide option to
look at binary, tensor and str cases
Look at how parameterised - what env vars and what defaults and how to apply (think example helm charts)
If single engine for graph is that ok that we won’t know which component? Will we know anyway? Do we need to? What about MABs?
run on Cloud
look at custom metadata
document how it works (mention [filter_kubernetes] enhancement: provide mechanism to exclude containers from fluent bit via annotations fluent/fluent-bit#737 )
consider KF-serving where engine not present - could it use same logic in python as part of its base image/s? Or should we publish a py lib?

ryandawsonuk · 2019-06-06T09:31:04Z

New idea - look at performing the enrichment in a separate batch job that updates the entries in elasticsearch or an elasticsearch transform script - preferably we'd do something python-based to simplify working with the arrays. In that case the transformation/enrichment part in the engine for this PR would be taken out.

ryandawsonuk · 2019-06-08T14:59:18Z

Related to SeldonIO/seldon-operator#13 but not a dependency

ryandawsonuk · 2019-06-18T17:35:58Z

Depends on SeldonIO/seldon-operator#17

* add UnloadEnvoyRequested state * transition to UnloadEnvoyRequested * add UnloadEnvoyRequested in model stats * add UnloadEnvoyRequested in UnloadingOrUnloaded * make removeRouteForServerInEnvoy not sync * update func name to reflect how envoy is called * modelUpdate changes to proceed with batched rm * add TODO for pipeline envoy updates * remove unused field in ModelVersion * move field to atomic.bool * fix condition check * add tests * fix unload test * lint * fix lint issues * add extra test * docs update * add new replicastate in protos * update generated protos * add extra test in agent-server sync

ryandawsonuk added 8 commits May 31, 2019 14:10

initial EFK setup

54aeda2

index pattern has underscore in different place and kibana env var na…

6674e9c

…me changed vs katacoda

initial engine request/reponse logging

2159b9c

use newer elasticsearch

d960481

try logging as plain json

c763b2e

show filtering

840f086

now have puid in request and response

a9c51fb

add option to transform data

9069b5b

seldondev added the size/L label Jun 4, 2019

ryandawsonuk mentioned this pull request Jun 4, 2019

WIP: initial EFK setup #610

Closed

now not dependent on double - still assumes rowmajor

8d9e97d

ryandawsonuk changed the title ~~WIP: 545 EFK with in-engine transform for tabular data~~ WIP: 545 EFK with transform for tabular data Jun 8, 2019

only log pods labeled fluentd true

43083a9

seldondev added size/XL and removed size/L labels Jun 8, 2019

note for next steps

67d5587

ryandawsonuk mentioned this pull request Jun 8, 2019

add fluentd label by default SeldonIO/seldon-operator#13

Merged

ryandawsonuk added 2 commits June 10, 2019 10:28

point temporarily to images in personal repo

dd1b991

accessing elastic backend externally

5a68f4f

seldondev added size/L and removed size/XL labels Jun 10, 2019

take out nodeport elastic pending issue on upstream chart

7d433b2

seldondev added size/XL and removed size/L labels Jun 10, 2019

ryandawsonuk added 2 commits June 10, 2019 12:16

start adding labels options into helm charts

33e96d3

setting labels from helm charts

8749541

ryandawsonuk added 7 commits June 17, 2019 15:36

use python impl for logger

312ce45

single-node elastic

33e39c7

format dict as json

8a4ffae

do need to wait on knative install

cc9f5d6

reduce istio resources and note on kafka and cloud

04bdf73

extra clarificatory notes

68cb6ce

note on custom fields

cde596a

ryandawsonuk mentioned this pull request Jun 18, 2019

how to add custom meta fields for tracking #632

Closed

ryandawsonuk added 5 commits June 18, 2019 12:53

fix bug in logger and note on how to debug

eeeb6bd

notes on limitations of logger

6b5d8fb

turn off request logging by default and add option to helm charts

48269eb

Merge branch 'master' into 545-efk-transform

62a1f09

merge in master and fix labels on mab chart

9de5d09

seldondev added size/XXL and removed size/XL labels Jun 18, 2019

add option to set engine env vars from mab chart

8787767

ryandawsonuk changed the title ~~WIP: 545 EFK with transform for tabular data~~ EFK with transform for tabular data Jun 18, 2019

comments in helm charts to explain env vars

f3754d6

seldondev added size/XL and removed size/XXL labels Jun 18, 2019

ryandawsonuk added 2 commits June 18, 2019 16:05

minor cleanup of unit test

83e6861

clean up request logger naming and check names not empty

8c242b7

full setup script

1e92c03

ryandawsonuk mentioned this pull request Jun 19, 2019

How can I modify the logging format of REST API for StackDriver and BigQuery? #557

Closed

ryandawsonuk added 3 commits June 19, 2019 16:54

option to set engine env vars from annotations

55d3eb2

expand annotations list

fa63f98

further small update to docs

ff03b20

ryandawsonuk merged commit aacb697 into SeldonIO:master Jun 19, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EFK with transform for tabular data #616

EFK with transform for tabular data #616

ryandawsonuk commented Jun 4, 2019 •

edited

Loading

ryandawsonuk commented Jun 5, 2019 •

edited

Loading

ryandawsonuk commented Jun 5, 2019 •

edited

Loading

ryandawsonuk commented Jun 6, 2019 •

edited

Loading

ryandawsonuk commented Jun 8, 2019 •

edited

Loading

ryandawsonuk commented Jun 18, 2019

EFK with transform for tabular data #616

EFK with transform for tabular data #616

Conversation

ryandawsonuk commented Jun 4, 2019 • edited Loading

ryandawsonuk commented Jun 5, 2019 • edited Loading

ryandawsonuk commented Jun 5, 2019 • edited Loading

ryandawsonuk commented Jun 6, 2019 • edited Loading

ryandawsonuk commented Jun 8, 2019 • edited Loading

ryandawsonuk commented Jun 18, 2019

ryandawsonuk commented Jun 4, 2019 •

edited

Loading

ryandawsonuk commented Jun 5, 2019 •

edited

Loading

ryandawsonuk commented Jun 5, 2019 •

edited

Loading

ryandawsonuk commented Jun 6, 2019 •

edited

Loading

ryandawsonuk commented Jun 8, 2019 •

edited

Loading