Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

POC: Add telemetry #4452

Closed
wants to merge 4 commits into from
Closed

POC: Add telemetry #4452

wants to merge 4 commits into from

Conversation

yannickcr
Copy link
Contributor

@yannickcr yannickcr commented Jul 21, 2020

Summary

This PR is a POC for the telemetry project.

You can read the related RFC here: https://github.com/algolia/instantsearch-rfcs/blob/telemetry/accepted/telemetry.md

And see the telemetry payload in the request done by the e-commerce demo: https://deploy-preview-4452--instantsearchjs.netlify.app/examples/e-commerce/

@codesandbox-ci
Copy link

codesandbox-ci bot commented Jul 21, 2020

This pull request is automatically built and testable in CodeSandbox.

To see build info of the built libraries, click here or the icon next to each commit SHA.

Latest deployment of this branch, based on commit 522df84:

Sandbox Source
InstantSearch.js Configuration

Co-authored-by: Haroen Viaene <hello@haroen.me>
Copy link
Contributor

@Haroenv Haroenv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some very interesting ideas!

return makeHits({ escapeHTML, transformItems });
return {
...makeHits({ escapeHTML, transformItems }),
$$params: widgetOptions,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead we should forward $$params to the widget itself from the makeWidget call:

return makeHits({ escapeHTML, transformItems, $$params: widgetOptions });

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's also a possibility, I will update the implementation to do this.

src/widgets/index/index.ts Show resolved Hide resolved
Comment on lines +45 to +65
val = +val || 0;

if (val > 0xfffffff || val < 0) throw new Error('Unsupported');

// realloc
let length = this.length || 16;
while (length < this.pos + 4) length *= 2;
if (length !== this.length) {
const buf = new Uint8Array(length);
buf.set(this.buf);
this.buf = buf;
this.length = length;
}

this.buf[this.pos++] = (val & 0x7f) | (val > 0x7f ? 0x80 : 0);
if (val <= 0x7f) return;
this.buf[this.pos++] = ((val >>>= 7) & 0x7f) | (val > 0x7f ? 0x80 : 0);
if (val <= 0x7f) return;
this.buf[this.pos++] = ((val >>>= 7) & 0x7f) | (val > 0x7f ? 0x80 : 0);
if (val <= 0x7f) return;
this.buf[this.pos++] = (val >>> 7) & 0x7f;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this code mean? It's quite dense

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, do not focus too much on it. It is a stripped down version of pbf (a protobuf implementation) with only the methods that are useful to us to keep it as light as possible (since the original implementation is not tree-shakable).

src/widgets/index/index.ts Outdated Show resolved Hide resolved
@@ -25,6 +25,82 @@ import {
mergeSearchParameters,
} from '../../lib/utils';

import { WidgetType, WidgetParams, Schema } from '../../../schema';

function Pbf() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that we didn't need to use a dependency here, but I'd probably prefer it in its own file, and with tests

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, but for now in this POC I just made it "works" 😉, then is still tons of improvement in quality/structure/testing to do.

schema.proto Show resolved Hide resolved
@Haroenv
Copy link
Contributor

Haroenv commented Jul 21, 2020

URLs have a limit of 500± characters in some cases, and we will still want to avoid going over that, however the ecom header is:

https://latency-dsn.algolia.net/1/indexes/*/queries?x-algolia-agent=Algolia%20for%20JavaScript%20(4.1.0)%3B%20Browser%20(lite)%3B%20instantsearch.js%20(4.7.0)%3B%20JS%20Helper%20(3.1.1)&x-algolia-api-key=6be0576ff61c053d5f9a3225e2a90f76&x-algolia-application-id=latency&x-algolia-telemetry=qh8IwCUMyiUC6AeqHxLAJRrKJQz%2FB4cIiAiJCPkH8QeqHwrAJQnKJQT9B%2FEHqh8NwCUEyiUE8Qf5B9AlAaofDMAlBcolBooIiwiMCKofDMAlHsolBv8H%2BQfxB6ofCsAlC8olBPEHgwiqHxLAJRDKJQyNCIQIjgiPCPkH8QeqHw7AJRjKJQj%2FB5AIkQjxB6ofCsAlCsolBPEH%2BQeqHwzAJRnKJQb%2FB%2FkH8QeqHwrAJR3KJQTxB%2FkHqh8MwCUbyiUG8QfyB%2FkHqh8MwCUcyiUG8Qf%2FB4MI

(which is ±600 characters)

Is there some way this can be smaller?

I also notice that there's URL encoded parts in there (%2F, a.k.a /), is that expected?


if (telemetryHeader !== newTelemetryHeader) {
telemetryHeader = newTelemetryHeader;
instantSearchInstance.client.transporter.queryParameters[
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is only v4 and our client compatible, we will want to send that as a query parameter

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is very hacky here:

  • It should be in headers, not in queryParameters
  • instantSearchInstance.client.transporter.queryParameters is typed at read-only, so to do it properly we'll need to have a proper method to update it

I was not planning to be compatible with the client v3 (since we'll need to do some update to the client to properly set the header). Do you thing this is something we should do?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unknown request parameters of the client will go to either headers or query parameters (v4):

search([], { someParameter: "value", headers: { someHeader: "value })

v3 doesn't have any options for that, but at least it should ignore the second parameter

@Haroenv
Copy link
Contributor

Haroenv commented Jul 21, 2020

What is the process of decoding this string? I know you should be able to do decodeURIComponent(header).split('/'), but I don't know what to learn out of those strings :)

Co-authored-by: Haroen Viaene <hello@haroen.me>
@yannickcr
Copy link
Contributor Author

yannickcr commented Jul 22, 2020

URLs have a limit of 500± characters in some cases, and we will still want to avoid going over that, however the ecom header is:

https://latency-dsn.algolia.net/1/indexes/*/queries?x-algolia-agent=Algolia%20for%20JavaScript%20(4.1.0)%3B%20Browser%20(lite)%3B%20instantsearch.js%20(4.7.0)%3B%20JS%20Helper%20(3.1.1)&x-algolia-api-key=6be0576ff61c053d5f9a3225e2a90f76&x-algolia-application-id=latency&x-algolia-telemetry=qh8IwCUMyiUC6AeqHxLAJRrKJQz%2FB4cIiAiJCPkH8QeqHwrAJQnKJQT9B%2FEHqh8NwCUEyiUE8Qf5B9AlAaofDMAlBcolBooIiwiMCKofDMAlHsolBv8H%2BQfxB6ofCsAlC8olBPEHgwiqHxLAJRDKJQyNCIQIjgiPCPkH8QeqHw7AJRjKJQj%2FB5AIkQjxB6ofCsAlCsolBPEH%2BQeqHwzAJRnKJQb%2FB%2FkH8QeqHwrAJR3KJQTxB%2FkHqh8MwCUbyiUG8QfyB%2FkHqh8MwCUcyiUG8Qf%2FB4MI

(which is ±600 characters)

For what I know in browsers URLs max length is 2048 characters, is there a limitation in the API I'm not aware about ?

The plan is to put it in a HTTP header (which have a limit around 8K), but in this POC I put it in the query parameters since I had some issues sending an unknown header to the API, I need to dig in it. Sending a custom HTTP header forces us to send a preflight request before the actual search request, so this is not a viable solution.

Also I planned to have a limit so we do not send a payload if it is too big.

Is there some way this can be smaller?

I used protobuf to specifically get the smallest possible payload. To get an even smaller one I see 2 solutions:

  • Find a better format than protobuf
  • Send less information

I also notice that there's URL encoded parts in there (%2F, a.k.a /), is that expected?

Yes, we're sending a base64 encoded string and it can contains some /, it should not be an issue.

What is the process of decoding this string? I know you should be able to do decodeURIComponent(header).split('/'), but I don't know what to learn out of those strings :)

To decode the string you'll need 2 things:

  • A protobuf implementation (there are some in multiple language)
  • The schema (schema.proto) to know how to decode it.

Here's how to do it in JS:

import Pbf from 'pbf';
import { WidgetType, WidgetParams, Schema } from '../schema';

const telemetryHeader =
  'qh8IwCUMyiUC6AeqHxLAJRrKJQzxB%2F8HhwiICIkI%2BQeqHwrAJQnKJQTxB%2F0Hqh8NwCUEyiUE8Qf5B9AlAaofDMAlBcolBooIiwiMCKofDMAlHsolBvEH%2Fwf5B6ofCsAlC8olBPEHgwiqHxLAJRDKJQzxB40IhAiOCI8I%2BQeqHw7AJRjKJQjxB%2F8HkAiRCKofCsAlCsolBPEH%2BQeqHwzAJRnKJQbxB%2F8H%2BQeqHwrAJR3KJQTxB%2FkHqh8MwCUbyiUG8QfyB%2FkHqh8MwCUcyiUG8Qf%2FB4MI';

// Decode the header and convert it to an Uint8Array
const arrayBuffer = new Uint8Array(
  Buffer.from(decodeURIComponent(telemetryHeader), 'base64')
);

// Parse the buffer using the schema
const pbf = new Pbf(arrayBuffer);
const output = Schema.read(pbf);

// Map the IDs to the matching strings
const mappedOutput = output.widgets.map(widget => ({
  type: Object.keys(WidgetType).find(
    type => WidgetType[type].value === widget.type
  ),
  params: widget.params.map(paramId =>
    Object.keys(WidgetParams).find(type => WidgetParams[type].value === paramId)
  ),
  useConnector: widget.useConnector,
}));

console.log(JSON.stringify(mappedOutput, null, 2));

And you'll get the following output:

Decoded payload
[
  {
    "type": "ais.index",
    "params": [
      "indexName"
    ],
    "useConnector": false
  },
  {
    "type": "ais.refinementList",
    "params": [
      "container",
      "attribute",
      "searchable",
      "searchablePlaceholder",
      "searchableShowReset",
      "templates"
    ],
    "useConnector": false
  },
  {
    "type": "ais.hierarchicalMenu",
    "params": [
      "container",
      "attributes"
    ],
    "useConnector": false
  },
  {
    "type": "ais.clearRefinements",
    "params": [
      "container",
      "templates"
    ],
    "useConnector": true
  },
  {
    "type": "ais.configure",
    "params": [
      "attributesToSnippet",
      "snippetEllipsisText",
      "removeWordsIfNoResults"
    ],
    "useConnector": false
  },
  {
    "type": "ais.toggleRefinement",
    "params": [
      "container",
      "attribute",
      "templates"
    ],
    "useConnector": false
  },
  {
    "type": "ais.hitsPerPage",
    "params": [
      "container",
      "items"
    ],
    "useConnector": false
  },
  {
    "type": "ais.pagination",
    "params": [
      "container",
      "scrollTo",
      "padding",
      "showFirst",
      "showLast",
      "templates"
    ],
    "useConnector": false
  },
  {
    "type": "ais.rangeSlider",
    "params": [
      "container",
      "attribute",
      "pips",
      "tooltips"
    ],
    "useConnector": false
  },
  {
    "type": "ais.hits",
    "params": [
      "container",
      "templates"
    ],
    "useConnector": false
  },
  {
    "type": "ais.ratingMenu",
    "params": [
      "container",
      "attribute",
      "templates"
    ],
    "useConnector": false
  },
  {
    "type": "ais.stats",
    "params": [
      "container",
      "templates"
    ],
    "useConnector": false
  },
  {
    "type": "ais.searchBox",
    "params": [
      "container",
      "placeholder",
      "templates"
    ],
    "useConnector": false
  },
  {
    "type": "ais.sortBy",
    "params": [
      "container",
      "attribute",
      "items"
    ],
    "useConnector": false
  }
]

@Haroenv
Copy link
Contributor

Haroenv commented Jul 22, 2020

HTTP header

We shouldn't use a header, since that unfortunately changes the network request from a "simple request" (which doesn't need to comply with CORS), to a regular request, and thus needs a synchronous OPTIONS request (which makes the initial search slower)

pips: 1040,
tooltips: 1041,
});
var WidgetParams = exports.WidgetParams = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would there be a way to make this esm instead of commonjs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly pbf only output a commonjs module (issue: mapbox/pbf#97)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if you change these lines it should work; since they have an issue, they will probably accept the PR

https://github.com/mapbox/pbf/blob/master/compile.js#L94-L97

@yannickcr
Copy link
Contributor Author

We shouldn't use a header, since that unfortunately changes the network request from a "simple request" (which doesn't need to comply with CORS), to a regular request, and thus needs a synchronous OPTIONS request (which makes the initial search slower)

Oh. You're right. We cannot use headers here then. So we need to keep using the query parameters.

@Haroenv Haroenv closed this Jan 19, 2021
@Haroenv Haroenv deleted the poc/telemetry branch January 19, 2021 12:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants