Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request getting filtered by Replay integration. #9339

Closed
Kobby-Bawuah opened this issue Oct 21, 2023 · 22 comments · Fixed by #9506 or #9623
Closed

Request getting filtered by Replay integration. #9339

Kobby-Bawuah opened this issue Oct 21, 2023 · 22 comments · Fixed by #9506 or #9623
Assignees
Labels
Sync: Jira apply to auto-create a Jira shadow ticket

Comments

@Kobby-Bawuah
Copy link

Kobby-Bawuah commented Oct 21, 2023

Environment

SaaS (https://sentry.io/)

Steps to Reproduce

A user reported that when their settings include the networkDetailAllowUrls integration, the requests to an example domain cease to appear in the Sentry network requests tab during a replay:

When the user has their setting as:

integrations: [ 
 new Replay(), 
 ], 

Then the domain requests show up (just no body).

As soon as the user specifies networkDetailAllowUrls, their example domain requests will stop showing up. So, a config like this makes the domain requests disappear from the Sentry network requests tab in a replay:

integrations: [ 
 new Replay({ 
 networkDetailAllowUrls: ['exampledomain.io'] 
 }), 
 ],

It's important to note that the issue may be related to how the Replay integration handles query parameters in the URL, particularly those involving URLs because it doesn't filter all domains added to the integration.

Please see Jira Sync for more information.

Expected Result

When the Replay integration includes the configuration parameter networkDetailAllowUrls: ['demodomain.io'], it is expected that requests to 'demodomain.io' should be visible in the Sentry network requests tab during a replay. Specifically, the integration should allow the specified URL to be processed without being blocked, ensuring accurate and complete replay of network requests.

Actual Result

Currently, when the networkDetailAllowUrls parameter includes 'demodomain.io', the requests to the specified URL with a URL as a query parameter are blocked. As a result, these requests do not appear in the Sentry network requests tab during a replay, leading to incomplete network data.

This behavior contradicts the expected functionality of the Replay integration, indicating a potential issue or bug in the integration logic.

Product Area

Replays

Link

No response

DSN

No response

Version

No response

┆Issue is synchronized with this Jira Improvement by Unito

@getsantry
Copy link

getsantry bot commented Oct 21, 2023

Routing to @getsentry/product-owners-replays for triage ⏲️

@getsantry
Copy link

getsantry bot commented Oct 21, 2023

Assigning to @getsentry/support for routing ⏲️

@Kobby-Bawuah Kobby-Bawuah added the Sync: Jira apply to auto-create a Jira shadow ticket label Oct 21, 2023
@billyvg billyvg transferred this issue from getsentry/sentry Oct 23, 2023
@billyvg
Copy link
Member

billyvg commented Oct 23, 2023

@mydea another network request issue, there are more details in the zendesk ticket as well

@mydea
Copy link
Member

mydea commented Oct 24, 2023

Hey,

I wasn't able to reproduce this, and I can't really see any more information about this in the linked jira issue - don't have access to the zendesk ticket, not sure if there is more in there.

The networkDetailAllowUrls setting has (or should have, at least) no effect on generally capturing the breadcrumb. To further debug this, I'd need:

  • The actual SDK init code
  • Maybe a link to an example event where this is missing
  • Maybe some more details on what request exactly is failing - e.g. the actual fetch or xhr code they use which is not captured (maybe it is something specific to the actual request, though not sure what that may be)

@dotdev-brendon
Copy link

dotdev-brendon commented Oct 26, 2023

UPDATE: Removing the query string did not actually fix the issue, so that assumption is not accurate. The only other real difference is that this request is a POST/XHR request vs GET/FETCH which are still showing.

+1 I can confirm that the exact issue is when the setting is being used, and the request includes a query string, then the requests are filtered from the network tab in session replays. They do still appear in the error bread crumbs though.

We can enable and disable the rules and see this consistent result/issues 100% of the time. Also our request just included a "?" at the end and no actual query string parameters.

With the setting commented out you can see requests to api.search.reactify.app:
https://industrie.sentry.io/replays/977029eaa9624157a00e401a398ec7a7/?f_n_search=reactify&project=4506114189688832&query=&referrer=%2Freplays%2F&statsPeriod=14d&t_main=network&yAxis=count%28%29

With the setting enabled, api requests are no longer included, note that the config and analytics subdomain requests still show as they don't include a query string:
https://industrie.sentry.io/replays/805c9086865d4902bc3f11d960ed09c1/?f_n_search=reactify&project=4506114189688832&query=&referrer=%2Freplays%2F&statsPeriod=1h&t_main=network&yAxis=count%28%29

Our init code, we have tried with extract URL matches, hostnames and regex and all produce the same result:

window.Sentry.init({
        dsn: "xxxxxx",
        integrations: [
          new window.Sentry.Replay({
            networkDetailAllowUrls: [/^.+\.search\.reactify\.app.*$/],
            networkRequestHeaders: ["x-reactify-shop","x-reactify-mode","x-reactify-client-id","x-reactify-client-version","x-reactify-ga"],
            networkResponseHeaders: ["x-reactify-cluster","x-reactify-shop","x-reactify-cache","x-reactify-cache-ttl","x-reactify-country","x-reactify-timezone"],
          }),
        ],
        replaysSessionSampleRate: 1.0,
        replaysOnErrorSampleRate: 1.0,
      });

Also adding or removing the headers does not change the result.

Request that's missing:
image

curl 'https://api.search.reactify.app/industrie-kids-primary/_msearch' \
  -H 'authority: api.search.reactify.app' \
  -H 'accept: application/json' \
  -H 'accept-language: en-GB,en-US;q=0.9,en;q=0.8,zh-TW;q=0.7,zh;q=0.6' \
  -H 'cache-control: no-cache' \
  -H 'content-type: application/x-ndjson' \
  -H 'dnt: 1' \
  -H 'origin: https://www.industriekids.com.au' \
  -H 'pragma: no-cache' \
  -H 'referer: https://www.industriekids.com.au/' \
  -H 'sec-ch-ua: "Chromium";v="118", "Google Chrome";v="118", "Not=A?Brand";v="99"' \
  -H 'sec-ch-ua-mobile: ?0' \
  -H 'sec-ch-ua-platform: "macOS"' \
  -H 'sec-fetch-dest: empty' \
  -H 'sec-fetch-mode: cors' \
  -H 'sec-fetch-site: cross-site' \
  -H 'user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/118.0.0.0 Safari/537.36' \
  -H 'x-reactify-client-id: theme' \
  -H 'x-reactify-client-version: 5.37.0' \
  -H 'x-reactify-ga: GA1.1.834222806.1697151060' \
  -H 'x-reactify-mode: collection' \
  -H 'x-reactify-shop: industrie-kids.myshopify.com' \
  --data-raw $'{"preference":"page"}\n{"query":{"bool":{"must":[{"bool":{"must":[{"bool":{"must_not":[{"nested":{"path":"curations","query":{"bool":{"must":[{"term":{"curations.collectionHandle.keyword":"online-warehouse-frenzy"}},{"term":{"curations.hidden":true}}]}}}}]}},{"match":{"published":true}},{"bool":{"should":[{"nested":{"path":"collections","query":{"term":{"collections.handle.keyword":"online-warehouse-frenzy"}}}},{"nested":{"path":"curations","query":{"term":{"curations.collectionHandle.keyword":"online-warehouse-frenzy"}}}}]}}]}}]}},"sort":[{"curations.position":{"unmapped_type":"long","order":"asc","nested":{"path":"curations","filter":{"term":{"curations.collectionHandle.keyword":"online-warehouse-frenzy"}}}}},{"_script":{"script":{"source":"(\u0021doc[params.field].empty && (int) Math.round(doc[params.field].value) > params.value) ? 1 : 0","params":{"field":"inventory_total","value":50},"lang":"painless"},"type":"number","order":"desc"}},{"collections.position":{"order":"asc","nested":{"path":"collections","filter":{"term":{"collections.handle.keyword":"online-warehouse-frenzy"}}}}}],"size":28,"_source":{"includes":["storefrontId","presentment_price_ranges","images","tags","variants.*","id","title","type","handle","image","variants.id","variants.presentment_prices","callout"],"excludes":["image"]},"from":0}\n' \
  --compressed

Request that shows:
image

curl 'https://config.search.reactify.app/?shop=industrie-kids.myshopify.com' \
-H 'authority: config.search.reactify.app' \
-H 'accept: */*' \
-H 'accept-language: en-GB,en-US;q=0.9,en;q=0.8,zh-TW;q=0.7,zh;q=0.6' \
-H 'cache-control: no-cache' \
-H 'dnt: 1' \
-H 'origin: https://www.industriekids.com.au' \
-H 'pragma: no-cache' \
-H 'referer: https://www.industriekids.com.au/' \
-H 'sec-ch-ua: "Chromium";v="118", "Google Chrome";v="118", "Not=A?Brand";v="99"' \
-H 'sec-ch-ua-mobile: ?0' \
-H 'sec-ch-ua-platform: "macOS"' \
-H 'sec-fetch-dest: empty' \
-H 'sec-fetch-mode: cors' \
-H 'sec-fetch-site: cross-site' \
-H 'user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/118.0.0.0 Safari/537.36' \
--compressed

@mydea
Copy link
Member

mydea commented Oct 27, 2023

@dotdev-brendon thanks for all the details!

i tried to replicate this in a small test app like this:

new Sentry.Replay({
  networkDetailAllowUrls: ['localhost'],
});

and

 const xhr = new XMLHttpRequest();

xhr.open('POST', 'http://localhost:7654/foo');
xhr.setRequestHeader('content-type', 'application/x-ndjson');
xhr.send('{"preference":"page"}\n{"query":{"bool":{"must":[{"bool":{"must":[{"bool":{"must_not":[{"nested":{"path":"curations","query":{"bool":{"must":[{"term":{"curations.collectionHandle.keyword":"online-warehouse-frenzy"}},{"term":{"curations.hidden":true}}]}}}}]}},{"match":{"published":true}},{"bool":{"should":[{"nested":{"path":"collections","query":{"term":{"collections.handle.keyword":"online-warehouse-frenzy"}}}},{"nested":{"path":"curations","query":{"term":{"curations.collectionHandle.keyword":"online-warehouse-frenzy"}}}}]}}]}}]}},"sort":[{"curations.position":{"unmapped_type":"long","order":"asc","nested":{"path":"curations","filter":{"term":{"curations.collectionHandle.keyword":"online-warehouse-frenzy"}}}}},{"_script":{"script":{"source":"(\u0021doc[params.field].empty && (int) Math.round(doc[params.field].value) > params.value) ? 1 : 0","params":{"field":"inventory_total","value":50},"lang":"painless"},"type":"number","order":"desc"}},{"collections.position":{"order":"asc","nested":{"path":"collections","filter":{"term":{"collections.handle.keyword":"online-warehouse-frenzy"}}}}}],"size":28,"_source":{"includes":["storefrontId","presentment_price_ranges","images","tags","variants.*","id","title","type","handle","image","variants.id","variants.presentment_prices","callout"],"excludes":["image"]},"from":0}\n');

xhr.addEventListener('readystatechange', function () {
  console.log(...arguments);
});

but I see the request normally 🤔

Could you maybe share how you make the XHR request? Ideally the actual code (e.g. xhr = new XmlHttpRequest...`), and/or which library/tool you use to make the request?

Also could you possibly share the response of the POST request (possibly in a stripped down form)?

@mydea
Copy link
Member

mydea commented Oct 27, 2023

Another thing you could try: enable debug: true in your Sentry.init(), and see if there are any relevant log messages visible e.g. in the replay. Maybe e.g. a [Replay] Failed to capture fetch breadcrumb or something like this would show up somewhere 🤔

@dotdev-brendon
Copy link

dotdev-brendon commented Oct 31, 2023

@mydea The requests are made via the https://github.com/appbaseio/reactivesearch/tree/v3/packages/web package, and it looks like it uses cross-fetch to make the request: https://github.com/appbaseio/appbase-js/blob/e849a3fc672e82201e3b3554ff174bb0e543844b/src/core/request/fetch.js#L2.

Response body can be found here:
https://textbin.net/lpttgqxprt

I've added the debug but can not see anything useful, the requests are there in the error traces but still excluded from session replay: https://industrie.sentry.io/issues/4575323854/events/90166f6c17ac499bb4e89f1d7c176963/?project=4506114189688832

image

@mydea
Copy link
Member

mydea commented Oct 31, 2023

Hmm, the weird thing is that we see we are capturing an xhr request, so that can't come from cross-fetch I believe...? At least when I run cross-fetch manually like this (based on the library you provided):

import fetch from 'cross-fetch';

function fetchRequest(args) {
  return new Promise((resolve, reject) => {
    const parsedArgs = args;
    try {
      const {
        method,
        path,
        params,
        body,
        isRSAPI,
        isSuggestionsAPI,
        isMongoRequest = false,
        httpRequestTimeout = 0, // Default timeout is 0 (no timeout)
      } = parsedArgs;
      const app = isSuggestionsAPI ? '.suggestions' : '.app';
      let bodyCopy = body;
      const contentType =
        path.endsWith('msearch') || path.endsWith('bulk') ? 'application/x-ndjson' : 'application/json';
      const headers = Object.assign(
        {},
        {
          Accept: 'application/json',
          'Content-Type': contentType,
        },
        args.headers,
      );
      const timestamp = Date.now();
      const requestOptions = {
        method,
        headers,
      };
      if (Array.isArray(bodyCopy)) {
        let arrayBody = '';
        bodyCopy.forEach(item => {
          arrayBody += JSON.stringify(item);
          arrayBody += '\n';
        });

        bodyCopy = arrayBody;
      } else {
        bodyCopy = JSON.stringify(bodyCopy) || {};
      }

      if (Object.keys(bodyCopy).length !== 0) {
        requestOptions.body = bodyCopy;
      }

      const handleTransformRequest = res => {
        return Promise.resolve(res);
      };

      let responseHeaders = {};

      const finalURL = `http://localhost:7654/${path}`;

      return handleTransformRequest(
        Object.assign(
          {},
          {
            url: finalURL,
          },
          requestOptions,
        ),
      )
        .then(ts => {
          const transformedRequest = Object.assign({}, ts);
          const { url } = transformedRequest;
          delete transformedRequest.url;

          const controller = new AbortController();
          const { signal } = controller;

          const fetchPromise = fetch(
            url || finalURL,
            Object.assign({}, transformedRequest, {
              // apply timestamp header for RS API
              headers:
                isRSAPI && !isMongoRequest
                  ? Object.assign({}, transformedRequest.headers, {
                      'x-timestamp': new Date().getTime(),
                    })
                  : transformedRequest.headers,
              signal, // Attach the abort signal to the fetch request
            }),
          );

          const timeoutPromise = new Promise((_, rejectTP) => {
            if (httpRequestTimeout > 0) {
              setTimeout(() => {
                rejectTP(new Error('Request timeout'));
                controller.abort();
              }, httpRequestTimeout);
            }
          });

          return Promise.race([fetchPromise, timeoutPromise])
            .then(res => {
              if (res.status >= 500) {
                return reject(res);
              }
              responseHeaders = res.headers;
              return res
                .json()
                .then(data => {
                  if (res.status >= 400) {
                    return reject(res);
                  }
                  if (data && data.error) {
                    return reject(data);
                  }
                  // Handle error from RS API RESPONSE
                  if (isRSAPI && data && Object.prototype.toString.call(data) === '[object Object]') {
                    if (body && body.query && body.query instanceof Array) {
                      let errorResponses = 0;
                      const allResponses = body.query.filter(q => q.execute || q.execute === undefined).length;

                      if (data) {
                        Object.keys(data).forEach(key => {
                          if (
                            data[key] &&
                            Object.prototype.hasOwnProperty.call(data[key], 'error') &&
                            !!data[key].error
                          ) {
                            errorResponses += 1;
                          }
                        });
                      }
                      // reject only when all responses have an error
                      if (errorResponses > 0 && allResponses === errorResponses) {
                        return reject(data);
                      }
                    }
                  }

                  // Handle error from _msearch response
                  if (data && data.responses instanceof Array) {
                    const allResponses = data.responses.length;
                    const errorResponses = data.responses.filter(entry =>
                      Object.prototype.hasOwnProperty.call(entry, 'error'),
                    ).length;
                    // reject only when all responses have an error
                    if (allResponses === errorResponses) {
                      return reject(data);
                    }
                  }
                  const response = Object.assign({}, data, {
                    _timestamp: timestamp,
                    _headers: responseHeaders,
                  });
                  return resolve(response);
                })
                .catch(e => reject(e));
            })
            .catch(e => reject(e));
        })
        .catch(err => reject(err));
    } catch (e) {
      return reject(e);
    }
  });
}

window.fetchRequest = fetchRequest;

and then

fetchRequest({
        path: '_msearch',
        method: 'POST',
        body: '{"preference":"page"}\n{"query":{"bool":{"must":[{"bool":{"must":[{"bool":{"must_not":[{"nested":{"path":"curations","query":{"bool":{"must":[{"term":{"curations.collectionHandle.keyword":"online-warehouse-frenzy"}},{"term":{"curations.hidden":true}}]}}}}]}},{"match":{"published":true}},{"bool":{"should":[{"nested":{"path":"collections","query":{"term":{"collections.handle.keyword":"online-warehouse-frenzy"}}}},{"nested":{"path":"curations","query":{"term":{"curations.collectionHandle.keyword":"online-warehouse-frenzy"}}}}]}}]}}]}},"sort":[{"curations.position":{"unmapped_type":"long","order":"asc","nested":{"path":"curations","filter":{"term":{"curations.collectionHandle.keyword":"online-warehouse-frenzy"}}}}},{"_script":{"script":{"source":"(\u0021doc[params.field].empty && (int) Math.round(doc[params.field].value) > params.value) ? 1 : 0","params":{"field":"inventory_total","value":50},"lang":"painless"},"type":"number","order":"desc"}},{"collections.position":{"order":"asc","nested":{"path":"collections","filter":{"term":{"collections.handle.keyword":"online-warehouse-frenzy"}}}}}],"size":28,"_source":{"includes":["storefrontId","presentment_price_ranges","images","tags","variants.*","id","title","type","handle","image","variants.id","variants.presentment_prices","callout"],"excludes":["image"]},"from":0}\n',
      })

I do see this captured for replay correctly 🤔

Is this only limited to _msearch requests, I wonder? So you'd call this as msearchApi(...)? (not sure how that works, do you call this manually or is this called by some other thing under the hood?).

One thing to verify, maybe to be 1000% sure, can you set networkCaptureBodies: false and see if it works then? So capture it via networkDetailAllowUrls but only capture headers, not the bodies. This would at least help to narrow this down to the bodies themselves.

@dotdev-brendon
Copy link

dotdev-brendon commented Nov 1, 2023

@mydea Unsure to be honest, that's all the deep workings of the package and _msearch is the only calls used.

I've added the networkCaptureBodies: false and that seem's to work 🎉
image

Although we need the body, at least it will help point you in the right direction, there is two warnings, maybe this could be why?
image

I think it's checking the Request Content-Type and expecting it to match the Response Content-Type... rather then matching to the Request Accept header, which is correctly application/json

@mydea
Copy link
Member

mydea commented Nov 2, 2023

Hey, thank you for the continued help in looking into this!

It's good to know that we can narrow this down to the body handling. I still don't really see how this can happen 😬 but I know at least have a better idea where to look into!

@dotdev-brendon
Copy link

Hey, thank you for the continued help in looking into this!

It's good to know that we can narrow this down to the body handling. I still don't really see how this can happen 😬 but I know at least have a better idea where to look into!

@mydea Any updates on this?

@mydea
Copy link
Member

mydea commented Nov 7, 2023

We are currently working on moving the body truncation out, which may help here if this is the underlying problem * (which we haven't found yet, sadly 😬 )

I still haven't managed to reproduce the issue, and we don't have a public reproduction yet (AFAIK), making it tricky to figure out what exactly is breaking there.

@mydea
Copy link
Member

mydea commented Nov 8, 2023

Hey,

we've just released v7.78.0 of the JavaScript SDK, which included some changes to how we process network bodies. Could you by chance try to update to this version and see if this possibly fixes the problem?

@dotdev-brendon
Copy link

Hey,

we've just released v7.78.0 of the JavaScript SDK, which included some changes to how we process network bodies. Could you by chance try to update to this version and see if this possibly fixes the problem?

@mydea That does not seem to fix the issue, no requests when body is enabled... same warning error when body is disabled.

@mydea
Copy link
Member

mydea commented Nov 9, 2023

Thank you for bearing with us. This at least helps to further narrow down where to problem is coming from - I'll do some more investigation this week, and will keep you posted in here!

mydea added a commit that referenced this issue Nov 10, 2023
…9506)

This adds additional safeguards around fetch/xhr body capturing for
replay.
I added additional try-catch in all places that depend on
`networkCaptureBodies`.

This also types the fetch/xhr hints as `Partial` to ensure we guard
against any of the things not actually being defined, to be on the safe
side.

Hopefully fixes
#9339
@mydea mydea reopened this Nov 10, 2023
@mydea
Copy link
Member

mydea commented Nov 14, 2023

Hello, we just published v7.80.1 of the SDK with some further changes to this. We added a bunch of further safeguards to body parsing - could you give it a try? If it still does not work, please try enabling debug: true in your Sentry.init() and check if you see any relevant console logs in your application - we also added some more debugging statements there!

@dotdev-brendon
Copy link

dotdev-brendon commented Nov 15, 2023

Hi @mydea

UPDATED:
The requests are now showing with the request body, but the response body is "undefined"

Looks like we have success... requests and there body data is now coming through successfully on this version.

Appreciate you getting this resolved 🎉

@mydea
Copy link
Member

mydea commented Nov 15, 2023

Hi @mydea

UPDATED: The requests are now showing with the request body, but the response body is "undefined"

Looks like we have success... requests and there body data is now coming through successfully on this version.
Appreciate you getting this resolved 🎉

Ah, that gets us one step closer at least. It seems that somehow parsing the response body is failing, which is why we fall back to undefined in this case. 🤔

Just to be clear, do you see anything console logs if you enable debug: true? At least some of the possible scenarios should show logs in that case, maybe we get one more step closer there 🤔 I also opened a PR to add even more logging, to get to the bottom of this: #9566

mydea added a commit that referenced this issue Nov 16, 2023
We got somewhat closer to figuring out the problem there. Based on this
comment:
#9339 (comment)

> The requests are now showing with the request body, but the response
body is "undefined"

It seems that the problem is with parsing the response body for a XHR
request. I added some more logging there, to be able to debug this
further.
@mydea
Copy link
Member

mydea commented Nov 21, 2023

Hey, we added some further logs in v7.81.0, if somebody that has this problem is able to update to this version and enable debug: true, and look at the console logs on their page, that would be very helpful!

In the meanwhile, I'll investigate further what could be the problem there 🤔

@mydea
Copy link
Member

mydea commented Nov 21, 2023

Hey, so I think I nailed it down. it is because of xhr.responseType = 'json', probably - opened a PR to handle this: #9623

mydea added a commit that referenced this issue Dec 4, 2023
This is stupid, now that I figured it out. Basically, if you set
`xhr.responseType = 'json'`, it will force `xhr.response` to be a POJO -
which we can't parse right now.

We now handle this case specifically. 

This also adds a new `UNPARSEABLE_BODY_TYPE` meta warning if we are not
getting a body because it is not matching any of the known/parsed types.

Closes #9339
@AbhiPrasad
Copy link
Member

Fix released with https://github.com/getsentry/sentry-javascript/releases/tag/7.85.0 - please give it a try! Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Sync: Jira apply to auto-create a Jira shadow ticket
Projects
Archived in project
6 participants