Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

The browser caching issue seems to have gotten worse #8915

Closed
4 of 9 tasks
kbeeveer46 opened this issue Apr 24, 2023 · 24 comments
Closed
4 of 9 tasks

The browser caching issue seems to have gotten worse #8915

kbeeveer46 opened this issue Apr 24, 2023 · 24 comments
Labels
area:spfx Category: SharePoint Framework (not extensions related) area:sw Service Worker related issues

Comments

@kbeeveer46
Copy link

kbeeveer46 commented Apr 24, 2023

What type of issue is this?

Question

What SharePoint development model, framework, SDK or API is this about?

馃挜 SharePoint Framework

Target SharePoint environment

SharePoint Online

What browser(s) / client(s) have you tested

  • 馃挜 Internet Explorer
  • 馃挜 Microsoft Edge
  • 馃挜 Google Chrome
  • 馃挜 FireFox
  • 馃挜 Safari
  • mobile (iOS/iPadOS)
  • mobile (Android)
  • not applicable
  • other (enter in the "Additional environment details" area below)

Additional environment details

  • Latest version of all browsers
  • 1.17.1
  • 16.16.0

Issue description

I was under the impression that after this ticket was completed that a lot of the caching issues would solved or have gotten better but it has only gotten worse for me. In fact, I barely had caching issues before but my co-workers did so I was always looking out for a solution.

SPFX webpart serving old files versions in Chrome #8803

Ever since that ticket was completed (maybe it's just a coincidence) I'm getting caching issues every time I push out a new version of our package to our QA SharePoint site. On top of the caching issues, I'm getting this generic error which I never got before. It goes away after 1 or 2 page refreshes. The only thing that has changed on my end around the same time is I renamed our solution from "xxx-client-side-solution" to just "xxx" but I didn't change the ID/guid.

Screenshot 2023-04-23 at 11 23 29 PM

I tried increasing the version number of the solution (which I never had to do before) but that didn't help. What's even worse is that I'll refresh the page and it'll be on the latest version, refresh it again and it'll go back to the old version, refresh it again and it's back to the new version, refresh it again and it might show the error above. It literally flips back and forth when refreshing the page several times (I'm not sure how it's even possible for it to go back to an old version after it has shown the latest version in the browser). I don't feel like that ever happened until a few weeks ago around the time of that ticket.

It's there something obvious I might be overlooking when packaging the solution for deployment other than making sure I'm running the clean command, using the --ship flags, and increasing the version number? Was there something I am supposed to add to my solution as the result of that ticket? I've been building this solution for 2 years without any caching issues (or ever getting that error above) up until about 2-3 weeks ago.

@ghost
Copy link

ghost commented Apr 24, 2023

Thank you for reporting this issue. We will be triaging your incoming issue as soon as possible.

@ghost ghost added the Needs: Triage 馃攳 Awaiting categorization and initial review. label Apr 24, 2023
@AJIXuMuK
Copy link
Collaborator

@kbeeveer46 - could you please check the Network tab and send information about headers (similarly to what has been sent in the issue you're referencing).
Also please check what scrips are failing to load.

@AJIXuMuK AJIXuMuK added area:spfx Category: SharePoint Framework (not extensions related) Needs: Author Feedback Awaiting response from the original poster of the issue. Marked as stale if no activity for 7 days. and removed Needs: Triage 馃攳 Awaiting categorization and initial review. labels Apr 24, 2023
@kbeverforden
Copy link

kbeverforden commented Apr 24, 2023

@AJIXuMuK I just did a quick test and increased the version number from 4.1.1.0 to 4.1.2.0 and uploaded it to our test site. I was not able to replicate the issue at first. I went through ~10 different pages all with a single web part on them and they showed the latest version # on them. Then I tried 5 minutes later and was able to replicate the issue in both Edge and Firefox on 2 different pages/web parts and they both show different errors. I pasted the screen shots below.

Let me know if you need more details about these 2 web parts. I looked in the dist folder and the guids don't match the guids on the end of these .js files so I'm assuming it's trying to use an old version and the mime type and cors errors are "fallback" errors and red herrings.

Edge:
Edge

Firefox:
Firefox

@AJIXuMuK AJIXuMuK added area:sw Service Worker related issues and removed Needs: Author Feedback Awaiting response from the original poster of the issue. Marked as stale if no activity for 7 days. labels Apr 24, 2023
@kbeverforden
Copy link

@AJIXuMuK Now that it has been an hour I am not able to replicate the error or see any caching issues on several browsers on a Windows and MacOS machine. It appears it's just happening within a certain amount of time after uploading the new version. We will just have to move our releases to later at night. I just wanted to bring it to your attention because I had never experienced this in the 2 years I've been developing this SPFx solution. There may have been random caching issues here and there but never this error and never on every single new version upload.

@AJIXuMuK
Copy link
Collaborator

@kbeeveer46 - it might be related to how the CDN propagation works. Maybe there is an interval when the assets are being updated.
We will still check if there is something we can do here.
But good to hear the issue is fixed for you.

@demyren
Copy link

demyren commented Apr 24, 2023

@kbeverforden If/when it happens again, can you find your file in one of the "Site-[id]" cache storages? I would like to see what got returned from server and cached by the service worker on the initial requests after you deployed a new version. Thank you!

@kbeverforden
Copy link

@demyren I've never been in that caches area so I don't know what the important parts are to screen shot. There were multiple site-id's but only one had the web part that failed. I filtered the list down to that web part and ordered by cache time. The one that is highlighted blue is the one in my dist folder (are those 4 old versions below it supposed to be there?). I also took a screen shot of the console (this time it only had 1 error and doesn't tell me which version of the web part it tried to load). Let me know what else you need screen shots of.

As someone looking at this for the first time, the only thing that stood out to me was the "0" for content length but it's showing the contents of the highlighted web part in the bottom preview and there's content to it.

Screenshot 2023-04-24 at 7 02 51 PM

Screenshot 2023-04-24 at 7 04 45 PM

@kbeverforden
Copy link

kbeverforden commented Apr 25, 2023

@demyren I don't know if it matters, but we're using a site collection catalog on a site called "ClientPortalQA" which is where we upload the package to for our QA team to test and then it gets uploaded to our tenant app catalog at the end of our sprints. So the same package is on multiple app catalogs at any given time. I think one of the entries in that list above that doesn't have "ClientPortalQA" in it is from the tenant but I'm not sure.

After looking at those 2 screen shots closer, the guid in the error in the dev console that starts with "02f..." matches the guid in the web part I selected in blue. The preview at the bottom has the same guid. EDIT: Nvm, all of them that were cached throughout the day have that same guid because it's the quid of the web part. I was trying to find a link between the dev console error and an entry in the site-id.

Here is the contents of the Headers tab in case it helps

Headers

@kbeverforden
Copy link

kbeverforden commented Apr 25, 2023

@demyren Sorry for another comment. I just don't want anyone wasting time spinning their wheels based of something I possibly did on my end so I want to provide any insight I can think of. The screen shot of the dev console error where it has 1.0.0 on the end of the file name reminded me that we never increase the version number in package.json. We just never knew to and everything always worked fine the past 2 years so we kept it that way. Could that be causing any issues since 1.0.0 doesn't match 4.1.3.0 in package-solution.json?

@kbeverforden
Copy link

kbeverforden commented Apr 25, 2023

I replicated the issue several times again last night and I think there is actually 2 different results that I'm seeing when the issue occurs. I don't have a deep enough knowledge of SPFx to interpret these results to tell what's going on.

Result 1: The pictures I posted yesterday when the dev console only has 1 web part error in it ("could not load xxx-web-part in require"). When you look in Cache Storage, the web part appears to be there.

Result 2: The dev console shows multiple web part errors (the error from result 1 and along with 404 and mime type errors). When you look in Cache Storage, the web part is not in any of the site-id's. Here is an example of this result which I didn't post yesterday. You'll see in the bottom part of the picture the web part can't be found in either site-id.

PT Reports Example

@demyren
Copy link

demyren commented Apr 25, 2023

@kbeverforden Thank you so much for all the input here! I will investigate further at my earliest convenience. One more thing that could be interesting is seeing what the CDN server actually returns when this happens - is the 404 error, or the
unexpected application/json content type possibly being returned by the CDN server? An overview of the network activity from the network tab could help us learn more.

Then you could also try reloading with CTRL+F5 to bypass the service worker and see if that resolves the issue - though your "Result 2" above may indicate that it is not a service worker-related problem since even when the file is not served by the service worker but rather the network, the error is produced.

@kbeverforden
Copy link

kbeverforden commented Apr 28, 2023

@demyren I haven't gotten the error as much the past few days but still see the caching issue quite often. A few notes on the caching issue that I was able to replicate again this morning...

  • CTRL+F5 doesn't pull the latest version no matter how many times I use it. In fact, the latest version seems to appear more frequently if I do a normal refresh when CTRL+F5 doesn't work.
  • As I mentioned in my initial post in this thread, refreshing the page (not using CTRL+F5) will flip between versions. I show the package-solution version number in the footer of our site and as I refresh the page every 2 seconds, it will flip between the old and latest version multiple times.
  • We have a single solution with 10+ web parts. We use the quick launch menu at the top of the page to navigation between pages. Every page has a single, full page web part. It's essentially a SPA React app inside SharePoint online. When clicking between pages using the quick launch navigation it will show the old version on some pages and the new version on others (even though all the web parts are in the same solution). Clicking the refresh button at the top of the browser will eventually pull the latest version on the pages that were showing the old one.

We've had to start increasing the version number every time we push our code to QA and tell QA, when they are testing, to make sure that version is appearing in the footer. We already had an instance today where 1 page was showing the latest version and another wasn't and QA thought we didn't push the fix out for the second page. It was because the page had loaded the old version. This was never an issue until 3-4 weeks ago.

@demyren
Copy link

demyren commented Apr 28, 2023

@kbeverforden I understand this is frustrating. We hope to get to the bottom of this as soon as we can.
When you are bouncing between 2 versions, would you be able to record the .aspx JSON response sent from the server for the 2 requests that are giving inconsistent versions so that we can determine what the server sent us? The server sends the manifest information for the extensions to the service worker, and the service worker loads the scripts according to this manifest information (or should be).
Ideally Fiddler can be used, but also a F12 network trace can work.
Also, it would be great if I could see the response headers for the .js files sent from CDN (you could find this in the Site-[id] cache storage).

Between versions, are you reusing the same URL when deploying a new version, or does the deployed file get a new URL?

BTW, when you are refreshing with CTRL+F5, it would be good to see the content of that .aspx HTML response as well. Is CTRL+F5 consistently giving you the wrong version, and if so, how long does it persist before the correct version is given?

It would be great if you could send me the trace data, to demyren@microsoft.com. Thank you so much for your assistance!

@kbeeveer46
Copy link
Author

kbeeveer46 commented Apr 28, 2023

@demyren It may take me some time to get you that info. We're running into another issue now. We just released our latest code to the tenant app catalog. It's been 20 minutes since we uploaded it. I went to a SharePoint site that I haven't navigated to in months (to make sure I had not cached anything recently) using a browser and computer I don't believe I've ever visited the site with it and it threw the "Something is wrong error". The web part in my dist folder is not in Cache Storage. So, even after 20 minutes, sites are still trying to use the old web part. After a few refreshes it finally loads it. But then when I navigate to another page via the quick launch I get the same error again. After getting the error it may take 1 refresh or 10 refreshes to finally get it to load.

I'm not sure what you mean by new URL? We deploy the sppkg file to our QA site daily for testing which is a single SharePoint site and then at the end of the sprint we upload the package to the tenant which then automatically updates the 250+ sites that are using the web parts in the package.

EDIT: I forgot about another issue. We released a hotfix and uploaded our package to the tenant again minutes after the the release to our tenant I mentioned above. I went to another site that I know I haven't visited on this computer and it's loading the first release, not the hotfix.

EDIT 2: Just went to another site. Got error on initial page load. Refreshed page and got version 4.0.0.0. Refreshed again and got version 4.0.0.1 (latest release hotfix). Refreshed again and got the error. Refreshed again and got the latest version. Refreshed again and got the old version. Whenever I get the error, the GUID for the solution is not listed under publiccdn.sharepointonline.com in the sources tab in the developer console. The GUIDs for the extensions/apps that are installed on the site are but not the solution that contains the web parts.

EDIT 3: 2 hours after uploading the hotfix to our tenant and any random site I pick out of the 250+ still gives me the error the first time I visit it and tries to load the version of the package we uploaded to the tenant 4 weeks ago.

EDIT: 4: It's been 9 hours now and all sites appear to be using the latest version. I am able to go to any of the 250+ sites and not get the error on initial load like in my previous edits.

Quick question... Is this all happening because we are using the default SharePoint CDN? If we were using our own Azure CDN would all the caching issues be eliminated?

@kbeeveer46
Copy link
Author

kbeeveer46 commented Apr 30, 2023

@demyren Sorry for only providing little pieces here and there but it takes quite a bit of work to replicate the issues and document my findings and then fix everything. I just replicated the issue of it showing the latest version and then switching back to an old version after refreshing the page by clicking the page's quick launch link. I tried it in Edge first and couldn't replicate it but then tried Chrome and replicated it within a minute or two.

When the page loads an old version of the web part, the network tab loads multiple JS files for the same web part. 1 is the latest version and 1 is an old version. Here is an example. "9d68a" is the latest version 4.1.0.3 but you can see it's loading the other JS file and the previous version number is appearing in the top right

With Version

The response of the .aspx page has references to both JS files. The latest JS file is referenced as one of the <link> elements

<link rel="preload" href="https://publiccdn.sharepointonline.com/xxxxxx.sharepoint.com/sites/ClientPortalQA/ClientSideAssets/079636b6-dfb8-4ab3-895e-2900871a7662/ttc-filing-calendar-web-part_9d68af0e4eaf1972bddb.js" as="script" crossorigin>

The other one is referenced in one of the <script> elements further down the page. I pasted the start of the script and where it references the JS file

<script nonce="y1ym3707po">(()=>{const e=()=>{location.href='https://xxxxx.sharepoint.com/sites/ClientPortalQA/SitePages/FilingCalendar.aspx?sw=bypass';};
..................

"entryModuleId":"ttc-filing-calendar-web-part","scriptResources":{"ttc-filing-calendar-web-part":{"type":"path","path":"ttc-filing-calendar-web-part_8c1d10a843f834649c53.js"}

..................

</script>

Here is the headers of both JS files and the .aspx "document" and "json" record in the network tab.

JS file "9d68a" which is the latest version and not showing on the page:

9d68

JS file "8c1d1" which is the old version and the one showing on the page:

8c1d1

.aspx "document"

aspx document

.aspx "json"

aspx json

@demyren
Copy link

demyren commented May 1, 2023

@kbeeveer46, thank you for all the information you have provided! I recognize it can be tedious and time-consuming to collect all the requested data.

This data indicates that it is not a script caching issue on the client that prevents the last version of the script from loading.
Rather, it does seem like the SharePoint responses for the different navigations may return different versions.
You are seeing a case where the had the correct URL linked, however the actual extension manifest down the page, that dictates actual extension to load, pointed to the old version.
The tag is intended so speed up loading of the script and is rendered using locally cached information according to the js that was loaded the last navigation to the page.
The actual manifest however, is rendered only when the new response comes in from SharePoint and it seems that the second navigation, we were served the old manifests, while in a previous navigation we were served the correct, new manifest.

If we compare the .aspx response body of 2 SharePoint navigations that ended up loading 2 different versions, we can confirm if this is the case, Particularly, the manifest JSON for your manifest is what I believe was served inconsistently, where 1 navigation was correct and contained your latest version, however the other served the older version.

If you're able to collect that information as/if this reproes again, you can send it to me directly at demyren@microsoft.com. Also, rest assured that we will prioritize further investigation into this issue this week and will get back to you again shortly on next steps. Thank you!

@demyren
Copy link

demyren commented May 3, 2023

Hello @kbeeveer46, apologies for some delay. We set up an environment on our side that should be similar to yours, in hopes to reproduce the issue.
We created an extension web part and hosted it on publiccdn.sharepointonline.com. I made changes to the web part, rebuilt it and deployed the new sppkg file to SharePoint. I then immediately refreshed the page containing the web part. However, I was not able to reproduce the problem. The new changes were reflected on the first navigation, and multiple subsequent navigations consistently rendered the same version of the web part.

If you'd be able to provide the .aspx responses for 2 SharePoint navigations, one that loaded the expected, newly deployed web part, and one that loaded the unexpected old web part, I could determine if possibly SharePoint may have returned inconsistent results in the 2 navigations. Since your newly deployed web part gets a new URL (with unique hash) we do not believe it can be a client-side caching problem, as the file would not be found in the local cache and would have to go to network. I suspect that somehow, SharePoint may have returned inconsistent results but would like to confirm this next.

Once again, thank you so much for your cooperation and I sincerely hope we can get to the bottom of this.

@kbeeveer46
Copy link
Author

kbeeveer46 commented May 6, 2023

@demyren Did you create an extension solution or web part solution? You mentioned both terms and just wanted to make it clear we don't have any issues with extensions, only web parts. In our situation, we have many web parts, one on each page (full page web parts). Did you test the scenario of putting a different web part from the same solution on each page and switching between pages using the quick launch? I uploaded our solution 6+ hours ago at this very moment it still serves me the old version on some pages and the new version on others when switching between pages via the quick launch. Clicking the refresh button instead of using the quick launch eventually serves the latest version after a few refreshes.

At this point, I've gotten it to where it doesn't show the error anymore. I changed so much that I'm not sure what fixed it. It still has some caching issues but that's not the end of the world if the client sees the old version for a while after release. It was a dealbreaker when it was throwing errors instead of the old version. I'm not sure if it's the thing that fixed the errors, but one change I made was to make sure our QA SharePoint site catalog doesn't use the said GUID as the solution does in the tenant catalog. Another thing I tried was uploading the solution to our Azure CDN instead of the default one and that had no affect. All the same issues appeared when I did that test.

Like I mentioned, we finally have our environment in a stable state so I'm not sure if it's worth it on your end to keep digging into it.

@kbeeveer46
Copy link
Author

kbeeveer46 commented May 14, 2023

@demyren It appears that the issue has come back. We went all all week without seeing the error and today it came back. We haven't made any changes to the way we are uploading the package to our QA site app catalog and the 4th digit of the version number has increased for every upload. Here's the steps that I just replicated in every browser.

  1. Upload package to site app catalog. We use the PnP PowerShell Add-PnPApp cmdlet. Add-PnPApp -Path ./sharepoint/solution/client-portal.sppkg -Publish -Scope Site -Overwrite -SkipFeatureDeployment
  2. Navigate to root of the site and view home page. Site shows the latest version.
  3. Navigate to another page via the quick launch. Page throws the error.
  4. Refresh the page several times and the error goes away and page shows the latest version.
  5. Navigate to another page via the quick launch. Page may or may not show the latest version. I've even seen it show a version number from 2 versions prior to the latest.
  6. Navigate back to the page from step 3 via the quick launch. Page throws the error again.

That's just one scenario. There doesn't seem to be a common way to replicate it. Sometimes it throws the error on the first page load when viewing the home page right after uploading to the app catalog. Sometimes it might be the 1st or 4th page you navigate to.

EDIT: I just opened a new browser tab and went to the site again after writing this comment to see if giving it some time would help. The footer of the home page showed the latest version (4.1.0.30) but when I navigated to another page via the quick launch the footer is showing 4.1.0.28. The second page is showing the same solution but from 2 versions ago. We've uploaded 2 packages using the -Overwrite flag each time since that version. There is only 1 package in the app catalog so I'm not sure how that's even possible especially after the home page already showed the latest version.

EDIT 2: Just uploaded a new version .31 to the site app catalog and overwrote version .30 using the upload feature in the app catalog instead of using PnP. When It said "Yes" for Current Version Deployed, I went to a completely different computer, closed down Edge completely, opened it, and went to the site and the home page is showing version .28.

@stkBul
Copy link

stkBul commented May 16, 2023

I'm SharePoint Developer in a company delivering custom Spfx webparts to hundreds of customers across Europe and we get almost weekly support tickets with exactly the same issue as @kbeeveer46 has so very well explained. We have created two Microsoft support tickets and in the process of creating third one but so far without any results.

The SharePoint modern pages and the Teams client started serving old versions of the webpart some months ago this year. The worst case scenario currently is when SharePoint or Teams tries to serve version of the webpart which does not exist in the client assets library on the app catalog site anymore and thus failing with the message "Something went wrong". It happens randomly for different users and both on Edge and Chrome.

The issue can be easily reproduced on developer tenant with out of the box Spfx webpart.
I will be more than happy to show the issue in Teams call if that will help.

@demyren
Copy link

demyren commented May 16, 2023

@kbeeveer46 and @stkBul, thank you for reaching out again. I'm truly sorry that you are still seeing these issues when updating your solutions.

@kbeeveer46, in our test we created a web part solution. We only tested full page loads; that is, refreshing a page or using a bookmark or address bar to navigate to a page - and not using the SharePoint quicklaunch. The quicklaunch leverages in-place navigation. In in-place navigation, once a given extension web part version has been loaded (loaded on first page that contains the given web part, following a full page load), it cannot be unloaded in that in-place nav session and other pages that are navigated to using quicklaunch that use the same web part, would (or should) reuse the already loaded version. From what I understand, you also did not change the GUID of the web part in the deployment and so it is technically the same web part, with just a different version.

If you do have any Fiddler traces for page navigations that returned the correct version vs. navigations that returned the incorrect version, please send them to me.

We will perform another set of tests and get back to you. @stkBul, thank you for offering a Teams call - if we still cannot repro, indeed this could be a great next step. We intend to resolve the issue at our earliest convenience.

CC @AJIXuMuK

@kbeeveer46
Copy link
Author

kbeeveer46 commented May 23, 2023

We had a production release tonight at 7 CST in which we uploaded our solution to the tenant app catalog and used the option to deploy to all sites (~300). It's been 2.5 hours and most of the sites don't work and throw the error because they are still loading the previous version. If we make any changes to our API endpoints (like changing the number of parameters they use or the URL of the endpoint) the sites basically all crash and don't work for hours because the endpoints aren't compatible with the code in the previous version. At this point, there's nothing we can do but accept our sites will be down for hours after every release. Is there anything else we can do to guarantee the version in the app catalog will always be shown? I've already tried switching to our Azure CDN instead of the default one and still saw caching issues.

EDIT: No more caching issues as of 5am CST and sites appear to be working fine. Btw, does the amount of sites that use a solution/package affect the amount of time the it will be cached? Does it take time for the changes to make it to each site or is it a one-time thing when uploaded to the catalog? Whenever we upload the solution to a single site catalog like our QA or Demo site, the caching issues are never as bad as our tenant. I just uploaded the code to the Demo site and saw no caching issues at all. Could be a coincidence. Maybe it's a theory to look into?

@AJIXuMuK
Copy link
Collaborator

AJIXuMuK commented Jun 9, 2023

@kbeeveer46 - caching appears for 2 reasons: service worker (discussed above with Dennis) + propagating new bits on all WFEs.
Usually the second part should take no more than 10 mins.
For the long time problem you had - probably a coincidence.

if you experience the issue again - please, share sprequestid response header, tenant url and timestamp of the network request.
It will allow to investigate WFEs caching issues.

I will close this issue for now.

Feel free to open a new one with the information above if you're experiencing the issue again.

Thanks!

@AJIXuMuK AJIXuMuK closed this as completed Jun 9, 2023
@ghost
Copy link

ghost commented Jun 17, 2023

Issues that have been closed & had no follow-up activity for at least 7 days are automatically locked. Please refer to our wiki for more details, including how to remediate this action if you feel this was done prematurely or in error: Issue List: Our approach to locked issues

@ghost ghost locked as resolved and limited conversation to collaborators Jun 17, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area:spfx Category: SharePoint Framework (not extensions related) area:sw Service Worker related issues
Projects
None yet
Development

No branches or pull requests

5 participants