Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change resource table to be based on mime type #1698

Merged
merged 10 commits into from
Apr 21, 2024

Conversation

tw4l
Copy link
Contributor

@tw4l tw4l commented Apr 18, 2024

Follow-up to #1689

Instead of using resource types, let's use the first half of mime types now, with a few manual substitutions to ensure that favicons, javascript, html, and stylesheets are put into better categories than just e.g. text.

Screen Shot 2024-04-18 at 5 12 55 PM

@tw4l tw4l requested review from SuaYoo and ikreymer April 18, 2024 21:14
@tw4l tw4l marked this pull request as draft April 18, 2024 21:15
@tw4l
Copy link
Contributor Author

tw4l commented Apr 18, 2024

Converted to draft because there's an issue where if a category exists for replay but not crawl, it throws an uncaught TypeError

@tw4l tw4l marked this pull request as ready for review April 18, 2024 22:01
@tw4l
Copy link
Contributor Author

tw4l commented Apr 18, 2024

TypeErrors fixed, this is now ready for review

@tw4l
Copy link
Contributor Author

tw4l commented Apr 18, 2024

Switched to iterating through qaResources because there are frequently keys in qaResources that are not in crawlResources but not the other way around that I've seen.

- if resourceType is script, stylesheet, image, font, use that definitely
- if resourceType is fetch/xhr, also check for extension, specifically for images
- also check for extension for json
- also check for pdf by type + extension
- categorize 'ping' as 'other'
- use first part of mime as fallback
@ikreymer
Copy link
Member

Added a few more improvements, incorporating the resourceType as follows:

  • if resourceType is script, stylesheet, image, font, use that definitively, instead of mime
  • if resourceType is fetch/xhr, also check for extension, specifically for images, which might have text/html mime, especially for 404s
  • also check for extension for json
  • also check for pdf by type + extension
  • categorize 'ping' as 'other'
  • continue to use first part of mime as fallback

@ikreymer ikreymer merged commit d31bdd2 into main Apr 21, 2024
2 checks passed
@ikreymer ikreymer deleted the issue-1689-resource-table-followup branch April 21, 2024 02:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants