Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excessive downloads of pci.ids #3483

Closed
gollux opened this issue Jan 31, 2024 · 8 comments · Fixed by #3484
Closed

Excessive downloads of pci.ids #3483

gollux opened this issue Jan 31, 2024 · 8 comments · Fixed by #3484
Labels
bug:unconfirmed Someone works on identifying the issue

Comments

@gollux
Copy link

gollux commented Jan 31, 2024

Describe the bug

I am the maintainer of the PCI ID database and I am getting an excessive number of HTTP requests (more than 30000 per day) for the pci.ids file from clients with User-Agent set to HeroicGameLauncher. This is becoming unsustainable, so I would like to kindly request your cooperation with reducing the load.

At the very minimum, you should download the compressed version of the file and cache it locally.

However, it seems that all you need is identifying the GPU of the machine where your software is running. For this purpose, the PCI ID database can be queried via DNS for a specific ID only (see lib/names.c in the pciutils project).

Add logs

Does not apply.

Steps to reproduce

Does not apply.

Expected behavior

Downloading files responsibly.

Screenshots

No response

Heroic Version

Latest Stable

System Information

Linux.

Additional information

No response

@gollux gollux added the bug:unconfirmed Someone works on identifying the issue label Jan 31, 2024
@CommandMC
Copy link
Collaborator

Hello. First off, I'd like to apologize for the mess here. Our intent was to cache the response, however it seems the expiry time conditions were not checked properly.

However, it seems that all you need is identifying the GPU of the machine where your software is running. For this purpose, the PCI ID database can be queried via DNS for a specific ID only (see lib/names.c in the pciutils project).

I'm not the best at reading C, but I don't think NodeJS, this project's language, offers the same low-level DNS functions? I'll look into this once I have some more time though.
We've discussed some alternatives, such as only re-downloading the file if the user's hardware changes, but that'd require more complex changes.

For now, I've created this PR to mitigate the issue (properly caching the file, as mentioned above, and bumping up our cache timeout to 30 days to hopefully lessen the load further).

@gollux
Copy link
Author

gollux commented Feb 1, 2024

Thanks for the quick reaction! A hot fix of caching issues is very welcome.

Let us think about what to do next. I am no NodeJS expert, but I think it should be possible to query DNS for TXT records using the dns.resolveTxt method.

Another possibility is to bundle a snapshot pci.ids with your releases and download a newer version only if the user explicitly asks for it, or perhaps if you fail to identify the device.

(This is based on my assumption that you are using the pci.ids to identify the GPU. Correct me if I am wrong, please.)

@flavioislima
Copy link
Member

This should be fixed now on the latest release v2.12.1.

Let us know if you still gets a lot of request in the upcoming weeks.

For now I think we can close it.

@flavioislima flavioislima linked a pull request Feb 1, 2024 that will close this issue
4 tasks
@CommandMC
Copy link
Collaborator

CommandMC commented Feb 1, 2024

Let us think about what to do next. I am no NodeJS expert, but I think it should be possible to query DNS for TXT records using the dns.resolveTxt method.

Another possibility is to bundle a snapshot pci.ids with your releases and download a newer version only if the user explicitly asks for it, or perhaps if you fail to identify the device.

Both of those options indeed sound doable. I'll have some more time starting Saturday, I'll experiment around with this then.
I hope the band-aid is enough for now. I reckon a 2.13.0 release is not far off either, that could then include proper handling of this (along with other web requests-related fixes I definitely have to get back on)

(This is based on my assumption that you are using the pci.ids to identify the GPU. Correct me if I am wrong, please.)

Yes, this is correct. We could use lspci to do this on Linux, but I don't think a built-in tool for this exists on Windows (WMI-reported device names are sadly not accurate enough)

@gollux
Copy link
Author

gollux commented Feb 4, 2024

Another thing that could help is to download the compressed version of the file (it's about 3 times smaller).

@CommandMC
Copy link
Collaborator

CommandMC commented Feb 5, 2024

I've tried out resolving device names via DNS now, seems to work!

Can we somehow identify ourselves when doing this? I'm thinking that in the case of any further issues, you'd (1) know that it's our fault again and (2) you might be able to block just us, without impacting other tools using this interface

@gollux
Copy link
Author

gollux commented Feb 5, 2024

I've tried out resolving device names via DNS now, seems to work!

Great!

Can we somehow identify ourselves when doing this? I'm thinking that in the case of any further issues, you'd (1) know that it's our fault again and (2) you might be able to block just us, without impacting other tools using this interface

It would be nice, but I don't see how to attach this information to the DNS queries. I will try to find some trick.

@vadcx
Copy link

vadcx commented Feb 15, 2024

but I don't see how to attach this information to the DNS queries.

@gollux I think the only way is to add per-organization subdomains like NTP does with their vendor pools. https://pages.cs.wisc.edu/~plonka/netgear-sntp/

Personally I feel uneasy about this for the potential of long-term fingerprinting of passive network surveillance. Not only would the queries include fingerprintable PCI device IDs, but include a dedicated per-app subdomain too. Although I find the use of DNS TXT queries for lookups was a creative choice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug:unconfirmed Someone works on identifying the issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants