Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Insufficient privacy sanitization #56

Closed
thehans opened this issue Feb 27, 2020 · 3 comments
Closed

Insufficient privacy sanitization #56

thehans opened this issue Feb 27, 2020 · 3 comments

Comments

@thehans
Copy link

thehans commented Feb 27, 2020

I recently found this project while looking for EDID info, and thought it would be nice to contribute a probe upload about my hardware.

I naively ran the suggested: sudo -E hw-probe -all -upload (using the AppImage) for the first time without checking the output beforehand.
After seeing that a full probe include 58 logs! I now regret this, and have some concerns about overall privacy of this.

README claims

Private information (including the username, machine's hostname, IP addresses, MAC addresses and serial numbers) is NOT uploaded to the database.

I didn't know that it would default to collecting so much detailed info (and often irrelevant to "hardware") which seems can be used to uniquely identify a person/computer.
After the upload, I decided to use the -save option and grep for some things to see what else might be there.

  1. hw.info/logs/efibootmgr DOES CONTAIN my MAC address on a line like: .../MAC(xxxxxxxxxxxx,0)... with the x's being my actual MAC in lowercase hex, no colons.
  2. my username is spattered in a few places:
    Due to byobu: hw.info/logs/dev:/dev/shm/byobu-USERNAME-....
    Due to systemctl (in 2 forms): hw.info/logs/systemctl: media-USERNAME-XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX.mount
    and in path form (with UUID as a bonus):
                         loaded active mounted   /media/USERNAME/XXXXXXXX-REAL-UUID-HERE-XXXXXXXXXXXX
  1. UUIDs of drives/partitions seem to be masked in some cases, but unmasked versions still slip through all over the place. Mainly as /dev/disk/by-uuid or /dev/disk/by-partuuid

  2. running grep -ri serial over the saved folder shows that many lines have Serial number replaced with ellipses:

      hw.info/logs/hwinfo:  Serial ID: "..."
    

    But for some reason, other entries in the same file are not replaced in the same way?
    Then more in hw.info/logs/usb-devices:S: SerialNumber=

    and hw.info/logs/smartctl:Serial Number:
    and hw.info/logs/dmidecode: Serial Number:
    and finally RAM sticks as hw.info/devices:mem:MFG-MODELNUM-serial-ACTUAL_SERIAL_NUM

    Thankfully, at least as far as I can tell, these do get stripped by the time they are stored displayed by the server, but I would have more peace of mind if I didn't see all these being saved.

    I can only assume these show up because they are used for calculation of the ID mentioned here:

    The tool uploads 32-byte prefix of salted SHA512 hash of MAC addresses and serial numbers to properly identify unique computers and hard drives. All the data is uploaded securely via HTTPS.

    And that corresponds to this line hw.info/logs/dmi_id:board_serial: ?

    But if that calculated ID is already written to a file on its own, then 1) does it really need to leave them unmasked in all the constituent files? and 2) is it really using ALL of those to calculate that ID?

  3. I haven't (and can't) 100% verify that there's no other uniquely identifying hardware Serial numbers actually stored on the server, but I really wouldn't be surprised if more things are inadvertently slipping through than I've found here, based on the sheer volume of data collected by "-all"

  4. Keeping a listing of all installed deb files just seems particularly excessive and irrelevant to hardware.

I know now that I could have / should have limited which logs to upload, but feel a bit mislead by such privacy statements. And I remain skeptical of the usefulness for collecting ALL of this information.

@linuxhw
Copy link
Owner

linuxhw commented Feb 27, 2020

Hello.

Thanks a lot for the review!

Are you using the latest master version?

Serial numbers and UUIDs are hashed. You see a hash, not a real number, even if it looks like a real number (this is related to the latest version).

Also try Snap or Flatpak to limit access of the app to meaningful system resources: https://github.com/linuxhw/hw-probe#snap

@thehans
Copy link
Author

thehans commented Feb 27, 2020

I used the AppImage version which is linked from README: hw-probe-1.5-149-x86_64.AppImage
Is it not updated as often as snap/flatpak?

@linuxhw
Copy link
Owner

linuxhw commented Feb 28, 2020

Fixed by c536ebe.

Images will be rebuilt soon.

Should we delete your first probe from the server?

Thanks a lot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants