Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fetches code online by default #69

Open
jonassmedegaard opened this issue May 31, 2021 · 11 comments
Open

fetches code online by default #69

jonassmedegaard opened this issue May 31, 2021 · 11 comments
Assignees
Labels
bug Something isn't working discussion good first issue Good for newcomers

Comments

@jonassmedegaard
Copy link

bootiso by deault fetches SYSLINUX online if major version of local version is different than the one used in an image.

I find it it a nice feature that bootiso can fetch an alternate bootloader - but find it problematic that it is done by default.

Please consider flipping around the logic to operate offline by default.

@jonassmedegaard jonassmedegaard added the bug Something isn't working label May 31, 2021
@jsamr
Copy link
Owner

jsamr commented May 31, 2021

@jonassmedegaard The rationale is that mismatching the major version leads to an important risk of producing an unbootable device (I remember trying that scenario with some popular recoveries ISOs). So I guess the handling of this scenario is very opinionated... What would be the rationale in support of the behavior you are advocating?

@jonassmedegaard
Copy link
Author

jonassmedegaard commented May 31, 2021

the rationale is would be that reading the man page description of create a bootable USB drive from an image file gives the impression that this is a tool moving IO from an image to a device.
To some, fetching executable code online and injecting that into the target device is no big deal, but to others it really is.
Like, what validation was made about the authenticity of that online code (was the PGP signature for the code itself validated against PGP signatures of authors of the code? Was the TLS certificate of the web hosting provider distributing the code validated against lists of revocation certificates and/or Google/Mozilla CA certificate blacklists?).
Or, was the user warned ahead that now their image flashing activity will be revealed to at least their own ISP and the ISP of the code blob distributor and the code blob distributor itself, but potentially also to the NSA via backbone routing providers?

@jsamr
Copy link
Owner

jsamr commented May 31, 2021

@jonassmedegaard I understand that security is a good weight in the balance. I guess we could prompt the user to chose between the two options, unless he provides a flag explicitly preferring one option or the other. Would that address your concerns?

@jsamr jsamr added discussion good first issue Good for newcomers labels May 31, 2021
@jonassmedegaard
Copy link
Author

If you mean by default do nothing, only prompt, then I disagree.

I suggest to flip the logic from...

  • always flash, possibly with inferior code (and then emit a warning but continue without prompting) if network fails

...to...

  • always flash, possibly with inferior code (and then emit a warning but continue without prompting) if user has not explicitly permitted to fetch code online

@jonassmedegaard
Copy link
Author

...but yes it would address my concerns - just raise a different (not concern but) annoyance about suboptimal user experience :-)

@jonassmedegaard
Copy link
Author

jonassmedegaard commented May 31, 2021

I am thinking of something like this (untested!):

--- a/bootiso
+++ b/bootiso
@@ -1886,7 +1886,7 @@
   function _checkSyslinuxVersion() {
     local -i _versionsMatch=0
     if [ "$enableForceLocalBootloader" == true ]; then
-      term_echogood "Enforced local SYSLINUX bootloader at version '$_localSyslinuxVersion'."
+      term_echogood "Enforced local SYSLINUX bootloader at version '$_localSyslinuxVersion'.  Use $(term_boldify '--local-bootloader=false') to permit trying to download and execute a version from kernel.org."
       return 1
     fi
     if [ -z "$st_targetSyslinuxVersion" ]; then
@@ -2337,7 +2337,7 @@
   sourceHashFile=${st_userVars['hash-file']:-''}
   targetDevice=${st_userVars[device]:-''}
   targetPartitionLabel=${st_userVars[label]:-''}
-  targetBootloaderVersion=${st_userVars['remote-bootloader']:-'auto'}
+  targetBootloaderVersion=${st_userVars['remote-bootloader']:-''}
   targetDDBusSize=${st_userVars['dd-bs']:-'4M'}
   targetDataPartFstype=${st_userVars['data-part-fs']:-'vfat'}
   assumeImageIs=${st_userVars['assume-image-is']:-''}
@@ -2542,8 +2542,8 @@
       term_echoinfo "However, SYSLINUX version (${st_isoInspections[syslinuxVer]}) in the image file doesn't match the minor part of local version ($_localSyslinuxVersion), which should not cause any problems."
     else
       term_echowarn "SYSLINUX version (${st_isoInspections[syslinuxVer]}) in the image file doesn't match the major part of local version ($_localSyslinuxVersion)." \
-        "$scriptName will try to download and execute this version from kernel.org, unless given the modifier $(term_boldify '--local-bootloader')." \
-        "If that fails, it will attempt installation with the local version of SYSLINUX."
+        "$scriptName will attempt installation with the local version of SYSLINUX regardless."
+        "If that fails, you may try have it download and execute the exact version from kernel.org, by providing the modifier $(term_boldify '--local-bootloader=false')."
     fi
   fi
 }

@jsamr
Copy link
Owner

jsamr commented May 31, 2021

@jonassmedegaard That seems all right, I don't have strong feelings about either solution; would you be willing to submit a PR?

@jonassmedegaard
Copy link
Author

Sorry, I won't provide a PR: Github has a problematic terms of service with a term arguably equivalent to auto-granting a permissive license for all published code (i.e. defeating copyleft licensing like GPL).

What I can provide is a publicly accessible patch like the one quoted above - which (if you don't find time to do a better job at that than me) I will do as part of my Debian packaging.
I'd prefer that you do it, however, since you know the codebase better.

@jsamr
Copy link
Owner

jsamr commented May 31, 2021

@jonassmedegaard I guess you're referring to this. But anyway, I can craft the patch (and yes, I know git apply); I'll also need to update the man and completions. Can't commit to a schedule, but I'll address this ASAP, along with other issues you've raised.

@jonassmedegaard
Copy link
Author

I am referring to same issue that FSF covers as well, but I recommend this alternate view on the matter.

@jsamr
Copy link
Owner

jsamr commented Jun 29, 2021

@jonassmedegaard It looks like the licensing issues aren't theoretical anymore with the new AI-powered GIthub Copilot, see https://news.ycombinator.com/item?id=27676266

In this hackernews thread, a team developer has been very evasive on the question of whether the Github AI could train on GPL-licensed work and generate suggestions based on copyleft licenses, thus seemingly infringing those licenses while passing under the radar. Indeed, an AI is a black box and there is no way to track down the myriad of original sources that have contributed to this suggestion. That is a morally challenging situation and I guess I have to at least consider migrating this repository to a different hosting platform.

Looks like there will be interesting legal fights ahead!

EDIT

Quotes from Copilot website

What data has GitHub Copilot been trained on?
GitHub Copilot is powered by OpenAI Codex, a new AI system created by OpenAI. It has been trained on a selection of English language and source code from publicly available sources, including code in public repositories on GitHub.

Why was GitHub Copilot trained on data from publicly available sources?
Training machine learning models on publicly available data is considered fair use across the machine learning community. The models gain insight and accuracy from the public collective intelligence. But this is a new space, and we are keen to engage in a discussion with developers on these topics and lead the industry in setting appropriate standards for training AI models.

Yeah, might be "considered fair use across the machine learning community", but I'm not sure there has been cautious consideration of licensed data... Moreover, their poorly defined "publicly available data" could even include, theoretically, data from publicly available git repository, irrespective to the host service.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working discussion good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants