Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automating the removal of unwanted components #1087

Open
Eloston opened this issue Jun 25, 2020 · 2 comments
Open

Automating the removal of unwanted components #1087

Eloston opened this issue Jun 25, 2020 · 2 comments
Assignees
Labels
meta

Comments

@Eloston
Copy link
Member

@Eloston Eloston commented Jun 25, 2020

NOTE: This idea is a work-in-progress. I have no idea if it's even feasible yet. I'll keep this updated as I progress.


Inspired by the huge patches that disable Safe Browsing (examples: first, second, and especially third), I wonder if it is possible to automatically generate a patch that removes Safe Browsing, as well as any other unwanted components.

This can be implemented as a hypothetical program that roughly works as such:

  1. Unwanted components can be thought of as a composition of units, in the generic sense of the word. Units include, but are not limited to: GN targets, directories of code and/or resources, and specific files. We take these units as input to our hypothetical program.
  2. All undesired units must be linked to wanted units in a "computable" fashion. For example, C++ files are linked by their headers, GN targets by their dependents, embedded resources by GRIT files. This is pretty intuitive because the compilation process must know how to combine these units together. We can use this fact to see how our unwanted units are being referenced by wanted units.
  3. Once we determine these links, we need to remove them. This is pretty trivial for GN and GRIT files: simply remove the lines containing those GN targets or resources. However, the more interesting and difficult challenge is removing the dependencies in code. Removing code in a correct and succinct manner requires an understanding of how the wanted and unwanted units interact. Additionally, there are multiple possible ways to remove the linkage, each with their own pros and cons. This is what makes updating ungoogled-chromium patches difficult, and can cause challenging bugs (such as those that caused me to create #845).

In reality, it's very likely we cannot implement full automation, either because of the limits to our capabilities and time, or we encounter an unsolved or unsolvable problem. Instead, we could build-up the automation in stages:

  1. We start with a pure detector. It should correctly report all files, and all lines within the files, that need to be modified or removed.
  2. Afterwards, we'd want to automate removals that certainly need to be removed. It'll report lines it doesn't know how to modify.
  3. Once detection is possible, we can build an interactive tool that'll allow the user to modify files that we cannot automatically modify, or that we are not certain we can modify correctly.
  4. Over time, implement more sophisticated logic that can automate difficult patching problems. This is an open-ended problem that'd may never be fully automated.

As an aside, this should mitigate any additional work imposed by #845 in theory.

For first steps, I'd probably want to implement #845 first as it'd contain some file processing that'd be helpful in establishing the foundation. Afterwards I should have a better idea of how to write the software requirements for this hypothetical program.

@Eloston Eloston added the meta label Jun 25, 2020
@Eloston Eloston self-assigned this Jun 25, 2020
@jstkdng
Copy link
Member

@jstkdng jstkdng commented Jul 8, 2020

I have an idea but I'm not sure if it would be feasible.
What if instead of trying to disable safe browsing, we make safe browsing work for us without the telemetry. What I'm talking about is to create a google-compatible api server that would return dummy results, maybe these results could be an error, an empty response or a response with dummy data.
That way we could simplify the patches here as they will only continue to increase in complexity as chromium grows. Of course, as the server would pretty much do nothing anyone could run it on a cheap vps or if they don't want we could have a public server that anyone could use and if you really care about privacy you could just start your own.
This server could be configurable in chromium itself, and the patches would just modify the urls chromium is trying to connect with our own.
Maybe at some point we could also implement the synchronization/accounts for chromium, like how firefox does.

@Eloston
Copy link
Member Author

@Eloston Eloston commented Jul 8, 2020

What if instead of trying to disable safe browsing, we make safe browsing work for us without the telemetry.

You make a good point here. Safe Browsing may become a deeply integrated component that is no longer practical to disable by removing its code. And even though there'd be more dead code, it still would meet our goals for ungoogled-chromium.

But my understanding is that Safe Browsing is more than just a blacklist service with the introduction of Enhanced Safe Browsing. For example, I don't see why anti-virus scans of downloaded files would require Google's web service to scan the file. It's not necessary to remove these kinds of features, but it's nice to do so.

Also, I'm hoping that this tool will be useful for removing all other components we don't need from the browser. We have a number of patches that partially disable components in the browser instead of removing them entirely, and I think that makes ungoogled-chromium more susceptible to subtle bugs.

Essentially, I think this tool is more than just a "Safe Browsing remover", so I'll open a new issue to discuss Safe Browsing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
meta
Projects
None yet
Development

No branches or pull requests

2 participants