Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IPFS filtering to allow node operators to decide on content they are willing to serve #8492

Open
3 tasks done
thibmeu opened this issue Oct 6, 2021 · 6 comments
Open
3 tasks done
Assignees
Labels
kind/feature A new feature P1 High: Likely tackled by core team if no one steps up

Comments

@thibmeu
Copy link
Contributor

thibmeu commented Oct 6, 2021

Checklist

  • My issue is specific & actionable.
  • I am not suggesting a protocol enhancement.
  • I have searched on the issue tracker for my issue.

Description

Recently, Cloudflare has open sourced a fork of go-ipfs providing filtering capabilities, grouped under safemode command. The architecture is described in a dedicated blog.

The system works by filtering certain CID when walking the DAG. This allow node operators to prevent certain CID from being provided, both by the HTTP gateway and to the P2P network.
CIDs to be filtered are stored in a blocklist. By default, this blocklist is in a dedicated mount of the datastore /safemode.

Action that can be performed by a blocklist are (based on the proposed interface):

  • block to add content to the blocklist
  • unblock to remove it
  • purge to remove content from the blockstore. Ideally, this option could be extensible, to purge remote datastore, or HTTP cache for instance
  • search to query the blocklist
  • audit to access the log of actions that have been performed against the blocklist

For convenience, ipfs safemode command provides multiple way to resolve content. From its documentation:

- IPFS address, i.e. /ipfs/<CID>
- IPNS address, i.e. /ipns/<hash_publickey>
- DNSLink address, i.e. /ipns/example.com
- HTTP URL, i.e. https://example.com/ or https://gateway.example.com/ipfs/<CID>

This is a proposal implementation, which satisfies some requirements laid out in ipfs/roadmap#64. It provides a more standardised approach for node operators to filter content they are willing to provide.

The implementation has been developed 3 years ago, and may not suit the current architecture of the go-ipfs project.

@thibmeu thibmeu added the kind/feature A new feature label Oct 6, 2021
@BigLep
Copy link
Contributor

BigLep commented Oct 8, 2021

@thibmeu : thanks for bringing this up. I think we need to have a larger discussion about the kind of software Gateway Operators want to have before we keep proceeding with the status quo of go-ipfs serving the wide range of usecases from high traffic gateways to desktop applications. go-ipfs maintainers are going to link discussions/notes that we're having in 2021Q4 on this topic to #8499 . We'll certainly be engaging with Cloudflare as part of this process.

@BigLep BigLep added the status/blocked Unable to be worked further until needs are met label Jan 7, 2022
@BigLep BigLep added P3 Low: Not priority right now and removed status/blocked Unable to be worked further until needs are met labels Jun 3, 2022
@BigLep
Copy link
Contributor

BigLep commented Jun 3, 2022

2022-06-03 conversation: we have the capability for this in go-bitswap per #8763 . If you're interested in contributing a plugin, that would be welcome. Otherwise this isn't a priority for the core maintainers because go-ipfs isn't really designed for large-scale operations, but we'll support operators on any reviews.

@BigLep
Copy link
Contributor

BigLep commented Jun 3, 2022

@guseggert will link the issue that is actively being worked on right now that will make plugins easier to write/maintain.

@guseggert
Copy link
Contributor

guseggert commented Jun 3, 2022

The issue is #7653, which allows arbitrary modifications to the go-ipfs dependency graph using a plugin, so that you can inject a custom exchange.Interface (e.g. a Bitswap instance w/ a customized filter).

@lidel lidel added P1 High: Likely tackled by core team if no one steps up and removed P3 Low: Not priority right now labels Aug 2, 2022
@lidel lidel self-assigned this Aug 2, 2022
@lidel
Copy link
Member

lidel commented Aug 2, 2022

I believe it is time to prioritize this. There is enough need and interest around blocking bad bits for this to be part of Kubo, and not just a plugin:

Quick notes:

  1. denylists are not enough. it has to be allow and deny lists from the start
    • node operators been asking not only for blocking bad bits, but also a primitive for blocking everything and only allowing specific CIDs and paths (e.g. a startup only wants to run a gateway to host their user data etc). if we don't tackle allowlists as part of this, we will end up with franken-api in the future when allowlists are bolted on awkwardly.
  • MVP:
    • Add command namespace (tbd, ipfs rules --help is as good as any other) allowing user to build content policy around allow or deny (and set the default strategy).
    • We don't need to cover all use cases, it should be a low level primitive that allows people to implement their own strategies on top of (similar to firewall rules).
      • each cid / path has to be added as an explicit allow or deny entry
      • use default policy when no entry matching
      • ability to mark added rule as sensitive (enables us to interop with https://badbits.dwebops.pub/) so it is never stored/exported in cleartext
      • use this during path resolution, bitswap and processing Gateway requests (covers the common asks from the community)
  1. import and export commands should be part of the UX, but we need to agree on the transport format – gathering feedback in IPIP: format for denylists for IPFS Nodes and Gateways specs#299

@lidel
Copy link
Member

lidel commented Aug 9, 2022

Another requirement from Infra team: ability to allow / deny specific PeerIDs.

This is a real world which I also needed in the past. In many cases, we struggle to create deterministic test fixtures. Making sure node can't dial specific Peer and needs to get data from someone else requires disabliing more and more internal services (mdns, routing, relays...) and is very brittle, test setup can break the moment we introduce new discovery method.

When we design ipfs rules it should encompass allow / deny rules for:

  • CIDs and content paths
  • PeerIDs and multiaddrs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature A new feature P1 High: Likely tackled by core team if no one steps up
Projects
Status: 🥞 Todo
Development

No branches or pull requests

4 participants